aboutsummaryrefslogtreecommitdiff
path: root/drivers/cpuidle
AgeCommit message (Collapse)AuthorFilesLines
2025-12-05Merge tag 'soc-drivers-6.19' of ↵Linus Torvalds1-4/+8
git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc Pull SoC driver updates from Arnd Bergmann: "This is the first half of the driver changes: - A treewide interface change to the "syscore" operations for power management, as a preparation for future Tegra specific changes - Reset controller updates with added drivers for LAN969x, eic770 and RZ/G3S SoCs - Protection of system controller registers on Renesas and Google SoCs, to prevent trivially triggering a system crash from e.g. debugfs access - soc_device identification updates on Nvidia, Exynos and Mediatek - debugfs support in the ST STM32 firewall driver - Minor updates for SoC drivers on AMD/Xilinx, Renesas, Allwinner, TI - Cleanups for memory controller support on Nvidia and Renesas" * tag 'soc-drivers-6.19' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc: (114 commits) memory: tegra186-emc: Fix missing put_bpmp Documentation: reset: Remove reset_controller_add_lookup() reset: fix BIT macro reference reset: rzg2l-usbphy-ctrl: Fix a NULL vs IS_ERR() bug in probe reset: th1520: Support reset controllers in more subsystems reset: th1520: Prepare for supporting multiple controllers dt-bindings: reset: thead,th1520-reset: Add controllers for more subsys dt-bindings: reset: thead,th1520-reset: Remove non-VO-subsystem resets reset: remove legacy reset lookup code clk: davinci: psc: drop unused reset lookup reset: rzg2l-usbphy-ctrl: Add support for RZ/G3S SoC reset: rzg2l-usbphy-ctrl: Add support for USB PWRRDY dt-bindings: reset: renesas,rzg2l-usbphy-ctrl: Document RZ/G3S support reset: eswin: Add eic7700 reset driver dt-bindings: reset: eswin: Documentation for eic7700 SoC reset: sparx5: add LAN969x support dt-bindings: reset: microchip: Add LAN969x support soc: rockchip: grf: Add select correct PWM implementation on RK3368 soc/tegra: pmc: Add USB wake events for Tegra234 amba: tegra-ahb: Fix device leak on SMMU enable ...
2025-12-04Merge tag 'devicetree-for-6.19' of ↵Linus Torvalds1-10/+1
git://git.kernel.org/pub/scm/linux/kernel/git/robh/linux Pull devicetree updates from Rob Herring: "DT bindings: - Convert lattice,ice40-fpga-mgr, apm,xgene-storm-dma, brcm,sr-thermal, amazon,al-thermal, brcm,ocotp, mt8173-mdp, Actions Owl SPS, Marvell AP80x System Controller, Marvell CP110 System Controller, cznic,moxtet, and apm,xgene-slimpro-mbox to DT schema format - Add i.MX95 fsl,irqsteer, MT8365 Mali Bifrost GPU, Anvo ANV32C81W EEPROM, and Microchip pic64gx PLIC - Add missing LGE, AMD Seattle, and APM X-Gene SoC platform compatibles - Updates to brcm,bcm2836-l1-intc, brcm,bcm2835-hvs, and bcm2711-hdmi bindings to fix warnings on BCM2712 platforms - Drop obsolete db8500-thermal.txt - Treewide clean-up of extra blank lines and inconsistent quoting - Ensure all .dtbo targets are applied to a base .dtb - Speed up dt_binding_check by skipping running validation on empty examples DT core: - Add of_machine_device_match() and of_machine_get_match_data() helpers and convert users treewide - Fix bounds checking of address properties in FDT code. Rework the code to have a single implementation of the bounds checks. - Rework of_irq_init() to ignore any implicit interrupt-parent (i.e. in a parent node) on nodes without an interrupt. This matches the spec description and fixes some RISC-V platforms. - Avoid a spurious message on overlay removal - Skip DT kunit tests on RISCV+ACPI" * tag 'devicetree-for-6.19' of git://git.kernel.org/pub/scm/linux/kernel/git/robh/linux: (55 commits) dt-bindings: kbuild: Skip validating empty examples dt-bindings: interrupt-controller: brcm,bcm2836-l1-intc: Drop interrupt-controller requirement dt-bindings: display: Fix brcm,bcm2835-hvs bindings for BCM2712 dt-bindings: display: bcm2711-hdmi: Add interrupt details for BCM2712 of: Skip devicetree kunit tests when RISCV+ACPI doesn't populate root node soc: tegra: Simplify with of_machine_device_match() soc: qcom: ubwc: Simplify with of_machine_get_match_data() powercap: dtpm: Simplify with of_machine_get_match_data() platform: surface: Simplify with of_machine_get_match_data() irqchip/atmel-aic: Simplify with of_machine_get_match_data() firmware: qcom: scm: Simplify with of_machine_device_match() cpuidle: big_little: Simplify with of_machine_device_match() cpufreq: sun50i: Simplify with of_machine_device_match() cpufreq: mediatek: Simplify with of_machine_get_match_data() cpufreq: dt-platdev: Simplify with of_machine_get_match_data() of: Add wrappers to match root node with OF device ID tables dt-bindings: eeprom: at25: Add Anvo ANV32C81W of/reserved_mem: Simplify the logic of __reserved_mem_alloc_size() of/reserved_mem: Simplify the logic of fdt_scan_reserved_mem_reg_nodes() of/reserved_mem: Simplify the logic of __reserved_mem_reserve_reg() ...
2025-12-04Merge tag 'pmdomain-v6.19' of ↵Linus Torvalds1-2/+2
git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/linux-pm Pull pmdomain updates from Ulf Hansson: "pmdomain core: - Allow power-off for out-of-band wakeup-capable devices - Drop the redundant call to dev_pm_domain_detach() for the amba bus - Extend the genpd governor for CPUs to account for IPIs pmdomain providers: - bcm: Add support for BCM2712 - mediatek: Add support for MFlexGraphics power domains - mediatek: Add support for MT8196 power domains - qcom: Add RPMh power domain support for Kaanapali - rockchip: Add support for RV1126B pmdomain consumers: - usb: dwc3: Enable out of band wakeup for i.MX95 - usb: chipidea: Enable out of band wakeup for i.MX95" * tag 'pmdomain-v6.19' of git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/linux-pm: (26 commits) pmdomain: Extend the genpd governor for CPUs to account for IPIs smp: Introduce a helper function to check for pending IPIs pmdomain: mediatek: convert from clk round_rate() to determine_rate() amba: bus: Drop dev_pm_domain_detach() call pmdomain: bcm: bcm2835-power: Prepare to support BCM2712 pmdomain: mediatek: mtk-mfg: select MAILBOX in Kconfig pmdomain: mediatek: Add support for MFlexGraphics pmdomain: mediatek: Fix build-errors cpuidle: psci: Replace deprecated strcpy in psci_idle_init_cpu pmdomain: rockchip: Add support for RV1126B pmdomain: mediatek: Add support for MT8196 HFRPSYS power domains pmdomain: mediatek: Add support for MT8196 SCPSYS power domains pmdomain: mediatek: Add support for secure HWCCF infra power on pmdomain: mediatek: Add support for Hardware Voter power domains pmdomain: qcom: rpmhpd: Add RPMh power domain support for Kaanapali usb: dwc3: imx8mp: Set out of band wakeup for i.MX95 usb: chipidea: ci_hdrc_imx: Set out of band wakeup for i.MX95 usb: chipidea: core: detach power domain for ci_hdrc platform device pmdomain: core: Allow power-off for out-of-band wakeup-capable devices PM: wakeup: Add out-of-band system wakeup support for devices ...
2025-11-28Merge branches 'pm-qos' and 'pm-tools'Rafael J. Wysocki2-5/+11
Merge PM QoS updates and a cpupower utility update for 6.19-rc1: - Introduce and document a QoS limit on CPU exit latency during wakeup from suspend-to-idle (Ulf Hansson) - Add support for building libcpupower statically (Zuo An) * pm-qos: Documentation: power/cpuidle: Document the CPU system wakeup latency QoS cpuidle: Respect the CPU system wakeup QoS limit for cpuidle sched: idle: Respect the CPU system wakeup QoS limit for s2idle pmdomain: Respect the CPU system wakeup QoS limit for cpuidle pmdomain: Respect the CPU system wakeup QoS limit for s2idle PM: QoS: Introduce a CPU system wakeup QoS limit * pm-tools: tools/power/cpupower: Support building libcpupower statically
2025-11-28Merge branches 'pm-cpuidle' and 'pm-powercap'Rafael J. Wysocki4-91/+91
Merge cpuidle and power capping updates for 6.19-rc1: - Use residency threshold in polling state override decisions in the menu cpuidle governor (Aboorva Devarajan) - Add sanity check for exit latency and target residency in the cpufreq core (Rafael Wysocki) - Use this_cpu_ptr() where possible in the teo governor (Christian Loehle) - Rework the handling of tick wakeups in the teo cpuidle governor to increase the likelihood of stopping the scheduler tick in the cases when tick wakeups can be counted as non-timer ones (Rafael Wysocki) - Fix a reverse condition in the teo cpuidle governor and drop a misguided target residency check from it (Rafael Wysocki) - Clean up muliple minor defects in the teo cpuidle governor (Rafael Wysocki) - Update header inclusion to make it follow the Include What You Use principle (Andy Shevchenko) - Enable MSR-based RAPL PMU support in the intel_rapl power capping driver and arrange for using it on the Panther Lake and Wildcat Lake processors (Kuppuswamy Sathyanarayanan) - Add support for Nova Lake and Wildcat Lake processors to the intel_rapl power capping driver (Kaushlendra Kumar, Srinivas Pandruvada) * pm-cpuidle: cpuidle: Warn instead of bailing out if target residency check fails cpuidle: Update header inclusion cpuidle: governors: teo: Add missing space to the description cpuidle: governors: teo: Simplify intercepts-based state lookup cpuidle: governors: teo: Fix tick_intercepts handling in teo_update() cpuidle: governors: teo: Rework the handling of tick wakeups cpuidle: governors: teo: Decay metrics below DECAY_SHIFT threshold cpuidle: governors: teo: Use s64 consistently in teo_update() cpuidle: governors: teo: Drop redundant function parameter cpuidle: governors: teo: Drop misguided target residency check cpuidle: teo: Use this_cpu_ptr() where possible cpuidle: Add sanity check for exit latency and target residency cpuidle: menu: Use residency threshold in polling state override decisions * pm-powercap: powercap: intel_rapl: Enable MSR-based RAPL PMU support powercap: intel_rapl: Prepare read_raw() interface for atomic-context callers powercap: intel_rapl: Add support for Nova Lake processors powercap: intel_rapl: Add support for Wildcat Lake platform
2025-11-26cpuidle: big_little: Simplify with of_machine_device_match()Krzysztof Kozlowski1-10/+1
Replace open-coded getting root OF node and matching against it with new of_machine_device_match() helper. Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org> Reviewed-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com> Link: https://patch.msgid.link/20251112-b4-of-match-matchine-data-v2-5-d46b72003fd6@linaro.org Signed-off-by: Rob Herring (Arm) <robh@kernel.org>
2025-11-25cpuidle: Warn instead of bailing out if target residency check failsRafael J. Wysocki1-10/+8
It turns out that the change in commit 76934e495cdc ("cpuidle: Add sanity check for exit latency and target residency") goes too far because there are systems in the field on which the check introduced by that commit does not pass. For this reason, change __cpuidle_driver_init() return type back to void and make it print a warning when the check mentioned above does not pass. Fixes: 76934e495cdc ("cpuidle: Add sanity check for exit latency and target residency") Reported-by: Val Packett <val@packett.cool> Closes: https://lore.kernel.org/linux-pm/20251121010756.6687-1-val@packett.cool/ Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Reviewed-by: Christian Loehle <christian.loehle@arm.com> Link: https://patch.msgid.link/2808566.mvXUDI8C0e@rafael.j.wysocki
2025-11-25cpuidle: Update header inclusionAndy Shevchenko1-0/+4
While cleaning up some headers, I got a build error on this file: drivers/cpuidle/poll_state.c:52:2: error: call to undeclared library function 'snprintf' with type 'int (char *restrict, unsigned long, const char *restrict, ...)'; ISO C99 and later do not support implicit function declarations [-Wimplicit-function-declaration] Update header inclusions to follow IWYU (Include What You Use) principle. Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Link: https://patch.msgid.link/20251124205752.1328701-1-andriy.shevchenko@linux.intel.com Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2025-11-25cpuidle: Respect the CPU system wakeup QoS limit for cpuidleUlf Hansson1-0/+4
The CPU system wakeup QoS limit must be respected for the regular cpuidle state selection. Therefore, let's extend the common governor helper cpuidle_governor_latency_req(), to take the constraint into account. Reviewed-by: Dhruva Gole <d-gole@ti.com> Reviewed-by: Kevin Hilman (TI) <khilman@baylibre.com> Tested-by: Kevin Hilman (TI) <khilman@baylibre.com> Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org> Link: https://patch.msgid.link/20251125112650.329269-6-ulf.hansson@linaro.org Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2025-11-25sched: idle: Respect the CPU system wakeup QoS limit for s2idleUlf Hansson1-5/+7
A CPU system wakeup QoS limit may have been requested by user space. To avoid breaking this constraint when entering a low power state during s2idle, let's start to take into account the QoS limit. Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Dhruva Gole <d-gole@ti.com> Reviewed-by: Kevin Hilman (TI) <khilman@baylibre.com> Tested-by: Kevin Hilman (TI) <khilman@baylibre.com> Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org> Link: https://patch.msgid.link/20251125112650.329269-5-ulf.hansson@linaro.org Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2025-11-24cpuidle: governors: teo: Add missing space to the descriptionRafael J. Wysocki1-1/+1
There is a missing space in the governor description comment, so add it. No functional impact. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Link: https://patch.msgid.link/5059034.31r3eYUQgx@rafael.j.wysocki
2025-11-20cpuidle: governors: teo: Simplify intercepts-based state lookupRafael J. Wysocki1-46/+16
Simplify the loop looking up a candidate idle state in the case when an intercept is likely to occur by adding a search for the state index limit if the tick is stopped before it. First, call tick_nohz_tick_stopped() just once and if it returns true, look for the shallowest state index below the current candidate one with target residency at least equal to the tick period length. Next, simply look for a state that is not shallower than the one found in the previous step and satisfies the intercepts majority condition (if there are no such states, the shallowest state that is not shallower than the one found in the previous step becomes the new candidate). Since teo_state_ok() has no callers any more after the above changes, drop it. No intentional functional impact. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Reviewed-by: Christian Loehle <christian.loehle@arm.com> [ rjw: Changelog clarification and code comment edit ] Link: https://patch.msgid.link/2418792.ElGaqSPkdT@rafael.j.wysocki Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2025-11-20cpuidle: governors: teo: Fix tick_intercepts handling in teo_update()Rafael J. Wysocki1-1/+1
The condition deciding whether or not to increase cpu_data->tick_intercepts in teo_update() is reverse, so fix it. Fixes: d619b5cc6780 ("cpuidle: teo: Simplify counting events used for tick management") Cc: 6.14+ <stable@vger.kernel.org> # 6.14+: 0796ddf4a7f0: cpuidle: teo: Use this_cpu_ptr() where possible Cc: 6.14+ <stable@vger.kernel.org> # 6.14+: 8f3f01082d7a: cpuidle: governors: teo: Use s64 consistently in teo_update() Cc: 6.14+ <stable@vger.kernel.org> # 6.14+: b54df61c7428: cpuidle: governors: teo: Decay metrics below DECAY_SHIFT threshold Cc: 6.14+ <stable@vger.kernel.org> 6.14+: 083654ded547: cpuidle: governors: teo: Rework the handling of tick wakeups Cc: 6.14+ <stable@vger.kernel.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Reviewed-by: Christian Loehle <christian.loehle@arm.com> Link: https://patch.msgid.link/5085160.31r3eYUQgx@rafael.j.wysocki
2025-11-20cpuidle: governors: teo: Rework the handling of tick wakeupsRafael J. Wysocki1-15/+24
If the wakeup pattern is clearly dominated by tick wakeups, count those wakeups as hits on the deepest available idle state to increase the likelihood of stopping the tick, especially on systems where there are only 2 usable idle states and the tick can only be stopped when the deeper state is selected. This change is expected to reduce power on some systems where state 0 is selected relatively often even though they are almost idle. Without it, the governor may end up selecting the shallowest idle state all the time even if the system is almost completely idle due all tick wakeups being counted as hits on that state and preventing the tick from being stopped at all. Fixes: 4b20b07ce72f ("cpuidle: teo: Don't count non-existent intercepts") Reported-by: Reka Norman <rekanorman@chromium.org> Closes: https://lore.kernel.org/linux-pm/CAEmPcwsNMNnNXuxgvHTQ93Mx-q3Oz9U57THQsU_qdcCx1m4w5g@mail.gmail.com/ Tested-by: Reka Norman <rekanorman@chromium.org> Tested-by: Christian Loehle <christian.loehle@arm.com> Cc: 6.11+ <stable@vger.kernel.org> # 6.11+: 92ce5c07b7a1: cpuidle: teo: Reorder candidate state index checks Cc: 6.11+ <stable@vger.kernel.org> # 6.11+: ea185406d1ed: cpuidle: teo: Combine candidate state index checks against 0 Cc: 6.11+ <stable@vger.kernel.org> # 6.11+: b9a6af26bd83: cpuidle: teo: Drop local variable prev_intercept_idx Cc: 6.11+ <stable@vger.kernel.org> # 6.11+: e24f8a55de50: cpuidle: teo: Clarify two code comments Cc: 6.11+ <stable@vger.kernel.org> # 6.11+: d619b5cc6780: cpuidle: teo: Simplify counting events used for tick management Cc: 6.11+ <stable@vger.kernel.org> # 6.11+: 13ed5c4a6d9c: cpuidle: teo: Skip getting the sleep length if wakeups are very frequent Cc: 6.11+ <stable@vger.kernel.org> # 6.11+: ddcfa7964677: cpuidle: teo: Simplify handling of total events count Cc: 6.11+ <stable@vger.kernel.org> # 6.11+: 65e18e654475: cpuidle: teo: Replace time_span_ns with a flag Cc: 6.11+ <stable@vger.kernel.org> # 6.11+: 0796ddf4a7f0: cpuidle: teo: Use this_cpu_ptr() where possible Cc: 6.11+ <stable@vger.kernel.org> # 6.11+: 8f3f01082d7a: cpuidle: governors: teo: Use s64 consistently in teo_update() Cc: 6.11+ <stable@vger.kernel.org> # 6.11+: b54df61c7428: cpuidle: governors: teo: Decay metrics below DECAY_SHIFT threshold Cc: 6.11+ <stable@vger.kernel.org> # 6.11+ Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> [ rjw: Rebase on commit 0796ddf4a7f0, changelog update ] Link: https://patch.msgid.link/6228387.lOV4Wx5bFT@rafael.j.wysocki Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2025-11-19cpuidle: psci: Replace deprecated strcpy in psci_idle_init_cpuThorsten Blum1-2/+2
strcpy() is deprecated; use strscpy() instead. Link: https://github.com/KSPP/linux/issues/88 Signed-off-by: Thorsten Blum <thorsten.blum@linux.dev> Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
2025-11-14cpuidle: governors: teo: Decay metrics below DECAY_SHIFT thresholdRafael J. Wysocki1-7/+19
If a given governor metric falls below a certain value (8 for DECAY_SHIFT equal to 3), it will not decay any more due to the simplistic decay implementation. This may in some cases lead to subtle inconsistencies in the governor behavior, so change the decay implementation to take it into account and set the metric at hand to 0 in that case. Suggested-by: Christian Loehle <christian.loehle@arm.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Reviewed-by: Christian Loehle <christian.loehle@arm.com> Tested-by: Christian Loehle <christian.loehle@arm.com> Link: https://patch.msgid.link/2819353.mvXUDI8C0e@rafael.j.wysocki
2025-11-14cpuidle: governors: teo: Use s64 consistently in teo_update()Rafael J. Wysocki1-4/+3
Two local variables in teo_update() are defined as u64, but their values are then compared with s64 values, so it is more consistent to use s64 as their data type. No intentional functional impact. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Reviewed-by: Christian Loehle <christian.loehle@arm.com> Tested-by: Christian Loehle <christian.loehle@arm.com> Link: https://patch.msgid.link/3026616.e9J7NaK4W3@rafael.j.wysocki
2025-11-14cpuidle: governors: teo: Drop redundant function parameterRafael J. Wysocki1-6/+4
The last no_poll parameter of teo_find_shallower_state() is always false, so drop it. No intentional functional impact. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Reviewed-by: Christian Loehle <christian.loehle@arm.com> Tested-by: Christian Loehle <christian.loehle@arm.com> Link: https://patch.msgid.link/2253109.irdbgypaU6@rafael.j.wysocki
2025-11-14cpuidle: governors: teo: Drop misguided target residency checkRafael J. Wysocki1-5/+2
When the target residency of the current candidate idle state is greater than the expected time till the closest timer (the sleep length), it does not matter whether or not the tick has already been stopped or if it is going to be stopped. The closest timer will trigger anyway at its due time, so if an idle state with target residency above the sleep length is selected, energy will be wasted and there may be excess latency. Of course, if the closest timer were canceled before it could trigger, a deeper idle state would be more suitable, but this is not expected to happen (generally speaking, hrtimers are not expected to be canceled as a rule). Accordingly, the teo_state_ok() check done in that case causes energy to be wasted more often than it allows any energy to be saved (if it allows any energy to be saved at all), so drop it and let the governor use the teo_find_shallower_state() return value as the new candidate idle state index. Fixes: 21d28cd2fa5f ("cpuidle: teo: Do not call tick_nohz_get_sleep_length() upfront") Cc: All applicable <stable@vger.kernel.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Reviewed-by: Christian Loehle <christian.loehle@arm.com> Tested-by: Christian Loehle <christian.loehle@arm.com> Link: https://patch.msgid.link/5955081.DvuYhMxLoT@rafael.j.wysocki
2025-11-14syscore: Pass context data to callbacksThierry Reding1-4/+8
Several drivers can benefit from registering per-instance data along with the syscore operations. To achieve this, move the modifiable fields out of the syscore_ops structure and into a separate struct syscore that can be registered with the framework. Add a void * driver data field for drivers to store contextual data that will be passed to the syscore ops. Acked-by: Rafael J. Wysocki (Intel) <rafael@kernel.org> Signed-off-by: Thierry Reding <treding@nvidia.com>
2025-11-12cpuidle: teo: Use this_cpu_ptr() where possibleChristian Loehle1-3/+3
The cpuidle governor callbacks for update, select and reflect are always running on the actual idle entering/exiting CPU, so use the more optimized this_cpu_ptr() to access the internal teo data. This brings down the latency-critical teo_reflect() from static void teo_reflect(struct cpuidle_device *dev, int state) { ffffffc080ffcff0: hint #0x19 ffffffc080ffcff4: stp x29, x30, [sp, #-48]! struct teo_cpu *cpu_data = per_cpu_ptr(&teo_cpus, dev->cpu); ffffffc080ffcff8: adrp x2, ffffffc0848c0000 <gicv5_global_data+0x28> { ffffffc080ffcffc: add x29, sp, #0x0 ffffffc080ffd000: stp x19, x20, [sp, #16] ffffffc080ffd004: orr x20, xzr, x0 struct teo_cpu *cpu_data = per_cpu_ptr(&teo_cpus, dev->cpu); ffffffc080ffd008: add x0, x2, #0xc20 { ffffffc080ffd00c: stp x21, x22, [sp, #32] struct teo_cpu *cpu_data = per_cpu_ptr(&teo_cpus, dev->cpu); ffffffc080ffd010: adrp x19, ffffffc083eb5000 <cpu_devices+0x78> ffffffc080ffd014: add x19, x19, #0xbb0 ffffffc080ffd018: ldr w3, [x20, #4] dev->last_state_idx = state; to static void teo_reflect(struct cpuidle_device *dev, int state) { ffffffc080ffd034: hint #0x19 ffffffc080ffd038: stp x29, x30, [sp, #-48]! ffffffc080ffd03c: add x29, sp, #0x0 ffffffc080ffd040: stp x19, x20, [sp, #16] ffffffc080ffd044: orr x20, xzr, x0 struct teo_cpu *cpu_data = this_cpu_ptr(&teo_cpus); ffffffc080ffd048: adrp x19, ffffffc083eb5000 <cpu_devices+0x78> { ffffffc080ffd04c: stp x21, x22, [sp, #32] struct teo_cpu *cpu_data = this_cpu_ptr(&teo_cpus); ffffffc080ffd050: add x19, x19, #0xbb0 dev->last_state_idx = state; This saves us: adrp x2, ffffffc0848c0000 <gicv5_global_data+0x28> add x0, x2, #0xc20 ldr w3, [x20, #4] Signed-off-by: Christian Loehle <christian.loehle@arm.com> [ rjw: Subject tweak ] Link: https://patch.msgid.link/20251110120819.714560-1-christian.loehle@arm.com Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2025-11-12cpuidle: Add sanity check for exit latency and target residencyRafael J. Wysocki1-2/+14
Make __cpuidle_driver_init() fail if the exit latency of one of the driver's idle states is less than its target residency which would break cpuidle assumptions. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Reviewed-by: Artem Bityutskiy <artem.bityutskiy@linux.intel.com> Reviewed-by: Christian Loehle <christian.loehle@arm.com> [ rjw: Changelog fix ] Link: https://patch.msgid.link/12779486.O9o76ZdvQC@rafael.j.wysocki Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2025-11-06Merge tag 'riscv-for-linus-6.18-rc5' of ↵Linus Torvalds1-2/+3
git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux Pull RISC-V fixes from Paul Walmsley: - A fix to disable KASAN checks while walking a non-current task's stackframe (following x86) - A fix for a kvrealloc()-related memory leak in module_frob_arch_sections() - Two replacements of strcpy() with strscpy() - A change to use the RISC-V .insn assembler directive when possible to assemble instructions from hex opcodes - Some low-impact fixes in the ptdump code and kprobes test code * tag 'riscv-for-linus-6.18-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux: cpuidle: riscv-sbi: Replace deprecated strcpy in sbi_cpuidle_init_cpu riscv: KGDB: Replace deprecated strcpy in kgdb_arch_handle_qxfer_pkt riscv: asm: use .insn for making custom instructions riscv: tests: Make RISCV_KPROBES_KUNIT tristate riscv: tests: Rename kprobes_test_riscv to kprobes_riscv riscv: Fix memory leak in module_frob_arch_sections() riscv: ptdump: use seq_puts() in pt_dump_seq_puts() macro riscv: stacktrace: Disable KASAN checks for non-current tasks
2025-10-27cpuidle: riscv-sbi: Replace deprecated strcpy in sbi_cpuidle_init_cpuThorsten Blum1-2/+3
strcpy() is deprecated; use strscpy() instead. Link: https://github.com/KSPP/linux/issues/88 Signed-off-by: Thorsten Blum <thorsten.blum@linux.dev> Link: https://lore.kernel.org/r/20251021135155.1409-2-thorsten.blum@linux.dev Signed-off-by: Paul Walmsley <pjw@kernel.org>
2025-10-27cpuidle: menu: Use residency threshold in polling state override decisionsAboorva Devarajan1-4/+5
On virtualized PowerPC (pseries) systems, where only one polling state (Snooze) and one deep state (CEDE) are available, selecting CEDE when the predicted idle duration is less than the target residency of CEDE state can hurt performance. In such cases, the entry/exit overhead of CEDE outweighs the power savings, leading to unnecessary state transitions and higher latency. Menu governor currently contains a special-case rule that prioritizes the first non-polling state over polling, even when its target residency is much longer than the predicted idle duration. On PowerPC/pseries, where the gap between the polling state (Snooze) and the first non-polling state (CEDE) is large, this behavior causes performance regressions. Refine that special case by adding an extra requirement: the first non-polling state can only be chosen if its target residency is below the defined RESIDENCY_THRESHOLD_NS. If this condition is not satisfied, polling is allowed instead, avoiding suboptimal non-polling state entries. This change is limited to the single special-case rule for the first non-polling state. The general non-polling state selection logic in the menu governor remains unchanged. Performance improvement observed with pgbench on PowerPC (pseries) system: +---------------------------+------------+------------+------------+ | Metric | Baseline | Patched | Change (%) | +---------------------------+------------+------------+------------+ | Transactions/sec (TPS) | 495,210 | 536,982 | +8.45% | | Avg latency (ms) | 0.163 | 0.150 | -7.98% | +---------------------------+------------+------------+------------+ CPUIdle state usage: +--------------+--------------+-------------+ | Metric | Baseline | Patched | +--------------+--------------+-------------+ | Total usage | 12,735,820 | 13,918,442 | | Above usage | 11,401,520 | 1,598,210 | | Below usage | 20,145 | 702,395 | +--------------+--------------+-------------+ Above/Total and Below/Total usage percentages: +------------------------+-----------+---------+ | Metric | Baseline | Patched | +------------------------+-----------+---------+ | Above % (Above/Total) | 89.56% | 11.49% | | Below % (Below/Total) | 0.16% | 5.05% | | Total cpuidle miss (%) | 89.72% | 16.54% | +------------------------+-----------+---------+ The results indicate that restricting CEDE selection to cases where its residency matches the predicted idle time reduces mispredictions, lowers unnecessary state transitions, and improves overall throughput. Reviewed-by: Christian Loehle <christian.loehle@arm.com> Signed-off-by: Aboorva Devarajan <aboorvad@linux.ibm.com> [ rjw: Changelog edits, rebase ] Link: https://patch.msgid.link/20251006013954.17972-1-aboorvad@linux.ibm.com Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2025-10-27cpuidle: governors: menu: Select polling state in some more casesRafael J. Wysocki1-2/+5
A throughput regression of 11% introduced by commit 779b1a1cb13a ("cpuidle: governors: menu: Avoid selecting states with too much latency") has been reported and it is related to the case when the menu governor checks if selecting a proper idle state instead of a polling one makes sense. In particular, it is questionable to do so if the exit latency of the idle state in question exceeds the predicted idle duration, so add a check for that, which is sufficient to make the reported regression go away, and update the related code comment accordingly. Fixes: 779b1a1cb13a ("cpuidle: governors: menu: Avoid selecting states with too much latency") Closes: https://lore.kernel.org/linux-pm/004501dc43c9$ec8aa930$c59ffb90$@telus.net/ Reported-by: Doug Smythies <dsmythies@telus.net> Tested-by: Doug Smythies <dsmythies@telus.net> Cc: All applicable <stable@vger.kernel.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Reviewed-by: Christian Loehle <christian.loehle@arm.com> Link: https://patch.msgid.link/12786727.O9o76ZdvQC@rafael.j.wysocki
2025-10-20Revert "cpuidle: menu: Avoid discarding useful information"Rafael J. Wysocki1-12/+9
It is reported that commit 85975daeaa4d ("cpuidle: menu: Avoid discarding useful information") led to a performance regression on Intel Jasper Lake systems because it reduced the time spent by CPUs in idle state C7 which is correlated to the maximum frequency the CPUs can get to because of an average running power limit [1]. Before that commit, get_typical_interval() would have returned UINT_MAX whenever it had been unable to make a high-confidence prediction which had led to selecting the deepest available idle state too often and both power and performance had been inadequate as a result of that on some systems. However, this had not been a problem on systems with relatively aggressive average running power limits, like the Jasper Lake systems in question, because on those systems it was compensated by the ability to run CPUs faster. It was addressed by causing get_typical_interval() to return a number based on the recent idle duration information available to it even if it could not make a high-confidence prediction, but that clearly did not take the possible correlation between idle power and available CPU capacity into account. For this reason, revert most of the changes made by commit 85975daeaa4d, except for one cosmetic cleanup, and add a comment explaining the rationale for returning UINT_MAX from get_typical_interval() when it is unable to make a high-confidence prediction. Fixes: 85975daeaa4d ("cpuidle: menu: Avoid discarding useful information") Closes: https://lore.kernel.org/linux-pm/36iykr223vmcfsoysexug6s274nq2oimcu55ybn6ww4il3g3cv@cohflgdbpnq7/ [1] Reported-by: Sergey Senozhatsky <senozhatsky@chromium.org> Cc: All applicable <stable@vger.kernel.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Link: https://patch.msgid.link/3663603.iIbC2pHGDl@rafael.j.wysocki
2025-09-20cpuidle: Fail cpuidle device registration if there is one alreadyRafael J. Wysocki1-1/+7
Refuse to register a cpuidle device if the given CPU has a cpuidle device already and print a message regarding it. Without this, an attempt to register a new cpuidle device without unregistering the existing one leads to the removal of the existing cpuidle device without removing its sysfs interface. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2025-09-20cpuidle: sysfs: Use sysfs_emit()/sysfs_emit_at() instead of ↵Vivek Yadav1-17/+17
sprintf()/scnprintf() The ->show() callbacks in sysfs should use sysfs_emit() or sysfs_emit_at() when formatting values for user space. These helpers are the recommended way to ensure correct buffer handling and consistency across the kernel. See Documentation/filesystems/sysfs.rst for details. No functional change intended. Suggested-by: Joe Perches <joe@perches.com> Signed-off-by: Vivek Yadav <vivekyadav1207731111@gmail.com> Link: https://patch.msgid.link/20250919165657.233349-1-vivekyadav1207731111@gmail.com [ rjw: Minor subject edits ] Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2025-09-10cpuidle: qcom-spm: drop unnecessary initialisationsJohan Hovold1-2/+2
Drop the unnecessary initialisations of the platform device and driver data pointers which are assigned on first use when registering the cpuidle device during probe. Signed-off-by: Johan Hovold <johan@kernel.org> Reviewed-by: Konrad Dybcio <konrad.dybcio@oss.qualcomm.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2025-09-10cpuidle: qcom-spm: fix device and OF node leaks at probeJohan Hovold1-2/+5
Make sure to drop the reference to the saw device taken by of_find_device_by_node() after retrieving its driver data during probe(). Also drop the reference to the CPU node sooner to avoid leaking it in case there is no saw node or device. Fixes: 60f3692b5f0b ("cpuidle: qcom_spm: Detach state machine from main SPM handling") Signed-off-by: Johan Hovold <johan@kernel.org> Reviewed-by: Konrad Dybcio <konrad.dybcio@oss.qualcomm.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2025-08-21cpuidle: governors: menu: Rearrange main loop in menu_select()Rafael J. Wysocki1-34/+36
Reduce the indentation level in the main loop of menu_select() by rearranging some checks and assignments in it. No intentional functional impact. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Reviewed-by: Christian Loehle <christian.loehle@arm.com> Link: https://patch.msgid.link/2389215.ElGaqSPkdT@rafael.j.wysocki
2025-08-18cpuidle: governors: menu: Avoid selecting states with too much latencyRafael J. Wysocki1-17/+12
Occasionally, the exit latency of the idle state selected by the menu governor may exceed the PM QoS CPU wakeup latency limit. Namely, if the scheduler tick has been stopped already and predicted_ns is greater than the tick period length, the governor may return an idle state whose exit latency exceeds latency_req because that decision is made before checking the current idle state's exit latency. For instance, say that there are 3 idle states, 0, 1, and 2. For idle states 0 and 1, the exit latency is equal to the target residency and the values are 0 and 5 us, respectively. State 2 is deeper and has the exit latency and target residency of 200 us and 2 ms (which is greater than the tick period length), respectively. Say that predicted_ns is equal to TICK_NSEC and the PM QoS latency limit is 20 us. After the first two iterations of the main loop in menu_select(), idx becomes 1 and in the third iteration of it the target residency of the current state (state 2) is greater than predicted_ns. State 2 is not a polling one and predicted_ns is not less than TICK_NSEC, so the check on whether or not the tick has been stopped is done. Say that the tick has been stopped already and there are no imminent timers (that is, delta_tick is greater than the target residency of state 2). In that case, idx becomes 2 and it is returned immediately, but the exit latency of state 2 exceeds the latency limit. Address this issue by modifying the code to compare the exit latency of the current idle state (idle state i) with the latency limit before comparing its target residency with predicted_ns, which allows one more exit_latency_ns check that becomes redundant to be dropped. However, after the above change, latency_req cannot take the predicted_ns value any more, which takes place after commit 38f83090f515 ("cpuidle: menu: Remove iowait influence"), because it may cause a polling state to be returned prematurely. In the context of the previous example say that predicted_ns is 3000 and the PM QoS latency limit is still 20 us. Additionally, say that idle state 0 is a polling one. Moving the exit_latency_ns check before the target_residency_ns one causes the loop to terminate in the second iteration, before the target_residency_ns check, so idle state 0 will be returned even though previously state 1 would be returned if there were no imminent timers. For this reason, remove the assignment of the predicted_ns value to latency_req from the code. Fixes: 5ef499cd571c ("cpuidle: menu: Handle stopped tick more aggressively") Cc: 4.17+ <stable@vger.kernel.org> # 4.17+ Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Reviewed-by: Christian Loehle <christian.loehle@arm.com> Link: https://patch.msgid.link/5043159.31r3eYUQgx@rafael.j.wysocki
2025-08-11cpuidle: governors: menu: Avoid using invalid recent intervals dataRafael J. Wysocki1-4/+17
Marc has reported that commit 85975daeaa4d ("cpuidle: menu: Avoid discarding useful information") caused the number of wakeup interrupts to increase on an idle system [1], which was not expected to happen after merely allowing shallower idle states to be selected by the governor in some cases. However, on the system in question, all of the idle states deeper than WFI are rejected by the driver due to a firmware issue [2]. This causes the governor to only consider the recent interval duriation data corresponding to attempts to enter WFI that are successful and the recent invervals table is filled with values lower than the scheduler tick period. Consequently, the governor predicts an idle duration below the scheduler tick period length and avoids stopping the tick more often which leads to the observed symptom. Address it by modifying the governor to update the recent intervals table also when entering the previously selected idle state fails, so it knows that the short idle intervals might have been the minority had the selected idle states been actually entered every time. Fixes: 85975daeaa4d ("cpuidle: menu: Avoid discarding useful information") Link: https://lore.kernel.org/linux-pm/86o6sv6n94.wl-maz@kernel.org/ [1] Link: https://lore.kernel.org/linux-pm/7ffcb716-9a1b-48c2-aaa4-469d0df7c792@arm.com/ [2] Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Tested-by: Christian Loehle <christian.loehle@arm.com> Tested-by: Marc Zyngier <maz@kernel.org> Reviewed-by: Christian Loehle <christian.loehle@arm.com> Link: https://patch.msgid.link/2793874.mvXUDI8C0e@rafael.j.wysocki
2025-07-29Merge tag 'pmdomain-v6.17' of ↵Linus Torvalds2-28/+0
git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/linux-pm Pull pmdomain updates from Ulf Hansson: "pmdomain core: - Leave powered-on genpds on until ->sync_state() or late_initcall_sync - Export a common ->sync_state() helper for genpd providers - Add generic ->sync_state() support - Add a bus/driver for genpd provider-devices - Introduce dev_pm_genpd_is_on() for consumers pmdomain providers: - cpuidle-psci: Drop redundant ->sync_state() support - cpuidle-riscv-sbi: Drop redundant ->sync_state() support - imx: Set ISI panic write for imx8m-blk-ctrl - qcom: Add support for Glymur and Milos RPMh power-domains - qcom: Use of_genpd_sync_state() for power-domains - rockchip: Add support for the RK3528 variant - samsung: Fix splash-screen handover by enforcing a ->sync_state() - sunxi: Add support for Allwinner A523's PCK600 power-controller - tegra: Opt-out from genpd's common ->sync_state() support for pmc - thead: Instantiate a GPU power sequencer via the auxiliary bus - renesas: Move init to postcore_initcalls - xilinx: Move ->sync_state() support to firmware driver - xilinx: Use of_genpd_sync_state() for power-domains pmdomain consumers: - remoteproc: imx_rproc: Fixup the detect/attach procedure for pre-booted cores" * tag 'pmdomain-v6.17' of git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/linux-pm: (44 commits) pmdomain: qcom: rpmhpd: Add Glymur RPMh Power Domains dt-bindings: power: rpmpd: Add Glymur power domains remoteproc: imx_rproc: detect and attach to pre-booted remote cores remoteproc: imx_rproc: skip clock enable when M-core is managed by the SCU pmdomain: core: introduce dev_pm_genpd_is_on() pmdomain: ti: Select PM_GENERIC_DOMAINS pmdomain: sunxi: sun20i-ppu: change to tristate and enable for ARCH_SUNXI pmdomain: sunxi: add driver for Allwinner A523's PCK-600 power controller pmdomain: sunxi: sun20i-ppu: add A523 support pmdomain: samsung: Fix splash-screen handover by enforcing a sync_state cpuidle: riscv-sbi: Drop redundant sync_state support cpuidle: psci: Drop redundant sync_state support pmdomain: core: Leave powered-on genpds on until sync_state pmdomain: core: Leave powered-on genpds on until late_initcall_sync pmdomain: core: Default to use of_genpd_sync_state() for genpd providers driver core: Add dev_set_drv_sync_state() pmdomain: core: Add common ->sync_state() support for genpd providers driver core: Export get_dev_from_fwnode() firmware: xilinx: Use of_genpd_sync_state() firmware: xilinx: Don't share zynqmp_pm_init_finalize() ...
2025-07-28Merge tag 'pm-6.17-rc1' of ↵Linus Torvalds1-9/+5
git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm Pull power management updates from Rafael Wysocki: "As is tradition, cpufreq is the part with the largest number of updates that include core fixes and cleanups as well as updates of several assorted drivers, but there are also quite a few updates related to system sleep, mostly focused on asynchronous suspend and resume of devices and on making the integration of system suspend and resume with runtime PM easier. Runtime PM is also updated to allow some code duplication in drivers to be eliminated going forward and to work more consistently overall in some cases. Apart from that, there are some driver core updates related to PM domains that should help to address ordering issues with devm_ cleanup routines relying on PM domains, some assorted devfreq updates including core fixes and cleanups, tooling updates, and documentation and MAINTAINERS updates. Specifics: - Fix two initialization ordering issues in the cpufreq core and a governor initialization error path in it, and clean it up (Lifeng Zheng) - Add Granite Rapids support in no-HWP mode to the intel_pstate cpufreq driver (Li RongQing) - Make intel_pstate always use HWP_DESIRED_PERF when operating in the passive mode (Rafael Wysocki) - Allow building the tegra124 cpufreq driver as a module (Aaron Kling) - Do minor cleanups for Rust cpufreq and cpumask APIs and fix MAINTAINERS entry for cpu.rs (Abhinav Ananthu, Ritvik Gupta, Lukas Bulwahn) - Clean up assorted cpufreq drivers (Arnd Bergmann, Dan Carpenter, Krzysztof Kozlowski, Sven Peter, Svyatoslav Ryhel, Lifeng Zheng) - Add the NEED_UPDATE_LIMITS flag to the CPPC cpufreq driver (Prashant Malani) - Fix minimum performance state label error in the amd-pstate driver documentation (Shouye Liu) - Add the CPUFREQ_GOV_STRICT_TARGET flag to the userspace cpufreq governor and explain HW coordination influence on it in the documentation (Shashank Balaji) - Fix opencoded for_each_cpu() in idle_state_valid() in the DT cpuidle driver (Yury Norov) - Remove info about non-existing QoS interfaces from the PM QoS documentation (Ulf Hansson) - Use c_* types via kernel prelude in Rust for OPP (Abhinav Ananthu) - Add HiSilicon uncore frequency scaling driver to devfreq (Jie Zhan) - Allow devfreq drivers to add custom sysfs ABIs (Jie Zhan) - Simplify the sun8i-a33-mbus devfreq driver by using more devm functions (Uwe Kleine-König) - Fix an index typo in trans_stat() in devfreq (Chanwoo Choi) - Check devfreq governor before using governor->name (Lifeng Zheng) - Remove a redundant devfreq_get_freq_range() call from devfreq_add_device() (Lifeng Zheng) - Limit max_freq with scaling_min_freq in devfreq (Lifeng Zheng) - Replace sscanf() with kstrtoul() in set_freq_store() (Lifeng Zheng) - Extend the asynchronous suspend and resume of devices to handle suppliers like parents and consumers like children (Rafael Wysocki) - Make pm_runtime_force_resume() work for drivers that set the DPM_FLAG_SMART_SUSPEND flag and allow PCI drivers and drivers that collaborate with the general ACPI PM domain to set it (Rafael Wysocki) - Add kernel parameter to disable asynchronous suspend/resume of devices (Tudor Ambarus) - Drop redundant might_sleep() calls from some functions in the device suspend/resume core code (Zhongqiu Han) - Fix the handling of monitors connected right before waking up the system from sleep (tuhaowen) - Clean up MAINTAINERS entries for suspend and hibernation (Rafael Wysocki) - Fix error code path in the KEXEC_JUMP flow and drop a redundant pm_restore_gfp_mask() call from it (Rafael Wysocki) - Rearrange suspend/resume error handling in the core device suspend and resume code (Rafael Wysocki) - Fix up white space that does not follow coding style in the hibernation core code (Darshan Rathod) - Document return values of suspend-related API functions in the runtime PM framework (Sakari Ailus) - Mark last busy stamp in multiple autosuspend-related functions