aboutsummaryrefslogtreecommitdiff
path: root/include/linux/suspend.h
AgeCommit message (Collapse)AuthorFilesLines
2024-02-05PM: sleep: stats: Define suspend_stats next to the code using itRafael J. Wysocki1-56/+15
It is not necessary to define struct suspend_stats in a header file and the suspend_stats variable in the core device system-wide PM code. They both can be defined in kernel/power/main.c, next to the sysfs and debugfs code accessing suspend_stats, which can be static. Modify the code in question in accordance with the above observation and replace the static inline functions manipulating suspend_stats with regular ones defined in kernel/power/main.c. While at it, move the enum suspend_stat_step to the end of suspend.h which is a more suitable place for it. No intentional functional impact. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Reviewed-by: Ulf Hansson <ulf.hansson@linaro.org>
2024-02-05PM: sleep: stats: Use unsigned int for success and failure countersRafael J. Wysocki1-2/+2
Change the type of the "success" and "fail" fields in struct suspend_stats to unsigned int, because they cannot be negative. No intentional functional impact. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Reviewed-by: Stanislaw Gruszka <stanislaw.gruszka@linux.intel.com> Reviewed-by: Ulf Hansson <ulf.hansson@linaro.org>
2024-02-05PM: sleep: stats: Use an array of step failure countersRafael J. Wysocki1-8/+4
Instead of using a set of individual struct suspend_stats fields representing suspend step failure counters, use an array of counters indexed by enum suspend_stat_step for this purpose, which allows dpm_save_failed_step() to increment the appropriate counter automatically, so that its callers don't need to do that directly. It also allows suspend_stats_show() to carry out a loop over the counters array to print their values. Because the counters cannot become negative, use unsigned int for representing them. The only user-observable impact of this change is a different ordering of entries in the suspend_stats debugfs file which is not expected to matter. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Reviewed-by: Stanislaw Gruszka <stanislaw.gruszka@linux.intel.com> Reviewed-by: Ulf Hansson <ulf.hansson@linaro.org>
2024-02-05PM: sleep: stats: Use array of suspend step namesRafael J. Wysocki1-1/+2
Replace suspend_step_name() in the suspend statistics code with an array of suspend step names which has fewer lines of code and less overhead. While at it, remove two unnecessary line breaks in suspend_stats_show() and adjust some white space in there to the kernel coding style for a more consistent code layout. No intentional functional impact. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Reviewed-by: Stanislaw Gruszka <stanislaw.gruszka@linux.intel.com> Reviewed-by: Ulf Hansson <ulf.hansson@linaro.org>
2023-06-28Merge tag 'mm-stable-2023-06-24-19-15' of ↵Linus Torvalds1-3/+6
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm Pull mm updates from Andrew Morton: - Yosry Ahmed brought back some cgroup v1 stats in OOM logs - Yosry has also eliminated cgroup's atomic rstat flushing - Nhat Pham adds the new cachestat() syscall. It provides userspace with the ability to query pagecache status - a similar concept to mincore() but more powerful and with improved usability - Mel Gorman provides more optimizations for compaction, reducing the prevalence of page rescanning - Lorenzo Stoakes has done some maintanance work on the get_user_pages() interface - Liam Howlett continues with cleanups and maintenance work to the maple tree code. Peng Zhang also does some work on maple tree - Johannes Weiner has done some cleanup work on the compaction code - David Hildenbrand has contributed additional selftests for get_user_pages() - Thomas Gleixner has contributed some maintenance and optimization work for the vmalloc code - Baolin Wang has provided some compaction cleanups, - SeongJae Park continues maintenance work on the DAMON code - Huang Ying has done some maintenance on the swap code's usage of device refcounting - Christoph Hellwig has some cleanups for the filemap/directio code - Ryan Roberts provides two patch series which yield some rationalization of the kernel's access to pte entries - use the provided APIs rather than open-coding accesses - Lorenzo Stoakes has some fixes to the interaction between pagecache and directio access to file mappings - John Hubbard has a series of fixes to the MM selftesting code - ZhangPeng continues the folio conversion campaign - Hugh Dickins has been working on the pagetable handling code, mainly with a view to reducing the load on the mmap_lock - Catalin Marinas has reduced the arm64 kmalloc() minimum alignment from 128 to 8 - Domenico Cerasuolo has improved the zswap reclaim mechanism by reorganizing the LRU management - Matthew Wilcox provides some fixups to make gfs2 work better with the buffer_head code - Vishal Moola also has done some folio conversion work - Matthew Wilcox has removed the remnants of the pagevec code - their functionality is migrated over to struct folio_batch * tag 'mm-stable-2023-06-24-19-15' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (380 commits) mm/hugetlb: remove hugetlb_set_page_subpool() mm: nommu: correct the range of mmap_sem_read_lock in task_mem() hugetlb: revert use of page_cache_next_miss() Revert "page cache: fix page_cache_next/prev_miss off by one" mm/vmscan: fix root proactive reclaim unthrottling unbalanced node mm: memcg: rename and document global_reclaim() mm: kill [add|del]_page_to_lru_list() mm: compaction: convert to use a folio in isolate_migratepages_block() mm: zswap: fix double invalidate with exclusive loads mm: remove unnecessary pagevec includes mm: remove references to pagevec mm: rename invalidate_mapping_pagevec to mapping_try_invalidate mm: remove struct pagevec net: convert sunrpc from pagevec to folio_batch i915: convert i915_gpu_error to use a folio_batch pagevec: rename fbatch_count() mm: remove check_move_unevictable_pages() drm: convert drm_gem_put_pages() to use a folio_batch i915: convert shmem_sg_free_table() to use a folio_batch scatterlist: add sg_set_folio() ...
2023-06-26Merge tag 'pm-6.5-rc1' of ↵Linus Torvalds1-4/+10
git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm Pull power management updates from Rafael Wysocki: "These add Intel TPMI (Topology Aware Register and PM Capsule Interface) support to the power capping subsystem, extend the intel_idle driver to work in VM guests where MWAIT is not available, extend the system-wide power management diagnostics, fix bugs and clean up code. Specifics: - Introduce power capping core support for Intel TPMI (Topology Aware Register and PM Capsule Interface) and a TPMI interface driver for Intel RAPL (Zhang Rui, Dan Carpenter) - Fix CONFIG_IOSF_MBI dependency in the Intel RAPL power capping driver (Zhang Rui) - Fix invalid initialization for pl4_supported field in the Intel RAPL power capping driver (Sumeet Pawnikar) - Clean up the intel_idle driver, make it work with VM guests that cannot use the MWAIT instruction and address the case in which the host may enter a deep idle state when the guest is idle (Arjan van de Ven) - Prevent cpufreq drivers that provide the ->adjust_perf() callback without a ->fast_switch() one which is used as a fallback from the former in some cases (Wyes Karny) - Fix some issues related to the AMD P-state cpufreq driver (Mario Limonciello, Wyes Karny) - Fix the energy_performance_preference attribute handling in the intel_pstate driver in passive mode (Tero Kristo) - Fix the handling of pm_suspend_target_state when CONFIG_PM is unset (Kai-Heng Feng) - Correct spelling mistake in a comment in the hibernation code (Wang Honghui) - Add arch_resume_nosmt() prototype to avoid a "missing prototypes" build warning (Arnd Bergmann) - Restrict pm_pr_dbg() to system-wide power transitions and use it in a few additional places (Mario Limonciello) - Drop verification of in-params from genpd_add_device() and ensure that all of its callers will do it (Ulf Hansson) - Prevent possible integer overflows from occurring in genpd_parse_state() (Nikita Zhandarovich) - Reorder fieldls in 'struct devfreq_dev_status' to reduce its size somewhat (Christophe JAILLET) - Ensure that the Exynos PPMU driver is already loaded before the Exynos Bus driver starts probing so as to avoid a possible freeze loading of the kernel modules (Marek Szyprowski) - Fix variable deferencing before NULL check in the mtk-cci devfreq driver (Sukrut Bellary)" * tag 'pm-6.5-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: (42 commits) intel_idle: Add a "Long HLT" C1 state for the VM guest mode cpufreq: intel_pstate: Fix energy_performance_preference for passive cpufreq: amd-pstate: Add a kernel config option to set default mode cpufreq: amd-pstate: Set a fallback policy based on preferred_profile ACPI: CPPC: Add definition for undefined FADT preferred PM profile value cpufreq: amd-pstate: Set default governor to schedutil PM: domains: Move the verification of in-params from genpd_add_device() cpufreq: amd-pstate: Make amd-pstate EPP driver name hyphenated cpufreq: amd-pstate: Write CPPC enable bit per-socket intel_idle: Add support for using intel_idle in a VM guest using just hlt cpufreq: Fail driver register if it has adjust_perf without fast_switch intel_idle: clean up the (new) state_update_enter_method function intel_idle: refactor state->enter manipulation into its own function platform/x86/amd: pmc: Use pm_pr_dbg() for suspend related messages pinctrl: amd: Use pm_pr_dbg to show debugging messages ACPI: x86: Add pm_debug_messages for LPS0 _DSM state tracking include/linux/suspend.h: Only show pm_pr_dbg messages at suspend/resume powercap: RAPL: Fix a NULL vs IS_ERR() bug powercap: RAPL: Fix CONFIG_IOSF_MBI dependency powercap: RAPL: fix invalid initialization for pl4_supported field ...
2023-06-12include/linux/suspend.h: Only show pm_pr_dbg messages at suspend/resumeMario Limonciello1-3/+5
All uses in the kernel are currently already oriented around suspend/resume. As some other parts of the kernel may also use these messages in functions that could also be used outside of suspend/resume, only enable in suspend/resume path. Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2023-06-09mm: page_alloc: move pm_* function into powerKefeng Wang1-0/+6
pm_restrict_gfp_mask()/pm_restore_gfp_mask() only used in power, let's move them out of page_alloc.c. Adding a general gfp_has_io_fs() function which return true if gfp with both __GFP_IO and __GFP_FS flags, then use it inside of pm_suspended_storage(), also the pm_suspended_storage() is moved into suspend.h. Link: https://lkml.kernel.org/r/20230516063821.121844-11-wangkefeng.wang@huawei.com Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com> Cc: David Hildenbrand <david@redhat.com> Cc: "Huang, Ying" <ying.huang@intel.com> Cc: Iurii Zaikin <yzaikin@google.com> Cc: Kees Cook <keescook@chromium.org> Cc: Len Brown <len.brown@intel.com> Cc: Luis Chamberlain <mcgrof@kernel.org> Cc: Mike Rapoport (IBM) <rppt@kernel.org> Cc: Oscar Salvador <osalvador@suse.de> Cc: Pavel Machek <pavel@ucw.cz> Cc: Rafael J. Wysocki <rafael@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-06-09mm: page_alloc: move mark_free_page() into snapshot.cKefeng Wang1-3/+0
The mark_free_page() is only used in kernel/power/snapshot.c, move it out to reduce a bit of page_alloc.c Link: https://lkml.kernel.org/r/20230516063821.121844-10-wangkefeng.wang@huawei.com Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com> Cc: David Hildenbrand <david@redhat.com> Cc: "Huang, Ying" <ying.huang@intel.com> Cc: Iurii Zaikin <yzaikin@google.com> Cc: Kees Cook <keescook@chromium.org> Cc: Len Brown <len.brown@intel.com> Cc: Luis Chamberlain <mcgrof@kernel.org> Cc: Mike Rapoport (IBM) <rppt@kernel.org> Cc: Oscar Salvador <osalvador@suse.de> Cc: Pavel Machek <pavel@ucw.cz> Cc: Rafael J. Wysocki <rafael@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-05-24PM: suspend: add a arch_resume_nosmt() prototypeArnd Bergmann1-0/+2
The arch_resume_nosmt() has a __weak definition, plus an x86 specific override, but no prototype that ensures the two have the same arguments. This causes a W=1 warning: arch/x86/power/hibernate.c:189:5: error: no previous prototype for 'arch_resume_nosmt' [-Werror=missing-prototypes] Add the prototype in linux/suspend.h, which is included in both places. Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2023-05-24PM: suspend: Fix pm_suspend_target_state handling for !CONFIG_PMKai-Heng Feng1-1/+3
Move the pm_suspend_target_state definition for CONFIG_SUSPEND unset from the wakeup code into the headers so as to allow it to still be used elsewhere when CONFIG_SUSPEND is not set. Signed-off-by: Kai-Heng Feng <kai.heng.feng@canonical.com> [ rjw: Changelog and subject edits ] Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2023-05-18x86/hibernate: Declare global functions in suspend.hArnd Bergmann1-0/+4
Three functions that are defined in x86 specific code to override generic __weak implementations cause a warning because of a missing prototype: arch/x86/power/cpu.c:298:5: error: no previous prototype for 'hibernate_resume_nonboot_cpu_disable' [-Werror=missing-prototypes] arch/x86/power/hibernate.c:129:5: error: no previous prototype for 'arch_hibernation_header_restore' [-Werror=missing-prototypes] arch/x86/power/hibernate.c:91:5: error: no previous prototype for 'arch_hibernation_header_save' [-Werror=missing-prototypes] Move the declarations into a global header so it can be included by any file defining one of these. Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com> Reviewed-by: Alexander Lobakin <aleksander.lobakin@intel.com> Link: https://lore.kernel.org/all/20230516193549.544673-14-arnd%40kernel.org
2023-04-20PM: Add sysfs files to represent time spent in hardware sleep stateMario Limonciello1-0/+8
Userspace can't easily discover how much of a sleep cycle was spent in a hardware sleep state without using kernel tracing and vendor specific sysfs or debugfs files. To make this information more discoverable, introduce 3 new sysfs files: 1) The time spent in a hw sleep state for last cycle. 2) The time spent in a hw sleep state since the kernel booted 3) The maximum time that the hardware can report for a sleep cycle. All of these files will be present only if the system supports s2idle. Reviewed-by: Hans de Goede <hdegoede@redhat.com> Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2022-10-10Merge tag 'sched-core-2022-10-07' of ↵Linus Torvalds1-4/+4
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull scheduler updates from Ingo Molnar: "Debuggability: - Change most occurances of BUG_ON() to WARN_ON_ONCE() - Reorganize & fix TASK_ state comparisons, turn it into a bitmap - Update/fix misc scheduler debugging facilities Load-balancing & regular scheduling: - Improve the behavior of the scheduler in presence of lot of SCHED_IDLE tasks - in particular they should not impact other scheduling classes. - Optimize task load tracking, cleanups & fixes - Clean up & simplify misc load-balancing code Freezer: - Rewrite the core freezer to behave better wrt thawing and be simpler in general, by replacing PF_FROZEN with TASK_FROZEN & fixing/adjusting all the fallout. Deadline scheduler: - Fix the DL capacity-aware code - Factor out dl_task_is_earliest_deadline() & replenish_dl_new_period() - Relax/optimize locking in task_non_contending() Cleanups: - Factor out the update_current_exec_runtime() helper - Various cleanups, simplifications" * tag 'sched-core-2022-10-07' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (41 commits) sched: Fix more TASK_state comparisons sched: Fix TASK_state comparisons sched/fair: Move call to list_last_entry() in detach_tasks sched/fair: Cleanup loop_max and loop_break sched/fair: Make sure to try to detach at least one movable task sched: Show PF_flag holes freezer,sched: Rewrite core freezer logic sched: Widen TAKS_state literals sched/wait: Add wait_event_state() sched/completion: Add wait_for_completion_state() sched: Add TASK_ANY for wait_task_inactive() sched: Change wait_task_inactive()s match_state freezer,umh: Clean up freezer/initrd interaction freezer: Have {,un}lock_system_sleep() save/restore flags sched: Rename task_running() to task_on_cpu() sched/fair: Cleanup for SIS_PROP sched/fair: Default to false in test_idle_cores() sched/fair: Remove useless check in select_idle_core() sched/fair: Avoid double search on same cpu sched/fair: Remove redundant check in select_idle_smt() ...
2022-10-05Merge tag 'platform-drivers-x86-v6.1-1' of ↵Linus Torvalds1-0/+1
git://git.kernel.org/pub/scm/linux/kernel/git/pdx86/platform-drivers-x86 Pull x86 platform driver updates from Hans de Goede: - AMD Platform Management Framework (PMF) driver with AMT and QnQF support - AMD PMC: Improved logging for debugging s2idle issues - Big refactor of the ACPI/x86 backlight handling, ensuring that we only register 1 /sys/class/backlight device per LCD panel - Microsoft Surface: - Surface Laptop Go 2 support - Surface Pro 8 HID sensor support - Asus WMI: - Lots of cleanups - Support for TUF RGB keyboard backlight control - Add support for ROG X13 tablet mode - Siemens Simatic: IPC227G and IPC427G support - Toshiba ACPI laptop driver: Fan hwmon and battery ECO mode support - tools/power/x86/intel-speed-select: Various improvements - Various cleanups - Various small bugfixes * tag 'platform-drivers-x86-v6.1-1' of git://git.kernel.org/pub/scm/linux/kernel/git/pdx86/platform-drivers-x86: (153 commits) platform/x86: use PLATFORM_DEVID_NONE instead of -1 platform/x86/amd: pmc: Dump idle mask during "check" stage instead platform/x86/intel/wmi: thunderbolt: Use dev_groups callback platform/x86/amd: pmc: remove CONFIG_DEBUG_FS checks platform/surface: Split memcpy() of struct ssam_event flexible array platform/x86: compal-laptop: Get rid of a few forward declarations platform/x86: intel-uncore-freq: Use sysfs_emit() to instead of scnprintf() platform/x86: dell-smbios-base: Use sysfs_emit() platform/x86/amd/pmf: Remove unused power_delta instances platform/x86/amd/pmf: install notify handler after acpi init Documentation/ABI/testing/sysfs-amd-pmf: Add ABI doc for AMD PMF platform/x86/amd/pmf: Add sysfs to toggle CnQF platform/x86/amd/pmf: Add support for CnQF platform/x86/amd: pmc: Fix build without debugfs platform/x86: hp-wmi: Support touchpad on/off platform/x86: int3472/discrete: Drop a forward declaration platform/x86: toshiba_acpi: change turn_on_panel_on_resume to static platform/x86: wmi: Drop forward declaration of static functions platform/x86: toshiba_acpi: Remove duplicate include platform/x86: msi-laptop: Change DMI match / alias strings to fix module autoloading ...
2022-09-09ACPI: s2idle: Add a new ->check() callback for platform_s2idle_opsMario Limonciello1-0/+1
On some platforms it is found that Linux more aggressively enters s2idle than Windows enters Modern Standby and this uncovers some synchronization issues for the platform. To aid in debugging this class of problems in the future, add support for an extra optional callback intended for drivers to emit extra debugging. Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Link: https://lore.kernel.org/r/20220829162953.5947-2-mario.limonciello@amd.com Signed-off-by: Hans de Goede <hdegoede@redhat.com>
2022-09-07freezer: Have {,un}lock_system_sleep() save/restore flagsPeter Zijlstra1-4/+4
Rafael explained that the reason for having both PF_NOFREEZE and PF_FREEZER_SKIP is that {,un}lock_system_sleep() is callable from kthread context that has previously called set_freezable(). In preparation of merging the flags, have {,un}lock_system_slee() save and restore current->flags. Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Link: https://lore.kernel.org/r/20220822114648.725003428@infradead.org
2022-08-31PM: suspend: move from strlcpy() with unused retval to strscpy()Wolfram Sang1-1/+1
Follow the advice of the below link and prefer 'strscpy' in this subsystem. Conversion is 1:1 because the return value is not used. Generated by a coccinelle script. Link: https://lore.kernel.org/r/CAHk-=wgfRnXz0W3D37d01q3JFkr_i_uTL=V6A6G1oUZcprmknw@mail.gmail.com/ Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2022-04-13PM: sleep: enable dynamic debug support within pm_pr_dbg()David Cohen1-5/+39
Currently pm_pr_dbg() is used to filter kernel pm debug messages based on pm_debug_messages_on flag. The problem is if we enable/disable this flag it will affect all pm_pr_dbg() calls at once, so we can't individually control them. This patch changes pm_pr_dbg() implementation as such: - If pm_debug_messages_on is enabled, print the message. - If pm_debug_messages_on is disabled and CONFIG_DYNAMIC_DEBUG is enabled, only print the messages explicitly enabled on /sys/kernel/debug/dynamic_debug/control. - If pm_debug_messages_on is disabled and CONFIG_DYNAMIC_DEBUG is disabled, don't print the message. Signed-off-by: David Cohen <dacohen@pm.me> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2022-02-07PM: s2idle: ACPI: Fix wakeup interrupts handlingRafael J. Wysocki1-2/+2
After commit e3728b50cd9b ("ACPI: PM: s2idle: Avoid possible race related to the EC GPE") wakeup interrupts occurring immediately after the one discarded by acpi_s2idle_wake() may be missed. Moreover, if the SCI triggers again immediately after the rearming in acpi_s2idle_wake(), that wakeup may be missed too. The problem is that pm_system_irq_wakeup() only calls pm_system_wakeup() when pm_wakeup_irq is 0, but that's not the case any more after the interrupt causing acpi_s2idle_wake() to run until pm_wakeup_irq is cleared by the pm_wakeup_clear() call in s2idle_loop(). However, there may be wakeup interrupts occurring in that time frame and if that happens, they will be missed. To address that issue first move the clearing of pm_wakeup_irq to the point at which it is known that the interrupt causing acpi_s2idle_wake() to tun will be discarded, before rearming the SCI for wakeup. Moreover, because that only reduces the size of the time window in which the issue may manifest itself, allow pm_system_irq_wakeup() to register two second wakeup interrupts in a row and, when discarding the first one, replace it with the second one. [Of course, this assumes that only one wakeup interrupt can be discarded in one go, but currently that is the case and I am not aware of any plans to change that.] Fixes: e3728b50cd9b ("ACPI: PM: s2idle: Avoid possible race related to the EC GPE") Cc: 5.4+ <stable@vger.kernel.org> # 5.4+ Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2022-01-25PM: hibernate: Remove register_nosave_region_late()Amadeusz Sławiński1-10/+1
It is an unused wrapper forcing kmalloc allocation for registering nosave regions. Also, rename __register_nosave_region() to register_nosave_region() now that there is no need for disambiguation. Signed-off-by: Amadeusz Sławiński <amadeuszx.slawinski@linux.intel.com> Reviewed-by: Cezary Rojewski <cezary.rojewski@intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2021-12-08PM: hibernate: Allow ACPI hardware signature to be honouredDavid Woodhouse1-0/+1
Theoretically, when the hardware signature in FACS changes, the OS is supposed to gracefully decline to attempt to resume from S4: "If the signature has changed, OSPM will not restore the system context and can boot from scratch" In practice, Windows doesn't do this and many laptop vendors do allow the signature to change especially when docking/undocking, so it would be a bad idea to simply comply with the specification by default in the general case. However, there are use cases where we do want the compliant behaviour and we know it's safe. Specifically, when resuming virtual machines where we know the hypervisor has changed sufficiently that resume will fail. We really want to be able to *tell* the guest kernel not to try, so it boots cleanly and doesn't just crash. This patch provides a way to opt in to the spec-compliant behaviour on the command line. A follow-up patch may do this automatically for certain "known good" machines based on a DMI match, or perhaps just for all hypervisor guests since there's no good reason a hypervisor would change the hardware_signature that it exposes to guests *unless* it wants them to obey the ACPI specification. Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2020-09-23PM: rewrite is_hibernate_resume_dev to not require an inodeChristoph Hellwig1-2/+2
Just check the dev_t to help simplifying the code. Signed-off-by: Christoph Hellwig <hch@lst.de> Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Acked-by: Pavel Machek <pavel@ucw.cz> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2020-07-28PM, libnvdimm: Add runtime firmware activation supportDan Williams1-0/+6
Abstract platform specific mechanics for nvdimm firmware activation behind a handful of generic ops. At the bus level ->activate_state() indicates the unified state (idle, busy, armed) of all DIMMs on the bus, and ->capability() indicates the system state expectations for activate. At the DIMM level ->activate_state() indicates the per-DIMM state, ->activate_result() indicates the outcome of the last activation attempt, and ->arm() attempts to transition the DIMM from 'idle' to 'armed'. A new hibernate_quiet_exec() facility is added to support firmware activation in an OS defined system quiesce state. It leverages the fact that the hibernate-freeze state wants to assert that a memory hibernation snapshot can be taken. This is in contrast to a platform firmware defined quiesce state that may forcefully quiet the memory controller independent of whether an individual device-driver properly supports hibernate-freeze. The libnvdimm sysfs interface is extended to support detection of a firmware activate capability. The mechanism supports enumeration and triggering of firmware activate, optionally in the hibernate_quiet_exec() context. [rafael: hibernate_quiet_exec() proposal] [vishal: fix up sparse warning, grammar in Documentation/] Cc: Pavel Machek <pavel@ucw.cz> Cc: Ira Weiny <ira.weiny@intel.com> Cc: Len Brown <len.brown@intel.com> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Dave Jiang <dave.jiang@intel.com> Cc: Vishal Verma <vishal.l.verma@intel.com> Reported-by: kernel test robot <lkp@intel.com> Co-developed-by: "Rafael J. Wysocki" <rafael.j.wysocki@intel.com> Signed-off-by: "Rafael J. Wysocki" <rafael.j.wysocki@intel.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com> Signed-off-by: Vishal Verma <vishal.l.verma@intel.com>
2020-05-27PM: hibernate: Restrict writes to the resume deviceDomenico Andreoli1-0/+6
Hibernation via snapshot device requires write permission to the swap block device, the one that more often (but not necessarily) is used to store the hibernation image. With this patch, such permissions are granted iff: 1) snapshot device config option is enabled 2) swap partition is used as resume device In other circumstances the swap device is not writable from userspace. In order to achieve this, every write attempt to a swap device is checked against the device configured as part of the uswsusp API [0] using a pointer to the inode struct in memory. If the swap device being written was not configured for resuming, the write request is denied. NOTE: this implementation works only for swap block devices, where the inode configured by swapon (which sets S_SWAPFILE) is the same used by SNAPSHOT_SET_SWAP_AREA. In case of swap file, SNAPSHOT_SET_SWAP_AREA indeed receives the inode of the block device containing the filesystem where the swap file is located (+ offset in it) which is never passed to swapon and then has not set S_SWAPFILE. As result, the swap file itself (as a file) has never an option to be written from userspace. Instead it remains writable if accessed directly from the containing block device, which is always writeable from root. [0] Documentation/power/userland-swsusp.rst v2: - rename is_hibernate_snapshot_dev() to is_hibernate_resume_dev() - fix description so to correctly refer to the resume device Signed-off-by: Domenico Andreoli <domenico.andreoli@linux.com> Acked-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2020-03-23PM: remove s390 specific callbacksHeiko Carstens1-34/+0
ARCH_SAVE_PAGE_KEYS has been introduced in order to be able to save and restore s390 specific storage keys into a hibernation image. With hibernation support removed from s390 there is no point in keeping the callbacks. Acked-by: Christian Borntraeger <borntraeger@de.ibm.com> Acked-by: Peter Oberparleiter <oberpar@linux.ibm.com> Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2020-02-11ACPI: PM: s2idle: Avoid possible race related to the EC GPERafael J. Wysocki1-1/+1
It is theoretically possible for the ACPI EC GPE to be set after the s2idle_ops->wake() called from s2idle_loop() has returned and before the subsequent pm_wakeup_pending() check is carried out. If that happens, the resulting wakeup event will cause the system to resume even though it may be a spurious one. To avoid that race, first make the ->wake() callback in struct platform_s2idle_ops return a bool value indicating whether or not to let the system resume and rearrange s2idle_loop() to use that value instad of the direct pm_wakeup_pending() call if ->wake() is present. Next, rework acpi_s2idle_wake() to process EC events and check pm_wakeup_pending() before re-arming the SCI for system wakeup to prevent it from triggering prematurely and add comments to that function to explain the rationale for the new code flow. Fixes: 56b991849009 ("PM: sleep: Simplify suspend-to-idle control flow") Cc: 5.4+ <stable@vger.kernel.org> # 5.4+ Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2020-01-16PM: suspend: Add sysfs attribute to control the "sync on suspend" behaviorJonas Meurer1-0/+2
The sysfs attribute `/sys/power/sync_on_suspend` controls, whether or not filesystems are synced by the kernel before system suspend. Congruously, the behaviour of build-time switch CONFIG_SUSPEND_SKIP_SYNC is slightly changed: It now defines the run-tim default for the new sysfs attribute `/sys/power/sync_on_suspend`. The run-time attribute is added because the existing corresponding build-time Kconfig flag for (`CONFIG_SUSPEND_SKIP_SYNC`) is not flexible enough. E.g. Linux distributions that provide pre-compiled kernels usually want to stick with the default (sync filesystems before suspend) but under special conditions this needs to be changed. One example for such a special condition is user-space handling of suspending block devices (e.g. using `cryptsetup luksSuspend` or `dmsetup suspend`) before system suspend. The Kernel trying to sync filesystems after the underlying block device already got suspended obviously leads to dead-locks. Be aware that you have to take care of the filesystem sync yourself before suspending the system in those scenarios. Signed-off-by: Jonas Meurer <jonas@freesources.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2019-08-08ACPI: PM: s2idle: Execute LPS0 _DSM functions with suspended devicesRafael J. Wysocki1-0/+2
According to Section 3.5 of the "Intel Low Power S0 Idle" document [1], Function 5 of the LPS0 _DSM is expected to be invoked when the system configuration matches the criteria for entering the target low-power state of the platform. In particular, this means that all devices should be suspended and in low-power states already when that function is invoked. This is not the case currently, however, because Function 5 of the LPS0 _DSM is invoked by it before the "noirq" phase of device suspend, which means that some devices may not have been put into low-power states yet at that point. That is a consequence of the previous design of the suspend-to-idle flow that allowed the "noirq" phase of device suspend and the "noirq" phase of device resume to be carried out for multiple times while "suspended" (if any spurious wakeup events were detected) and the point of the LPS0 _DSM Function 5 invocation was chosen so as to call it (and LPS0 _DSM Function 6 analogously) once per suspend-resume cycle (regardless of how many times the "noirq" phases of device suspend and resume were carried out while "suspended"). Now that the suspend-to-idle flow has been redesigned to carry out the "noirq" phases of device suspend and resume once in each cycle, the code can be reordered to follow the specification that it is based on more closely. For this purpose, add ->prepare_late and ->restore_early platform callbacks for suspend-to-idle, to be executed, respectively, after the "noirq" phase of suspending devices and before the "noirq" phase of resuming them and make ACPI use them for the invocation of LPS0 _DSM functions as appropriate. While at it, move the LPS0 entry requirements check to be made before invoking Functions 3 and 5 of the LPS0 _DSM (also once per cycle) as follows from the specification [1]. Link: https://uefi.org/sites/default/files/resources/Intel_ACPI_Low_Power_S0_Idle.pdf # [1] Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Tested-by: Kai-Heng Feng <kai.heng.feng@canonical.com>
2019-07-30ACPI: PM: Set up EC GPE for system wakeup from drivers that need itRafael J. Wysocki1-0/+1
The EC GPE needs to be set up for system wakeup only if there is a driver depending on it, either intel-hid or intel-vbtn, bound to a button device that is expected to wake up the system from sleep (such as the power button on some Dell systems, like the XPS13 9360). It doesn't need to be set up for waking up the system from sleep in any other cases and whether or not it is expected to wake up the system from sleep doesn't depend on whether or not the LPS0 device is present in the ACPI namespace. For this reason, rearrange the ACPI suspend-to-idle code to make the drivers depending on the EC GPE wakeup take care of setting it up and decouple that from the LPS0 device handling. While at it, make intel-hid and intel-vbtn prepare for system wakeup only if they are allowed to wake up the system from sleep by user space (via sysfs). [Note that acpi_ec_mark_gpe_for_wake() and acpi_ec_set_gpe_wake_mask() are there to prevent the EC GPE from being disabled by the acpi_enable_all_wakeup_gpes() call in acpi_s2idle_prepare(), so on systems with either intel-hid or intel-vbtn this change doesn't affect any interactions with the hardware or platform firmware.] Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Reviewed-by: Andy Shevchenko <andy.shevchenko@gmail.com>
2019-07-23PM: sleep: Simplify suspend-to-idle control flowRafael J. Wysocki1-1/+0
After commit 33e4f80ee69b ("ACPI / PM: Ignore spurious SCI wakeups from suspend-to-idle") the "noirq" phases of device suspend and resume may run for multiple times during suspend-to-idle, if there are spurious system wakeup events while suspended. However, this is complicated and fragile and actually unnecessary. The main reason for doing this is that on some systems the EC may signal system wakeup events (power button events, for example) as well as events that should not cause the system to resume (spurious system wakeup events). Thus, in order to determine whether or not a given event signaled by the EC while suspended is a proper system wakeup one, the EC GPE needs to be dispatched and to start with that was achieved by allowing the ACPI SCI action handler to run, which was only possible after calling resume_device_irqs(). However, dispatching the EC GPE this way turned out to take too much time in some cases and some EC events might be missed due to that, so commit 68e22011856f ("ACPI: EC: Dispatch the EC GPE directly on s2idle wake") started to dispatch the EC GPE right after a wakeup event has been detected, so in fact the full ACPI SCI action handler doesn't need to run any more to deal with the wakeups coming from the EC. Use this observation to simplify the suspend-to-idle control flow so that the "noirq" phases of device suspend and resume are each run only once in every suspend-to-idle cycle, which is reported to significantly reduce power drawn by some systems when suspended to idle (by allowing them to reach a deep platform-wide low-power state through the suspend-to-idle flow). [What appears to happen is that the "noirq" resume of devices after a spurious EC wakeup brings some devices into a state in which they prevent the platform from reaching the deep low-power state going forward, even after a subsequent "noirq" suspend phase, and on some systems the EC triggers such wakeups already when the "noirq" suspend of devices is running for the first time in the given suspend/resume cycle, so the platform cannot reach the deep low-power state at all.] First, make acpi_s2idle_wake() use the acpi_ec_dispatch_gpe() return value to determine whether or not the wakeup may have been triggered by the EC (in which case the system wakeup is canceled and ACPI events are processed in order to determine whether or not the event is a proper system wakeup one) and use rearm_wake_irq() (introduced by a previous change) in it to rearm the ACPI SCI for system wakeup detection in case the system will remain suspended. Second, drop acpi_s2idle_sync(), which is not needed any more, and the corresponding global platform suspend-to-idle callback. Next, drop the pm_wakeup_pending() check (which is an optimization only) from __device_suspend_noirq() to prevent it from returning errors on system wakeups occurring before the "noirq" phase of device suspend is complete (as in the case of suspend-to-idle it is not known whether or not these wakeups are suprious at that point), in order to avoid having to carry out a "noirq" resume of devices on a spurious system wakeup. Finally, change the code flow in s2idle_loop() to (1) run the "noirq" suspend of devices once before starting the loop, (2) check for spurious EC wakeups (via the platform ->wake callback) for the first time before calling s2idle_enter(), and (3) run the "noirq" resume of devices once after leaving the loop. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Acked-by: Thomas Gleixner <tglx@linutronix.de>
2019-07-08Merge branch 'pm-sleep'Rafael J. Wysocki1-2/+3
* pm-sleep: PM: sleep: Drop dev_pm_skip_next_resume_phases() ACPI: PM: Drop unused function and function header ACPI: PM: Introduce "poweroff" callbacks for ACPI PM domain and LPSS ACPI: PM: Simplify and fix PM domain hibernation callbacks PCI: PM: Simplify bus-level hibernation callbacks PM: ACPI/PCI: Resume all devices during hibernation kernel: power: swap: use kzalloc() instead of kmalloc() followed by memset() PM: sleep: Update struct wakeup_source documentation drivers: base: power: remove wakeup_sources_stats_dentry variable PM: suspend: Rename pm_suspend_via_s2idle() PM: sleep: Show how long dpm_suspend_start() and dpm_suspend_end() take PM: hibernate: powerpc: Expose pfn_is_nosave() prototype
2019-06-26PCI: PM: Avoid skipping bus-level PM on platforms without ACPIRafael J. Wysocki1-2/+24
There are platforms that do not call pm_set_suspend_via_firmware(), so pm_suspend_via_firmware() returns 'false' on them, but the power states of PCI devices (PCIe ports in particular) are changed as a result of powering down core platform components during system-wide suspend. Thus the pm_suspend_via_firmware() checks in pci_pm_suspend_noirq() and pci_pm_resume_noirq() introduced by commit 3e26c5feed2a ("PCI: PM: Skip devices in D0 for suspend-to- idle") are not sufficient to determine that devices left in D0 during suspend will remain in D0 during resume and so the bus-level power management can be skipped for them. For this reason, introduce a new global suspend flag, PM_SUSPEND_FLAG_NO_PLATFORM, set it for suspend-to-idle only and replace the pm_suspend_via_firmware() checks mentioned above with checks against this flag. Fixes: 3e26c5feed2a ("PCI: PM: Skip devices in D0 for suspend-to-idle") Reported-by: Jon Hunter <jonathanh@nvidia.com> Tested-by: Jon Hunter <jonathanh@nvidia.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Tested-by: Mika Westerberg <mika.westerberg@linux.intel.com> Reviewed-by: Mika Westerberg <mika.westerberg@linux.intel.com>
2019-06-19PM: suspend: Rename pm_suspend_via_s2idle()Rafael J. Wysocki1-2/+2
The name of pm_suspend_via_s2idle() is confusing, as it doesn't reflect the purpose of the function precisely enough and it is very similar to pm_suspend_via_firmware(), which has a different purpose, so rename it as pm_suspend_default_s2idle() and update its only caller, i8042_register_ports(), accordingly. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Acked-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
2019-06-14PM: hibernate: powerpc: Expose pfn_is_nosave() prototypeMathieu Malaterre1-0/+1
The declaration for pfn_is_nosave is only available in kernel/power/power.h. Since this function can be override in arch, expose it globally. Having a prototype will make sure to avoid warning (sometime treated as error with W=1) such as: arch/powerpc/kernel/suspend.c:18:5: error: no previous prototype for 'pfn_is_nosave' [-Werror=missing-prototypes] This moves the declaration into a globally visible header file and add missing include to avoid a warning on powerpc. Also remove the duplicated prototypes since not required anymore. Signed-off-by: Mathieu Malaterre <malat@debian.org> Acked-by: Michael Ellerman <mpe@ellerman.id.au> (powerpc) Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2019-06-03PM: sleep: Add kerneldoc comments to some functionsRafael J. Wysocki1-0/+31
Add kerneldoc comments to pm_suspend_via_firmware(), pm_resume_via_firmware() and pm_suspend_via_s2idle() to explain what they do. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2019-05-27ACPI: PM: Call pm_set_suspend_via_firmware() during hibernationRafael J. Wysocki1-1/+1
On systems with ACPI platform firmware the last stage of hibernation is analogous to system suspend to S3 (suspend-to-RAM), so it should be handled analogously. In particular, pm_suspend_via_firmware() should return 'true' in that stage to let the callers of it know that control will be passed to the platform firmware going forward, so pm_set_suspend_via_firmware() needs to be called then in analogy with acpi_suspend_begin(). However, the platform hibernation ->begin() callback is invoked during the "freeze" transition (before creating a snapshot image of system memory) as well as during the "hibernate" transition which is the last stage of it and pm_set_suspend_via_firmware() should be invoked by that callback in the latter stage only. In order to implement that redefine the hibernation ->begin() callback to take a pm_message_t argument to indicate which stage of hibernation is taking place and rework acpi_hibernation_begin() and acpi_hibernation_begin_old() to take it into account as needed. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2019-04-02PM / sleep: Refactor filesystems sync to reduce duplicationHarry Pan1-0/+3
Create a common helper to sync filesystems for system suspend and hibernation. Signed-off-by: Harry Pan <harry.pan@intel.com> Acked-by: Pavel Machek <pavel@ucw.cz> [ rjw: Changelog ] Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2018-10-12Merge branch 'for-linus' of ↵Greg Kroah-Hartman1-0/+2