aboutsummaryrefslogtreecommitdiff
path: root/init
AgeCommit message (Collapse)AuthorFilesLines
2026-04-18Merge tag 'memblock-v7.1-rc1' of ↵Linus Torvalds1-7/+0
git://git.kernel.org/pub/scm/linux/kernel/git/rppt/memblock Pull memblock updates from Mike Rapoport: - improve debuggability of reserve_mem kernel parameter handling with print outs in case of a failure and debugfs info showing what was actually reserved - Make memblock_free_late() and free_reserved_area() use the same core logic for freeing the memory to buddy and ensure it takes care of updating memblock arrays when ARCH_KEEP_MEMBLOCK is enabled. * tag 'memblock-v7.1-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rppt/memblock: x86/alternative: delay freeing of smp_locks section memblock: warn when freeing reserved memory before memory map is initialized memblock, treewide: make memblock_free() handle late freeing memblock: make free_reserved_area() update memblock if ARCH_KEEP_MEMBLOCK=y memblock: extract page freeing from free_reserved_area() into a helper memblock: make free_reserved_area() more robust mm: move free_reserved_area() to mm/memblock.c powerpc: opal-core: pair alloc_pages_exact() with free_pages_exact() powerpc: fadump: pair alloc_pages_exact() with free_pages_exact() memblock: reserve_mem: fix end caclulation in reserve_mem_release_by_name() memblock: move reserve_bootmem_range() to memblock.c and make it static memblock: Add reserve_mem debugfs info memblock: Print out errors on reserve_mem parser
2026-04-15Merge tag 'mm-stable-2026-04-13-21-45' of ↵Linus Torvalds1-1/+1
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm Pull MM updates from Andrew Morton: - "maple_tree: Replace big node with maple copy" (Liam Howlett) Mainly prepararatory work for ongoing development but it does reduce stack usage and is an improvement. - "mm, swap: swap table phase III: remove swap_map" (Kairui Song) Offers memory savings by removing the static swap_map. It also yields some CPU savings and implements several cleanups. - "mm: memfd_luo: preserve file seals" (Pratyush Yadav) File seal preservation to LUO's memfd code - "mm: zswap: add per-memcg stat for incompressible pages" (Jiayuan Chen) Additional userspace stats reportng to zswap - "arch, mm: consolidate empty_zero_page" (Mike Rapoport) Some cleanups for our handling of ZERO_PAGE() and zero_pfn - "mm/kmemleak: Improve scan_should_stop() implementation" (Zhongqiu Han) A robustness improvement and some cleanups in the kmemleak code - "Improve khugepaged scan logic" (Vernon Yang) Improve khugepaged scan logic and reduce CPU consumption by prioritizing scanning tasks that access memory frequently - "Make KHO Stateless" (Jason Miu) Simplify Kexec Handover by transitioning KHO from an xarray-based metadata tracking system with serialization to a radix tree data structure that can be passed directly to the next kernel - "mm: vmscan: add PID and cgroup ID to vmscan tracepoints" (Thomas Ballasi and Steven Rostedt) Enhance vmscan's tracepointing - "mm: arch/shstk: Common shadow stack mapping helper and VM_NOHUGEPAGE" (Catalin Marinas) Cleanup for the shadow stack code: remove per-arch code in favour of a generic implementation - "Fix KASAN support for KHO restored vmalloc regions" (Pasha Tatashin) Fix a WARN() which can be emitted the KHO restores a vmalloc area - "mm: Remove stray references to pagevec" (Tal Zussman) Several cleanups, mainly udpating references to "struct pagevec", which became folio_batch three years ago - "mm: Eliminate fake head pages from vmemmap optimization" (Kiryl Shutsemau) Simplify the HugeTLB vmemmap optimization (HVO) by changing how tail pages encode their relationship to the head page - "mm/damon/core: improve DAMOS quota efficiency for core layer filters" (SeongJae Park) Improve two problematic behaviors of DAMOS that makes it less efficient when core layer filters are used - "mm/damon: strictly respect min_nr_regions" (SeongJae Park) Improve DAMON usability by extending the treatment of the min_nr_regions user-settable parameter - "mm/page_alloc: pcp locking cleanup" (Vlastimil Babka) The proper fix for a previously hotfixed SMP=n issue. Code simplifications and cleanups ensued - "mm: cleanups around unmapping / zapping" (David Hildenbrand) A bunch of cleanups around unmapping and zapping. Mostly simplifications, code movements, documentation and renaming of zapping functions - "support batched checking of the young flag for MGLRU" (Baolin Wang) Batched checking of the young flag for MGLRU. It's part cleanups; one benchmark shows large performance benefits for arm64 - "memcg: obj stock and slab stat caching cleanups" (Johannes Weiner) memcg cleanup and robustness improvements - "Allow order zero pages in page reporting" (Yuvraj Sakshith) Enhance free page reporting - it is presently and undesirably order-0 pages when reporting free memory. - "mm: vma flag tweaks" (Lorenzo Stoakes) Cleanup work following from the recent conversion of the VMA flags to a bitmap - "mm/damon: add optional debugging-purpose sanity checks" (SeongJae Park) Add some more developer-facing debug checks into DAMON core - "mm/damon: test and document power-of-2 min_region_sz requirement" (SeongJae Park) An additional DAMON kunit test and makes some adjustments to the addr_unit parameter handling - "mm/damon/core: make passed_sample_intervals comparisons overflow-safe" (SeongJae Park) Fix a hard-to-hit time overflow issue in DAMON core - "mm/damon: improve/fixup/update ratio calculation, test and documentation" (SeongJae Park) A batch of misc/minor improvements and fixups for DAMON - "mm: move vma_(kernel|mmu)_pagesize() out of hugetlb.c" (David Hildenbrand) Fix a possible issue with dax-device when CONFIG_HUGETLB=n. Some code movement was required. - "zram: recompression cleanups and tweaks" (Sergey Senozhatsky) A somewhat random mix of fixups, recompression cleanups and improvements in the zram code - "mm/damon: support multiple goal-based quota tuning algorithms" (SeongJae Park) Extend DAMOS quotas goal auto-tuning to support multiple tuning algorithms that users can select - "mm: thp: reduce unnecessary start_stop_khugepaged()" (Breno Leitao) Fix the khugpaged sysfs handling so we no longer spam the logs with reams of junk when starting/stopping khugepaged - "mm: improve map count checks" (Lorenzo Stoakes) Provide some cleanups and slight fixes in the mremap, mmap and vma code - "mm/damon: support addr_unit on default monitoring targets for modules" (SeongJae Park) Extend the use of DAMON core's addr_unit tunable - "mm: khugepaged cleanups and mTHP prerequisites" (Nico Pache) Cleanups to khugepaged and is a base for Nico's planned khugepaged mTHP support - "mm: memory hot(un)plug and SPARSEMEM cleanups" (David Hildenbrand) Code movement and cleanups in the memhotplug and sparsemem code - "mm: remove CONFIG_ARCH_ENABLE_MEMORY_HOTREMOVE and cleanup CONFIG_MIGRATION" (David Hildenbrand) Rationalize some memhotplug Kconfig support - "change young flag check functions to return bool" (Baolin Wang) Cleanups to change all young flag check functions to return bool - "mm/damon/sysfs: fix memory leak and NULL dereference issues" (Josh Law and SeongJae Park) Fix a few potential DAMON bugs - "mm/vma: convert vm_flags_t to vma_flags_t in vma code" (Lorenzo Stoakes) Convert a lot of the existing use of the legacy vm_flags_t data type to the new vma_flags_t type which replaces it. Mainly in the vma code. - "mm: expand mmap_prepare functionality and usage" (Lorenzo Stoakes) Expand the mmap_prepare functionality, which is intended to replace the deprecated f_op->mmap hook which has been the source of bugs and security issues for some time. Cleanups, documentation, extension of mmap_prepare into filesystem drivers - "mm/huge_memory: refactor zap_huge_pmd()" (Lorenzo Stoakes) Simplify and clean up zap_huge_pmd(). Additional cleanups around vm_normal_folio_pmd() and the softleaf functionality are performed. * tag 'mm-stable-2026-04-13-21-45' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (369 commits) mm: fix deferred split queue races during migration mm/khugepaged: fix issue with tracking lock mm/huge_memory: add and use has_deposited_pgtable() mm/huge_memory: add and use normal_or_softleaf_folio_pmd() mm: add softleaf_is_valid_pmd_entry(), pmd_to_softleaf_folio() mm/huge_memory: separate out the folio part of zap_huge_pmd() mm/huge_memory: use mm instead of tlb->mm mm/huge_memory: remove unnecessary sanity checks mm/huge_memory: deduplicate zap deposited table call mm/huge_memory: remove unnecessary VM_BUG_ON_PAGE() mm/huge_memory: add a common exit path to zap_huge_pmd() mm/huge_memory: handle buggy PMD entry in zap_huge_pmd() mm/huge_memory: have zap_huge_pmd return a boolean, add kdoc mm/huge: avoid big else branch in zap_huge_pmd() mm/huge_memory: simplify vma_is_specal_huge() mm: on remap assert that input range within the proposed VMA mm: add mmap_action_map_kernel_pages[_full]() uio: replace deprecated mmap hook with mmap_prepare in uio_info drivers: hv: vmbus: replace deprecated mmap hook with mmap_prepare mm: allow handling of stacked mmap_prepare hooks in more drivers ...
2026-04-15Merge tag 'sched_ext-for-7.1' of ↵Linus Torvalds1-0/+4
git://git.kernel.org/pub/scm/linux/kernel/git/tj/sched_ext Pull sched_ext updates from Tejun Heo: - cgroup sub-scheduler groundwork Multiple BPF schedulers can be attached to cgroups and the dispatch path is made hierarchical. This involves substantial restructuring of the core dispatch, bypass, watchdog, and dump paths to be per-scheduler, along with new infrastructure for scheduler ownership enforcement, lifecycle management, and cgroup subtree iteration The enqueue path is not yet updated and will follow in a later cycle - scx_bpf_dsq_reenq() generalized to support any DSQ including remote local DSQs and user DSQs Built on top of this, SCX_ENQ_IMMED guarantees that tasks dispatched to local DSQs either run immediately or get reenqueued back through ops.enqueue(), giving schedulers tighter control over queueing latency Also useful for opportunistic CPU sharing across sub-schedulers - ops.dequeue() was only invoked when the core knew a task was in BPF data structures, missing scheduling property change events and skipping callbacks for non-local DSQ dispatches from ops.select_cpu() Fixed to guarantee exactly one ops.dequeue() call when a task leaves BPF scheduler custody - Kfunc access validation moved from runtime to BPF verifier time, removing runtime mask enforcement - Idle SMT sibling prioritization in the idle CPU selection path - Documentation, selftest, and tooling updates. Misc bug fixes and cleanups * tag 'sched_ext-for-7.1' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/sched_ext: (134 commits) tools/sched_ext: Add explicit cast from void* in RESIZE_ARRAY() sched_ext: Make string params of __ENUM_set() const tools/sched_ext: Kick home CPU for stranded tasks in scx_qmap sched_ext: Drop spurious warning on kick during scheduler disable sched_ext: Warn on task-based SCX op recursion sched_ext: Rename scx_kf_allowed_on_arg_tasks() to scx_kf_arg_task_ok() sched_ext: Remove runtime kfunc mask enforcement sched_ext: Add verifier-time kfunc context filter sched_ext: Drop redundant rq-locked check from scx_bpf_task_cgroup() sched_ext: Decouple kfunc unlocked-context check from kf_mask sched_ext: Fix ops.cgroup_move() invocation kf_mask and rq tracking sched_ext: Track @p's rq lock across set_cpus_allowed_scx -> ops.set_cpumask sched_ext: Add select_cpu kfuncs to scx_kfunc_ids_unlocked sched_ext: Drop TRACING access to select_cpu kfuncs selftests/sched_ext: Fix wrong DSQ ID in peek_dsq error message sched_ext: Documentation: improve accuracy of task lifecycle pseudo-code selftests/sched_ext: Improve runner error reporting for invalid arguments sched_ext: Documentation: Fix scx_bpf_move_to_local kfunc name sched_ext: Documentation: Add ops.dequeue() to task lifecycle tools/sched_ext: Fix off-by-one in scx_sdt payload zeroing ...
2026-04-14Merge tag 'sched-core-2026-04-13' of ↵Linus Torvalds1-0/+1
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull scheduler updates from Ingo Molnar: "Fair scheduling updates: - Skip SCHED_IDLE rq for SCHED_IDLE tasks (Christian Loehle) - Remove superfluous rcu_read_lock() in the wakeup path (K Prateek Nayak) - Simplify the entry condition for update_idle_cpu_scan() (K Prateek Nayak) - Simplify SIS_UTIL handling in select_idle_cpu() (K Prateek Nayak) - Avoid overflow in enqueue_entity() (K Prateek Nayak) - Update overutilized detection (Vincent Guittot) - Prevent negative lag increase during delayed dequeue (Vincent Guittot) - Clear buddies for preempt_short (Vincent Guittot) - Implement more complex proportional newidle balance (Peter Zijlstra) - Increase weight bits for avg_vruntime (Peter Zijlstra) - Use full weight to __calc_delta() (Peter Zijlstra) RT and DL scheduling updates: - Fix incorrect schedstats for rt and dl thread (Dengjun Su) - Skip group schedulable check with rt_group_sched=0 (Michal Koutný) - Move group schedulability check to sched_rt_global_validate() (Michal Koutný) - Add reporting of runtime left & abs deadline to sched_getattr() for DEADLINE tasks (Tommaso Cucinotta) Scheduling topology updates by K Prateek Nayak: - Compute sd_weight considering cpuset partitions - Extract "imb_numa_nr" calculation into a separate helper - Allocate per-CPU sched_domain_shared in s_data - Switch to assigning "sd->shared" from s_data - Remove sched_domain_shared allocation with sd_data Energy-aware scheduling updates: - Filter false overloaded_group case for EAS (Vincent Guittot) - PM: EM: Switch to rcu_dereference_all() in wakeup path (Dietmar Eggemann) Infrastructure updates: - Replace use of system_unbound_wq with system_dfl_wq (Marco Crivellari) Proxy scheduling updates by John Stultz: - Make class_schedulers avoid pushing current, and get rid of proxy_tag_curr() - Minimise repeated sched_proxy_exec() checking - Fix potentially missing balancing with Proxy Exec - Fix and improve task::blocked_on et al handling - Add assert_balance_callbacks_empty() helper - Add logic to zap balancing callbacks if we pick again - Move attach_one_task() and attach_task() helpers to sched.h - Handle blocked-waiter migration (and return migration) - Add K Prateek Nayak to scheduler reviewers for proxy execution Misc cleanups and fixes by John Stultz, Joseph Salisbury, Peter Zijlstra, K Prateek Nayak, Michal Koutný, Randy Dunlap, Shrikanth Hegde, Vincent Guittot, Zhan Xusheng, Xie Yuanbin and Vincent Guittot" * tag 'sched-core-2026-04-13' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (46 commits) sched/eevdf: Clear buddies for preempt_short sched/rt: Cleanup global RT bandwidth functions sched/rt: Move group schedulability check to sched_rt_global_validate() sched/rt: Skip group schedulable check with rt_group_sched=0 sched/fair: Avoid overflow in enqueue_entity() sched: Use u64 for bandwidth ratio calculations sched/fair: Prevent negative lag increase during delayed dequeue sched/fair: Use sched_energy_enabled() sched: Handle blocked-waiter migration (and return migration) sched: Move attach_one_task and attach_task helpers to sched.h sched: Add logic to zap balance callbacks if we pick again sched: Add assert_balance_callbacks_empty helper sched/locking: Add special p->blocked_on==PROXY_WAKING value for proxy return-migration sched: Fix modifying donor->blocked on without proper locking locking: Add task::blocked_lock to serialize blocked_on state sched: Fix potentially missing balancing with Proxy Exec sched: Minimise repeated sched_proxy_exec() checking sched: Make class_schedulers avoid pushing current, and get rid of proxy_tag_curr() MAINTAINERS: Add K Prateek Nayak to scheduler reviewers sched/core: Get this cpu once in ttwu_queue_cond() ...
2026-04-14Merge tag 'timers-vdso-2026-04-12' of ↵Linus Torvalds2-1/+5
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull vdso updates from Thomas Gleixner: - Make the handling of compat functions consistent and more robust - Rework the underlying data store so that it is dynamically allocated, which allows the conversion of the last holdout SPARC64 to the generic VDSO implementation - Rework the SPARC64 VDSO to utilize the generic implementation - Mop up the left overs of the non-generic VDSO support in the core code - Expand the VDSO selftest and make them more robust - Allow time namespaces to be enabled independently of the generic VDSO support, which was not possible before due to SPARC64 not using it - Various cleanups and improvements in the related code * tag 'timers-vdso-2026-04-12' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (51 commits) timens: Use task_lock guard in timens_get*() timens: Use mutex guard in proc_timens_set_offset() timens: Simplify some calls to put_time_ns() timens: Add a __free() wrapper for put_time_ns() timens: Remove dependency on the vDSO vdso/timens: Move functions to new file selftests: vDSO: vdso_test_correctness: Add a test for time() selftests: vDSO: vdso_test_correctness: Use facilities from parse_vdso.c selftests: vDSO: vdso_test_correctness: Handle different tv_usec types selftests: vDSO: vdso_test_correctness: Drop SYS_getcpu fallbacks selftests: vDSO: vdso_test_gettimeofday: Remove nolibc checks Revert "selftests: vDSO: parse_vdso: Use UAPI headers instead of libc headers" random: vDSO: Remove ifdeffery random: vDSO: Trim vDSO includes vdso/datapage: Trim down unnecessary includes vdso/datapage: Remove inclusion of gettimeofday.h vdso/helpers: Explicitly include vdso/processor.h vdso/gettimeofday: Add explicit includes random: vDSO: Add explicit includes MIPS: vdso: Explicitly include asm/vdso/vdso.h ...
2026-04-14Merge tag 'kbuild-7.1-1' of ↵Linus Torvalds1-0/+5
git://git.kernel.org/pub/scm/linux/kernel/git/kbuild/linux Pull Kbuild/Kconfig updates from Nicolas Schier: "Kbuild: - reject unexpected values for LLVM= - uapi: remove usage of toolchain headers - switch from '-fms-extensions' to '-fms-anonymous-structs' when available (currently: clang >= 23.0.0) - reduce the number of compiler-generated suffixes for clang thin-lto build - reduce output spam ("GEN Makefile") when building out of tree - improve portability for testing headers - also test UAPI headers against C++ compilers - drop build ID architecture allow-list in vdso_install - only run checksyscalls when necessary - update the debug information notes in reproducible-builds.rst - expand inlining hints with -fdiagnostics-show-inlining-chain Kconfig: - forbid multiple entries with the same symbol in a choice - error out on duplicated kconfig inclusion" * tag 'kbuild-7.1-1' of git://git.kernel.org/pub/scm/linux/kernel/git/kbuild/linux: (35 commits) kbuild: expand inlining hints with -fdiagnostics-show-inlining-chain kconfig: forbid multiple entries with the same symbol in a choice Documentation: kbuild: Update the debug information notes in reproducible-builds.rst checksyscalls: move instance functionality into generic code checksyscalls: only run when necessary checksyscalls: fail on all intermediate errors checksyscalls: move path to reference table to a variable kbuild: vdso_install: drop build ID architecture allow-list kbuild: vdso_install: gracefully handle images without build ID kbuild: vdso_install: hide readelf warnings kbuild: vdso_install: split out the readelf invocation kbuild: uapi: also test UAPI headers against C++ compilers kbuild: uapi: provide a C++ compatible dummy definition of NULL kbuild: uapi: handle UML in architecture-specific exclusion lists kbuild: uapi: move all include path flags together kbuild: uapi: move some compiler arguments out of the command definition check-uapi: use dummy libc includes check-uapi: honor ${CROSS_COMPILE} setting check-uapi: link into shared objects kbuild: reduce output spam when building out of tree ...
2026-04-13Merge tag 'driver-core-7.1-rc1' of ↵Linus Torvalds1-0/+2
git://git.kernel.org/pub/scm/linux/kernel/git/driver-core/driver-core Pull driver core updates from Danilo Krummrich: "debugfs: - Fix NULL pointer dereference in debugfs_create_str() - Fix misplaced EXPORT_SYMBOL_GPL for debugfs_create_str() - Fix soundwire debugfs NULL pointer dereference from uninitialized firmware_file device property: - Make fwnode flags modifications thread safe; widen the field to unsigned long and use set_bit() / clear_bit() based accessors - Document how to check for the property presence devres: - Separate struct devres_node from its "subclasses" (struct devres, struct devres_group); give struct devres_node its own release and free callbacks for per-type dispatch - Introduce struct devres_action for devres actions, avoiding the ARCH_DMA_MINALIGN alignment overhead of struct devres - Export struct devres_node and its init/add/remove/dbginfo primitives for use by Rust Devres<T> - Fix missing node debug info in devm_krealloc() - Use guard(spinlock_irqsave) where applicable; consolidate unlock paths in devres_release_group() driver_override: - Convert PCI, WMI, vdpa, s390/cio, s390/ap, and fsl-mc to the generic driver_override infrastructure, replacing per-bus driver_override strings, sysfs attributes, and match logic; fixes a potential UAF from unsynchronized access to driver_override in bus match() callbacks - Simplify __device_set_driver_override() logic kernfs: - Send IN_DELETE_SELF and IN_IGNORED inotify events on kernfs file and directory removal - Add corresponding selftests for memcg platform: - Allow attaching software nodes when creating platform devices via a new 'swnode' field in struct platform_device_info - Add kerneldoc for struct platform_device_info software node: - Move software node initialization from postcore_initcall() to driver_init(), making it available early in the boot process - Move kernel_kobj initialization (ksysfs_init) earlier to support the above - Remove software_node_exit(); dead code in a built-in unit SoC: - Introduce of_machine_read_compatible() and of_machine_read_model() OF helpers and export soc_attr_read_machine() to replace direct accesses to of_root from SoC drivers; also enables CONFIG_COMPILE_TEST coverage for these drivers sysfs: - Constify attribute group array pointers to 'const struct attribute_group *const *' in sysfs functions, device_add_groups() / device_remove_groups(), and struct class Rust: - Devres: - Embed struct devres_node directly in Devres<T> instead of going through devm_add_action(), avoiding the extra allocation and the unnecessary ARCH_DMA_MINALIGN alignment - I/O: - Turn IoCapable from a marker trait into a functional trait carrying the raw I/O accessor implementation (io_read / io_write), providing working defaults for the per-type Io methods - Add RelaxedMmio wrapper type, making relaxed accessors usable in code generic over the Io trait - Remove overloaded per-type Io methods and per-backend macros from Mmio and PCI ConfigSpace - I/O (Register): - Add IoLoc trait and generic read/write/update methods to the Io trait, making I/O operations parameterizable by typed locations - Add register! macro for defining hardware register types with typed bitfield accessors backed by Bounded values; supports direct, relative, and array register addressing - Add write_reg() / try_write_reg() and LocatedRegister trait - Update PCI sample driver to demonstrate the register! macro Example: ``` register! { /// UART control register. CTRL(u32) @ 0x18 { /// Receiver enable. 19:19 rx_enable => bool; /// Parity configuration. 14:13 parity ?=> Parity; } /// FIFO watermark and counter register. WATER(u32) @ 0x2c { /// Number of datawords in the receive FIFO. 26:24 rx_count; /// RX interrupt threshold. 17:16 rx_water; } } impl WATER { fn rx_above_watermark(&self) -> bool { self.rx_count() > self.rx_water() } } fn init(bar: &pci::Bar<BAR0_SIZE>) { let water = WATER::zeroed() .with_const_rx_water::<1>(); // > 3 would not compile bar.write_reg(water); let ctrl = CTRL::zeroed() .with_parity(Parity::Even) .with_rx_enable(true); bar.write_reg(ctrl); } fn handle_rx(bar: &pci::Bar<BAR0_SIZE>) { if bar.read(WATER).rx_above_watermark() { // drain the FIFO } } fn set_parity(bar: &pci::Bar<BAR0_SIZE>, parity: Parity) { bar.update(CTRL, |r| r.with_parity(parity)); } ``` - IRQ: - Move 'static bounds from where clauses to trait declarations for IRQ handler traits - Misc: - Enable the generic_arg_infer Rust feature - Extend Bounded with shift operations, single-bit bool conversion, and const get() Misc: - Make deferred_probe_timeout default a Kconfig option - Drop auxiliary_dev_pm_ops; the PM core falls back to driver PM callbacks when no bus type PM ops are set - Add conditional guard support for device_lock() - Add ksysfs.c to the DRIVER CORE MAINTAINERS entry - Fix kernel-doc warnings in base.h - Fix stale reference to memory_block_add_nid() in documentation" * tag 'driver-core-7.1-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/driver-core/driver-core: (67 commits) bus: fsl-mc: use generic driver_override infrastructure s390/ap: use generic driver_override infrastructure s390/cio: use generic driver_override infrastructure vdpa: use generic driver_override infrastructure platform/wmi: use generic driver_override infrastructure PCI: use generic driver_override infrastructure driver core: make software nodes available earlier software node: remove software_node_exit() kernel: ksysfs: initialize kernel_kobj earlier MAINTAINERS: add ksysfs.c to the DRIVER CORE entry drivers/base/memory: fix stale reference to memory_block_add_nid() device property: Document how to check for the property presence soundwire: debugfs: initialize firmware_file to empty string debugfs: fix placement of EXPORT_SYMBOL_GPL for debugfs_create_str() debugfs: check for NULL pointer in debugfs_create_str() driver core: Make deferred_probe_timeout default a Kconfig option driver core: simplify __device_set_driver_override() clearing logic driver core: auxiliary bus: Drop auxiliary_dev_pm_ops device property: Make modifications of fwnode "flags" thread safe rust: devres: embed struct devres_node directly ...
2026-04-13Merge tag 'hardening-v7.1-rc1' of ↵Linus Torvalds1-1/+8
git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux Pull hardening updates from Kees Cook: - randomize_kstack: Improve implementation across arches (Ryan Roberts) - lkdtm/fortify: Drop unneeded FORTIFY_STR_OBJECT test - refcount: Remove unused __signed_wrap function annotations * tag 'hardening-v7.1-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux: lkdtm/fortify: Drop unneeded FORTIFY_STR_OBJECT test refcount: Remove unused __signed_wrap function annotations randomize_kstack: Unify random source across arches randomize_kstack: Maintain kstack_offset per task
2026-04-13Merge tag 'vfs-7.1-rc1.misc' of ↵Linus Torvalds1-1/+1
git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs Pull misc vfs updates from Christian Brauner: "Features: - coredump: add tracepoint for coredump events - fs: hide file and bfile caches behind runtime const machinery Fixes: - fix architecture-specific compat_ftruncate64 implementations - dcache: Limit the minimal number of bucket to two - fs/omfs: reject s_sys_blocksize smaller than OMFS_DIR_START - fs/mbcache: cancel shrink work before destroying the cache - dcache: permit dynamic_dname()s up to NAME_MAX Cleanups: - remove or unexport unused fs_context infrastructure - trivial ->setattr cleanups - selftests/filesystems: Assume that TIOCGPTPEER is defined - writeback: fix kernel-doc function name mismatch for wb_put_many() - autofs: replace manual symlink buffer allocation in autofs_dir_symlink - init/initramfs.c: trivial fix: FSM -> Finite-state machine - fs: remove stale and duplicate forward declarations - readdir: Introduce dirent_size() - fs: Replace user_access_{begin/end} by scoped user access - kernel: acct: fix duplicate word in comment - fs: write a better comment in step_into() concerning .mnt assignment - fs: attr: fix comment formatting and spelling issues" * tag 'vfs-7.1-rc1.misc' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs: (28 commits) dcache: permit dynamic_dname()s up to NAME_MAX fs: attr: fix comment formatting and spelling issues fs: hide file and bfile caches behind runtime const machinery fs: write a better comment in step_into() concerning .mnt assignment proc: rename proc_notify_change to proc_setattr proc: rename proc_setattr to proc_nochmod_setattr affs: rename affs_notify_change to affs_setattr adfs: rename adfs_notify_change to adfs_setattr hfs: update comments on hfs_inode_setattr kernel: acct: fix duplicate word in comment fs: Replace user_access_{begin/end} by scoped user access readdir: Introduce dirent_size() coredump: add tracepoint for coredump events fs: remove do_sys_truncate fs: pass on FTRUNCATE_* flags to do_truncate fs: fix archiecture-specific compat_ftruncate64 fs: remove stale and duplicate forward declarations init/initramfs.c: trivial fix: FSM -> Finite-state machine autofs: replace manual symlink buffer allocation in autofs_dir_symlink fs/mbcache: cancel shrink work before destroying the cache ...
2026-04-13Merge tag 'rust-7.1' of ↵Linus Torvalds1-14/+16
git://git.kernel.org/pub/scm/linux/kernel/git/ojeda/linux Pull Rust updates from Miguel Ojeda: "Toolchain and infrastructure: - Bump the minimum Rust version to 1.85.0 (and 'bindgen' to 0.71.1). As proposed in LPC 2025 and the Maintainers Summit [1], we are going to follow Debian Stable's Rust versions as our minimum versions. Debian Trixie was released on 2025-08-09 with a Rust 1.85.0 and 'bindgen' 0.71.1 toolchain, which is a fair amount of time for e.g. kernel developers to upgrade. Other major distributions support a Rust version that is high enough as well, including: + Arch Linux. + Fedora Linux. + Gentoo Linux. + Nix. + openSUSE Slowroll and openSUSE Tumbleweed. + Ubuntu 25.10 and 26.04 LTS. In addition, 24.04 LTS using their versioned packages. The merged patch series comes with the associated cleanups and simplifications treewide that can be performed thanks to both bumps, as well as documentation updates. In addition, start using 'bindgen''s '--with-attribute-custom-enum' feature to set the 'cfi_encoding' attribute for the 'lru_status' enum used in Binder. Link: https://lwn.net/Articles/1050174/ [1] - Add experimental Kconfig option ('CONFIG_RUST_INLINE_HELPERS') that inlines C helpers into Rust. Essentially, it performs a step similar to LTO, but just for the helpers, i.e. very local and fast. It relies on 'llvm-link' and its '--internalize' flag, and requires a compatible LLVM between Clang and 'rustc' (i.e. same major version, 'CONFIG_RUSTC_CLANG_LLVM_COMPATIBLE'). It is only enabled for two architectures for now. The result is a measurable speedup in different workloads that different users have tested. For instance, for the null block driver, it amounts to a 2%. - Support global per-version flags. While we already have per-version flags in many places, we didn't have a place to set global ones that depend on the compiler version, i.e. in 'rust_common_flags', which sometimes is needed to e.g. tweak the lints set per version. Use that to allow the 'clippy::precedence' lint for Rust < 1.86.0, since it had a change in behavior. - Support overriding the crate name and apply it to Rust Binder, which wanted the module to be called 'rust_binder'. - Add the remaining '__rust_helper' annotations (started in the previous cycle). 'kernel' crate: - Introduce the 'const_assert!' macro: a more powerful version of 'static_assert!' that can refer to generics inside functions or implementation bodies, e.g.: fn f<const N: usize>() { const_assert!(N > 1); } fn g<T>() { const_assert!(size_of::<T>() > 0, "T cannot be ZST"); } In addition, reorganize our set of build-time assertion macros ('{build,const,static_assert}!') to live in the 'build_assert' module. Finally, improve the docs as well to clarify how these are different from one another and how to pick the right one to use, and their equivalence (if any) to the existing C ones for extra clarity. - 'sizes' module: add 'SizeConstants' trait. This gives us typed 'SZ_*' constants (avoiding casts) for use in device address spaces where the address width depends on the hardware (e.g. 32-bit MMIO windows, 64-bit GPU framebuffers, etc.), e.g.: let gpu_heap = 14 * u64::SZ_1M; let mmio_window = u32::SZ_16M; - 'clk' module: implement 'Send' and 'Sync' for 'Clk' and thus simplify the users in Tyr and PWM. - 'ptr' module: add 'const_align_up'. - 'str' module: improve the documentation of the 'c_str!' macro to explain that one should only use it for non-literal cases (for the other case we instead use C string literals, e.g. 'c"abc"'). - Disallow the use of 'CStr::{as_ptr,from_ptr}' and clean one such use in the 'task' module. - 'sync' module: finish the move of 'ARef' and 'AlwaysRefCounted' outside of the 'types' module, i.e. update the last remaining instances and finally remove the re-exports. - 'error' module: clarify that 'from_err_ptr' can return 'Ok(NULL)', including runtime-tested examples. The intention is to hopefully prevent UB that assumes the result of the function is not 'NULL' if successful. This originated from a case of UB I noticed in 'regulator' that created a 'NonNull' on it. Timekeeping: - Expand the example section in the 'HrTimer' documentation. - Mark the 'ClockSource' trait as unsafe to ensure valid values for 'ktime_get()'. - Add 'Delta::from_nanos()'. 'pin-init' crate: - Replace the 'Zeroable' impls for 'Option<NonZero*>' with impls of 'ZeroableOption' for 'NonZero*'. - Improve feature gate handling for unstable features. - Declutter the documentation of implementations of 'Zeroable' for tuples. - Replace uses of 'addr_of[_mut]!' with '&raw [mut]'. rust-analyzer: - Add type annotations to 'generate_rust_analyzer.py'. - Add support for scripts written in Rust ('generate_rust_target.rs', 'rustdoc_test_builder.rs', 'rustdoc_test_gen.rs'). - Refactor 'generate_rust_analyzer.py' to explicitly identify host and target crates, improve readability, and reduce duplication. And some other fixes, cleanups and improvements" * tag 'rust-7.1' of git://git.kernel.org/pub/scm/linux/kernel/git/ojeda/linux: (79 commits) rust: sizes: add SizeConstants trait for device address space constants rust: kernel: update `file_with_nul` comment rust: kbuild: allow `clippy::precedence` for Rust < 1.86.0 rust: kbuild: support global per-version flags rust: declare cfi_encoding for lru_status docs: rust: general-information: use real example docs: rust: general-information: simplify Kconfig example docs: rust: quick-start: remove GDB/Binutils mention docs: rust: quick-start: remove Nix "unstable channel" note docs: rust: quick-start: remove Gentoo "testing" note docs: rust: quick-start: add Ubuntu 26.04 LTS and remove subsection title docs: rust: quick-start: update minimum Ubuntu version docs: rust: quick-start: update Ubuntu versioned packages docs: rust: quick-start: openSUSE provides `rust-src` package nowadays rust: kbuild: remove "dummy parameter" workaround for `bindgen` < 0.71.1 rust: kbuild: update `bindgen --rust-target` version and replace comment rust: rust_is_available: remove warning for `bindgen` < 0.69.5 && libclang >= 19.1 rust: rust_is_available: remove warning for `bindgen` 0.66.[01] rust: bump `bindgen` minimum supported version to 0.71.1 (Debian Trixie) rust: block: update `const_refs_to_static` MSRV TODO comment ...
2026-04-07rust: kbuild: remove "dummy parameter" workaround for `bindgen` < 0.71.1Miguel Ojeda1-6/+1
Until the version bump of `bindgen`, we needed to pass a dummy parameter to avoid failing the `--version` call. Thus remove it. Reviewed-by: Tamir Duberstein <tamird@kernel.org> Reviewed-by: Gary Guo <gary@garyguo.net> Link: https://patch.msgid.link/20260405235309.418950-22-ojeda@kernel.org Signed-off-by: Miguel Ojeda <ojeda@kernel.org>
2026-04-07rust: remove `RUSTC_HAS_COERCE_POINTEE` and simplify codeMiguel Ojeda1-3/+0
With the Rust version bump in place, the `RUSTC_HAS_COERCE_POINTEE` Kconfig (automatic) option is always true. Thus remove the option and simplify the code. In particular, this includes removing our use of the predecessor unstable features we used with Rust < 1.84.0 (`coerce_unsized`, `dispatch_from_dyn` and `unsize`). Reviewed-by: Tamir Duberstein <tamird@kernel.org> Acked-by: Danilo Krummrich <dakr@kernel.org> Reviewed-by: Gary Guo <gary@garyguo.net> Link: https://patch.msgid.link/20260405235309.418950-11-ojeda@kernel.org Signed-off-by: Miguel Ojeda <ojeda@kernel.org>
2026-04-07rust: remove `RUSTC_HAS_SLICE_AS_FLATTENED` and simplify codeMiguel Ojeda1-3/+0
With the Rust version bump in place, the `RUSTC_HAS_SLICE_AS_FLATTENED` Kconfig (automatic) option is always true. Thus remove the option and simplify the code. In particular, this includes removing the `slice` module which contained the temporary slice helpers, i.e. the `AsFlattened` extension trait and its `impl`s. Reviewed-by: Tamir Duberstein <tamird@kernel.org> Reviewed-by: Gary Guo <gary@garyguo.net> Link: https://patch.msgid.link/20260405235309.418950-10-ojeda@kernel.org Signed-off-by: Miguel Ojeda <ojeda@kernel.org>
2026-04-07rust: simplify `RUSTC_VERSION` Kconfig conditionsMiguel Ojeda1-2/+0
With the Rust version bump in place, several Kconfig conditions based on `RUSTC_VERSION` are always true. Thus simplify them. The minimum supported major LLVM version by our new Rust minimum version is now LLVM 18, instead of LLVM 16. However, there are no possible cleanups for `RUSTC_LLVM_VERSION`. Reviewed-by: Tamir Duberstein <tamird@kernel.org> Reviewed-by: Gary Guo <gary@garyguo.net> Link: https://patch.msgid.link/20260405235309.418950-9-ojeda@kernel.org Signed-off-by: Miguel Ojeda <ojeda@kernel.org>
2026-04-05mm: introduce CONFIG_NUMA_MIGRATION and simplify CONFIG_MIGRATIONDavid Hildenbrand (Arm)1-1/+1
CONFIG_MEMORY_HOTREMOVE, CONFIG_COMPACTION and CONFIG_CMA all select CONFIG_MIGRATION, because they require it to work (users). Only CONFIG_NUMA_BALANCING and CONFIG_BALLOON_MIGRATION depend on CONFIG_MIGRATION. CONFIG_BALLOON_MIGRATION is not an actual user, but an implementation of migration support, so the dependency is correct (CONFIG_BALLOON_MIGRATION does not make any sense without CONFIG_MIGRATION). However, kconfig-language.rst clearly states "In general use select only for non-visible symbols". So far CONFIG_MIGRATION is user-visible ... and the dependencies rather confusing. The whole reason why CONFIG_MIGRATION is user-visible is because of CONFIG_NUMA: some users might want CONFIG_NUMA but not page migration support. Let's clean all that up by introducing a dedicated CONFIG_NUMA_MIGRATION config option for that purpose only. Make CONFIG_NUMA_BALANCING that so far depended on CONFIG_NUMA && CONFIG_MIGRATION to depend on CONFIG_MIGRATION instead. CONFIG_NUMA_MIGRATION will depend on CONFIG_NUMA && CONFIG_MMU. CONFIG_NUMA_MIGRATION is user-visible and will default to "y". We use that default so new configs will automatically enable it, just like it was the case with CONFIG_MIGRATION. The downside is that some configs that used to have CONFIG_MIGRATION=n might get it re-enabled by CONFIG_NUMA_MIGRATION=y, which shouldn't be a problem. CONFIG_MIGRATION is now a non-visible config option. Any code that select CONFIG_MIGRATION (as before) must depend directly or indirectly on CONFIG_MMU. CONFIG_NUMA_MIGRATION is responsible for any NUMA migration code, which is mempolicy migration code, memory-tiering code, and move_pages() code in migrate.c. CONFIG_NUMA_BALANCING uses its functionality. Note that this implies that with CONFIG_NUMA_MIGRATION=n, move_pages() will not be available even though CONFIG_MIGRATION=y, which is an expected change. In migrate.c, we can remove the CONFIG_NUMA check as both CONFIG_NUMA_MIGRATION and CONFIG_NUMA_BALANCING depend on it. With this change, CONFIG_MIGRATION is an internal config, all users of migration selects CONFIG_MIGRATION, and only CONFIG_BALLOON_MIGRATION depends on it. Link: https://lkml.kernel.org/r/20260319-config_migration-v1-2-42270124966f@kernel.org Signed-off-by: David Hildenbrand (Arm) <david@kernel.org> Reviewed-by: Lorenzo Stoakes (Oracle) <ljs@kernel.org> Acked-by: Zi Yan <ziy@nvidia.com> Reviewed-by: Jonathan Cameron <jonathan.cameron@huawei.com> Cc: Albert Ou <aou@eecs.berkeley.edu> Cc: Alexander Gordeev <agordeev@linux.ibm.com> Cc: Alexandre Ghiti <alex@ghiti.fr> Cc: Alistair Popple <apopple@nvidia.com> Cc: "Borislav Petkov (AMD)" <bp@alien8.de> Cc: Byungchul Park <byungchul@sk.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Christian Borntraeger <borntraeger@linux.ibm.com> Cc: Gregory Price <gourry@gourry.net> Cc: Heiko Carstens <hca@linux.ibm.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Huacai Chen <chenhuacai@kernel.org> Cc: "Huang, Ying" <ying.huang@linux.alibaba.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Joshua Hahn <joshua.hahnjy@gmail.com> Cc: Liam Howlett <liam.howlett@oracle.com> Cc: Madhavan Srinivasan <maddy@linux.ibm.com> Cc: Matthew Brost <matthew.brost@intel.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Michal Hocko <mhocko@suse.com> Cc: Mike Rapoport <rppt@kernel.org> Cc: Nicholas Piggin <npiggin@gmail.com> Cc: Palmer Dabbelt <palmer@dabbelt.com> Cc: Rakie Kim <rakie.kim@sk.com> Cc: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Suren Baghdasaryan <surenb@google.com> Cc: Sven Schnelle <svens@linux.ibm.com> Cc: Vasily Gorbik <gor@linux.ibm.com> Cc: WANG Xuerui <kernel@xen0n.name> Cc: Will Deacon <will@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2026-04-03kernel: ksysfs: initialize kernel_kobj earlierBartosz Golaszewski1-0/+2
Software nodes depend on kernel_kobj which is initialized pretty late into the boot process - as a core_initcall(). Ahead of moving the software node initialization to driver_init() we must first make kernel_kobj available before it. Make ksysfs_init() visible in a new header - ksysfs.h - and call it in do_basic_setup() right before driver_init(). Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@oss.qualcomm.com> Link: https://patch.msgid.link/20260402-nokia770-gpio-swnodes-v5-1-d730db3dd299@oss.qualcomm.com Signed-off-by: Danilo Krummrich <dakr@kernel.org>
2026-04-03locking: Add task::blocked_lock to serialize blocked_on stateJohn Stultz1-0/+1
So far, we have been able to utilize the mutex::wait_lock for serializing the blocked_on state, but when we move to proxying across runqueues, we will need to add more state and a way to serialize changes to this state in contexts where we don't hold the mutex::wait_lock. So introduce the task::blocked_lock, which nests under the mutex::wait_lock in the locking order, and rework the locking to use it. Signed-off-by: John Stultz <jstultz@google.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: K Prateek Nayak <kprateek.nayak@amd.com> Link: https://patch.msgid.link/20260324191337.1841376-5-jstultz@google.com
2026-04-01memblock: make free_reserved_area() update memblock if ARCH_KEEP_MEMBLOCK=yMike Rapoport (Microsoft)1-7/+0
On architectures that keep memblock after boot, freeing of reserved memory with free_reserved_area() is paired with an update of memblock arrays, usually by a call to memblock_free(). Make free_reserved_area() directly update memblock.reserved when ARCH_KEEP_MEMBLOCK is enabled. Remove the now-redundant explicit memblock_free() call from arm64::free_initmem() and the #ifdef CONFIG_ARCH_KEEP_MEMBLOCK block from the generic free_initrd_mem(). Link: https://patch.msgid.link/20260323074836.3653702-8-rppt@kernel.org Signed-off-by: Mike Rapoport (Microsoft) <rppt@kernel.org>
2026-03-30kbuild: rust: add `CONFIG_RUSTC_CLANG_LLVM_COMPATIBLE`Gary Guo1-0/+15
This config detects if Rust and Clang have matching LLVM major version. All IR or bitcode operations (e.g. LTO) rely on LLVM major version to be matching, otherwise it may generate errors, or worse, miscompile silently due to change of IR semantics. It's usually suggested to use the exact same LLVM version, but this can be difficult to guarantee. Rust's suggestion [1] is also major-version only, so I think this check is sufficient for the kernel. Link: https://doc.rust-lang.org/rustc/linker-plugin-lto.html [1] Reviewed-by: Andreas Hindborg <a.hindborg@kernel.org> Signed-off-by: Gary Guo <gary@garyguo.net> Signed-off-by: Matthew Maurer <mmaurer@google.com> Signed-off-by: Alice Ryhl <aliceryhl@google.com> Reviewed-by: Nathan Chancellor <nathan@kernel.org> Tested-by: Nathan Chancellor <nathan@kernel.org> Reviewed-by: Nicolas Schier <nsc@kernel.org> Tested-by: Nicolas Schier <nsc@kernel.org> Tested-by: Andreas Hindborg <a.hindborg@kernel.org> Link: https://patch.msgid.link/20260203-inline-helpers-v2-1-beb8547a03c9@google.com [ Fixed typo. - Miguel ] Signed-off-by: Miguel Ojeda <ojeda@kernel.org>
2026-03-26timens: Remove dependency on the vDSOThomas Weißschuh1-1/+3
Previously, missing time namespace support in the vDSO meant that time namespaces needed to be disabled globally. This was expressed in a hard dependency on the generic vDSO library. This also meant that architectures without any vDSO or only a stub vDSO could not enable time namespaces. Now that all architectures using a real vDSO are using the generic library, that dependency is not necessary anymore. Remove the dependency and let all architectures enable time namespaces. Signed-off-by: Thomas Weißschuh <thomas.weissschuh@linutronix.de> Signed-off-by: Thomas Gleixner <tglx@kernel.org> Link: https://patch.msgid.link/20260326-vdso-timens-decoupling-v2-2-c82693a7775f@linutronix.de
2026-03-25Merge tag 'hardening-v7.0-rc6' of ↵Linus Torvalds1-1/+1
git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux Pull hardening fixes from Kees Cook: - fix required Clang version for CC_HAS_COUNTED_BY_PTR (Nathan Chancellor) - update Coccinelle script used for kmalloc_obj * tag 'hardening-v7.0-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux: init/Kconfig: Require a release version of clang-22 for CC_HAS_COUNTED_BY_PTR coccinelle: kmalloc_obj: Remove default GFP_KERNEL arg
2026-03-24randomize_kstack: Unify random source across archesRyan Roberts1-0/+8
Previously different architectures were using random sources of differing strength and cost to decide the random kstack offset. A number of architectures (loongarch, powerpc, s390, x86) were using their timestamp counter, at whatever the frequency happened to be. Other arches (arm64, riscv) were using entropy from the crng via get_random_u16(). There have been concerns that in some cases the timestamp counters may be too weak, because they can be easily guessed or influenced by user space. And get_random_u16() has been shown to be too costly for the level of protection kstack offset randomization provides. So let's use a common, architecture-agnostic source of entropy; a per-cpu prng, seeded at boot-time from the crng. This has a few benefits: - We can remove choose_random_kstack_offset(); That was only there to try to make the timestamp counter value a bit harder to influence from user space [*]. - The architecture code is simplified. All it has to do now is call add_random_kstack_offset() in the syscall path. - The strength of the randomness can be reasoned about independently of the architecture. - Arches previously using get_random_u16() now have much faster syscall paths, see below results. [*] Additionally, this gets rid of some redundant work on s390 and x86. Before this patch, those architectures called choose_random_kstack_offset() under arch_exit_to_user_mode_prepare(), which is also called for exception returns to userspace which were *not* syscalls (e.g. regular interrupts). Getting rid of choose_random_kstack_offset() avoids a small amount of redundant work for the non-syscall cases. In some configurations, add_random_kstack_offset() will now call instrumentable code, so for a couple of arches, I have moved the call a bit later to the first point where instrumentation is allowed. This doesn't impact the efficacy of the mechanism. There have been some claims that a prng may be less strong than the timestamp counter if not regularly reseeded. But the prng has a period of about 2^113. So as long as the prng state remains secret, it should not be possible to guess. If the prng state can be accessed, we have bigger problems. Additionally, we are only consuming 6 bits to randomize the stack, so there are only 64 possible random offsets. I assert that it would be trivial for an attacker to brute force by repeating their attack and waiting for the random stack offset to be the desired one. The prng approach seems entirely proportional to this level of protection. Performance data are provided below. The baseline is v6.18 with rndstack on for each respective arch. (I)/(R) indicate statistically significant improvement/regression. arm64 platform is AWS Graviton3 (m7g.metal). x86_64 platform is AWS Sapphire Rapids (m7i.24xlarge): +-----------------+--------------+---------------+---------------+ | Benchmark | Result Class | per-cpu-prng | per-cpu-prng | | | | arm64 (metal) | x86_64 (VM) | +=================+==============+===============+===============+ | syscall/getpid | mean (ns) | (I) -9.50% | (I) -17.65% | | | p99 (ns) | (I) -59.24% | (I) -24.41% | | | p99.9 (ns) | (I) -59.52% | (I) -28.52% | +-----------------+--------------+---------------+---------------+ | syscall/getppid | mean (ns) | (I) -9.52% | (I) -19.24% | | | p99 (ns) | (I) -59.25% | (I) -25.03% | | | p99.9 (ns) | (I) -59.50% | (I) -28.17% | +-----------------+--------------+---------------+---------------+ | syscall/invalid | mean (ns) | (I) -10.31% | (I) -18.56% | | | p99 (ns) | (I) -60.79% | (I) -20.06% | | | p99.9 (ns) | (I) -61.04% | (I) -25.04% | +-----------------+--------------+---------------+---------------+ I tested an earlier version of this change on x86 bare metal and it showed a smalle