aboutsummaryrefslogtreecommitdiff
path: root/include/linux
AgeCommit message (Collapse)AuthorFilesLines
2025-11-21usb: ohci-da8xx: remove unused platform dataBartosz Golaszewski1-22/+0
We no longer support any board files for DaVinci in mainline and so struct da8xx_ohci_root_hub is no longer used. Remove it together with all the code it's used for. Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org> Acked-by: Alan Stern <stern@rowland.harvard.edu> Link: https://patch.msgid.link/20251114-davinci-usb-v1-1-737380353a74@linaro.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-11-21bug: Add report_bug_entry()Peter Zijlstra1-0/+8
Add a report_bug() variant where the bug_entry is already known. This is useful when the exception instruction is not instantiated per-site. But instead has a single instance. In such a case the bug_entry address might be passed along in a known register or something. Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://patch.msgid.link/20251110115757.575795595@infradead.org
2025-11-21efi/cper: align ARM CPER type with UEFI 2.9A/2.10 specsMauro Carvalho Chehab1-5/+5
Up to UEFI spec 2.9, the type byte of CPER struct for ARM processor was defined simply as: Type at byte offset 4: - Cache error - TLB Error - Bus Error - Micro-architectural Error All other values are reserved Yet, there was no information about how this would be encoded. Spec 2.9A errata corrected it by defining: - Bit 1 - Cache Error - Bit 2 - TLB Error - Bit 3 - Bus Error - Bit 4 - Micro-architectural Error All other values are reserved That actually aligns with the values already defined on older versions at N.2.4.1. Generic Processor Error Section. Spec 2.10 also preserve the same encoding as 2.9A. Adjust CPER and GHES handling code for both generic and ARM processors to properly handle UEFI 2.9A and 2.10 encoding. Link: https://uefi.org/specs/UEFI/2.10/Apx_N_Common_Platform_Error_Record.html#arm-processor-error-information Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Acked-by: Borislav Petkov (AMD) <bp@alien8.de> Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
2025-11-21efi/cper: Add a new helper function to print bitmasksMauro Carvalho Chehab1-0/+2
Add a helper function to print a string with names associated to each bit field. A typical example is: const char * const bits[] = { "bit 3 name", "bit 4 name", "bit 5 name", }; char str[120]; unsigned int bitmask = BIT(3) | BIT(5); #define MASK GENMASK(5,3) cper_bits_to_str(str, sizeof(str), FIELD_GET(MASK, bitmask), bits, ARRAY_SIZE(bits)); The above code fills string "str" with "bit 3 name|bit 5 name". Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org> Acked-by: Borislav Petkov (AMD) <bp@alien8.de> Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
2025-11-21RAS: Report all ARM processor CPER information to userspaceJason Tian1-3/+13
The ARM processor CPER record was added in UEFI v2.6 and remained unchanged up to v2.10. Yet, the original arm_event trace code added by e9279e83ad1f ("trace, ras: add ARM processor error trace event") is incomplete, as it only traces some fields of UAPI 2.6 table N.16, not exporting any information from tables N.17 to N.29 of the record. This is not enough for the user to be able to figure out what has exactly happened or to take appropriate action. According to the UEFI v2.9 specification chapter N2.4.4, the ARM processor error section includes: - several (ERR_INFO_NUM) ARM processor error information structures (Tables N.17 to N.20); - several (CONTEXT_INFO_NUM) ARM processor context information structures (Tables N.21 to N.29); - several vendor specific error information structures. The size is given by Section Length minus the size of the other fields. In addition, it also exports two fields that are parsed by the GHES driver when firmware reports it, e.g.: - error severity - CPU logical index Report all of these information to userspace via a the ARM tracepoint so that userspace can properly record the error and take decisions related to CPU core isolation according to error severity and other info. The updated ARM trace event now contains the following fields: ====================================== ============================= UEFI field on table N.16 ARM Processor trace fields ====================================== ============================= Validation handled when filling data for affinity MPIDR and running state. ERR_INFO_NUM pei_len CONTEXT_INFO_NUM ctx_len Section Length indirectly reported by pei_len, ctx_len and oem_len Error affinity level affinity MPIDR_EL1 mpidr MIDR_EL1 midr Running State running_state PSCI State psci_state Processor Error Information Structure pei_err - count at pei_len Processor Context ctx_err- count at ctx_len Vendor Specific Error Info oem - count at oem_len ====================================== ============================= It should be noted that decoding of tables N.17 to N.29, if needed, will be handled in userspace. That gives more flexibility, as there won't be any need to flood the kernel with micro-architecture specific error decoding. Also, decoding the other fields require a complex logic, and should be done for each of the several values inside the record field. So, let userspace daemons like rasdaemon decode them, parsing such tables and having vendor-specific micro-architecture-specific decoders. [mchehab: modified description, solved merge conflicts and fixed coding style] Signed-off-by: Jason Tian <jason@os.amperecomputing.com> Co-developed-by: Shengwei Luo <luoshengwei@huawei.com> Signed-off-by: Shengwei Luo <luoshengwei@huawei.com> Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org> Signed-off-by: Daniel Ferguson <danielf@os.amperecomputing.com> # rebased Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Tested-by: Shiju Jose <shiju.jose@huawei.com> Acked-by: Borislav Petkov (AMD) <bp@alien8.de> Fixes: e9279e83ad1f ("trace, ras: add ARM processor error trace event") Link: https://uefi.org/specs/UEFI/2.10/Apx_N_Common_Platform_Error_Record.html#arm-processor-error-section Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
2025-11-20Merge tag 'vfio-v6.19-dma-buf-v9+' into v6.19/vfio/nextAlex Williamson5-47/+145
[v9] vfio/pci: Allow MMIO regions to be exported through dma-buf https://lore.kernel.org/all/20251120-dmabuf-vfio-v9-0-d7f71607f371@nvidia.com Signed-off-by: Alex Williamson <alex@shazbot.org>
2025-11-20vfio/pci: Add dma-buf export support for MMIO regionsLeon Romanovsky1-0/+42
Add support for exporting PCI device MMIO regions through dma-buf, enabling safe sharing of non-struct page memory with controlled lifetime management. This allows RDMA and other subsystems to import dma-buf FDs and build them into memory regions for PCI P2P operations. The implementation provides a revocable attachment mechanism using dma-buf move operations. MMIO regions are normally pinned as BARs don't change physical addresses, but access is revoked when the VFIO device is closed or a PCI reset is issued. This ensures kernel self-defense against potentially hostile userspace. Currently VFIO can take MMIO regions from the device's BAR and map them into a PFNMAP VMA with special PTEs. This mapping type ensures the memory cannot be used with things like pin_user_pages(), hmm, and so on. In practice only the user process CPU and KVM can safely make use of these VMA. When VFIO shuts down these VMAs are cleaned by unmap_mapping_range() to prevent any UAF of the MMIO beyond driver unbind. However, VFIO type 1 has an insecure behavior where it uses follow_pfnmap_*() to fish a MMIO PFN out of a VMA and program it back into the IOMMU. This has a long history of enabling P2P DMA inside VMs, but has serious lifetime problems by allowing a UAF of the MMIO after the VFIO driver has been unbound. Introduce DMABUF as a new safe way to export a FD based handle for the MMIO regions. This can be consumed by existing DMABUF importers like RDMA or DRM without opening an UAF. A following series will add an importer to iommufd to obsolete the type 1 code and allow safe UAF-free MMIO P2P in VM cases. DMABUF has a built in synchronous invalidation mechanism called move_notify. VFIO keeps track of all drivers importing its MMIO and can invoke a synchronous invalidation callback to tell the importing drivers to DMA unmap and forget about the MMIO pfns. This process is being called revoke. This synchronous invalidation fully prevents any lifecycle problems. VFIO will do this before unbinding its driver ensuring there is no UAF of the MMIO beyond the driver lifecycle. Further, VFIO has additional behavior to block access to the MMIO during things like Function Level Reset. This is because some poor platforms may experience a MCE type crash when touching MMIO of a PCI device that is undergoing a reset. Today this is done by using unmap_mapping_range() on the VMAs. Extend that into the DMABUF world and temporarily revoke the MMIO from the DMABUF importers during FLR as well. This will more robustly prevent an errant P2P from possibly upsetting the platform. A DMABUF FD is a preferred handle for MMIO compared to using something like a pgmap because: - VFIO is supported, including its P2P feature, on archs that don't support pgmap - PCI devices have all sorts of BAR sizes, including ones smaller than a section so a pgmap cannot always be created - It is undesirable to waste a lot of memory for struct pages, especially for a case like a GPU with ~100GB of BAR size - We want a synchronous revoke semantic to support FLR with light hardware requirements Use the P2P subsystem to help generate the DMA mapping. This is a significant upgrade over the abuse of dma_map_resource() that has historically been used by DMABUF exporters. Experience with an OOT version of this patch shows that real systems do need this. This approach deals with all the P2P scenarios: - Non-zero PCI bus_offset - ACS flags routing traffic to the IOMMU - ACS flags that bypass the IOMMU - though vfio noiommu is required to hit this. There will be further work to formalize the revoke semantic in DMABUF. For now this acts like a move_notify dynamic exporter where importer fault handling will get a failure when they attempt to map. This means that only fully restartable fault capable importers can import the VFIO DMABUFs. A future revoke semantic should open this up to more HW as the HW only needs to invalidate, not handle restartable faults. Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Signed-off-by: Vivek Kasireddy <vivek.kasireddy@intel.com> Reviewed-by: Kevin Tian <kevin.tian@intel.com> Signed-off-by: Leon Romanovsky <leonro@nvidia.com> Acked-by: Ankit Agrawal <ankita@nvidia.com> Link: https://lore.kernel.org/r/20251120-dmabuf-vfio-v9-10-d7f71607f371@nvidia.com Signed-off-by: Alex Williamson <alex@shazbot.org>
2025-11-20net: stmmac: remove axi_blen arrayRussell King (Oracle)1-1/+0
Remove the axi_blen array from struct stmmac_axi as we set this array, and then immediately convert it ot the register value, never looking at the array again. Thus, the array can be function local rather than part of a run-time allocated long-lived struct. Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Link: https://patch.msgid.link/E1vLfLg-0000000FMbD-1vmh@rmk-PC.armlinux.org.uk Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-11-20net: stmmac: move stmmac_axi_blen_to_mask() to stmmac_main.cRussell King (Oracle)1-0/+1
Move the call to stmmac_axi_blen_to_mask() out of the individual MAC version drivers into the main code in stmmac_init_dma_engine(), passing the resulting value through a new member, axi_blen_regval, in the struct stmmac_axi structure. There is now no need for stmmac_axi_blen_to_dma_mask() to use u32p_replace_bits(), so use FIELD_PREP() instead. Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Link: https://patch.msgid.link/E1vLfLW-0000000FMb1-0zKV@rmk-PC.armlinux.org.uk Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-11-20net: stmmac: pass struct device to init()/exit() methodsRussell King (Oracle)1-2/+2
As struct plat_stmmacenet_data is not platform_device specific, pass a struct device into the init() and exit() methods to allow them to become independent of the underlying device. Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Acked-by: Chen-Yu Tsai <wens@kernel.org> Reviewed-by: Maxime Chevallier <maxime.chevallier@bootlin.com> Link: https://patch.msgid.link/E1vLf2U-0000000FMN2-0SLg@rmk-PC.armlinux.org.uk Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-11-21Merge tag 'v6.18-rc6' into drm-nextDave Airlie28-66/+153
Linux 6.18-rc6 Backmerge in order to merge msm next Signed-off-by: Dave Airlie <airlied@redhat.com>
2025-11-20bpf: Document cfi_stubs and owner fields in struct bpf_struct_opsNirbhay Sharma1-4/+6
Add missing kernel-doc documentation for the cfi_stubs and owner fields in struct bpf_struct_ops to fix the following warnings: Warning: include/linux/bpf.h:1931 struct member 'cfi_stubs' not described in 'bpf_struct_ops' Warning: include/linux/bpf.h:1931 struct member 'owner' not described in 'bpf_struct_ops' The cfi_stubs field was added in commit 2cd3e3772e41 ("x86/cfi,bpf: Fix bpf_struct_ops CFI") to provide CFI stub functions for trampolines, and the owner field is used for module reference counting. Signed-off-by: Nirbhay Sharma <nirbhay.lkd@gmail.com> Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org> Link: https://patch.msgid.link/20251120204620.59571-2-nirbhay.lkd@gmail.com
2025-11-20uaccess: gate _copy_[to|from]_user on !INLINE_COPY_FROM_USERAlice Ryhl1-0/+4
These methods only exist when INLINE_COPY_FROM_USER is disabled, so update the header file to reflect that. This fixes the following error on builds that enable both RUST and INLINE_COPY_FROM_USER. ERROR: modpost: "_copy_from_user" [samples/rust/rust_misc_device.ko] undefined! ERROR: modpost: "_copy_to_user" [samples/rust/rust_misc_device.ko] undefined! This error is triggered because when a method is available both as a rust_helper_* and normal method, Rust will call the normal method. [akpm@linux-foundation.org: s/INLINE_COPY_FROM_USER/INLINE_COPY_TO_USER/, per Alice] Link: https://lkml.kernel.org/r/20251118173250.2821388-1-aliceryhl@google.com Fixes: d99dc586ca7c ("uaccess: decouple INLINE_COPY_FROM_USER and CONFIG_RUST") Signed-off-by: Alice Ryhl <aliceryhl@google.com> Cc: Alex Gaynor <alex.gaynor@gmail.com> Cc: Andreas Hindborg <a.hindborg@kernel.org> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Björn Roy Baron <bjorn3_gh@protonmail.com> Cc: Boqun Feng <boqun.feng@gmail.com> Cc: Danilo Krummrich <dakr@kernel.org> Cc: Gary Guo <gary@garyguo.net> Cc: John Hubbard <jhubbard@nvidia.com> Cc: Miguel Ojeda <ojeda@kernel.org> Cc: Trevor Gross <tmgross@umich.edu> Cc: Yury Norov (NVIDIA) <yury.norov@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-11-20lib/base64: add support for multiple variantsKuan-Wei Chiu1-2/+8
Patch series " lib/base64: add generic encoder/decoder, migrate users", v5. This series introduces a generic Base64 encoder/decoder to the kernel library, eliminating duplicated implementations and delivering significant performance improvements. The Base64 API has been extended to support multiple variants (Standard, URL-safe, and IMAP) as defined in RFC 4648 and RFC 3501. The API now takes a variant parameter and an option to control padding. As part of this series, users are migrated to the new interface while preserving their specific formats: fscrypt now uses BASE64_URLSAFE, Ceph uses BASE64_IMAP, and NVMe is updated to BASE64_STD. On the encoder side, the implementation processes input in 3-byte blocks, mapping 24 bits directly to 4 output symbols. This avoids bit-by-bit streaming and reduces loop overhead, achieving about a 2.7x speedup compared to previous implementations. On the decoder side, replace strchr() lookups with per-variant reverse tables and process input in 4-character groups. Each group is mapped to numeric values and combined into 3 bytes. Padded and unpadded forms are validated explicitly, rejecting invalid '=' usage and enforcing tail rules. This improves throughput by ~43-52x. This patch (of 6): Extend the base64 API to support multiple variants (standard, URL-safe, and IMAP) as defined in RFC 4648 and RFC 3501. The API now takes a variant parameter and an option to control padding. Update NVMe auth code to use the new interface with BASE64_STD. Link: https://lkml.kernel.org/r/20251114055829.87814-1-409411716@gms.tku.edu.tw Link: https://lkml.kernel.org/r/20251114060045.88792-1-409411716@gms.tku.edu.tw Signed-off-by: Kuan-Wei Chiu <visitorckw@gmail.com> Co-developed-by: Guan-Chun Wu <409411716@gms.tku.edu.tw> Signed-off-by: Guan-Chun Wu <409411716@gms.tku.edu.tw> Reviewed-by: David Laight <david.laight.linux@gmail.com> Cc: Christoph Hellwig <hch@lst.de> Cc: Eric Biggers <ebiggers@kernel.org> Cc: Ilya Dryomov <idryomov@gmail.com> Cc: Jaegeuk Kim <jaegeuk@kernel.org> Cc: Jens Axboe <axboe@kernel.dk> Cc: Keith Busch <kbusch@kernel.org> Cc: Sagi Grimberg <sagi@grimberg.me> Cc: "Theodore Y. Ts'o" <tytso@mit.edu> Cc: Viacheslav Dubeyko <Slava.Dubeyko@ibm.com> Cc: Xiubo Li <xiubli@redhat.com> Cc: Yu-Sheng Huang <home7438072@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-11-20include/linux/once_lite.h: fix judgment in WARN_ONCE with clangXie Yuanbin1-1/+1
For c code: ```c extern int xx; void test(void) { if (WARN_ONCE(xx, "x")) __asm__ volatile ("nop":::); } ``` Clang will generate the following assembly code: ```assemble test: movl xx(%rip), %eax // Assume xx == 0 (likely case) testl %eax, %eax // judge once je .LBB0_3 // jump to .LBB0_3 testb $1, test.__already_done(%rip) je .LBB0_2 .LBB0_3: testl %eax, %eax // judge again je .LBB0_5 // jump to .LBB0_5 .LBB0_4: nop .LBB0_5: retq // omit ``` In the above code, `xx == 0` should be a likely case, but in this case, xx has been judged twice. Test info: 1. kernel source: linux-next commit 9c0826a5d9aa4d52206d ("Add linux-next specific files for 20251107") 2. compiler: clang: Debian clang version 21.1.4 (8) with Debian LLD 21.1.4 (compatible with GNU linkers) 3. config: base on default x86_64_defconfig, and setting: CONFIG_MITIGATION_RETHUNK=n CONFIG_STACKPROTECTOR=n Add unlikely to __ret_cond to help the compiler optimize correctly. [akpm@linux-foundation.org: undo whitespace changes] Link: https://lkml.kernel.org/r/20251109083715.24495-1-qq570070308@gmail.com Signed-off-by: Xie Yuanbin <qq570070308@gmail.com> Cc: Bill Wendling <morbo@google.com> Cc: Jan Kara <jack@suse.cz> Cc: Justin Stitt <justinstitt@google.com> Cc: Maninder Singh <maninder1.s@samsung.com> Cc: Masahiro Yamada <masahiroy@kernel.org> Cc: Nathan Chancellor <nathan@kernel.org> Cc: Will Deacon <will@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-11-20math.h: amend abs() kernel-doc and add a note about signed type limitsAndy Shevchenko1-4/+9
- amend the kernel-doc so the description is decoupled from the parameter descriptions. - add a note to explain behaviour for the signed types when supplied value is the minimum (e.g., INT_MIN for int type). Link: https://lkml.kernel.org/r/20251106152051.2361551-1-andriy.shevchenko@linux.intel.com Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Cc: Jonathan Cameron <jic23@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-11-20lib: mul_u64_u64_div_u64(): optimise multiply on 32bit x86David Laight1-0/+11
gcc generates horrid code for both ((u64)u32_a * u32_b) and (u64_a + u32_b). As well as the extra instructions it can generate a lot of spills to stack (including spills of constant zeros and even multiplies by constant zero). mul_u32_u32() already exists to optimise the multiply. Add a similar add_u64_32() for the addition. Disable both for clang - it generates better code without them. Move the 64x64 => 128 multiply into a static inline helper function for code clarity. No need for the a/b_hi/lo variables, the implicit casts on the function calls do the work for us. Should have minimal effect on the generated code. Use mul_u32_u32() and add_u64_u32() in the 64x64 => 128 multiply in mul_u64_add_u64_div_u64(). Link: https://lkml.kernel.org/r/20251105201035.64043-8-david.laight.linux@gmail.com Signed-off-by: David Laight <david.laight.linux@gmail.com> Reviewed-by: Nicolas Pitre <npitre@baylibre.com> Cc: Biju Das <biju.das.jz@bp.renesas.com> Cc: Borislav Betkov <bp@alien8.de> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jens Axboe <axboe@kernel.dk> Cc: Li RongQing <lirongqing@baidu.com> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleinxer <tglx@linutronix.de> Cc: Uwe Kleine-König <u.kleine-koenig@baylibre.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-11-20lib: add mul_u64_add_u64_div_u64() and mul_u64_u64_div_u64_roundup()David Laight1-1/+47
The existing mul_u64_u64_div_u64() rounds down, a 'rounding up' variant needs 'divisor - 1' adding in between the multiply and divide so cannot easily be done by a caller. Add mul_u64_add_u64_div_u64(a, b, c, d) that calculates (a * b + c)/d and implement the 'round down' and 'round up' using it. Update the x86-64 asm to optimise for 'c' being a constant zero. Add kerndoc definitions for all three functions. Link: https://lkml.kernel.org/r/20251105201035.64043-5-david.laight.linux@gmail.com Signed-off-by: David Laight <david.laight.linux@gmail.com> Reviewed-by: Nicolas Pitre <npitre@baylibre.com> Cc: Biju Das <biju.das.jz@bp.renesas.com> Cc: Borislav Betkov <bp@alien8.de> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jens Axboe <axboe@kernel.dk> Cc: Li RongQing <lirongqing@baidu.com> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleinxer <tglx@linutronix.de> Cc: Uwe Kleine-König <u.kleine-koenig@baylibre.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-11-20util_macros.h: fix kernel-doc for u64_to_user_ptr()Andy Shevchenko1-2/+2
The added documentation to u64_to_user_ptr() misspelled the function name. Fix it. Link: https://lkml.kernel.org/r/20251104183834.1046584-1-andriy.shevchenko@linux.intel.com Fixes: 029c896c4105 ("kernel.h: move PTR_IF() and u64_to_user_ptr() to util_macros.h") Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Cc: Alexandru Ardelean <aardelean@baylibre.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-11-20panic: sys_info: align constant definition names with parametersAndy Shevchenko1-1/+1
Align constant definition names with parameters to make it easier to map. It's also better to maintain and extend the names while keeping their uniqueness. Link: https://lkml.kernel.org/r/20251030132007.3742368-3-andriy.shevchenko@linux.intel.com Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Reviewed-by: Feng Tang <feng.tang@linux.alibaba.com> Reviewed-by: Petr Mladek <pmladek@suse.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-11-20mm: propagate VM_SOFTDIRTY on mergeLorenzo Stoakes1-8/+7
Patch series "make VM_SOFTDIRTY a sticky VMA flag", v2. Currently we set VM_SOFTDIRTY when a new mapping is set up (whether by establishing a new VMA, or via merge) as implemented in __mmap_complete() and do_brk_flags(). However, when performing a merge of existing mappings such as when performing mprotect(), we may lose the VM_SOFTDIRTY flag. Now we have the concept of making VMA flags 'sticky', that is that they both don't prevent merge and, importantly, are propagated to merged VMAs, this seems a sensible alternative to the existing special-casing of VM_SOFTDIRTY. We additionally add a self-test that demonstrates that this logic behaves as expected. This patch (of 2): Currently we set VM_SOFTDIRTY when a new mapping is set up (whether by establishing a new VMA, or via merge) as implemented in __mmap_complete() and do_brk_flags(). However, when performing a merge of existing mappings such as when performing mprotect(), we may lose the VM_SOFTDIRTY flag. This is because currently we simply ignore VM_SOFTDIRTY for the purposes of merge, so one VMA may possess the flag and another not, and whichever happens to be the target VMA will be the one upon which the merge is performed which may or may not have VM_SOFTDIRTY set. Now we have the concept of 'sticky' VMA flags, let's make VM_SOFTDIRTY one which solves this issue. Additionally update VMA userland tests to propagate changes. [akpm@linux-foundation.org: update comments, per Lorenzo] Link: https://lkml.kernel.org/r/0019e0b8-ee1e-4359-b5ee-94225cbe5588@lucifer.local Link: https://lkml.kernel.org/r/cover.1763399675.git.lorenzo.stoakes@oracle.com Link: https://lkml.kernel.org/r/955478b5170715c895d1ef3b7f68e0cd77f76868.1763399675.git.lorenzo.stoakes@oracle.com Signed-off-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com> Suggested-by: Vlastimil Babka <vbabka@suse.cz> Acked-by: David Hildenbrand (Red Hat) <david@kernel.org> Reviewed-by: Pedro Falcato <pfalcato@suse.de> Acked-by: Andrey Vagin <avagin@gmail.com> Reviewed-by: Vlastimil Babka <vbabka@suse.cz> Acked-by: Cyrill Gorcunov <gorcunov@gmail.com> Cc: Jann Horn <jannh@google.com> Cc: Liam Howlett <liam.howlett@oracle.com> Cc: Michal Hocko <mhocko@suse.com> Cc: Mike Rapoport <rppt@kernel.org> Cc: Suren Baghdasaryan <surenb@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-11-20mm/damon: rename damos->filters to damos->core_filtersSeongJae Park1-5/+5
DAMOS filters that are handled by the ops layer are linked to damos->ops_filters. Owing to the ops_ prefix on the name, it is easy to understand it is for ops layer handled filters. The other types of filters, which are handled by the core layer, are linked to damos->filters. Because of the name, it is easy to confuse the list is there for not only core layer handled ones but all filters. Avoid such confusions by renaming the field to core_filters. Link: https://lkml.kernel.org/r/20251112154114.66053-3-sj@kernel.org Signed-off-by: SeongJae Park <sj@kernel.org> Cc: Bill Wendling <morbo@google.com> Cc: Brendan Higgins <brendan.higgins@linux.dev> Cc: David Gow <davidgow@google.com> Cc: David Hildenbrand <david@kernel.org> Cc: Hugh Dickins <hughd@google.com> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Justin Stitt <justinstitt@google.com> Cc: Liam Howlett <liam.howlett@oracle.com> Cc: Lorenzo Stoakes <lorenzo.stoakes@oracle.com> Cc: Michal Hocko <mhocko@suse.com> Cc: Miguel Ojeda <ojeda@kernel.org> Cc: Mike Rapoport <rppt@kernel.org> Cc: Nathan Chancellor <nathan@kernel.org> Cc: Shuah Khan <shuah@kernel.org> Cc: Suren Baghdasaryan <surenb@google.com> Cc: Vlastimil Babka <vbabka@suse.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-11-20mm/damon: rename damos core filter helpers to have word coreSeongJae Park1-2/+2
Patch series "mm/damon: misc cleanups". Yet another batch of misc cleanups and refactoring for DAMON code, tests, and documents. First two patches (1and 2) rename DAMOS core filters related code for readability. Three following patches (3-5) refactor page table walk callback functions in DAMON, as suggested by Hugh and David, and I promised. Next two patches (6 and 7) refactor DAMON core layer kunit test and sysfs interface selftest to be simple and deduplicated. Final two patches (8 and 9) fix up sphinx and grammatical errors on documents. This patch (of 9): DAMOS filters handled by the core layer are called core filters, while those handled by the ops layer are called ops filters. They share the same type but are managed in different places since core filters are evaluated before the ops filters. They also have different helper functions that depend on their managed places. The helper functions for ops filters have '_ops_' keyword on their name, so it is easy to know they are for ops filters. Meanwhile, the helper functions for core filters are not having the 'core' keyword on their name. This makes it easy to be mistakenly used for ops filters. Actually there was such a bug. To avoid future mistakes from similar confusions, rename DAMOS core filters helper functions to have a keyword 'core' on their names. Link: https://lkml.kernel.org/r/20251112154114.66053-1-sj@kernel.org Link: https://lkml.kernel.org/r/20251112154114.66053-2-sj@kernel.org Signed-off-by: SeongJae Park <sj@kernel.org> Cc: Bill Wendling <morbo@google.com> Cc: Brendan Higgins <brendan.higgins@linux.dev> Cc: David Gow <davidgow@google.com> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Justin Stitt <justinstitt@google.com> Cc: Liam Howlett <liam.howlett@oracle.com> Cc: Lorenzo Stoakes <lorenzo.stoakes@oracle.com> Cc: Michal Hocko <mhocko@suse.com> Cc: Miguel Ojeda <ojeda@kernel.org> Cc: Mike Rapoport <rppt@kernel.org> Cc: Nathan Chancellor <nathan@kernel.org> Cc: Shuah Khan <shuah@kernel.org> Cc: Suren Baghdasaryan <surenb@google.com> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: David Hildenbrand <david@kernel.org> Cc: Hugh Dickins <hughd@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-11-20mm: add vma_start_write_killable()Matthew Wilcox (Oracle)1-2/+28
Patch series "vma_start_write_killable"", v2. When we added the VMA lock, we made a major oversight in not adding a killable variant. That can run us into trouble where a thread takes the VMA lock for read (eg handling a page fault) and then goes out to lunch for an hour (eg doing reclaim). Another thread tries to modify the VMA, taking the mmap_lock for write, then attempts to lock the VMA for write. That blocks on the first thread, and ensures that every other page fault now tries to take the mmap_lock for read. Because everything's in an uninterruptible sleep, we can't kill the task, which makes me angry. This patchset just adds vma_start_write_killable() and converts one caller to use it. Most users are somewhat tricky to convert, so expect follow-up individual patches per call-site which need careful analysis to make sure we've done proper cleanup. This patch (of 2): The vma can be held read-locked for a substantial period of time, eg if memory allocation needs to go into reclaim. It's useful to be able to send fatal signals to threads which are waiting for the write lock. Link: https://lkml.kernel.org/r/20251110203204.1454057-1-willy@infradead.org Link: https://lkml.kernel.org/r/20251110203204.1454057-2-willy@infradead.org Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Reviewed-by: Suren Baghdasaryan <surenb@google.com> Reviewed-by: Liam R. Howlett <Liam.Howlett@oracle.com> Reviewed-by: Vlastimil Babka <vbabka@suse.cz> Reviewed-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com> Cc: Chris Li <chriscli@google.com> Cc: Jann Horn <jannh@google.com> Cc: Matthew Wilcox (Oracle) <willy@infradead.org> Cc: Shakeel Butt <shakeel.butt@linux.dev> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-11-20mm: memcg: dump memcg protection info on oom or alloc failuresShakeel Butt1-0/+5
Currently kernel dumps memory state on oom and allocation failures. One of the question usually raised on those dumps is why the kernel has not reclaimed the reclaimable memory instead of triggering oom. One potential reason is the usage of memory protection provided by memcg. So, let's also dump the memory protected by the memcg in such reports to ease the debugging. Link: https://lkml.kernel.org/r/20251107234041.3632644-1-shakeel.butt@linux.dev Signed-off-by: Shakeel Butt <shakeel.butt@linux.dev> Acked-by: Vlastimil Babka <vbabka@suse.cz> Acked-by: Michal Hocko <mhocko@suse.com> Cc: Roman Gushchin <roman.gushchin@linux.dev> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Shakeel Butt <shakeel.butt@linux.dev> Cc: Muchun Song <songmuchun@bytedance.com> Cc: David Rientjes <rientjes@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-11-20mm: introduce copy-on-fork VMAs and make VM_MAYBE_GUARD oneLorenzo Stoakes1-0/+26
Gather all the VMA flags whose presence implies that page tables must be copied on fork into a single bitmap - VM_COPY_ON_FORK - and use this rather than specifying individual flags in vma_needs_copy(). We also add VM_MAYBE_GUARD to this list, as it being set on a VMA implies that there may be metadata contained in the page tables (that is - guard markers) which would will not and cannot be propagated upon fork. This was already being done manually previously in vma_needs_copy(), but this makes it very explicit, alongside VM_PFNMAP, VM_MIXEDMAP and VM_UFFD_WP all of which imply the same. Note that VM_STICKY flags ought generally to be marked VM_COPY_ON_FORK too - because equally a flag being VM_STICKY indicates that the VMA contains metadat that is not propagated by being faulted in - i.e. that the VMA metadata does not fully describe the VMA alone, and thus we must propagate whatever metadata there is on a fork. However, for maximum flexibility, we do not make this necessarily the case here. Link: https://lkml.kernel.org/r/5d41b24e7bc622cda0af92b6d558d7f4c0d1bc8c.1763460113.git.lorenzo.stoakes@oracle.com Signed-off-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com> Reviewed-by: Pedro Falcato <pfalcato@suse.de> Reviewed-by: Vlastimil Babka <vbabka@suse.cz> Acked-by: David Hildenbrand (Red Hat) <david@kernel.org> Cc: Andrei Vagin <avagin@gmail.com> Cc: Baolin Wang <baolin.wang@linux.alibaba.com> Cc: Barry Song <baohua@kernel.org> Cc: Dev Jain <dev.jain@arm.com> Cc: Jann Horn <jannh@google.com> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Lance Yang <lance.yang@linux.dev> Cc: Liam Howlett <liam.howlett@oracle.com> Cc: "Masami Hiramatsu (Google)" <mhiramat@kernel.org> Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Cc: Michal Hocko <mhocko@suse.com> Cc: Mike Rapoport <rppt@kernel.org> Cc: Nico Pache <npache@redhat.com> Cc: Ryan Roberts <ryan.roberts@arm.com> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Suren Baghdasaryan <surenb@google.com> Cc: Zi Yan <ziy@nvidia.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-11-20mm: implement sticky VMA flagsLorenzo Stoakes1-0/+28
It is useful to be able to designate that certain flags are 'sticky', that is, if two VMAs are merged one with a flag of this nature and one without, the merged VMA sets this flag. As a result we ignore these flags for the purposes of determining VMA flag differences between VMAs being considered for merge. This patch therefore updates the VMA merge logic to perform this action, with flags possessing this property being described in the VM_STICKY bitmap. Those flags which ought to be ignored for the purposes of VMA merge are described in the VM_IGNORE_MERGE bitmap, which the VMA merge logic is also updated to use. As part of this change we place VM_SOFTDIRTY in VM_IGNORE_MERGE as it already had this behaviour, alongside VM_STICKY as sticky flags by implication must not disallow merge. Ultimately it seems that we should make VM_SOFTDIRTY a sticky flag in its own right, but this change is out of scope for this series. The only sticky flag designated as such is VM_MAYBE_GUARD, so as a result of this change, once the VMA flag is set upon guard region installation, VMAs with guard ranges will now not have their merge behaviour impacted as a result and can be freely merged with other VMAs without VM_MAYBE_GUARD set. Also update the comments for vma_modify_flags() to directly reference sticky flags now we have established the concept. We also update the VMA userland tests to account for the changes. Link: https://lkml.kernel.org/r/22ad5269f7669d62afb42ce0c79bad70b994c58d.1763460113.git.lorenzo.stoakes@oracle.com Signed-off-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com> Reviewed-by: Pedro Falcato <pfalcato@suse.de> Reviewed-by: Vlastimil Babka <vbabka@suse.cz> Cc: Andrei Vagin <avagin@gmail.com> Cc: Baolin Wang <baolin.wang@linux.alibaba.com> Cc: Barry Song <baohua@kernel.org> Cc: David Hildenbrand (Red Hat) <david@kernel.org> Cc: Dev Jain <dev.jain@arm.com> Cc: Jann Horn <jannh@google.com> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Lance Yang <lance.yang@linux.dev> Cc: Liam Howlett <liam.howlett@oracle.com> Cc: "Masami Hiramatsu (Google)" <mhiramat@kernel.org> Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Cc: Michal Hocko <mhocko@suse.com> Cc: Mike Rapoport <rppt@kernel.org> Cc: Nico Pache <npache@redhat.com> Cc: Ryan Roberts <ryan.roberts@arm.com> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Suren Baghdasaryan <surenb@google.com> Cc: Zi Yan <ziy@nvidia.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-11-20mm: add atomic VMA flags and set VM_MAYBE_GUARD as suchLorenzo Stoakes1-0/+44
This patch adds the ability to atomically set VMA flags with only the mmap read/VMA read lock held. As this could be hugely problematic for VMA flags in general given that all other accesses are non-atomic and serialised by the mmap/VMA locks, we implement this with a strict allow-list - that is, only designated flags are allowed to do this. We make VM_MAYBE_GUARD one of these flags. Link: https://lkml.kernel.org/r/97e57abed09f2663077ed7a36fb8206e243171a9.1763460113.git.lorenzo.stoakes@oracle.com Signed-off-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com> Reviewed-by: Pedro Falcato <pfalcato@suse.de> Reviewed-by: Vlastimil Babka <vbabka@suse.cz> Acked-by: David Hildenbrand (Red Hat) <david@kernel.org> Reviewed-by: Lance Yang <lance.yang@linux.dev> Cc: Andrei Vagin <avagin@gmail.com> Cc: Baolin Wang <baolin.wang@linux.alibaba.com> Cc: Barry Song <baohua@kernel.org> Cc: Dev Jain <dev.jain@arm.com> Cc: Jann Horn <jannh@google.com> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Liam Howlett <liam.howlett@oracle.com> Cc: "Masami Hiramatsu (Google)" <mhiramat@kernel.org> Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Cc: Michal Hocko <mhocko@suse.com> Cc: Mike Rapoport <rppt@kernel.org> Cc: Nico Pache <npache@redhat.com> Cc: Ryan Roberts <ryan.roberts@arm.com> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Suren Baghdasaryan <surenb@google.com> Cc: Zi Yan <ziy@nvidia.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-11-20mm: introduce VM_MAYBE_GUARD and make visible in /proc/$pid/smapsLorenzo Stoakes1-0/+3
Patch series "introduce VM_MAYBE_GUARD and make it sticky", v4. Currently, guard regions are not visible to users except through /proc/$pid/pagemap, with no explicit visibility at the VMA level. This makes the feature less useful, as it isn't entirely apparent which VMAs may have these entries present, especially when performing actions which walk through memory regions such as those performed by CRIU. This series addresses this issue by introducing the VM_MAYBE_GUARD flag which fulfils this role, updating the smaps logic to display an entry for these. The semantics of this flag are that a guard region MAY be present if set (we cannot be sure, as we can't efficiently track whether an MADV_GUARD_REMOVE finally removes all the guard regions in a VMA) - but if not set the VMA definitely does NOT have any guard regions present. It's problematic to establish this flag without further action, because that means that VMAs with guard regions in them become non-mergeable with adjacent VMAs for no especially good reason. To work around this, this series also introduces the concept of 'sticky' VMA flags - that is flags which: a. if set in one VMA and not in another still permit those VMAs to be merged (if otherwise compatible). b. When they are merged, the resultant VMA must have the flag set. The VMA logic is updated to propagate these flags correctly. Additionally, VM_MAYBE_GUARD being an explicit VMA flag allows us to solve an issue with file-backed guard regions - previously these established an anon_vma object for file-backed mappings solely to have vma_needs_copy() correctly propagate guard region mappings to child processes. We introduce a new flag alias VM_COPY_ON_FORK (which currently only specifies VM_MAYBE_GUARD) and update vma_needs_copy() to check explicitly for this flag and to copy page tables if it is present, which resolves this issue. Additionally, we add the ability for allow-listed VMA flags to be atomically writable with only mmap/VMA read locks held. The only flag we allow so far is VM_MAYBE_GUARD, which we carefully ensure does not cause any races by being allowed to do so. This allows us to maintain guard region installation as a read-locked operation and not endure the overhead of obtaining a write lock here. Finally we introduce extensive VMA userland tests to assert that the sticky VMA logic behaves correctly as well as guard region self tests to assert that smaps visibility is correctly implemented. This patch (of 9): Currently, if a user needs to determine if guard regions are present in a range, they have to scan all VMAs (or have knowledge of which ones might have guard regions). Since commit 8e2f2aeb8b48 ("fs/proc/task_mmu: add guard region bit to pagemap") and the related commit a516403787e0 ("fs/proc: extend the PAGEMAP_SCAN ioctl to report guard regions"), users can use either /proc/$pid/pagemap or the PAGEMAP_SCAN functionality to perform this operation at a virtual address level. This is not ideal, and it gives no visibility at a /proc/$pid/smaps level that guard regions exist in ranges. This patch remedies the situation by establishing a new VMA flag, VM_MAYBE_GUARD, to indicate that a VMA may contain guard regions (it is uncertain because we cannot reasonably determine whether a MADV_GUARD_REMOVE call has removed all of the guard regions in a VMA, and additionally VMAs may change across merge/split). We utilise 0x800 for this flag which makes it available to 32-bit architectures also, a flag that was previously used by VM_DENYWRITE, which was removed in commit 8d0920bde5eb ("mm: remove VM_DENYWRITE") and hasn't bee reused yet. We also update the smaps logic and documentation to identify these VMAs. Another major use of this functionality is that we can use it to identify that we ought to copy page tables on fork. We do not actually implement usage of this flag in mm/madvise.c yet as we need to allow some VMA flags to be applied atomically under mmap/VMA read lock in order to avoid the need to acquire a write lock for this purpose. Link: https://lkml.kernel.org/r/cover.1763460113.git.lorenzo.stoakes@oracle.com Link: https://lkml.kernel.org/r/cf8ef821eba29b6c5b5e138fffe95d6dcabdedb9.1763460113.git.lorenzo.stoakes@oracle.com Signed-off-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com> Reviewed-by: Pedro Falcato <pfalcato@suse.de> Reviewed-by: Vlastimil Babka <vbabka@suse.cz> Acked-by: David Hildenbrand (Red Hat) <david@kernel.org> Reviewed-by: Lance Yang <lance.yang@linux.dev> Cc: Andrei Vagin <avagin@gmail.com> Cc: Baolin Wang <baolin.wang@linux.alibaba.com> Cc: Barry Song <baohua@kernel.org> Cc: Dev Jain <dev.jain@arm.com> Cc: Jann Horn <jannh@google.com> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Liam Howlett <liam.howlett@oracle.com> Cc: "Masami Hiramatsu (Google)" <mhiramat@kernel.org> Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Cc: Michal Hocko <mhocko@suse.com> Cc: Mike Rapoport <rppt@kernel.org> Cc: Nico Pache <npache@redhat.com> Cc: Ryan Roberts <ryan.roberts@arm.com> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Suren Baghdasaryan <surenb@google.com> Cc: Zi Yan <ziy@nvidia.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-11-20mm: kill mm_wr_locked from unmap_vmas() and unmap_single_vma()Kefeng Wang1-1/+1
Kill mm_wr_locked since commit f8e97613fed2 ("mm: convert VM_PFNMAP tracking to pfnmap_track() + pfnmap_untrack()") remove the user. Link: https://lkml.kernel.org/r/20251104085709.2688433-1-wangkefeng.wang@huawei.com Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com> Reviewed-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com> Reviewed-by: Liam R. Howlett <Liam.Howlett@oracle.com> Acked-by: David Hildenbrand (Red Hat) <david@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-11-20lockd: don't allow locking on reexported NFSv2/3Jeff Layton1-1/+8
Since commit 9254c8ae9b81 ("nfsd: disallow file locking and delegations for NFSv4 reexport"), file locking when reexporting an NFS mount via NFSv4 is expressly prohibited by nfsd. Do the same in lockd: Add a new nlmsvc_file_cannot_lock() helper that will test whether file locking is allowed for a given file, and return nlm_lck_denied_nolocks if it isn't. Signed-off-by: Jeff Layton <jlayton@kernel.org> Tested-by: Olga Kornievskaia <okorniev@redhat.com> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2025-11-20Merge back material related to system sleep for 6.19Rafael J. Wysocki2-6/+14
2025-11-20sched: Provide and use set_need_resched_current()Peter Zijlstra1-0/+7
set_tsk_need_resched(current) requires set_preempt_need_resched(current) to work correctly outside of the scheduler. Provide set_need_resched_current() which wraps this correctly and replace all the open coded instances. Signed-off-by: Peter Zijlstra <peterz@infradead.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Link: https://patch.msgid.link/20251116174750.665769842@linutronix.de
2025-11-20io_uring/kbuf: remove obsolete buf_nr_pages and update commentsJoanne Koong1-2/+2
The buf_nr_pages field in io_buffer_list was previously used to determine whether the buffer list uses ring-provided buffers or classic provided buffers. This is now determined by checking the IOBL_BUF_RING flag. Remove the buf_nr_pages field and update related comments. Signed-off-by: Joanne Koong <joannelkoong@gmail.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-11-20timers/migration: Exclude isolated cpus from hierarchyGabriele Monaco1-0/+9
The timer migration mechanism allows active CPUs to pull timers from idle ones to improve the overall idle time. This is however undesired when CPU intensive workloads run on isolated cores, as the algorithm would move the timers from housekeeping to isolated cores, negatively affecting the isolation. Exclude isolated cores from the timer migr