aboutsummaryrefslogtreecommitdiff
path: root/include/linux
AgeCommit message (Collapse)AuthorFilesLines
2026-02-23sched/fair: More complex proportional newidle balancePeter Zijlstra1-0/+1
It turns out that a few workloads (easyWave, fio) have a fairly low success rate on newidle balance, but still benefit greatly from having it anyway. Luckliky these workloads have a faily low newidle rate, so the cost if doing the newidle is relatively low, even if unsuccessfull. Add a simple rate based part to the newidle ratio compute, such that low rate newidle will still have a high newidle ratio. This cures the easyWave and fio workloads while not affecting the schbench numbers either (which have a very high newidle rate). Reported-by: Mario Roy <marioeroy@gmail.com> Reported-by: "Mohamed Abuelfotoh, Hazem" <abuehaze@amazon.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Tested-by: Mario Roy <marioeroy@gmail.com> Tested-by: "Mohamed Abuelfotoh, Hazem" <abuehaze@amazon.com> Link: https://patch.msgid.link/20260127151748.GA1079264@noisy.programming.kicks-ass.net
2026-02-23rseq: slice ext: Ensure rseq feature size differs from original rseq sizeMathieu Desnoyers1-0/+12
Before rseq became extensible, its original size was 32 bytes even though the active rseq area was only 20 bytes. This had the following impact in terms of userspace ecosystem evolution: * The GNU libc between 2.35 and 2.39 expose a __rseq_size symbol set to 32, even though the size of the active rseq area is really 20. * The GNU libc 2.40 changes this __rseq_size to 20, thus making it express the active rseq area. * Starting from glibc 2.41, __rseq_size corresponds to the AT_RSEQ_FEATURE_SIZE from getauxval(3). This means that users of __rseq_size can always expect it to correspond to the active rseq area, except for the value 32, for which the active rseq area is 20 bytes. Exposing a 32 bytes feature size would make life needlessly painful for userspace. Therefore, add a reserved field at the end of the rseq area to bump the feature size to 33 bytes. This reserved field is expected to be replaced with whatever field will come next, expecting that this field will be larger than 1 byte. The effect of this change is to increase the size from 32 to 64 bytes before we actually have fields using that memory. Clarify the allocation size and alignment requirements in the struct rseq uapi comment. Change the value returned by getauxval(AT_RSEQ_ALIGN) to return the value of the active rseq area size rounded up to next power of 2, which guarantees that the rseq structure will always be aligned on the nearest power of two large enough to contain it, even as it grows. Change the alignment check in the rseq registration accordingly. This will minimize the amount of ABI corner-cases we need to document and require userspace to play games with. The rule stays simple when __rseq_size != 32: #define rseq_field_available(field) (__rseq_size >= offsetofend(struct rseq_abi, field)) Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://patch.msgid.link/20260220200642.1317826-3-mathieu.desnoyers@efficios.com
2026-02-23rseq: Mark rseq_arm_slice_extension_timer() __always_inlineArnd Bergmann1-4/+4
objtool warns about this function being called inside of a uaccess section: kernel/entry/common.o: warning: objtool: irqentry_exit+0x1dc: call to rseq_arm_slice_extension_timer() with UACCESS enabled Interestingly, this happens with CONFIG_RSEQ_SLICE_EXTENSION disabled, so this is an empty function, as the normal implementation is already marked __always_inline. I could reproduce this multiple times with gcc-11 but not with gcc-15, so the compiler probably got better at identifying the trivial function. Mark all the empty helpers for !RSEQ_SLICE_EXTENSION as __always_inline for consistency, avoiding this warning. Fixes: 0ac3b5c3dc45 ("rseq: Implement time slice extension enforcement timer") Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://patch.msgid.link/20260206074122.709580-1-arnd@kernel.org
2026-02-23sched/fair: Fix lag clampPeter Zijlstra1-0/+1
Vincent reported that he was seeing undue lag clamping in a mixed slice workload. Implement the max_slice tracking as per the todo comment. Fixes: 147f3efaa241 ("sched/fair: Implement an EEVDF-like scheduling policy") Reported-off-by: Vincent Guittot <vincent.guittot@linaro.org> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Tested-by: Vincent Guittot <vincent.guittot@linaro.org> Tested-by: K Prateek Nayak <kprateek.nayak@amd.com> Tested-by: Shubhang Kaushik <shubhang@os.amperecomputing.com> Link: https://patch.msgid.link/20250422101628.GA33555@noisy.programming.kicks-ass.net
2026-02-22Merge tag 'fsverity-for-linus' of git://git.kernel.org/pub/scm/fs/fsverity/linuxLinus Torvalds1-8/+7
Pull fsverity fixes from Eric Biggers: - Fix a build error on parisc - Remove the non-large-folio-aware function fsverity_verify_page() * tag 'fsverity-for-linus' of git://git.kernel.org/pub/scm/fs/fsverity/linux: fsverity: fix build error by adding fsverity_readahead() stub fsverity: remove fsverity_verify_page() f2fs: make f2fs_verify_cluster() partially large-folio-aware f2fs: remove unnecessary ClearPageUptodate in f2fs_verify_cluster()
2026-02-21Convert 'alloc_obj' family to use the new default GFP_KERNEL argumentLinus Torvalds5-5/+5
This was done entirely with mindless brute force, using git grep -l '\<k[vmz]*alloc_objs*(.*, GFP_KERNEL)' | xargs sed -i 's/\(alloc_objs*(.*\), GFP_KERNEL)/\1)/' to convert the new alloc_obj() users that had a simple GFP_KERNEL argument to just drop that argument. Note that due to the extreme simplicity of the scripting, any slightly more complex cases spread over multiple lines would not be triggered: they definitely exist, but this covers the vast bulk of the cases, and the resulting diff is also then easier to check automatically. For the same reason the 'flex' versions will be done as a separate conversion. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2026-02-21add default_gfp() helper macro and use it in the new *alloc_obj() helpersLinus Torvalds2-24/+28
Most simple allocations use GFP_KERNEL, and with the new allocation helpers being introduced, let's just take advantage of that to simplify that default case. It's a numbers game: git grep 'alloc_obj(' | sed 's/.*\(GFP_[_A-Z]*\).*/\1/' | sort | uniq -c | sort -n | tail shows that about 90% of all those new allocator instances just use that standard GFP_KERNEL. Those helpers are already macros, and we can easily just make it be the default case when the gfp argument is missing. And yes, we could do that for all the legacy interfaces too, but let's keep it to just the new ones at least for now, since those all got converted recently anyway, so this is not any "extra" noise outside of that limited conversion. And, in fact, I want to do this before doing the -rc1 release, exactly so that we don't get extra merge conflicts. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2026-02-21slab.h: disable completely broken overflow handling in flex allocationsLinus Torvalds2-6/+2
Commit 69050f8d6d07 ("treewide: Replace kmalloc with kmalloc_obj for non-scalar types") started using the new allocation helpers, and in the process showed that they were completely non-working. The overflow logic in overflows_flex_counter_type() is completely the wrong way around, and that broke __alloc_flex() completely. By chance, the resulting code was then such a mess that clang generated sufficiently garbage code that objtool warned about it all. Which made it somewhat quicker to narrow things down. While fixing overflows_flex_counter_type() would presumably fix this all, I'm excising the whole broken overflow logic from __alloc_flex(), because we don't want that kind of code in basic allocation functions anyway. That (no longer) broken overflows_flex_counter_type() thing needs to be inserted into the actual __set_flex_counter() logic in the unlikely case that we ever want this at all. And made conditional. Fixes: 81cee9166a90 ("compiler_types: Introduce __flex_counter() and family") Fixes: 69050f8d6d07 ("treewide: Replace kmalloc with kmalloc_obj for non-scalar types") Cc: Kees Cook <kees@kernel.org> Link: https://lore.kernel.org/all/CAHk-=whEd020BYzGTzYrENjD9Z5_82xx6h8HsQvH5xDSnv0=Hw@mail.gmail.com/ Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2026-02-21Merge tag 'kmalloc_obj-treewide-v7.0-rc1' of ↵Linus Torvalds10-11/+11
git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux Pull kmalloc_obj conversion from Kees Cook: "This does the tree-wide conversion to kmalloc_obj() and friends using coccinelle, with a subsequent small manual cleanup of whitespace alignment that coccinelle does not handle. This uncovered a clang bug in __builtin_counted_by_ref(), so the conversion is preceded by disabling that for current versions of clang. The imminent clang 22.1 release has the fix. I've done allmodconfig build tests for x86_64, arm64, i386, and arm. I did defconfig builds for alpha, m68k, mips, parisc, powerpc, riscv, s390, sparc, sh, arc, csky, xtensa, hexagon, and openrisc" * tag 'kmalloc_obj-treewide-v7.0-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux: kmalloc_obj: Clean up after treewide replacements treewide: Replace kmalloc with kmalloc_obj for non-scalar types compiler_types: Disable __builtin_counted_by_ref for Clang
2026-02-21Merge tag 'ntb-7.0' of https://github.com/jonmason/ntbLinus Torvalds1-14/+0
Pull NTB (PCIe non-transparent bridge) updates from Jon Mason: "NTB updates include debugfs improvements, correctness fixes, cleanups, and new hardware support: ntb_transport QP stats are converted to seq_file, a tx_memcpy_offload module parameter is introduced with associated ordering fixes, and a debugfs queue name truncation bug is corrected. Additional fixes address format specifier mismatches in ntb_tool and boundary conditions in the Switchtec driver, while unused MSI helpers are removed and the codebase migrates to dma_map_phys(). Intel Gen6 (Diamond Rapids) NTB support is also added" * tag 'ntb-7.0' of https://github.com/jonmason/ntb: NTB: ntb_transport: Use seq_file for QP stats debugfs NTB: ntb_transport: Fix too small buffer for debugfs_name ntb/ntb_tool: correct sscanf format for u64 and size_t in tool_peer_mw_trans_write ntb: intel: Add Intel Gen6 NTB support for DiamondRapids NTB/msi: Remove unused functions ntb: ntb_hw_switchtec: Increase MAX_MWS limit to 256 ntb: ntb_hw_switchtec: Fix array-index-out-of-bounds access ntb: ntb_hw_switchtec: Fix shift-out-of-bounds for 0 mw lut NTB: epf: allow built-in build ntb: migrate to dma_map_phys instead of map_page NTB: ntb_transport: Add 'tx_memcpy_offload' module option NTB: ntb_transport: Remove unused 'retries' field from ntb_queue_entry
2026-02-21Merge tag 'io_uring-20260221' of ↵Linus Torvalds1-4/+11
git://git.kernel.org/pub/scm/linux/kernel/git/axboe/linux Pull io_uring fixes from Jens Axboe: - A fix for a missing URING_CMD128 opcode check, fixing an issue with the SQE mixed mode support introduced in 6.19. Merged late due to having multiple dependencies - Add sqe->cmd size checking for big SQEs, similar to what we have for normal sized SQEs - Fix a race condition in zcrx, that leads to a double free * tag 'io_uring-20260221' of git://git.kernel.org/pub/scm/linux/kernel/git/axboe/linux: io_uring: Add size check for sqe->cmd io_uring: add IORING_OP_URING_CMD128 to opcode checks io_uring/zcrx: fix user_ref race between scrub and refill paths
2026-02-21kmalloc_obj: Clean up after treewide replacementsKees Cook2-2/+2
Coccinelle doesn't handle re-indenting line escapes. Fix the 2 places where these got misaligned. Remove 2 now-redundant type casts, found with: $ git grep -P 'struct (\S+).*\)\s*k\S+alloc_(objs?|flex)\(struct \1' Signed-off-by: Kees Cook <kees@kernel.org>
2026-02-21treewide: Replace kmalloc with kmalloc_obj for non-scalar typesKees Cook9-10/+9
This is the result of running the Coccinelle script from scripts/coccinelle/api/kmalloc_objs.cocci. The script is designed to avoid scalar types (which need careful case-by-case checking), and instead replace kmalloc-family calls that allocate struct or union object instances: Single allocations: kmalloc(sizeof(TYPE), ...) are replaced with: kmalloc_obj(TYPE, ...) Array allocations: kmalloc_array(COUNT, sizeof(TYPE), ...) are replaced with: kmalloc_objs(TYPE, COUNT, ...) Flex array allocations: kmalloc(struct_size(PTR, FAM, COUNT), ...) are replaced with: kmalloc_flex(*PTR, FAM, COUNT, ...) (where TYPE may also be *VAR) The resulting allocations no longer return "void *", instead returning "TYPE *". Signed-off-by: Kees Cook <kees@kernel.org>
2026-02-21compiler_types: Disable __builtin_counted_by_ref for ClangKees Cook1-1/+2
Unfortunately, there is a corner case of __builtin_counted_by_ref() usage that crashes[1] Clang since support was introduced in Clang 19. Disable it prior to Clang 22. Found while tested kmalloc_obj treewide refactoring (via kmalloc_flex() usage). Link: https://github.com/llvm/llvm-project/issues/182575 [1] Signed-off-by: Kees Cook <kees@kernel.org>
2026-02-20Merge tag 'trace-v7.0-2' of ↵Linus Torvalds2-3/+15
git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace Pull tracing fixes from Steven Rostedt: - Fix possible dereference of uninitialized pointer When validating the persistent ring buffer on boot up, if the first validation fails, a reference to "head_page" is performed in the error path, but it skips over the initialization of that variable. Move the initialization before the first validation check. - Fix use of event length in validation of persistent ring buffer On boot up, the persistent ring buffer is checked to see if it is valid by several methods. One being to walk all the events in the memory location to make sure they are all valid. The length of the event is used to move to the next event. This length is determined by the data in the buffer. If that length is corrupted, it could possibly make the next event to check located at a bad memory location. Validate the length field of the event when doing the event walk. - Fix function graph on archs that do not support use of ftrace_ops When an architecture defines HAVE_DYNAMIC_FTRACE_WITH_ARGS, it means that its function graph tracer uses the ftrace_ops of the function tracer to call its callbacks. This allows a single registered callback to be called directly instead of checking the callback's meta data's hash entries against the function being traced. For architectures that do not support this feature, it must always call the loop function that tests each registered callback (even if there's only one). The loop function tests each callback's meta data against its hash of functions and will call its callback if the function being traced is in its hash map. The issue was that there was no check against this and the direct function was being called even if the architecture didn't support it. This meant that if function tracing was enabled at the same time as a callback was registered with the function graph tracer, its callback would be called for every function that the function tracer also traced, even if the callback's meta data only wanted to be called back for a small subset of functions. Prevent the direct calling for those architectures that do not support it. - Fix references to trace_event_file for hist files The hist files used event_file_data() to get a reference to the associated trace_event_file the histogram was attached to. This would return a pointer even if the trace_event_file is about to be freed (via RCU). Instead it should use the event_file_file() helper that returns NULL if the trace_event_file is marked to be freed so that no new references are added to it. - Wake up hist poll readers when an event is being freed When polling on a hist file, the task is only awoken when a hist trigger is triggered. This means that if an event is being freed while there's a task waiting on its hist file, it will need to wait until the hist trigger occurs to wake it up and allow the freeing to happen. Note, the event will not be completely freed until all references are removed, and a hist poller keeps a reference. But it should still be woken when the event is being freed. * tag 'trace-v7.0-2' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace: tracing: Wake up poll waiters for hist files when removing an event tracing: Fix checking of freed trace_event_file for hist files fgraph: Do not call handlers direct when not using ftrace_ops tracing: ring-buffer: Fix to check event length before using ring-buffer: Fix possible dereference of uninitialized pointer
2026-02-20NTB/msi: Remove unused functionsDr. David Alan Gilbert1-14/+0
ntbm_msi_free_irq() and ntb_msi_peer_addr() were both added in 2019's commit 26b3a37b9284 ("NTB: Introduce MSI library") but have remained unused. Remove them, and the ntbm_msi_callback_match() helper that was used by ntbm_msi_free_irq(). Signed-off-by: Dr. David Alan Gilbert <linux@treblig.org> Signed-off-by: Jon Mason <jdmason@kudzu.us>
2026-02-19tracing: Wake up poll waiters for hist files when removing an eventPetr Pavlu1-0/+5
The event_hist_poll() function attempts to verify whether an event file is being removed, but this check may not occur or could be unnecessarily delayed. This happens because hist_poll_wakeup() is currently invoked only from event_hist_trigger() when a hist command is triggered. If the event file is being removed, no associated hist command will be triggered and a waiter will be woken up only after an unrelated hist command is triggered. Fix the issue by adding a call to hist_poll_wakeup() in remove_event_file_dir() after setting the EVENT_FILE_FL_FREED flag. This ensures that a task polling on a hist file is woken up and receives EPOLLERR. Cc: stable@vger.kernel.org Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Cc: Tom Zanussi <zanussi@kernel.org> Acked-by: Masami Hiramatsu (Google) <mhiramat@kernel.org> Link: https://patch.msgid.link/20260219162737.314231-3-petr.pavlu@suse.com Fixes: 1bd13edbbed6 ("tracing/hist: Add poll(POLLIN) support on hist file") Signed-off-by: Petr Pavlu <petr.pavlu@suse.com> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2026-02-19fgraph: Do not call handlers direct when not using ftrace_opsSteven Rostedt1-3/+10
The function graph tracer was modified to us the ftrace_ops of the function tracer. This simplified the code as well as allowed more features of the function graph tracer. Not all architectures were converted over as it required the implementation of HAVE_DYNAMIC_FTRACE_WITH_ARGS to implement. For those architectures, it still did it the old way where the function graph tracer handle was called by the function tracer trampoline. The handler then had to check the hash to see if the registered handlers wanted to be called by that function or not. In order to speed up the function graph tracer that used ftrace_ops, if only one callback was registered with function graph, it would call its function directly via a static call. Now, if the architecture does not support the use of using ftrace_ops and still has the ftrace function trampoline calling the function graph handler, then by doing a direct call it removes the check against the handler's hash (list of functions it wants callbacks to), and it may call that handler for functions that the handler did not request calls for. On 32bit x86, which does not support the ftrace_ops use with function graph tracer, it shows the issue: ~# trace-cmd start -p function -l schedule ~# trace-cmd show # tracer: function_graph # # CPU DURATION FUNCTION CALLS # | | | | | | | 2) * 11898.94 us | schedule(); 3) # 1783.041 us | schedule(); 1) | schedule() { ------------------------------------------ 1) bash-8369 => kworker-7669 ------------------------------------------ 1) | schedule() { ------------------------------------------ 1) kworker-7669 => bash-8369 ------------------------------------------ 1) + 97.004 us | } 1) | schedule() { [..] Now by starting the function tracer is another instance: ~# trace-cmd start -B foo -p function This causes the function graph tracer to trace all functions (because the function trace calls the function graph tracer for each on, and the function graph trace is doing a direct call): ~# trace-cmd show # tracer: function_graph # # CPU DURATION FUNCTION CALLS # | | | | | | | 1) 1.669 us | } /* preempt_count_sub */ 1) + 10.443 us | } /* _raw_spin_unlock_irqrestore */ 1) | tick_program_event() { 1) | clockevents_program_event() { 1) 1.044 us | ktime_get(); 1) 6.481 us | lapic_next_event(); 1) + 10.114 us | } 1) + 11.790 us | } 1) ! 181.223 us | } /* hrtimer_interrupt */ 1) ! 184.624 us | } /* __sysvec_apic_timer_interrupt */ 1) | irq_exit_rcu() { 1) 0.678 us | preempt_count_sub(); When it should still only be tracing the schedule() function. To fix this, add a macro FGRAPH_NO_DIRECT to be set to 0 when the architecture does not support function graph use of ftrace_ops, and set to 1 otherwise. Then use this macro to know to allow function graph tracer to call the handlers directly or not. Cc: stable@vger.kernel.org Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Cc: Mark Rutland <mark.rutland@arm.com> Link: https://patch.msgid.link/20260218104244.5f14dade@gandalf.local.home Fixes: cc60ee813b503 ("function_graph: Use static_call and branch to optimize entry function") Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2026-02-19Merge tag 'net-7.0-rc1' of ↵Linus Torvalds6-7/+7
git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net Pull networking fixes from Jakub Kicinski: "Including fixes from Netfilter. Current release - new code bugs: - net: fix backlog_unlock_irq_restore() vs CONFIG_PREEMPT_RT - eth: mlx5e: XSK, Fix unintended ICOSQ change - phy_port: correctly recompute the port's linkmodes - vsock: prevent child netns mode switch from local to global - couple of kconfig fixes for new symbols Previous releases - regressions: - nfc: nci: fix false-positive parameter validation for packet data - net: do not delay zero-copy skbs in skb_attempt_defer_free() Previous releases - always broken: - mctp: ensure our nlmsg responses to user space are zero-initialised - ipv6: ioam: fix heap buffer overflow in __ioam6_fill_trace_data() - fixes for ICMP rate limiting Misc: - intel: fix PCI device ID conflict between i40e and ipw2200" * tag 'net-7.0-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (85 commits) net: nfc: nci: Fix parameter validation for packet data net/mlx5e: Use unsigned for mlx5e_get_max_num_channels net/mlx5e: Fix deadlocks between devlink and netdev instance locks net/mlx5e: MACsec, add ASO poll loop in macsec_aso_set_arm_event net/mlx5: Fix misidentification of write combining CQE during poll loop net/mlx5e: Fix misidentification of ASO CQE during poll loop net/mlx5: Fix multiport device check over light SFs bonding: alb: fix UAF in rlb_arp_recv during bond up/down bnge: fix reserving resources from FW eth: fbnic: Advertise supported XDP features. rds: tcp: fix uninit-value in __inet_bind net/rds: Fix NULL pointer dereference in rds_tcp_accept_one octeontx2-af: Fix default entries mcam entry action net/mlx5e: XSK, Fix unintended ICOSQ change ipv6: icmp: icmpv6_xrlim_allow() optimization if net.ipv6.icmp.ratelimit is zero ipv4: icmp: icmpv4_xrlim_allow() optimization if net.ipv4.icmp_ratelimit is zero ipv6: icmp: remove obsolete code in icmpv6_xrlim_allow() inet: move icmp_global_{credit,stamp} to a separate cache line icmp: prevent possible overflow in icmp_global_allow() selftests/net: packetdrill: add ipv4-mapped-ipv6 tests ...
2026-02-19net/mlx5: Fix multiport device check over light SFsShay Drory1-2/+2
Driver is using num_vhca_ports capability to distinguish between multiport master device and multiport slave device. num_vhca_ports is a capability the driver sets according to the MAX num_vhca_ports capability reported by FW. On the other hand, light SFs doesn't set the above capbility. This leads to wrong results whenever light SFs is checking whether he is a multiport master or slave. Therefore, use the MAX capability to distinguish between master and slave devices. Fixes: e71383fb9cd1 ("net/mlx5: Light probe local SFs") Signed-off-by: Shay Drory <shayd@nvidia.com> Reviewed-by: Moshe Shemesh <moshe@nvidia.com> Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Reviewed-by: Jacob Keller <Jacob.e.keller@intel.com> Link: https://patch.msgid.link/20260218072904.1764634-2-tariqt@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-02-19io_uring: Add size check for sqe->cmdGovindarajulu Varadarajan1-4/+11
For SQE128, sqe->cmd provides 80 bytes for uring_cmd. Add macro to check if size of user struct does not exceed 80 bytes at compile time. User doesn't have to track this manually during development. Replace io_uring_sqe_cmd() inline func with macro and add io_uring_sqe128_cmd() which checks struct size for 16 bytes cmd and 80 bytes cmd respectively. Signed-off-by: Govindarajulu Varadarajan <govind.varadar@gmail.com> Reviewed-by: Caleb Sander Mateos <csander@purestorage.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2026-02-18Merge tag 'mm-stable-2026-02-18-19-48' of ↵Linus Torvalds11-76/+331
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm Pull more MM updates from Andrew Morton: - "mm/vmscan: fix demotion targets checks in reclaim/demotion" fixes a couple of issues in the demotion code - pages were failed demotion and were finding themselves demoted into disallowed nodes (Bing Jiao) - "Remove XA_ZERO from error recovery of dup_mmap()" fixes a rare mapledtree race and performs a number of cleanups (Liam Howlett) - "mm: add bitmap VMA flag helpers and convert all mmap_prepare to use them" implements a lot of cleanups following on from the conversion of the VMA flags into a bitmap (Lorenzo Stoakes) - "support batch checking of references and unmapping for large folios" implements batching to greatly improve the performance of reclaiming clean file-backed large folios (Baolin Wang) - "selftests/mm: add memory failure selftests" does as claimed (Miaohe Lin) * tag 'mm-stable-2026-02-18-19-48' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (36 commits) mm/page_alloc: clear page->private in free_pages_prepare() selftests/mm: add memory failure dirty pagecache test selftests/mm: add memory failure clean pagecache test selftests/mm: add memory failure anonymous page test mm: rmap: support batched unmapping for file large folios arm64: mm: implement the architecture-specific clear_flush_young_ptes() arm64: mm: support batch clearing of the young flag for large folios arm64: mm: factor out the address and ptep alignment into a new helper mm: rmap: support batched checks of the references for large folios tools/testing/vma: add VMA userland tests for VMA flag functions tools/testing/vma: separate out vma_internal.h into logical headers tools/testing/vma: separate VMA userland tests into separate files mm: make vm_area_desc utilise vma_flags_t only mm: update all remaining mmap_prepare users to use vma_flags_t mm: update shmem_[kernel]_file_*() functions to use vma_flags_t mm: update secretmem to use VMA flags on mmap_prepare mm: update hugetlbfs to use VMA flags on mmap_prepare mm: add basic VMA flag operation helper functions tools: bitmap: add missing bitmap_[subset(), andnot()] mm: add mk_vma_flags() bitmap flag macro helper ...
2026-02-18Merge tag 'pm-7.0-rc1-2' of ↵Linus Torvalds1-1/+1
git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm Pull more power management updates from Rafael Wysocki: "These are mostly fixes on top of the power management updates merged recently in cpuidle governors, in the Intel RAPL power capping driver and in the wake IRQ management code: - Fix the handling of package-scope MSRs in the intel_rapl power capping driver when called from the PMU subsystem and make it add all package CPUs to the PMU cpumask to allow tools to read RAPL events from any CPU in the package (Kuppuswamy Satharayananyan) - Rework the invalid version check in the intel_rapl_tpmi power capping driver to account for the fact that on partitioned systems, multiple TPMI instances may exist per package, but RAPL registers are only valid on one instance (Kuppuswamy Satharayananyan) - Describe the new intel_idle.table command line option in the admin-guide intel_idle documentation (Artem Bityutskiy) - Fix a crash in the ladder cpuidle governor on systems with only one (polling) idle state available by making the cpuidle core bypass the governor in those cases and adjust the other existing governors to that change (Aboorva Devarajan, Christian Loehle) - Update kerneldoc comments for wake IRQ management functions that have not been matching the code (Wang Jiayue)" * tag 'pm-7.0-rc1-2' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: cpuidle: menu: Remove single state handling cpuidle: teo: Remove single state handling cpuidle: haltpoll: Remove single state handling cpuidle: Skip governor when only one idle state is available powercap: intel_rapl_tpmi: Remove FW_BUG from invalid version check PM: sleep: wakeirq: Update outdated documentation comments Documentation: PM: Document intel_idle.table command line option powercap: intel_rapl: Expose all package CPUs in PMU cpumask powercap: intel_rapl: Remove incorrect CPU check in PMU context
2026-02-18Merge tag 'sysctl-7.00-rc1' of ↵Linus Torvalds4-107/+17
git://git.kernel.org/pub/scm/linux/kernel/git/sysctl/sysctl Pull sysctl updates from Joel Granados: - Remove macros from proc handler converters Replace the proc converter macros with "regular" functions. Though it is more verbose than the macro version, it helps when debugging and better aligns with coding-style.rst. - General cleanup Remove superfluous ctl_table forward declarations. Const qualify the memory_allocation_profiling_sysctl and loadpin_sysctl_table arrays. Add missing kernel doc to proc_dointvec_conv. - Testing This series was run through sysctl selftests/kunit test suite in x86_64. And went into linux-next after rc4, giving it a good 3 weeks of testing * tag 'sysctl-7.00-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/sysctl/sysctl: sysctl: replace SYSCTL_INT_CONV_CUSTOM macro with functions sysctl: Replace unidirectional INT converter macros with functions sysctl: Add kernel doc to proc_douintvec_conv sysctl: Replace UINT converter macros with functions sysctl: Add CONFIG_PROC_SYSCTL guards for converter macros sysctl: clarify proc_douintvec_minmax doc sysctl: Return -ENOSYS from proc_douintvec_conv when CONFIG_PROC_SYSCTL=n sysctl: Remove unused ctl_table forward declarations loadpin: Implement custom proc_handler for enforce alloc_tag: move memory_allocation_profiling_sysctls into .rodata sysctl: Add missing kernel-doc for proc_dointvec_conv
2026-02-17fsverity: fix build error by adding fsverity_readahead() stubEric Biggers1-2/+7
hppa-linux-gcc 9.5.0 generates a call to fsverity_readahead() in f2fs_readahead() when CONFIG_FS_VERITY=n, because it fails to do the expected dead code elimination based on vi always being NULL. Fix the build error by adding an inline stub for fsverity_readahead(). Since it's just for opportunistic readahead, just make it a no-op. Reported-by: kernel test robot <lkp@intel.com> Closes: https://lore.kernel.org/oe-kbuild-all/202602180838.pwICdY2r-lkp@intel.com/ Fixes: 45dcb3ac9832 ("f2fs: consolidate fsverity_info lookup") Reviewed-by: Christoph Hellwig <hch@lst.de> Link: https://lore.kernel.org/r/20260218012244.18536-1-ebiggers@kernel.org Signed-off-by: Eric Biggers <ebiggers@kernel.org>
2026-02-17fsverity: remove fsverity_verify_page()Eric Biggers1-6/+0
Now that fsverity_verify_page() has no callers, remove it. Suggested-by: Linus Torvalds <torvalds@linux-foundation.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Link: https://lore.kernel.org/r/20260218010630.7407-4-ebiggers@kernel.org Signed-off-by: Eric Biggers <ebiggers@kernel.org>
2026-02-17Merge tag 'ceph-for-7.0-rc1' of https://github.com/ceph/ceph-clientLinus Torvalds1-2/+3
Pull ceph updates from Ilya Dryomov: "This adds support for the upcoming aes256k key type in CephX that is based on Kerberos 5 and brings a bunch of assorted CephFS fixes from Ethan and Sam. One of Sam's patches in particular undoes a change in the fscrypt area that had an inadvertent side effect of making CephFS behave as if mounted with wsize=4096 and leading to the corresponding degradation in performance, especially for sequential writes" * tag 'ceph-for-7.0-rc1' of https://github.com/ceph/ceph-client: ceph: assert loop invariants in ceph_writepages_start() ceph: remove error return from ceph_process_folio_batch() ceph: fix write storm on fscrypted files ceph: do not propagate page array emplacement errors as batch errors ceph: supply snapshot context in ceph_uninline_data() ceph: supply snapshot context in ceph_zero_partial_object() libceph: adapt ceph_x_challenge_blob hashing and msgr1 message signing libceph: add support for CEPH_CRYPTO_AES256KRB5 libceph: introduce ceph_crypto_key_prepare() libceph: generalize ceph_x_encrypt_offset() and ceph_x_encrypt_buflen() libceph: define and enforce CEPH_MAX_KEY_LEN
2026-02-17Merge tag 'dmaengine-7.0-rc1' of ↵Linus Torvalds3-27/+27
git://git.kernel.org/pub/scm/linux/kernel/git/vkoul/dmaengine Pull dmaengine updates from Vinod Koul: "Core: - Add Frank Li as susbstem reviewer to help with reviews New Support: - Mediatek support for Dimensity 6300 and 9200 controller - Qualcomm Kaanapali and Glymur GPI DMA engine - Synopsis DW AXI Agilex5 - Renesas RZ/V2N SoC - Atmel microchip lan9691-dma - Tegra ADMA tegra264 Updates: - sg_nents_for_dma() helper use in subsystem - pm_runtime_mark_last_busy() redundant call update for subsystem - Residue support for xilinx AXIDMA driver - Intel Max SGL Size Support and capabilities for DSA3.0 - AXI dma larger than 32bits address support" * tag 'dmaengine-7.0-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/vkoul/dmaengine: (64 commits) dmaengine: add Frank Li as reviewer dt-bindings: dma: qcom,gpi: Update max interrupts lines to 16 dmaengine: fsl-edma: don't explicitly disable clocks in .remove() dmaengine: xilinx: xdma: use sg_nents_for_dma() helper dmaengine: sh: use sg_nents_for_dma() helper dmaengine: sa11x0: use sg_nents_for_dma() helper dmaengine: qcom: bam_dma: use sg_nents_for_dma() helper dmaengine: qcom: adm: use sg_nents_for_dma() helper dmaengine: pxa-dma: use sg_nents_for_dma() helper dmaengine: lgm: use sg_nents_for_dma() helper dmaengine: k3dma: use sg_nents_for_dma() helper dmaengine: dw-axi-dmac: use sg_nents_for_dma() helper dmaengine: bcm2835-dma: use sg_nents_for_dma() helper dmaengine: axi-dmac: use sg_nents_for_dma() helper dmaengine: altera-msgdma: use sg_nents_for_dma() helper scatterlist: introduce sg_nents_for_dma() helper dmaengine: idxd: Add Max SGL Size Support for DSA3.0 dmaengine: idxd: Expose DSA3.0 capabilities through sysfs dmaengine: sh: rz-dmac: Make channel irq local dmaengine: pl08x: Fix comment stating the difference between PL080 and PL081 ...
2026-02-17Merge tag 'phy-for-7.0' of ↵Linus Torvalds4-7/+84
git://git.kernel.org/pub/scm/linux/kernel/git/phy/linux-phy Pull phy updates from Vinod Koul: "Core: - Add suuport for "rx-polarity" and "tx-polarity" device tree properties and phy common properties to manage this New Support: - Qualcomm Glymur PCIe Gen4 2-lanes PCIe phy, DP and edp phy, USB UNI PHY and SMB2370 eUSB2 repeater. SC8280xp QMP UFS PHY, Kaanapali PCIe phy and QMP PHY, QCS615 QMP USB3+DP PHY and driver support for that. - SpacemiT PCIe/combo PHY and K1 USB2 PHY driver. - HDMI 2.1 FRL configuration support and driver enabling for rockchip samsung-hdptx driver - TI TCAN1046 phy - Renesas RZ/V2H(P) and RZ/V2N usb3 - Mediatek MT8188 hdmi-phy - Google Tensor SoC USB PHY driver - Apple Type-C PHY Updates: - Subsystem conversion for clock round_rate() to determine_rate() - TI USB3 DT schema conversion - Samsung ExynosAutov920 usb3, combo hsphy and ssphy support" * tag 'phy-for-7.0' of git://git.kernel.org/pub/scm/linux/kernel/git/phy/linux-phy: (143 commits) phy: ti: phy-j721e-wiz: convert from divider_round_rate() to divider_determine_rate() dt-bindings: phy: ti,control-phy-otghs: convert to DT schema dt-bindings: phy: ti,phy-usb3: convert to DT schema phy: tegra: xusb: Remove unused powered_on variable phy: renesas: rcar-gen3-usb2: add regulator dependency phy: GOOGLE_USB: add TYPEC dependency phy: enter drivers/phy/Makefile even without CONFIG_GENERIC_PHY phy: renesas: rcar-gen3-usb2: Use mux-state for phyrst management phy: renesas: rcar-gen3-usb2: Add regulator for OTG VBUS control phy: renesas: rcar-gen3-usb2: Use devm_pm_runtime_enable() phy: renesas: rcar-gen3-usb2: Factor out VBUS control logic dt-bindings: phy: renesas,usb2-phy: Document RZ/G3E SoC dt-bindings: phy: renesas,usb2-phy: Document mux-states property dt-bindings: phy: renesas,usb2-phy: Document USB VBUS regulator phy: rockchip: samsung-hdptx: Add HDMI 2.1 FRL support phy: rockchip: samsung-hdptx: Extend rk_hdptx_phy_verify_hdmi_config() helper phy: rockchip: samsung-hdptx: Switch to driver specific HDMI config phy: rockchip: samsung-hdptx: Drop hw_rate driver data phy: rockchip: samsung-hdptx: Compute clk rate from PLL config phy: rockchip: samsung-hdptx: Cleanup *_cmn_init_seq lists ...
2026-02-17Merge tag 'soundwire-7.0-rc1' of ↵Linus Torvalds1-1/+1
git://git.kernel.org/pub/scm/linux/kernel/git/vkoul/soundwire Pull soundwire updates from Vinod Koul: - support for Qualcomm v2.2.0 controllers - bus method updates for .probe(), .remove() and .shutdown() and remove function return value updates - Avell B.ON dmi-quirks mapping - mark cs42l45 codec as wake capable * tag 'soundwire-7.0-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/vkoul/soundwire: soundwire: intel_ace2x: add SND_HDA_CORE dependency dt-bindings: soundwire: qcom: Add SoundWire v2.2.0 compatible soundwire: Use bus methods for .probe(), .remove() and .shutdown() soundwire: Make remove function return no value soundwire: dmi-quirks: add mapping for Avell B.ON (OEM rebranded of NUC15) soundwire: qcom: Use guard to avoid mixing cleanup and goto soundwire: intel_auxdevice: add cs42l45 codec to wake_capable_list
2026-02-17Merge tag 'spdx-7.0-rc1' of ↵Linus Torvalds2-7/+2
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/spdx Pull SPDX updates from Greg KH: "Here are two small changes that add some missing SPDX license lines to some core kernel files. These are: - adding SPDX license lines to kdb files - adding SPDX license lines to the remaining kernel/ files Both of these have been in linux-next for a while with no reported issues" * tag 'spdx-7.0-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/spdx: kernel: debug: Add SPDX license ids to kdb files kernel: add SPDX-License-Identifier lines
2026-02-17Merge tag 'usb-7.0-rc1' of ↵Linus Torvalds7-58/+69
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb Pull USB / Thunderbolt updates from Greg KH: "Here is the "big" set of USB and Thunderbolt driver updates for 7.0-rc1. Overall more lines were removed than added, thanks to dropping the obsolete isp1362 USB host controller driver, always a nice change. Other than that, nothing major happening here, highlights are: - lots of dwc3 driver updates and new hardware support added - usb gadget function driver updates - usb phy driver updates - typec driver updates and additions - USB rust binding updates for syntax and formatting changes - more usb serial device ids added - other smaller USB core and driver updates and additions All of these have been in linux-next for a long time, with no reported problems" * tag 'usb-7.0-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb: (77 commits) usb: typec: ucsi: Add Thunderbolt alternate mode support usb: typec: hd3ss3220: Check if regulator needs to be switched usb: phy: tegra: parametrize PORTSC1 register offset usb: phy: tegra: parametrize HSIC PTS value usb: phy: tegra: return error value from utmi_wait_register usb: phy: tegra: cosmetic fixes dt-bindings: usb: renesas,usbhs: Add RZ/G3E SoC support usb: dwc2: fix resume failure if dr_mode is host usb: cdns3: fix role switching during resume usb: dwc3: gadget: Move vbus draw to workqueue context USB: serial: option: add Telit FN920C04 RNDIS compositions usb: dwc3: Log dwc3 address in traces usb: gadget: tegra-xudc: Add handling for BLCG_COREPLL_PWRDN usb: phy: tegra: add HSIC support usb: phy: tegra: use phy type directly usb: typec: ucsi: Enforce mode selection for cros_ec_ucsi usb: typec: ucsi: Support mode selection to activate altmodes usb: typec: Introduce mode_selection bit usb: typec: Implement mode selection usb: typec: Expose alternate mode priority via sysfs ...
2026-02-17Merge tag 'tty-7.0-rc1' of ↵Linus Torvalds1-0/+1
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty Pull tty / serial driver updates from Greg KH: "Here is the small amount of tty and serial driver updates for 7.0-rc1. Nothing major in here at all, just some driver updates and minor tweaks and cleanups including: - sh-sci serial driver updates - 8250 driver updates - attempt to make the tty ports have their own workqueue, but was reverted after testing found it to have problems on some platforms. This will probably come back for 7.1 after it has been reworked and resubmitted - other tiny tty driver changes All of these have been in linux-next for a while with no reported problems" * tag 'tty-7.0-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty: (49 commits) Revert "tty: tty_port: add workqueue to flip TTY buffer" tty: tty_port: add workqueue to flip TTY buffer serial: 8250_pci: Remove custom deprecated baud setting routine serial: 8250_omap: Remove custom deprecated baud setting routine dt-bindings: serial: renesas,scif: Document RZ/G3L SoC serial: 8250: omap: set out-of-band wakeup if wakeup pinctrl exists tty: hvc-iucv: Remove KMSG_COMPONENT macro dt-bindings: serial: google,goldfish-tty: Convert to DT schema dt-bindings: serial: sh-sci: Fold single-entry compatibles into enum serial: 8250: 8250_omap.c: Clear DMA RX running status only after DMA termination is done serial: 8250: 8250_omap.c: Add support for handling UART error conditions serial: SH_SCI: improve "DMA support" prompt serial: Kconfig: fix ordering of entries for menu display serial: 8250: fix ordering of entries for menu display serial: imx: change SERIAL_IMX_CONSOLE to bool 8250_men_mcb: drop unneeded MODULE_ALIAS serial: men_z135_uart: drop unneeded MODULE_ALIAS dt-bindings: serial: renesas,rsci: Document RZ/V2H(P) and RZ/V2N SoCs serial: rsci: Convert to FIELD_MODIFY() dt-bindings: serial: 8250: add SpacemiT K3 UART compatible ...
2026-02-17Merge tag 'char-misc-7.0-rc1' of ↵Linus Torvalds12-74/+192
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc Pull char/misc/IIO driver updates from Greg KH: "Here is the big set of char/misc/iio and other smaller driver subsystem changes for 7.0-rc1. Lots of little things in here, including: - Loads of iio driver changes and updates and additions - gpib driver updates - interconnect driver updates - i3c driver updates - hwtracing (coresight and intel) driver updates - deletion of the obsolete mwave driver - binder driver updates (rust and c versions) - mhi driver updates (causing a merge conflict, see below) - mei driver updates - fsi driver updates - eeprom driver updates - lots of other small char and misc driver updates and cleanups All of these have been in linux-next for a while, with no reported issues" * tag 'char-misc-7.0-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc: (297 commits) mux: mmio: fix regmap leak on probe failure rust_binder: return p from rust_binder_transaction_target_node() drivers: android: binder: Update ARef imports from sync::aref rust_binder: fix needless borrow in context.rs iio: magn: mmc5633: Fix Kconfig for combination of I3C as module and driver builtin iio: sca3000: Fix a resource leak in sca3000_probe() iio: proximity: rfd77402: Add interrupt handling support iio: proximity: rfd77402: Document device private data structure iio: proximity: rfd77402: Use devm-managed mutex initialization iio: proximity: rfd77402: Use kernel helper for result polling iio: proximity: rfd77402: Align polling timeout with datasheet iio: cros_ec: Allow enabling/disabling calibration mode iio: frequency: ad9523: correct kernel-doc bad line warning iio: buffer: buffer_impl.h: fix kernel-doc warnings iio: gyro: itg3200: Fix unchecked return value in read_raw MAINTAINERS: add entry for ADE9000 driver iio: accel: sca3000: remove unused last_timestamp field iio: accel: adxl372: remove unused int2_bitmask field iio: adc: ad7766: Use iio_trigger_generic_data_rdy_poll() iio: magnetometer: Remove IRQF_ONESHOT ...
2026-02-17Merge tag 'block-7.0-20260216' of ↵Linus Torvalds3-22/+42
git://git.kernel.org/pub/scm/linux/kernel/git/axboe/linux Pull more block updates from Jens Axboe: - Fix partial IOVA mapping cleanup in error handling - Minor prep series ignoring discard return value, as the inline value is always known - Ensure BLK_FEAT_STABLE_WRITES is set for drbd - Fix leak of folio in bio_iov_iter_bounce_read() - Allow IOC_PR_READ_* for read-only open - Another debugfs deadlock fix - A few doc updates * tag 'block-7.0-20260216' of git://git.kernel.org/pub/scm/linux/kernel/git/axboe/linux: blk-mq: use NOIO context to prevent deadlock during debugfs creation blk-stat: convert struct blk_stat_callback to kernel-doc block: fix enum descriptions kernel-doc block: update docs for bio and bvec_iter block: change return type to void nvmet: