aboutsummaryrefslogtreecommitdiff
path: root/include
AgeCommit message (Collapse)AuthorFilesLines
2025-10-30accel/ivpu: Add support for userptr buffer objectsJacek Lawrynowicz1-0/+52
Introduce a new ioctl `drm_ivpu_bo_create_from_userptr` that allows users to create GEM buffer objects from user pointers to memory regions. The user pointer must be page-aligned and the memory region must remain valid for the buffer object's lifetime. Userptr buffers enable direct use of mmapped files (e.g. inference weights) in NPU workloads without copying data to NPU buffer objects. This reduces memory usage and provides better flexibility for NPU applications. Signed-off-by: Jacek Lawrynowicz <jacek.lawrynowicz@linux.intel.com> Reviewed-by: Jeff Hugo <jeff.hugo@oss.qualcomm.com> Signed-off-by: Karol Wachowski <karol.wachowski@linux.intel.com> Link: https://patch.msgid.link/20251029091752.203198-1-karol.wachowski@linux.intel.com
2025-10-30wifi: mac80211: add RX flag to report radiotap VHT informationBenjamin Berg2-1/+21
mac80211 already reports some basic information in the radiotap header with the known fields declared by the driver. However, drivers may want to report more accurate information and in that case the full VHT radiotap structure needs to be provided. Add a new RX_FLAG_RADIOTAP_VHT which is set when the VHT information should be pulled from the skb. Update the code to fill in the VHT fields to only do so when requested by the driver or if the information has not yet been set. This way the driver can fully control the information if it chooses so. Signed-off-by: Benjamin Berg <benjamin.berg@intel.com> Reviewed-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: Miri Korenblit <miriam.rachel.korenblit@intel.com> Link: https://patch.msgid.link/20251027142118.0bad1c307a21.I2cf285c20a822698039603f2af00ed9c548f2ee0@changeid Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2025-10-29crypto: blake2b - Reimplement using library APIEric Biggers2-126/+0
Replace blake2b_generic.c with a new file blake2b.c which implements the BLAKE2b crypto_shash algorithms on top of the BLAKE2b library API. Change the driver name suffix from "-generic" to "-lib" to reflect that these algorithms now just use the (possibly arch-optimized) library. This closely mirrors crypto/{md5,sha1,sha256,sha512}.c. Remove include/crypto/internal/blake2b.h since it is no longer used. Likewise, remove struct blake2b_state from include/crypto/blake2b.h. Omit support for import_core and export_core, since there are no legacy drivers that need these for these algorithms. Reviewed-by: Ard Biesheuvel <ardb@kernel.org> Link: https://lore.kernel.org/r/20251018043106.375964-10-ebiggers@kernel.org Signed-off-by: Eric Biggers <ebiggers@kernel.org>
2025-10-29lib/crypto: blake2b: Add BLAKE2b library functionsEric Biggers2-13/+137
Add a library API for BLAKE2b, closely modeled after the BLAKE2s API. This will allow in-kernel users such as btrfs to use BLAKE2b without going through the generic crypto layer. In addition, as usual the BLAKE2b crypto_shash algorithms will be reimplemented on top of this. Note: to create lib/crypto/blake2b.c I made a copy of lib/crypto/blake2s.c and made the updates from BLAKE2s => BLAKE2b. This way, the BLAKE2s and BLAKE2b code is kept consistent. Therefore, it borrows the SPDX-License-Identifier and Copyright from lib/crypto/blake2s.c rather than crypto/blake2b_generic.c. The library API uses 'struct blake2b_ctx', consistent with other lib/crypto/ APIs. The existing 'struct blake2b_state' will be removed once the blake2b crypto_shash algorithms are updated to stop using it. Reviewed-by: Ard Biesheuvel <ardb@kernel.org> Link: https://lore.kernel.org/r/20251018043106.375964-7-ebiggers@kernel.org Signed-off-by: Eric Biggers <ebiggers@kernel.org>
2025-10-29byteorder: Add le64_to_cpu_array() and cpu_to_le64_array()Eric Biggers1-0/+16
Add le64_to_cpu_array() and cpu_to_le64_array(). These mirror the corresponding 32-bit functions. These will be used by the BLAKE2b code. Reviewed-by: Ard Biesheuvel <ardb@kernel.org> Link: https://lore.kernel.org/r/20251018043106.375964-6-ebiggers@kernel.org Signed-off-by: Eric Biggers <ebiggers@kernel.org>
2025-10-29lib/crypto: blake2s: Document the BLAKE2s library APIEric Biggers1-0/+58
Add kerneldoc for the BLAKE2s library API. Reviewed-by: Ard Biesheuvel <ardb@kernel.org> Link: https://lore.kernel.org/r/20251018043106.375964-5-ebiggers@kernel.org Signed-off-by: Eric Biggers <ebiggers@kernel.org>
2025-10-29lib/crypto: blake2s: Drop excessive const & rename block => dataEric Biggers1-7/+6
A couple more small cleanups to the BLAKE2s code before these things get propagated into the BLAKE2b code: - Drop 'const' from some non-pointer function parameters. It was a bit excessive and not conventional. - Rename 'block' argument of blake2s_compress*() to 'data'. This is for consistency with the SHA-* code, and also to avoid the implication that it points to a singular "block". No functional changes. Reviewed-by: Ard Biesheuvel <ardb@kernel.org> Link: https://lore.kernel.org/r/20251018043106.375964-4-ebiggers@kernel.org Signed-off-by: Eric Biggers <ebiggers@kernel.org>
2025-10-29lib/crypto: blake2s: Rename blake2s_state to blake2s_ctxEric Biggers1-30/+29
For consistency with the SHA-1, SHA-2, SHA-3 (in development), and MD5 library APIs, rename blake2s_state to blake2s_ctx. As a refresher, the ctx name: - Is a bit shorter. - Avoids confusion with the compression function state, which is also often called the state (but is just part of the full context). - Is consistent with OpenSSL. Not a big deal, of course. But consistency is nice. With a BLAKE2b library API about to be added, this is a convenient time to update this. Reviewed-by: Ard Biesheuvel <ardb@kernel.org> Link: https://lore.kernel.org/r/20251018043106.375964-3-ebiggers@kernel.org Signed-off-by: Eric Biggers <ebiggers@kernel.org>
2025-10-29lib/crypto: blake2s: Adjust parameter order of blake2s()Eric Biggers1-3/+3
Reorder the parameters of blake2s() from (out, in, key, outlen, inlen, keylen) to (key, keylen, in, inlen, out, outlen). This aligns BLAKE2s with the common conventions of pairing buffers and their lengths, and having outputs follow inputs. This is widely used elsewhere in lib/crypto/ and crypto/, and even elsewhere in the BLAKE2s code itself such as blake2s_init_key() and blake2s_final(). So blake2s() was a bit of an exception. Notably, this results in the same order as hmac_*_usingrawkey(). Note that since the type signature changed, it's not possible for a blake2s() call site to be silently missed. Reviewed-by: Ard Biesheuvel <ardb@kernel.org> Link: https://lore.kernel.org/r/20251018043106.375964-2-ebiggers@kernel.org Signed-off-by: Eric Biggers <ebiggers@kernel.org>
2025-10-29scsi: ufs: core: Add a quirk to suppress link_startup_againAdrian Hunter1-0/+7
ufshcd_link_startup() has a facility (link_startup_again) to issue DME_LINKSTARTUP a 2nd time even though the 1st time was successful. Some older hardware benefits from that, however the behaviour is non-standard, and has been found to cause link startup to be unreliable for some Intel Alder Lake based host controllers. Add UFSHCD_QUIRK_PERFORM_LINK_STARTUP_ONCE to suppress link_startup_again, in preparation for setting the quirk for affected controllers. Fixes: 7dc9fb47bc9a ("scsi: ufs: ufs-pci: Add support for Intel ADL") Cc: stable@vger.kernel.org Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Reviewed-by: Bart Van Assche <bvanassche@acm.org> Link: https://patch.msgid.link/20251024085918.31825-3-adrian.hunter@intel.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2025-10-29libeth: xdp: Disable generic kCFI pass for libeth_xdp_tx_xmit_bulk()Nathan Chancellor1-1/+1
When building drivers/net/ethernet/intel/idpf/xsk.c for ARCH=arm with CONFIG_CFI=y using a version of LLVM prior to 22.0.0, there is a BUILD_BUG_ON failure: $ cat arch/arm/configs/repro.config CONFIG_BPF_SYSCALL=y CONFIG_CFI=y CONFIG_IDPF=y CONFIG_XDP_SOCKETS=y $ make -skj"$(nproc)" ARCH=arm LLVM=1 clean defconfig repro.config drivers/net/ethernet/intel/idpf/xsk.o In file included from drivers/net/ethernet/intel/idpf/xsk.c:4: include/net/libeth/xsk.h:205:2: error: call to '__compiletime_assert_728' declared with 'error' attribute: BUILD_BUG_ON failed: !__builtin_constant_p(tmo == libeth_xsktmo) 205 | BUILD_BUG_ON(!__builtin_constant_p(tmo == libeth_xsktmo)); | ^ ... libeth_xdp_tx_xmit_bulk() indirectly calls libeth_xsk_xmit_fill_buf() but these functions are marked as __always_inline so that the compiler can turn these indirect calls into direct ones and see that the tmo parameter to __libeth_xsk_xmit_fill_buf_md() is ultimately libeth_xsktmo from idpf_xsk_xmit(). Unfortunately, the generic kCFI pass in LLVM expands the kCFI bundles from the indirect calls in libeth_xdp_tx_xmit_bulk() in such a way that later optimizations cannot turn these calls into direct ones, making the BUILD_BUG_ON fail because it cannot be proved at compile time that tmo is libeth_xsktmo. Disable the generic kCFI pass for libeth_xdp_tx_xmit_bulk() to ensure these indirect calls can always be turned into direct calls to avoid this error. Closes: https://github.com/ClangBuiltLinux/linux/issues/2124 Fixes: 9705d6552f58 ("idpf: implement Rx path for AF_XDP") Signed-off-by: Nathan Chancellor <nathan@kernel.org> Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com> Acked-by: Alexander Lobakin <aleksander.lobakin@intel.com> Link: https://patch.msgid.link/20251025-idpf-fix-arm-kcfi-build-error-v1-3-ec57221153ae@kernel.org Signed-off-by: Kees Cook <kees@kernel.org>
2025-10-29compiler_types: Introduce __nocfi_genericNathan Chancellor1-0/+6
There are two different ways that LLVM can expand kCFI operand bundles in LLVM IR: generically in the middle end or using an architecture specific sequence when lowering LLVM IR to machine code in the backend. The generic pass allows any architecture to take advantage of kCFI but the expansion of these bundles in the middle end can mess with optimizations that may turn indirect calls into direct calls when the call target is known at compile time, such as after inlining. Add __nocfi_generic, dependent on an architecture selecting CONFIG_ARCH_USES_CFI_GENERIC_LLVM_PASS, to disable kCFI bundle generation in functions where only the generic kCFI pass may cause problems. Link: https://github.com/ClangBuiltLinux/linux/issues/2124 Signed-off-by: Nathan Chancellor <nathan@kernel.org> Link: https://patch.msgid.link/20251025-idpf-fix-arm-kcfi-build-error-v1-1-ec57221153ae@kernel.org Signed-off-by: Kees Cook <kees@kernel.org>
2025-10-29Merge patch series "ufs: Add support for AMD Versal Gen2 UFS"Martin K. Petersen3-0/+55
Ajay Neeli <ajay.neeli@amd.com> says: This patch series adds support for the UFS driver on the AMD Versal Gen 2 SoC. It includes: - Device tree bindings and driver implementation. - Secure read support for the secure retrieval of UFS calibration values. The UFS host driver is based upon the Synopsis DesignWare (DWC) UFS architecture, utilizing the existing UFSHCD_DWC and UFSHCD_PLATFORM drivers. Link: https://patch.msgid.link/20251021113003.13650-1-ajay.neeli@amd.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2025-10-29scsi: ufs: amd-versal2: Add UFS support for AMD Versal Gen 2 SoCSai Krishna Potthuri1-0/+1
Add support for the UFS host controller on the AMD Versal Gen 2 SoC, built on the Synopsys DWC UFS architecture, using the UFSHCD DWC and UFSHCD platform driver. This controller requires specific configurations like M-PHY/RMMI/UniPro and vendor specific registers programming before doing the UIC_LINKSTARTUP. Signed-off-by: Sai Krishna Potthuri <sai.krishna.potthuri@amd.com> Signed-off-by: Ajay Neeli <ajay.neeli@amd.com> Acked-by: Bart Van Assche <bvanassche@acm.org> Link: https://patch.msgid.link/20251021113003.13650-5-ajay.neeli@amd.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2025-10-29scsi: firmware: xilinx: Add APIs for UFS PHY initializationAjay Neeli2-0/+39
- Add APIs for UFS PHY initialization. - Verify M-PHY TX-RX configuration readiness. - Confirm SRAM initialization and Set SRAM bypass. - Retrieve UFS calibration values. Signed-off-by: Ajay Neeli <ajay.neeli@amd.com> Acked-by: Senthil Nathan Thangaraj <senthilnathan.thangaraj@amd.com> Acked-by: Michal Simek <michal.simek@amd.com> Acked-by: Bart Van Assche <bvanassche@acm.org> Link: https://patch.msgid.link/20251021113003.13650-4-ajay.neeli@amd.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2025-10-29scsi: firmware: xilinx: Add support for secure read/write ioctl interfaceIzhar Ameer Shaikh1-0/+15
Add support for a generic ioctl read/write interface using which users can request firmware to perform read/write operations on a protected and secure address space. The functionality is introduced through the means of two new IOCTL IDs which extend the existing PM_IOCTL EEMI API: - IOCTL_READ_REG - IOCTL_MASK_WRITE_REG The caller only passes the node id of the given device and an offset. The base address is not exposed to the caller and internally retrieved by the firmware. Firmware will enforce an access policy on the incoming read/write request. Signed-off-by: Izhar Ameer Shaikh <izhar.ameer.shaikh@amd.com> Reviewed-by: Tanmay Shah <tanmay.shah@amd.com> Signed-off-by: Radhey Shyam Pandey <radhey.shyam.pandey@amd.com> Signed-off-by: Ajay Neeli <ajay.neeli@amd.com> Acked-by: Senthil Nathan Thangaraj <senthilnathan.thangaraj@amd.com> Acked-by: Michal Simek <michal.simek@amd.com> Acked-by: Bart Van Assche <bvanassche@acm.org> Link: https://patch.msgid.link/20251021113003.13650-3-ajay.neeli@amd.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2025-10-29net: phy: add iterator mdiobus_for_each_phyHeiner Kallweit1-1/+10
Add an iterator for all PHY's on a MII bus, and phy_find_next() as a prerequisite. Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Reviewed-by: Wei Fang <wei.fang@nxp.com> Link: https://patch.msgid.link/cd112f15-401a-43d9-8525-9ff0965a68cd@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-10-29net: tls: Cancel RX async resync request on rcd_delta overflowShahar Shitrit1-0/+6
When a netdev issues a RX async resync request for a TLS connection, the TLS module handles it by logging record headers and attempting to match them to the tcp_sn provided by the device. If a match is found, the TLS module approves the tcp_sn for resynchronization. While waiting for a device response, the TLS module also increments rcd_delta each time a new TLS record is received, tracking the distance from the original resync request. However, if the device response is delayed or fails (e.g due to unstable connection and device getting out of tracking, hardware errors, resource exhaustion etc.), the TLS module keeps logging and incrementing, which can lead to a WARN() when rcd_delta exceeds the threshold. To address this, introduce tls_offload_rx_resync_async_request_cancel() to explicitly cancel resync requests when a device response failure is detected. Call this helper also as a final safeguard when rcd_delta crosses its threshold, as reaching this point implies that earlier cancellation did not occur. Signed-off-by: Shahar Shitrit <shshitrit@nvidia.com> Reviewed-by: Sabrina Dubroca <sd@queasysnail.net> Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Link: https://patch.msgid.link/1761508983-937977-3-git-send-email-tariqt@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-10-29net: tls: Change async resync helpers argumentShahar Shitrit1-13/+8
Update tls_offload_rx_resync_async_request_start() and tls_offload_rx_resync_async_request_end() to get a struct tls_offload_resync_async parameter directly, rather than extracting it from struct sock. This change aligns the function signatures with the upcoming tls_offload_rx_resync_async_request_cancel() helper, which will be introduced in a subsequent patch. Signed-off-by: Shahar Shitrit <shshitrit@nvidia.com> Reviewed-by: Sabrina Dubroca <sd@queasysnail.net> Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Link: https://patch.msgid.link/1761508983-937977-2-git-send-email-tariqt@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-10-29ipv6: icmp: Add RFC 5837 supportIdo Schimmel1-0/+1
Add the ability to append the incoming IP interface information to ICMPv6 error messages in accordance with RFC 5837 and RFC 4884. This is required for more meaningful traceroute results in unnumbered networks. The feature is disabled by default and controlled via a new sysctl ("net.ipv6.icmp.errors_extension_mask") which accepts a bitmask of ICMP extensions to append to ICMP error messages. Currently, only a single value is supported, but the interface and the implementation should be able to support more extensions, if needed. Clone the skb and copy the relevant data portions before modifying the skb as the caller of icmp6_send() still owns the skb after the function returns. This should be fine since by default ICMP error messages are rate limited to 1000 per second and no more than 1 per second per specific host. Trim or pad the packet to 128 bytes before appending the ICMP extension structure in order to be compatible with legacy applications that assume that the ICMP extension structure always starts at this offset (the minimum length specified by RFC 4884). Since commit 20e1954fe238 ("ipv6: RFC 4884 partial support for SIT/GRE tunnels") it is possible for icmp6_send() to be called with an skb that already contains ICMP extensions. This can happen when we receive an ICMPv4 message with extensions from a tunnel and translate it to an ICMPv6 message towards an IPv6 host in the overlay network. I could not find an RFC that supports this behavior, but it makes sense to not overwrite the original extensions that were appended to the packet. Therefore, avoid appending extensions if the length field in the provided ICMPv6 header is already filled. Export netdev_copy_name() using EXPORT_IPV6_MOD_GPL() to make it available to IPv6 when it is built as a module. Reviewed-by: Petr Machata <petrm@nvidia.com> Reviewed-by: David Ahern <dsahern@kernel.org> Reviewed-by: Willem de Bruijn <willemb@google.com> Signed-off-by: Ido Schimmel <idosch@nvidia.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Link: https://patch.msgid.link/20251027082232.232571-3-idosch@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-10-29ipv4: icmp: Add RFC 5837 supportIdo Schimmel2-0/+33
Add the ability to append the incoming IP interface information to ICMPv4 error messages in accordance with RFC 5837 and RFC 4884. This is required for more meaningful traceroute results in unnumbered networks. The feature is disabled by default and controlled via a new sysctl ("net.ipv4.icmp_errors_extension_mask") which accepts a bitmask of ICMP extensions to append to ICMP error messages. Currently, only a single value is supported, but the interface and the implementation should be able to support more extensions, if needed. Clone the skb and copy the relevant data portions before modifying the skb as the caller of __icmp_send() still owns the skb after the function returns. This should be fine since by default ICMP error messages are rate limited to 1000 per second and no more than 1 per second per specific host. Trim or pad the packet to 128 bytes before appending the ICMP extension structure in order to be compatible with legacy applications that assume that the ICMP extension structure always starts at this offset (the minimum length specified by RFC 4884). Reviewed-by: Petr Machata <petrm@nvidia.com> Reviewed-by: David Ahern <dsahern@kernel.org> Reviewed-by: Willem de Bruijn <willemb@google.com> Signed-off-by: Ido Schimmel <idosch@nvidia.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Link: https://patch.msgid.link/20251027082232.232571-2-idosch@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-10-29tcp: add newval parameter to tcp_rcvbuf_grow()Eric Dumazet1-1/+1
This patch has no functional change, and prepares the following one. tcp_rcvbuf_grow() will need to have access to tp->rcvq_space.space old and new values. Change mptcp_rcvbuf_grow() in a similar way. Signed-off-by: Eric Dumazet <edumazet@google.com> [ Moved 'oldval' declaration to the next patch to avoid warnings at build time. ] Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Reviewed-by: Neal Cardwell <ncardwell@google.com> Link: https://patch.msgid.link/20251028-net-tcp-recv-autotune-v3-3-74b43ba4c84c@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-10-29trace: tcp: add three metrics to trace_tcp_rcvbuf_grow()Eric Dumazet1-0/+9
While chasing yet another receive autotuning bug, I found useful to add rcv_ssthresh, window_clamp and rcv_wnd. tcp_stream 40597 [068] 2172.978198: tcp:tcp_rcvbuf_grow: time=50307 rtt_us=50179 copied=77824 inq=0 space=40960 ooo=0 scaling_ratio=219 rcvbuf=131072 rcv_ssthresh=107474 window_clamp=112128 rcv_wnd=110592 tcp_stream 40597 [068] 2173.028528: tcp:tcp_rcvbuf_grow: time=50336 rtt_us=50206 copied=110592 inq=0 space=77824 ooo=0 scaling_ratio=219 rcvbuf=509444 rcv_ssthresh=328658 window_clamp=435813 rcv_wnd=331776 tcp_stream 40597 [068] 2173.078830: tcp:tcp_rcvbuf_grow: time=50305 rtt_us=50070 copied=270336 inq=0 space=110592 ooo=0 scaling_ratio=219 rcvbuf=509444 rcv_ssthresh=431159 window_clamp=435813 rcv_wnd=434176 tcp_stream 40597 [068] 2173.129137: tcp:tcp_rcvbuf_grow: time=50313 rtt_us=50118 copied=434176 inq=0 space=270336 ooo=0 scaling_ratio=219 rcvbuf=2457847 rcv_ssthresh=1299511 window_clamp=2102611 rcv_wnd=1302528 tcp_stream 40597 [068] 2173.179451: tcp:tcp_rcvbuf_grow: time=50318 rtt_us=50041 copied=1019904 inq=0 space=434176 ooo=0 scaling_ratio=219 rcvbuf=2457847 rcv_ssthresh=2087445 window_clamp=2102611 rcv_wnd=2088960 Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Reviewed-by: Neal Cardwell <ncardwell@google.com> Link: https://patch.msgid.link/20251028-net-tcp-recv-autotune-v3-2-74b43ba4c84c@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-10-29fs: Make wbc_to_tag() inline and use it in fs.Julian Sun1-0/+7
The logic in wbc_to_tag() is widely used in file systems, so modify this function to be inline and use it in file systems. This patch has only passed compilation tests, but it should be fine. Signed-off-by: Julian Sun <sunjunchao@bytedance.com> Reviewed-by: Qu Wenruo <wqu@suse.com> Reviewed-by: Jan Kara <jack@suse.cz> Signed-off-by: Christian Brauner <brauner@kernel.org>
2025-10-29writeback: allow the file system to override MIN_WRITEBACK_PAGESChristoph Hellwig2-0/+6
The relatively low minimal writeback size of 4MiB means that written back inodes on rotational media are switched a lot. Besides introducing additional seeks, this also can lead to extreme file fragmentation on zoned devices when a lot of files are cached relative to the available writeback bandwidth. Add a superblock field that allows the file system to override the default size. Signed-off-by: Christoph Hellwig <hch@lst.de> Link: https://patch.msgid.link/20251017034611.651385-3-hch@lst.de Reviewed-by: Jan Kara <jack@suse.cz> Reviewed-by: Darrick J. Wong <djwong@kernel.org> Signed-off-by: Christian Brauner <brauner@kernel.org>
2025-10-29mm: rename filemap_fdatawrite_range_kick to filemap_flush_rangeChristoph Hellwig1-3/+3
Rename filemap_fdatawrite_range_kick to filemap_flush_range because it is the ranged version of filemap_flush. Signed-off-by: Christoph Hellwig <hch@lst.de> Link: https://patch.msgid.link/20251024080431.324236-11-hch@lst.de Reviewed-by: Jan Kara <jack@suse.cz> Signed-off-by: Christian Brauner <brauner@kernel.org>
2025-10-29mm: remove __filemap_fdatawrite_rangeChristoph Hellwig1-2/+0
Use filemap_fdatawrite_range and filemap_fdatawrite_range_kick instead of the low-level __filemap_fdatawrite_range that requires the caller to know the internals of the writeback_control structure and remove __filemap_fdatawrite_range now that it is trivial and only two callers would be left. Signed-off-by: Christoph Hellwig <hch@lst.de> Link: https://patch.msgid.link/20251024080431.324236-10-hch@lst.de Reviewed-by: Jan Kara <jack@suse.cz> Reviewed-by: Damien Le Moal <dlemoal@kernel.org> Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> Signed-off-by: Christian Brauner <brauner@kernel.org>
2025-10-29mm: remove filemap_fdatawrite_wbcChristoph Hellwig1-2/+0
Replace filemap_fdatawrite_wbc, which exposes a writeback_control to the callers with a filemap_writeback helper that takes all the possible arguments and declares the writeback_control itself. Signed-off-by: Christoph Hellwig <hch@lst.de> Link: https://patch.msgid.link/20251024080431.324236-9-hch@lst.de Reviewed-by: David Hildenbrand <david@redhat.com> Reviewed-by: Jan Kara <jack@suse.cz> Reviewed-by: Damien Le Moal <dlemoal@kernel.org> Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> Signed-off-by: Christian Brauner <brauner@kernel.org>
2025-10-29mm,btrfs: add a filemap_flush_nr helperChristoph Hellwig1-0/+1
Abstract out the btrfs-specific behavior of kicking off I/O on a number of pages on an address_space into a well-defined helper. Note: there is no kerneldoc comment for the new function because it is not part of the public API. Signed-off-by: Christoph Hellwig <hch@lst.de> Link: https://patch.msgid.link/20251024080431.324236-7-hch@lst.de Reviewed-by: David Hildenbrand <david@redhat.com> Reviewed-by: Damien Le Moal <dlemoal@kernel.org> Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> Reviewed-by: Jan Kara <jack@suse.cz> Signed-off-by: Christian Brauner <brauner@kernel.org>
2025-10-29regmap: add flat cache with sparse validitySander Vanheule1-6/+11
The flat regcache will always assume the data in the cache is valid. Since the cache is preferred over hardware access, this may shadow the actual state of the device. Add a new containing cache structure with the flat data table and a bitmap indicating cache validity. REGCACHE_FLAT will still behave as before, as the validity is ignored. Define new cache type REGCACHE_FLAT_S: a flat cache with sparse validity. The sparse validity is used to determine if a hardware access should occur to initialize the cache on the fly, vs. at regmap init for REGCACHE_FLAT. Contrary to REGCACHE_FLAT, this allows us to implement regcache_ops.drop. Signed-off-by: Sander Vanheule <sander@svanheule.net> Link: https://patch.msgid.link/20251029081248.52607-2-sander@svanheule.net Signed-off-by: Mark Brown <broonie@kernel.org>
2025-10-29perf: Support deferred user unwindPeter Zijlstra4-14/+34
Add support for deferred userspace unwind to perf. Where perf currently relies on in-place stack unwinding; from NMI context and all that. This moves the userspace part of the unwind to right before the return-to-userspace. This has two distinct benefits, the biggest is that it moves the unwind to a faultable context. It becomes possible to fault in debug info (.eh_frame, SFrame etc.) that might not otherwise be readily available. And secondly, it de-duplicates the user callchain where multiple samples happen during the same kernel entry. To facilitate this the perf interface is extended with a new record type: PERF_RECORD_CALLCHAIN_DEFERRED and two new attribute flags: perf_event_attr::defer_callchain - to request the user unwind be deferred perf_event_attr::defer_output - to request PERF_RECORD_CALLCHAIN_DEFERRED records The existing PERF_RECORD_SAMPLE callchain section gets a new context type: PERF_CONTEXT_USER_DEFERRED After which will come a single entry, denoting the 'cookie' of the deferred callchain that should be attached here, matching the 'cookie' field of the above mentioned PERF_RECORD_CALLCHAIN_DEFERRED. The 'defer_callchain' flag is expected on all events with PERF_SAMPLE_CALLCHAIN. The 'defer_output' flag is expect on the event responsible for collecting side-band events (like mmap, comm etc.). Setting 'defer_output' on multiple events will get you duplicated PERF_RECORD_CALLCHAIN_DEFERRED records. Based on earlier patches by Josh and Steven. Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://patch.msgid.link/20251023150002.GR4067720@noisy.programming.kicks-ass.net
2025-10-29unwind_user/x86: Teach FP unwind about start of functionPeter Zijlstra1-0/+1
When userspace is interrupted at the start of a function, before we get a chance to complete the frame, unwind will miss one caller. X86 has a uprobe specific fixup for this, add bits to the generic unwinder to support this. Suggested-by: Jens Remus <jremus@linux.ibm.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://patch.msgid.link/20251024145156.GM4068168@noisy.programming.kicks-ass.net
2025-10-29unwind: Implement compat fp unwindPeter Zijlstra1-0/+1
It is important to be able to unwind compat tasks too. Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://patch.msgid.link/20250924080119.613695709@infradead.org
2025-10-29unwind: Make unwind_task_info::unwind_mask consistentPeter Zijlstra2-3/+4
The unwind_task_info::unwind_mask was manipulated using a mixture of: regular store WRITE_ONCE() try_cmpxchg() set_bit() atomic_long_*() Clean up and make it consistently atomic_long_t. Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://patch.msgid.link/20250924080119.384384486@infradead.org
2025-10-29unwind: Simplify unwind_reset_info()Peter Zijlstra1-14/+14
Invert the condition of the first if and make it an early exit to reduce an indent level for the rest fo the function. Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Steven Rostedt (Google) <rostedt@goodmis.org> Link: https://patch.msgid.link/20250924080118.777916262@infradead.org
2025-10-29unwind: Add required include filesPeter Zijlstra1-0/+2
To be self sufficient, the file needs to include linux/types.h. This provides things like u32/u64 and struct callback_head. Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Acked-by: Steven Rostedt (Google) <rostedt@goodmis.org> Link: https://patch.msgid.link/20250924080118.665787071@infradead.org
2025-10-29unwind: Shorten linesPeter Zijlstra1-4/+14
There are some exceptionally long lines that cause ugly wrapping. Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Acked-by: Steven Rostedt (Google) <rostedt@goodmis.org> Link: https://patch.msgid.link/20250924080118.545274393@infradead.org
2025-10-29dma-mapping: remove unused map_page callbackLeon Romanovsky1-7/+0
After conversion of arch code to use physical address mapping, there are no users of .map_page() and .unmap_page() callbacks, so let's remove them. Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Signed-off-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com> Link: https://lore.kernel.org/r/20251015-remove-map-page-v5-14-3bbfe3a25cdf@kernel.org
2025-10-29dma-mapping: remove unused mapping resource callbacksLeon Romanovsky1-6/+0
After ARM and XEN conversions to use physical addresses for the mapping, there are no in-kernel users for map_resource/unmap_resource callbacks, so remove them. Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Signed-off-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com> Link: https://lore.kernel.org/r/20251015-remove-map-page-v5-6-3bbfe3a25cdf@kernel.org
2025-10-29dma-mapping: prepare dma_map_ops to conversion to physical addressLeon Romanovsky1-0/+7
Add new .map_phys() and .unmap_phys() callbacks to dma_map_ops as a preparation to replace .map_page() and .unmap_page() respectively. Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Signed-off-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com> Link: https://lore.kernel.org/r/20251015-remove-map-page-v5-1-3bbfe3a25cdf@kernel.org
2025-10-29dma-mapping: benchmark: Restore padding to ensure uABI remained consistentQinxin Xia1-0/+1
The padding field in the structure was previously reserved to maintain a stable interface for potential new fields, ensuring compatibility with user-space shared data structures. However,it was accidentally removed by tiantao in a prior commit, which may lead to incompatibility between user space and the kernel. This patch reinstates the padding to restore the original structure layout and preserve compatibility. Fixes: 8ddde07a3d28 ("dma-mapping: benchmark: extract a common header file for map_benchmark definition") Cc: stable@vger.kernel.org Acked-by: Barry Song <baohua@kernel.org> Signed-off-by: Qinxin Xia <xiaqinxin@huawei.com> Reported-by: Barry Song <baohua@kernel.org> Closes: https://lore.kernel.org/lkml/CAGsJ_4waiZ2+NBJG+SCnbNk+nQ_ZF13_Q5FHJqZyxyJTcEop2A@mail.gmail.com/ Reviewed-by: Jonathan Cameron <jonathan.cameron@huawei.com> Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com> Link: https://lore.kernel.org/r/20251028120900.2265511-2-xiaqinxin@huawei.com
2025-10-29tools/dma: move dma_map_benchmark from selftests to tools/dmaQinxin Xia1-5/+8
dma_map_benchmark is a standalone developer tool rather than an automated selftest. It has no pass/fail criteria, expects manual invocation, and is built as a normal userspace binary. Move it to tools/dma/ and add a minimal Makefile. Suggested-by: Marek Szyprowski <m.szyprowski@samsung.com> Suggested-by: Barry Song <baohua@kernel.org> Signed-off-by: Qinxin Xia <xiaqinxin@huawei.com> Acked-by: Barry Song <baohua@kernel.org> Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com> Link: https://lore.kernel.org/r/20251028120900.2265511-3-xiaqinxin@huawei.com
2025-10-29Merge branch 'linus/master' into sched/core, to resolve conflictPeter Zijlstra24-48/+180
Conflicts: kernel/sched/ext.c Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Ingo Molnar <mingo@kernel.org>
2025-10-28sctp: Constify struct sctp_sched_opsChristophe JAILLET2-3/+3
'struct sctp_sched_ops' is not modified in these drivers. Constifying this structure moves some data to a read-only section, so increases overall security, especially when the structure holds some function pointers. On a x86_64, with allmodconfig, as an example: Before: ====== text data bss dec hex filename 8019 568 0 8587 218b net/sctp/stream_sched_fc.o After: ===== text data bss dec hex filename 8275 312 0 8587 218b net/sctp/stream_sched_fc.o Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr> Link: https://patch.msgid.link/dce03527eb7b7cc8a3c26d5cdac12bafe3350135.1761377890.git.christophe.jaillet@wanadoo.fr Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-10-28net: netmem: remove NET_IOV_MAX from net_iov_type enumBobby Eshleman1-4/+0
Remove the NET_IOV_MAX workaround from the net_iov_type enum. This entry was previously added to force the enum size to unsigned long to satisfy the NET_IOV_ASSERT_OFFSET static assertions. After commit f3d85c9ee510 ("netmem: introduce struct netmem_desc mirroring struct page") this approach became unnecessary by placing the net_iov_type after the netmem_desc. Placing the net_iov_type after netmem_desc results in the net_iov_type size having no effect on the position or layout of the fields that mirror the struct page. The layout before this patch: struct net_iov { union { struct netmem_desc desc; /* 0 48 */ struct { long unsigned int _flags; /* 0 8 */ long unsigned int pp_magic; /* 8 8 */ struct page_pool * pp; /* 16 8 */ long unsigned int _pp_mapping_pad; /* 24 8 */ long unsigned int dma_addr; /* 32 8 */ atomic_long_t pp_ref_count; /* 40 8 */ }; /* 0 48 */ }; /* 0 48 */ struct net_iov_area * owner; /* 48 8 */ enum net_iov_type type; /* 56 8 */ /* size: 64, cachelines: 1, members: 3 */ }; The layout after this patch: struct net_iov { union { struct netmem_desc desc; /* 0 48 */ struct { long unsigned int _flags; /* 0 8 */ long unsigned int pp_magic; /* 8 8 */ struct page_pool * pp; /* 16 8 */ long unsigned int _pp_mapping_pad; /* 24 8 */ long unsigned int dma_addr; /* 32 8 */ atomic_long_t pp_ref_count; /* 40 8 */ }; /* 0 48 */ }; /* 0 48 */ struct net_iov_area * owner; /* 48 8 */ enum net_iov_type type; /* 56 4 */ /* size: 64, cachelines: 1, members: 3 */ /* padding: 4 */ }; Signed-off-by: Bobby Eshleman <bobbyeshleman@meta.com> Reviewed-by: Mina Almasry <almasrymina@google.com> Link: https://patch.msgid.link/20251024-b4-devmem-remove-niov-max-v1-1-ba72c68bc869@meta.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-10-28net: rps: softnet_data reorg to make enqueue_to_backlog() fastEric Dumazet1-1/+10
enqueue_to_backlog() is showing up in kernel profiles on hosts with many cores, when RFS/RPS is used. The following softnet_data fields need to be updated: - input_queue_tail - input_pkt_queue (next, prev, qlen, lock) - backlog.state (if input_pkt_queue was empty) Unfortunately they are currenly using two cache lines: /* --- cacheline 3 boundary (192 bytes) --- */ call_single_data_t csd __attribute__((__aligned__(64))); /* 0xc0 0x20 */ struct softnet_data * rps_ipi_next; /* 0xe0 0x8 */ unsigned int cpu; /* 0xe8 0x4 */ unsigned int input_queue_tail; /* 0xec 0x4 */ struct sk_buff_head input_pkt_queue; /* 0xf0 0x18 */ /* --- cacheline 4 boundary (256 bytes) was 8 bytes ago --- */ struct napi_struct backlog __attribute__((__aligned__(8))); /* 0x108 0x1f0 */ Add one ____cacheline_aligned_in_smp to make sure they now are using a single cache line. Also, because napi_struct has written fields, make @state its first field. We want to make sure that cpus adding packets to sd->input_pkt_queue are not slowing down cpus processing their backlog because of false sharing. After this patch new layout is: /* --- cacheline 5 boundary (320 bytes) --- */ long int pad[3] __attribute__((__aligned__(64))); /* 0x140 0x18 */ unsigned int input_queue_tail; /* 0x158 0x4 */ /* XXX 4 bytes hole, try to pack */ struct sk_buff_head input_pkt_queue; /* 0x160 0x18 */ struct napi_struct backlog __attribute__((__aligned__(8))); /* 0x178 0x1f0 */ Signed-off-by: Eric Dumazet <edumazet@google.com> Reviewed-by: Kuniyuki Iwashima <kuniyu@google.com> Link: https://patch.msgid.link/20251024091240.3292546-1-edumazet@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-10-28tracing: Add trace_seq_pop() and seq_buf_pop()Steven Rostedt2-0/+30
In order to allow an interface to remove an added character from the trace_seq and seq_buf descriptors, add helper functions trace_seq_pop() and seq_buf_pop(). Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Takaya Saeki <takayas@google.com> Cc: Tom Zanussi <zanussi@kernel.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ian Rogers <irogers@google.com> Cc: Douglas Raillard <douglas.raillard@arm.com> Cc: Arnaldo Carvalho de Melo <acme@kernel.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Ingo Molnar <mingo@redhat.com> Link: https://lore.kernel.org/20251028231148.594898736@kernel.org Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2025-10-28tracing: Display some syscall arrays as stringsSteven Rostedt1-1/+3
Some of the system calls that read a fixed length of memory from the user space address are not arrays but strings. Take a bit away from the nb_args field in the syscall meta data to use as a flag to denote that the system call's user_arg_size is being used as a string. The nb_args should never be more than 6, so 7 bits is plenty to hold that number. When the user_arg_is_str flag that, when set, will display the data array from the user space address as a string and not an array. This will allow the output to look like this: sys_sethostname(name: 0x5584310eb2a0 "debian", len: 6) Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Takaya Saeki <takayas@google.com> Cc: Tom Zanussi <zanussi@kernel.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ian Rogers <irogers@google.com> Cc: Douglas Raillard <douglas.raillard@arm.com> Cc: Arnaldo Carvalho de Melo <acme@kernel.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Ingo Molnar <mingo@redhat.com> Link: https://lore.kernel.org/20251028231147.930550359@kernel.org Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2025-10-28tracing: Have system call events record user array dataSteven Rostedt1-1/+3
For system call events that have a length field, add a "user_arg_size" parameter to the system call meta data that denotes the index of the args array that holds the size of arg that the user_mask field has a bit set for. The "user_mask" has a bit set that denotes the arg that points to an array in the user space address space and if a system call event has the user_mask field set and the user_arg_size set, it will then record the content of that address into the trace event, up to the size defined by SYSCALL_FAULT_BUF_SZ - 1. This allows the output to look like: sys_write(fd: 0xa, buf: 0x5646978d13c0 (01:00:05:00:00:00:00:00:01:87:55:89:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00), count: 0x20) Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Mat