linux.git - Unnamed repository; edit this file 'description' to name the repository.

Age	Commit message (Collapse)	Author	Files	Lines
2026-06-08	x86/xen: Cleanup Xen related trace points	Juergen Gross	1	-59/+3
	Since dropping Xen-PV support for 32-bit, include/trace/events/xen.h contains several stale trace point definitions. Remove them. Signed-off-by: Juergen Gross <jgross@suse.com> Message-ID: <20260522152114.77319-3-jgross@suse.com>
2026-06-04	mm: kick writeback flusher for IOCB_DONTCACHE with targeted dirty tracking	Jeff Layton	1	-1/+2
	The IOCB_DONTCACHE writeback path in generic_write_sync() calls filemap_flush_range() on every write, submitting writeback inline in the writer's context. Perf lock contention profiling shows the performance problem is not lock contention but the writeback submission work itself — walking the page tree and submitting I/O blocks the writer for milliseconds, inflating p99.9 latency from 23ms (buffered) to 93ms (dontcache). Replace the inline filemap_flush_range() call with a flusher kick that drains dirty pages in the background. This moves writeback submission completely off the writer's hot path. To avoid flushing unrelated buffered dirty data, add a dedicated WB_start_dontcache bit and wb_check_start_dontcache() handler that uses the per-wb WB_DONTCACHE_DIRTY counter to determine how many pages to write back. The flusher writes back that many pages from the oldest dirty inodes (not restricted to dontcache-specific inodes). This helps preserve I/O batching while limiting the scope of expedited writeback. Like WB_start_all, the WB_start_dontcache bit coalesces multiple DONTCACHE writes into a single flusher wakeup without per-write allocations. Use test_and_clear_bit to atomically consume the kick request before reading the dirty counter and starting writeback, so that concurrent DONTCACHE writes during writeback can re-set the bit and schedule a follow-up flusher run. Read the dirty counter with wb_stat_sum() (aggregating per-CPU batches) rather than wb_stat() (which reads only the global counter) to ensure small writes below the percpu batch threshold are visible to the flusher. In filemap_dontcache_kick_writeback(), set the WB_start_dontcache bit inside the unlocked_inode_to_wb_begin/end section for correct cgroup writeback domain targeting, but defer the wb_wakeup() call until after the section ends, since wb_wakeup() uses spin_unlock_irq() which would unconditionally re-enable interrupts while the i_pages xa_lock may still be held under irqsave during a cgroup writeback switch. Pin the wb with wb_get() inside the RCU critical section before calling wb_wakeup() outside it, since cgroup bdi_writeback structures are RCU-freed and the wb pointer could become invalid after unlocked_inode_to_wb_end() drops the RCU read lock. Also add WB_REASON_DONTCACHE as a new writeback reason for tracing visibility. dontcache-bench results (same host, T6F_SKL_1920GBF, 251 GiB RAM, xfs on NVMe, fio io_uring): Buffered and direct I/O paths are unaffected by this patchset. All improvements are confined to the dontcache path: Single-stream throughput (MB/s): Before After Change seq-write/dontcache 298 897 +201% rand-write/dontcache 131 236 +80% Tail latency improvements (seq-write/dontcache): p99: 135,266 us -> 23,986 us (-82%) p99.9: 8,925,479 us -> 28,443 us (-99.7%) Multi-writer (4 jobs, sequential write): Before After Change dontcache aggregate (MB/s) 2,529 4,532 +79% dontcache p99 (us) 8,553 1,002 -88% dontcache p99.9 (us) 109,314 1,057 -99% Dontcache multi-writer throughput now matches buffered (4,532 vs 4,616 MB/s). 32-file write (Axboe test): Before After Change dontcache aggregate (MB/s) 1,548 3,499 +126% dontcache p99 (us) 10,170 602 -94% Peak dirty pages (MB) 1,837 213 -88% Dontcache now reaches 81% of buffered throughput (was 35%). Competing writers (dontcache vs buffered, separate files): Before After buffered writer 868 433 MB/s dontcache writer 415 433 MB/s Aggregate 1,284 866 MB/s Previously the buffered writer starved the dontcache writer 2:1. With per-bdi_writeback tracking, both writers now receive equal bandwidth. The aggregate matches the buffered-vs-buffered baseline (863 MB/s), indicating fair sharing regardless of I/O mode. The dontcache writer's p99.9 latency collapsed from 119 ms to 33 ms (-73%), eliminating the severe periodic stalls seen in the baseline. Both writers now share identical latency profiles, matching the buffered-vs-buffered pattern. The per-bdi_writeback dirty tracking dramatically reduces peak dirty pages in dontcache workloads, with the 32-file test dropping from 1.8 GB to 213 MB. Dontcache sequential write throughput triples and multi-writer throughput reaches parity with buffered I/O, with tail latencies collapsing by 1-2 orders of magnitude. Assisted-by: Claude:claude-opus-4-6 Signed-off-by: Jeff Layton <jlayton@kernel.org> Link: https://patch.msgid.link/20260511-dontcache-v7-3-2848ddce8090@kernel.org Reviewed-by: Jan Kara <jack@suse.cz> Reviewed-by: Ritesh Harjani (IBM) <ritesh.list@gmail.com> Signed-off-by: Christian Brauner (Amutable) <brauner@kernel.org>
2026-06-03	ext4: fast commit: add lock_updates tracepoint	Li Chen	1	-0/+61
	Commit-time fast commit snapshots run under jbd2_journal_lock_updates(), so it is useful to quantify the time spent with updates locked and to understand why snapshotting can fail. Add a new tracepoint, ext4_fc_lock_updates, reporting the time spent in the updates-locked window along with the number of snapshotted inodes and ranges. Record the first snapshot failure reason in a stable snap_err field for tooling. Signed-off-by: Li Chen <chenl311@chinatelecom.cn> Reviewed-by: Steven Rostedt (Google) <rostedt@goodmis.org> Link: https://patch.msgid.link/20260515091829.194810-7-me@linux.beauty Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2026-06-02	mm/damon: trace probe_hits	SeongJae Park	1	-0/+38
	Introduce a new tracepoint for exposing the per-region per-probe positive sample count via tracefs. Link: https://lore.kernel.org/20260518234119.97569-19-sj@kernel.org Signed-off-by: SeongJae Park <sj@kernel.org> Cc: David Hildenbrand <david@kernel.org> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Liam R. Howlett <liam@infradead.org> Cc: Lorenzo Stoakes <ljs@kernel.org> Cc: "Masami Hiramatsu (Google)" <mhiramat@kernel.org> Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Cc: Michal Hocko <mhocko@suse.com> Cc: Mike Rapoport <rppt@kernel.org> Cc: Shuah Khan <shuah@kernel.org> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Suren Baghdasaryan <surenb@google.com> Cc: Vlastimil Babka <vbabka@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2026-06-01	Merge tag 'v7.1-rc6' into tty-next	Greg Kroah-Hartman	4	-4/+11
	We need the tty/serial fixes in here as well. Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2026-06-01	spi: fsl-lpspi: fix DMA termination issues	Mark Brown	3	-1/+10
	Carlos Song (OSS) <carlos.song@oss.nxp.com> says: This series fixes two issues in the fsl-lpspi DMA transfer error paths. Patch 1 replaces the deprecated dmaengine_terminate_all() with dmaengine_terminate_sync() across all error paths in fsl_lpspi_dma_transfer(). Patch 2 fixes a missing RX DMA channel termination when TX descriptor preparation fails. Since the RX channel is already submitted and issued before the TX descriptor is prepared, returning -EINVAL without terminating the RX channel leaves it running against buffers that the SPI core will unmap, potentially causing memory corruption. Link: https://patch.msgid.link/20260525062357.3191349-1-carlos.song@oss.nxp.com
2026-05-28	mm/vmscan: add balance_pgdat begin/end tracepoints	Bunyod Suvonov	1	-0/+52
	Vmscan has six main reclaim entry points: try_to_free_pages() for direct reclaim, try_to_free_mem_cgroup_pages() for memcg reclaim, mem_cgroup_shrink_node() for memcg soft limit reclaim, node_reclaim() for node reclaim, shrink_all_memory() for hibernation reclaim, and balance_pgdat() for kswapd reclaim. All of them, except for shrink_all_memory() and balance_pgdat(), already have begin/end tracepoints. This makes it harder to trace which reclaim path is responsible for memory reclaim activity, because kswapd reclaim cannot be identified as cleanly as other reclaim entry points, even though it is the main background reclaim path under memory pressure. There may be no need to trace shrink_all_memory() as it is primarily used during hibernation. So this patch adds the missing tracepoint pair for balance_pgdat(). The begin tracepoint records the node id, requested reclaim order, and the requested classzone bound (highest_zoneidx). The end tracepoint records the node id, the reclaim order that balance_pgdat() finished with, the requested classzone bound, and nr_reclaimed. Together, they show the requested reclaim order and classzone bound, whether reclaim fell back to a lower order, and how much reclaim work was done. The end tracepoint also records highest_zoneidx even though it does not change within a balance_pgdat() invocation. This keeps the end event self-contained, so users can analyze reclaim results directly from end events without depending on begin/end correlation, which is less convenient when tracing is filtered or records are dropped. It also makes it straightforward to relate nr_reclaimed and the final reclaim order to the requested classzone bound. Link: https://lore.kernel.org/20260424031418.174597-1-b.suvonov@sjtu.edu.cn Link: https://lore.kernel.org/20260423103753.546582-1-b.suvonov@sjtu.edu.cn Signed-off-by: Bunyod Suvonov <b.suvonov@sjtu.edu.cn> Acked-by: Shakeel Butt <shakeel.butt@linux.dev> Cc: Axel Rasmussen <axelrasmussen@google.com> Cc: Barry Song <baohua@kernel.org> Cc: David Hildenbrand <david@kernel.org> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Kairui Song <kasong@tencent.com> Cc: Lorenzo Stoakes <ljs@kernel.org> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Cc: Michal Hocko <mhocko@kernel.org> Cc: Qi Zheng <zhengqi.arch@bytedance.com> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Wei Xu <weixugc@google.com> Cc: Yuanchu Xie <yuanchu@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2026-05-28	gpu: host1x: trace: fix string fields in host1x traces	Artur Kowalski	1	-24/+26
	Use __assign_str and __get_str as required by tracing subsystem. Fixes string fields being rejected by the verifier and unreadable from userspace. Tested on v6.18.21. Signed-off-by: Artur Kowalski <arturkow2000@gmail.com> Acked-by: Steven Rostedt <rostedt@goodmis.org> Signed-off-by: Thierry Reding <treding@nvidia.com> Link: https://patch.msgid.link/20260519-host1x-tracing-v1-1-55afb8cbd186@gmail.com
2026-05-28	Merge v7.1-rc5 into drm-next	Simona Vetter	4	-4/+11
	Boris Brezillion needs the gem lru fixes 379e8f1ca5e9 ("drm/gem: Make the GEM LRU lock part of drm_device") backmerged for drm-misc-next. That also means we need to sort out the rename conflict in panthor with the fixup patch from Boris from drm-tip. Signed-off-by: Simona Vetter <simona.vetter@ffwll.ch>
2026-05-26	blk-mq: add tracepoint block_rq_tag_wait	Aaron Tomlin	1	-0/+59
	In high-performance storage environments, particularly when utilising RAID controllers with shared tag sets (BLK_MQ_F_TAG_HCTX_SHARED), severe latency spikes can occur when fast devices (SSDs) are starved of hardware tags when sharing the same blk_mq_tag_set. Currently, diagnosing this specific hardware queue contention is difficult. When a CPU thread exhausts the tag pool, blk_mq_get_tag() forces the current thread to block uninterruptible via io_schedule(). While this can be inferred via sched:sched_switch or dynamically traced by attaching a kprobe to blk_mq_mark_tag_wait(), there is no dedicated, out-of-the-box observability for this event. This patch introduces the block_rq_tag_wait tracepoint in the tag allocation slow-path. It triggers immediately before the task state is altered to TASK_UNINTERRUPTIBLE (ensuring safety for PREEMPT_RT locks). It exposes the exact hardware context (hctx) that is starved, the specific pool experiencing starvation (driver, software scheduler, or reserved), and the exact pool depth. This provides storage engineers with a zero-configuration, low-overhead mechanism to definitively identify shared-tag bottlenecks. For example, userspace can trivially replicate tag starvation counters using bpftrace: # bpftrace -e 'tracepoint:block:block_rq_tag_wait { @tag_waits[cpu] = count(); }' Attaching 1 probe... ^C @tag_waits[4]: 12 @tag_waits[12]: 87 Signed-off-by: Aaron Tomlin <atomlin@atomlin.com> Link: https://patch.msgid.link/20260525005123.722277-1-atomlin@atomlin.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2026-05-25	Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf 7.1-rc5	Alexei Starovoitov	4	-4/+11
	Cross-merge BPF and other fixes after downstream PR. Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2026-05-22	serial: qcom-geni: trace: Add tracepoint support for Qualcomm GENI serial	Praveen Talari	1	-0/+164
	Add tracepoint support to the Qualcomm GENI serial driver to provide runtime visibility into driver behavior without requiring invasive debug patches. The trace events cover UART termios configuration, clock setup, modem control state, interrupt status, and TX/RX data, making it easier to diagnose communication issues in the field. Reviewed-by: Konrad Dybcio <konrad.dybcio@oss.qualcomm.com> Signed-off-by: Praveen Talari <praveen.talari@oss.qualcomm.com> Link: https://patch.msgid.link/20260518-add-tracepoints-for-qcom-geni-serial-v3-1-b4addb151376@oss.qualcomm.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2026-05-21	Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net	Jakub Kicinski	4	-4/+11
	Cross-merge networking fixes after downstream PR (net-7.1-rc5). No conflicts, adjacent changes: drivers/net/ethernet/mellanox/mlx5/core/en_txrx.c cc199cd1b912 ("net/mlx5e: Reduce branches in napi poll") c326f9c68921 ("net/mlx5e: xsk: Fix unlocked writing to ICOSQ") drivers/net/ethernet/mellanox/mlx5/core/eswitch.c c6df9a65cbb0 ("net/mlx5: Skip disabled vports when setting max TX speed") 1fba57c91416 ("net/mlx5: Add VHCA_ID page management mode support") net/mac80211/mlme.c a6e6ccd5bd07 ("wifi: mac80211: consume only present negotiated TTLM maps") 49e62ec6eb06 ("wifi: mac80211: move frame RX handling to type files") Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-05-21	Merge tag 'net-7.1-rc5' of ↵	Linus Torvalds	1	-0/+1
	git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net Pull networking fixes from Jakub Kicinski: "Including fixes from Bluetooth, wireless and netfilter. Craziness continues with no end in sight. Even discounting the driver revert this is a pretty huge PR for standards of the previous era. I'd speculate - we haven't seen the worst of it, yet. Good news, I guess, is that so far we haven't seen many (any?) cases of "AI reported a bug, we fixed it and a real user regressed". Current release - fix to a fix: - Bluetooth: btmtk: accept too short WMT FUNC_CTRL events - vsock/virtio: relax the recently added memory limit a little Current release - regressions: - IB/IPoIB: make sure IB drivers always use async set_rx_mode since some (mlx5) are now required to use it due to locking changes Previous releases - regressions: - udp: fix UDP length on last GSO_PARTIAL segment - af_unix: fix UAF read of tail->len in unix_stream_data_wait() - tcp: fix stale per-CPU tcp_tw_isn leak enabling ISN prediction - mlx5e: fix unlocked writing to ICOSQ, breaking AF_XDP Previous releases - always broken: - tap: fix stack info leak in tap_ioctl() SIOCGIFHWADDR - ipv4: raw: reject IP_HDRINCL packets with ihl < 5 - Bluetooth: a lot of locking and concurrency fixes (as always) - batman-adv (mesh wireless networking): a lot of random fixes for issues reported by security researchers and Sashiko - netfilter: same thing, a lot of small security-ish fixes all over the place, nothing really stands out Misc: - bring back the old 3c509 driver, Maciej wants to maintain it" * tag 'net-7.1-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (187 commits) net: enetc: avoid VF->PF mailbox timeout during SR-IOV teardown net: enetc: fix init and teardown order to prevent use of unsafe resources net: enetc: fix unbounded loop and interrupt handling in VF-to-PF messaging net: enetc: fix DMA write to freed memory in enetc_msg_free_mbx() net: enetc: fix race condition in VF MAC address configuration net: enetc: fix TOCTOU race and validate VF MAC address net: enetc: add ratelimiting to VF mailbox error messages net: enetc: fix missing error code when pf->vf_state allocation fails net: enetc: fix incorrect mailbox message status returned to VFs net: bridge: prevent too big nested attributes in br_fill_linkxstats() l2tp: use list_del_rcu in l2tp_session_unhash net: bcmgenet: keep RBUF EEE/PM disabled ethernet: 3c509: Fix most coding style issues ethernet: 3c509: Update documentation to match MAINTAINERS ethernet: 3c509: Add GPL 2.0 SPDX license identifier ethernet: 3c509: Fix AUI transceiver type selection Revert "drivers: net: 3com: 3c509: Remove this driver" tools: ynl: support listening on all nsids net: gro: don't merge zcopy skbs pds_core: ensure null-termination for firmware version strings ...
2026-05-20	crypto/krb5, rxrpc: Fix lack of pre-decrypt/pre-verify length checks	David Howells	1	-0/+1
	Change the krb5 crypto library to provide facilities to precheck the length of the message about to be decrypted or verified. Fix AF_RXRPC to make use of this to validate DATA packets secured with RxGK. Fixes: 9d1d2b59341f ("rxrpc: rxgk: Implement the yfs-rxgk security class (GSSAPI)") Closes: https://sashiko.dev/#/patchset/20260511160753.607296-1-dhowells%40redhat.com Signed-off-by: David Howells <dhowells@redhat.com> cc: Herbert Xu <herbert@gondor.apana.org.au> cc: Simon Horman <horms@kernel.org> cc: Chuck Lever <chuck.lever@oracle.com> cc: linux-afs@lists.infradead.org Reviewed-by: Jeffrey Altman <jaltman@auristor.com> Tested-by: Marc Dionne <marc.dionne@auristor.com> Link: https://patch.msgid.link/20260515230516.2718212-2-dhowells@redhat.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-05-19	Merge tag 'mm-hotfixes-stable-2026-05-18-21-07' of ↵	Linus Torvalds	1	-1/+1
	git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm Pull misc fixes from Andrew Morton: "14 hotfixes. 9 are for MM. 10 are cc:stable and the remainder are for post-7.1 issues or aren't deemed suitable for backporting. There's a two-patch MAINTAINERS series from Mike Rapoport which updates us for the new KEXEC/KDUMP/crash/LUO/etc arrangements. And another two-patch series from Muchun Song to fix a couple of memory-hotplug issues. Otherwise singletons, please see the changelogs for details" * tag 'mm-hotfixes-stable-2026-05-18-21-07' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: mm/memory: fix spurious warning when unmapping device-private/exclusive pages mm: fix __vm_normal_page() to handle missing support for pmd_special()/pud_special() drivers/base/memory: fix memory block reference leak in poison accounting mm/memory_hotplug: fix memory block reference leak on remove lib: kunit_iov_iter: fix test fail on powerpc mm/page_alloc: fix initialization of tags of the huge zero folio with init_on_free MAINTAINERS: add kexec@ list to LIVE UPDATE ENTRY MAINTAINERS: add tree for KDUMP and KEXEC selftests/mm: run_vmtests.sh: fix destructive tests invocation scripts/gdb: slab: update field names of struct kmem_cache scripts/gdb: mm: cast untyped symbols in x86_page_ops mm/damon: fix damos_stat tracepoint format for sz_applied mm/damon/sysfs-schemes: call missing mem_cgroup_iter_break() mm/migrate_device: fix spinlock leak in migrate_vma_insert_huge_pmd_page
2026-05-19	spi: qcom-geni: trace: Add trace events for Qualcomm GENI SPI	Praveen Talari	1	-0/+103
	Add tracepoint support to the Qualcomm GENI SPI driver to provide runtime visibility into driver behavior without requiring invasive debug patches. The trace events cover clock and setup parameter configuration, transfer metadata, interrupt status to be making it easier to diagnose communication issues in the field.. Reviewed-by: Konrad Dybcio <konrad.dybcio@oss.qualcomm.com> Signed-off-by: Praveen Talari <praveen.talari@oss.qualcomm.com> Link: https://patch.msgid.link/20260518-add-tracepoints-for-qcom-geni-spi-v3-1-7928f6810a79@oss.qualcomm.com Signed-off-by: Mark Brown <broonie@kernel.org>
2026-05-18	Merge tag 'vfs-7.1-rc5.fixes' of ↵	Linus Torvalds	1	-0/+8
	git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs Pull vfs fixes from Christian Brauner: "This contains a fixes for the current development cycle. Note that AI related review sometimes delays fixes a bit because we find more fixes for the fixes. I might try and send smaller but more fixes PRs if this trend keeps up. - Fix various netfslib bugs - Fix an out-of-bounds write when listing idmappings - Fix the return values in jfs_mkdir() and orangefs_mkdir() - Fix a writeback writeback array overflow in fuse - Fix a forced iversion increment on lazytime timestamp updates - Reject a negative timeval component in kern_select() - Fix error return when vfs_mkdir() fails in the cachefiles code - Fix wrong error code returned for pidns ioctls" * tag 'vfs-7.1-rc5.fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs: (31 commits) cachefiles: Fix error return when vfs_mkdir() fails afs: Fix the locking used by afs_get_link() netfs, afs: Fix write skipping in dir/link writepages netfs: Fix netfs_read_folio() to wait on writeback netfs: Fix folio->private handling in netfs_perform_write() netfs: Fix partial invalidation of streaming-write folio netfs: Fix potential UAF in netfs_unlock_abandoned_read_pages() netfs: Fix leak of request in netfs_write_begin() error handling netfs: Fix early put of sink folio in netfs_read_gaps() netfs: Fix write streaming disablement if fd open O_RDWR netfs: Fix read-gaps to remove netfs_folio from filled folio netfs: Fix potential deadlock in write-through mode netfs: Fix streaming write being overwritten netfs: Defer the emission of trace_netfs_folio() netfs: Fix netfs_invalidate_folio() to clear dirty bit if all changes gone netfs: Fix overrun check in netfs_extract_user_iter() netfs: fix error handling in netfs_extract_user_iter() netfs: Fix potential uninitialised var in netfs_extract_user_iter() netfs: fix VM_BUG_ON_FOLIO() issue in netfs_write_begin() call netfs: Fix zeropoint update where i_size > remote_i_size ...
2026-05-15	Merge tag 'for-7.1-rc3-tag' of ↵	Linus Torvalds	1	-3/+1
	git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux Pull btrfs fixes from David Sterba: - fixup warning when allocating memory for readahead, __GFP_NOWARN was accidentally dropped when setting mapping constraints - in tracepoint of file sync, fix sleeping in atomic context when handling dentries - harden initial loading of block group on crafted/fuzzed images, iterate all chunk mapping entries unconditionally - fix freeing pages of submitted io after checking for errors - fix incorrect inode size after remount when using fallocate KEEP_SIZE mode (also requires disabled 'no-holes' feature) * tag 'for-7.1-rc3-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux: btrfs: fix incorrect i_size after remount caused by KEEP_SIZE prealloc gap btrfs: only release the dirty pages io tree after successful writes btrfs: tracepoints: fix sleep while in atomic context in btrfs_sync_file() btrfs: always pass __GFP_NOWARN from add_ra_bio_pages() btrfs: fix check_chunk_block_group_mappings() to iterate all chunk maps
2026-05-15	fsnotify: new tracepoint in fsnotify()	Jeff Layton	2	-0/+86
	Add a tracepoint so we can see exactly how this is being called. Reviewed-by: Jan Kara <jack@suse.cz> Signed-off-by: Jeff Layton <jlayton@kernel.org> Link: https://patch.msgid.link/20260428-dir-deleg-v3-5-5a0780ba9def@kernel.org Acked-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Christian Brauner <brauner@kernel.org>
2026-05-15	filelock: add a tracepoint to start of break_lease()	Jeff Layton	1	-0/+33
	...mostly to show the LEASE_BREAK_* flags. Reviewed-by: Jan Kara <jack@suse.cz> Signed-off-by: Jeff Layton <jlayton@kernel.org> Link: https://patch.msgid.link/20260428-dir-deleg-v3-3-5a0780ba9def@kernel.org Acked-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Christian Brauner <brauner@kernel.org>
2026-05-15	filelock: add support for ignoring deleg breaks for dir change events	Jeff Layton	1	-1/+4
	If a NFS client requests a directory delegation with a notification bitmask covering directory change events, the server shouldn't recall the delegation. Instead the client will be notified of the change after the fact. Add support for ignoring lease breaks on directory changes. Add a new flags parameter to try_break_deleg() and teach __break_lease how to ignore certain types of delegation break events. Reviewed-by: Jan Kara <jack@suse.cz> Signed-off-by: Jeff Layton <jlayton@kernel.org> Link: https://patch.msgid.link/20260428-dir-deleg-v3-2-5a0780ba9def@kernel.org Acked-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Christian Brauner <brauner@kernel.org>
2026-05-13	mm/damon: fix damos_stat tracepoint format for sz_applied	SeongJae Park	1	-1/+1
	The print format is wrongly marking sz_applied as sz_tried. Fix it. Link: https://lore.kernel.org/20260426193119.88095-1-sj@kernel.org Fixes: 804c26b961da ("mm/damon/core: add trace point for damos stat per apply interval") Signed-off-by: SeongJae Park <sj@kernel.org> Cc: "Masami Hiramatsu (Google)" <mhiramat@kernel.org> Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: <stable@vger.kernel.org> # 7.0.x Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2026-05-12	net: constify sk_skb_reason_drop() sock parameter	Eric Dumazet	1	-2/+2
	sk_skb_reason_drop() does not change sock parameter, make it const so that we can call it from TCP stack without a cast on a (const) listener socket. Signed-off-by: Eric Dumazet <edumazet@google.com> Reviewed-by: David Ahern <dsahern@kernel.org> Reviewed-by: Kuniyuki Iwashima <kuniyu@google.com> Link: https://patch.msgid.link/20260511072310.1094859-2-edumazet@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-05-12	netfs: Fix folio->private handling in netfs_perform_write()	David Howells	1	-0/+1
	Under some circumstances, netfs_perform_write() doesn't correctly manipulate folio->private between NULL, NETFS_FOLIO_COPY_TO_CACHE, pointing to a group and pointing to a netfs_folio struct, leading to potential multiple attachments of private data with associated folio ref leaks and also leaks of netfs_folio structs or netfs_group refs. Fix this by consolidating the place at which a folio is marked uptodate in one place and having that look at what's attached to folio->private and decide how to clean it up and then set the new group. Also, the content shouldn't be flushed if group is NULL, even if a group is specified in the netfs_group parameter, as that would be the case for a new folio. A filesystem should always specify netfs_group or never specify netfs_group. The Sashiko auto-review tool noted that it was theoretically possible that the fpos >= ctx->zero_point section might leak if it modified a streaming write folio. This is unlikely, but with a network filesystem, third party changes can happen. It also pointed out that __netfs_set_group() would leak if called multiple times on the same folio from the "whole folio modify section". Fixes: 8f52de0077ba ("netfs: Reduce number of conditional branches in netfs_perform_write()") Closes: https://sashiko.dev/#/patchset/20260414082004.3756080-1-dhowells%40redhat.com Signed-off-by: David Howells <dhowells@redhat.com> Link: https://patch.msgid.link/20260512123404.719402-22-dhowells@redhat.com cc: Paulo Alcantara <pc@manguebit.org> cc: Matthew Wilcox <willy@infradead.org> cc: netfs@lists.linux.dev cc: linux-fsdevel@vger.kernel.org Signed-off-by: Christian Brauner <brauner@kernel.org>
2026-05-12	netfs: Fix streaming write being overwritten	David Howells	1	-0/+3
	In order to avoid reading whilst writing, netfslib will allow "streaming writes" in which dirty data is stored directly into folios without reading them first. Such folios are marked dirty but may not be marked uptodate. If a folio is entirely written by a streaming write, uptodate will be set, otherwise it will have a netfs_folio struct attached to ->private recording the dirty region. In the event that a partially written streaming write page is to be overwritten entirely by a single write(), netfs_perform_write() will try to copy over it, but doesn't discard the netfs_folio if it succeeds; further, it doesn't correctly handle a partial copy that overwrites some of the dirty data. Fix this by the following: (1) If the folio is successfully overwritten, free the netfs_folio struct before marking the page uptodate. (2) If the copy to the folio partially fails, but short of the dirty data, just ignore the copy. (3) If the copy partially fails and overwrites some of the dirty data, accept the copy, update the netfs_folio struct to record the new data. If the folio is now filled, free the netfs_folio and set uptodate, otherwise return a partial write. Found with: fsx -q -N 1000000 -p 10000 -o 128000 -l 600000 \ /xfstest.test/junk --replay-ops=junk.fsxops using the following as junk.fsxops: truncate 0x0 0 0x927c0 write 0x63fb8 0x53c8 0 copy_range 0xb704 0x19b9 0x24429 0x79380 write 0x2402b 0x144a2 0x90660 * write 0x204d5 0x140a0 0x927c0 * copy_range 0x1f72c 0x137d0 0x7a906 0x927c0 * read 0x00000 0x20000 0x9157c read 0x20000 0x20000 0x9157c read 0x40000 0x20000 0x9157c read 0x60000 0x20000 0x9157c read 0x7e1a0 0xcfb9 0x9157c on cifs with the default cache option. It shows folio 0x24 misbehaving if the FMODE_READ check is commented out in netfs_perform_write(): if (//(file->f_mode & FMODE_READ) \|\| netfs_is_cache_enabled(ctx)) { and no fscache. This was initially found with the generic/522 xfstest. Fixes: 8f52de0077ba ("netfs: Reduce number of conditional branches in netfs_perform_write()") Signed-off-by: David Howells <dhowells@redhat.com> Link: https://patch.msgid.link/20260512123404.719402-14-dhowells@redhat.com cc: Paulo Alcantara <pc@manguebit.org> cc: Matthew Wilcox <willy@infradead.org> cc: netfs@lists.linux.dev cc: linux-fsdevel@vger.kernel.org Signed-off-by: Christian Brauner <brauner@kernel.org>
2026-05-12	netfs: Fix netfs_invalidate_folio() to clear dirty bit if all changes gone	David Howells	1	-0/+4
	If a streaming write is made, this will leave the relevant modified folio in a not-uptodate, but dirty state with a netfs_folio struct hung off of folio->private indicating the dirty range. Subsequently truncating the file such that the dirty data in the folio is removed, but the first part of the folio theoretically remains will cause the netfs_folio struct to be discarded... but will leave the dirty flag set. If the folio is then read via mmap(), netfs_read_folio() will see that the page is dirty and jump to netfs_read_gaps() to fill in the missing bits. netfs_read_gaps(), however, expects there to be a netfs_folio struct present and can oops because truncate removed it. Fix this by calling folio_cancel_dirty() in netfs_invalidate_folio() in the event that all the dirty data in the folio is erased (as nfs does). Also add some tracepoints to log modifications to a dirty page. This can be reproduced with something like: dd if=/dev/zero of=/xfstest.test/foo bs=1M count=1 umount /xfstest.test mount /xfstest.test xfs_io -c "w 0xbbbf 0xf96c" \ -c "truncate 0xbbbf" \ -c "mmap -r 0xb000 0x11000" \ -c "mr 0xb000 0x11000" \ /xfstest.test/foo with fscaching disabled (otherwise streaming writes are suppressed) and a change to netfs_perform_write() to disallow streaming writes if the fd is open O_RDWR: if (//(file->f_mode & FMODE_READ) \|\| <--- comment this out netfs_is_cache_enabled(ctx)) { It should be reproducible even without this change, but if prevents the above trivial xfs_io command from reproducing it. Note that the initial dd is important: the file must start out sufficiently large that the zero-point logic doesn't just clear the gaps because it knows there's nothing in the file to read yet. Unmounting and mounting is needed to clear the pagecache (there are other ways to do that that may also work). This was initially reproduced with the generic/522 xfstest on some patches that remove the FMODE_READ restriction. Fixes: 9ebff83e6481 ("netfs: Prep to use folio->private for write grouping and streaming write") Reported-by: Marc Dionne <marc.dionne@auristor.com> Signed-off-by: David Howells <dhowells@redhat.com> Link: https://patch.msgid.link/20260512123404.719402-12-dhowells@redhat.com cc: Paulo Alcantara <pc@manguebit.org> cc: Matthew Wilcox <willy@infradead.org> cc: netfs@lists.linux.dev cc: linux-fsdevel@vger.kernel.org Signed-off-by: Christian Brauner <brauner@kernel.org>
2026-05-11	fs: add icount_read_once() and stop open-coding ->i_count loads	Mateusz Guzik	1	-1/+1
	Similarly to inode_state_read_once(), it makes the caller spell out they acknowledge instability of the returned value. Signed-off-by: Mateusz Guzik <mjguzik@gmail.com> Link: https://patch.msgid.link/20260421182538.1215894-2-mjguzik@gmail.com Reviewed-by: Jan Kara <jack@suse.cz> Signed-off-by: Christian Brauner <brauner@kernel.org>
2026-05-10	Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf 7.1-rc3	Alexei Starovoitov	7	-28/+48
	Cross-merge BPF and other fixes after downstream PR. Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2026-05-08	btrfs: tracepoints: fix sleep while in atomic context in btrfs_sync_file()	Filipe Manana	1	-3/+1
	The trace event btrfs_sync_file() is called in an atomic context (all trace events are) and its call to dput(), which is needed due to the call to dget_parent(), can sleep, triggering a kernel splat. This can be reproduced by enabling the trace event and running btrfs/056 from fstests for example. The splat shown in dmesg is the following: [53.919] BUG: sleeping function called from invalid context at fs/dcache.c:970 [53.947] in_atomic(): 1, irqs_disabled(): 0, non_block: 0, pid: 32773, name: xfs_io [53.988] preempt_count: 2, expected: 0 [53.967] RCU nest depth: 0, expected: 0 [53.943] Preemption disabled at: [53.944] [<0000000000000000>] 0x0 [54.078] CPU: 0 UID: 0 PID: 32773 Comm: xfs_io Tainted: G W 7.1.0-rc1-btrfs-next-232+ #1 PREEMPT(full) [54.070] Tainted: [W]=WARN [54.071] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.16.2-0-gea1b7a073390-prebuilt.qemu.org 04/01/2014 [54.072] Call Trace: [54.074] <TASK> [54.076] dump_stack_lvl+0x56/0x80 [54.079] __might_resched.cold+0xd6/0x10f [54.072] dput.part.0+0x24/0x110 [54.078] trace_event_raw_event_btrfs_sync_file+0x75/0x140 [btrfs] [54.089] btrfs_sync_file+0x1ed/0x530 [btrfs] [54.087] ? __handle_mm_fault+0x8ae/0xed0 [54.089] btrfs_do_write_iter+0x172/0x210 [btrfs] [54.091] vfs_write+0x21f/0x450 [54.094] __x64_sys_pwrite64+0x8d/0xc0 [54.096] ? do_user_addr_fault+0x20c/0x670 [54.099] do_syscall_64+0x60/0xf20 [54.092] ? clear_bhb_loop+0x60/0xb0 [54.094] entry_SYSCALL_64_after_hwframe+0x76/0x7e So stop using dget_parent() and dput() and access the parent dentry directly as dentry->d_parent. This is also what ext4 is doing in its equivalent trace event ext4_sync_file_enter(). Fixes: a85b46db143f ("btrfs: tracepoints: get correct superblock from dentry in event btrfs_sync_file()") Reviewed-by: Boris Burkov <boris@bur.io> Signed-off-by: Filipe Manana <fdmanana@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2026-05-06	timers/migration: Handle capacity in connect tracepoints	Frederic Weisbecker	1	-10/+14
	This let tracers know to which hierarchy a CPU belongs to. Signed-off-by: Frederic Weisbecker <frederic@kernel.org> Signed-off-by: Thomas Gleixner <tglx@kernel.org> Link: https://patch.msgid.link/20260423165354.95152-6-frederic@kernel.org
2026-05-01	hrtimer: Provide hrtimer_start_range_ns_user()	Thomas Gleixner	1	-0/+13
	Calvin reported an odd NMI watchdog lockup which claims that the CPU locked up in user space. He provided a reproducer, which set's up a timerfd based timer and then rearms it in a loop with an absolute expiry time of 1ns. As the expiry time is in the past, the timer ends up as the first expiring timer in the per CPU hrtimer base and the clockevent device is programmed with the minimum delta value. If the machine is fast enough, this ends up in a endless loop of programming the delta value to the minimum value defined by the clock event device, before the timer interrupt can fire, which starves the interrupt and consequently triggers the lockup detector because the hrtimer callback of the lockup mechanism is never invoked. The clockevents code already has a last resort mechanism to prevent that, but it's sensible to catch such issues before trying to reprogram the clock event device. Provide a variant of hrtimer_start_range_ns(), which sanity checks the timer after queueing it. It does not so before because the timer might be armed and therefore needs to be dequeued. also we optimize for the latest possible point to check, so that the clock event prevention is avoided as much as possible. If the timer is already expired _before_ the clock event is reprogrammed, remove the timer from the queue and signal to the caller that the operation failed by returning false. That allows the caller to take immediate action without going through the loops and hoops of the hrtimer interrupt. The queueing code can't invoke the timer callback as the caller might hold a lock which is taken in the callback. Add a tracepoint which allows to analyze the expired at start situation. Reported-by: Calvin Owens <calvin@wbinvd.org> Signed-off-by: Thomas Gleixner <tglx@kernel.org> Tested-by: Calvin Owens <calvin@wbinvd.org> Reviewed-by: Frederic Weisbecker <frederic@kernel.org> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://patch.msgid.link/20260408114951.995031895@kernel.org
2026-04-27	Merge drm/drm-next into drm-misc-next	Thomas Zimmermann	34	-704/+1103
	Getting fixes and updates from v7.1-rc1. Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de>
2026-04-24	Merge tag 'nfs-for-7.1-1' of git://git.linux-nfs.org/projects/trondmy/linux-nfs	Linus Torvalds	1	-14/+14
	Pull NFS client updates from Trond Myklebust: "Bugfixes: - Fix handling of ENOSPC so that if we have to resend writes, they are written synchronously - SUNRPC RDMA transport fixes from Chuck - Several fixes for delegated timestamps in NFSv4.2 - Failure to obtain a directory delegation should not cause stat() to fail with NFSv4 - Rename was failing to update timestamps when a directory delegation is held on NFSv4 - Ensure we check rsize/wsize after crossing a NFSv4 filesystem boundary - NFSv4/pnfs: - If the server is down, retry the layout returns on reboot - Fallback to MDS could result in a short write being incorrectly logged Cleanups: - Use memcpy_and_pad in decode_fh" * tag 'nfs-for-7.1-1' of git://git.linux-nfs.org/projects/trondmy/linux-nfs: (21 commits) NFS: Fix RCU dereference of cl_xprt in nfs_compare_super_address NFS: remove redundant __private attribute from nfs_page_class NFSv4.2: fix CLONE/COPY attrs in presence of delegated attributes NFS: fix writeback in presence of errors nfs: use memcpy_and_pad in decode_fh NFSv4.1: Apply session size limits on clone path NFSv4: retry GETATTR if GET_DIR_DELEGATION failed NFS: fix RENAME attr in presence of directory delegations pnfs/flexfiles: validate ds_versions_cnt is non-zero NFS/blocklayout: print each device used for SCSI layouts xprtrdma: Post receive buffers after RPC completion xprtrdma: Scale receive batch size with credit window xprtrdma: Replace rpcrdma_mr_seg with xdr_buf cursor xprtrdma: Decouple frwr_wp_create from frwr_map xprtrdma: Close lost-wakeup race in xprt_rdma_alloc_slot xprtrdma: Avoid 250 ms delay on backlog wakeup xprtrdma: Close sendctx get/put race that can block a transport nfs: update inode ctime after removexattr operation nfs: fix utimensat() for atime with delegated timestamps NFS: improve "Server wrote zero bytes" error ...
2026-04-23	Merge tag 'net-7.1-rc1' of ↵	Linus Torvalds	2	-4/+6
	git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net Pull networking fixes from Jakub Kicinski: "Including fixes from Netfilter. Steady stream of fixes. Last two weeks feel comparable to the two weeks before the merge window. Lots of AI-aided bug discovery. A newer big source is Sashiko/Gemini (Roman Gushchin's system), which points out issues in existing code during patch review (maybe 25% of fixes here likely originating from Sashiko). Nice thing is these are often fixed by the respective maintainers, not drive-bys. Current release - new code bugs: - kconfig: MDIO_PIC64HPSC should depend on ARCH_MICROCHIP Previous releases - regressions: - add async ndo_set_rx_mode and switch drivers which we promised to be called under the per-netdev mutex to it - dsa: remove duplicate netdev_lock_ops() for conduit ethtool ops - hv_sock: report EOF instead of -EIO for FIN - vsock/virtio: fix MSG_PEEK calculation on bytes to copy Previous releases - always broken: - ipv6: fix possible UAF in icmpv6_rcv() - icmp: validate reply type before using icmp_pointers - af_unix: drop all SCM attributes for SOCKMAP - netfilter: fix a number of bugs in the osf (OS fingerprinting) - eth: intel: fix timestamp interrupt configuration for E825C Misc: - bunch of data-race annotations" * tag 'net-7.1-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (148 commits) rxrpc: Fix error handling in rxgk_extract_token() rxrpc: Fix re-decryption of RESPONSE packets rxrpc: Fix rxrpc_input_call_event() to only unshare DATA packets rxrpc: Fix missing validation of ticket length in non-XDR key preparsing rxgk: Fix potential integer overflow in length check rxrpc: Fix conn-level packet handling to unshare RESPONSE packets rxrpc: Fix potential UAF after skb_unshare() failure rxrpc: Fix rxkad crypto unalignment handling rxrpc: Fix memory leaks in rxkad_verify_response() net: rds: fix MR cleanup on copy error m68k: mvme147: Make me the maintainer net: txgbe: fix firmware version check selftests/bpf: check epoll readiness during reuseport migration tcp: call sk_data_ready() after listener migration vhost_net: fix sleeping with preempt-disabled in vhost_net_busy_poll() ipv6: Cap TLV scan in ip6_tnl_parse_tlv_enc_lim tipc: fix double-free in tipc_buf_append() llc: Return -EINPROGRESS from llc_ui_connect() ipv4: icmp: validate reply type before using icmp_pointers selftests/net: packetdrill: cover RFC 5961 5.2 challenge ACK on both edges ...
2026-04-23	rxrpc: Fix re-decryption of RESPONSE packets	David Howells	1	-1/+0
	If a RESPONSE packet gets a temporary failure during processing, it may end up in a partially decrypted state - and then get requeued for a retry. Fix this by just discarding the packet; we will send another CHALLENGE packet and thereby elicit a further response. Similarly, discard an incoming CHALLENGE packet if we get an error whilst generating a RESPONSE; the server will send another CHALLENGE. Fixes: 17926a79320a ("[AF_RXRPC]: Provide secure RxRPC sockets for use by userspace and kernel both") Closes: https://sashiko.dev/#/patchset/20260422161438.2593376-4-dhowells@redhat.com Signed-off-by: David Howells <dhowells@redhat.com> cc: Marc Dionne <marc.dionne@auristor.com> cc: Jeffrey Altman <jaltman@auristor.com> cc: Simon Horman <horms@kernel.org> cc: linux-afs@lists.infradead.org cc: stable@kernel.org Link: https://patch.msgid.link/20260423200909.3049438-3-dhowells@redhat.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-04-23	rxrpc: Fix potential UAF after skb_unshare() failure	David Howells	1	-2/+2
	If skb_unshare() fails to unshare a packet due to allocation failure in rxrpc_input_packet(), the skb pointer in the parent (rxrpc_io_thread()) will be NULL'd out. This will likely cause the call to trace_rxrpc_rx_done() to oops. Fix this by moving the unsharing down to where rxrpc_input_call_event() calls rxrpc_input_call_packet(). There are a number of places prior to that where we ignore DATA packets for a variety of reasons (such as the call already being complete) for which an unshare is then avoided. And with that, rxrpc_input_packet() doesn't need to take a pointer to the pointer to the packet, so change that to just a pointer. Fixes: 2d1faf7a0ca3 ("rxrpc: Simplify skbuff accounting in receive path") Closes: https://sashiko.dev/#/patchset/20260408121252.2249051-1-dhowells%40redhat.com Signed-off-by: David Howells <dhowells@redhat.com> cc: Marc Dionne <marc.dionne@auristor.com> cc: Jeffrey Altman <jaltman@auristor.com> cc: Simon Horman <horms@kernel.org> cc: linux-afs@lists.infradead.org cc: stable@kernel.org Link: https://patch.msgid.link/20260422161438.2593376-4-dhowells@redhat.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-04-23	rxrpc: Fix rxkad crypto unalignment handling	David Howells	1	-0/+1
	Fix handling of a packet with a misaligned crypto length. Also handle non-ENOMEM errors from decryption by aborting. Further, remove the WARN_ON_ONCE() so that it can't be remotely triggered (a trace line can still be emitted). Fixes: f93af41b9f5f ("rxrpc: Fix missing error checks for rxkad encryption/decryption failure") Closes: https://sashiko.de