aboutsummaryrefslogtreecommitdiff
path: root/include/trace
AgeCommit message (Collapse)AuthorFilesLines
2026-04-05mm: khugepaged: add trace_mm_khugepaged_scan eventVernon Yang1-0/+25
Patch series "Improve khugepaged scan logic", v8. This series improves the khugepaged scan logic and reduces CPU consumption by prioritizing scanning tasks that access memory frequently. The following data is traced by bpftrace[1] on a desktop system. After the system has been left idle for 10 minutes upon booting, a lot of SCAN_PMD_MAPPED or SCAN_NO_PTE_TABLE are observed during a full scan by khugepaged. @scan_pmd_status[1]: 1 ## SCAN_SUCCEED @scan_pmd_status[6]: 2 ## SCAN_EXCEED_SHARED_PTE @scan_pmd_status[3]: 142 ## SCAN_PMD_MAPPED @scan_pmd_status[2]: 178 ## SCAN_NO_PTE_TABLE total progress size: 674 MB Total time : 419 seconds ## include khugepaged_scan_sleep_millisecs The khugepaged has below phenomenon: the khugepaged list is scanned in a FIFO manner, as long as the task is not destroyed, 1. the task no longer has memory that can be collapsed into hugepage, continues scan it always. 2. the task at the front of the khugepaged scan list is cold, they are still scanned first. 3. everyone scan at intervals of khugepaged_scan_sleep_millisecs (default 10s). If we always scan the above two cases first, the valid scan will have to wait for a long time. For the first case, when the memory is either SCAN_PMD_MAPPED or SCAN_NO_PTE_TABLE or SCAN_PTE_MAPPED_HUGEPAGE [5], just skip it. For the second case, if the user has explicitly informed us via MADV_FREE that these folios will be freed, just skip it only. The below is some performance test results. kernbench results (testing on x86_64 machine): baseline w/o patches test w/ patches Amean user-32 18522.51 ( 0.00%) 18333.64 * 1.02%* Amean syst-32 1137.96 ( 0.00%) 1113.79 * 2.12%* Amean elsp-32 666.04 ( 0.00%) 659.44 * 0.99%* BAmean-95 user-32 18520.01 ( 0.00%) 18323.57 ( 1.06%) BAmean-95 syst-32 1137.68 ( 0.00%) 1110.50 ( 2.39%) BAmean-95 elsp-32 665.92 ( 0.00%) 659.06 ( 1.03%) BAmean-99 user-32 18520.01 ( 0.00%) 18323.57 ( 1.06%) BAmean-99 syst-32 1137.68 ( 0.00%) 1110.50 ( 2.39%) BAmean-99 elsp-32 665.92 ( 0.00%) 659.06 ( 1.03%) Create three task[2]: hot1 -> cold -> hot2. After all three task are created, each allocate memory 128MB. the hot1/hot2 task continuously access 128 MB memory, while the cold task only accesses its memory briefly andthen call madvise(MADV_FREE). Here are the performance test results: (Throughput bigger is better, other smaller is better) Testing on x86_64 machine: | task hot2 | without patch | with patch | delta | |---------------------|---------------|---------------|---------| | total accesses time | 3.14 sec | 2.93 sec | -6.69% | | cycles per access | 4.96 | 2.21 | -55.44% | | Throughput | 104.38 M/sec | 111.89 M/sec | +7.19% | | dTLB-load-misses | 284814532 | 69597236 | -75.56% | Testing on qemu-system-x86_64 -enable-kvm: | task hot2 | without patch | with patch | delta | |---------------------|---------------|---------------|---------| | total accesses time | 3.35 sec | 2.96 sec | -11.64% | | cycles per access | 7.29 | 2.07 | -71.60% | | Throughput | 97.67 M/sec | 110.77 M/sec | +13.41% | | dTLB-load-misses | 241600871 | 3216108 | -98.67% | This patch (of 4): Add mm_khugepaged_scan event to track the total time for full scan and the total number of pages scanned of khugepaged. Link: https://lkml.kernel.org/r/20260221093918.1456187-2-vernon2gm@gmail.com Signed-off-by: Vernon Yang <yanglincheng@kylinos.cn> Acked-by: David Hildenbrand (Red Hat) <david@kernel.org> Reviewed-by: Barry Song <baohua@kernel.org> Reviewed-by: Lance Yang <lance.yang@linux.dev> Reviewed-by: Dev Jain <dev.jain@arm.com> Cc: Baolin Wang <baolin.wang@linux.alibaba.com> Cc: Dev Jain <dev.jain@arm.com> Cc: Liam Howlett <Liam.Howlett@oracle.com> Cc: Lorenzo Stoakes <lorenzo.stoakes@oracle.com> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Cc: Nico Pache <npache@redhat.com> Cc: Ryan Roberts <ryan.roberts@arm.com> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Zi Yan <ziy@nvidia.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2026-04-02Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netJakub Kicinski2-8/+11
Cross-merge networking fixes after downstream PR (net-7.0-rc7). Conflicts: net/vmw_vsock/af_vsock.c b18c83388874 ("vsock: initialize child_ns_mode_locked in vsock_net_init()") 0de607dc4fd8 ("vsock: add G2H fallback for CIDs not owned by H2G transport") Adjacent changes: drivers/net/ethernet/broadcom/bnxt/bnxt_ethtool.c ceee35e5674a ("bnxt_en: Refactor some basic ring setup and adjustment logic") 57cdfe0dc70b ("bnxt_en: Resize RSS contexts on channel count change") drivers/net/wireless/intel/iwlwifi/mld/mac80211.c 4d56037a02bd ("wifi: iwlwifi: mld: block EMLSR during TDLS connections") 687a95d204e7 ("wifi: iwlwifi: mld: correctly set wifi generation data") drivers/net/wireless/intel/iwlwifi/mld/scan.h b6045c899e37 ("wifi: iwlwifi: mld: Refactor scan command handling") ec66ec6a5a8f ("wifi: iwlwifi: mld: Fix MLO scan timing") drivers/net/wireless/intel/iwlwifi/mvm/fw.c 078df640ef05 ("wifi: iwlwifi: mld: add support for iwl_mcc_allowed_ap_type_cmd v 2") 323156c3541e ("wifi: iwlwifi: mvm: don't send a 6E related command when not supported") Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-04-02dma-mapping: introduce DMA_ATTR_CC_SHARED for shared memoryJiri Pirko1-1/+2
Current CC designs don't place a vIOMMU in front of untrusted devices. Instead, the DMA API forces all untrusted device DMA through swiotlb bounce buffers (is_swiotlb_force_bounce()) which copies data into shared memory on behalf of the device. When a caller has already arranged for the memory to be shared via set_memory_decrypted(), the DMA API needs to know so it can map directly using the unencrypted physical address rather than bounce buffering. Following the pattern of DMA_ATTR_MMIO, add DMA_ATTR_CC_SHARED for this purpose. Like the MMIO case, only the caller knows what kind of memory it has and must inform the DMA API for it to work correctly. Signed-off-by: Jiri Pirko <jiri@nvidia.com> Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Acked-by: Sumit Semwal <sumit.semwal@linaro.org> Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com> Link: https://lore.kernel.org/r/20260325192352.437608-2-jiri@resnulli.us
2026-03-31sched: Add deadline tracepointsGabriele Monaco1-0/+26
Add the following tracepoints: * sched_dl_throttle(dl_se, cpu, type): Called when a deadline entity is throttled * sched_dl_replenish(dl_se, cpu, type): Called when a deadline entity's runtime is replenished * sched_dl_update(dl_se, cpu, type): Called when a deadline entity updates without throttle or replenish * sched_dl_server_start(dl_se, cpu, type): Called when a deadline server is started * sched_dl_server_stop(dl_se, cpu, type): Called when a deadline server is stopped Those tracepoints can be useful to validate the deadline scheduler with RV and are not exported to tracefs. Reviewed-by: Phil Auld <pauld@redhat.com> Acked-by: Juri Lelli <juri.lelli@redhat.com> Link: https://lore.kernel.org/r/20260330111010.153663-11-gmonaco@redhat.com Signed-off-by: Gabriele Monaco <gmonaco@redhat.com>
2026-03-31BackMerge tag 'v7.0-rc6' into drm-nextDave Airlie4-11/+19
Linux 7.0-rc6 Requested by a few people on irc to resolve conflicts in other tress. Signed-off-by: Dave Airlie <airlied@redhat.com>
2026-03-29sunrpc: Add XPT flags missing from SVC_XPRT_FLAG_LISTChuck Lever1-1/+3
Commit eccbbc7c00a5 ("nfsd: don't use sv_nrthreads in connection limiting calculations.") and commit 898374fdd7f0 ("nfsd: unregister with rpcbind when deleting a transport") added new XPT flags but neglected to update the show_svc_xprt_flags() macro. Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2026-03-29Merge tag 'vfs-7.0-rc6.fixes' of ↵Linus Torvalds1-4/+4
git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs Pull vfs fixes from Christian Brauner: - Fix netfs_limit_iter() hitting BUG() when an ITER_KVEC iterator reaches it via core dump writes to 9P filesystems. Add ITER_KVEC handling following the same pattern as the existing ITER_BVEC code. - Fix a NULL pointer dereference in the netfs unbuffered write retry path when the filesystem (e.g., 9P) doesn't set the prepare_write operation. - Clear I_DIRTY_TIME in sync_lazytime for filesystems implementing ->sync_lazytime. Without this the flag stays set and may cause additional unnecessary calls during inode deactivation. - Increase tmpfs size in mount_setattr selftests. A recent commit bumped the ext4 image size to 2 GB but didn't adjust the tmpfs backing store, so mkfs.ext4 fails with ENOSPC writing metadata. - Fix an invalid folio access in iomap when i_blkbits matches the folio size but differs from the I/O granularity. The cur_folio pointer would not get invalidated and iomap_read_end() would still be called on it despite the IO helper owning it. - Fix hash_name() docstring. - Fix read abandonment during netfs retry where the subreq variable used for abandonment could be uninitialized on the first pass or point to a deleted subrequest on later passes. - Don't block sync for filesystems with no data integrity guarantees. Add a SB_I_NO_DATA_INTEGRITY superblock flag replacing the per-inode AS_NO_DATA_INTEGRITY mapping flag so sync kicks off writeback but doesn't wait for flusher threads. This fixes a suspend-to-RAM hang on fuse-overlayfs where the flusher thread blocks when the fuse daemon is frozen. - Fix a lockdep splat in iomap when reads fail. iomap_read_end_io() invokes fserror_report() which calls igrab() taking i_lock in hardirq context while i_lock is normally held with interrupts enabled. Kick failed read handling to a workqueue. - Remove the redundant netfs_io_stream::front member and use stream->subrequests.next instead, fixing a potential issue in the direct write code path. * tag 'vfs-7.0-rc6.fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs: netfs: Fix the handling of stream->front by removing it iomap: fix lockdep complaint when reads fail writeback: don't block sync for filesystems with no data integrity guarantees netfs: Fix read abandonment during retry vfs: fix docstring of hash_name() iomap: fix invalid folio access when i_blkbits differs from I/O granularity selftests/mount_setattr: increase tmpfs size for idmapped mount tests fs: clear I_DIRTY_TIME in sync_lazytime netfs: Fix NULL pointer dereference in netfs_unbuffered_write() on retry netfs: Fix kernel BUG in netfs_limit_iter() for ITER_KVEC iterators
2026-03-28Merge tag 'for-7.0-rc5-tag' of ↵Linus Torvalds1-4/+7
git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux Pull btrfs fixes from David Sterba: "A few more fixes. There's one that stands out in size as it fixes an edge case in fsync. - fix issue on fsync where file with zero size appears as a non-zero after log replay - in zlib compression, handle a crash when data alignment causes folio reference issues - fix possible crash with enabled tracepoints on a overlayfs mount - handle device stats update error - on zoned filesystems, fix kobject leak on sub-block groups - fix super block offset in an error message in validation" * tag 'for-7.0-rc5-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux: btrfs: fix lost error when running device stats on multiple devices fs btrfs: tracepoints: get correct superblock from dentry in event btrfs_sync_file() btrfs: zlib: handle page aligned compressed size correctly btrfs: fix leak of kobject name for sub-group space_info btrfs: fix zero size inode with non-zero size after log replay btrfs: fix super block offset in error message in btrfs_validate_super()
2026-03-26Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netJakub Kicinski2-3/+8
Cross-merge networking fixes after downstream PR (net-7.0-rc6). No conflicts, or adjacent changes. Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-03-26Merge tag 'dma-mapping-7.0-2026-03-25' of ↵Linus Torvalds1-1/+3
git://git.kernel.org/pub/scm/linux/kernel/git/mszyprowski/linux Pull dma-mapping fixes from Marek Szyprowski: "A set of fixes for DMA-mapping subsystem, which resolve false- positive warnings from KMSAN and DMA-API debug (Shigeru Yoshida and Leon Romanovsky) as well as a simple build fix (Miguel Ojeda)" * tag 'dma-mapping-7.0-2026-03-25' of git://git.kernel.org/pub/scm/linux/kernel/git/mszyprowski/linux: dma-mapping: add missing `inline` for `dma_free_attrs` mm/hmm: Indicate that HMM requires DMA coherency RDMA/umem: Tell DMA mapping that UMEM requires coherency iommu/dma: add support for DMA_ATTR_REQUIRE_COHERENT attribute dma-direct: prevent SWIOTLB path when DMA_ATTR_REQUIRE_COHERENT is set dma-mapping: Introduce DMA require coherency attribute dma-mapping: Clarify valid conditions for CPU cache line overlap dma-mapping: handle DMA_ATTR_CPU_CACHE_CLEAN in trace output dma-debug: Allow multiple invocations of overlapping entries dma: swiotlb: add KMSAN annotations to swiotlb_bounce()
2026-03-26netfs: Fix the handling of stream->front by removing itDavid Howells1-4/+4
The netfs_io_stream::front member is meant to point to the subrequest currently being collected on a stream, but it isn't actually used this way by direct write (which mostly ignores it). However, there's a tracepoint which looks at it. Further, stream->front is actually redundant with stream->subrequests.next. Fix the potential problem in the direct code by just removing the member and using stream->subrequests.next instead, thereby also simplifying the code. Fixes: a0b4c7a49137 ("netfs: Fix unbuffered/DIO writes to dispatch subrequests in strict sequence") Reported-by: Paulo Alcantara <pc@manguebit.org> Signed-off-by: David Howells <dhowells@redhat.com> Link: https://patch.msgid.link/4158599.1774426817@warthog.procyon.org.uk Reviewed-by: Paulo Alcantara (Red Hat) <pc@manguebit.org> cc: netfs@lists.linux.dev cc: linux-fsdevel@vger.kernel.org Signed-off-by: Christian Brauner <brauner@kernel.org>
2026-03-24Merge commit 'f35dbac6942171dc4ce9398d1d216a59224590a9' into ↵Steven Rostedt1-2/+5
trace/ring-buffer/core The commit f35dbac69421 ("ring-buffer: Fix to update per-subbuf entries of persistent ring buffer") was a fix and merged upstream. It is needed for some other work in the ring buffer. The current branch has the remote buffer code that is shared with the Arm64 subsystem and can't be rebased. Merge in the upstream commit to allow continuing of the ring buffer work. Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2026-03-23btrfs: tracepoints: get correct superblock from dentry in event ↵Goldwyn Rodrigues1-4/+7
btrfs_sync_file() If overlay is used on top of btrfs, dentry->d_sb translates to overlay's super block and fsid assignment will lead to a crash. Use file_inode(file)->i_sb to always get btrfs_sb. Reviewed-by: Boris Burkov <boris@bur.io> Signed-off-by: Goldwyn Rodrigues <rgoldwyn@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2026-03-23coredump: add tracepoint for coredump eventsBreno Leitao1-0/+45
Coredump is a generally useful and interesting event in the lifetime of a process. Add a tracepoint so it can be monitored through the standard kernel tracing infrastructure. BPF-based crash monitoring is an advanced approach that allows real-time crash interception: by attaching a BPF program at this point, tools can use bpf_get_stack() with BPF_F_USER_STACK to capture the user-space stack trace at the exact moment of the crash, before the process is fully terminated, without waiting for a coredump file to be written and parsed. However, there is currently no stable kernel API for this use case. Existing tools rely on attaching fentry probes to do_coredump(), which is an internal function whose signature changes across kernel versions, breaking these tools. Add a stable tracepoint that fires at the beginning of do_coredump(), providing BPF programs a reliable attachment point. At tracepoint time, the crashing process context is still live, so BPF programs can call bpf_get_stack() with BPF_F_USER_STACK to extract the user-space backtrace. The tracepoint records: - sig: signal number that triggered the coredump - comm: process name Example output: $ echo 1 > /sys/kernel/tracing/events/coredump/coredump/enable $ sleep 999 & $ kill -SEGV $! $ cat /sys/kernel/tracing/trace # TASK-PID CPU# ||||| TIMESTAMP FUNCTION # | | | ||||| | | sleep-634 [036] ..... 145.222206: coredump: sig=11 comm=sleep Suggested-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Breno Leitao <leitao@debian.org> Link: https://patch.msgid.link/20260323-coredump_tracepoint-v2-1-afced083b38d@debian.org Signed-off-by: Christian Brauner <brauner@kernel.org>
2026-03-21tracing: Revert "tracing: Remove pid in task_rename tracing output"Xuewen Yan1-2/+5
This reverts commit e3f6a42272e028c46695acc83fc7d7c42f2750ad. The commit says that the tracepoint only deals with the current task, however the following case is not current task: comm_write() { p = get_proc_task(inode); if (!p) return -ESRCH; if (same_thread_group(current, p)) set_task_comm(p, buffer); } where set_task_comm() calls __set_task_comm() which records the update of p and not current. So revert the patch to show pid. Cc: <mhiramat@kernel.org> Cc: <mathieu.desnoyers@efficios.com> Cc: <elver@google.com> Cc: <kees@kernel.org> Link: https://patch.msgid.link/20260306075954.4533-1-xuewen.yan@unisoc.com Fixes: e3f6a42272e0 ("tracing: Remove pid in task_rename tracing output") Reported-by: Guohua Yan <guohua.yan@unisoc.com> Signed-off-by: Xuewen Yan <xuewen.yan@unisoc.com> Reviewed-by: Steven Rostedt (Google) <rostedt@goodmis.org> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2026-03-21Merge tag 'v7.0-rc4' into timers/core, to resolve conflictIngo Molnar2-2/+10
Resolve conflict between this change in the upstream kernel: 4c652a47722f ("rseq: Mark rseq_arm_slice_extension_timer() __always_inline") ... and this pending change in timers/core: 0e98eb14814e ("entry: Prepare for deferred hrtimer rearming") Signed-off-by: Ingo Molnar <mingo@kernel.org>
2026-03-20dma-mapping: Introduce DMA require coherency attributeLeon Romanovsky1-1/+2
The mapping buffers which carry this attribute require DMA coherent system. This means that they can't take SWIOTLB path, can perform CPU cache overlap and doesn't perform cache flushing. Signed-off-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com> Link: https://lore.kernel.org/r/20260316-dma-debug-overlap-v3-4-1dde90a7f08b@nvidia.com
2026-03-20dma-mapping: Clarify valid conditions for CPU cache line overlapLeon Romanovsky1-1/+1
Rename the DMA_ATTR_CPU_CACHE_CLEAN attribute to better reflect that it is debugging aid to inform DMA core code that CPU cache line overlaps are allowed, and refine the documentation describing its use. Signed-off-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com> Link: https://lore.kernel.org/r/20260316-dma-debug-overlap-v3-3-1dde90a7f08b@nvidia.com
2026-03-20dma-mapping: handle DMA_ATTR_CPU_CACHE_CLEAN in trace outputLeon Romanovsky1-1/+2
Tracing prints decoded DMA attribute flags, but it does not yet include the recently added DMA_ATTR_CPU_CACHE_CLEAN. Add support for decoding and displaying this attribute in the trace output. Fixes: 61868dc55a11 ("dma-mapping: add DMA_ATTR_CPU_CACHE_CLEAN") Signed-off-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com> Link: https://lore.kernel.org/r/20260316-dma-debug-overlap-v3-2-1dde90a7f08b@nvidia.com
2026-03-14devlink: add devlink_dev_driver_name() helper and use it in trace eventsJiri Pirko1-6/+6
In preparation to dev-less devlinks, add devlink_dev_driver_name() that returns the driver name stored in devlink struct, and use it in all trace events. Signed-off-by: Jiri Pirko <jiri@nvidia.com> Link: https://patch.msgid.link/20260312100407.551173-9-jiri@resnulli.us Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-03-14devlink: add helpers to get bus_name/dev_nameJiri Pirko1-12/+12
Introduce devlink_bus_name() and devlink_dev_name() helpers and convert all direct accesses to devlink->dev->bus->name and dev_name(devlink->dev) to use them. This prepares for dev-less devlink instances where these helpers will be extended to handle the missing device. Signed-off-by: Jiri Pirko <jiri@nvidia.com> Link: https://patch.msgid.link/20260312100407.551173-3-jiri@resnulli.us Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-03-12hrtimer: Drop unnecessary pointer indirection in hrtimer_expire_entry eventThomas Weißschuh (Schneider Electric)1-4/+3
This pointer indirection is a remnant from when ktime_t was a struct, today it is pointless. Drop the pointer indirection. Signed-off-by: Thomas Weißschuh (Schneider Electric) <thomas.weissschuh@linutronix.de> Signed-off-by: Thomas Gleixner <tglx@kernel.org> Link: https://patch.msgid.link/20260311-hrtimer-cleanups-v1-9-095357392669@linutronix.de
2026-03-12tracing: Use explicit array size instead of sentinel elements in symbol printingThomas Weißschuh (Schneider Electric)1-20/+20
The sentinel value added by the wrapper macros __print_symbolic() et al prevents the callers from adding their own trailing comma. This makes constructing symbol list dynamically based on kconfig values tedious. Drop the sentinel elements, so callers can either specify the trailing comma or not, just like in regular array initializers. Signed-off-by: Thomas Weißschuh (Schneider Electric) <thomas.weissschuh@linutronix.de> Signed-off-by: Thomas Gleixner <tglx@kernel.org> Link: https://patch.msgid.link/20260311-hrtimer-cleanups-v1-2-095357392669@linutronix.de
2026-03-11Merge v7.0-rc3 into drm-nextSimona Vetter1-1/+3
Requested by Maxime Ripard for drm-misc-next because renesas people need fb797a70108f ("drm: renesas: rz-du: mipi_dsi: Set DSI divider"). Signed-off-by: Simona Vetter <simona.vetter@ffwll.ch>
2026-03-09tracing: Add helpers to create trace remote eventsVincent Donnefort1-0/+73
Declaring remote events can be cumbersome let's add a set of macros to simplify developers life. The declaration of a remote event is very similar to kernel's events: REMOTE_EVENT(name, id, RE_STRUCT( re_field(u64 foo) ), RE_PRINTK("foo=%llu", __entry->foo) ) Link: https://patch.msgid.link/20260309162516.2623589-12-vdonnefort@google.com Reviewed-by: Steven Rostedt (Google) <rostedt@goodmis.org> Signed-off-by: Vincent Donnefort <vdonnefort@google.com> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2026-03-06ext4: fix signed format specifier in ext4_load_inode trace eventChristian Brauner1-1/+1
The ext4_load_inode trace event uses %lld (signed) to print the ino field which is u64 (unsigned). Use %llu instead. Signed-off-by: Christian Brauner <brauner@kernel.org>
2026-03-06nilfs2: widen trace event i_ino fields to u64Jeff Layton1-6/+6
In trace events, change __field(unsigned long, ...) to __field(u64, ...) and update TP_PROTO parameters and TP_printk format strings to match the widened field type. Reviewed-by: Viacheslav Dubeyko <slava@dubeyko.com> Signed-off-by: Jeff Layton <jlayton@kernel.org> Link: https://patch.msgid.link/20260304-iino-u64-v3-11-2257ad83d372@kernel.org Signed-off-by: Christian Brauner <brauner@kernel.org>
2026-03-06f2fs: widen trace event i_ino fields to u64Jeff Layton1-121/+121
In trace events, change __field(ino_t, ...) to __field(u64, ...) and update TP_printk format strings to %llu/%llx to match the widened field type. Signed-off-by: Jeff Layton <jlayton@kernel.org> Link: https://patch.msgid.link/20260304-iino-u64-v3-10-2257ad83d372@kernel.org Signed-off-by: Christian Brauner <brauner@kernel.org>
2026-03-06ext4: widen trace event i_ino fields to u64Jeff Layton1-272/+272
In trace events, change __field(ino_t, ...) to __field(u64, ...) and update TP_printk format strings to %llu/%llx to match the widened field type. Reviewed-by: Jan Kara <jack@suse.cz> Signed-off-by: Jeff Layton <jlayton@kernel.org> Link: https://patch.msgid.link/20260304-iino-u64-v3-9-2257ad83d372@kernel.org Signed-off-by: Christian Brauner <brauner@kernel.org>
2026-03-06hugetlbfs: widen trace event i_ino fields to u64Jeff Layton1-21/+21
Update hugetlbfs trace event definitions to use u64 instead of ino_t/unsigned long for inode number fields. Signed-off-by: Jeff Layton <jlayton@kernel.org> Link: https://patch.msgid.link/20260304-iino-u64-v3-7-2257ad83d372@kernel.org Signed-off-by: Christian Brauner <brauner@kernel.org>
2026-03-06cachefiles: widen trace event i_ino fields to u64Jeff Layton1-9/+9
Update cachefiles trace event definitions to use u64 instead of ino_t/unsigned long for inode number fields. Signed-off-by: Jeff Layton <jlayton@kernel.org> Link: https://patch.msgid.link/20260304-iino-u64-v3-5-2257ad83d372@kernel.org Signed-off-by: Christian Brauner <brauner@kernel.org>
2026-03-06vfs: widen trace event i_ino fields to u64Jeff Layton8-154/+154
Update VFS-layer trace event definitions to use u64 instead of ino_t/unsigned long for inode number fields. Update TP_printk format strings to use %llu/%llx to match the widened field type. Remove now-unnecessary (unsigned long) casts since __entry->ino is already u64. Signed-off-by: Jeff Layton <jlayton@kernel.org> Link: https://patch.msgid.link/20260304-iino-u64-v3-4-2257ad83d372@kernel.org Reviewed-by: Jan Kara <jack@suse.cz> Signed-off-by: Christian Brauner <brauner@kernel.org>
2026-03-05Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netJakub Kicinski2-2/+10
Cross-merge networking fixes after downstream PR (net-7.0-rc3). No conflicts. Adjacent changes: net/netfilter/nft_set_rbtree.c fb7fb4016300 ("netfilter: nf_tables: clone set on flush only") 3aea466a4399 ("netfilter: nft_set_rbtree: don't disable bh when acquiring tree lock") Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-03-04Merge tag 'vfs-7.0-rc3.fixes' of ↵Linus Torvalds1-1/+3
git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs Pull vfs fixes from Christian Brauner: - kthread: consolidate kthread exit paths to prevent use-after-free - iomap: - don't mark folio uptodate if read IO has bytes pending - don't report direct-io retries to fserror - reject delalloc mappings during writeback - ns: tighten visibility checks - netfs: Fix unbuffered/DIO writes to dispatch subrequests in strict sequence * tag 'vfs-7.0-rc3.fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs: iomap: reject delalloc mappings during writeback iomap: don't mark folio uptodate if read IO has bytes pending selftests: fix mntns iteration selftests nstree: tighten permission checks for listing nsfs: tighten permission checks for handle opening nsfs: tighten permission checks for ns iteration ioctls netfs: Fix unbuffered/DIO writes to dispatch subrequests in strict sequence kthread: consolidate kthread exit paths to prevent use-after-free iomap: don't report direct-io retries to fserror
2026-03-02Merge tag 'drm-misc-next-2026-02-26' of ↵Dave Airlie1-30/+5
https://gitlab.freedesktop.org/drm/misc/kernel into drm-next drm-misc-next for v7.1: UAPI Changes: connector: - Add panel_type property fourcc: - Add ARM interleaved 64k modifier nouveau: - Query Z-Cull info with DRM_IOCTL_NOUVEAU_GET_ZCULL_INFO Cross-subsystem Changes: coreboot: - Clean up coreboot framebuffer support dma-buf: - Provide revoke mechanism for shared buffers - Rename move_notify callback to invalidate_mappings and update users. - Always enable move_notify - Support dma_fence_was_initialized() test - Protect dma_fence_ops by RCU and improve locking - Fix sparse warnings Core Changes: atomic: - Allocate drm_private_state via callback and convert drivers atomic-helper: - Use system_percpu_wq buddy: - Make buddy allocator available to all DRM drivers - Document flags and structures colorop: - Add destroy helper and convert drivers fbdev-emulation: - Clean up gem: - Fix drm_gem_objects_lookup() error cleanup Driver Changes: amdgpu: - Set panel_type to OELD for eDP atmel-hlcdc: - Support sana5d65 LCD controller bridge: - anx7625: Support USB-C plus DT bindings - connector: Fix EDID detection - dw-hdmi-qp: Support Vendor-Specfic and SDP Infoframes; improve others - fsl-ldb: Fix visual artifacts plus related DT property 'enable-termination-resistor' - imx8qxp-pixel-link: Improve bridge reference handling - lt9611: Support Port-B-only input plus DT bindings - tda998x: Support DRM_BRIDGE_ATTACH_NO_CONNECTOR; Clean up - Support TH1520 HDMI plus DT bindings - Clean up imagination: - Clean up komeda: - Fix integer overflow in AFBC checks mcde: - Improve bridge handling nouveau: - Provide Z-cull info to user space - gsp: Support GA100 - Shutdown on PCI device shutdown - Clean up panel: - panel-jdi-lt070me05000: Use mipi-dsi multi functions - panel-edp: Support Add AUO B116XAT04.1 (HW: 1A); Support CMN N116BCL-EAK (C2); Support FriendlyELEC plus DT changes - Fix Kconfig dependencies panthor: - Add tracepoints for power and IRQs rcar-du: - dsi: fix VCLK calculation rockchip: - vop2: Use drm_ logging functions - Support DisplayPort on RK3576 sysfb: - corebootdrm: Support system framebuffer on coreboot firmware; detect orientation - Clean up pixel-format lookup sun4i: - Clean up tilcdc: - Use DT bindings scheme - Use managed DRM interfaces - Support DRM_BRIDGE_ATTACH_NO_CONNECTOR - Clean up a lot of obsolete code v3d: - Clean up vc4: - Use system_percpu_wq - Clean up verisilicon: - Support DC8200 plus DT bindings virtgpu: - Support PRIME imports with enabled 3D Signed-off-by: Dave Airlie <airlied@redhat.com> From: Thomas Zimmermann <tzimmermann@suse.de> Link: https://patch.msgid.link/20260226143615.GA47200@linux.fritz.box
2026-02-28net: sched: introduce qdisc-specific drop reason tracingJesper Dangaard Brouer1-0/+51
Create new enum qdisc_drop_reason and trace_qdisc_drop tracepoint for qdisc layer drop diagnostics with direct qdisc context visibility. The new tracepoint includes qdisc handle, parent, kind (name), and device information. Existing SKB_DROP_REASON_QDISC_DROP is retained for backwards compatibility via kfree_skb_reason(). Convert qdiscs with drop reasons to use the new infrastructure. Change CAKE's cobalt_should_drop() return type from enum skb_drop_reason to enum qdisc_drop_reason to fix implicit enum conversion warnings. Use QDISC_DROP_UNSPEC as the 'not dropped' sentinel instead of SKB_NOT_DROPPED_YET. Both have the same compiled value (0), so the comparison logic remains semantically equivalent. Signed-off-by: Jesper Dangaard Brouer <hawk@kernel.org> Reviewed-by: Toke Høiland-Jørgensen <toke@redhat.com> Link: https://patch.msgid.link/177211345275.3011628.1974310302645218067.stgit@firesoul Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-02-27hrtimer: Add hrtimer_rearm tracepointThomas Gleixner1-0/+24
Analyzing the reprogramming of the clock event device is essential to debug the behaviour of the hrtimer subsystem especially with the upcoming deferred rearming scheme. Signed-off-by: Thomas Gleixner <tglx@kernel.org> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://patch.msgid.link/20260224163430.803669745@kernel.org
2026-02-27hrtimer: Reduce trace noise in hrtimer_start()Thomas Gleixner1-4/+7
hrtimer_start() when invoked with an already armed timer traces like: <comm>-.. [032] d.h2. 5.002263: hrtimer_cancel: hrtimer= .... <comm>-.. [032] d.h1. 5.002263: hrtimer_start: hrtimer= .... Which is incorrect as the timer doesn't get canceled. Just the expiry time changes. The internal dequeue operation which is required for that is not really interesting for trace analysis. But it makes it tedious to keep real cancellations and the above case apart. Remove the cancel tracing in hrtimer_start() and add a 'was_armed' indicator to the hrtimer start tracepoint, which clearly indicates what the state of the hrtimer is when hrtimer_start() is invoked: <comm>-.. [032] d.h1. 6.200103: hrtimer_start: hrtimer= .... was_armed=0 <comm>-.. [032] d.h1. 6.200558: hrtimer_start: hrtimer= .... was_armed=1 Fixes: c6a2a1770245 ("hrtimer: Add tracepoint for hrtimers") Signed-off-by: Thomas Gleixner <tglx@kernel.org> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://patch.msgid.link/20260224163430.208491877@kernel.org
2026-02-26netfs: Fix unbuffered/DIO writes to dispatch subrequests in strict sequenceDavid Howells1-1/+3
Fix netfslib such that when it's making an unbuffered or DIO write, to make sure that it sends each subrequest strictly sequentially, waiting till the previous one is 'committed' before sending the next so that we don't have pieces landing out of order and potentially leaving a hole if an error occurs (ENOSPC for example). This is done by copying in just those bits of issuing, collecting and retrying subrequests that are necessary to do one subrequest at a time. Retrying, in particular, is simpler because if the current subrequest needs retrying, the source iterator can just be copied again and the subrequest prepped and issued again without needing to be concerned about whether it needs merging with the previous or next in the sequence. Note that the issuing loop waits for a subrequest to complete right after issuing it, but this wait could be moved elsewhere allowing preparatory steps to be performed whilst the subrequest is in progress. In particular, once content encryption is available in netfslib, that could be done whilst waiting, as could cleanup of buffers that have been completed. Fixes: 153a9961b551 ("netfs: Implement unbuffered/DIO write support") Signed-off-by: David Howells <dhowells@redhat.com> Link: https://patch.msgid.link/58526.1772112753@warthog.procyon.org.uk Tested-by: Steve French <sfrench@samba.org> Reviewed-by: Paulo Alcantara (Red Hat) <pc@manguebit.org> cc: netfs@lists.linux.dev cc: linux-fsdevel@vger.kernel.org Signed-off-by: Christian Brauner <brauner@kernel.org>
2026-02-24mm/tracing: rss_stat: ensure curr is false from kthread contextKalesh Singh1-1/+7
The rss_stat trace event allows userspace tools, like Perfetto [1], to inspect per-process RSS metric changes over time. The curr field was introduced to rss_stat in commit e4dcad204d3a ("rss_stat: add support to detect RSS updates of external mm"). Its intent is to indicate whether the RSS update is for the mm_struct of the current execution context; and is set to false when operating on a remote mm_struct (e.g., via kswapd or a direct reclaimer). However, an issue arises when a kernel thread temporarily adopts a user process's mm_struct. Kernel threads do not have their own mm_struct and normally have current->mm set to NULL. To operate on user memory, they can "borrow" a memory context using kthread_use_mm(), which sets current->mm to the user process's mm. This can be observed, for example, in the USB Function Filesystem (FFS) driver. The ffs_user_copy_worker() handles AIO completions and uses kthread_use_mm() to copy data to a user-space buffer. If a page fault occurs during this copy, the fault handler executes in the kthread's context. At this point, current is the kthread, but current->mm points to the user process's mm. Since the rss_stat event (from the page fault) is for that same mm, the condition current->mm == mm becomes true, causing curr to be incorrectly set to true when the trace event is emitted. This is misleading because it suggests the mm belongs to the kthread, confusing userspace tools that track per-process RSS changes and corrupting their mm_id-to-process association. Fix this by ensuring curr is always false when the trace event is emitted from a kthread context by checking for the PF_KTHREAD flag. Link: https://lkml.kernel.org/r/20260219233708.1971199-1-kaleshsingh@google.com Link: https://perfetto.dev/ [1] Fixes: e4dcad204d3a ("rss_stat: add support to detect RSS updates of external mm") Signed-off-by: Kalesh Singh <kaleshsingh@google.com> Acked-by: Zi Yan <ziy@nvidia.com> Acked-by: SeongJae Park <sj@kernel.org> Reviewed-by: Pedro Falcato <pfalcato@suse.de> Cc: "David Hildenbrand (Arm)" <david@kernel.org> Cc: Joel Fernandes <joel@joelfernandes.org> Cc: Lorenzo Stoakes <lorenzo.stoakes@oracle.com> Cc: Minchan Kim <minchan@kernel.org> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Suren Baghdasaryan <surenb@google.com> Cc: <stable@vger.kernel.org> [5.10+] Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2026-02-23Merge drm/drm-next into drm-misc-nextMaxime Ripard16-29/+629
Let's merge 7.0-rc1 to start the new drm-misc-next window Signed-off-by: Maxime Ripard <mripard@kernel.org>
2026-02-16Merge tag 'vfs-7.0-rc1.misc.2' of ↵Linus Torvalds1-0/+146
git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs Pull more misc vfs updates from Christian Brauner: "Features: - Optimize close_range() from O(range size) to O(active FDs) by using find_next_bit() on the open_fds bitmap instead of linearly scanning the entire requested range. This is a significant improvement for large-range close operations on sparse file descriptor tables. - Add FS_XFLAG_VERITY file attribute for fs-verity files, retrievable via FS_IOC_FSGETXATTR and file_getattr(). The flag is read-only. Add tracepoints for fs-verity enable and verify operations, replacing the previously removed debug printk's. - Prevent nfsd from exporting special kernel filesystems like pidfs and nsfs. These filesystems have custom ->open() and ->permission() export methods that are designed for open_by_handle_at(2) only and are incompatible with nfsd. Update the exportfs documentation accordingly. Fixes: - Fix KMSAN uninit-value in ovl_fill_real() where strcmp() was used on a non-null-terminated decrypted directory entry name from fscrypt. This triggered on encrypted lower layers when the decrypted name buffer contained uninitialized tail data. The fix also adds VFS-level name_is_dot(), name_is_dotdot(), and name_is_dot_dotdot() helpers, replacing various open-coded "." and ".." checks across the tree. - Fix read-only fsflags not being reset together with xflags in vfs_fileattr_set(). Currently harmless since no read-only xflags overlap with flags, but this would cause inconsistencies for any future shared read-only flag - Return -EREMOTE instead of -ESRCH from PIDFD_GET_INFO when the target process is in a different pid namespace. This lets userspace distinguish "process exited" from "process in another namespace", matching glibc's pidfd_getpid() behavior Cleanups: - Use C-string literals in the Rust seq_file bindings, replacing the kernel::c_str!() macro (available since Rust 1.77) - Fix typo in d_walk_ret enum comment, add porting notes for the readlink_copy() calling convention change" * tag 'vfs-7.0-rc1.misc.2' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs: fs: add porting notes about readlink_copy() pidfs: return -EREMOTE when PIDFD_GET_INFO is called on another ns nfsd: do not allow exporting of special kernel filesystems exportfs: clarify the documentation of open()/permission() expotrfs ops fsverity: add tracepoints fs: add FS_XFLAG_VERITY for fs-verity files rust: seq_file: replace `kernel::c_str!` with C-Strings fs: dcache: fix typo in enum d_walk_ret comment ovl: use name_is_dot* helpers in readdir code fs: add helpers name_is_dot{,dot,_dotdot} ovl: Fix uninit-value in ovl_fill_real fs: reset read-only fsflags together with xflags fs/file: optimize close_range() complexity from O(N) to O(Sparse)
2026-02-14Merge tag 'f2fs-for-7.0-rc1' of ↵Linus Torvalds1-1/+141
git://git.kernel.org/pub/scm/linux/kernel/git/jaegeuk/f2fs Pull f2fs updates from Jaegeuk Kim: "In this development cycle, we focused on several key performance optimizations: - introducing large folio support to enhance read speeds for immutable files - reducing checkpoint=enable latency by flushing only committed dirty pages - implementing tracepoints to diagnose and resolve lock priority inversion. Additionally, we introduced the packed_ssa feature to optimize the SSA footprint when utilizing large block sizes. Detail summary: Enhancements: - support large folio for immutable non-compressed case - support non-4KB block size without packed_ssa feature - optimize f2fs_enable_checkpoint() to avoid long delay - optimize f2fs_overwrite_io() for f2fs_iomap_begin - optimize NAT block loading during checkpoint write - add write latency stats for NAT and SIT blocks in f2fs_write_checkpoint - pin files do not require sbi->writepages lock for ordering - avoid f2fs_map_blocks() for consecutive holes in readpages - flush plug periodically during GC to maximize readahead effect - add tracepoints to catch lock overheads - add several sysfs entries to tune internal lock priorities Fixes: - fix lock priority inversion issue - fix incomplete block usage in compact SSA summaries - fix to show simulate_lock_timeout correctly - fix to avoid mapping wrong physical block for swapfile - fix IS_CHECKPOINTED flag inconsistency issue caused by concurrent atomic commit and checkpoint writes - fix to avoid UAF in f2fs_write_end_io()" * tag 'f2fs-for-7.0-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/jaegeuk/f2fs: (61 commits) f2fs: sysfs: introduce critical_task_priority f2fs: introduce trace_f2fs_priority_update f2fs: fix lock priority inversion issue f2fs: optimize f2fs_overwrite_io() for f2fs_iomap_begin f2fs: fix incomplete block usage in compact SSA summaries f2fs: decrease maximum flush retry count in f2fs_enable_checkpoint() f2fs: optimize NAT block loading during checkpoint write f2fs: change size parameter of __has_cursum_space() to unsigned int f2fs: add write latency stats for NAT and SIT blocks in f2fs_write_checkpoint f2fs: pin files do not require sbi->writepages lock for ordering f2fs: fix to show simulate_lock_timeout correctly f2fs: introduce FAULT_SKIP_WRITE f2fs: check skipped write in f2fs_enable_checkpoint() Revert "f2fs: add timeout in f2fs_enable_checkpoint()" f2fs: fix to unlock folio in f2fs_read_data_large_folio() f2fs: fix error path handling in f2fs_read_data_large_folio() f2fs: use folio_end_read f2fs: fix to avoid mapping wrong physical block for swapfile f2fs: avoid f2fs_map_blocks() for consecutive holes in readpages f2fs: advance index and offset after zeroing in large folio read ...
2026-02-13Merge tag 'trace-v7.0' of ↵Linus Torvalds3-6/+6
git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace Pull tracing updates from Steven Rostedt: "User visible changes: - Add an entry into MAINTAINERS file for RUST versions of code There's now RUST code for tracing and static branches. To differentiate that code from the C code, add entries in for the RUST version (with "[RUST]" around it) so that the right maintainers get notified on changes. - New bitmask-list option added to tracefs When this is set, bitmasks in trace event are not displayed as hex numbers, but instead as lists: e.g. 0-5,7,9 instead of 0000015f - New show_event_filters file in tracefs Instead of having to search all events/*/*/filter for any active filters enabled in the trace instance, the file show_event_filters will list them so that there's only one file that needs to be examined to