aboutsummaryrefslogtreecommitdiff
path: root/fs
AgeCommit message (Collapse)AuthorFilesLines
2026-01-12ocfs2: add setlease file operationJeff Layton1-0/+5
Add the setlease file_operation to ocfs2_fops, ocfs2_dops, ocfs2_fops_no_plocks, and ocfs2_dops_no_plocks, pointing to generic_setlease. A future patch will change the default behavior to reject lease attempts with -EINVAL when there is no setlease file operation defined. Add generic_setlease to retain the ability to set leases on this filesystem. Signed-off-by: Jeff Layton <jlayton@kernel.org> Link: https://patch.msgid.link/20260108-setlease-6-20-v1-15-ea4dec9b67fa@kernel.org Acked-by: Al Viro <viro@zeniv.linux.org.uk> Acked-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Jan Kara <jack@suse.cz> Signed-off-by: Christian Brauner <brauner@kernel.org>
2026-01-12ntfs3: add setlease file operationJeff Layton2-0/+6
Add the setlease file_operation to ntfs_file_operations, ntfs_legacy_file_operations, ntfs_dir_operations, and ntfs_legacy_dir_operations, pointing to generic_setlease. A future patch will change the default behavior to reject lease attempts with -EINVAL when there is no setlease file operation defined. Add generic_setlease to retain the ability to set leases on this filesystem. Signed-off-by: Jeff Layton <jlayton@kernel.org> Link: https://patch.msgid.link/20260108-setlease-6-20-v1-14-ea4dec9b67fa@kernel.org Acked-by: Al Viro <viro@zeniv.linux.org.uk> Acked-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Christian Brauner <brauner@kernel.org>
2026-01-12nilfs2: add setlease file operationJeff Layton2-1/+4
Add the setlease file_operation to nilfs_file_operations and nilfs_dir_operations, pointing to generic_setlease. A future patch will change the default behavior to reject lease attempts with -EINVAL when there is no setlease file operation defined. Add generic_setlease to retain the ability to set leases on this filesystem. Signed-off-by: Jeff Layton <jlayton@kernel.org> Link: https://patch.msgid.link/20260108-setlease-6-20-v1-13-ea4dec9b67fa@kernel.org Acked-by: Ryusuke Konishi <konishi.ryusuke@gmail.com> Acked-by: Al Viro <viro@zeniv.linux.org.uk> Acked-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Christian Brauner <brauner@kernel.org>
2026-01-12jfs: add setlease file operationJeff Layton2-0/+4
Add the setlease file_operation to jfs_file_operations and jfs_dir_operations, pointing to generic_setlease. A future patch will change the default behavior to reject lease attempts with -EINVAL when there is no setlease file operation defined. Add generic_setlease to retain the ability to set leases on this filesystem. Signed-off-by: Jeff Layton <jlayton@kernel.org> Link: https://patch.msgid.link/20260108-setlease-6-20-v1-12-ea4dec9b67fa@kernel.org Acked-by: Richard Weinberger <richard@nod.at> Acked-by: Al Viro <viro@zeniv.linux.org.uk> Acked-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Dave Kleikamp <dave.kleikamp@oracle.com> Signed-off-by: Christian Brauner <brauner@kernel.org>
2026-01-12jffs2: add setlease file operationJeff Layton2-0/+4
Add the setlease file_operation to jffs2_file_operations and jffs2_dir_operations, pointing to generic_setlease. A future patch will change the default behavior to reject lease attempts with -EINVAL when there is no setlease file operation defined. Add generic_setlease to retain the ability to set leases on this filesystem. Signed-off-by: Jeff Layton <jlayton@kernel.org> Link: https://patch.msgid.link/20260108-setlease-6-20-v1-11-ea4dec9b67fa@kernel.org Acked-by: Richard Weinberger <richard@nod.at> Acked-by: Al Viro <viro@zeniv.linux.org.uk> Acked-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Christian Brauner <brauner@kernel.org>
2026-01-12gfs2: add a setlease file operationJeff Layton1-0/+1
gfs2_file_fops_nolock() already has this explicitly set, so it's only necessary to set this in gfs2_dir_fops_nolock(). A future patch will change the default behavior to reject lease attempts with -EINVAL when there is no setlease file operation defined. Signed-off-by: Jeff Layton <jlayton@kernel.org> Link: https://patch.msgid.link/20260108-setlease-6-20-v1-10-ea4dec9b67fa@kernel.org Acked-by: Al Viro <viro@zeniv.linux.org.uk> Acked-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Christian Brauner <brauner@kernel.org>
2026-01-12fat: add setlease file operationJeff Layton2-0/+4
Add the setlease file_operation to fat_file_operations and fat_dir_operations, pointing to generic_setlease. A future patch will change the default behavior to reject lease attempts with -EINVAL when there is no setlease file operation defined. Add generic_setlease to retain the ability to set leases on this filesystem. Signed-off-by: Jeff Layton <jlayton@kernel.org> Link: https://patch.msgid.link/20260108-setlease-6-20-v1-9-ea4dec9b67fa@kernel.org Acked-by: OGAWA Hirofumi <hirofumi@mail.parknet.co.jp> Acked-by: Al Viro <viro@zeniv.linux.org.uk> Acked-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Christian Brauner <brauner@kernel.org>
2026-01-12f2fs: add setlease file operationJeff Layton2-0/+4
Add the setlease file_operation to f2fs_file_operations and f2fs_dir_operations, pointing to generic_setlease. A future patch will change the default behavior to reject lease attempts with -EINVAL when there is no setlease file operation defined. Add generic_setlease to retain the ability to set leases on this filesystem. Signed-off-by: Jeff Layton <jlayton@kernel.org> Link: https://patch.msgid.link/20260108-setlease-6-20-v1-8-ea4dec9b67fa@kernel.org Acked-by: Al Viro <viro@zeniv.linux.org.uk> Acked-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Chao Yu <chao@kernel.org> Signed-off-by: Christian Brauner <brauner@kernel.org>
2026-01-12exfat: add setlease file operationJeff Layton2-0/+4
Add the setlease file_operation to exfat_file_operations and exfat_dir_operations, pointing to generic_setlease. A future patch will change the default behavior to reject lease attempts with -EINVAL when there is no setlease file operation defined. Add generic_setlease to retain the ability to set leases on this filesystem. Signed-off-by: Jeff Layton <jlayton@kernel.org> Link: https://patch.msgid.link/20260108-setlease-6-20-v1-7-ea4dec9b67fa@kernel.org Acked-by: Namjae Jeon <linkinjeon@kernel.org> Acked-by: Al Viro <viro@zeniv.linux.org.uk> Acked-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Christian Brauner <brauner@kernel.org>
2026-01-12ext4: add setlease file operationJeff Layton2-0/+4
Add the setlease file_operation to ext4_file_operations and ext4_dir_operations, pointing to generic_setlease. A future patch will change the default behavior to reject lease attempts with -EINVAL when there is no setlease file operation defined. Add generic_setlease to retain the ability to set leases on this filesystem. Signed-off-by: Jeff Layton <jlayton@kernel.org> Link: https://patch.msgid.link/20260108-setlease-6-20-v1-6-ea4dec9b67fa@kernel.org Acked-by: Al Viro <viro@zeniv.linux.org.uk> Acked-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Jan Kara <jack@suse.cz> Signed-off-by: Christian Brauner <brauner@kernel.org>
2026-01-12ext2: add setlease file operationJeff Layton2-0/+4
Add the setlease file_operation to ext2_file_operations and ext2_dir_operations, pointing to generic_setlease. A future patch will change the default behavior to reject lease attempts with -EINVAL when there is no setlease file operation defined. Add generic_setlease to retain the ability to set leases on this filesystem. Signed-off-by: Jeff Layton <jlayton@kernel.org> Link: https://patch.msgid.link/20260108-setlease-6-20-v1-5-ea4dec9b67fa@kernel.org Acked-by: Al Viro <viro@zeniv.linux.org.uk> Acked-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Jan Kara <jack@suse.cz> Signed-off-by: Christian Brauner <brauner@kernel.org>
2026-01-12erofs: add setlease file operationJeff Layton2-0/+4
Add the setlease file_operation to erofs_file_fops and erofs_dir_fops, pointing to generic_setlease. A future patch will change the default behavior to reject lease attempts with -EINVAL when there is no setlease file operation defined. Add generic_setlease to retain the ability to set leases on this filesystem. Signed-off-by: Jeff Layton <jlayton@kernel.org> Link: https://patch.msgid.link/20260108-setlease-6-20-v1-4-ea4dec9b67fa@kernel.org Acked-by: Al Viro <viro@zeniv.linux.org.uk> Acked-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Chao Yu <chao@kernel.org> Signed-off-by: Christian Brauner <brauner@kernel.org>
2026-01-12btrfs: add setlease file operationJeff Layton2-0/+4
Add the setlease file_operation to btrfs_file_operations and btrfs_dir_file_operations, pointing to generic_setlease. A future patch will change the default behavior to reject lease attempts with -EINVAL when there is no setlease file operation defined. Add generic_setlease to retain the ability to set leases on this filesystem. Signed-off-by: Jeff Layton <jlayton@kernel.org> Link: https://patch.msgid.link/20260108-setlease-6-20-v1-3-ea4dec9b67fa@kernel.org Acked-by: David Sterba <dsterba@suse.com> Acked-by: Al Viro <viro@zeniv.linux.org.uk> Acked-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Christian Brauner <brauner@kernel.org>
2026-01-12affs: add setlease file operationJeff Layton2-0/+4
Add the setlease file_operation to affs_file_operations and affs_dir_operations, pointing to generic_setlease. A future patch will change the default behavior to reject lease attempts with -EINVAL when there is no setlease file operation defined. Add generic_setlease to retain the ability to set leases on this filesystem. Signed-off-by: Jeff Layton <jlayton@kernel.org> Link: https://patch.msgid.link/20260108-setlease-6-20-v1-2-ea4dec9b67fa@kernel.org Acked-by: David Sterba <dsterba@suse.com> Acked-by: Al Viro <viro@zeniv.linux.org.uk> Acked-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Christian Brauner <brauner@kernel.org>
2026-01-12fs: add setlease to generic_ro_fops and read-only filesystem directory ↵Jeff Layton8-0/+16
operations Add the setlease file_operation to generic_ro_fops, which covers file operations for several read-only filesystems (BEFS, EFS, ISOFS, QNX4, QNX6, CRAMFS, FREEVXFS). Also add setlease to the directory file_operations for these filesystems. A future patch will change the default behavior to reject lease attempts with -EINVAL when there is no setlease file operation defined. Add generic_setlease to retain the ability to set leases on these filesystems. Signed-off-by: Jeff Layton <jlayton@kernel.org> Link: https://patch.msgid.link/20260108-setlease-6-20-v1-1-ea4dec9b67fa@kernel.org Acked-by: Al Viro <viro@zeniv.linux.org.uk> Acked-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Jan Kara <jack@suse.cz> Signed-off-by: Christian Brauner <brauner@kernel.org>
2026-01-12vboxsf: don't allow delegations to be set on directoriesJeff Layton1-0/+1
With the advent of directory leases, it's necessary to set the ->setlease() handler in directory file_operations to properly deny them. Fixes: e6d28ebc17eb ("filelock: push the S_ISREG check down to ->setlease handlers") Signed-off-by: Jeff Layton <jlayton@kernel.org> Link: https://patch.msgid.link/20260107-setlease-6-19-v1-6-85f034abcc57@kernel.org Signed-off-by: Christian Brauner <brauner@kernel.org>
2026-01-12ceph: don't allow delegations to be set on directoriesJeff Layton1-0/+2
With the advent of directory leases, it's necessary to set the ->setlease() handler in directory file_operations to properly deny them. Fixes: e6d28ebc17eb ("filelock: push the S_ISREG check down to ->setlease handlers") Signed-off-by: Jeff Layton <jlayton@kernel.org> Link: https://patch.msgid.link/20260107-setlease-6-19-v1-5-85f034abcc57@kernel.org Reviewed-by: Viacheslav Dubeyko <Slava.Dubeyko@ibm.com> Signed-off-by: Christian Brauner <brauner@kernel.org>
2026-01-12gfs2: don't allow delegations to be set on directoriesJeff Layton1-0/+1
With the advent of directory leases, it's necessary to set the ->setlease() handler in directory file_operations to properly deny them. In the "nolock" case however, there is no need to deny them. Fixes: e6d28ebc17eb ("filelock: push the S_ISREG check down to ->setlease handlers") Signed-off-by: Jeff Layton <jlayton@kernel.org> Link: https://patch.msgid.link/20260107-setlease-6-19-v1-4-85f034abcc57@kernel.org Signed-off-by: Christian Brauner <brauner@kernel.org>
2026-01-129p: don't allow delegations to be set on directoriesJeff Layton1-0/+2
With the advent of directory leases, it's necessary to set the ->setlease() handler in directory file_operations to properly deny them. Fixes: e6d28ebc17eb ("filelock: push the S_ISREG check down to ->setlease handlers") Signed-off-by: Jeff Layton <jlayton@kernel.org> Link: https://patch.msgid.link/20260107-setlease-6-19-v1-3-85f034abcc57@kernel.org Signed-off-by: Christian Brauner <brauner@kernel.org>
2026-01-12smb/client: properly disallow delegations on directoriesJeff Layton1-3/+1
The check for S_ISREG() in cifs_setlease() is incorrect since that operation doesn't get called for directories. The correct way to prevent delegations on directories is to set the ->setlease() method in directory file_operations to simple_nosetlease(). Fixes: e6d28ebc17eb ("filelock: push the S_ISREG check down to ->setlease handlers") Signed-off-by: Jeff Layton <jlayton@kernel.org> Link: https://patch.msgid.link/20260107-setlease-6-19-v1-2-85f034abcc57@kernel.org Signed-off-by: Christian Brauner <brauner@kernel.org>
2026-01-12nfs: properly disallow delegation requests on directoriesJeff Layton2-2/+1
Checking for S_ISREG() in nfs4_setlease() is incorrect, since that op is never called for directories. The right way to deny lease requests on directories is to set the ->setlease() operation to simple_nosetlease() in the directory file_operations. Fixes: e6d28ebc17eb ("filelock: push the S_ISREG check down to ->setlease handlers") Reported-by: Christoph Hellwig <hch@infradead.org> Closes: https://lore.kernel.org/linux-fsdevel/aV316LhsVSl0n9-E@infradead.org/ Signed-off-by: Jeff Layton <jlayton@kernel.org> Link: https://patch.msgid.link/20260107-setlease-6-19-v1-1-85f034abcc57@kernel.org Tested-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Christian Brauner <brauner@kernel.org>
2026-01-12fuse: fix conversion of fuse_reverse_inval_entry() to start_removing()NeilBrown1-7/+16
The recent conversion of fuse_reverse_inval_entry() to use start_removing() was wrong. As Val Packett points out the original code did not call ->lookup while the new code does. This can lead to a deadlock. Rather than using full_name_hash() and d_lookup() as the old code did, we can use try_lookup_noperm() which combines these. Then the result can be given to start_removing_dentry() to get the required locks for removal. We then double check that the name hasn't changed. As 'dir' needs to be used several times now, we load the dput() until the end, and initialise to NULL so dput() is always safe. Reported-by: Val Packett <val@packett.cool> Closes: https://lore.kernel.org/all/6713ea38-b583-4c86-b74a-bea55652851d@packett.cool Fixes: c9ba789dad15 ("VFS: introduce start_creating_noperm() and start_removing_noperm()") Signed-off-by: NeilBrown <neil@brown.name> Link: https://patch.msgid.link/176454037897.634289.3566631742434963788@noble.neil.brown.name Signed-off-by: Christian Brauner <brauner@kernel.org>
2026-01-11blk-crypto: handle the fallback above the block layerChristoph Hellwig7-11/+16
Add a blk_crypto_submit_bio helper that either submits the bio when it is not encrypted or inline encryption is provided, but otherwise handles the encryption before going down into the low-level driver. This reduces the risk from bio reordering and keeps memory allocation as high up in the stack as possible. Note that if the submitter knows that inline enctryption is known to be supported by the underyling driver, it can still use plain submit_bio. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Eric Biggers <ebiggers@kernel.org> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2026-01-11fscrypt: keep multiple bios in flight in fscrypt_zeroout_range_inline_cryptChristoph Hellwig1-32/+54
This should slightly improve performance for large zeroing operations, but more importantly prepares for blk-crypto refactoring that requires all fscrypt users to call submit_bio directly. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Eric Biggers <ebiggers@kernel.org> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2026-01-11fscrypt: pass a real sector_t to fscrypt_zeroout_range_inline_cryptChristoph Hellwig1-5/+4
While the pblk argument to fscrypt_zeroout_range_inline_crypt is declared as a sector_t it actually is interpreted as a logical block size unit, which is highly unusual. Switch to passing the 512 byte units that sector_t is defined for. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Eric Biggers <ebiggers@kernel.org> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2026-01-11treewide: Update email addressThomas Gleixner1-2/+2
In a vain attempt to consolidate the email zoo switch everything to the kernel.org account. Signed-off-by: Thomas Gleixner <tglx@kernel.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2026-01-10erofs: fix file-backed mounts no longer working on EROFS partitionsGao Xiang1-1/+2
Sheng Yong reported [1] that Android APEX images didn't work with commit 072a7c7cdbea ("erofs: don't bother with s_stack_depth increasing for now") because "EROFS-formatted APEX file images can be stored within an EROFS-formatted Android system partition." In response, I sent a quick fat-fingered [PATCH v3] to address the report. Unfortunately, the updated condition was incorrect: if (erofs_is_fileio_mode(sbi)) { - sb->s_stack_depth = - file_inode(sbi->dif0.file)->i_sb->s_stack_depth + 1; - if (sb->s_stack_depth > FILESYSTEM_MAX_STACK_DEPTH) { - erofs_err(sb, "maximum fs stacking depth exceeded"); + inode = file_inode(sbi->dif0.file); + if ((inode->i_sb->s_op == &erofs_sops && !sb->s_bdev) || + inode->i_sb->s_stack_depth) { The condition `!sb->s_bdev` is always true for all file-backed EROFS mounts, making the check effectively a no-op. The real fix tested and confirmed by Sheng Yong [2] at that time was [PATCH v3 RESEND], which correctly ensures the following EROFS^2 setup works: EROFS (on a block device) + EROFS (file-backed mount) But sadly I screwed it up again by upstreaming the outdated [PATCH v3]. This patch applies the same logic as the delta between the upstream [PATCH v3] and the real fix [PATCH v3 RESEND]. Reported-by: Sheng Yong <shengyong1@xiaomi.com> Closes: https://lore.kernel.org/r/3acec686-4020-4609-aee4-5dae7b9b0093@gmail.com [1] Fixes: 072a7c7cdbea ("erofs: don't bother with s_stack_depth increasing for now") Link: https://lore.kernel.org/r/243f57b8-246f-47e7-9fb1-27a771e8e9e8@gmail.com [2] Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2026-01-10fs/resctrl: Move RMID initialization to first mountTony Luck3-29/+34
L3 monitor features are enumerated during resctrl initialization and rmid_ptrs[] that tracks all RMIDs and depends on the number of supported RMIDs is allocated during this time. Telemetry monitor features are enumerated during first resctrl mount and may support a different number of RMIDs compared to L3 monitor features. Delay allocation and initialization of rmid_ptrs[] until first mount. Since the number of RMIDs cannot change on later mounts, keep the same set of rmid_ptrs[] until resctrl_exit(). This is required because the limbo handler keeps running after resctrl is unmounted and needs to access rmid_ptrs[] as it keeps tracking busy RMIDs after unmount. Rename routines to match what they now do: dom_data_init() -> setup_rmid_lru_list() dom_data_exit() -> free_rmid_lru_list() Signed-off-by: Tony Luck <tony.luck@intel.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Reviewed-by: Reinette Chatre <reinette.chatre@intel.com> Link: https://lore.kernel.org/20251217172121.12030-1-tony.luck@intel.com
2026-01-10x86,fs/resctrl: Compute number of RMIDs as minimum across resourcesTony Luck1-0/+6
resctrl assumes that only the L3 resource supports monitor events, so it simply takes the rdt_resource::num_rmid from RDT_RESOURCE_L3 as the system's number of RMIDs. The addition of telemetry events in a different resource breaks that assumption. Compute the number of available RMIDs as the minimum value across all mon_capable resources (analogous to how the number of CLOSIDs is computed across alloc_capable resources). Note that mount time enumeration of the telemetry resource means that this number can be reduced. If this happens, then some memory will be wasted as the allocations for rdt_l3_mon_domain::mbm_states[] and rdt_l3_mon_domain::rmid_busy_llc created during resctrl initialization will be larger than needed. Signed-off-by: Tony Luck <tony.luck@intel.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Reviewed-by: Reinette Chatre <reinette.chatre@intel.com> Link: https://lore.kernel.org/20251217172121.12030-1-tony.luck@intel.com
2026-01-10fs/resctrl: Move allocation/free of closid_num_dirty_rmid[]Tony Luck1-28/+51
closid_num_dirty_rmid[] and rmid_ptrs[] are allocated together during resctrl initialization and freed together during resctrl exit. Telemetry events are enumerated on resctrl mount so only at resctrl mount will the number of RMID supported by all monitoring resources and needed as size for rmid_ptrs[] be known. Separate closid_num_dirty_rmid[] and rmid_ptrs[] allocation and free in preparation for rmid_ptrs[] to be allocated on resctrl mount. Keep the rdtgroup_mutex protection around the allocation and free of closid_num_dirty_rmid[] as ARM needs this to guarantee memory ordering. Signed-off-by: Tony Luck <tony.luck@intel.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Reviewed-by: Reinette Chatre <reinette.chatre@intel.com> Link: https://lore.kernel.org/20251217172121.12030-1-tony.luck@intel.com
2026-01-10x86/resctrl: Handle number of RMIDs supported by RDT_RESOURCE_PERF_PKGTony Luck1-1/+1
There are now three meanings for "number of RMIDs": 1) The number for legacy features enumerated by CPUID leaf 0xF. This is the maximum number of distinct values that can be loaded into MSR_IA32_PQR_ASSOC. Note that systems with Sub-NUMA Cluster mode enabled will force scaling down the CPUID enumerated value by the number of SNC nodes per L3-cache. 2) The number of registers in MMIO space for each event. This is enumerated in the XML files and is the value initialized into event_group::num_rmid. 3) The number of "hardware counters" (this isn't a strictly accurate description of how things work, but serves as a useful analogy that does describe the limitations) feeding to those MMIO registers. This is enumerated in telemetry_region::num_rmids returned by intel_pmt_get_regions_by_feature(). Event groups with insufficient "hardware counters" to track all RMIDs are difficult for users to use, since the system may reassign "hardware counters" at any time. This means that users cannot reliably collect two consecutive event counts to compute the rate at which events are occurring. Disable such event groups by default. The user may override this with a command line "rdt=" option. In this case limit an under-resourced event group's number of possible monitor resource groups to the lowest number of "hardware counters". Scan all enabled event groups and assign the RDT_RESOURCE_PERF_PKG resource "num_rmid" value to the smallest of these values as this value will be used later to compare against the number of RMIDs supported by other resources to determine how many monitoring resource groups are supported. N.B. Change type of resctrl_mon::num_rmid to u32 to match its usage and the type of event_group::num_rmid so that min(r->num_rmid, e->num_rmid) won't complain about mixing signed and unsigned types. Signed-off-by: Tony Luck <tony.luck@intel.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Reviewed-by: Reinette Chatre <reinette.chatre@intel.com> Link: https://lore.kernel.org/20251217172121.12030-1-tony.luck@intel.com
2026-01-09Merge tag 'erofs-for-6.19-rc5-fixes' of ↵Linus Torvalds1-6/+12
git://git.kernel.org/pub/scm/linux/kernel/git/xiang/erofs Pull erofs fix from Gao Xiang: - Don't increase s_stack_depth which caused regressions in some composefs mount setups (EROFS + ovl^2) Instead just allow one extra unaccounted fs stacking level for straightforward cases. * tag 'erofs-for-6.19-rc5-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/xiang/erofs: erofs: don't bother with s_stack_depth increasing for now
2026-01-10erofs: don't bother with s_stack_depth increasing for nowGao Xiang1-6/+12
Previously, commit d53cd891f0e4 ("erofs: limit the level of fs stacking for file-backed mounts") bumped `s_stack_depth` by one to avoid kernel stack overflow when stacking an unlimited number of EROFS on top of each other. This fix breaks composefs mounts, which need EROFS+ovl^2 sometimes (and such setups are already used in production for quite a long time). One way to fix this regression is to bump FILESYSTEM_MAX_STACK_DEPTH from 2 to 3, but proving that this is safe in general is a high bar. After a long discussion on GitHub issues [1] about possible solutions, one conclusion is that there is no need to support nesting file-backed EROFS mounts on stacked filesystems, because there is always the option to use loopback devices as a fallback. As a quick fix for the composefs regression for this cycle, instead of bumping `s_stack_depth` for file backed EROFS mounts, we disallow nesting file-backed EROFS over EROFS and over filesystems with `s_stack_depth` > 0. This works for all known file-backed mount use cases (composefs, containerd, and Android APEX for some Android vendors), and the fix is self-contained. Essentially, we are allowing one extra unaccounted fs stacking level of EROFS below stacking filesystems, but EROFS can only be used in the read path (i.e. overlayfs lower layers), which typically has much lower stack usage than the write path. We can consider increasing FILESYSTEM_MAX_STACK_DEPTH later, after more stack usage analysis or using alternative approaches, such as splitting the `s_stack_depth` limitation according to different combinations of stacking. Fixes: d53cd891f0e4 ("erofs: limit the level of fs stacking for file-backed mounts") Reported-and-tested-by: Dusty Mabe <dusty@dustymabe.com> Reported-by: Timothée Ravier <tim@siosm.fr> Closes: https://github.com/coreos/fedora-coreos-tracker/issues/2087 [1] Reported-by: "Alekséi Naidénov" <an@digitaltide.io> Closes: https://lore.kernel.org/r/CAFHtUiYv4+=+JP_-JjARWjo6OwcvBj1wtYN=z0QXwCpec9sXtg@mail.gmail.com Acked-by: Amir Goldstein <amir73il@gmail.com> Acked-by: Alexander Larsson <alexl@redhat.com> Reviewed-and-tested-by: Sheng Yong <shengyong1@xiaomi.com> Reviewed-by: Zhiguo Niu <zhiguo.niu@unisoc.com> Reviewed-by: Chao Yu <chao@kernel.org> Cc: Christian Brauner <brauner@kernel.org> Cc: Miklos Szeredi <mszeredi@redhat.com> Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
2026-01-09x86,fs/resctrl: Handle domain creation/deletion for RDT_RESOURCE_PERF_PKGTony Luck1-5/+12
The L3 resource has several requirements for domains. There are per-domain structures that hold the 64-bit values of counters, and elements to keep track of the overflow and limbo threads. None of these are needed for the PERF_PKG resource. The hardware counters are wide enough that they do not wrap around for decades. Define a new rdt_perf_pkg_mon_domain structure which just consists of the standard rdt_domain_hdr to keep track of domain id and CPU mask. Update resctrl_online_mon_domain() for RDT_RESOURCE_PERF_PKG. The only action needed for this resource is to create and populate domain directories if a domain is added while resctrl is mounted. Similarly resctrl_offline_mon_domain() only needs to remove domain directories. Signed-off-by: Tony Luck <tony.luck@intel.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Reviewed-by: Reinette Chatre <reinette.chatre@intel.com> Link: https://lore.kernel.org/20251217172121.12030-1-tony.luck@intel.com
2026-01-09fs/resctrl: Refactor rmdir_mondata_subdir_allrdtgrp()Tony Luck1-11/+31
Clearing a monitor group's mon_data directory is complicated because of the support for Sub-NUMA Cluster (SNC) mode. Refactor the SNC case into a helper function to make it easier to add support for a new telemetry resource. Suggested-by: Reinette Chatre <reinette.chatre@intel.com> Signed-off-by: Tony Luck <tony.luck@intel.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Reviewed-by: Reinette Chatre <reinette.chatre@intel.com> Link: https://lore.kernel.org/20251217172121.12030-1-tony.luck@intel.com
2026-01-09fs/resctrl: Refactor mkdir_mondata_subdir()Tony Luck1-50/+58
Population of a monitor group's mon_data directory is unreasonably complicated because of the support for Sub-NUMA Cluster (SNC) mode. Split out the SNC code into a helper function to make it easier to add support for a new telemetry resource. Move all the duplicated code to make and set owner of domain directories into the mon_add_all_files() helper and rename to _mkdir_mondata_subdir(). Suggested-by: Reinette Chatre <reinette.chatre@intel.com> Signed-off-by: Tony Luck <tony.luck@intel.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Reviewed-by: Reinette Chatre <reinette.chatre@intel.com> Link: https://lore.kernel.org/20251217172121.12030-1-tony.luck@intel.com
2026-01-09x86/resctrl: Read telemetry eventsTony Luck1-0/+14
Introduce intel_aet_read_event() to read telemetry events for resource RDT_RESOURCE_PERF_PKG. There may be multiple aggregators tracking each package, so scan all of them and add up all counters. Aggregators may return an invalid data indication if they have received no records for a given RMID. The user will see "Unavailable" if none of the aggregators on a package provide valid counts. Resctrl now uses readq() so depends on X86_64. Update Kconfig. Signed-off-by: Tony Luck <tony.luck@intel.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Reviewed-by: Reinette Chatre <reinette.chatre@intel.com> Link: https://lore.kernel.org/20251217172121.12030-1-tony.luck@intel.com
2026-01-09x86/resctrl: Find and enable usable telemetry eventsTony Luck1-4/+6
Every event group has a private copy of the data of all telemetry event aggregators (aka "telemetry regions") tracking its feature type. Included may be regions that have the same feature type but tracking different GUID from the event group's. Traverse the event group's telemetry region data and mark all regions that are not usable by the event group as unusable by clearing those regions' MMIO addresses. A region is considered unusable if: 1) GUID does not match the GUID of the event group. 2) Package ID is invalid. 3) The enumerated size of the MMIO region does not match the expected value from the XML description file. Hereafter any telemetry region with an MMIO address is considered valid for the event group it is associated with. Enable all the event group's events as long as there is at least one usable region from where data for its events can be read. Enabling of an event can fail if the same event has already been enabled as part of another event group. It should never happen that the same event is described by different GUID supported by the same system so just WARN (via resctrl_enable_mon_event()) and skip the event. Note that it is architecturally possible that some telemetry events are only supported by a subset of the packages in the system. It is not expected that systems will ever do this. If they do the user will see event files in resctrl that always return "Unavailable". Signed-off-by: Tony Luck <tony.luck@intel.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Reviewed-by: Reinette Chatre <reinette.chatre@intel.com> Link: https://lore.kernel.org/20251217172121.12030-1-tony.luck@intel.com
2026-01-09Merge tag 'for-6.19-rc4-tag' of ↵Linus Torvalds4-17/+39
git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux Pull btrfs fixes from David Sterba: - fix potential NULL pointer dereference when replaying tree log after an error - release path before initializing extent tree to avoid potential deadlock when allocating new inode - on filesystems with block size > page size - fix potential read out of bounds during encoded read of an inline extent - only enforce free space tree if v1 cache is required - print correct tree id in error message * tag 'for-6.19-rc4-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux: btrfs: show correct warning if can't read data reloc tree btrfs: fix NULL pointer dereference in do_abort_log_replay() btrfs: force free space tree for bs > ps cases btrfs: only enforce free space tree if v1 cache is required for bs < ps cases btrfs: release path before initializing extent tree in btrfs_read_locked_inode() btrfs: avoid access-beyond-folio for bs > ps encoded writes
2026-01-09btrfs: send: check for inline extents in range_is_hole_in_parent()Qu Wenruo1-0/+2
Before accessing the disk_bytenr field of a file extent item we need to check if we are dealing with an inline extent. This is because for inline extents their data starts at the offset of the disk_bytenr field. So accessing the disk_bytenr means we are accessing inline data or in case the inline data is less than 8 bytes we can actually cause an invalid memory access if this inline extent item is the first item in the leaf or access metadata from other items. Fixes: 82bfb2e7b645 ("Btrfs: incremental send, fix unnecessary hole writes for sparse files") Reviewed-by: Filipe Manana <fdmanana@suse.com> Signed-off-by: Qu Wenruo <wqu@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2026-01-09btrfs: tests: fix return 0 on rmap test failureNaohiro Aota1-0/+3
In test_rmap_blocks(), we have ret = 0 before checking the results. We need to set it to -EINVAL, so that a mismatching result will return -EINVAL not 0. Reviewed-by: Qu Wenruo <wqu@suse.com> Signed-off-by: Naohiro Aota <naohiro.aota@wdc.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2026-01-09btrfs: tests: fix root tree leak in btrfs_test_qgroups()Zilin Guan1-3/+3
If btrfs_insert_fs_root() fails, the tmp_root allocated by btrfs_alloc_dummy_root() is leaked because its initial reference count is not decremented. Fix this by calling btrfs_put_root() unconditionally after btrfs_insert_fs_root(). This ensures the local reference is always dropped. Also fix a copy-paste error in the error message where the subvolume root insertion failure was incorrectly logged as "fs root". Co-developed-by: Jianhao Xu <jianhao.xu@seu.edu.cn> Signed-off-by: Jianhao Xu <jianhao.xu@seu.edu.cn> Signed-off-by: Zilin Guan <zilin@seu.edu.cn> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2026-01-09btrfs: release path before iget_failed() in btrfs_read_locked_inode()Filipe Manana1-0/+9
In btrfs_read_locked_inode() if we fail to lookup the inode, we jump to the 'out' label with a path that has a read locked leaf and then we call iget_failed(). This can result in a ABBA deadlock, since iget_failed() triggers inode eviction and that causes the release of the delayed inode, which must lock the delayed inode's mutex, and a task updating a delayed inode starts by taking the node's mutex and then modifying the inode's subvolume btree. Syzbot reported the following lockdep splat for this: ====================================================== WARNING: possible circular locking dependency detected syzkaller #0 Not tainted ------------------------------------------------------ btrfs-cleaner/8725 is trying to acquire lock: ffff0000d6826a48 (&delayed_node->mutex){+.+.}-{4:4}, at: __btrfs_release_delayed_node+0xa0/0x9b0 fs/btrfs/delayed-inode.c:290 but task is already holding lock: ffff0000dbeba878 (btrfs-tree-00){++++}-{4:4}, at: btrfs_tree_read_lock_nested+0x44/0x2ec fs/btrfs/locking.c:145 which lock already depends on the new lock. the existing dependency chain (in reverse order) is: -> #1 (btrfs-tree-00){++++}-{4:4}: __lock_release kernel/locking/lockdep.c:5574 [inline] lock_release+0x198/0x39c kernel/locking/lockdep.c:5889 up_read+0x24/0x3c kernel/locking/rwsem.c:1632 btrfs_tree_read_unlock+0xdc/0x298 fs/btrfs/locking.c:169 btrfs_tree_unlock_rw fs/btrfs/locking.h:218 [inline] btrfs_search_slot+0xa6c/0x223c fs/btrfs/ctree.c:2133 btrfs_lookup_inode+0xd8/0x38c fs/btrfs/inode-item.c:395 __btrfs_update_delayed_inode+0x124/0xed0 fs/btrfs/delayed-inode.c:1032 btrfs_update_delayed_inode fs/btrfs/delayed-inode.c:1118 [inline] __btrfs_commit_inode_delayed_items+0x15f8/0x1748 fs/btrfs/delayed-inode.c:1141 __btrfs_run_delayed_items+0x1ac/0x514 fs/btrfs/delayed-inode.c:1176 btrfs_run_delayed_items_nr+0x28/0x38 fs/btrfs/delayed-inode.c:1219 flush_space+0x26c/0xb68 fs/btrfs/space-info.c:828 do_async_reclaim_metadata_space+0x110/0x364 fs/btrfs/space-info.c:1158 btrfs_async_reclaim_metadata_space+0x90/0xd8 fs/btrfs/space-info.c:1226 process_one_work+0x7e8/0x155c kernel/workqueue.c:3263 process_scheduled_works kernel/workqueue.c:3346 [inline] worker_thread+0x958/0xed8 kernel/workqueue.c:3427 kthread+0x5fc/0x75c kernel/kthread.c:463 ret_from_fork+0x10/0x20 arch/arm64/kernel/entry.S:844 -> #0 (&delayed_node->mutex){+.+.}-{4:4}: check_prev_add kernel/locking/lockdep.c:3165 [inline] check_prevs_add kernel/locking/lockdep.c:3284 [inline] validate_chain kernel/locking/lockdep.c:3908 [inline] __lock_acquire+0x1774/0x30a4 kernel/locking/lockdep.c:5237 lock_acquire+0x14c/0x2e0 kernel/locking/lockdep.c:5868 __mutex_lock_common+0x1d0/0x2678 kernel/locking/mutex.c:598 __mutex_lock kernel/locking/mutex.c:760 [inline] mutex_lock_nested+0x2c/0x38 kernel/locking/mutex.c:812 __btrfs_release_delayed_node+0xa0/0x9b0 fs/btrfs/delayed-inode.c:290 btrfs_release_delayed_node fs/btrfs/delayed-inode.c:315 [inline] btrfs_remove_delayed_node+0x68/0x84 fs/btrfs/delayed-inode.c:1326 btrfs_evict_inode+0x578/0xe28 fs/btrfs/inode.c:5587 evict+0x414/0x928 fs/inode.c:810 iput_final fs/inode.c:1914 [inline] iput+0x95c/0xad4 fs/inode.c:1966 iget_failed+0xec/0x134 fs/bad_inode.c:248 btrfs_read_locked_inode+0xe1c/0x1234 fs/btrfs/inode.c:4101 btrfs_iget+0x1b0/0x264 fs/btrfs/inode.c:5837 btrfs_run_defrag_inode fs/btrfs/defrag.c:237 [inline] btrfs_run_defrag_inodes+0x520/0xdc4 fs/btrfs/defrag.c:309 cleaner_kthread+0x21c/0x418 fs/btrfs/disk-io.c:1516 kthread+0x5fc/0x75c kernel/kthread.c:463 ret_from_fork+0x10/0x20 arch/arm64/kernel/entry.S:844 other info that might help us debug this: Possible unsafe locking scenario: CPU0 CPU1 ---- ---- rlock(btrfs-tree-00); lock(&delayed_node->mutex); lock(btrfs-tree-00); lock(&delayed_node->mutex); *** DEADLOCK *** 1 lock held by btrfs-cleaner/8725: #0: ffff0000dbeba878 (btrfs-tree-00){++++}-{4:4}, at: btrfs_tree_read_lock_nested+0x44/0x2ec fs/btrfs/locking.c:145 stack backtrace: CPU: 0 UID: 0 PID: 8725 Comm: btrfs-cleaner Not tainted syzkaller #0 PREEMPT Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 10/03/2025 Call trace: show_stack+0x2c/0x3c arch/arm64/kernel/stacktrace.c:499 (C) __dump_stack+0x30/0x40 lib/dump_stack.c:94 dump_stack_lvl+0xd8/0x12c lib/dump_stack.c:120 dump_stack+0x1c/0x28 lib/dump_stack.c:129 print_circular_bug+0x324/0x32c kernel/locking/lockdep.c:2043 check_noncircular+0x154/0x174 kernel/locking/lockdep.c:2175 check_prev_add kernel/locking/lockdep.c:3165 [inline] check_prevs_add kernel/locking/lockdep.c:3284 [inline] validate_chain kernel/locking/lockdep.c:3908 [inline] __lock_acquire+0x1774/0x30a4 kernel/locking/lockdep.c:5237 lock_acquire+0x14c/0x2e0 kernel/locking/lockdep.c:5868 __mutex_lock_common+0x1d0/0x2678 kernel/locking/mutex.c:598 __mutex_lock kernel/locking/mutex.c:760 [inline] mutex_lock_nested+0x2c/0x38 kernel/locking/mutex.c:812 __btrfs_release_delayed_node+0xa0/0x9b0 fs/btrfs/delayed-inode.c:290 btrfs_release_delayed_node fs/btrfs/delayed-inode.c:315 [inline] btrfs_remove_delayed_node+0x68/0x84 fs/btrfs/delayed-inode.c:1326 btrfs_evict_inode+0x578/0xe28 fs/btrfs/inode.c:5587 evict+0x414/0x928 fs/inode.c:810 iput_final fs/inode.c:1914 [inline] iput+0x95c/0xad4 fs/inode.c:1966 iget_failed+0xec/0x134 fs/bad_inode.c:248 btrfs_read_locked_inode+0xe1c/0x1234 fs/btrfs/inode.c:4101 btrfs_iget+0x1b0/0x264 fs/btrfs/inode.c:5837 btrfs_run_defrag_inode fs/btrfs/defrag.c:237 [inline] btrfs_run_defrag_inodes+0x520/0xdc4 fs/btrfs/defrag.c:309 cleaner_kthread+0x21c/0x418 fs/btrfs/disk-io.c:1516 kthread+0x5fc/0x75c kernel/kthread.c:463 ret_from_fork+0x10/0x20 arch/arm64/kernel/entry.S:844 Fix this by releasing the path before calling iget_failed(). Reported-by: syzbot+c1c6edb02bea1da754d8@syzkaller.appspotmail.com Link: https://lore.kernel.org/linux-btrfs/694530c2.a70a0220.207337.010d.GAE@google.com/ Fixes: 69673992b1ae ("btrfs: push cleanup into btrfs_read_locked_inode()") Reviewed-by: Boris Burkov <boris@bur.io> Signed-off-by: Filipe Manana <fdmanana@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2026-01-09Merge tag 'vfs-6.19-rc5.fixes' of ↵Linus Torvalds11-91/+184
gitolite.kernel.org:pub/scm/linux/kernel/git/vfs/vfs Pull vfs fixes from Christian Brauner: - Remove incorrect __user annotation from struct xattr_args::value - Documentation fix: Add missing kernel-doc description for the @isnew parameter in ilookup5_nowait() to silence Sphinx warnings - Documentation fix: Fix kernel-doc comment for __start_dirop() - the function name in the comment was wrong and the @state parameter was undocumented - Replace dynamic folio_batch allocation with stack allocation in iomap_zero_range(). The dynamic allocation was problematic for ext4-on-iomap work (didn't handle allocation failure properly) and triggered lockdep complaints. Uses a flag instead to control batch usage - Re-add #ifdef guards around PIDFD_GET_<ns-type>_NAMESPACE ioctls. When a namespace type is disabled, ns->ops is NULL, causes crashes during inode eviction when closing the fd. The ifdefs were removed in a recent simplification but are still needed - Fixe a race where a folio could be unlocked before the trailing zeros (for EOF within the page) were written - Split out a dedicated lease_dispose_list() helper since lease code paths always know they're disposing of leases. Removes unnecessary runtime flag checks and prepares for upcoming lease_manager enhancements - Fix userland delegation requests succeeding despite conflicting opens. Previously, FL_LAYOUT and FL_DELEG leases bypassed conflict checks (a hack for nfsd). Adds new ->lm_open_conflict() lease_manager operation so userland delegations get proper conflict checking while nfsd can continue its own conflict handling - Fix LOOKUP_CACHED path lookups incorrectly falling through to the slow path. After legitimize_links() calls were conditionally elided, the routine would always fail with LOOKUP_CACHED regardless of whether there were any links. Now the flag is checked at the two callsites before calling legitimize_links() - Fix bug in media fd allocation in media_request_alloc() - Fix mismatched API calls in ecryptfs_mknod(): was calling end_removing() instead of end_creating() after ecryptfs_start_creating_dentry() - Fix dentry reference count leak in ecryptfs_mkdir(): a dget() of the lower parent dir was added but never dput()'d, causing BUG during lower filesystem unmount due to the still-in-use dentry * tag 'vfs-6.19-rc5.fixes' of gitolite.kernel.org:pub/scm/linux/kernel/git/vfs/vfs: pidfs