linux.git - Unnamed repository; edit this file 'description' to name the repository.

Age	Commit message (Collapse)	Author	Files	Lines
24 hours	Merge tag '9p-for-7.2-rc1' of https://github.com/martinetd/linux	Linus Torvalds	11	-62/+283
	Pull 9p updates from Dominique Martinet: "Asides of the avalanche of LLM-driven fixes, there are a couple of big changes this cycle: - negative dentry and symlink cache - a way out of the unkillable "io_wait_event_killable" (because it looped around waiting for the request flush to come back from server; this has been bugging syzcaller folks since forever): I'm still not 100% sure about this patch, but I think it's as good as we'll ever get, and will keep testing a bit further in the coming weeks The rest is more noisy than usual, but shouldn't cause any trouble" * tag '9p-for-7.2-rc1' of https://github.com/martinetd/linux: 9p: Add missing read barrier in virtio zero-copy path net/9p: Replace strlen() strcpy() pair with strscpy() 9p: skip nlink update in cacheless mode to fix WARN_ON net/9p: fix race condition on rdma->state in trans_rdma.c 9p: v9fs_file_do_lock: replace WARN_ONCE with p9_debug 9p: Enable symlink caching in page cache 9p: Set default negative dentry retention time for cache=loose 9p: Add mount option for negative dentry cache retention 9p: Cache negative dentries for lookup performance 9p: avoid returning ERR_PTR(0) from mkdir operations 9p: avoid putting oldfid in p9_client_walk() error path net/9p: fix infinite loop in p9_client_rpc on fatal signal docs/filesystems/9p: fix broken external links 9p: invalidate readdir buffer on seek 9p: use kvzalloc for readdir buffer net/9p/usbg: Constify struct configfs_item_operations
36 hours	9p: skip nlink update in cacheless mode to fix WARN_ON	Breno Leitao	1	-0/+9
	v9fs_dec_count() unconditionally calls drop_nlink() on regular files, even when the inode's nlink is already zero. In cacheless mode the client refetches inode metadata from the server (the source of truth) on every operation, so by the time v9fs_remove() returns, the locally cached nlink may already reflect the post-unlink value: 1. Client initiates unlink, server processes it and sets nlink to 0 2. Client refetches inode metadata (nlink=0) before unlink returns 3. Client's v9fs_remove() completes successfully 4. Client calls v9fs_dec_count() which calls drop_nlink() on nlink=0 This race is easily triggered under heavy unlink workloads, such as stress-ng's unlink stressor, producing the following warning: WARNING: fs/inode.c:417 at drop_nlink+0x4c/0xc8 Call trace: drop_nlink+0x4c/0xc8 v9fs_remove+0x1e0/0x250 [9p] v9fs_vfs_unlink+0x20/0x38 [9p] vfs_unlink+0x13c/0x258 ... In cacheless mode the server is authoritative and the inode is on its way out, so locally adjusting nlink buys nothing. Skip v9fs_dec_count() entirely when neither CACHE_META nor CACHE_LOOSE is set, which both avoids the warning and removes a class of nlink races (two concurrent unlinkers observing nlink > 0 and both calling drop_nlink()) that an nlink == 0 guard alone would only narrow rather than close. Fixes: ac89b2ef9b55 ("9p: don't maintain dir i_nlink if the exported fs doesn't either") Cc: stable@vger.kernel.org Suggested-by: Dominique Martinet <asmadeus@codewreck.org> Signed-off-by: Breno Leitao <leitao@debian.org> Message-ID: <20260421-9p-v2-1-48762d294fad@debian.org> Signed-off-by: Dominique Martinet <asmadeus@codewreck.org>
36 hours	9p: v9fs_file_do_lock: replace WARN_ONCE with p9_debug	Dominique Martinet	1	-1/+1
	This warning depends on server-provided data, we should not use WARN here Reported-by: Yifei Chu <yifeichu24@gmail.com> Closes: https://lore.kernel.org/r/CAPJnbgJ7ZK7DCjCfG56hd_iKGePmAzudb4hOWd4=9r32nM+KcA@mail.gmail.com Signed-off-by: Dominique Martinet <asmadeus@codewreck.org> Message-ID: <20260529-lock-warn-v1-1-20c29580d61d@codewreck.org>
36 hours	9p: Enable symlink caching in page cache	Remi Pommarel	3	-10/+88
	Currently, when cache=loose is enabled, file reads are cached in the page cache, but symlink reads are not. This patch allows the results of p9_client_readlink() to be stored in the page cache, eliminating the need for repeated 9P transactions on subsequent symlink accesses. This change improves performance for workloads that involve frequent symlink resolution. Signed-off-by: Remi Pommarel <repk@triplefau.lt> Message-ID: <982462d17c0c0d2856763266a25eb04d080c1dbb.1779355927.git.repk@triplefau.lt> Signed-off-by: Dominique Martinet <asmadeus@codewreck.org>
36 hours	9p: Set default negative dentry retention time for cache=loose	Remi Pommarel	1	-0/+10
	For cache=loose mounts, set the default negative dentry cache retention time to 24 hours. Signed-off-by: Remi Pommarel <repk@triplefau.lt> Message-ID: <b5beca3e70890ab8a4f0b9e99bd69cb97f5cb9eb.1779355927.git.repk@triplefau.lt> Signed-off-by: Dominique Martinet <asmadeus@codewreck.org>
36 hours	9p: Add mount option for negative dentry cache retention	Remi Pommarel	2	-11/+28
	Introduce a new mount option, negtimeout, for v9fs that allows users to specify how long negative dentries are retained in the cache. The retention time can be set in milliseconds (e.g. negtimeout=10000 for a 10secs retention time) or a negative value (e.g. negtimeout=-1) to keep negative entries until the buffer cache management removes them. For consistency reasons, this option should only be used in exclusive or read-only mount scenarios, aligning with the cache=loose usage. Signed-off-by: Remi Pommarel <repk@triplefau.lt> Message-ID: <b2d66500aa5a2f6540347c4aa46a4be10dd01bc6.1779355927.git.repk@triplefau.lt> Signed-off-by: Dominique Martinet <asmadeus@codewreck.org>
36 hours	9p: Cache negative dentries for lookup performance	Remi Pommarel	7	-24/+126
	Not caching negative dentries can result in poor performance for workloads that repeatedly look up non-existent paths. Each such lookup triggers a full 9P transaction with the server, adding unnecessary overhead. A typical example is source compilation, where multiple cc1 processes are spawned and repeatedly search for the same missing header files over and over again. This change enables caching of negative dentries, so that lookups for known non-existent paths do not require a full 9P transaction. The cached negative dentries are retained for a configurable duration (expressed in milliseconds), as specified by the ndentry_timeout field in struct v9fs_session_info. If set to -1, negative dentries are cached indefinitely. This optimization reduces lookup overhead and improves performance for workloads involving frequent access to non-existent paths. Signed-off-by: Remi Pommarel <repk@triplefau.lt> Message-ID: <e542317dd03bbadb5249abd3ea6aecfdca692c19.1779355927.git.repk@triplefau.lt> Signed-off-by: Dominique Martinet <asmadeus@codewreck.org>
36 hours	9p: avoid returning ERR_PTR(0) from mkdir operations	Hongling Zeng	2	-15/+8
	When mkdir succeeds, v9fs_vfs_mkdir_dotl() and v9fs_vfs_mkdir() return ERR_PTR(0) which is incorrect. They should return NULL instead for success and ERR_PTR() only with negative error codes for failure. Return NULL instead of passing to ERR_PTR while err is zero Fixes smatch warnings: fs/9p/vfs_inode_dotl.c:420 v9fs_vfs_mkdir_dotl() warn: passing zero to 'ERR_PTR' fs/9p/vfs_inode.c:695 v9fs_vfs_mkdir() warn: passing zero to 'ERR_PTR' The v9fs_vfs_mkdir() code was further simplified because v9fs_create() can never return NULL, so we do not need to check for fid being set separately, and the error path can be a simple return immediately after v9fs_create() failure. There is no intended functional change. Fixes: 88d5baf69082 ("Change inode_operations.mkdir to return struct dentry *") Suggested-by: David Laight <david.laight.linux@gmail.com> Acked-by: Christian Schoenebeck <linux_oss@crudebyte.com> Signed-off-by: Hongling Zeng <zenghongling@kylinos.cn> Message-ID: <20260520022650.14217-1-zenghongling@kylinos.cn> Signed-off-by: Dominique Martinet <asmadeus@codewreck.org>
2026-05-19	9p: invalidate readdir buffer on seek	Pierre Barre	1	-0/+12
	The per-fid readdir buffer (fid->rdir) is populated lazily and only refilled when fully drained (rdir->head == rdir->tail). userspace lseek() on a directory fd updates file->f_pos via generic_file_llseek() but does not touch the cached buffer, so the next getdents() iterates the stale cache and emits entries from the previous position instead of the one the caller asked for. Track the file position the cached data corresponds to in struct p9_rdir, and drop the cache on entry to iterate_shared when it no longer matches ctx->pos. The 9p protocol's Tread/Treaddir already take an arbitrary offset on every request, so a refill at the new position is always legal; no .llseek override or seek restriction is needed. Reported-by: Pierre Barre <pierre@barre.sh> Link: https://lore.kernel.org/v9fs/496d10b9-40fe-4f81-8014-37497c37ff63@app.fastmail.com/ Signed-off-by: Pierre Barre <pierre@barre.sh> Message-ID: <20260512132032.369281-2-pierre@barre.sh> Signed-off-by: Dominique Martinet <asmadeus@codewreck.org>
2026-05-19	9p: use kvzalloc for readdir buffer	Pierre Barre	1	-1/+1
	The readdir buffer is sized to msize, so kzalloc() can fail under fragmentation with a page allocation failure in v9fs_alloc_rdir_buf() / v9fs_dir_readdir_dotl(). The buffer is only a response sink and is never pack_sg_list()'d, so kvzalloc() is safe for all transports, unlike the fcall buffers fixed in e21d451a82f3 ("9p: Use kvmalloc for message buffers on supported transports"). Signed-off-by: Pierre Barre <pierre@barre.sh> Message-ID: <20260512132032.369281-1-pierre@barre.sh> Signed-off-by: Dominique Martinet <asmadeus@codewreck.org>
2026-05-12	netfs: Fix potential for tearing in ->remote_i_size and ->zero_point	David Howells	3	-19/+12
	Fix potential tearing in using ->remote_i_size and ->zero_point by copying i_size_read() and i_size_write() and using the same seqcount as for i_size. We need to make sure that netfslib and the filesystems that use it always hold i_lock whilst updating any of the sizes to prevent i_size_seqcount from getting corrupted. Fixes: 4058f742105e ("netfs: Keep track of the actual remote file size") Fixes: 100ccd18bb41 ("netfs: Optimise away reads above the point at which there can be no data") Closes: https://sashiko.dev/#/patchset/20260414082004.3756080-1-dhowells%40redhat.com Signed-off-by: David Howells <dhowells@redhat.com> Link: https://patch.msgid.link/20260512123404.719402-6-dhowells@redhat.com cc: Paulo Alcantara <pc@manguebit.org> cc: Matthew Wilcox <willy@infradead.org> cc: netfs@lists.linux.dev cc: linux-fsdevel@vger.kernel.org Signed-off-by: Christian Brauner <brauner@kernel.org>
2026-04-24	Merge tag '9p-for-7.1-rc1' of https://github.com/martinetd/linux	Linus Torvalds	2	-3/+7
	Pull 9p updates from Dominique Martinet: - 9p access flag fix (cannot change access flag since new mount API implem) - some minor cleanup * tag '9p-for-7.1-rc1' of https://github.com/martinetd/linux: 9p/trans_xen: replace simple_strto* with kstrtouint 9p/trans_xen: make cleanup idempotent after dataring alloc errors 9p: document missing enum values in kernel-doc comments 9p: fix access mode flags being ORed instead of replaced 9p: fix memory leak in v9fs_init_fs_context error path
2026-04-16	9p: fix access mode flags being ORed instead of replaced	Pierre Barre	1	-0/+4
	Since commit 1f3e4142c0eb ("9p: convert to the new mount API"), v9fs_apply_options() applies parsed mount flags with \|= onto flags already set by v9fs_session_init(). For 9P2000.L, session_init sets V9FS_ACCESS_CLIENT as the default, so when the user mounts with "access=user", both bits end up set. Access mode checks compare against exact values, so having both bits set matches neither mode. This causes v9fs_fid_lookup() to fall through to the default switch case, using INVALID_UID (nobody/65534) instead of current_fsuid() for all fid lookups. Root is then unable to chown or perform other privileged operations. Fix by clearing the access mask before applying the user's choice. Fixes: 1f3e4142c0eb ("9p: convert to the new mount API") Signed-off-by: Pierre Barre <pierre@barre.sh> Reviewed-by: Christian Schoenebeck <linux_oss@crudebyte.com> Message-ID: <0ddc72da-d196-4f01-8755-0086f670e779@app.fastmail.com> Cc: stable@vger.kernel.org Signed-off-by: Dominique Martinet <asmadeus@codewreck.org>
2026-04-16	9p: fix memory leak in v9fs_init_fs_context error path	Sasha Levin	1	-3/+3
	Move the assignments of fc->ops and fc->fs_private to right after the kzalloc, before any fallible operations. Previously these were assigned at the end of the function, after the kstrdup calls for uname and aname. If either kstrdup failed, the error path would set fc->need_free but leave fc->ops NULL, so put_fs_context() would never call v9fs_free_fc() to free the allocated context and any already-duplicated strings. Fixes: 1f3e4142c0eb ("9p: convert to the new mount API") Assisted-by: Claude:claude-opus-4-6 Signed-off-by: Sasha Levin <sashal@kernel.org> Message-ID: <20260225135745.351984-1-sashal@kernel.org> Signed-off-by: Dominique Martinet <asmadeus@codewreck.org>
2026-03-06	treewide: change inode->i_ino from unsigned long to u64	Jeff Layton	3	-8/+8
	On 32-bit architectures, unsigned long is only 32 bits wide, which causes 64-bit inode numbers to be silently truncated. Several filesystems (NFS, XFS, BTRFS, etc.) can generate inode numbers that exceed 32 bits, and this truncation can lead to inode number collisions and other subtle bugs on 32-bit systems. Change the type of inode->i_ino from unsigned long to u64 to ensure that inode numbers are always represented as 64-bit values regardless of architecture. Update all format specifiers treewide from %lu/%lx to %llu/%llx to match the new type, along with corresponding local variable types. This is the bulk treewide conversion. Earlier patches in this series handled trace events separately to allow trace field reordering for better struct packing on 32-bit. Signed-off-by: Jeff Layton <jlayton@kernel.org> Link: https://patch.msgid.link/20260304-iino-u64-v3-12-2257ad83d372@kernel.org Acked-by: Damien Le Moal <dlemoal@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Jan Kara <jack@suse.cz> Reviewed-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Christian Brauner <brauner@kernel.org>
2026-02-21	Convert 'alloc_obj' family to use the new default GFP_KERNEL argument	Linus Torvalds	1	-2/+2
	This was done entirely with mindless brute force, using git grep -l '\<k[vmz]alloc_objs(., GFP_KERNEL)' \| xargs sed -i 's/\(alloc_objs(.*\), GFP_KERNEL)/\1)/' to convert the new alloc_obj() users that had a simple GFP_KERNEL argument to just drop that argument. Note that due to the extreme simplicity of the scripting, any slightly more complex cases spread over multiple lines would not be triggered: they definitely exist, but this covers the vast bulk of the cases, and the resulting diff is also then easier to check automatically. For the same reason the 'flex' versions will be done as a separate conversion. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2026-02-21	treewide: Replace kmalloc with kmalloc_obj for non-scalar types	Kees Cook	1	-2/+2
	This is the result of running the Coccinelle script from scripts/coccinelle/api/kmalloc_objs.cocci. The script is designed to avoid scalar types (which need careful case-by-case checking), and instead replace kmalloc-family calls that allocate struct or union object instances: Single allocations: kmalloc(sizeof(TYPE), ...) are replaced with: kmalloc_obj(TYPE, ...) Array allocations: kmalloc_array(COUNT, sizeof(TYPE), ...) are replaced with: kmalloc_objs(TYPE, COUNT, ...) Flex array allocations: kmalloc(struct_size(PTR, FAM, COUNT), ...) are replaced with: kmalloc_flex(PTR, FAM, COUNT, ...) (where TYPE may also be VAR) The resulting allocations no longer return "void ", instead returning "TYPE ". Signed-off-by: Kees Cook <kees@kernel.org>
2026-02-09	Merge tag 'vfs-7.0-rc1.misc' of ↵	Linus Torvalds	1	-13/+3
	git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs Pull misc vfs updates from Christian Brauner: "This contains a mix of VFS cleanups, performance improvements, API fixes, documentation, and a deprecation notice. Scalability and performance: - Rework pid allocation to only take pidmap_lock once instead of twice during alloc_pid(), improving thread creation/teardown throughput by 10-16% depending on false-sharing luck. Pad the namespace refcount to reduce false-sharing - Track file lock presence via a flag in ->i_opflags instead of reading ->i_flctx, avoiding false-sharing with ->i_readcount on open/close hot paths. Measured 4-16% improvement on 24-core open-in-a-loop benchmarks - Use a consume fence in locks_inode_context() to match the store-release/load-consume idiom, eliminating a hardware fence on some architectures - Annotate cdev_lock with __cacheline_aligned_in_smp to prevent false-sharing - Remove a redundant DCACHE_MANAGED_DENTRY check in __follow_mount_rcu() that never fires since the caller already verifies it, eliminating a 100% mispredicted branch - Fix a 100% mispredicted likely() in devcgroup_inode_permission() that became wrong after a prior code reorder Bug fixes and correctness: - Make insert_inode_locked() wait for inode destruction instead of skipping, fixing a corner case where two matching inodes could exist in the hash - Move f_mode initialization before file_ref_init() in alloc_file() to respect the SLAB_TYPESAFE_BY_RCU ordering contract - Add a WARN_ON_ONCE guard in try_to_free_buffers() for folios with no buffers attached, preventing a null pointer dereference when AS_RELEASE_ALWAYS is set but no release_folio op exists - Fix select restart_block to store end_time as timespec64, avoiding truncation of tv_sec on 32-bit architectures - Make dump_inode() use get_kernel_nofault() to safely access inode and superblock fields, matching the dump_mapping() pattern API modernization: - Make posix_acl_to_xattr() allocate the buffer internally since every single caller was doing it anyway. Reduces boilerplate and unnecessary error checking across ~15 filesystems - Replace deprecated simple_strtoul() with kstrtoul() for the ihash_entries, dhash_entries, mhash_entries, and mphash_entries boot parameters, adding proper error handling - Convert chardev code to use guard(mutex) and __free(kfree) cleanup patterns - Replace min_t() with min() or umin() in VFS code to avoid silently truncating unsigned long to unsigned int - Gate LOOKUP_RCU assertions behind CONFIG_DEBUG_VFS since callers already check the flag Deprecation: - Begin deprecating legacy BSD process accounting (acct(2)). The interface has numerous footguns and better alternatives exist (eBPF) Documentation: - Fix and complete kernel-doc for struct export_operations, removing duplicated documentation between ReST and source - Fix kernel-doc warnings for __start_dirop() and ilookup5_nowait() Testing: - Add a kunit test for initramfs cpio handling of entries with filesize > PATH_MAX Misc: - Add missing <linux/init_task.h> include in fs_struct.c" * tag 'vfs-7.0-rc1.misc' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs: (28 commits) posix_acl: make posix_acl_to_xattr() alloc the buffer fs: make insert_inode_locked() wait for inode destruction initramfs_test: kunit test for cpio.filesize > PATH_MAX fs: improve dump_inode() to safely access inode fields fs: add <linux/init_task.h> for 'init_fs' docs: exportfs: Use source code struct documentation fs: move initializing f_mode before file_ref_init() exportfs: Complete kernel-doc for struct export_operations exportfs: Mark struct export_operations functions at kernel-doc exportfs: Fix kernel-doc output for get_name() acct(2): begin the deprecation of legacy BSD process accounting device_cgroup: remove branch hint after code refactor VFS: fix __start_dirop() kernel-doc warnings fs: Describe @isnew parameter in ilookup5_nowait() fs/namei: Remove redundant DCACHE_MANAGED_DENTRY check in __follow_mount_rcu fs: only assert on LOOKUP_RCU when built with CONFIG_DEBUG_VFS select: store end_time as timespec64 in restart block chardev: Switch to guard(mutex) and __free(kfree) namespace: Replace simple_strtoul with kstrtoul to parse boot params dcache: Replace simple_strtoul with kstrtoul in set_dhash_entries ...
2026-01-16	posix_acl: make posix_acl_to_xattr() alloc the buffer	Miklos Szeredi	1	-13/+3
	Without exception all caller do that. So move the allocation into the helper. This reduces boilerplate and removes unnecessary error checking. Signed-off-by: Miklos Szeredi <mszeredi@redhat.com> Link: https://patch.msgid.link/20260115122341.556026-1-mszeredi@redhat.com Signed-off-by: Christian Brauner <brauner@kernel.org>
2026-01-12	fs: remove simple_nosetlease()	Jeff Layton	2	-4/+0
	Setting ->setlease() to a NULL pointer now has the same effect as setting it to simple_nosetlease(). Remove all of the setlease file_operations that are set to simple_nosetlease, and the function itself. Signed-off-by: Jeff Layton <jlayton@kernel.org> Link: https://patch.msgid.link/20260108-setlease-6-20-v1-24-ea4dec9b67fa@kernel.org Acked-by: Al Viro <viro@zeniv.linux.org.uk> Acked-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Jan Kara <jack@suse.cz> Signed-off-by: Christian Brauner <brauner@kernel.org>
2026-01-12	9p: don't allow delegations to be set on directories	Jeff Layton	1	-0/+2
	With the advent of directory leases, it's necessary to set the ->setlease() handler in directory file_operations to properly deny them. Fixes: e6d28ebc17eb ("filelock: push the S_ISREG check down to ->setlease handlers") Signed-off-by: Jeff Layton <jlayton@kernel.org> Link: https://patch.msgid.link/20260107-setlease-6-19-v1-3-85f034abcc57@kernel.org Signed-off-by: Christian Brauner <brauner@kernel.org>
2025-12-07	Merge tag '9p-for-6.19-rc1' of https://github.com/martinetd/linux	Linus Torvalds	7	-292/+386
	Pull 9p updates from Dominique Martinet: - fix a bug with O_APPEND in cached mode causing data to be written multiple times on server - use kvmalloc for trans_fd to avoid problems with large msize and fragmented memory This should hopefully be used in more transports when time allows - convert to new mount API - minor cleanups * tag '9p-for-6.19-rc1' of https://github.com/martinetd/linux: 9p: fix new mount API cache option handling 9p: fix cache/debug options printing in v9fs_show_options 9p: convert to the new mount API 9p: create a v9fs_context structure to hold parsed options net/9p: move structures and macros to header files fs/fs_parse: add back fsparam_u32hex fs/9p: delete unnnecessary condition fs/9p: Don't open remote file with APPEND mode when writeback cache is used net/9p: cleanup: change p9_trans_module->def to bool 9p: Use kvmalloc for message buffers on supported transports
2025-12-05	9p: fix new mount API cache option handling	Eric Sandeen	1	-12/+32
	After commit 4eb3117888a92, 9p needs to be able to accept numerical cache= mount options as well as the string "shortcuts" because the option is printed numerically in /proc/mounts rather than by string. This was missed in the mount API conversion, which used an enum for the shortcuts and therefore could not handle a numeric equivalent as an argument to the cache option. Fix this by removing the enum and reverting to the slightly more open-coded option handling for Opt_cache, with the reinstated get_cache_mode() helper. Signed-off-by: Eric Sandeen <sandeen@redhat.com> Message-ID: <48cdeec9-5bb9-4c7a-a203-39bb8e0ef443@redhat.com> Tested-by: Remi Pommarel <repk@triplefau.lt> Signed-off-by: Dominique Martinet <asmadeus@codewreck.org>
2025-12-05	9p: fix cache/debug options printing in v9fs_show_options	Eric Sandeen	1	-2/+2
	commit 4eb3117888a92 changed the cache= option to accept either string shortcuts or bitfield values. It also changed /proc/mounts to emit the option as the hexadecimal numeric value rather than the shortcut string. However, by printing "cache=%x" without the leading 0x, shortcuts such as "cache=loose" will emit "cache=f" and 'f' is not a string that is parseable by kstrtoint(), so remounting may fail if a remount with "cache=f" is attempted. debug=%x has had the same problem since options have been displayed in c4fac9100456 ("9p: Implement show_options") Fix these by adding the 0x prefix to the hexadecimal value shown in /proc/mounts. Fixes: 4eb3117888a92 ("fs/9p: Rework cache modes and add new options to Documentation") Signed-off-by: Eric Sandeen <sandeen@redhat.com> Message-ID: <54b93378-dcf1-4b04-922d-c8b4393da299@redhat.com> [Dominique: use %#x at Al Viro's suggestion, also handle debug] Tested-by: Remi Pommarel <repk@triplefau.lt> Signed-off-by: Dominique Martinet <asmadeus@codewreck.org>
2025-12-01	Merge tag 'vfs-6.19-rc1.fs_header' of ↵	Linus Torvalds	1	-0/+1
	git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs Pull fs header updates from Christian Brauner: "This contains initial work to start splitting up fs.h. Begin the long-overdue work of splitting up the monolithic fs.h header. The header has grown to over 3000 lines and includes types and functions for many different subsystems, making it difficult to navigate and causing excessive compilation dependencies. This series introduces new focused headers for superblock-related code: - Rename fs_types.h to fs_dirent.h to better reflect its actual content (directory entry types) - Add fs/super_types.h containing superblock type definitions - Add fs/super.h containing superblock function declarations This is the first step in a longer effort to modularize the VFS headers. Cleanups: - Inode Field Layout Optimization (Mateusz Guzik) Move inode fields used during fast path lookup closer together to improve cache locality during path resolution. - current_umask() Optimization (Mateusz Guzik) Inline current_umask() and move it to fs_struct.h. This improves performance by avoiding function call overhead for this frequently-used function, and places it in a more appropriate header since it operates on fs_struct" * tag 'vfs-6.19-rc1.fs_header' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs: fs: move inode fields used during fast path lookup closer together fs: inline current_umask() and move it to fs_struct.h fs: add fs/super.h header fs: add fs/super_types.h header fs: rename fs_types.h to fs_dirent.h
2025-12-01	Merge tag 'vfs-6.19-rc1.writeback' of ↵	Linus Torvalds	1	-13/+4
	git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs Pull writeback updates from Christian Brauner: "Features: - Allow file systems to increase the minimum writeback chunk size. The relatively low minimal writeback size of 4MiB means that written back inodes on rotational media are switched a lot. Besides introducing additional seeks, this also can lead to extreme file fragmentation on zoned devices when a lot of files are cached relative to the available writeback bandwidth. This adds a superblock field that allows the file system to override the default size, and sets it to the zone size for zoned XFS. - Add logging for slow writeback when it exceeds sysctl_hung_task_timeout_secs. This helps identify tasks waiting for a long time and pinpoint potential issues. Recording the starting jiffies is also useful when debugging a crashed vmcore. - Wake up waiting tasks when finishing the writeback of a chunk Cleanups: - filemap_* writeback interface cleanups. Adding filemap_fdatawrite_wbc ended up being a mistake, as all but the original btrfs caller should be using better high level interfaces instead. This series removes all these low-level interfaces, switches btrfs to a more specific interface, and cleans up other too low-level interfaces. With this the writeback_control that is passed to the writeback code is only initialized in three places. - Remove __filemap_fdatawrite, __filemap_fdatawrite_range, and filemap_fdatawrite_wbc - Add filemap_flush_nr helper for btrfs - Push struct writeback_control into start_delalloc_inodes in btrfs - Rename filemap_fdatawrite_range_kick to filemap_flush_range - Stop opencoding filemap_fdatawrite_range in 9p, ocfs2, and mm - Make wbc_to_tag() inline and use it in fs" * tag 'vfs-6.19-rc1.writeback' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs: fs: Make wbc_to_tag() inline and use it in fs. xfs: set s_min_writeback_pages for zoned file systems writeback: allow the file system to override MIN_WRITEBACK_PAGES writeback: cleanup writeback_chunk_size mm: rename filemap_fdatawrite_range_kick to filemap_flush_range mm: remove __filemap_fdatawrite_range mm: remove filemap_fdatawrite_wbc mm: remove __filemap_fdatawrite mm,btrfs: add a filemap_flush_nr helper btrfs: push struct writeback_control into start_delalloc_inodes btrfs: use the local tmp_inode variable in start_delalloc_inodes ocfs2: don't opencode filemap_fdatawrite_range in ocfs2_journal_submit_inode_data_buffers 9p: don't opencode filemap_fdatawrite_range in v9fs_mmap_vm_close mm: don't opencode filemap_fdatawrite_range in filemap_invalidate_inode writeback: Add logging for slow writeback (exceeds sysctl_hung_task_timeout_secs) writeback: Wake up waiting tasks when finishing the writeback of a chunk.
2025-12-01	Merge tag 'vfs-6.19-rc1.inode' of ↵	Linus Torvalds	2	-2/+2
	git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs Pull vfs inode updates from Christian Brauner: "Features: - Hide inode->i_state behind accessors. Open-coded accesses prevent asserting they are done correctly. One obvious aspect is locking, but significantly more can be checked. For example it can be detected when the code is clearing flags which are already missing, or is setting flags when it is illegal (e.g., I_FREEING when ->i_count > 0) - Provide accessors for ->i_state, converts all filesystems using coccinelle and manual conversions (btrfs, ceph, smb, f2fs, gfs2, overlayfs, nilfs2, xfs), and makes plain ->i_state access fail to compile - Rework I_NEW handling to operate without fences, simplifying the code after the accessor infrastructure is in place Cleanups: - Move wait_on_inode() from writeback.h to fs.h - Spell out fenced ->i_state accesses with explicit smp_wmb/smp_rmb for clarity - Cosmetic fixes to LRU handling - Push list presence check into inode_io_list_del() - Touch up predicts in __d_lookup_rcu() - ocfs2: retire ocfs2_drop_inode() and I_WILL_FREE usage - Assert on ->i_count in iput_final() - Assert ->i_lock held in __iget() Fixes: - Add missing fences to I_NEW handling" * tag 'vfs-6.19-rc1.inode' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs: (22 commits) dcache: touch up predicts in __d_lookup_rcu() fs: push list presence check into inode_io_list_del() fs: cosmetic fixes to lru handling fs: rework I_NEW handling to operate without fences fs: make plain ->i_state access fail to compile xfs: use the new ->i_state accessors nilfs2: use the new ->i_state accessors overlayfs: use the new ->i_state accessors gfs2: use the new ->i_state accessors f2fs: use the new ->i_state accessors smb: use the new ->i_state accessors ceph: use the new ->i_state accessors btrfs: use the new ->i_state accessors Manual conversion to use ->i_state accessors of all places not covered by coccinelle Coccinelle-based conversion to use ->i_state accessors fs: provide accessors for ->i_state fs: spell out fenced ->i_state accesses with explicit smp_wmb/smp_rmb fs: move wait_on_inode() from writeback.h to fs.h fs: add missing fences to I_NEW handling ocfs2: retire ocfs2_drop_inode() and I_WILL_FREE usage ...
2025-11-05	fs: inline current_umask() and move it to fs_struct.h	Mateusz Guzik	1	-0/+1
	There is no good reason to have this as a func call, other than avoiding the churn of adding fs_struct.h as needed. Signed-off-by: Mateusz Guzik <mjguzik@gmail.com> Link: https://patch.msgid.link/20251104170448.630414-1-mjguzik@gmail.com Signed-off-by: Christian Brauner <brauner@kernel.org>
2025-11-03	9p: convert to the new mount API	Eric Sandeen	3	-309/+380
	Convert 9p to the new mount API. This patch consolidates all parsing into fs/9p/v9fs.c, which stores all results into a filesystem context which can be passed to the various transports as needed. Some of the parsing helper functions such as get_cache_mode() have been eliminated in favor of using the new mount API's enum param type, for simplicity. Signed-off-by: Eric Sandeen <sandeen@redhat.com> Message-ID: <20251010214222.1347785-5-sandeen@redhat.com> [ Dominique: handled source explicitly as per follow-up discussion ] Signed-off-by: Dominique Martinet <asmadeus@codewreck.org>
2025-11-03	fs/9p: delete unnnecessary condition	Dan Carpenter	1	-1/+0
	We already know that "retval" is negative, so there is no need to check again. Also the statement is not indented far enough. Delete it. Signed-off-by: Dan Carpenter <dan.carpenter@linaro.org> Fixes: 43c36a56ccf6 ("Revert "fs/9p: Refresh metadata in d_revalidate for uncached mode too"") Reviewed-by: Christian Schoenebeck <linux_oss@crudebyte.com> Message-ID: <aPtiSJl8EwSfVvqN@stanley.mountain> Signed-off-by: Dominique Martinet <asmadeus@codewreck.org>
2025-11-03	fs/9p: Don't open remote file with APPEND mode when writeback cache is used	Tingmao Wang	3	-6/+10
	When page cache is used, writebacks are done on a page granularity, and it is expected that the underlying filesystem (such as v9fs) should respect the write position. However, currently v9fs will passthrough O_APPEND to the server even on cached mode. This causes data corruption if a sync or fstat gets between two writes to the same file. This patch removes the APPEND flag from the open request we send to the server when writeback caching is involved. I believe keeping server-side APPEND is probably fine for uncached mode (even if two fds are opened, one without O_APPEND and one with it, this should still be fine since they would use separate fid for the writes). Signed-off-by: Tingmao Wang <m@maowtm.org> Fixes: 4eb3117888a9 ("fs/9p: Rework cache modes and add new options to Documentation") Message-ID: <20251102235631.8724-1-m@maowtm.org> Signed-off-by: Dominique Martinet <asmadeus@codewreck.org>
2025-10-29	9p: don't opencode filemap_fdatawrite_range in v9fs_mmap_vm_close	Christoph Hellwig	1	-13/+4
	Use filemap_fdatawrite_range instead of opencoding the logic using filemap_fdatawrite_wbc. Signed-off-by: Christoph Hellwig <hch@lst.de> Link: https://patch.msgid.link/20251024080431.324236-3-hch@lst.de Reviewed-by: Jan Kara <jack@suse.cz> Reviewed-by: Damien Le Moal <dlemoal@kernel.org> Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> Signed-off-by: Christian Brauner <brauner@kernel.org>
2025-10-22	Revert "fs/9p: Refresh metadata in d_revalidate for uncached mode too"	Dominique Martinet	3	-22/+4
	This reverts commit 290434474c332a2ba9c8499fe699c7f2e1153280. That commit broke cache=mmap, a mode that doesn't cache metadata, but still has writeback cache. In commit 290434474c33 ("fs/9p: Refresh metadata in d_revalidate for uncached mode too") we considered metadata cache to be enough to not look at the server, but in writeback cache too looking at the server size would make the vfs consider the file has been truncated before the data has been flushed out, making the following repro fail (nothing is ever read back, the resulting file ends up with no data written) ``` #include <fcntl.h> #include <stdio.h> #include <stdlib.h> #include <unistd.h> char buf[4096]; int main(int argc, char *argv[]) { int ret, i; int fdw, fdr; if (argc < 2) return 1; fdw = openat(AT_FDCWD, argv[1], O_RDWR\|O_CREAT\|O_EXCL\|O_CLOEXEC, 0600); if (fdw < 0) { fprintf(stderr, "cannot open fdw\n"); return 1; } write(fdw, buf, sizeof(buf)); fdr = openat(AT_FDCWD, argv[1], O_RDONLY\|O_CLOEXEC); if (fdr < 0) { fprintf(stderr, "cannot open fdr\n"); close(fdw); return 1; } for (i = 0; i < 10; i++) { ret = read(fdr, buf, sizeof(buf)); fprintf(stderr, "i: %d, read returns %d\n", i, ret); } close(fdr); close(fdw); return 0; } ``` There is a fix for this particular reproducer but it looks like there are other problems around metadata refresh (e.g. around file rename), so revert this to avoid d_revalidate in uncached mode for now. Reported-by: Song Liu <song@kernel.org> Link: https://lkml.kernel.org/r/CAHzjS_u_SYdt5=2gYO_dxzMKXzGMt-TfdE_ueowg-Hq5tRCAiw@mail.gmail.com Reported-by: Andrii Nakryiko <andrii.nakryiko@gmail.com> Link: https://lore.kernel.org/bpf/CAEf4BzZbCE4tLoDZyUf_aASpgAGFj75QMfSXX4a4dLYixnOiLg@mail.gmail.com/ Fixes: 290434474c33 ("fs/9p: Refresh metadata in d_revalidate for uncached mode too") Signed-off-by: Dominique Martinet <asmadeus@codewreck.org>
2025-10-20	Coccinelle-based conversion to use ->i_state accessors	Mateusz Guzik	2	-2/+2
	All places were patched by coccinelle with the default expecting that ->i_lock is held, afterwards entries got fixed up by hand to use unlocked variants as needed. The script: @@ expression inode, flags; @@ - inode->i_state & flags + inode_state_read(inode) & flags @@ expression inode, flags; @@ - inode->i_state &= ~flags + inode_state_clear(inode, flags) @@ expression inode, flag1, flag2; @@ - inode->i_state &= ~flag1 & ~flag2 + inode_state_clear(inode, flag1 \| flag2) @@ expression inode, flags; @@ - inode->i_state \|= flags + inode_state_set(inode, flags) @@ expression inode, flags; @@ - inode->i_state = flags + inode_state_assign(inode, flags) @@ expression inode, flags; @@ - flags = inode->i_state + flags = inode_state_read(inode) @@ expression inode, flags; @@ - READ_ONCE(inode->i_state) & flags + inode_state_read(inode) & flags Signed-off-by: Mateusz Guzik <mjguzik@gmail.com> Signed-off-by: Christian Brauner <brauner@kernel.org>
2025-10-09	Merge tag '9p-for-6.18-rc1' of https://github.com/martinetd/linux	Linus Torvalds	4	-13/+52
	Pull 9p updates from Dominique Martinet: "A bunch of unrelated fixes: - polling fix for trans fd that ought to have been fixed otherwise back in March, but apparently came back somewhere else... - USB transport buffer overflow fix - Some dentry lifetime rework to handle metadata update for currently opened files in uncached mode, or inode type change in cached mode - a double-put on invalid flush found by syzbot - and finally /sys/fs/9p/caches not advancing buffer and overwriting itself for large contents Thanks to everyone involved!" * tag '9p-for-6.18-rc1' of https://github.com/martinetd/linux: 9p: sysfs_init: don't hardcode error to ENOMEM 9p: fix /sys/fs/9p/caches overwriting itself 9p: clean up comment typos 9p/trans_fd: p9_fd_request: kick rx thread if EPOLLIN net/9p: fix double req put in p9_fd_cancelled net/9p: Fix buffer overflow in USB transport layer fs/9p: Add p9_debug(VFS) in d_revalidate fs/9p: Invalidate dentry if inode type change detected in cached mode fs/9p: Refresh metadata in d_revalidate for uncached mode too
2025-10-03	Merge tag 'pull-finish_no_open' of ↵	Linus Torvalds	2	-32/+17
	git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs Pull finish_no_open updates from Al Viro: "finish_no_open calling conventions change to simplify callers" * tag 'pull-finish_no_open' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: slightly simplify nfs_atomic_open() simplify gfs2_atomic_open() simplify fuse_atomic_open() simplify nfs_atomic_open_v23() simplify vboxsf_dir_atomic_open() simplify cifs_atomic_open() 9p: simplify v9fs_vfs_atomic_open_dotl() 9p: simplify v9fs_vfs_atomic_open() allow finish_no_open(file, ERR_PTR(-E...))
2025-09-27	9p: sysfs_init: don't hardcode error to ENOMEM	Randall P. Embry	1	-2/+5
	v9fs_sysfs_init() always returned -ENOMEM on failure; return the actual sysfs_create_group() error instead. Signed-off-by: Randall P. Embry <rpembry@gmail.com> Message-ID: <20250926-v9fs_misc-v1-3-a8b3907fc04d@codewreck.org> Signed-off-by: Dominique Martinet <asmadeus@codewreck.org>
2025-09-27	9p: fix /sys/fs/9p/caches overwriting itself	Randall P. Embry	1	-1/+1
	caches_show() overwrote its buffer on each iteration, so only the last cache tag was visible in sysfs output. Properly append with snprintf(buf + count, …). Signed-off-by: Randall P. Embry <rpembry@gmail.com> Message-ID: <20250926-v9fs_misc-v1-2-a8b3907fc04d@codewreck.org> Signed-off-by: Dominique Martinet <asmadeus@codewreck.org>
2025-09-27	9p: clean up comment typos	Randall P. Embry	1	-4/+3
	Fix a few minor typos in comments (e.g. "trasnport" → "transport"). Signed-off-by: Randall P. Embry <rpembry@gmail.com> Message-ID: <20250926-v9fs_misc-v1-1-a8b3907fc04d@codewreck.org> Signed-off-by: Dominique Martinet <asmadeus@codewreck.org>
2025-09-16	9p: simplify v9fs_vfs_atomic_open_dotl()	Al Viro	1	-10/+5
	again, preexisting aliases will always be positive Reviewed-by: NeilBrown <neil@brown.name> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2025-09-16	9p: simplify v9fs_vfs_atomic_open()	Al Viro	1	-22/+12
	if v9fs_vfs_lookup() returns a preexisting alias, it is guaranteed to be positive. IOW, in that case we will immediately return finish_no_open(), leaving only the case res == NULL past that point. Reviewed-by: NeilBrown <neil@brown.name> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2025-09-15	fs: rename generic_delete_inode() and generic_drop_inode()	Mateusz Guzik	1	-1/+1
	generic_delete_inode() is rather misleading for what the routine is doing. inode_just_drop() should be much clearer. The new naming is inconsistent with generic_drop_inode(), so rename that one as well with inode_ as the suffix. No functional changes. Signed-off-by: Mateusz Guzik <mjguzik@gmail.com> Reviewed-by: Jan Kara <jack@suse.cz> Signed-off-by: Christian Brauner <brauner@kernel.org>
2025-08-23	fs/9p: Add p9_debug(VFS) in d_revalidate	Tingmao Wang	1	-4/+20
	This was a useful debugging / validation aid, and can explain why a GETATTR request is made. Signed-off-by: Tingmao Wang <m@maowtm.org> Message-ID: <00829a99549e33d26139fa4d756c466629f13e00.1743956147.git.m@maowtm.org> Signed-off-by: Dominique Martinet <asmadeus@codewreck.org>
2025-08-23	fs/9p: Invalidate dentry if inode type change detected in cached mode	Tingmao Wang	1	-1/+1
	This is an extension of the last commit to cached mode as well. While server-side changes when using cached mode is not expected, when it does happen we can get things like -EOPNOTSUPP. With this change at least when we realize the inode has updated (for example after a `touch` on the client), we can get a new one. Signed-off-by: Tingmao Wang <m@maowtm.org> Message-ID: <01afd3c77d5cda181780dc931baa8f3fc54562c8.1743956147.git.m@maowtm.org> Signed-off-by: Dominique Martinet <asmadeus@codewreck.org>
2025-08-23	fs/9p: Refresh metadata in d_revalidate for uncached mode too	Tingmao Wang	3	-3/+24
	Currently if another process keeps a file open, due to existing dentry in the dcache, other processes will not see updated metadata of that file if it is changed on the server, even in uncached mode. This can also manifest as -ENODATA when reading a file that has shrunk on the server (even if it's re-opened in another process), or -ENOTSUPP if the file has changed type (e.g. regular file to directory) on the server. We can end up in a situation where both `readdir` or `read` fails until the file is closed by all processes using it. This commit fixes that, and invalidates the dentry altogether if the inode type is changed (fo