aboutsummaryrefslogtreecommitdiff
path: root/fs/exfat
AgeCommit message (Collapse)AuthorFilesLines
3 daysexfat: add blank line after declarationsWilliam Hansen-Baird2-0/+2
Add a blank line after variable declarations in fatent.c and file.c. This improves readability and makes code style more consistent across the exfat subsystem. Signed-off-by: William Hansen-Baird <william.hansen.baird@gmail.com> Signed-off-by: Namjae Jeon <linkinjeon@kernel.org>
3 daysexfat: remove unnecessary else after return statementWilliam Hansen-Baird1-2/+3
Else-branch is unnecessary after return statement in if-branch. Remove to enhance readability and reduce indentation. Signed-off-by: William Hansen-Baird <william.hansen.baird@gmail.com> Signed-off-by: Namjae Jeon <linkinjeon@kernel.org>
3 daysexfat: support multi-cluster for exfat_get_clusterChi Zhiling3-8/+53
This patch introduces a count parameter to exfat_get_cluster, which serves as an input parameter for the caller to specify the desired number of clusters, and as an output parameter to store the length of consecutive clusters. This patch can improve read performance by reducing the number of get_block calls in sequential read scenarios. speacially in small cluster size. According to my test data, the performance improvement is approximately 10% when read FAT_CHAIN file with 512 bytes of cluster size. 454 MB/s -> 511 MB/s Suggested-by: Yuezhang Mo <Yuezhang.Mo@sony.com> Signed-off-by: Chi Zhiling <chizhiling@kylinos.cn> Reviewed-by: Yuezhang Mo <Yuezhang.Mo@sony.com> Signed-off-by: Namjae Jeon <linkinjeon@kernel.org>
3 daysexfat: return the start of next cache in exfat_cache_lookupChi Zhiling1-12/+37
Change exfat_cache_lookup to return the cluster number of the last cluster before the next cache (i.e., the end of the current cache range) or the given 'end' if there is no next cache. This allows the caller to know whether the next cluster after the current cache is cached. The function signature is changed to accept an 'end' parameter, which is the upper bound of the search range. The function now stops early if it finds a cache that starts within the current cache's tail, meaning caches are contiguous. The return value is the cluster number at which the next cache starts (minus one) or the original 'end' if no next cache is found. The new behavior is illustrated as follows: cache: [ccccccc-------ccccccccc] search: [..................] return: ^ Signed-off-by: Chi Zhiling <chizhiling@kylinos.cn> Reviewed-by: Yuezhang Mo <Yuezhang.Mo@sony.com> Signed-off-by: Namjae Jeon <linkinjeon@kernel.org>
3 daysexfat: tweak cluster cache to support zero offsetChi Zhiling1-2/+2
The current cache mechanism does not support reading clusters starting from a file offset of zero. This patch enables that feature in preparation for subsequent reads of contiguous clusters from offset zero. 1. support finding clusters with zero offset. 2. allow clusters with zero offset to be cached. Signed-off-by: Chi Zhiling <chizhiling@kylinos.cn> Reviewed-by: Yuezhang Mo <Yuezhang.Mo@sony.com> Signed-off-by: Namjae Jeon <linkinjeon@kernel.org>
3 daysexfat: support multi-cluster for exfat_map_clusterChi Zhiling1-13/+17
This patch introduces a parameter 'count' to support fetching multiple clusters in exfat_map_cluster. The returned 'count' indicates the number of consecutive clusters, or 0 when the input cluster offset is past EOF. And the 'count' is also an input parameter for the caller to specify the required number of clusters. Only NO_FAT_CHAIN files enable multi-cluster fetching in this patch. After this patch, the time proportion of exfat_get_block has decreased, The performance data is as follows: Cluster size: 512 bytes Sequential read of a 30GB NO_FAT_CHAIN file: 2.4GB/s -> 2.5 GB/s proportion of exfat_get_block: 10.8% -> 0.02% Signed-off-by: Chi Zhiling <chizhiling@kylinos.cn> Reviewed-by: Yuezhang Mo <Yuezhang.Mo@sony.com> Signed-off-by: Namjae Jeon <linkinjeon@kernel.org>
3 daysexfat: remove handling of non-file types in exfat_map_clusterChi Zhiling1-17/+1
Yuezhang said: "exfat_map_cluster() is only used for files. The code in this 'else' block is never executed and can be cleaned up." Suggested-by: Yuezhang Mo <Yuezhang.Mo@sony.com> Signed-off-by: Chi Zhiling <chizhiling@kylinos.cn> Reviewed-by: Yuezhang Mo <Yuezhang.Mo@sony.com> Signed-off-by: Namjae Jeon <linkinjeon@kernel.org>
3 daysexfat: reuse cache to improve exfat_get_clusterChi Zhiling1-1/+3
Since exfat_ent_get supports cache buffer head, we can use this option to reduce sb_bread calls when fetching consecutive entries. Signed-off-by: Chi Zhiling <chizhiling@kylinos.cn> Reviewed-by: Yuezhang Mo <Yuezhang.Mo@sony.com> Signed-off-by: Namjae Jeon <linkinjeon@kernel.org>
3 daysexfat: reduce the number of parameters for exfat_get_cluster()Chi Zhiling3-24/+11
Remove parameter 'fclus' and 'allow_eof': - The fclus parameter is changed to a local variable as it is not needed to be returned. - The passed allow_eof parameter was always 1, remove it and the associated error handling. Signed-off-by: Chi Zhiling <chizhiling@kylinos.cn> Reviewed-by: Yuezhang Mo <Yuezhang.Mo@sony.com> Signed-off-by: Namjae Jeon <linkinjeon@kernel.org>
3 daysexfat: remove the unreachable warning for cache miss casesChi Zhiling1-12/+1
The cache_id remains unchanged on a cache miss; its value is always exactly what was set by cache_init. Therefore, checking this value again is meaningless. Signed-off-by: Chi Zhiling <chizhiling@kylinos.cn> Reviewed-by: Yuezhang Mo <Yuezhang.Mo@sony.com> Signed-off-by: Namjae Jeon <linkinjeon@kernel.org>
3 daysexfat: remove the check for infinite cluster chain loopChi Zhiling1-10/+0
The infinite cluster chain loop check is not work because the loop will terminate when fclus reaches the parameter cluster, and the parameter cluster value is never greater than ei->valid_size. The following relationship holds: 'fclus' < 'cluster' ≤ ei->valid_size ≤ sb->num_clusters The check would only be triggered if a cluster number greater than sb->num_clusters is passed, but no caller currently does this. Signed-off-by: Chi Zhiling <chizhiling@kylinos.cn> Reviewed-by: Yuezhang Mo <Yuezhang.Mo@sony.com> Signed-off-by: Namjae Jeon <linkinjeon@kernel.org>
3 daysexfat: improve exfat_find_last_clusterChi Zhiling1-1/+3
Since exfat_ent_get support cache buffer head, let's apply it to exfat_find_last_cluster. Signed-off-by: Chi Zhiling <chizhiling@kylinos.cn> Reviewed-by: Yuezhang Mo <Yuezhang.Mo@sony.com> Signed-off-by: Namjae Jeon <linkinjeon@kernel.org>
3 daysexfat: improve exfat_count_num_clustersChi Zhiling1-1/+3
Since exfat_ent_get support cache buffer head, let's apply it to exfat_count_num_clusters. Signed-off-by: Chi Zhiling <chizhiling@kylinos.cn> Reviewed-by: Yuezhang Mo <Yuezhang.Mo@sony.com> Signed-off-by: Namjae Jeon <linkinjeon@kernel.org>
3 daysexfat: support reuse buffer head for exfat_ent_getChi Zhiling3-18/+27
This patch is part 2 of cached buffer head for exfat_ent_get, it introduces an argument for exfat_ent_get, and make sure this routine releases buffer head refcount when any error return. Signed-off-by: Chi Zhiling <chizhiling@kylinos.cn> Reviewed-by: Yuezhang Mo <Yuezhang.Mo@sony.com> Signed-off-by: Namjae Jeon <linkinjeon@kernel.org>
3 daysexfat: add cache option for __exfat_ent_getChi Zhiling1-7/+13
When multiple entries are obtained consecutively, these entries are mostly stored adjacent to each other. this patch introduces a "last" parameter to cache the last opened buffer head, and reuse it when possible, which reduces the number of sb_bread() calls. When the passed parameter "last" is NULL, it means cache option is disabled, the behavior unchanged as it was. Signed-off-by: Chi Zhiling <chizhiling@kylinos.cn> Reviewed-by: Yuezhang Mo <Yuezhang.Mo@sony.com> Signed-off-by: Namjae Jeon <linkinjeon@kernel.org>
3 daysexfat: reduce unnecessary writes during mmap writeYuling Dong1-9/+6
During mmap write, exfat_page_mkwrite() currently extends valid_size to the end of the VMA range. For a large mapping, this can push valid_size far beyond the page that actually triggered the fault, resulting in unnecessary writes. valid_size only needs to extend to the end of the page being written. Signed-off-by: Yuling Dong <yuling-dong@qq.com> Reviewed-by: Yuezhang Mo <Yuezhang.Mo@sony.com> Signed-off-by: Namjae Jeon <linkinjeon@kernel.org>
3 daysexfat: improve error code handling in exfat_find_empty_entry()Haotian Zhang1-2/+2
Change the type of 'ret' from unsigned int to int in exfat_find_empty_entry(). Although the implicit type conversion (int -> unsigned int -> int) does not cause actual bugs in practice, using int directly is more appropriate for storing error codes returned by exfat_alloc_cluster(). This improves code clarity and consistency with standard error handling practices. Signed-off-by: Haotian Zhang <vulab@iscas.ac.cn> Reviewed-by: Sungjong Seo <sj1557.seo@samsung.com> Signed-off-by: Namjae Jeon <linkinjeon@kernel.org>
2026-01-12exfat: add setlease file operationJeff Layton2-0/+4
Add the setlease file_operation to exfat_file_operations and exfat_dir_operations, pointing to generic_setlease. A future patch will change the default behavior to reject lease attempts with -EINVAL when there is no setlease file operation defined. Add generic_setlease to retain the ability to set leases on this filesystem. Signed-off-by: Jeff Layton <jlayton@kernel.org> Link: https://patch.msgid.link/20260108-setlease-6-20-v1-7-ea4dec9b67fa@kernel.org Acked-by: Namjae Jeon <linkinjeon@kernel.org> Acked-by: Al Viro <viro@zeniv.linux.org.uk> Acked-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Christian Brauner <brauner@kernel.org>
2025-12-03exfat: fix remount failure in different process environmentsYuezhang Mo1-4/+15
The kernel test robot reported that the exFAT remount operation failed. The reason for the failure was that the process's umask is different between mount and remount, causing fs_fmask and fs_dmask are changed. Potentially, both gid and uid may also be changed. Therefore, when initializing fs_context for remount, inherit these mount options from the options used during mount. Reported-by: kernel test robot <oliver.sang@intel.com> Closes: https://lore.kernel.org/oe-lkp/202511251637.81670f5c-lkp@intel.com Signed-off-by: Yuezhang Mo <Yuezhang.Mo@sony.com> Signed-off-by: Namjae Jeon <linkinjeon@kernel.org>
2025-12-03exfat: fix divide-by-zero in exfat_allocate_bitmapNamjae Jeon1-1/+1
The variable max_ra_count can be 0 in exfat_allocate_bitmap(), which causes a divide-by-zero error in the subsequent modulo operation (i % max_ra_count), leading to a system crash. When max_ra_count is 0, it means that readahead is not used. This patch load the bitmap without readahead. Fixes: 9fd688678dd8 ("exfat: optimize allocation bitmap loading time") Reported-by: Jiaming Zhang <r772577952@gmail.com> Signed-off-by: Namjae Jeon <linkinjeon@kernel.org>
2025-12-03exfat: validate the cluster bitmap bits of directoryNamjae Jeon5-9/+46
Syzbot created this issue by testing an image that did not have the root cluster bitmap bit marked. After accessing a file through the root directory via exfat_lookup, when creating a file again with mkdir, the root cluster bit can be allocated for direcotry, which can cause the root cluster to be zeroed out and the same entry can be allocated in the same cluster. This patch improved this issue by adding exfat_test_bitmap to validate the cluster bits of the root directory and directory. And the first cluster bit of the root directory should never be unset except when storage is corrupted. This bit is set to allow operations after mount. Reported-by: syzbot+5216036fc59c43d1ee02@syzkaller.appspotmail.com Tested-by: syzbot+5216036fc59c43d1ee02@syzkaller.appspotmail.com Reviewed-by: Sungjong Seo <sj1557.seo@samsung.com> Reviewed-by: Yuezhang Mo <Yuezhang.Mo@sony.com> Signed-off-by: Namjae Jeon <linkinjeon@kernel.org>
2025-12-03exfat: zero out post-EOF page cache on file extensionYuezhang Mo1-0/+5
xfstests generic/363 was failing due to unzeroed post-EOF page cache that allowed mmap writes beyond EOF to become visible after file extension. For example, in following xfs_io sequence, 0x22 should not be written to the file but would become visible after the extension: xfs_io -f -t -c "pwrite -S 0x11 0 8" \ -c "mmap 0 4096" \ -c "mwrite -S 0x22 32 32" \ -c "munmap" \ -c "pwrite -S 0x33 512 32" \ $testfile This violates the expected behavior where writes beyond EOF via mmap should not persist after the file is extended. Instead, the extended region should contain zeros. Fix this by using truncate_pagecache() to truncate the page cache after the current EOF when extending the file. Signed-off-by: Yuezhang Mo <Yuezhang.Mo@sony.com> Signed-off-by: Namjae Jeon <linkinjeon@kernel.org>
2025-12-03exfat: fix refcount leak in exfat_findShuhao Fu1-10/+10
Fix refcount leaks in `exfat_find` related to `exfat_get_dentry_set`. Function `exfat_get_dentry_set` would increase the reference counter of `es->bh` on success. Therefore, `exfat_put_dentry_set` must be called after `exfat_get_dentry_set` to ensure refcount consistency. This patch relocate two checks to avoid possible leaks. Fixes: 82ebecdc74ff ("exfat: fix improper check of dentry.stream.valid_size") Fixes: 13940cef9549 ("exfat: add a check for invalid data size") Signed-off-by: Shuhao Fu <sfual@cse.ust.hk> Reviewed-by: Yuezhang Mo <Yuezhang.Mo@sony.com> Signed-off-by: Namjae Jeon <linkinjeon@kernel.org>
2025-11-17Merge tag 'vfs-6.18-rc7.fixes' of ↵Linus Torvalds1-1/+4
gitolite.kernel.org:pub/scm/linux/kernel/git/vfs/vfs Pull vfs fixes from Christian Brauner: - Fix unitialized variable in statmount_string() - Fix hostfs mounting when passing host root during boot - Fix dynamic lookup to fail on cell lookup failure - Fix missing file type when reading bfs inodes from disk - Enforce checking of sb_min_blocksize() calls and update all callers accordingly - Restore write access before closing files opened by open_exec() in binfmt_misc - Always freeze efivarfs during suspend/hibernate cycles - Fix statmount()'s and listmount()'s grab_requested_mnt_ns() helper to actually allow mount namespace file descriptor in addition to mount namespace ids - Fix tmpfs remount when noswap is specified - Switch Landlock to iput_not_last() to remove false-positives from might_sleep() annotations in iput() - Remove dead node_to_mnt_ns() code - Ensure that per-queue kobjects are successfully created * tag 'vfs-6.18-rc7.fixes' of gitolite.kernel.org:pub/scm/linux/kernel/git/vfs/vfs: landlock: fix splats from iput() after it started calling might_sleep() fs: add iput_not_last() shmem: fix tmpfs reconfiguration (remount) when noswap is set fs/namespace: correctly handle errors returned by grab_requested_mnt_ns power: always freeze efivarfs binfmt_misc: restore write access before closing files opened by open_exec() block: add __must_check attribute to sb_min_blocksize() virtio-fs: fix incorrect check for fsvq->kobj xfs: check the return value of sb_min_blocksize() in xfs_fs_fill_super isofs: check the return value of sb_min_blocksize() in isofs_fill_super exfat: check return value of sb_min_blocksize in exfat_read_boot_sector vfat: fix missing sb_min_blocksize() return value checks mnt: Remove dead code which might prevent from building bfs: Reconstruct file type when loading from disk afs: Fix dynamic lookup to fail on cell lookup failure hostfs: Fix only passing host root in boot stage with new mount fs: Fix uninitialized 'offp' in statmount_string()
2025-11-05exfat: check return value of sb_min_blocksize in exfat_read_boot_sectorYongpeng Yang1-1/+4
sb_min_blocksize() may return 0. Check its return value to avoid accessing the filesystem super block when sb->s_blocksize is 0. Cc: stable@vger.kernel.org # v6.15 Fixes: 719c1e1829166d ("exfat: add super block operations") Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Yongpeng Yang <yangyongpeng@xiaomi.com> Link: https://patch.msgid.link/20251104125009.2111925-3-yangyongpeng.storage@gmail.com Signed-off-by: Christian Brauner <brauner@kernel.org>
2025-10-15exfat: fix out-of-bounds in exfat_nls_to_ucs2()Jeongjun Park4-8/+5
Since the len argument value passed to exfat_ioctl_set_volume_label() from exfat_nls_to_utf16() is passed 1 too large, an out-of-bounds read occurs when dereferencing p_cstring in exfat_nls_to_ucs2() later. And because of the NLS_NAME_OVERLEN macro, another error occurs when creating a file with a period at the end using utf8 and other iocharsets. So to avoid this, you should remove the code that uses NLS_NAME_OVERLEN macro and make the len argument value be the length of the label string, but with a maximum length of FSLABEL_MAX - 1. Reported-by: syzbot+98cc76a76de46b3714d4@syzkaller.appspotmail.com Closes: https://syzkaller.appspot.com/bug?extid=98cc76a76de46b3714d4 Fixes: d01579d590f7 ("exfat: Add support for FS_IOC_{GET,SET}FSLABEL") Suggested-by: Pali Rohár <pali@kernel.org> Signed-off-by: Jeongjun Park <aha310510@gmail.com> Signed-off-by: Namjae Jeon <linkinjeon@kernel.org>
2025-10-15exfat: fix improper check of dentry.stream.valid_sizeJaehun Gou1-1/+5
We found an infinite loop bug in the exFAT file system that can lead to a Denial-of-Service (DoS) condition. When a dentry in an exFAT filesystem is malformed, the following system calls — SYS_openat, SYS_ftruncate, and SYS_pwrite64 — can cause the kernel to hang. Root cause analysis shows that the size validation code in exfat_find() does not check whether dentry.stream.valid_size is negative. As a result, the system calls mentioned above can succeed and eventually trigger the DoS issue. This patch adds a check for negative dentry.stream.valid_size to prevent this vulnerability. Co-developed-by: Seunghun Han <kkamagui@gmail.com> Signed-off-by: Seunghun Han <kkamagui@gmail.com> Co-developed-by: Jihoon Kwon <jimmyxyz010315@gmail.com> Signed-off-by: Jihoon Kwon <jimmyxyz010315@gmail.com> Signed-off-by: Jaehun Gou <p22gone@gmail.com> Signed-off-by: Namjae Jeon <linkinjeon@kernel.org>
2025-10-03Merge tag 'exfat-for-6.18-rc1' of ↵Linus Torvalds10-35/+360
git://git.kernel.org/pub/scm/linux/kernel/git/linkinjeon/exfat Pull exfat updates from Namjae Jeon: - Add support for FS_IOC_{GET,SET}FSLABEL ioctl - Two small clean-up patches - Optimizes allocation bitmap loading time on large partitions with small cluster sizes - Allow changes for discard, zero_size_dir, and errors options via remount - Validate that the clusters used for the allocation bitmap are correctly marked as in-use during mount, preventing potential data corruption from reallocating the bitmap's own space - Uses ratelimit to avoid too many error prints on I/O error path * tag 'exfat-for-6.18-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/linkinjeon/exfat: exfat: Add support for FS_IOC_{GET,SET}FSLABEL exfat: combine iocharset and utf8 option setup exfat: support modifying mount options via remount exfat: optimize allocation bitmap loading time exfat: Remove unnecessary parentheses exfat: drop redundant conversion to bool exfat: validate cluster allocation bits of the allocation bitmap exfat: limit log print for IO error
2025-09-30exfat: Add support for FS_IOC_{GET,SET}FSLABELEthan Ferguson5-1/+226
Add support for reading / writing to the exfat volume label from the FS_IOC_GETFSLABEL and FS_IOC_SETFSLABEL ioctls Co-developed-by: Yuezhang Mo <Yuezhang.Mo@sony.com> Signed-off-by: Yuezhang Mo <Yuezhang.Mo@sony.com> Signed-off-by: Ethan Ferguson <ethan.ferguson@zetier.com> Reviewed-by: Sungjong Seo <sj1557.seo@samsung.com> Signed-off-by: Namjae Jeon <linkinjeon@kernel.org>
2025-09-30exfat: combine iocharset and utf8 option setupSang-Heon Jeon1-9/+15
Currently, exfat utf8 mount option depends on the iocharset option value. After exfat remount, utf8 option may become inconsistent with iocharset option. If the options are inconsistent; (specifically, iocharset=utf8 but utf8=0) readdir may reference uninitalized NLS, leading to a null pointer dereference. Extract and combine utf8/iocharset setup logic into exfat_set_iocharset(). Then Replace iocharset setup logic to exfat_set_iocharset to prevent utf8/iocharset option inconsistentcy after remount. Reported-by: syzbot+3e9cb93e3c5f90d28e19@syzkaller.appspotmail.com Closes: https://syzkaller.appspot.com/bug?extid=3e9cb93e3c5f90d28e19 Signed-off-by: Sang-Heon Jeon <ekffu200098@gmail.com> Fixes: acab02ffcd6b ("exfat: support modifying mount options via remount") Tested-by: syzbot+3e9cb93e3c5f90d28e19@syzkaller.appspotmail.com Reviewed-by: Yuezhang Mo <Yuezhang.Mo@sony.com> Signed-off-by: Namjae Jeon <linkinjeon@kernel.org>
2025-09-30exfat: support modifying mount options via remountYuezhang Mo1-6/+38
Before this commit, all exfat-defined mount options could not be modified dynamically via remount, and no error was returned. After this commit, these three exfat-defined mount options (discard, zero_size_dir, and errors) can be modified dynamically via remount. While other exfat-defined mount options cannot be modified dynamically via remount because their old settings are cached in inodes or dentries, modifying them will be rejected with an error. Signed-off-by: Yuezhang Mo <Yuezhang.Mo@sony.com> Signed-off-by: Namjae Jeon <linkinjeon@kernel.org>
2025-09-30exfat: optimize allocation bitmap loading timeNamjae Jeon1-0/+13
Loading the allocation bitmap is very slow if user set the small cluster size on large partition. For optimizing it, This patch uses sb_breadahead() read the allocation bitmap. It will improve the mount time. The following is the result of about 4TB partition(2KB cluster size) on my target. without patch: real 0m41.746s user 0m0.011s sys 0m0.000s with patch: real 0m2.525s user 0m0.008s sys 0m0.008s Reviewed-by: Sungjong Seo <sj1557.seo@samsung.com> Reviewed-by: Yuezhang Mo <Yuezhang.Mo@sony.com> Signed-off-by: Namjae Jeon <linkinjeon@kernel.org>
2025-09-30exfat: Remove unnecessary parenthesesLiao Yuanhong1-1/+1
When using &, it's unnecessary to have parentheses afterward. Remove redundant parentheses to enhance readability. Signed-off-by: Liao Yuanhong <liaoyuanhong@vivo.com> Signed-off-by: Namjae Jeon <linkinjeon@kernel.org>
2025-09-30exfat: drop redundant conversion to boolXichao Zhao1-1/+1
The result of integer comparison already evaluates to bool. No need for explicit conversion. No functional impact. Signed-off-by: Xichao Zhao <zhao.xichao@vivo.com> Signed-off-by: Namjae Jeon <linkinjeon@kernel.org>
2025-09-30exfat: validate cluster allocation bits of the allocation bitmapNamjae Jeon1-12/+60
syzbot created an exfat image with cluster bits not set for the allocation bitmap. exfat-fs reads and uses the allocation bitmap without checking this. The problem is that if the start cluster of the allocation bitmap is 6, cluster 6 can be allocated when creating a directory with mkdir. exfat zeros out this cluster in exfat_mkdir, which can delete existing entries. This can reallocate the allocated entries. In addition, the allocation bitmap is also zeroed out, so cluster 6 can be reallocated. This patch adds exfat_test_bitmap_range to validate that clusters used for the allocation bitmap are correctly marked as in-use. Reported-by: syzbot+a725ab460fc1def9896f@syzkaller.appspotmail.com Tested-by: syzbot+a725ab460fc1def9896f@syzkaller.appspotmail.com Reviewed-by: Yuezhang Mo <Yuezhang.Mo@sony.com> Reviewed-by: Sungjong Seo <sj1557.seo@samsung.com> Signed-off-by: Namjae Jeon <linkinjeon@kernel.org>
2025-09-30exfat: limit log print for IO errorChi Zhiling1-5/+6
For exFAT filesystems with 4MB read_ahead_size, removing the storage device when the read operation is in progress, which cause the last read syscall spent 150s [1]. The main reason is that exFAT generates excessive log messages [2]. After applying this patch, approximately 300,000 lines of log messages were suppressed, and the delay of the last read() syscall was reduced to about 4 seconds. [1]: write(5, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 131072) = 131072 <0.000120> read(4, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 131072) = 131072 <0.000032> write(5, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 131072) = 131072 <0.000119> read(4, 0x7fccf28ae000, 131072) = -1 EIO (Input/output error) <150.186215> [2]: [ 333.696603] exFAT-fs (vdb): error, failed to access to FAT (entry 0x0000d780, err:-5) [ 333.697378] exFAT-fs (vdb): error, failed to access to FAT (entry 0x0000d780, err:-5) [ 333.698156] exFAT-fs (vdb): error, failed to access to FAT (entry 0x0000d780, err:-5) Signed-off-by: Chi Zhiling <chizhiling@kylinos.cn> Signed-off-by: Namjae Jeon <linkinjeon@kernel.org>
2025-09-15exfat_find(): constify qstr argumentAl Viro1-1/+1
Nothing outside of fs/dcache.c has any business modifying dentry names; passing &dentry->d_name as an argument should have that argument declared as a const pointer. Acked-by: Namjae Jeon <linkinjeon@kernel.org> Reviewed-by: Christian Brauner <brauner@kernel.org> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2025-08-01exfat: add cluster chain loop check for dirYuezhang Mo4-11/+48
An infinite loop may occur if the following conditions occur due to file system corruption. (1) Condition for exfat_count_dir_entries() to loop infinitely. - The cluster chain includes a loop. - There is no UNUSED entry in the cluster chain. (2) Condition for exfat_create_upcase_table() to loop infinitely. - The cluster chain of the root directory includes a loop. - There are no UNUSED entry and up-case table entry in the cluster chain of the root directory. (3) Condition for exfat_load_bitmap() to loop infinitely. - The cluster chain of the root directory includes a loop. - There are no UNUSED entry and bitmap entry in the cluster chain of the root directory. (4) Condition for exfat_find_dir_entry() to loop infinitely. - The cluster chain includes a loop. - The unused directory entries were exhausted by some operation. (5) Condition for exfat_check_dir_empty() to loop infinitely. - The cluster chain includes a loop. - The unused directory entries were exhausted by some operation. - All files and sub-directories under the directory are deleted. This commit adds checks to break the above infinite loop. Signed-off-by: Yuezhang Mo <Yuezhang.Mo@sony.com> Signed-off-by: Namjae Jeon <linkinjeon@kernel.org>
2025-08-01exfat: fdatasync flag should be same like generic_write_sync()Zhengxu Zhang1-3/+2
Test: androbench by default setting, use 64GB sdcard. the random write speed: without this patch 3.5MB/s with this patch 7MB/s After patch "11a347fb6cef", the random write speed decreased significantly. the .write_iter() interface had been modified, and check the differences with generic_file_write_iter(), when calling generic_write_sync() and exfat_file_write_iter() to call vfs_fsync_range(), the fdatasync flag is wrong, and make not use the fdatasync mode, and make random write speed decreased. So use generic_write_sync() instead of vfs_fsync_range(). Fixes: 11a347fb6cef ("exfat: change to get file size from DataLength") Signed-off-by: Zhengxu Zhang <zhengxu.zhang@unisoc.com> Acked-by: Yuezhang Mo <Yuezhang.Mo@sony.com> Signed-off-by: Namjae Jeon <linkinjeon@kernel.org>
2025-07-28Merge tag 'vfs-6.17-rc1.mmap_prepare' of ↵Linus Torvalds1-4/+6
git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs Pull mmap_prepare updates from Christian Brauner: "Last cycle we introduce f_op->mmap_prepare() in c84bf6dd2b83 ("mm: introduce new .mmap_prepare() file callback"). This is preferred to the existing f_op->mmap() hook as it does require a VMA to be established yet, thus allowing the mmap logic to invoke this hook far, far earlier, prior to inserting a VMA into the virtual address space, or performing any other heavy handed operations. This allows for much simpler unwinding on error, and for there to be a single attempt at merging a VMA rather than having to possibly reattempt a merge based on potentially altered VMA state. Far more importantly, it prevents inappropriate manipulation of incompletely initialised VMA state, which is something that has been the cause of bugs and complexity in the past. The intent is to gradually deprecate f_op->mmap, and in that vein this series coverts the majority of file systems to using f_op->mmap_prepare. Prerequisite steps are taken - firstly ensuring all checks for mmap capabilities use the file_has_valid_mmap_hooks() helper rather than directly checking for f_op->mmap (which is now not a valid check) and secondly updating daxdev_mapping_supported() to not require a VMA parameter to allow ext4 and xfs to be converted. Commit bb666b7c2707 ("mm: add mmap_prepare() compatibility layer for nested file systems") handles the nasty edge-case of nested file systems like overlayfs, which introduces a compatibility shim to allow f_op->mmap_prepare() to be invoked from an f_op->mmap() callback. This allows for nested filesystems to continue to function correctly with all file systems regardless of which callback is used. Once we finally convert all file systems, this shim can be removed. As a result, ecryptfs, fuse, and overlayfs remain unaltered so they can nest all other file systems. We additionally do not update resctl - as this requires an update to remap_pfn_range() (or an alternative to it) which we defer to a later series, equally we do not update cramfs which needs a mixed mapping insertion with the same issue, nor do we update procfs, hugetlbfs, syfs or kernfs all of which require VMAs for internal state and hooks. We shall return to all of these later" * tag 'vfs-6.17-rc1.mmap_prepare' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs: doc: update porting, vfs documentation to describe mmap_prepare() fs: replace mmap hook with .mmap_prepare for simple mappings fs: convert most other generic_file_*mmap() users to .mmap_prepare() fs: convert simple use of generic_file_*_mmap() to .mmap_prepare() mm/filemap: introduce generic_file_*_mmap_prepare() helpers fs/xfs: transition from deprecated .mmap hook to .mmap_prepare fs/ext4: transition from deprecated .mmap hook to .mmap_prepare fs/dax: make it possible to check dev dax support without a VMA fs: consistently use can_mmap_file() helper mm/nommu: use file_has_valid_mmap_hooks() helper mm: rename call_mmap/mmap_prepare to vfs_mmap/mmap_prepare
2025-07-28Merge tag 'vfs-6.17-rc1.misc' of ↵Linus Torvalds2-13/+14
git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs Pull misc VFS updates from Christian Brauner: "This contains the usual selections of misc updates for this cycle. Features: - Add ext4 IOCB_DONTCACHE support This refactors the address_space_operations write_begin() and write_end() callbacks to take const struct kiocb * as their first argument, allowing IOCB flags such as IOCB_DONTCACHE to propagate to the filesystem's buffered I/O path. Ext4 is updated to implement handling of the IOCB_DONTCACHE flag and advertises support via the FOP_DONTCACHE file operation flag. Additionally, the i915 driver's shmem write paths are updated to bypass the legacy write_begin/write_end interface in favor of directly calling write_iter() with a constructed synchronous kiocb. Another i915 change replaces a manual write loop with kernel_write() during GEM shmem object creation. Cleanups: - don't duplicate vfs_open() in kernel_file_open() - proc_fd_getattr(): don't bother with S_ISDIR() check - fs/ecryptfs: replace snprintf with sysfs_emit in show function - vfs: Remove unnecessary list_for_each_entry_safe() from evict_inodes() - filelock: add new locks_wake_up_waiter() helper - fs: Remove three arguments from block_write_end() - VFS: change old_dir and new_dir in struct renamedata to dentrys - netfs: Remove unused declaration netfs_queue_write_request() Fixes: - eventpoll: Fix semi-unbounded recursion - eventpoll: fix sphinx documentation build warning - fs/read_write: Fix spelling typo - fs: annotate data race between poll_schedule_timeout() and pollwake() - fs/pipe: set FMODE_NOWAIT in create_pipe_files() - docs/vfs: update references to i_mutex to i_rwsem - fs/buffer: remove comment about hard sectorsize - fs/buffer: remove the min and max limit checks in __getblk_slow() - fs/libfs: don't assume blocksize <= PAGE_SIZE in generic_check_addressable - fs_context: fix parameter name in infofc() macro - fs: Prevent file descriptor table allocations exceeding INT_MAX" * tag 'vfs-6.17-rc1.misc' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs: (24 commits) netfs: Remove unused declaration netfs_queue_write_request() eventpoll: fix sphinx documentation build warning ext4: support uncached buffered I/O mm/pagemap: add write_begin_get_folio() helper function fs: change write_begin/write_end interface to take struct kiocb * drm/i915: Refactor shmem_pwrite() to use kiocb and write_iter drm/i915: Use kernel_write() in shmem object create eventpoll: Fix semi-unbounded recursion vfs: Remove unnecessary list_for_each_entry_safe() from evict_inodes() fs/libfs: don't assume blocksize <= PAGE_SIZE in generic_check_addressable fs/buffer: remove the min and max limit checks in __getblk_slow() fs: Prevent file descriptor table allocations exceeding INT_MAX fs: Remove three arguments from block_write_end() fs/ecryptfs: replace snprintf with sysfs_emit in show function fs: annotate suspected data race between poll_schedule_timeout() and pollwake() docs/vfs: update references to i_mutex to i_rwsem fs/buffer: remove comment about hard sectorsize fs_context: fix parameter name in infofc() macro VFS: change old_dir and new_dir in struct renamedata to dentrys proc_fd_getattr(): don't bother with S_ISDIR() check ...
2025-07-16fs: change write_begin/write_end interface to take struct kiocb *Taotao Chen2-13/+14
Change the address_space_operations callbacks write_begin() and write_end() to take struct kiocb * as the first argument instead of struct file *. Update all affected function prototypes, implementations, call sites, and related documentation across VFS, filesystems, and block layer. Part of a series refactoring address_space_operations write_begin and write_end callbacks to use struct kiocb for passing write context and flags. Signed-off-by: Taotao Chen <chentaotao@didiglobal.com> Link: https://lore.kernel.org/20250716093559.217344-4-chentaotao@didiglobal.com Signed-off-by: Christian Brauner <brauner@kernel.org>
2025-06-19fs: replace mmap hook with .mmap_prepare for simple mappingsLorenzo Stoakes1-4/+6
Since commit c84bf6dd2b83 ("mm: introduce new .mmap_prepare() file callback"), the f_op->mmap() hook has been deprecated in favour of f_op->mmap_prepare(). This callback is invoked in the mmap() logic far earlier, so error handling can be performed more safely without complicated and bug-prone state unwinding required should an error arise. This hook also avoids passing a pointer to a not-yet-correctly-established VMA avoiding any issues with referencing this data structure. It rather provides a pointer to the new struct vm_area_desc descriptor type which contains all required state and allows easy setting of required parameters without any consideration needing to be paid to locking or reference counts. Note that nested filesystems like overlayfs are compatible with an .mmap_prepare() callback since commit bb666b7c2707 ("mm: add mmap_prepare() compatibility layer for nested file systems"). In this patch we apply this change to file systems with relatively simple mmap() hook logic - exfat, ceph, f2fs, bcachefs, zonefs, btrfs, ocfs2, orangefs, nilfs2, romfs, ramfs and aio. Signed-off-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com> Link: https://lore.kernel.org/f528ac4f35b9378931bd800920fee53fc0c5c74d.1750099179.git.lorenzo.stoakes@oracle.com Acked-by: Damien Le Moal <dlemoal@kernel.org> Reviewed-by: Jan Kara <jack@suse.cz> Reviewed-by: Viacheslav Dubeyko <Slava.Dubeyko@ibm.com> Signed-off-by: Christian Brauner <brauner@kernel.org>
2025-06-10new helper: set_default_d_op()Al Viro1-2/+2
... to be used instead of manually assigning to ->s_d_op. All in-tree filesystem converted (and field itself is renamed, so any out-of-tree ones in need of conversion will be caught by compiler). Reviewed-by: Christian Brauner <brauner@kernel.org> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2025-05-26exfat: do not clear volume dirty flag during syncYuezhang Mo1-23/+7
xfstests generic/482 tests the file system consistency after each FUA operation. It fails when run on exfat. exFAT clears the volume dirty flag with a FUA operation during sync. Since s_lock is not held when data is being written to a file, sync can be executed at the same time. When data is being written to a file, the FAT chain is updated first, and then the file size is updated. If sync is executed between updating them, the length of the FAT chain may be inconsistent with the file size. To avoid the situation where the file system is inconsistent but the volume dirty flag is cleared, this commit moves the clearing of the volume dirty flag from exfat_fs_sync() to exfat_put_super(), so that the volume dirty flag is not cleared until unmounting. After the move, there is no additional action during sync, so exfat_fs_sync() can be deleted. Reviewed-by: Sungjong Seo <sj1557.seo@samsung.com> Signed-off-by: Yuezhang Mo <Yuezhang.Mo@sony.com> Signed-off-by: Namjae Jeon <linkinjeon@kernel.org>
2025-05-26exfat: fix double free in delayed_freeNamjae Jeon1-0/+1
The double free could happen in the following path. exfat_create_upcase_table() exfat_create_upcase_table() : return error exfat_free_upcase_table() : free ->vol_utbl exfat_load_default_upcase_table : return error exfat_kill_sb() delayed_free() exfat_free_upcase_table() <--------- double free This patch set ->vol_util as NULL after freeing it. Reported-by: Jianzhou Zhao <xnxc22xnxc22@qq.com> Signed-off-by: Namjae Jeon <linkinjeon@kernel.org>
2025-03-29exfat: call bh_read in get_block only when necessarySungjong Seo1-82/+77
With commit 11a347fb6cef ("exfat: change to get file size from DataLength"), exfat_get_block() can now handle valid_size. However, most partial unwritten blocks that could be mapped with other blocks are being inefficiently processed separately as individual blocks. Except for partial unwritten blocks that require independent processing, let's handle them simply as before. Signed-off-by: Sungjong Seo <sj1557.seo@samsung.com> Reviewed-by: Yuezhang Mo <Yuezhang.Mo@sony.com> Signed-off-by: Namjae Jeon <linkinjeon@kernel.org>
2025-03-29exfat: fix potential wrong error return from get_blockSungjong Seo1-0/+2
If there is no error, get_block() should return 0. However, when bh_read() returns 1, get_block() also returns 1 in the same manner. Let's set err to 0, if there is no error from bh_read() Fixes: 11a347fb6cef ("exfat: change to get file size from DataLength") Cc: stable@vger.kernel.org Signed-off-by: Sungjong Seo <sj1557.seo@samsung.com> Reviewed-by: Yuezhang Mo <Yuezhang.Mo@sony.com> Signed-off-by: Namjae Jeon <linkinjeon@kernel.org>
2025-03-27exfat: fix missing shutdown checkYuezhang Mo1-2/+27
xfstests generic/730 test failed because after deleting the device that still had dirty data, the file could still be read without returning an error. The reason is the missing shutdown check in ->read_iter. I also noticed that shutdown checks were missing from ->write_iter, ->splice_read, and ->mmap. This commit adds shutdown checks to all of them. Fixes: f761fcdd289d ("exfat: Implement sops->shutdown and ioctl") Signed-off-by: Yuezhang Mo <Yuezhang.Mo@sony.com> Signed-off-by: Namjae Jeon <linkinjeon@kernel.org>
2025-03-27exfat: fix the infinite loop in exfat_find_last_cluster()Yuezhang Mo1-1/+1
In exfat_find_last_cluster(), the cluster chain is traversed until the EOF cluster. If the cluster chain includes a loop due to file system corruption, the EOF cluster cannot be traversed, resulting in an infinite loop. If the number of clusters indicated by the file size is inconsistent with the cluster chain length, exfat_find_last_cluster() will return an error, so if this inconsistency is found, the traversal can be aborted without traversing to the EOF cluster. Reported-by: syzbot+f7d147e6db52b1e09dba@syzkaller.appspotmail.com Closes: https://syzkaller.appspot.com/bug?extid=f7d147e6db52b1e09dba Tested-by: syzbot+f7d147e6db52b1e09dba@syzkaller.appspotmail.com Fixes: 31023864e67a ("exfat: add fat entry operations") Signed-off-by: Yuezhang Mo <Yuezhang.Mo@sony.com> Signed-off-by: Namjae Jeon <linkinjeon@kernel.org>