| Age | Commit message (Collapse) | Author | Files | Lines |
|
https://github.com/Paragon-Software-Group/linux-ntfs3
Pull ntfs3 updates from Konstantin Komarov:
"New:
- reject inodes with zero non-DOS link count
- return folios from ntfs_lock_new_page()
- subset of W=1 warnings for stricter checks
- work around -Wmaybe-uninitialized warnings
- buffer boundary checks to run_unpack()
- terminate the cached volume label after UTF-8 conversion
Fixes:
- check return value of indx_find to avoid infinite loop
- prevent uninitialized lcn caused by zero len
- increase CLIENT_REC name field size to prevent buffer overflow
- missing run load for vcn0 in attr_data_get_block_locked()
- memory leak in indx_create_allocate()
- OOB write in attr_wof_frame_info()
- mount failure on volumes with fragmented MFT bitmap
- integer overflow in run_unpack() volume boundary check
- validate rec->used in journal-replay file record check
Updates:
- resolve compare function in public index APIs
- $LXDEV xattr lookup
- potential double iput on d_make_root() failure
- initialize err in ni_allocate_da_blocks_locked()
- correct the pre_alloc condition in attr_allocate_clusters()"
* tag 'ntfs3_for_7.1' of https://github.com/Paragon-Software-Group/linux-ntfs3:
fs/ntfs3: fix Smatch warnings
fs/ntfs3: validate rec->used in journal-replay file record check
fs/ntfs3: terminate the cached volume label after UTF-8 conversion
fs/ntfs3: fix potential double iput on d_make_root() failure
ntfs3: fix integer overflow in run_unpack() volume boundary check
ntfs3: add buffer boundary checks to run_unpack()
ntfs3: fix mount failure on volumes with fragmented MFT bitmap
fs/ntfs3: fix $LXDEV xattr lookup
ntfs3: fix OOB write in attr_wof_frame_info()
ntfs3: fix memory leak in indx_create_allocate()
ntfs3: work around false-postive -Wmaybe-uninitialized warnings
fs/ntfs3: fix missing run load for vcn0 in attr_data_get_block_locked()
fs/ntfs3: increase CLIENT_REC name field size
fs/ntfs3: prevent uninitialized lcn caused by zero len
fs/ntfs3: add a subset of W=1 warnings for stricter checks
fs/ntfs3: return folios from ntfs_lock_new_page()
fs/ntfs3: resolve compare function in public index APIs
ntfs3: reject inodes with zero non-DOS link count
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/linkinjeon/ntfs
Pull ntfs resurrection from Namjae Jeon:
"Ever since Kari Argillander’s 2022 report [1] regarding the state of
the ntfs3 driver, I have spent the last 4 years working to provide
full write support and current trends (iomap, no buffer head, folio),
enhanced performance, stable maintenance, utility support including
fsck for NTFS in Linux.
This new implementation is built upon the clean foundation of the
original read-only NTFS driver, adding:
- Write support:
Implemented full write support based on the classic read-only NTFS
driver. Added delayed allocation to improve write performance
through multi-cluster allocation and reduced fragmentation of the
cluster bitmap.
- iomap conversion:
Switched buffered IO (reads/writes), direct IO, file extent
mapping, readpages, and writepages to use iomap.
- Remove buffer_head:
Completely removed buffer_head usage by converting to folios. As a
result, the dependency on CONFIG_BUFFER_HEAD has been removed from
Kconfig.
- Stability improvements:
The new ntfs driver passes 326 xfstests, compared to 273 for ntfs3.
All tests passed by ntfs3 are a complete subset of the tests passed
by this implementation. Added support for fallocate, idmapped
mounts, permissions, and more.
xfstests Results report:
Total tests run: 787
Passed : 326
Failed : 38
Skipped : 423
Failed tests breakdown:
- 34 tests require metadata journaling
- 4 other tests:
094: No unwritten extent concept in NTFS on-disk format
563: cgroup v2 aware writeback accounting not supported
631: RENAME_WHITEOUT support required
787: NFS delegation test"
Link: https://lore.kernel.org/all/da20d32b-5185-f40b-48b8-2986922d8b25@stargateuniverse.net/ [1]
[ Let's see if this undead filesystem ends up being of the "Easter
miracle" kind, or the "Nosferatu of filesystems" kind... ]
* tag 'ntfs-for-7.1-rc1-v2' of git://git.kernel.org/pub/scm/linux/kernel/git/linkinjeon/ntfs: (46 commits)
ntfs: remove redundant out-of-bound checks
ntfs: add bound checking to ntfs_external_attr_find
ntfs: add bound checking to ntfs_attr_find
ntfs: fix ignoring unreachable code warnings
ntfs: fix inconsistent indenting warnings
ntfs: fix variable dereferenced before check warnings
ntfs: prefer IS_ERR_OR_NULL() over manual NULL check
ntfs: harden ntfs_listxattr against EA entries
ntfs: harden ntfs_ea_lookup against malformed EA entries
ntfs: check $EA query-length in ntfs_ea_get
ntfs: validate WSL EA payload sizes
ntfs: fix WSL ea restore condition
ntfs: add missing newlines to pr_err() messages
ntfs: fix pointer/integer casting warnings
ntfs: use ->mft_no instead of ->i_ino in prints
ntfs: change mft_no type to u64
ntfs: select FS_IOMAP in Kconfig
ntfs: add MODULE_ALIAS_FS
ntfs: reduce stack usage in ntfs_write_mft_block()
ntfs: fix sysctl table registration and path
...
|
|
Initialize err in ni_allocate_da_blocks_locked() and correct the
pre_alloc condition in attr_allocate_clusters().
Suggested-by: Dan Carpenter <error27@gmail.com>
Signed-off-by: Konstantin Komarov <almaz.alexandrovich@paragon-software.com>
|
|
check_file_record() validates rec->total against the record size but
never validates rec->used. The do_action() journal-replay handlers read
rec->used from disk and use it to compute memmove lengths:
DeleteAttribute: memmove(attr, ..., used - asize - roff)
CreateAttribute: memmove(..., attr, used - roff)
change_attr_size: memmove(..., used - PtrOffset(rec, next))
When rec->used is smaller than the offset of a validated attribute, or
larger than the record size, these subtractions can underflow allowing
us to copy huge amounts of memory in to a 4kb buffer, generally
considered a bad idea overall.
This requires a corrupted filesystem, which isn't a threat model the
kernel really needs to worry about, but checking for such an obvious
out-of-bounds value is good to keep things robust, especially on journal
replay
Fix this up by bounding rec->used correctly.
This is much like commit b2bc7c44ed17 ("fs/ntfs3: Fix slab-out-of-bounds
read in DeleteIndexEntryRoot") which checked different values in this
same switch statement.
Cc: Konstantin Komarov <almaz.alexandrovich@paragon-software.com>
Fixes: b46acd6a6a62 ("fs/ntfs3: Add NTFS journal")
Cc: stable <stable@kernel.org>
Assisted-by: gregkh_clanker_t1000
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Konstantin Komarov <almaz.alexandrovich@paragon-software.com>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Pull MM updates from Andrew Morton:
- "maple_tree: Replace big node with maple copy" (Liam Howlett)
Mainly prepararatory work for ongoing development but it does reduce
stack usage and is an improvement.
- "mm, swap: swap table phase III: remove swap_map" (Kairui Song)
Offers memory savings by removing the static swap_map. It also yields
some CPU savings and implements several cleanups.
- "mm: memfd_luo: preserve file seals" (Pratyush Yadav)
File seal preservation to LUO's memfd code
- "mm: zswap: add per-memcg stat for incompressible pages" (Jiayuan
Chen)
Additional userspace stats reportng to zswap
- "arch, mm: consolidate empty_zero_page" (Mike Rapoport)
Some cleanups for our handling of ZERO_PAGE() and zero_pfn
- "mm/kmemleak: Improve scan_should_stop() implementation" (Zhongqiu
Han)
A robustness improvement and some cleanups in the kmemleak code
- "Improve khugepaged scan logic" (Vernon Yang)
Improve khugepaged scan logic and reduce CPU consumption by
prioritizing scanning tasks that access memory frequently
- "Make KHO Stateless" (Jason Miu)
Simplify Kexec Handover by transitioning KHO from an xarray-based
metadata tracking system with serialization to a radix tree data
structure that can be passed directly to the next kernel
- "mm: vmscan: add PID and cgroup ID to vmscan tracepoints" (Thomas
Ballasi and Steven Rostedt)
Enhance vmscan's tracepointing
- "mm: arch/shstk: Common shadow stack mapping helper and
VM_NOHUGEPAGE" (Catalin Marinas)
Cleanup for the shadow stack code: remove per-arch code in favour of
a generic implementation
- "Fix KASAN support for KHO restored vmalloc regions" (Pasha Tatashin)
Fix a WARN() which can be emitted the KHO restores a vmalloc area
- "mm: Remove stray references to pagevec" (Tal Zussman)
Several cleanups, mainly udpating references to "struct pagevec",
which became folio_batch three years ago
- "mm: Eliminate fake head pages from vmemmap optimization" (Kiryl
Shutsemau)
Simplify the HugeTLB vmemmap optimization (HVO) by changing how tail
pages encode their relationship to the head page
- "mm/damon/core: improve DAMOS quota efficiency for core layer
filters" (SeongJae Park)
Improve two problematic behaviors of DAMOS that makes it less
efficient when core layer filters are used
- "mm/damon: strictly respect min_nr_regions" (SeongJae Park)
Improve DAMON usability by extending the treatment of the
min_nr_regions user-settable parameter
- "mm/page_alloc: pcp locking cleanup" (Vlastimil Babka)
The proper fix for a previously hotfixed SMP=n issue. Code
simplifications and cleanups ensued
- "mm: cleanups around unmapping / zapping" (David Hildenbrand)
A bunch of cleanups around unmapping and zapping. Mostly
simplifications, code movements, documentation and renaming of
zapping functions
- "support batched checking of the young flag for MGLRU" (Baolin Wang)
Batched checking of the young flag for MGLRU. It's part cleanups; one
benchmark shows large performance benefits for arm64
- "memcg: obj stock and slab stat caching cleanups" (Johannes Weiner)
memcg cleanup and robustness improvements
- "Allow order zero pages in page reporting" (Yuvraj Sakshith)
Enhance free page reporting - it is presently and undesirably order-0
pages when reporting free memory.
- "mm: vma flag tweaks" (Lorenzo Stoakes)
Cleanup work following from the recent conversion of the VMA flags to
a bitmap
- "mm/damon: add optional debugging-purpose sanity checks" (SeongJae
Park)
Add some more developer-facing debug checks into DAMON core
- "mm/damon: test and document power-of-2 min_region_sz requirement"
(SeongJae Park)
An additional DAMON kunit test and makes some adjustments to the
addr_unit parameter handling
- "mm/damon/core: make passed_sample_intervals comparisons
overflow-safe" (SeongJae Park)
Fix a hard-to-hit time overflow issue in DAMON core
- "mm/damon: improve/fixup/update ratio calculation, test and
documentation" (SeongJae Park)
A batch of misc/minor improvements and fixups for DAMON
- "mm: move vma_(kernel|mmu)_pagesize() out of hugetlb.c" (David
Hildenbrand)
Fix a possible issue with dax-device when CONFIG_HUGETLB=n. Some code
movement was required.
- "zram: recompression cleanups and tweaks" (Sergey Senozhatsky)
A somewhat random mix of fixups, recompression cleanups and
improvements in the zram code
- "mm/damon: support multiple goal-based quota tuning algorithms"
(SeongJae Park)
Extend DAMOS quotas goal auto-tuning to support multiple tuning
algorithms that users can select
- "mm: thp: reduce unnecessary start_stop_khugepaged()" (Breno Leitao)
Fix the khugpaged sysfs handling so we no longer spam the logs with
reams of junk when starting/stopping khugepaged
- "mm: improve map count checks" (Lorenzo Stoakes)
Provide some cleanups and slight fixes in the mremap, mmap and vma
code
- "mm/damon: support addr_unit on default monitoring targets for
modules" (SeongJae Park)
Extend the use of DAMON core's addr_unit tunable
- "mm: khugepaged cleanups and mTHP prerequisites" (Nico Pache)
Cleanups to khugepaged and is a base for Nico's planned khugepaged
mTHP support
- "mm: memory hot(un)plug and SPARSEMEM cleanups" (David Hildenbrand)
Code movement and cleanups in the memhotplug and sparsemem code
- "mm: remove CONFIG_ARCH_ENABLE_MEMORY_HOTREMOVE and cleanup
CONFIG_MIGRATION" (David Hildenbrand)
Rationalize some memhotplug Kconfig support
- "change young flag check functions to return bool" (Baolin Wang)
Cleanups to change all young flag check functions to return bool
- "mm/damon/sysfs: fix memory leak and NULL dereference issues" (Josh
Law and SeongJae Park)
Fix a few potential DAMON bugs
- "mm/vma: convert vm_flags_t to vma_flags_t in vma code" (Lorenzo
Stoakes)
Convert a lot of the existing use of the legacy vm_flags_t data type
to the new vma_flags_t type which replaces it. Mainly in the vma
code.
- "mm: expand mmap_prepare functionality and usage" (Lorenzo Stoakes)
Expand the mmap_prepare functionality, which is intended to replace
the deprecated f_op->mmap hook which has been the source of bugs and
security issues for some time. Cleanups, documentation, extension of
mmap_prepare into filesystem drivers
- "mm/huge_memory: refactor zap_huge_pmd()" (Lorenzo Stoakes)
Simplify and clean up zap_huge_pmd(). Additional cleanups around
vm_normal_folio_pmd() and the softleaf functionality are performed.
* tag 'mm-stable-2026-04-13-21-45' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (369 commits)
mm: fix deferred split queue races during migration
mm/khugepaged: fix issue with tracking lock
mm/huge_memory: add and use has_deposited_pgtable()
mm/huge_memory: add and use normal_or_softleaf_folio_pmd()
mm: add softleaf_is_valid_pmd_entry(), pmd_to_softleaf_folio()
mm/huge_memory: separate out the folio part of zap_huge_pmd()
mm/huge_memory: use mm instead of tlb->mm
mm/huge_memory: remove unnecessary sanity checks
mm/huge_memory: deduplicate zap deposited table call
mm/huge_memory: remove unnecessary VM_BUG_ON_PAGE()
mm/huge_memory: add a common exit path to zap_huge_pmd()
mm/huge_memory: handle buggy PMD entry in zap_huge_pmd()
mm/huge_memory: have zap_huge_pmd return a boolean, add kdoc
mm/huge: avoid big else branch in zap_huge_pmd()
mm/huge_memory: simplify vma_is_specal_huge()
mm: on remap assert that input range within the proposed VMA
mm: add mmap_action_map_kernel_pages[_full]()
uio: replace deprecated mmap hook with mmap_prepare in uio_info
drivers: hv: vmbus: replace deprecated mmap hook with mmap_prepare
mm: allow handling of stacked mmap_prepare hooks in more drivers
...
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs
Pull vfs buffer_head updates from Christian Brauner:
"This cleans up the mess that has accumulated over the years in
metadata buffer_head tracking for inodes.
It moves the tracking into dedicated structure in filesystem-private
part of the inode (so that we don't use private_list, private_data,
and private_lock in struct address_space), and also moves couple other
users of private_data and private_list so these are removed from
struct address_space saving 3 longs in struct inode for 99% of inodes"
* tag 'vfs-7.1-rc1.bh.metadata' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs: (42 commits)
fs: Drop i_private_list from address_space
fs: Drop mapping_metadata_bhs from address space
ext4: Track metadata bhs in fs-private inode part
minix: Track metadata bhs in fs-private inode part
udf: Track metadata bhs in fs-private inode part
fat: Track metadata bhs in fs-private inode part
bfs: Track metadata bhs in fs-private inode part
affs: Track metadata bhs in fs-private inode part
ext2: Track metadata bhs in fs-private inode part
fs: Provide functions for handling mapping_metadata_bhs directly
fs: Switch inode_has_buffers() to take mapping_metadata_bhs
fs: Make bhs point to mapping_metadata_bhs
fs: Move metadata bhs tracking to a separate struct
fs: Fold fsync_buffers_list() into sync_mapping_buffers()
fs: Drop osync_buffers_list()
kvm: Use private inode list instead of i_private_list
fs: Remove i_private_data
aio: Stop using i_private_data and i_private_lock
hugetlbfs: Stop using i_private_data
fs: Stop using i_private_data for metadata bh tracking
...
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs
Pull vfs i_ino updates from Christian Brauner:
"For historical reasons, the inode->i_ino field is an unsigned long,
which means that it's 32 bits on 32 bit architectures. This has caused
a number of filesystems to implement hacks to hash a 64-bit identifier
into a 32-bit field, and deprives us of a universal identifier field
for an inode.
This changes the inode->i_ino field from an unsigned long to a u64.
This shouldn't make any material difference on 64-bit hosts, but
32-bit hosts will see struct inode grow by at least 4 bytes. This
could have effects on slabcache sizes and field alignment.
The bulk of the changes are to format strings and tracepoints, since
the kernel itself doesn't care that much about the i_ino field. The
first patch changes some vfs function arguments, so check that one out
carefully.
With this change, we may be able to shrink some inode structures. For
instance, struct nfs_inode has a fileid field that holds the 64-bit
inode number. With this set of changes, that field could be
eliminated. I'd rather leave that sort of cleanups for later just to
keep this simple"
* tag 'vfs-7.1-rc1.kino' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs:
nilfs2: fix 64-bit division operations in nilfs_bmap_find_target_in_group()
EVM: add comment describing why ino field is still unsigned long
vfs: remove externs from fs.h on functions modified by i_ino widening
treewide: fix missed i_ino format specifier conversions
ext4: fix signed format specifier in ext4_load_inode trace event
treewide: change inode->i_ino from unsigned long to u64
nilfs2: widen trace event i_ino fields to u64
f2fs: widen trace event i_ino fields to u64
ext4: widen trace event i_ino fields to u64
zonefs: widen trace event i_ino fields to u64
hugetlbfs: widen trace event i_ino fields to u64
ext2: widen trace event i_ino fields to u64
cachefiles: widen trace event i_ino fields to u64
vfs: widen trace event i_ino fields to u64
net: change sock.sk_ino and sock_i_ino() to u64
audit: widen ino fields to u64
vfs: widen inode hash/lookup functions to u64
|
|
ntfs_fill_super() loads the on-disk volume label with utf16s_to_utf8s()
and stores the result in sbi->volume.label. The converted label is later
exposed through ntfs3_label_show() using %s, but utf16s_to_utf8s() only
returns the number of bytes written and does not add a trailing NUL.
If the converted label fills the entire fixed buffer,
ntfs3_label_show() can read past the end of sbi->volume.label while
looking for a terminator.
Terminate the cached label explicitly after a successful conversion and
clamp the exact-full case to the last byte of the buffer.
Fixes: 82cae269cfa9 ("fs/ntfs3: Add initialization of super block")
Signed-off-by: Pengpeng Hou <pengpeng@iscas.ac.cn>
Signed-off-by: Konstantin Komarov <almaz.alexandrovich@paragon-software.com>
|
|
d_make_root() consumes the reference to the passed inode: it either
attaches it to the newly created dentry on success, or drops it via
iput() on failure.
In the error path, the code currently does:
sb->s_root = d_make_root(inode);
if (!sb->s_root)
goto put_inode_out;
which leads to a second iput(inode) in put_inode_out. This results in
a double iput and may trigger a use-after-free if the inode gets freed
after the first iput().
Fix this by jumping directly to the common cleanup path, avoiding the
extra iput(inode).
Signed-off-by: Zhan Xusheng <zhanxusheng@xiaomi.com>
Signed-off-by: Konstantin Komarov <almaz.alexandrovich@paragon-software.com>
|
|
The volume boundary check `lcn + len > sbi->used.bitmap.nbits` uses raw
addition which can wrap around for large lcn and len values, bypassing
the validation. Use check_add_overflow() as is already done for the
adjacent prev_lcn + dlcn and vcn64 + len checks added by commit
3ac37e100385 ("ntfs3: Fix integer overflow in run_unpack()").
Found by fuzzing with a source-patched harness (LibAFL + QEMU).
Fixes: 82cae269cfa95 ("fs/ntfs3: Add initialization of super block")
Cc: stable@vger.kernel.org
Signed-off-by: Tobias Gaertner <tob.gaertner@me.com>
Signed-off-by: Konstantin Komarov <almaz.alexandrovich@paragon-software.com>
|
|
run_unpack() checks `run_buf < run_last` at the top of the while loop
but then reads size_size and offset_size bytes via run_unpack_s64()
without verifying they fit within the remaining buffer. A crafted NTFS
image with truncated run data in an MFT attribute triggers an OOB heap
read of up to 15 bytes when the filesystem is mounted.
Add boundary checks before each run_unpack_s64() call to ensure the
declared field size does not exceed the remaining buffer.
Found by fuzzing with a source-patched harness (LibAFL + QEMU).
Fixes: 82cae269cfa95 ("fs/ntfs3: Add initialization of super block")
Cc: stable@vger.kernel.org
Signed-off-by: Tobias Gaertner <tob.gaertner@me.com>
Signed-off-by: Konstantin Komarov <almaz.alexandrovich@paragon-software.com>
|
|
When the $MFT's $BITMAP attribute is fragmented across multiple MFT
records (base record + extent records), ntfs_fill_super() fails with
-ENOENT during wnd_init() because the MFT bitmap's run list only
contains runs from the base MFT record.
The issue is that wnd_init() (which calls wnd_rescan()) is invoked
before ni_load_all_mi(), so the extent MFT records containing
additional $BITMAP runs have not been loaded yet. When wnd_rescan()
tries to look up a VCN beyond the base record's runs, run_lookup_entry()
fails and returns -ENOENT.
This affects NTFS volumes with a large or heavily fragmented MFT, which
is common on long-used Windows systems where the MFT bitmap's run list
doesn't fit in the base MFT record and spills into extent records.
Fix this by:
1. Moving ni_load_all_mi() before wnd_init() so all extent records
are available.
2. After ni_load_all_mi(), iterating through the attribute list to
find any $BITMAP extent attributes and unpacking their runs into
sbi->mft.bitmap.run before wnd_init() is called.
Tested on a 664GB NTFS volume with 86 MFT bitmap runs spanning
records 0 (VCN 0-105) and 17 (VCN 106-165). Before the fix, mount
fails with -ENOENT. After the fix, mount succeeds and all read/write
operations work correctly. Stress-tested with 8 test categories
(large file integrity, 10K small files, copy, move, delete/recreate
cycles, concurrent writes, deep directories, overwrite persistence).
Signed-off-by: Ruslan Elishev <relishev@gmail.com>
Signed-off-by: Konstantin Komarov <almaz.alexandrovich@paragon-software.com>
|
|
Use correct xattr name ("$LXDEV") and buffer size when calling
ntfs_get_ea(), otherwise the attribute may not be read.
Signed-off-by: Zhan Xusheng <zhanxusheng@xiaomi.com>
Signed-off-by: Konstantin Komarov <almaz.alexandrovich@paragon-software.com>
|
|
In attr_wof_frame_info(), the offset-table read range for a nonresident
WofCompressedData stream is:
u64 from = vbo[i] & ~(u64)(PAGE_SIZE - 1);
u64 to = min(from + PAGE_SIZE, wof_size);
...
ntfs_read_run(sbi, run, addr, from, to - from);
A crafted image sets WofCompressedData.nres.data_size to 0xfff while the
file is large enough to request frame 1024 (offset 0x400000). This gives
from=0x1000, to=0xfff. The unsigned (to - from) wraps to 0xffffffffffffffff
and ntfs_read_write_run() overflows the single-page offs_folio via memcpy.
Triggered by pread() on a mounted NTFS image. Depending on adjacent
memory layout at the time of the overflow, KASAN reports this as
slab-out-of-bounds, use-after-free, or slab-use-after-free all at
ntfs_read_write_run(). Secondary corruption/panic paths were also observed.
Reject the read when the offset-table page is outside the stream.
Signed-off-by: 0xkato <0xkkato@gmail.com>
Signed-off-by: Konstantin Komarov <almaz.alexandrovich@paragon-software.com>
|
|
Similar to vma_flags_test(), we have previously renamed vma_desc_test() to
vma_desc_test_any(). Now that is in place, we can reintroduce
vma_desc_test() to explicitly check for a single VMA flag.
As with vma_flags_test(), this is useful as often flag tests are against a
single flag, and vma_desc_test_any(flags, VMA_READ_BIT) reads oddly and
potentially causes confusion.
As with vma_flags_test() a combination of sparse and vma_flags_t being a
struct means that users cannot misuse this function without it getting
flagged.
Also update the VMA tests to reflect this change.
Link: https://lkml.kernel.org/r/3a65ca23defb05060333f0586428fe279a484564.1772704455.git.ljs@kernel.org
Signed-off-by: Lorenzo Stoakes (Oracle) <ljs@kernel.org>
Acked-by: David Hildenbrand (Arm) <david@kernel.org>
Reviewed-by: Pedro Falcato <pfalcato@suse.de>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Babu Moger <babu.moger@amd.com>
Cc: Baolin Wang <baolin.wang@linux.alibaba.com>
Cc: Chao Yu <chao@kernel.org>
Cc: Chatre, Reinette <reinette.chatre@intel.com>
Cc: Chunhai Guo <guochunhai@vivo.com>
Cc: Damien Le Maol <dlemoal@kernel.org>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Dave Jiang <dave.jiang@intel.com>
Cc: Dave Martin <dave.martin@arm.com>
Cc: Gao Xiang <xiang@kernel.org>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Hongbo Li <lihongbo22@huawei.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: James Morse <james.morse@arm.com>
Cc: Jan Kara <jack@suse.cz>
Cc: Jann Horn <jannh@google.com>
Cc: Jason Gunthorpe <jgg@ziepe.ca>
Cc: Jeffle Xu <jefflexu@linux.alibaba.com>
Cc: Johannes Thumshirn <jth@kernel.org>
Cc: Konstantin Komarov <almaz.alexandrovich@paragon-software.com>
Cc: Liam Howlett <liam.howlett@oracle.com>
Cc: "Luck, Tony" <tony.luck@intel.com>
Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Mike Rapoport <rppt@kernel.org>
Cc: Muchun Song <muchun.song@linux.dev>
Cc: Naohiro Aota <naohiro.aota@wdc.com>
Cc: Oscar Salvador <osalvador@suse.de>
Cc: Sandeep Dhavale <dhavale@google.com>
Cc: Suren Baghdasaryan <surenb@google.com>
Cc: Vishal Verma <vishal.l.verma@intel.com>
Cc: Vlastimil Babka <vbabka@kernel.org>
Cc: Yue Hu <zbestahu@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
Patch series "mm: vma flag tweaks".
The ongoing work around introducing non-system word VMA flags has
introduced a number of helper functions and macros to make life easier
when working with these flags and to make conversions from the legacy use
of VM_xxx flags more straightforward.
This series improves these to reduce confusion as to what they do and to
improve consistency and readability.
Firstly the series renames vma_flags_test() to vma_flags_test_any() to
make it abundantly clear that this function tests whether any of the flags
are set (as opposed to vma_flags_test_all()).
It then renames vma_desc_test_flags() to vma_desc_test_any() for the same
reason. Note that we drop the 'flags' suffix here, as
vma_desc_test_any_flags() would be cumbersome and 'test' implies a flag
test.
Similarly, we rename vma_test_all_flags() to vma_test_all() for
consistency.
Next, we have a couple of instances (erofs, zonefs) where we are now
testing for vma_desc_test_any(desc, VMA_SHARED_BIT) &&
vma_desc_test_any(desc, VMA_MAYWRITE_BIT).
This is silly, so this series introduces vma_desc_test_all() so these
callers can instead invoke vma_desc_test_all(desc, VMA_SHARED_BIT,
VMA_MAYWRITE_BIT).
We then observe that quite a few instances of vma_flags_test_any() and
vma_desc_test_any() are in fact only testing against a single flag.
Using the _any() variant here is just confusing - 'any' of single item
reads strangely and is liable to cause confusion.
So in these instances the series reintroduces vma_flags_test() and
vma_desc_test() as helpers which test against a single flag.
The fact that vma_flags_t is a struct and that vma_flag_t utilises sparse
to avoid confusion with vm_flags_t makes it impossible for a user to
misuse these helpers without it getting flagged somewhere.
The series also updates __mk_vma_flags() and functions invoked by it to
explicitly mark them always inline to match expectation and to be
consistent with other VMA flag helpers.
It also renames vma_flag_set() to vma_flags_set_flag() (a function only
used by __mk_vma_flags()) to be consistent with other VMA flag helpers.
Finally it updates the VMA tests for each of these changes, and introduces
explicit tests for vma_flags_test() and vma_desc_test() to assert that
they behave as expected.
This patch (of 6):
On reflection, it's confusing to have vma_flags_test() and
vma_desc_test_flags() test whether any comma-separated VMA flag bit is
set, while also having vma_flags_test_all() and vma_test_all_flags()
separately test whether all flags are set.
Firstly, rename vma_flags_test() to vma_flags_test_any() to eliminate this
confusion.
Secondly, since the VMA descriptor flag functions are becoming rather
cumbersome, prefer vma_desc_test*() to vma_desc_test_flags*(), and also
rename vma_desc_test_flags() to vma_desc_test_any().
Finally, rename vma_test_all_flags() to vma_test_all() to keep the
VMA-specific helper consistent with the VMA descriptor naming convention
and to help avoid confusion vs. vma_flags_test_all().
While we're here, also update whitespace to be consistent in helper
functions.
Link: https://lkml.kernel.org/r/cover.1772704455.git.ljs@kernel.org
Link: https://lkml.kernel.org/r/0f9cb3c511c478344fac0b3b3b0300bb95be95e9.1772704455.git.ljs@kernel.org
Signed-off-by: Lorenzo Stoakes (Oracle) <ljs@kernel.org>
Suggested-by: Pedro Falcato <pfalcato@suse.de>
Acked-by: David Hildenbrand (Arm) <david@kernel.org>
Reviewed-by: Pedro Falcato <pfalcato@suse.de>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Babu Moger <babu.moger@amd.com>
Cc: Baolin Wang <baolin.wang@linux.alibaba.com>
Cc: Chao Yu <chao@kernel.org>
Cc: Chatre, Reinette <reinette.chatre@intel.com>
Cc: Chunhai Guo <guochunhai@vivo.com>
Cc: Damien Le Maol <dlemoal@kernel.org>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Dave Jiang <dave.jiang@intel.com>
Cc: Dave Martin <dave.martin@arm.com>
Cc: Gao Xiang <xiang@kernel.org>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Hongbo Li <lihongbo22@huawei.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: James Morse <james.morse@arm.com>
Cc: Jan Kara <jack@suse.cz>
Cc: Jann Horn <jannh@google.com>
Cc: Jason Gunthorpe <jgg@ziepe.ca>
Cc: Jeffle Xu <jefflexu@linux.alibaba.com>
Cc: Johannes Thumshirn <jth@kernel.org>
Cc: Konstantin Komarov <almaz.alexandrovich@paragon-software.com>
Cc: Liam Howlett <liam.howlett@oracle.com>
Cc: "Luck, Tony" <tony.luck@intel.com>
Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Mike Rapoport <rppt@kernel.org>
Cc: Muchun Song <muchun.song@linux.dev>
Cc: Naohiro Aota <naohiro.aota@wdc.com>
Cc: Oscar Salvador <osalvador@suse.de>
Cc: Sandeep Dhavale <dhavale@google.com>
Cc: Suren Baghdasaryan <surenb@google.com>
Cc: Vishal Verma <vishal.l.verma@intel.com>
Cc: Vlastimil Babka <vbabka@kernel.org>
Cc: Yue Hu <zbestahu@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
When indx_create_allocate() fails after
attr_allocate_clusters() succeeds, run_deallocate()
frees the disk clusters but never frees the memory
allocated by run_add_entry() via kvmalloc() for the
runs_tree structure.
Fix this by adding run_close() at the out: label to
free the run.runs memory on all error paths. The
success path is unaffected as it returns 0 directly
without going through out:, transferring ownership
of the run memory to indx->alloc_run via memcpy().
Reported-by: syzbot+7adcddaeeb860e5d3f2f@syzkaller.appspotmail.com
Closes: https://syzkaller.appspot.com/bug?extid=7adcddaeeb860e5d3f2f
Signed-off-by: Deepanshu Kartikey <Kartikey406@gmail.com>
Signed-off-by: Konstantin Komarov <almaz.alexandrovich@paragon-software.com>
|
|
calls
ntfs3 never calls mark_buffer_dirty_inode() and thus its metadata
buffers list is always empty. Drop the pointless sync_mapping_buffers()
and invalidate_inode_buffers() calls.
CC: Konstantin Komarov <almaz.alexandrovich@paragon-software.com>
CC: ntfs3@lists.linux.dev
Signed-off-by: Jan Kara <jack@suse.cz>
Link: https://patch.msgid.link/20260326095354.16340-45-jack@suse.cz
Signed-off-by: Christian Brauner <brauner@kernel.org>
|
|
gcc sometimes fails to analyse how two local variables in ntfs_write_bh()
are initialized, as the initialization happens only in the first pass
through the main loop:
fs/ntfs3/fsntfs.c: In function 'ntfs_write_bh':
fs/ntfs3/fsntfs.c:1443:17: error: 'fixup' may be used uninitialized [-Werror=maybe-uninitialized]
1443 | __le16 *fixup;
| ^~~~~
fs/ntfs3/fsntfs.c:1443:17: note: 'fixup' was declared here
1443 | __le16 *fixup;
| ^~~~~
fs/ntfs3/fsntfs.c:1487:30: error: 'sample' may be used uninitialized [-Werror=maybe-uninitialized]
1487 | *ptr = sample;
| ~~~~~^~~~~~~~
fs/ntfs3/fsntfs.c:1444:16: note: 'sample' was declared here
1444 | __le16 sample;
Initializing the two variables to bogus values shuts up the warning and
makes it clear that those cannot be used. I tried rearranging the loop to
move the initialization in front of it, but couldn't quite figure it out.
Fixes: 48d9b57b169f ("fs/ntfs3: add a subset of W=1 warnings for stricter checks")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Konstantin Komarov <almaz.alexandrovich@paragon-software.com>
|
|
When a compressed or sparse attribute has its clusters frame-aligned,
vcn is rounded down to the frame start using cmask, which can result
in vcn != vcn0. In this case, vcn and vcn0 may reside in different
attribute segments.
The code already handles the case where vcn is in a different segment
by loading its runs before allocation. However, it fails to load runs
for vcn0 when vcn0 resides in a different segment than vcn. This causes
run_lookup_entry() to return SPARSE_LCN for vcn0 since its segment was
never loaded into the in-memory run list, triggering the WARN_ON(1).
Fix this by adding a missing check for vcn0 after the existing vcn
segment check. If vcn0 falls outside the current segment range
[svcn, evcn1), find and load the attribute segment containing vcn0
before performing the run lookup.
The following scenario triggers the bug:
attr_data_get_block_locked()
vcn = vcn0 & cmask <- vcn != vcn0 after frame alignment
load runs for vcn segment <- vcn0 segment not loaded!
attr_allocate_clusters() <- allocation succeeds
run_lookup_entry(vcn0) <- vcn0 not in run -> SPARSE_LCN
WARN_ON(1) <- bug fires here!
Reported-by: syzbot+c1e9aedbd913fadad617@syzkaller.appspotmail.com
Closes: https://syzkaller.appspot.com/bug?extid=c1e9aedbd913fadad617
Fixes: c380b52f6c57 ("fs/ntfs3: Change new sparse cluster processing")
Signed-off-by: Deepanshu Kartikey <Kartikey406@gmail.com>
Signed-off-by: Konstantin Komarov <almaz.alexandrovich@paragon-software.com>
|
|
This patch increases the size of the CLIENT_REC name field from 32 utf-16
chars to 64 utf-16 chars. It fixes the buffer overflow problem in
log_replay() reported by Robbert Morris.
Reported-by: <rtm@csail.mit.edu>
Signed-off-by: Konstantin Komarov <almaz.alexandrovich@paragon-software.com>
|
|
ntfs3 copied the iomap code without attribution or talking to the
maintainers, to hook into the bio completion for (unexplained) zeroing.
Fix this by just overriding the bio completion handler in the submit
handler.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Link: https://patch.msgid.link/20260223132021.292832-13-hch@lst.de
Reviewed-by: "Darrick J. Wong" <djwong@kernel.org>
Signed-off-by: Christian Brauner <brauner@kernel.org>
|
|
Move the NULL check into the callers to simplify the callees.
Fuse was missing this before, but has a constant read_ctx that is
never NULL or changed, so no change here either.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Link: https://patch.msgid.link/20260223132021.292832-11-hch@lst.de
Tested-by: Anuj Gupta <anuj20.g@samsung.com>
Reviewed-by: "Darrick J. Wong" <djwong@kernel.org>
Signed-off-by: Christian Brauner <brauner@kernel.org>
|
|
This provides additional context for file systems.
Rename the fuse instance to match the method name while we're at it.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Link: https://patch.msgid.link/20260223132021.292832-10-hch@lst.de
Tested-by: Anuj Gupta <anuj20.g@samsung.com>
Reviewed-by: "Darrick J. Wong" <djwong@kernel.org>
Signed-off-by: Christian Brauner <brauner@kernel.org>
|
|
On 32-bit architectures, unsigned long is only 32 bits wide, which
causes 64-bit inode numbers to be silently truncated. Several
filesystems (NFS, XFS, BTRFS, etc.) can generate inode numbers that
exceed 32 bits, and this truncation can lead to inode number collisions
and other subtle bugs on 32-bit systems.
Change the type of inode->i_ino from unsigned long to u64 to ensure that
inode numbers are always represented as 64-bit values regardless of
architecture. Update all format specifiers treewide from %lu/%lx to
%llu/%llx to match the new type, along with corresponding local variable
types.
This is the bulk treewide conversion. Earlier patches in this series
handled trace events separately to allow trace field reordering for
better struct packing on 32-bit.
Signed-off-by: Jeff Layton <jlayton@kernel.org>
Link: https://patch.msgid.link/20260304-iino-u64-v3-12-2257ad83d372@kernel.org
Acked-by: Damien Le Moal <dlemoal@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Jan Kara <jack@suse.cz>
Reviewed-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Christian Brauner <brauner@kernel.org>
|
|
syzbot reported a uninit-value in ntfs_iomap_begin [1].
Since runs was not touched yet, run_lookup_entry() immediately fails
and returns false, which makes the value of "*len" 0.
Simultaneously, the new value and err value are also 0, causing the
logic in attr_data_get_block_locked() to jump directly to ok, ultimately
resulting in *lcn being triggered before it is set [1].
In ntfs_iomap_begin(), the check for a 0 value in clen is moved forward
to before updating lcn to avoid this [1].
[1]
BUG: KMSAN: uninit-value in ntfs_iomap_begin+0x8c0/0x1460 fs/ntfs3/inode.c:825
ntfs_iomap_begin+0x8c0/0x1460 fs/ntfs3/inode.c:825
iomap_iter+0x9b7/0x1540 fs/iomap/iter.c:110
Local variable lcn created at:
ntfs_iomap_begin+0x15d/0x1460 fs/ntfs3/inode.c:786
Fixes: 10d7c95af043 ("fs/ntfs3: add delayed-allocation (delalloc) support")
Reported-by: syzbot+7be88937363ac7ab7bb0@syzkaller.appspotmail.com
Closes: https://syzkaller.appspot.com/bug?extid=7be88937363ac7ab7bb0
Tested-by: syzbot+7be88937363ac7ab7bb0@syzkaller.appspotmail.com
Signed-off-by: Edward Adam Davis <eadavis@qq.com>
Signed-off-by: Konstantin Komarov <almaz.alexandrovich@paragon-software.com>
|
|
Enable a subset of W=1-style compiler warnings for the ntfs3 tree so we
catch small bugs early (unused symbols, missing declarations/prototypes,
possible uninitialized/mis-sized uses, etc).
Signed-off-by: Konstantin Komarov <almaz.alexandrovich@paragon-software.com>
|
|
ntfs_lock_new_page() currently returns a struct page * but it primarily
operates on folios via __filemap_get_folio(). Convert it to return a
struct folio * and use folio_alloc() + __folio_set_locked() for the
temporary page used to avoid data corruption during decompression.
When the cached folio is not uptodate, preserve the existing behavior by
using folio_file_page() and converting the returned page back to a
folio.
Update ni_readpage_cmpr() and ni_decompress_file() to handle the new
return type while keeping the existing struct page * array and the
unlock_page()/put_page() cleanup paths unchanged.
Reported-by: kernel test robot <lkp@intel.com>
Closes: https://lore.kernel.org/oe-kbuild-all/202602072013.jwrURE2e-lkp@intel.com/
Closes: https://lore.kernel.org/oe-kbuild-all/202602071921.nGIiI1J5-lkp@intel.com/
Signed-off-by: Sun Jian <sun.jian.kdev@gmail.com>
[almaz.alexandrovich@paragon-software.com: removed ni_fiemap function,
added reported-by and closes tags to commit]
Signed-off-by: Konstantin Komarov <almaz.alexandrovich@paragon-software.com>
|
|
Previously the comparator was stored in struct ntfs_index and
used by low-level helpers such as hdr_find_e(). This creates
a dependency on index state in private helpers.
Resolve the compare function in the public index APIs and pass
it explicitly to internal helpers.
This should make the ownership of the comparator explicit and keeps
low-level index code independent of index-root policy.
This also resolves the TODO comment about dropping the stored
comparator from struct ntfs_index.
Signed-off-by: Adarsh Das <adarshdas950@gmail.com>
Signed-off-by: Konstantin Komarov <almaz.alexandrovich@paragon-software.com>
|
|
ntfs_read_mft() counts file name attributes into two variables:
names (all names including DOS 8.3) and links (non-DOS names
only). The validation at line 424 checks names but set_nlink()
at line 436 uses links. A corrupted NTFS image where all file
name attributes have type FILE_NAME_DOS passes the names check
but results in set_nlink(inode, 0).
When such an inode is loaded via a code path that passes name=NULL
to ntfs_iget5() and the nlink=0 inode enters the VFS. The subsequent
unlink, rmdir, or rename targeting this inode calls drop_nlink()
which triggers WARN_ON(inode->i_nlink == 0) in fs/inode.c.
An all-DOS-name MFT record cannot exist on a valid NTFS volume.
Reject such records by checking for links == 0 before
calling set_nlink().
Signed-off-by: Ziyi Guo <n7l8m4@u.northwestern.edu>
Signed-off-by: Konstantin Komarov <almaz.alexandrovich@paragon-software.com>
|
|
This is the result of running the Coccinelle script from
scripts/coccinelle/api/kmalloc_objs.cocci. The script is designed to
avoid scalar types (which need careful case-by-case checking), and
instead replace kmalloc-family calls that allocate struct or union
object instances:
Single allocations: kmalloc(sizeof(TYPE), ...)
are replaced with: kmalloc_obj(TYPE, ...)
Array allocations: kmalloc_array(COUNT, sizeof(TYPE), ...)
are replaced with: kmalloc_objs(TYPE, COUNT, ...)
Flex array allocations: kmalloc(struct_size(PTR, FAM, COUNT), ...)
are replaced with: kmalloc_flex(*PTR, FAM, COUNT, ...)
(where TYPE may also be *VAR)
The resulting allocations no longer return "void *", instead returning
"TYPE *".
Signed-off-by: Kees Cook <kees@kernel.org>
|
|
Introduce Kconfig and Makefile for remade ntfs.
And this patch make ntfs and ntfs3 mutually exclusive so only one can be
built-in(y), while both can still be built as modules(m).
Reviewed-by: Amir Goldstein <amir73il@gmail.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Acked-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Namjae Jeon <linkinjeon@kernel.org>
|
|
Reverts the following commits that introduced legacy ntfs
driver alias and related support code:
74871791ffa9 ntfs3: serve as alias for the legacy ntfs driver
1ff2e956608c fs/ntfs3: Redesign legacy ntfs support
9b872cc50daa ntfs3: add legacy ntfs file operations
d55f90e9b243 ntfs3: enforce read-only when used as legacy ntfs driver
The legacy ntfs driver has been remade as a new implementation, so the
alias and related codes in ntfs3 are no longer needed.
Acked-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Namjae Jeon <linkinjeon@kernel.org>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Pull more MM updates from Andrew Morton:
- "mm/vmscan: fix demotion targets checks in reclaim/demotion" fixes a
couple of issues in the demotion code - pages were failed demotion
and were finding themselves demoted into disallowed nodes (Bing Jiao)
- "Remove XA_ZERO from error recovery of dup_mmap()" fixes a rare
mapledtree race and performs a number of cleanups (Liam Howlett)
- "mm: add bitmap VMA flag helpers and convert all mmap_prepare to use
them" implements a lot of cleanups following on from the conversion
of the VMA flags into a bitmap (Lorenzo Stoakes)
- "support batch checking of references and unmapping for large folios"
implements batching to greatly improve the performance of reclaiming
clean file-backed large folios (Baolin Wang)
- "selftests/mm: add memory failure selftests" does as claimed (Miaohe
Lin)
* tag 'mm-stable-2026-02-18-19-48' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (36 commits)
mm/page_alloc: clear page->private in free_pages_prepare()
selftests/mm: add memory failure dirty pagecache test
selftests/mm: add memory failure clean pagecache test
selftests/mm: add memory failure anonymous page test
mm: rmap: support batched unmapping for file large folios
arm64: mm: implement the architecture |