aboutsummaryrefslogtreecommitdiff
path: root/fs/crypto
AgeCommit message (Collapse)AuthorFilesLines
2026-04-13Merge tag 'fscrypt-for-linus' of git://git.kernel.org/pub/scm/fs/fscrypt/linuxLinus Torvalds6-164/+54
Pull fscrypt updates from Eric Biggers: - Various cleanups for the interface between fs/crypto/ and filesystems, from Christoph Hellwig - Simplify and optimize the implementation of v1 key derivation by using the AES library instead of the crypto_skcipher API * tag 'fscrypt-for-linus' of git://git.kernel.org/pub/scm/fs/fscrypt/linux: fscrypt: use AES library for v1 key derivation ext4: use a byte granularity cursor in ext4_mpage_readpages fscrypt: pass a real sector_t to fscrypt_zeroout_range fscrypt: pass a byte length to fscrypt_zeroout_range fscrypt: pass a byte offset to fscrypt_zeroout_range fscrypt: pass a byte length to fscrypt_zeroout_range_inline_crypt fscrypt: pass a byte offset to fscrypt_zeroout_range_inline_crypt fscrypt: pass a byte offset to fscrypt_set_bio_crypt_ctx fscrypt: pass a byte offset to fscrypt_mergeable_bio fscrypt: pass a byte offset to fscrypt_generate_dun fscrypt: move fscrypt_set_bio_crypt_ctx_bh to buffer.c ext4, fscrypt: merge fscrypt_mergeable_bio_bh into io_submit_need_new_bio ext4: factor out a io_submit_need_new_bio helper ext4: open code fscrypt_set_bio_crypt_ctx_bh ext4: initialize the write hint in io_submit_init_bio
2026-03-25fscrypt: use AES library for v1 key derivationEric Biggers2-60/+29
Convert the implementation of the v1 (original / deprecated) fscrypt per-file key derivation algorithm to use the AES library instead of an "ecb(aes)" crypto_skcipher. This is much simpler. While the AES library doesn't support AES-ECB directly yet, we can still simply call aes_encrypt() in a loop. While that doesn't explicitly parallelize the AES encryptions, it doesn't really matter in this case, where a new key is used each time and only 16 to 64 bytes are encrypted. In fact, a quick benchmark (AMD Ryzen 9 9950X) shows that this commit actually greatly improves performance, from ~7000 cycles per key derived to ~1500. The times don't differ much between 32 bytes and 64 bytes either, so clearly the bottleneck is API stuff and key expansion. Granted, performance of the v1 key derivation is no longer very relevant: most users have moved onto v2 encryption policies. The v2 key derivation uses HKDF-SHA512 (which is ~3500 cycles on the same CPU). Still, it's nice that the simpler solution is much faster as well. Compatibility verified with xfstests generic/548. Link: https://lore.kernel.org/r/20260321075338.99809-1-ebiggers@kernel.org Signed-off-by: Eric Biggers <ebiggers@kernel.org>
2026-03-09fscrypt: pass a real sector_t to fscrypt_zeroout_rangeChristoph Hellwig1-3/+2
While the pblk argument to fscrypt_zeroout_range is declared as a sector_t, it actually is interpreted as a logical block size unit, which is highly unusual. Switch to passing the 512 byte units that sector_t is defined for. Signed-off-by: Christoph Hellwig <hch@lst.de> Link: https://lore.kernel.org/r/20260302141922.370070-14-hch@lst.de Signed-off-by: Eric Biggers <ebiggers@kernel.org>
2026-03-09fscrypt: pass a byte length to fscrypt_zeroout_rangeChristoph Hellwig1-5/+6
Range lengths are usually expressed as bytes in the VFS, switch fscrypt_zeroout_range to this convention. Signed-off-by: Christoph Hellwig <hch@lst.de> Link: https://lore.kernel.org/r/20260302141922.370070-13-hch@lst.de Signed-off-by: Eric Biggers <ebiggers@kernel.org>
2026-03-09fscrypt: pass a byte offset to fscrypt_zeroout_rangeChristoph Hellwig1-4/+3
Logical offsets into an inode are usually expressed as bytes in the VFS. Switch fscrypt_zeroout_range to that convention. Signed-off-by: Christoph Hellwig <hch@lst.de> Link: https://lore.kernel.org/r/20260302141922.370070-12-hch@lst.de Signed-off-by: Eric Biggers <ebiggers@kernel.org>
2026-03-09fscrypt: pass a byte length to fscrypt_zeroout_range_inline_cryptChristoph Hellwig1-8/+4
Range lengths are usually expressed as bytes in the VFS, switch fscrypt_zeroout_range_inline_crypt to this convention. Signed-off-by: Christoph Hellwig <hch@lst.de> Link: https://lore.kernel.org/r/20260302141922.370070-11-hch@lst.de Signed-off-by: Eric Biggers <ebiggers@kernel.org>
2026-03-09fscrypt: pass a byte offset to fscrypt_zeroout_range_inline_cryptChristoph Hellwig1-3/+3
Logical offsets into an inode are usually expressed as bytes in the VFS. Switch fscrypt_zeroout_range_inline_crypt to that convention. Signed-off-by: Christoph Hellwig <hch@lst.de> Link: https://lore.kernel.org/r/20260302141922.370070-10-hch@lst.de Signed-off-by: Eric Biggers <ebiggers@kernel.org>
2026-03-09fscrypt: pass a byte offset to fscrypt_set_bio_crypt_ctxChristoph Hellwig2-7/+7
Logical offsets into an inode are usually expressed as bytes in the VFS. Switch fscrypt_set_bio_crypt_ctx to that convention. Signed-off-by: Christoph Hellwig <hch@lst.de> Link: https://lore.kernel.org/r/20260302141922.370070-9-hch@lst.de Signed-off-by: Eric Biggers <ebiggers@kernel.org>
2026-03-09fscrypt: pass a byte offset to fscrypt_mergeable_bioChristoph Hellwig2-4/+5
Logical offsets into an inode are usually expressed as bytes in the VFS. Switch fscrypt_mergeable_bio to that convention. Signed-off-by: Christoph Hellwig <hch@lst.de> Link: https://lore.kernel.org/r/20260302141922.370070-8-hch@lst.de Signed-off-by: Eric Biggers <ebiggers@kernel.org>
2026-03-09fscrypt: pass a byte offset to fscrypt_generate_dunChristoph Hellwig3-11/+4
Logical offsets into an inode are usually expressed as bytes in the VFS. Switch fscrypt_generate_dun to that convention and remove the ci_data_units_per_block_bits member in struct fscrypt_inode_info that was only used to cache the DUN shift based on the logical block size granularity. Signed-off-by: Christoph Hellwig <hch@lst.de> Link: https://lore.kernel.org/r/20260302141922.370070-7-hch@lst.de Signed-off-by: Eric Biggers <ebiggers@kernel.org>
2026-03-09fscrypt: move fscrypt_set_bio_crypt_ctx_bh to buffer.cChristoph Hellwig1-45/+0
fscrypt_set_bio_crypt_ctx_bh is only used by submit_bh_wbc now. Move it there and merge bh_get_inode_and_lblk_num into it. Note that this does not add ifdefs for fscrypt as the compiler will optimize away the dead code if it is not built in. Signed-off-by: Christoph Hellwig <hch@lst.de> Link: https://lore.kernel.org/r/20260302141922.370070-6-hch@lst.de Signed-off-by: Eric Biggers <ebiggers@kernel.org>
2026-03-09ext4, fscrypt: merge fscrypt_mergeable_bio_bh into io_submit_need_new_bioChristoph Hellwig1-23/+0
ext4 already has the inode and folio and can't have a NULL folio->mapping in this path. Open code fscrypt_mergeable_bio_bh in io_submit_need_new_bio based on these simplifying assumptions. Signed-off-by: Christoph Hellwig <hch@lst.de> Link: https://lore.kernel.org/r/20260302141922.370070-5-hch@lst.de Signed-off-by: Eric Biggers <ebiggers@kernel.org>
2026-03-06treewide: change inode->i_ino from unsigned long to u64Jeff Layton4-5/+5
On 32-bit architectures, unsigned long is only 32 bits wide, which causes 64-bit inode numbers to be silently truncated. Several filesystems (NFS, XFS, BTRFS, etc.) can generate inode numbers that exceed 32 bits, and this truncation can lead to inode number collisions and other subtle bugs on 32-bit systems. Change the type of inode->i_ino from unsigned long to u64 to ensure that inode numbers are always represented as 64-bit values regardless of architecture. Update all format specifiers treewide from %lu/%lx to %llu/%llx to match the new type, along with corresponding local variable types. This is the bulk treewide conversion. Earlier patches in this series handled trace events separately to allow trace field reordering for better struct packing on 32-bit. Signed-off-by: Jeff Layton <jlayton@kernel.org> Link: https://patch.msgid.link/20260304-iino-u64-v3-12-2257ad83d372@kernel.org Acked-by: Damien Le Moal <dlemoal@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Jan Kara <jack@suse.cz> Reviewed-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Christian Brauner <brauner@kernel.org>
2026-02-21Convert 'alloc_obj' family to use the new default GFP_KERNEL argumentLinus Torvalds4-6/+6
This was done entirely with mindless brute force, using git grep -l '\<k[vmz]*alloc_objs*(.*, GFP_KERNEL)' | xargs sed -i 's/\(alloc_objs*(.*\), GFP_KERNEL)/\1)/' to convert the new alloc_obj() users that had a simple GFP_KERNEL argument to just drop that argument. Note that due to the extreme simplicity of the scripting, any slightly more complex cases spread over multiple lines would not be triggered: they definitely exist, but this covers the vast bulk of the cases, and the resulting diff is also then easier to check automatically. For the same reason the 'flex' versions will be done as a separate conversion. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2026-02-21treewide: Replace kmalloc with kmalloc_obj for non-scalar typesKees Cook4-6/+6
This is the result of running the Coccinelle script from scripts/coccinelle/api/kmalloc_objs.cocci. The script is designed to avoid scalar types (which need careful case-by-case checking), and instead replace kmalloc-family calls that allocate struct or union object instances: Single allocations: kmalloc(sizeof(TYPE), ...) are replaced with: kmalloc_obj(TYPE, ...) Array allocations: kmalloc_array(COUNT, sizeof(TYPE), ...) are replaced with: kmalloc_objs(TYPE, COUNT, ...) Flex array allocations: kmalloc(struct_size(PTR, FAM, COUNT), ...) are replaced with: kmalloc_flex(*PTR, FAM, COUNT, ...) (where TYPE may also be *VAR) The resulting allocations no longer return "void *", instead returning "TYPE *". Signed-off-by: Kees Cook <kees@kernel.org>
2026-02-16Merge tag 'vfs-7.0-rc1.misc.2' of ↵Linus Torvalds1-1/+1
git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs Pull more misc vfs updates from Christian Brauner: "Features: - Optimize close_range() from O(range size) to O(active FDs) by using find_next_bit() on the open_fds bitmap instead of linearly scanning the entire requested range. This is a significant improvement for large-range close operations on sparse file descriptor tables. - Add FS_XFLAG_VERITY file attribute for fs-verity files, retrievable via FS_IOC_FSGETXATTR and file_getattr(). The flag is read-only. Add tracepoints for fs-verity enable and verify operations, replacing the previously removed debug printk's. - Prevent nfsd from exporting special kernel filesystems like pidfs and nsfs. These filesystems have custom ->open() and ->permission() export methods that are designed for open_by_handle_at(2) only and are incompatible with nfsd. Update the exportfs documentation accordingly. Fixes: - Fix KMSAN uninit-value in ovl_fill_real() where strcmp() was used on a non-null-terminated decrypted directory entry name from fscrypt. This triggered on encrypted lower layers when the decrypted name buffer contained uninitialized tail data. The fix also adds VFS-level name_is_dot(), name_is_dotdot(), and name_is_dot_dotdot() helpers, replacing various open-coded "." and ".." checks across the tree. - Fix read-only fsflags not being reset together with xflags in vfs_fileattr_set(). Currently harmless since no read-only xflags overlap with flags, but this would cause inconsistencies for any future shared read-only flag - Return -EREMOTE instead of -ESRCH from PIDFD_GET_INFO when the target process is in a different pid namespace. This lets userspace distinguish "process exited" from "process in another namespace", matching glibc's pidfd_getpid() behavior Cleanups: - Use C-string literals in the Rust seq_file bindings, replacing the kernel::c_str!() macro (available since Rust 1.77) - Fix typo in d_walk_ret enum comment, add porting notes for the readlink_copy() calling convention change" * tag 'vfs-7.0-rc1.misc.2' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs: fs: add porting notes about readlink_copy() pidfs: return -EREMOTE when PIDFD_GET_INFO is called on another ns nfsd: do not allow exporting of special kernel filesystems exportfs: clarify the documentation of open()/permission() expotrfs ops fsverity: add tracepoints fs: add FS_XFLAG_VERITY for fs-verity files rust: seq_file: replace `kernel::c_str!` with C-Strings fs: dcache: fix typo in enum d_walk_ret comment ovl: use name_is_dot* helpers in readdir code fs: add helpers name_is_dot{,dot,_dotdot} ovl: Fix uninit-value in ovl_fill_real fs: reset read-only fsflags together with xflags fs/file: optimize close_range() complexity from O(N) to O(Sparse)
2026-01-29fs: add helpers name_is_dot{,dot,_dotdot}Amir Goldstein1-1/+1
Rename the helper is_dot_dotdot() into the name_ namespace and add complementary helpers to check for dot and dotdot names individually. Signed-off-by: Amir Goldstein <amir73il@gmail.com> Link: https://patch.msgid.link/20260128132406.23768-3-amir73il@gmail.com Reviewed-by: Eric Biggers <ebiggers@kernel.org> Signed-off-by: Christian Brauner <brauner@kernel.org>
2026-01-11blk-crypto: handle the fallback above the block layerChristoph Hellwig1-1/+1
Add a blk_crypto_submit_bio helper that either submits the bio when it is not encrypted or inline encryption is provided, but otherwise handles the encryption before going down into the low-level driver. This reduces the risk from bio reordering and keeps memory allocation as high up in the stack as possible. Note that if the submitter knows that inline enctryption is known to be supported by the underyling driver, it can still use plain submit_bio. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Eric Biggers <ebiggers@kernel.org> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2026-01-11fscrypt: keep multiple bios in flight in fscrypt_zeroout_range_inline_cryptChristoph Hellwig1-32/+54
This should slightly improve performance for large zeroing operations, but more importantly prepares for blk-crypto refactoring that requires all fscrypt users to call submit_bio directly. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Eric Biggers <ebiggers@kernel.org> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2026-01-11fscrypt: pass a real sector_t to fscrypt_zeroout_range_inline_cryptChristoph Hellwig1-5/+4
While the pblk argument to fscrypt_zeroout_range_inline_crypt is declared as a sector_t it actually is interpreted as a logical block size unit, which is highly unusual. Switch to passing the 512 byte units that sector_t is defined for. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Eric Biggers <ebiggers@kernel.org> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-12-06Merge tag 'mm-nonmm-stable-2025-12-06-11-14' of ↵Linus Torvalds1-83/+6
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm Pull non-MM updates from Andrew Morton: - "panic: sys_info: Refactor and fix a potential issue" (Andy Shevchenko) fixes a build issue and does some cleanup in ib/sys_info.c - "Implement mul_u64_u64_div_u64_roundup()" (David Laight) enhances the 64-bit math code on behalf of a PWM driver and beefs up the test module for these library functions - "scripts/gdb/symbols: make BPF debug info available to GDB" (Ilya Leoshkevich) makes BPF symbol names, sizes, and line numbers available to the GDB debugger - "Enable hung_task and lockup cases to dump system info on demand" (Feng Tang) adds a sysctl which can be used to cause additional info dumping when the hung-task and lockup detectors fire - "lib/base64: add generic encoder/decoder, migrate users" (Kuan-Wei Chiu) adds a general base64 encoder/decoder to lib/ and migrates several users away from their private implementations - "rbree: inline rb_first() and rb_last()" (Eric Dumazet) makes TCP a little faster - "liveupdate: Rework KHO for in-kernel users" (Pasha Tatashin) reworks the KEXEC Handover interfaces in preparation for Live Update Orchestrator (LUO), and possibly for other future clients - "kho: simplify state machine and enable dynamic updates" (Pasha Tatashin) increases the flexibility of KEXEC Handover. Also preparation for LUO - "Live Update Orchestrator" (Pasha Tatashin) is a major new feature targeted at cloud environments. Quoting the cover letter: This series introduces the Live Update Orchestrator, a kernel subsystem designed to facilitate live kernel updates using a kexec-based reboot. This capability is critical for cloud environments, allowing hypervisors to be updated with minimal downtime for running virtual machines. LUO achieves this by preserving the state of selected resources, such as memory, devices and their dependencies, across the kernel transition. As a key feature, this series includes support for preserving memfd file descriptors, which allows critical in-memory data, such as guest RAM or any other large memory region, to be maintained in RAM across the kexec reboot. Mike Rappaport merits a mention here, for his extensive review and testing work. - "kexec: reorganize kexec and kdump sysfs" (Sourabh Jain) moves the kexec and kdump sysfs entries from /sys/kernel/ to /sys/kernel/kexec/ and adds back-compatibility symlinks which can hopefully be removed one day - "kho: fixes for vmalloc restoration" (Mike Rapoport) fixes a BUG which was being hit during KHO restoration of vmalloc() regions * tag 'mm-nonmm-stable-2025-12-06-11-14' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (139 commits) calibrate: update header inclusion Reinstate "resource: avoid unnecessary lookups in find_next_iomem_res()" vmcoreinfo: track and log recoverable hardware errors kho: fix restoring of contiguous ranges of order-0 pages kho: kho_restore_vmalloc: fix initialization of pages array MAINTAINERS: TPM DEVICE DRIVER: update the W-tag init: replace simple_strtoul with kstrtoul to improve lpj_setup KHO: fix boot failure due to kmemleak access to non-PRESENT pages Documentation/ABI: new kexec and kdump sysfs interface Documentation/ABI: mark old kexec sysfs deprecated kexec: move sysfs entries to /sys/kernel/kexec test_kho: always print restore status kho: free chunks using free_page() instead of kfree() selftests/liveupdate: add kexec test for multiple and empty sessions selftests/liveupdate: add simple kexec-based selftest for LUO selftests/liveupdate: add userspace API selftests docs: add documentation for memfd preservation via LUO mm: memfd_luo: allow preserving memfd liveupdate: luo_file: add private argument to store runtime state mm: shmem: export some functions to internal.h ...
2025-12-01Merge tag 'vfs-6.19-rc1.inode' of ↵Linus Torvalds2-2/+2
git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs Pull vfs inode updates from Christian Brauner: "Features: - Hide inode->i_state behind accessors. Open-coded accesses prevent asserting they are done correctly. One obvious aspect is locking, but significantly more can be checked. For example it can be detected when the code is clearing flags which are already missing, or is setting flags when it is illegal (e.g., I_FREEING when ->i_count > 0) - Provide accessors for ->i_state, converts all filesystems using coccinelle and manual conversions (btrfs, ceph, smb, f2fs, gfs2, overlayfs, nilfs2, xfs), and makes plain ->i_state access fail to compile - Rework I_NEW handling to operate without fences, simplifying the code after the accessor infrastructure is in place Cleanups: - Move wait_on_inode() from writeback.h to fs.h - Spell out fenced ->i_state accesses with explicit smp_wmb/smp_rmb for clarity - Cosmetic fixes to LRU handling - Push list presence check into inode_io_list_del() - Touch up predicts in __d_lookup_rcu() - ocfs2: retire ocfs2_drop_inode() and I_WILL_FREE usage - Assert on ->i_count in iput_final() - Assert ->i_lock held in __iget() Fixes: - Add missing fences to I_NEW handling" * tag 'vfs-6.19-rc1.inode' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs: (22 commits) dcache: touch up predicts in __d_lookup_rcu() fs: push list presence check into inode_io_list_del() fs: cosmetic fixes to lru handling fs: rework I_NEW handling to operate without fences fs: make plain ->i_state access fail to compile xfs: use the new ->i_state accessors nilfs2: use the new ->i_state accessors overlayfs: use the new ->i_state accessors gfs2: use the new ->i_state accessors f2fs: use the new ->i_state accessors smb: use the new ->i_state accessors ceph: use the new ->i_state accessors btrfs: use the new ->i_state accessors Manual conversion to use ->i_state accessors of all places not covered by coccinelle Coccinelle-based conversion to use ->i_state accessors fs: provide accessors for ->i_state fs: spell out fenced ->i_state accesses with explicit smp_wmb/smp_rmb fs: move wait_on_inode() from writeback.h to fs.h fs: add missing fences to I_NEW handling ocfs2: retire ocfs2_drop_inode() and I_WILL_FREE usage ...
2025-11-20fscrypt: replace local base64url helpers with lib/base64Guan-Chun Wu1-83/+6
Replace the base64url encoding and decoding functions in fscrypt with the generic base64_encode() and base64_decode() helpers from lib/base64. This removes the custom implementation in fscrypt, reduces code duplication, and relies on the shared Base64 implementation in lib. The helpers preserve RFC 4648-compliant URL-safe Base64 encoding without padding, so there are no functional changes. This change also improves performance: encoding is about 2.7x faster and decoding achieves 43-52x speedups compared to the previous implementation. Link: https://lkml.kernel.org/r/20251114060221.89734-1-409411716@gms.tku.edu.tw Reviewed-by: Kuan-Wei Chiu <visitorckw@gmail.com> Signed-off-by: Guan-Chun Wu <409411716@gms.tku.edu.tw> Cc: Christoph Hellwig <hch@lst.de> Cc: David Laight <david.laight.linux@gmail.com> Cc: Eric Biggers <ebiggers@kernel.org> Cc: Ilya Dryomov <idryomov@gmail.com> Cc: Jaegeuk Kim <jaegeuk@kernel.org> Cc: Jens Axboe <axboe@kernel.dk> Cc: Keith Busch <kbusch@kernel.org> Cc: Sagi Grimberg <sagi@grimberg.me> Cc: "Theodore Y. Ts'o" <tytso@mit.edu> Cc: Viacheslav Dubeyko <Slava.Dubeyko@ibm.com> Cc: Xiubo Li <xiubli@redhat.com> Cc: Yu-Sheng Huang <home7438072@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-11-04fscrypt: fix left shift underflow when inode->i_blkbits > PAGE_SHIFTYongpeng Yang1-2/+1
When simulating an nvme device on qemu with both logical_block_size and physical_block_size set to 8 KiB, an error trace appears during partition table reading at boot time. The issue is caused by inode->i_blkbits being larger than PAGE_SHIFT, which leads to a left shift of -1 and triggering a UBSAN warning. [ 2.697306] ------------[ cut here ]------------ [ 2.697309] UBSAN: shift-out-of-bounds in fs/crypto/inline_crypt.c:336:37 [ 2.697311] shift exponent -1 is negative [ 2.697315] CPU: 3 UID: 0 PID: 274 Comm: (udev-worker) Not tainted 6.18.0-rc2+ #34 PREEMPT(voluntary) [ 2.697317] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.16.3-0-ga6ed6b701f0a-prebuilt.qemu.org 04/01/2014 [ 2.697320] Call Trace: [ 2.697324] <TASK> [ 2.697325] dump_stack_lvl+0x76/0xa0 [ 2.697340] dump_stack+0x10/0x20 [ 2.697342] __ubsan_handle_shift_out_of_bounds+0x1e3/0x390 [ 2.697351] bh_get_inode_and_lblk_num.cold+0x12/0x94 [ 2.697359] fscrypt_set_bio_crypt_ctx_bh+0x44/0x90 [ 2.697365] submit_bh_wbc+0xb6/0x190 [ 2.697370] block_read_full_folio+0x194/0x270 [ 2.697371] ? __pfx_blkdev_get_block+0x10/0x10 [ 2.697375] ? __pfx_blkdev_read_folio+0x10/0x10 [ 2.697377] blkdev_read_folio+0x18/0x30 [ 2.697379] filemap_read_folio+0x40/0xe0 [ 2.697382] filemap_get_pages+0x5ef/0x7a0 [ 2.697385] ? mmap_region+0x63/0xd0 [ 2.697389] filemap_read+0x11d/0x520 [ 2.697392] blkdev_read_iter+0x7c/0x180 [ 2.697393] vfs_read+0x261/0x390 [ 2.697397] ksys_read+0x71/0xf0 [ 2.697398] __x64_sys_read+0x19/0x30 [ 2.697399] x64_sys_call+0x1e88/0x26a0 [ 2.697405] do_syscall_64+0x80/0x670 [ 2.697410] ? __x64_sys_newfstat+0x15/0x20 [ 2.697414] ? x64_sys_call+0x204a/0x26a0 [ 2.697415] ? do_syscall_64+0xb8/0x670 [ 2.697417] ? irqentry_exit_to_user_mode+0x2e/0x2a0 [ 2.697420] ? irqentry_exit+0x43/0x50 [ 2.697421] ? exc_page_fault+0x90/0x1b0 [ 2.697422] entry_SYSCALL_64_after_hwframe+0x76/0x7e [ 2.697425] RIP: 0033:0x75054cba4a06 [ 2.697426] Code: 5d e8 41 8b 93 08 03 00 00 59 5e 48 83 f8 fc 75 19 83 e2 39 83 fa 08 75 11 e8 26 ff ff ff 66 0f 1f 44 00 00 48 8b 45 10 0f 05 <48> 8b 5d f8 c9 c3 0f 1f 40 00 f3 0f 1e fa 55 48 89 e5 48 83 ec 08 [ 2.697427] RSP: 002b:00007fff973723a0 EFLAGS: 00000202 ORIG_RAX: 0000000000000000 [ 2.697430] RAX: ffffffffffffffda RBX: 00005ea9a2c02760 RCX: 000075054cba4a06 [ 2.697432] RDX: 0000000000002000 RSI: 000075054c190000 RDI: 000000000000001b [ 2.697433] RBP: 00007fff973723c0 R08: 0000000000000000 R09: 0000000000000000 [ 2.697434] R10: 0000000000000000 R11: 0000000000000202 R12: 0000000000000000 [ 2.697434] R13: 00005ea9a2c027c0 R14: 00005ea9a2be5608 R15: 00005ea9a2be55f0 [ 2.697436] </TASK> [ 2.697436] ---[ end trace ]--- This situation can happen for block devices because when CONFIG_TRANSPARENT_HUGEPAGE is enabled, the maximum logical_block_size is 64 KiB. set_init_blocksize() then sets the block device inode->i_blkbits to 13, which is within this limit. File I/O does not trigger this problem because for filesystems that do not support the FS_LBS feature, sb_set_blocksize() prevents sb->s_blocksize_bits from being larger than PAGE_SHIFT. During inode allocation, alloc_inode()->inode_init_always() assigns inode->i_blkbits from sb->s_blocksize_bits. Currently, only xfs_fs_type has the FS_LBS flag, and since xfs I/O paths do not reach submit_bh_wbc(), it does not hit the left-shift underflow issue. Signed-off-by: Yongpeng Yang <yangyongpeng@xiaomi.com> Fixes: 47dd67532303 ("block/bdev: lift block size restrictions to 64k") Cc: stable@vger.kernel.org [EB: use folio_pos() and consolidate the two shifts by i_blkbits] Link: https://lore.kernel.org/r/20251105003642.42796-1-ebiggers@kernel.org Signed-off-by: Eric Biggers <ebiggers@kernel.org>
2025-10-20Coccinelle-based conversion to use ->i_state accessorsMateusz Guzik2-2/+2
All places were patched by coccinelle with the default expecting that ->i_lock is held, afterwards entries got fixed up by hand to use unlocked variants as needed. The script: @@ expression inode, flags; @@ - inode->i_state & flags + inode_state_read(inode) & flags @@ expression inode, flags; @@ - inode->i_state &= ~flags + inode_state_clear(inode, flags) @@ expression inode, flag1, flag2; @@ - inode->i_state &= ~flag1 & ~flag2 + inode_state_clear(inode, flag1 | flag2) @@ expression inode, flags; @@ - inode->i_state |= flags + inode_state_set(inode, flags) @@ expression inode, flags; @@ - inode->i_state = flags + inode_state_assign(inode, flags) @@ expression inode, flags; @@ - flags = inode->i_state + flags = inode_state_read(inode) @@ expression inode, flags; @@ - READ_ONCE(inode->i_state) & flags + inode_state_read(inode) & flags Signed-off-by: Mateusz Guzik <mjguzik@gmail.com> Signed-off-by: Christian Brauner <brauner@kernel.org>
2025-09-29Merge tag 'fscrypt-for-linus' of git://git.kernel.org/pub/scm/fs/fscrypt/linuxLinus Torvalds9-161/+83
Pull fscrypt updates from Eric Biggers: "Make fs/crypto/ use the HMAC-SHA512 library functions instead of crypto_shash. This is simpler, faster, and more reliable" * tag 'fscrypt-for-linus' of git://git.kernel.org/pub/scm/fs/fscrypt/linux: fscrypt: use HMAC-SHA512 library for HKDF fscrypt: Remove redundant __GFP_NOWARN
2025-09-05fscrypt: use HMAC-SHA512 library for HKDFEric Biggers8-160/+82
For the HKDF-SHA512 key derivation needed by fscrypt, just use the HMAC-SHA512 library functions directly. These functions were introduced in v6.17, and they provide simple and efficient direct support for HMAC-SHA512. This ends up being quite a bit simpler and more efficient than using crypto/hkdf.c, as it avoids the generic crypto layer: - The HMAC library can't fail, so callers don't need to handle errors - No inefficient indirect calls - No inefficient and error-prone dynamic allocations - No inefficient and error-prone loading of algorithm by name - Less stack usage Benchmarks on x86_64 show that deriving a per-file key gets about 30% faster, and FS_IOC_ADD_ENCRYPTION_KEY gets nearly twice as fast. The only small downside is the HKDF-Expand logic gets duplicated again. Then again, even considering that, the new fscrypt_hkdf_expand() is only 7 lines longer than the version that called hkdf_expand(). Later we could add HKDF support to lib/crypto/, but for now let's just do this. Link: https://lore.kernel.org/r/20250906035913.1141532-1-ebiggers@kernel.org Signed-off-by: Eric Biggers <ebiggers@kernel.org>
2025-08-21fscrypt: add support for info in fs-specific part of inodeEric Biggers2-19/+28
Add an inode_info_offs field to struct fscrypt_operations, and update fs/crypto/ to support it. When set to a nonzero value, it specifies the offset to the fscrypt_inode_info pointer within the filesystem-specific part of the inode structure, to be used instead of inode::i_crypt_info. Since this makes inode::i_crypt_info no longer necessarily used, update comments that mentioned it. This is a prerequisite for a later commit that removes inode::i_crypt_info, saving memory and improving cache efficiency with filesystems that don't support fscrypt. Co-developed-by: Christian Brauner <brauner@kernel.org> Signed-off-by: Eric Biggers <ebiggers@kernel.org> Link: https://lore.kernel.org/20250810075706.172910-3-ebiggers@kernel.org Signed-off-by: Christian Brauner <brauner@kernel.org>
2025-08-21fscrypt: replace raw loads of info pointer with helper functionEric Biggers6-21/+27
Add and use a helper function fscrypt_get_inode_info_raw(). It loads an inode's fscrypt info pointer using a raw dereference, which is appropriate when the caller knows the key setup already happened. This eliminates most occurrences of inode::i_crypt_info in the source, in preparation for replacing that with a filesystem-specific field. Co-developed-by: Christian Brauner <brauner@kernel.org> Signed-off-by: Eric Biggers <ebiggers@kernel.org> Link: https://lore.kernel.org/20250810075706.172910-2-ebiggers@kernel.org Signed-off-by: Christian Brauner <brauner@kernel.org>
2025-08-11fscrypt: Remove redundant __GFP_NOWARNQianfeng Rong1-1/+1
GFP_NOWAIT already includes __GFP_NOWARN, so let's remove the redundant __GFP_NOWARN. Signed-off-by: Qianfeng Rong <rongqianfeng@vivo.com> Link: https://lore.kernel.org/r/20250803102243.623705-3-rongqianfeng@vivo.com Signed-off-by: Eric Biggers <ebiggers@kernel.org>
2025-07-10fscrypt: Remove gfp_t argument from fscrypt_encrypt_block_inplace()Eric Biggers1-2/+1
This argument is no longer used, so remove it. Reviewed-by: Alex Markuze <amarkuze@redhat.com> Link: https://lore.kernel.org/r/20250710060754.637098-6-ebiggers@kernel.org Signed-off-by: Eric Biggers <ebiggers@kernel.org>
2025-07-10fscrypt: Remove gfp_t argument from fscrypt_crypt_data_unit()Eric Biggers3-13/+7
This argument is no longer used, so remove it. Link: https://lore.kernel.org/r/20250710060754.637098-5-ebiggers@kernel.org Signed-off-by: Eric Biggers <ebiggers@kernel.org>
2025-07-10fscrypt: Switch to sync_skcipher and on-stack requestsEric Biggers5-95/+61
Now that fscrypt uses only synchronous skciphers, switch to the actual sync_skcipher API and the corresponding on-stack requests. This eliminates a heap allocation per en/decryption operation. Link: https://lore.kernel.org/r/20250710060754.637098-4-ebiggers@kernel.org Signed-off-by: Eric Biggers <ebiggers@kernel.org>
2025-07-10fscrypt: Drop FORBID_WEAK_KEYS flag for AES-ECBEric Biggers1-1/+0
This flag only has an effect for DES, 3DES, and XTS mode. It does nothing for AES-ECB, as there is no concept of weak keys for AES. Link: https://lore.kernel.org/r/20250710060754.637098-3-ebiggers@kernel.org Signed-off-by: Eric Biggers <ebiggers@kernel.org>
2025-07-10fscrypt: Don't use asynchronous CryptoAPI algorithmsEric Biggers4-21/+18
Now that fscrypt's incomplete support for non-inline crypto engines has been removed, and none of the CPU-based algorithms have the CRYPTO_ALG_ASYNC flag set anymore, there is no need to accommodate asynchronous algorithms. Therefore, explicitly allocate only synchronous algorithms. Then, remove the code that handled waiting for asynchronous en/decryption operations to complete. This commit should *not* be backported to kernels that lack commit 0ba6ec5b2972 ("crypto: x86/aes - stop using the SIMD helper"), as then it would disable the use of the optimized AES code on x86. Link: https://lore.kernel.org/r/20250710060754.637098-2-ebiggers@kernel.org Signed-off-by: Eric Biggers <ebiggers@kernel.org>
2025-07-04fscrypt: Don't use problematic non-inline crypto enginesEric Biggers4-3/+22
Make fscrypt no longer use Crypto API drivers for non-inline crypto engines, even when the Crypto API prioritizes them over CPU-based code (which unfortunately it often does). These drivers tend to be really problematic, especially for fscrypt's workload. This commit has no effect on inline crypto engines, which are different and do work well. Specifically, exclude drivers that have CRYPTO_ALG_KERN_DRIVER_ONLY or CRYPTO_ALG_ALLOCATES_MEMORY set. (Later, CRYPTO_ALG_ASYNC should be excluded too. That's omitted for now to keep this commit backportable, since until recently some CPU-based code had CRYPTO_ALG_ASYNC set.) There are two major issues with these drivers: bugs and performance. First, these drivers tend to be buggy. They're fundamentally much more error-prone and harder to test than the CPU-based code. They often don't get tested before kernel releases, and even if they do, the crypto self-tests don't properly test these drivers. Released drivers have en/decrypted or hashed data incorrectly. These bugs cause issues for fscrypt users who often didn't even want to use these drivers, e.g.: - https://github.com/google/fscryptctl/issues/32 - https://github.com/google/fscryptctl/issues/9 - https://lore.kernel.org/r/PH0PR02MB731916ECDB6C613665863B6CFFAA2@PH0PR02MB7319.namprd02.prod.outlook.com These drivers have also similarly caused issues for dm-crypt users, including data corruption and deadlocks. Since Linux v5.10, dm-crypt has disabled most of them by excluding CRYPTO_ALG_ALLOCATES_MEMORY. Second, these drivers tend to be *much* slower than the CPU-based code. This may seem counterintuitive, but benchmarks clearly show it. There's a *lot* of overhead associated with going to a hardware driver, off the CPU, and back again. To prove this, I gathered as many systems with this type of crypto engine as I could, and I measured synchronous encryption of 4096-byte messages (which matches fscrypt's workload): Intel Emerald Rapids server: AES-256-XTS: xts-aes-vaes-avx512 16171 MB/s [CPU-based, Vector AES] qat_aes_xts 289 MB/s [Offload, Intel QuickAssist] Qualcomm SM8650 HDK: AES-256-XTS: xts-aes-ce 4301 MB/s [CPU-based, ARMv8 Crypto Extensions] xts-aes-qce 73 MB/s [Offload, Qualcomm Crypto Engine] i.MX 8M Nano LPDDR4 EVK: AES-256-XTS: xts-aes-ce 647 MB/s [CPU-based, ARMv8 Crypto Extensions] xts(ecb-aes-caam) 20 MB/s [Offload, CAAM] AES-128-CBC-ESSIV: essiv(cbc-aes-caam,sha256-lib) 23 MB/s [Offload, CAAM] STM32MP157F-DK2: AES-256-XTS: xts-aes-neonbs 13.2 MB/s [CPU-based, ARM NEON] xts(stm32-ecb-aes) 3.1 MB/s [Offload, STM32 crypto engine] AES-128-CBC-ESSIV: essiv(cbc-aes-neonbs,sha256-lib) 14.7 MB/s [CPU-based, ARM NEON] essiv(stm32-cbc-aes,sha256-lib) 3.2 MB/s [Offload, STM32 crypto engine] Adiantum: adiantum(xchacha12-arm,aes-arm,nhpoly1305-neon) 52.8 MB/s [CPU-based, ARM scalar + NEON] So, there was no case in which the crypto engine was even *close* to being faster. On the first three, which have AES instructions in the CPU, the CPU was 30 to 55 times faster (!). Even on STM32MP157F-DK2 which has a Cortex-A7 CPU that doesn't have AES instructions, AES was over 4 times faster on the CPU. And Adiantum encryption, which is what actually should be used on CPUs like that, was over 17 times faster. Other justifications that have been given for these non-inline crypto engines (almost always coming from the hardware vendors, not actual users) don't seem very plausible either: - The crypto engine throughput could be improved by processing multiple requests concurrently. Currently irrelevant to fscrypt, since it doesn't do that. This would also be complex, and unhelpful in many cases. 2 of the 4 engines I tested even had only one queue. - Some of the engines, e.g. STM32, support hardware keys. Also currently irrelevant to fscrypt, since it doesn't support these. Interestingly, the STM32 driver itself doesn't support this either. - Free up CPU for other tasks and/or reduce energy usage. Not very plausible considering the "short" message length, driver overhead, and scheduling overhead. There's just very little time for the CPU to do something else like run another task or enter low-power state, before the message finishes and it's time to process the next one. - Some of these engines resist power analysis and electromagnetic attacks, while the CPU-based crypto generally does not. In theory, this sounds great. In practice, if this benefit requires the use of an off-CPU offload that massively regresses performance and has a low-quality, buggy driver, the price for this hardening (which is not relevant to most fscrypt users, and tends to be incomplete) is just too high. Inline crypto engines are much more promising here, as are on-CPU solutions like RISC-V High Assurance Cryptography. Fixes: b30ab0e03407 ("ext4 crypto: add ext4 encryption facilities") Cc: stable@vger.kernel.org Acked-by: Ard Biesheuvel <ardb@kernel.org> Link: https://lore.kernel.org/r/20250704070322.20692-1-ebiggers@kernel.org Signed-off-by: Eric Biggers <ebiggers@kernel.org>
2025-06-20fscrypt: Explicitly include <linux/export.h>Eric Biggers9-11/+24
Fix build warnings with W=1 that started appearing after commit a934a57a42f6 ("scripts/misc-check: check missing #include <linux/export.h> when W=1"). While at it, also sort the include lists alphabetically. Link: https://lore.kernel.org/r/20250614221301.100803-1-ebiggers@kernel.org Signed-off-by: Eric Biggers <ebiggers@kernel.org>
2025-04-08fscrypt: add support for hardware-wrapped keysEric Biggers6-65/+257
Add support for hardware-wrapped keys to fscrypt. Such keys are protected from certain attacks, such as cold boot attacks. For more information, see the "Hardware-wrapped keys" section of Documentation/block/inline-encryption.rst. To support hardware-wrapped keys in fscrypt, we allow the fscrypt master keys to be hardware-wrapped. File contents encryption is done by passing the wrapped key to the inline encryption hardware via blk-crypto. Other fscrypt operations such as filenames encryption continue to be done by the kernel, using the "software secret" which the hardware derives. For more information, see the documentation which this patch adds to Documentation/filesystems/fscrypt.rst. Note that this feature doesn't require any filesystem-specific changes. However it does depend on inline encryption support, and thus currently it is only applicable to ext4 and f2fs. The version of this feature introduced by this patch is mostly equivalent to the version that has existed downstream in the Android Common Kernels since 2020. However, a couple fixes are included. First, the flags field in struct fscrypt_add_key_arg is now placed in the proper location. Second, key identifiers for HW-wrapped keys are now derived using a distinct HKDF context byte; this fixes a bug where a raw key could have the same identifier as a HW-wrapped key. Note that as a result of these fixes, the version of this feature introduced by this patch is not UAPI or on-disk format compatible with the version in the Android Common Kernels, though the divergence is limited to just those specific fixes. This version should be used going forwards. This patch has been heavily rewritten from the original version by Gaurav Kashyap <quic_gaurkash@quicinc.com> and Barani Muthukumaran <bmuthuku@codeaurora.org>. Tested-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org> # sm8650 Link: https://lore.kernel.org/r/20250404225859.172344-1-ebiggers@kernel.org Signed-off-by: Eric Biggers <ebiggers@google.com>
2025-03-26Merge tag 'for-6.15/block-20250322' of git://git.kernel.dk/linuxLinus Torvalds3-71/+19
Pull block updates from Jens Axboe: - Fixes for integrity handling - NVMe pull request via Keith: - Secure concatenation for TCP transport (Hannes) - Multipath sysfs visibility (Nilay) - Various cleanups (Qasim, Baruch, Wang, Chen, Mike, Damien, Li) - Correct use of 64-bit BARs for pci-epf target (Niklas) - Socket fix for selinux when used in containers (Peijie) - MD pull request via Yu: - fix recovery can preempt resync (Li Nan) - fix md-bitmap IO limit (Su Yue) - fix raid10 discard with REQ_NOWAIT (Xiao Ni) - fix raid1 memory leak (Zheng Qixing) - fix mddev uaf (Yu Kuai) - fix raid1,raid10 IO flags (Yu Kuai) - some refactor and cleanup (Yu Kuai) - Series cleaning up and fixing bugs in the bad block handling code - Improve support for write failure simulation in null_blk - Various lock ordering fixes - Fixes for locking for debugfs attributes - Various ublk related fixes and improvements - Cleanups for blk-rq-qos wait handling - blk-throttle fixes - Fixes for loop dio and sync handling - Fixes and cleanups for the auto-PI code - Block side support for hardware encryption keys in blk-crypto - Various cleanups and fixes * tag 'for-6.15/block-20250322' of git://git.kernel.dk/linux: (105 commits) nvmet: replace max(a, min(b, c)) by clamp(val, lo, hi) nvme-tcp: fix selinux denied when calling sock_sendmsg nvmet: pci-epf: Always configure BAR0 as 64-bit nvmet: Remove duplicate uuid_copy nvme: zns: Simplify nvme_zone_parse_entry() nvmet: pci-epf: Remove redundant 'flush_workqueue()' calls nvmet-fc: Remove unused functions nvme-pci: remove stale comment nvme-fc: Utilise min3() to simplify queue count calculation nvme-multipath: Add visibility for queue-depth io-policy nvme-multipath: Add visibility for numa io-policy nvme-multipath: Add visibility for round-robin io-policy nvmet: add tls_concat and tls_key debugfs entries nvmet-tcp: support secure channel concatenation nvmet: Add 'sq' argument to alloc_ctrl_args nvme-fabrics: reset admin connection for secure concatenation nvme-tcp: request secure channel concatenation nvme-keyring: add nvme_tls_psk_refresh() nvme: add nvme_auth_derive_tls_psk() nvme: add nvme_auth_generate_digest() ...
2025-03-25Merge tag 'fscrypt-for-linus' of git://git.kernel.org/pub/scm/fs/fscrypt/linuxLinus Torvalds1-12/+8
Pull fscrypt updates from Eric Biggers: "A fix for an issue where CONFIG_FS_ENCRYPTION could be enabled without some of its dependencies, and a small documentation update" * tag 'fscrypt-for-linus' of git://git.kernel.org/pub/scm/fs/fscrypt/linux: fscrypt: mention init_on_free instead of page poisoning fscrypt: drop obsolete recommendation to enable optimized ChaCha20 Revert "fscrypt: relax Kconfig dependencies for crypto API algorithms"
2025-03-20crypto,fs: Separate out hkdf_extract() and hkdf_expand()Hannes Reinecke2-70/+16
Separate out the HKDF functions into a separate module to to make them available to other callers. And add a testsuite to the module with test vectors from RFC 5869 (and additional vectors for SHA384 and SHA512) to ensure the integrity of the algorithm. Signed-off-by: Hannes Reinecke <hare@kernel.org> Acked-by: Eric Biggers <ebiggers@kernel.org> Acked-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: Keith Busch <kbusch@kernel.org>
2025-03-05fscrypt: Change fscrypt_encrypt_pagecache_blocks() to take a folioMatthew Wilcox (Oracle)1-12/+10
ext4 and ceph already have a folio to pass; f2fs needs to be properly converted but this will do for now. This removes a reference to page->index and page->mapping as well as removing a call to compound_head(). Signed-off-by: "Matthew Wilcox (Oracle)" <willy@infradead.org> Link: https://lore.kernel.org/r/20250304170224.523141-1-willy@infradead.org Acked-by: Eric Biggers <ebiggers@kernel.org> Signed-off-by: Christian Brauner <brauner@kernel.org>