aboutsummaryrefslogtreecommitdiff
path: root/include/uapi
AgeCommit message (Collapse)AuthorFilesLines
2024-06-11Merge tag 'vfs-6.10-rc4.fixes' of ↵Linus Torvalds1-1/+1
git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs Pull vfs fixes from Christian Brauner: "Misc: - Restore debugfs behavior of ignoring unknown mount options - Fix kernel doc for netfs_wait_for_oustanding_io() - Fix struct statx comment after new addition for this cycle - Fix a check in find_next_fd() iomap: - Fix data zeroing behavior when an extent spans the block that contains i_size - Restore i_size increasing in iomap_write_end() for now to avoid stale data exposure on xfs with a realtime device Cachefiles: - Remove unneeded fdtable.h include - Improve trace output for cachefiles_obj_{get,put}_ondemand_fd() - Remove requests from the request list to prevent accessing already freed requests - Fix UAF when issuing restore command while the daemon is still alive by adding an additional reference count to requests - Fix UAF by grabbing a reference during xarray lookup with xa_lock() held - Simplify error handling in cachefiles_ondemand_daemon_read() - Add consistency checks read and open requests to avoid crashes - Add a spinlock to protect ondemand_id variable which is used to determine whether an anonymous cachefiles fd has already been closed - Make on-demand reads killable allowing to handle broken cachefiles daemon better - Flush all requests after the kernel has been marked dead via CACHEFILES_DEAD to avoid hung-tasks - Ensure that closed requests are marked as such to avoid reusing them with a reopen request - Defer fd_install() until after copy_to_user() succeeded and thereby get rid of having to use close_fd() - Ensure that anonymous cachefiles on-demand fds are reused while they are valid to avoid pinning already freed cookies" * tag 'vfs-6.10-rc4.fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs: iomap: Fix iomap_adjust_read_range for plen calculation iomap: keep on increasing i_size in iomap_write_end() cachefiles: remove unneeded include of <linux/fdtable.h> fs/file: fix the check in find_next_fd() cachefiles: make on-demand read killable cachefiles: flush all requests after setting CACHEFILES_DEAD cachefiles: Set object to close if ondemand_id < 0 in copen cachefiles: defer exposing anon_fd until after copy_to_user() succeeds cachefiles: never get a new anonymous fd if ondemand_id is valid cachefiles: add spin_lock for cachefiles_ondemand_info cachefiles: add consistency check for copen/cread cachefiles: remove err_put_fd label in cachefiles_ondemand_daemon_read() cachefiles: fix slab-use-after-free in cachefiles_ondemand_daemon_read() cachefiles: fix slab-use-after-free in cachefiles_ondemand_get_fd() cachefiles: remove requests from xarray during flushing requests cachefiles: add output string to cachefiles_obj_[get|put]_ondemand_fd statx: Update offset commentary for struct statx netfs: fix kernel doc for nets_wait_for_outstanding_io() debugfs: continue to ignore unknown mount options
2024-06-08Merge tag 'for-linus-2024060801' of ↵Linus Torvalds1-0/+2
git://git.kernel.org/pub/scm/linux/kernel/git/hid/hid Pull HID fixes from Benjamin Tissoires: - fix potential read out of bounds in hid-asus (Andrew Ballance) - fix endian-conversion on little endian systems in intel-ish-hid (Arnd Bergmann) - A couple of new input event codes (Aseda Aboagye) - errors handling fixes in hid-nvidia-shield (Chen Ni), hid-nintendo (Christophe JAILLET), hid-logitech-dj (José Expósito) - current leakage fix while the device is in suspend on a i2c-hid laptop (Johan Hovold) - other assorted smaller fixes and device ID / quirk entry additions * tag 'for-linus-2024060801' of git://git.kernel.org/pub/scm/linux/kernel/git/hid/hid: HID: Ignore battery for ELAN touchscreens 2F2C and 4116 HID: i2c-hid: elan: fix reset suspend current leakage dt-bindings: HID: i2c-hid: elan: add 'no-reset-on-power-off' property dt-bindings: HID: i2c-hid: elan: add Elan eKTH5015M dt-bindings: HID: i2c-hid: add dedicated Ilitek ILI2901 schema input: Add support for "Do Not Disturb" input: Add event code for accessibility key hid: asus: asus_report_fixup: fix potential read out of bounds HID: logitech-hidpp: add missing MODULE_DESCRIPTION() macro HID: intel-ish-hid: fix endian-conversion HID: nintendo: Fix an error handling path in nintendo_hid_probe() HID: logitech-dj: Fix memory leak in logi_dj_recv_switch_to_dj_mode() HID: core: remove unnecessary WARN_ON() in implement() HID: nvidia-shield: Add missing check for input_ff_create_memless HID: intel-ish-hid: Fix build error for COMPILE_TEST
2024-06-07input: Add support for "Do Not Disturb"Aseda Aboagye1-0/+1
HUTRR94 added support for a new usage titled "System Do Not Disturb" which toggles a system-wide Do Not Disturb setting. This commit simply adds a new event code for the usage. Signed-off-by: Aseda Aboagye <aaboagye@chromium.org> Acked-by: Dmitry Torokhov <dmitry.torokhov@gmail.com> Link: https://lore.kernel.org/r/Zl-gUHE70s7wCAoB@google.com Signed-off-by: Benjamin Tissoires <bentiss@kernel.org>
2024-06-07input: Add event code for accessibility keyAseda Aboagye1-0/+1
HUTRR116 added support for a new usage titled "System Accessibility Binding" which toggles a system-wide bound accessibility UI or command. This commit simply adds a new event code for the usage. Signed-off-by: Aseda Aboagye <aaboagye@chromium.org> Acked-by: Dmitry Torokhov <dmitry.torokhov@gmail.com> Link: https://lore.kernel.org/r/Zl-e97O9nvudco5z@google.com Signed-off-by: Benjamin Tissoires <bentiss@kernel.org>
2024-06-01Merge tag 'tty-6.10-rc2' of ↵Linus Torvalds1-49/+47
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty Pull tty fix from Greg KH: "Here is a single revert for a much-reported regression in 6.10-rc1 when it comes to a few older architectures. Turns out that the VT ioctls don't work the same across all cpu types because of some old compatibility requrements for stuff like alpha and powerpc. So revert the change that attempted to have them use the _IO() macros and go back to the known-working values instead. This has NOT been in linux-next but has had many reports that it fixes the issue with 6.10-rc1" * tag 'tty-6.10-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty: Revert "VT: Use macros to define ioctls"
2024-06-01Revert "VT: Use macros to define ioctls"Greg Kroah-Hartman1-49/+47
This reverts commit 8c467f3300591a206fa8dcc6988d768910799872. Turns out this breaks many architectures as the vt ioctls do not all match up everywhere due to historical reasons, so the original commit is invalid for many values. Reported-by: Nick Bowler <nbowler@draconx.ca> Reported-by: Arnd Bergmann <arnd@kernel.org> Reported-by: Jiri Slaby <jirislaby@kernel.org> Reported-by: Christian Zigotzky <chzigotzky@xenosoft.de> Reported-by: Michael Ellerman <mpe@ellerman.id.au> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Alexey Gladkov <legion@kernel.org> Link: https://lore.kernel.org/r/ad4e561c-1d49-4f25-882c-7a36c6b1b5c0@draconx.ca Link: https://lore.kernel.org/r/0da9785e-ba44-4718-9d08-4e96c1ba7ab2@kernel.org Link: https://lore.kernel.org/all/34d848f4-670b-4493-bf21-130ef862521b@xenosoft.de/ Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-05-30Merge tag 'net-6.10-rc2' of ↵Linus Torvalds2-2/+2
git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net Pull networking fixes from Paolo Abeni: "Including fixes from bpf and netfilter. Current release - regressions: - gro: initialize network_offset in network layer - tcp: reduce accepted window in NEW_SYN_RECV state Current release - new code bugs: - eth: mlx5e: do not use ptp structure for tx ts stats when not initialized - eth: ice: check for unregistering correct number of devlink params Previous releases - regressions: - bpf: Allow delete from sockmap/sockhash only if update is allowed - sched: taprio: extend minimum interval restriction to entire cycle too - netfilter: ipset: add list flush to cancel_gc - ipv4: fix address dump when IPv4 is disabled on an interface - sock_map: avoid race between sock_map_close and sk_psock_put - eth: mlx5: use mlx5_ipsec_rx_status_destroy to correctly delete status rules Previous releases - always broken: - core: fix __dst_negative_advice() race - bpf: - fix multi-uprobe PID filtering logic - fix pkt_type override upon netkit pass verdict - netfilter: tproxy: bail out if IP has been disabled on the device - af_unix: annotate data-race around unix_sk(sk)->addr - eth: mlx5e: fix UDP GSO for encapsulated packets - eth: idpf: don't enable NAPI and interrupts prior to allocating Rx buffers - eth: i40e: fully suspend and resume IO operations in EEH case - eth: octeontx2-pf: free send queue buffers incase of leaf to inner - eth: ipvlan: dont Use skb->sk in ipvlan_process_v{4,6}_outbound" * tag 'net-6.10-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (69 commits) netdev: add qstat for csum complete ipvlan: Dont Use skb->sk in ipvlan_process_v{4,6}_outbound net: ena: Fix redundant device NUMA node override ice: check for unregistering correct number of devlink params ice: fix 200G PHY types to link speed mapping i40e: Fully suspend and resume IO operations in EEH case i40e: factoring out i40e_suspend/i40e_resume e1000e: move force SMBUS near the end of enable_ulp function net: dsa: microchip: fix RGMII error in KSZ DSA driver ipv4: correctly iterate over the target netns in inet_dump_ifaddr() net: fix __dst_negative_advice() race nfc/nci: Add the inconsistency check between the input data length and count MAINTAINERS: dwmac: starfive: update Maintainer net/sched: taprio: extend minimum interval restriction to entire cycle too net/sched: taprio: make q->picos_per_byte available to fill_sched_entry() netfilter: nft_fib: allow from forward/input without iif selector netfilter: tproxy: bail out if IP has been disabled on the device netfilter: nft_payload: skbuff vlan metadata mangle support net: ti: icssg-prueth: Fix start counter for ft1 filter sock_map: avoid race between sock_map_close and sk_psock_put ...
2024-05-30netdev: add qstat for csum completeJakub Kicinski1-0/+1
Recent commit 0cfe71f45f42 ("netdev: add queue stats") added a lot of useful stats, but only those immediately needed by virtio. Presumably virtio does not support CHECKSUM_COMPLETE, so statistic for that form of checksumming wasn't included. Other drivers will definitely need it, in fact we expect it to be needed in net-next soon (mlx5). So let's add the definition of the counter for CHECKSUM_COMPLETE to uAPI in net already, so that the counters are in a more natural order (all subsequent counters have not been present in any released kernel, yet). Signed-off-by: Jakub Kicinski <kuba@kernel.org> Reviewed-by: Joe Damato <jdamato@fastly.com> Fixes: 0cfe71f45f42 ("netdev: add queue stats") Link: https://lore.kernel.org/r/20240529163547.3693194-1-kuba@kernel.org Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-05-29statx: Update offset commentary for struct statxJohn Garry1-1/+1
In commit 2a82bb02941f ("statx: stx_subvol"), a new member was added to struct statx, but the offset comment was not correct. Update it. Signed-off-by: John Garry <john.g.garry@oracle.com> Link: https://lore.kernel.org/r/20240529081725.3769290-1-john.g.garry@oracle.com Signed-off-by: Christian Brauner <brauner@kernel.org>
2024-05-24Merge tag 'drm-next-2024-05-25' of https://gitlab.freedesktop.org/drm/kernelLinus Torvalds2-4/+23
Pull drm fixes from Dave Airlie: "Some fixes for the end of the merge window, mostly amdgpu and panthor, with one nouveau uAPI change that fixes a bad decision we made a few months back. nouveau: - fix bo metadata uAPI for vm bind panthor: - Fixes for panthor's heap logical block. - Reset on unrecoverable fault - Fix VM references. - Reset fix. xlnx: - xlnx compile and doc fixes. amdgpu: - Handle vbios table integrated info v2.3 amdkfd: - Handle duplicate BOs in reserve_bo_and_cond_vms - Handle memory limitations on small APUs dp/mst: - MST null deref fix. bridge: - Don't let next bridge create connector in adv7511 to make probe work" * tag 'drm-next-2024-05-25' of https://gitlab.freedesktop.org/drm/kernel: drm/amdgpu/atomfirmware: add intergrated info v2.3 table drm/mst: Fix NULL pointer dereference at drm_dp_add_payload_part2 drm/amdkfd: Let VRAM allocations go to GTT domain on small APUs drm/amdkfd: handle duplicate BOs in reserve_bo_and_cond_vms drm/bridge: adv7511: Attach next bridge without creating connector drm/buddy: Fix the warn on's during force merge drm/nouveau: use tile_mode and pte_kind for VM_BIND bo allocations drm/panthor: Call panthor_sched_post_reset() even if the reset failed drm/panthor: Reset the FW VM to NULL on unplug drm/panthor: Keep a ref to the VM at the panthor_kernel_bo level drm/panthor: Force an immediate reset on unrecoverable faults drm/panthor: Document drm_panthor_tiler_heap_destroy::handle validity constraints drm/panthor: Fix an off-by-one in the heap context retrieval logic drm/panthor: Relax the constraints on the tiler chunk size drm/panthor: Make sure the tiler initial/max chunks are consistent drm/panthor: Fix tiler OOM handling to allow incremental rendering drm: xlnx: zynqmp_dpsub: Fix compilation error drm: xlnx: zynqmp_dpsub: Fix few function comments
2024-05-24Merge tag 'mm-stable-2024-05-24-11-49' of ↵Linus Torvalds1-1/+4
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm Pull more mm updates from Andrew Morton: "Jeff Xu's implementation of the mseal() syscall" * tag 'mm-stable-2024-05-24-11-49' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: selftest mm/mseal read-only elf memory segment mseal: add documentation selftest mm/mseal memory sealing mseal: add mseal syscall mseal: wire up mseal syscall
2024-05-24connector: Fix invalid conversion in cn_proc.hMatt Jan1-2/+1
The implicit conversion from unsigned int to enum proc_cn_event is invalid, so explicitly cast it for compilation in a C++ compiler. /usr/include/linux/cn_proc.h: In function 'proc_cn_event valid_event(proc_cn_event)': /usr/include/linux/cn_proc.h:72:17: error: invalid conversion from 'unsigned int' to 'proc_cn_event' [-fpermissive] 72 | ev_type &= PROC_EVENT_ALL; | ^ | | | unsigned int Signed-off-by: Matt Jan <zoo868e@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2024-05-23mseal: wire up mseal syscallJeff Xu1-1/+4
Patch series "Introduce mseal", v10. This patchset proposes a new mseal() syscall for the Linux kernel. In a nutshell, mseal() protects the VMAs of a given virtual memory range against modifications, such as changes to their permission bits. Modern CPUs support memory permissions, such as the read/write (RW) and no-execute (NX) bits. Linux has supported NX since the release of kernel version 2.6.8 in August 2004 [1]. The memory permission feature improves the security stance on memory corruption bugs, as an attacker cannot simply write to arbitrary memory and point the code to it. The memory must be marked with the X bit, or else an exception will occur. Internally, the kernel maintains the memory permissions in a data structure called VMA (vm_area_struct). mseal() additionally protects the VMA itself against modifications of the selected seal type. Memory sealing is useful to mitigate memory corruption issues where a corrupted pointer is passed to a memory management system. For example, such an attacker primitive can break control-flow integrity guarantees since read-only memory that is supposed to be trusted can become writable or .text pages can get remapped. Memory sealing can automatically be applied by the runtime loader to seal .text and .rodata pages and applications can additionally seal security critical data at runtime. A similar feature already exists in the XNU kernel with the VM_FLAGS_PERMANENT [3] flag and on OpenBSD with the mimmutable syscall [4]. Also, Chrome wants to adopt this feature for their CFI work [2] and this patchset has been designed to be compatible with the Chrome use case. Two system calls are involved in sealing the map: mmap() and mseal(). The new mseal() is an syscall on 64 bit CPU, and with following signature: int mseal(void addr, size_t len, unsigned long flags) addr/len: memory range. flags: reserved. mseal() blocks following operations for the given memory range. 1> Unmapping, moving to another location, and shrinking the size, via munmap() and mremap(), can leave an empty space, therefore can be replaced with a VMA with a new set of attributes. 2> Moving or expanding a different VMA into the current location, via mremap(). 3> Modifying a VMA via mmap(MAP_FIXED). 4> Size expansion, via mremap(), does not appear to pose any specific risks to sealed VMAs. It is included anyway because the use case is unclear. In any case, users can rely on merging to expand a sealed VMA. 5> mprotect() and pkey_mprotect(). 6> Some destructive madvice() behaviors (e.g. MADV_DONTNEED) for anonymous memory, when users don't have write permission to the memory. Those behaviors can alter region contents by discarding pages, effectively a memset(0) for anonymous memory. The idea that inspired this patch comes from Stephen Röttger’s work in V8 CFI [5]. Chrome browser in ChromeOS will be the first user of this API. Indeed, the Chrome browser has very specific requirements for sealing, which are distinct from those of most applications. For example, in the case of libc, sealing is only applied to read-only (RO) or read-execute (RX) memory segments (such as .text and .RELRO) to prevent them from becoming writable, the lifetime of those mappings are tied to the lifetime of the process. Chrome wants to seal two large address space reservations that are managed by different allocators. The memory is mapped RW- and RWX respectively but write access to it is restricted using pkeys (or in the future ARM permission overlay extensions). The lifetime of those mappings are not tied to the lifetime of the process, therefore, while the memory is sealed, the allocators still need to free or discard the unused memory. For example, with madvise(DONTNEED). However, always allowing madvise(DONTNEED) on this range poses a security risk. For example if a jump instruction crosses a page boundary and the second page gets discarded, it will overwrite the target bytes with zeros and change the control flow. Checking write-permission before the discard operation allows us to control when the operation is valid. In this case, the madvise will only succeed if the executing thread has PKEY write permissions and PKRU changes are protected in software by control-flow integrity. Although the initial version of this patch series is targeting the Chrome browser as its first user, it became evident during upstream discussions that we would also want to ensure that the patch set eventually is a complete solution for memory sealing and compatible with other use cases. The specific scenario currently in mind is glibc's use case of loading and sealing ELF executables. To this end, Stephen is working on a change to glibc to add sealing support to the dynamic linker, which will seal all non-writable segments at startup. Once this work is completed, all applications will be able to automatically benefit from these new protections. In closing, I would like to formally acknowledge the valuable contributions received during the RFC process, which were instrumental in shaping this patch: Jann Horn: raising awareness and providing valuable insights on the destructive madvise operations. Liam R. Howlett: perf optimization. Linus Torvalds: assisting in defining system call signature and scope. Theo de Raadt: sharing the experiences and insight gained from implementing mimmutable() in OpenBSD. MM perf benchmarks ================== This patch adds a loop in the mprotect/munmap/madvise(DONTNEED) to check the VMAs’ sealing flag, so that no partial update can be made, when any segment within the given memory range is sealed. To measure the performance impact of this loop, two tests are developed. [8] The first is measuring the time taken for a particular system call, by using clock_gettime(CLOCK_MONOTONIC). The second is using PERF_COUNT_HW_REF_CPU_CYCLES (exclude user space). Both tests have similar results. The tests have roughly below sequence: for (i = 0; i < 1000, i++) create 1000 mappings (1 page per VMA) start the sampling for (j = 0; j < 1000, j++) mprotect one mapping stop and save the sample delete 1000 mappings calculates all samples. Below tests are performed on Intel(R) Pentium(R) Gold 7505 @ 2.00GHz, 4G memory, Chromebook. Based on the latest upstream code: The first test (measuring time) syscall__ vmas t t_mseal delta_ns per_vma % munmap__ 1 909 944 35 35 104% munmap__ 2 1398 1502 104 52 107% munmap__ 4 2444 2594 149 37 106% munmap__ 8 4029 4323 293 37 107% munmap__ 16 6647 6935 288 18 104% munmap__ 32 11811 12398 587 18 105% mprotect 1 439 465 26 26 106% mprotect 2 1659 1745 86 43 105% mprotect 4 3747 3889 142 36 104% mprotect 8 6755 6969 215 27 103% mprotect 16 13748 14144 396 25 103% mprotect 32 27827 28969 1142 36 104% madvise_ 1 240 262 22 22 109% madvise_ 2 366 442 76 38 121% madvise_ 4 623 751 128 32 121% madvise_ 8 1110 1324 215 27 119% madvise_ 16 2127 2451 324 20 115% madvise_ 32 4109 4642 534 17 113% The second test (measuring cpu cycle) syscall__ vmas cpu cmseal delta_cpu per_vma % munmap__ 1 1790 1890 100 100 106% munmap__ 2 2819 3033 214 107 108% munmap__ 4 4959 5271 312 78 106% munmap__ 8 8262 8745 483 60 106% munmap__ 16 13099 14116 1017 64 108% munmap__ 32 23221 24785 1565 49 107% mprotect 1 906 967 62 62 107% mprotect 2 3019 3203 184 92 106% mprotect 4 6149 6569 420 105 107% mprotect 8 9978 10524 545 68 105% mprotect 16 20448 21427 979 61 105% mprotect 32 40972 42935 1963 61 105% madvise_ 1 434 497 63 63 115% madvise_ 2 752 899 147 74 120% madvise_ 4 1313 1513 200 50 115% madvise_ 8 2271 2627 356 44 116% madvise_ 16 4312 4883 571 36 113% madvise_ 32 8376 9319 943 29 111% Based on the result, for 6.8 kernel, sealing check adds 20-40 nano seconds, or around 50-100 CPU cycles, per VMA. In addition, I applied the sealing to 5.10 kernel: The first test (measuring time) syscall__ vmas t tmseal delta_ns per_vma % munmap__ 1 357 390 33 33 109% munmap__ 2 442 463 21 11 105% munmap__ 4 614 634 20 5 103% munmap__ 8 1017 1137 120 15 112% munmap__ 16 1889 2153 263 16 114% munmap__ 32 4109 4088 -21 -1 99% mprotect 1 235 227 -7 -7 97% mprotect 2 495 464 -30 -15 94% mprotect 4 741 764 24 6 103% mprotect 8 1434 1437 2 0 100% mprotect 16 2958 2991 33 2 101% mprotect 32 6431 6608 177 6 103% madvise_ 1 191 208 16 16 109% madvise_ 2 300 324 24 12 108% madvise_ 4 450 473 23 6 105% madvise_ 8 753 806 53 7 107% madvise_ 16 1467 1592 125 8 108% madvise_ 32 2795 3405 610 19 122% The second test (measuring cpu cycle) syscall__ nbr_vma cpu cmseal delta_cpu per_vma % munmap__ 1 684 715 31 31 105% munmap__ 2 861 898 38 19 104% munmap__ 4 1183 1235 51 13 104% munmap__ 8 1999 2045 46 6 102% munmap__ 16 3839 3816 -23 -1 99% munmap__ 32 7672 7887 216 7 103% mprotect 1 397 443 46 46 112% mprotect 2 738 788 50 25 107% mprotect 4 1221 1256 35 9 103% mprotect 8 2356 2429 72 9 103% mprotect 16 4961 4935 -26 -2 99% mprotect 32 9882 10172 291 9 103% madvise_ 1 351 380 29 29 108% madvise_ 2 565 615 49 25 109% madvise_ 4 872 933 61 15 107% madvise_ 8 1508 1640 132 16 109% madvise_ 16 3078 3323 245 15 108% madvise_ 32 5893 6704 811 25 114% For 5.10 kernel, sealing check adds 0-15 ns in time, or 10-30 CPU cycles, there is even decrease in some cases. It might be interesting to compare 5.10 and 6.8 kernel The first test (measuring time) syscall__ vmas t_5_10 t_6_8 delta_ns per_vma % munmap__ 1 357 909 552 552 254% munmap__ 2 442 1398 956 478 316% munmap__ 4 614 2444 1830 458 398% munmap__ 8 1017 4029 3012 377 396% munmap__ 16 1889 6647 4758 297 352% munmap__ 32 4109 11811 7702 241 287% mprotect 1 235 439 204 204 187% mprotect 2 495 1659 1164 582 335% mprotect 4 741 3747 3006 752 506% mprotect 8 1434 6755 5320 665 471% mprotect 16 2958 13748 10790 674 465% mprotect 32 6431 27827 21397 669 433% madvise_ 1 191 240 49 49 125% madvise_ 2 300 366 67 33 122% madvise_ 4 450 623 173 43 138% madvise_ 8 753 1110 357 45 147% madvise_ 16 1467 2127 660 41 145% madvise_ 32 2795 4109 1314 41 147% The second test (measuring cpu cycle) syscall__ vmas cpu_5_10 c_6_8 delta_cpu per_vma % munmap__ 1 684 1790 1106 1106 262% munmap__ 2 861 2819 1958 979 327% munmap__ 4 1183 4959 3776 944 419% munmap__ 8 1999 8262 6263 783 413% munmap__ 16 3839 13099 9260 579 341% munmap__ 32 7672 23221 15549 486 303% mprotect 1 397 906 509 509 228% mprotect 2 738 3019 2281 1140 409% mprotect 4 1221 6149 4929 1232 504% mprotect 8 2356 9978 7622 953 423% mprotect 16 4961 20448 15487 968 412% mprotect 32 9882 40972 31091 972 415% madvise_ 1 351 434 82 82 123% madvise_ 2 565 752 186 93 133% madvise_ 4 872 1313 442 110 151% madvise_ 8 1508 2271 763 95 151% madvise_ 16 3078 4312 1234 77 140% madvise_ 32 5893 8376 2483 78 142% From 5.10 to 6.8 munmap: added 250-550 ns in time, or 500-1100 in cpu cycle, per vma. mprotect: added 200-750 ns in time, or 500-1200 in cpu cycle, per vma. madvise: added 33-50 ns in time, or 70-110 in cpu cycle, per vma. In comparison to mseal, which adds 20-40 ns or 50-100 CPU cycles, the increase from 5.10 to 6.8 is significantly larger, approximately ten times greater for munmap and mprotect. When I discuss the mm performance with Brian Makin, an engineer who worked on performance, it was brought to my attention that such performance benchmarks, which measuring millions of mm syscall in a tight loop, may not accurately reflect real-world scenarios, such as that of a database service. Also this is tested using a single HW and ChromeOS, the data from another HW or distribution might be different. It might be best to take this data with a grain of salt. This patch (of 5): Wire up mseal syscall for all architectures. Link: https://lkml.kernel.org/r/20240415163527.626541-1-jeffxu@chromium.org Link: https://lkml.kernel.org/r/20240415163527.626541-2-jeffxu@chromium.org Signed-off-by: Jeff Xu <jeffxu@chromium.org> Reviewed-by: Kees Cook <keescook@chromium.org> Reviewed-by: Liam R. Howlett <Liam.Howlett@oracle.com> Cc: Dave Hansen <dave.hansen@intel.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Guenter Roeck <groeck@chromium.org> Cc: Jann Horn <jannh@google.com> [Bug #2] Cc: Jeff Xu <jeffxu@google.com> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Jorge Lucangeli Obes <jorgelo@chromium.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Matthew Wilcox (Oracle) <willy@infradead.org> Cc: Muhammad Usama Anjum <usama.anjum@collabora.com> Cc: Pedro Falcato <pedro.falcato@gmail.com> Cc: Stephen Röttger <sroettger@google.com> Cc: Suren Baghdasaryan <surenb@google.com> Cc: Amer Al Shanawany <amer.shanawany@gmail.com> Cc: Javier Carrasco <javier.carrasco.cruz@gmail.com> Cc: Shuah Khan <shuah@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-05-23Merge tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhostLinus Torvalds1-0/+2
Pull virtio updates from Michael Tsirkin: "Several new features here: - virtio-net is finally supported in vduse - virtio (balloon and mem) interaction with suspend is improved - vhost-scsi now handles signals better/faster And fixes, cleanups all over the place" * tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost: (48 commits) virtio-pci: Check if is_avq is NULL virtio: delete vq in vp_find_vqs_msix() when request_irq() fails MAINTAINERS: add Eugenio Pérez as reviewer vhost-vdpa: Remove usage of the deprecated ida_simple_xx() API vp_vdpa: don't allocate unused msix vectors sound: virtio: drop owner assignment fuse: virtio: drop owner assignment scsi: virtio: drop owner assignment rpmsg: virtio: drop owner assignment nvdimm: virtio_pmem: drop owner assignment wifi: mac80211_hwsim: drop owner assignment vsock/virtio: drop owner assignment net: 9p: virtio: drop owner assignment net: virtio: drop owner assignment net: caif: virtio: drop owner assignment misc: nsm: drop owner assignment iommu: virtio: drop owner assignment drm/virtio: drop owner assignment gpio: virtio: drop owner assignment firmware: arm_scmi: virtio: drop owner assignment ...
2024-05-22Merge tag 'char-misc-6.10-rc1' of ↵Linus Torvalds2-2/+28
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc Pull char/misc and other driver subsystem updates from Greg KH: "Here is the big set of char/misc and other driver subsystem updates for 6.10-rc1. Nothing major here, just lots of new drivers and updates for apis and new hardware types. Included in here are: - big IIO driver updates with more devices and drivers added - fpga driver updates - hyper-v driver updates - uio_pruss driver removal, no one uses it, other drivers control the same hardware now - binder minor updates - mhi driver updates - excon driver updates - counter driver updates - accessability driver updates - coresight driver updates - other hwtracing driver updates - nvmem driver updates - slimbus driver updates - spmi driver updates - other smaller misc and char driver updates All of these have been in linux-next for a while with no reported issues" * tag 'char-misc-6.10-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc: (319 commits) misc: ntsync: mark driver as "broken" to prevent from building spmi: pmic-arb: Add multi bus support spmi: pmic-arb: Register controller for bus instead of arbiter spmi: pmic-arb: Make core resources acquiring a version operation spmi: pmic-arb: Make the APID init a version operation spmi: pmic-arb: Fix some compile warnings about members not being described dt-bindings: spmi: Deprecate qcom,bus-id dt-bindings: spmi: Add X1E80100 SPMI PMIC ARB schema spmi: pmic-arb: Replace three IS_ERR() calls by null pointer checks in spmi_pmic_arb_probe() spmi: hisi-spmi-controller: Do not override device identifier dt-bindings: spmi: hisilicon,hisi-spmi-controller: clean up example dt-bindings: spmi: hisilicon,hisi-spmi-controller: fix binding references spmi: make spmi_bus_type const extcon: adc-jack: Document missing struct members extcon: realtek: Remove unused of_gpio.h extcon: usbc-cros-ec: Convert to platform remove callback returning void extcon: usb-gpio: Convert to platform remove callback returning void extcon: max77843: Convert to platform remove callback returning void extcon: max3355: Convert to platform remove callback returning void extcon: intel-mrfld: Convert to platform remove callback returning void ...
2024-05-22Merge tag 'tty-6.10-rc1' of ↵Linus Torvalds1-47/+49
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty Pull tty / serial updates from Greg KH: "Here is the big set of tty/serial driver changes for 6.10-rc1. Included in here are: - Usual good set of api cleanups and evolution by Jiri Slaby to make the serial interfaces move out of the 1990's by using kfifos instead of hand-rolling their own logic. - 8250_exar driver updates - max3100 driver updates - sc16is7xx driver updates - exar driver updates - sh-sci driver updates - tty ldisc api addition to help refuse bindings - other smaller serial driver updates All of these have been in linux-next for a while with no reported issues" * tag 'tty-6.10-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty: (113 commits) serial: Clear UPF_DEAD before calling tty_port_register_device_attr_serdev() serial: imx: Raise TX trigger level to 8 serial: 8250_pnp: Simplify "line" related code serial: sh-sci: simplify locking when re-issuing RXDMA fails serial: sh-sci: let timeout timer only run when DMA is scheduled serial: sh-sci: describe locking requirements for invalidating RXDMA serial: sh-sci: protect invalidating RXDMA on shutdown tty: add the option to have a tty reject a new ldisc serial: core: Call device_set_awake_path() for console port dt-bindings: serial: brcm,bcm2835-aux-uart: convert to dtschema tty: serial: uartps: Add support for uartps controller reset arm64: zynqmp: Add resets property for UART nodes dt-bindings: serial: cdns,uart: Add optional reset property serial: 8250_pnp: Switch to DEFINE_SIMPLE_DEV_PM_OPS() serial: 8250_exar: Keep the includes sorted serial: 8250_exar: Make type of bit the same in exar_ee_*_bit() serial: 8250_exar: Use BIT() in exar_ee_read() serial: 8250_exar: Switch to use dev_err_probe() serial: 8250_exar: Return directly from switch-cases serial: 8250_exar: Decrease indentation level ...
2024-05-22Merge tag 'riscv-for-linus-6.10-mw1' of ↵Linus Torvalds1-0/+6
git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux Pull RISC-V updates from Palmer Dabbelt: - Add byte/half-word compare-and-exchange, emulated via LR/SC loops - Support for Rust - Support for Zihintpause in hwprobe - Add PR_RISCV_SET_ICACHE_FLUSH_CTX prctl() - Support lockless lockrefs * tag 'riscv-for-linus-6.10-mw1' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux: (42 commits) riscv: defconfig: Enable CONFIG_CLK_SOPHGO_CV1800 riscv: select ARCH_HAS_FAST_MULTIPLIER riscv: mm: still create swiotlb buffer for kmalloc() bouncing if required riscv: Annotate pgtable_l{4,5}_enabled with __ro_after_init riscv: Remove redundant CONFIG_64BIT from pgtable_l{4,5}_enabled riscv: mm: Always use an ASID to flush mm contexts riscv: mm: Preserve global TLB entries when switching contexts riscv: mm: Make asid_bits a local variable riscv: mm: Use a fixed layout for the MM context ID riscv: mm: Introduce cntx2asid/cntx2version helper macros riscv: Avoid TLB flush loops when affected by SiFive CIP-1200 riscv: Apply SiFive CIP-1200 workaround to single-ASID sfence.vma riscv: mm: Combine the SMP and UP TLB flush code riscv: Only send remote fences when some other CPU is online riscv: mm: Broadcast kernel TLB flushes only when needed riscv: Use IPIs for remote cache/TLB flushes by default riscv: Factor out page table TLB synchronization riscv: Flush the instruction cache during SMP bringup riscv: hwprobe: export Zihintpause ISA extension riscv: misaligned: remove CONFIG_RISCV_M_MODE specific code ...
2024-05-22virtio-mem: support suspend+resumeDavid Hildenbrand1-0/+2
With virtio-mem, primarily hibernation is problematic: as the machine shuts down, the virtio-mem device loses its state. Powering the machine back up is like losing a bunch of DIMMs. While there would be ways to add limited support, suspend+resume is more commonly used for VMs and "easier" to support cleanly. s2idle can be supported without any device dependencies. Similarly, one would expect suspend-to-ram (i.e., S3) to work out of the box. However, QEMU currently unplugs all device memory when resuming the VM, using a cold reset on the "wakeup" path. In order to support S3, we need a feature flag for the device to tell us if memory remains plugged when waking up. In the future, QEMU will implement this feature. So let's always support s2idle and support S3 with plugged memory only if the device indicates support. Block hibernation early using the PM notifier. Trying to hibernate now fails early: # echo disk > /sys/power/state [ 26.455369] PM: hibernation: hibernation entry [ 26.458271] virtio_mem virtio0: hibernation is not supported. [ 26.462498] PM: hibernation: hibernation exit -bash: echo: write error: Operation not permitted s2idle works even without the new feature bit: # echo s2idle > /sys/power/mem_sleep # echo mem > /sys/power/state [ 52.083725] PM: suspend entry (s2idle) [ 52.095950] Filesystems sync: 0.010 seconds [ 52.101493] Freezing user space processes [ 52.104213] Freezing user space processes completed (elapsed 0.001 seconds) [ 52.106520] OOM killer disabled. [ 52.107655] Freezing remaining freezable tasks [ 52.110880] Freezing remaining freezable tasks completed (elapsed 0.001 seconds) [ 52.113296] printk: Suspending console(s) (use no_console_suspend to debug) S3 does not work without the feature bit when memory is plugged: # echo deep > /sys/power/mem_sleep # echo mem > /sys/power/state [ 32.788281] PM: suspend entry (deep) [ 32.816630] Filesystems sync: 0.027 seconds [ 32.820029] Freezing user space processes [ 32.823870] Freezing user space processes completed (elapsed 0.001 seconds) [ 32.827756] OOM killer disabled. [ 32.829608] Freezing remaining freezable tasks [ 32.833842] Freezing remaining freezable tasks completed (elapsed 0.001 seconds) [ 32.837953] printk: Suspending console(s) (use no_console_suspend to debug) [ 32.916172] virtio_mem virtio0: suspend+resume with plugged memory is not supported [ 32.916181] virtio-pci 0000:00:02.0: PM: pci_pm_suspend(): virtio_pci_freeze+0x0/0x50 returns -1 [ 32.916197] virtio-pci 0000:00:02.0: PM: dpm_run_callback(): pci_pm_suspend+0x0/0x170 returns -1 [ 32.916210] virtio-pci 0000:00:02.0: PM: failed to suspend async: error -1 But S3 works with the new feature bit when memory is plugged (patched QEMU): # echo deep > /sys/power/mem_sleep # echo mem > /sys/power/state [ 33.983694] PM: suspend entry (deep) [ 34.009828] Filesystems sync: 0.024 seconds [ 34.013589] Freezing user space processes [ 34.016722] Freezing user space processes completed (elapsed 0.001 seconds) [ 34.019092] OOM killer disabled. [ 34.020291] Freezing remaining freezable tasks [ 34.023549] Freezing remaining freezable tasks completed (elapsed 0.001 seconds) [ 34.026090] printk: Suspending console(s) (use no_console_suspend to debug) Cc: "Michael S. Tsirkin" <mst@redhat.com> Cc: Jason Wang <jasowang@redhat.com> Cc: Xuan Zhuo <xuanzhuo@linux.alibaba.com> Signed-off-by: David Hildenbrand <david@redhat.com> Message-Id: <20240318120645.105664-1-david@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2024-05-21Merge tag 'pci-v6.10-changes' of ↵Linus Torvalds1-0/+6
git://git.kernel.org/pub/scm/linux/kernel/git/pci/pci Pull pci updates from Bjorn Helgaas: "Enumeration: - Skip E820 checks for MCFG ECAM regions for new (2016+) machines, since there's no requirement to describe them in E820 and some platforms require ECAM to work (Bjorn Helgaas) - Rename PCI_IRQ_LEGACY to PCI_IRQ_INTX to be more specific (Damien Le Moal) - Remove last user and pci_enable_device_io() (Heiner Kallweit) - Wait for Link Training==0 to avoid possible race (Ilpo Järvinen) - Skip waiting for devices that have been disconnected while suspended (Ilpo Järvinen) - Clear Secondary Status errors after enumeration since Master Aborts and Unsupported Request errors are an expected part of enumeration (Vidya Sagar) MSI: - Remove unused IMS (Interrupt Message Store) support (Bjorn Helgaas) Error handling: - Mask Genesys GL975x SD host controller Replay Timer Timeout correctable errors caused by a hardware defect; the errors cause interrupts that prevent system suspend (Kai-Heng Feng) - Fix EDR-related _DSM support, which previously evaluated revision 5 but assumed revision 6 behavior (Kuppuswamy Sathyanarayanan) ASPM: - Simplify link state definitions and mask calculation (Ilpo Järvinen) Power management: - Avoid D3cold for HP Pavilion 17 PC/1972 PCIe Ports, where BIOS apparently doesn't know how to put them back in D0 (Mario Limonciello) CXL: - Support resetting CXL devices; special handling required because CXL Ports mask Secondary Bus Reset by default (Dave Jiang) DOE: - Support DOE Discovery Version 2 (Alexey Kardashevskiy) Endpoint framework: - Set endpoint BAR to be 64-bit if the driver says that's all the device supports, in addition to doing so if the size is >2GB (Niklas Cassel) - Simplify endpoint BAR allocation and setting interfaces (Niklas Cassel) Cadence PCIe controller driver: - Drop DT binding redundant msi-parent and pci-bus.yaml (Krzysztof Kozlowski) Cadence PCIe endpoint driver: - Configure endpoint BARs to be 64-bit based on the BAR type, not the BAR value (Niklas Cassel) Freescale Layerscape PCIe controller driver: - Convert DT binding to YAML (Frank Li) MediaTek MT7621 PCIe controller driver: - Add DT binding missing 'reg' property for child Root Ports (Krzysztof Kozlowski) - Fix theoretical string truncation in PHY name (Sergio Paracuellos) NVIDIA Tegra194 PCIe controller driver: - Return success for endpoint probe instead of falling through to the failure path (Vidya Sagar) Renesas R-Car PCIe controller driver: - Add DT binding missing IOMMU properties (Geert Uytterhoeven) - Add DT binding R-Car V4H compatible for host and endpoint mode (Yoshihiro Shimoda) Rockchip PCIe controller driver: - Configure endpoint BARs to be 64-bit based on the BAR type, not the BAR value (Niklas Cassel) - Add DT binding missing maxItems to ep-gpios (Krzysztof Kozlowski) - Set the Subsystem Vendor ID, which was previously zero because it was masked incorrectly (Rick Wertenbroek) Synopsys DesignWare PCIe controller driver: - Restructure DBI register access to accommodate devices where this requires Refclk to be active (Manivannan Sadhasivam) - Remove the deinit() callback, which was only need by the pcie-rcar-gen4, and do it directly in that driver (Manivannan Sadhasivam) - Add dw_pcie_ep_cleanup() so drivers that support PERST# can clean up things like eDMA (Manivannan Sadhasivam) - Rename dw_pcie_ep_exit() to dw_pcie_ep_deinit() to make it parallel to dw_pcie_ep_init() (Manivannan Sadhasivam) - Rename dw_pcie_ep_init_complete() to dw_pcie_ep_init_registers() to reflect the actual functionality (Manivannan Sadhasivam) - Call dw_pcie_ep_init_registers() directly from all the glue drivers, not just those that require active Refclk from the host (Manivannan Sadhasivam) - Remove the "core_init_notifier" flag, which was an obscure way for glue drivers to indicate that they depend on Refclk from the host (Manivannan Sadhasivam) TI J721E PCIe driver: - Add DT binding J784S4 SoC Device ID (Siddharth Vadapalli) - Add DT binding J722S SoC support (Siddharth Vadapalli) TI Keystone PCIe controller driver: - Add DT binding missing num-viewport, phys and phy-name properties (Jan Kiszka) Miscellaneous: - Constify and annotate with __ro_after_init (Heiner Kallweit) - Convert DT bindings to YAML (Krzysztof Kozlowski) - Check for kcalloc() failure in of_pci_prop_intr_map() (Duoming Zhou)" * tag 'pci-v6.10-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/pci/pci: (97 commits) PCI: Do not wait for disconnected devices when resuming x86/pci: Skip early E820 check for ECAM region PCI: Remove unused pci_enable_device_io() ata: pata_cs5520: Remove unnecessary call to pci_enable_device_io() PCI: Update pci_find_capability() stub return types PCI: Remove PCI_IRQ_LEGACY scsi: vmw_pvscsi: Do not use PCI_IRQ_LEGACY instead of PCI_IRQ_LEGACY scsi: pmcraid: Use PCI_IRQ_INTX instead of PCI_IRQ_LEGACY scsi: mpt3sas: Use PCI_IRQ_INTX instead of PCI_IRQ_LEGACY scsi: megaraid_sas: Use PCI_IRQ_INTX instead of PCI_IRQ_LEGACY scsi: ipr: Use PCI_IRQ_INTX instead of PCI_IRQ_LEGACY scsi: hpsa: Use PCI_IRQ_INTX instead of PCI_IRQ_LEGACY scsi: arcmsr: Use PCI_IRQ_INTX instead of PCI_IRQ_LEGACY wifi: rtw89: Use PCI_IRQ_INTX instead of PCI_IRQ_LEGACY dt-bindings: PCI: rockchip,rk3399-pcie: Add missing maxItems to ep-gpios Revert "genirq/msi: Provide constants for PCI/IMS support" Revert "x86/apic/msi: Enable PCI/IMS" Revert "iommu/vt-d: Enable PCI/IMS" Revert "iommu/amd: Enable PCI/IMS" Revert "PCI/MSI: Provide IMS (Interrupt Message Store) support" ...
2024-05-19Merge tag 'bcachefs-2024-05-19' of https://evilpiepirate.org/git/bcachefsLinus Torvalds1-0/+1
Pull bcachefs updates from Kent Overstreet: - More safety fixes, primarily found by syzbot - Run the upgrade/downgrade paths in nochnages mode. Nochanges mode is primarily for testing fsck/recovery in dry run mode, so it shouldn't change anything besides disabling writes and holding dirty metadata in memory. The idea here was to reduce the amount of activity if we can't write anything out, so that bringing up a filesystem in "super ro" mode would be more lilkely to work for data recovery - but norecovery is the correct option for this. - btree_trans->locked; we now track whether a btree_trans has any btree nodes locked, and this is used for improved assertions related to trans_unlock() and trans_relock(). We'll also be using it for improving how we work with lockdep in the future: we don't want lockdep to be tracking individual btree node locks because we take too many for lockdep to track, and it's not necessary since we have a cycle detector. - Trigger improvements that are prep work for online fsck - BTREE_TRIGGER_check_repair; this regularizes how we do some repair work for extents that goes with running triggers in fsck, and fixes some subtle issues with transaction restarts there. - bch2_snapshot_equiv() has now been ripped out of fsck.c; snapshot equivalence classes are for when snapshot deletion leaves behind redundant snapshot nodes, but snapshot deletion now cleans this up right away, so the abstraction doesn't need to leak. - Improvements to how we resume writing to the journal in recovery. The code for picking the new place to write when reading the journal is greatly simplified and we also store the position in the superblock for when we don't read the journal; this means that we preserve more of the journal for list_journal debugging. - Improvements to sysfs btree_cache and btree_node_cache, for debugging memory reclaim. - We now detect when we've blocked for 10 seconds on the allocator in the write path and dump some useful info. - Safety fixes for devices references: this is a big series that changes almost all device lookups to properly check if the device exists and take a reference to it. Previously we assumed that if a bkey exists that references a device then the device must exist, and this was enforced in .invalid methods, but this was incorrect because it meant device removal relied on accounting being correct to not leave keys pointing to invalid devices, and that's not something we can assume. Getting the "pointer to invalid device" checks out of our .invalid() methods fixes some long standing device removal bugs; the only outstanding bug with device removal now is a race between the discard path and deleting alloc info, which should be easily fixed. - The allocator now prefers not to expand the new member_info.btree_allocated bitmap, meaning if repair ever requires scanning for btree nodes (because of a corrupt interior nodes) we won't have to scan the whole device(s). - New coding style document, which among other things talks about the correct usage of assertions * tag 'bcachefs-2024-05-19' of https://evilpiepirate.org/git/bcachefs: (155 commits) bcachefs: add no_invalid_checks flag bcachefs: add counters for failed shrinker reclaim bcachefs: Fix sb_field_downgrade validation bcachefs: Plumb bch_validate_flags to sb_field_ops.validate() bcachefs: s/bkey_invalid_flags/bch_validate_flags bcachefs: fsync() should not return -EROFS bcachefs: Invalid devices are now checked for by fsck, not .invalid methods bcachefs: kill bch2_dev_bkey_exists() in bch2_check_fix_ptrs() bcachefs: kill bch2_dev_bkey_exists() in bch2_read_endio() bcachefs: bch2_dev_get_ioref() checks for device not present bcachefs: bch2_dev_get_ioref2(); io_read.c bcachefs: bch2_dev_get_ioref2(); debug.c bcachefs: bch2_dev_get_ioref2(); journal_io.c bcachefs: bch2_dev_get_ioref2(); io_write.c bcachefs: bch2_dev_get_ioref2(); btree_io.c bcachefs: bch2_dev_get_ioref2(); backpointers.c bcachefs: bch2_dev_get_ioref2(); alloc_background.c bcachefs: for_each_bset() declares loop iter bcachefs: Move BCACHEFS_STATFS_MAGIC value to UAPI magic.h bcachefs: Improve sysfs internal/btree_cache ...
2024-05-18Merge tag 'nfsd-6.10' of git://git.kernel.org/pub/scm/linux/kernel/git/cel/linuxLinus Torvalds2-1/+47
Pull nfsd updates from Chuck Lever: "This is a light release containing mostly optimizations, code clean- ups, and minor bug fixes. This development cycle has focused on non- upstream kernel work: 1. Continuing to build upstream CI for NFSD, based on kdevops 2. Backporting NFSD filecache-related fixes to selected LTS kernels One notable new feature in v6.10 NFSD is the addition of a new netlink protocol dedicated to configuring NFSD. A new user space tool, nfsdctl, is to be added to nfs-utils. Lots more to come here. As always I am very grateful to NFSD contributors, reviewers, testers, and bug reporters who participated during this cycle" * tag 'nfsd-6.10' of git://git.kernel.org/pub/scm/linux/kernel/git/cel/linux: (29 commits) NFSD: Force all NFSv4.2 COPY requests to be synchronous SUNRPC: Fix gss_free_in_token_pages() NFS/knfsd: Remove the invalid NFS error 'NFSERR_OPNOTSUPP' knfsd: LOOKUP can return an illegal error value nfsd: set security label during create operations NFSD: Add COPY status code to OFFLOAD_STATUS response NFSD: Record status of async copy operation in struct nfsd4_copy SUNRPC: Remove comment for sp_lock NFSD: add listener-{set,get} netlink command SUNRPC: add a new svc_find_listener helper SUNRPC: introduce svc_xprt_create_from_sa utility routine NFSD: add write_version to netlink command NFSD: convert write_threads to netlink command NFSD: allow callers to pass in scope string to nfsd_svc NFSD: move nfsd_mutex handling into nfsd_svc callers lockd: host: Remove unnecessary statements'host = NULL;' nfsd: don't create nfsv4recoverydir in nfsdfs when not used. nfsd: optimise recalculate_deny_mode() for a common case nfsd: add tracepoint in mark_client_expired_locked nfsd: new tracepoint for check_slot_seqid ...
2024-05-18Merge tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdmaLinus Torvalds4-1/+33
Pull rdma updates from Jason Gunthorpe: "Aside from the usual things this has an arch update for __iowrite64_copy() used by the RDMA drivers. This API was intended to generate large 64 byte MemWr TLPs on PCI. These days most processors had done this by just repeating writel() in a loop. S390 and some new ARM64 designs require a special helper to get this to generate. - Small improvements and fixes for erdma, efa, hfi1, bnxt_re - Fix a UAF crash after module unload on leaking restrack entry - Continue adding full RDMA support in mana with support for EQs, GID's and CQs - Improvements to the mkey cache in mlx5 - DSCP traffic class support in hns and several bug fixes - Cap the maximum number of MADs in the receive queue to avoid OOM - Another batch of rxe bug fixes from large scale testing - __iowrite64_copy() optimizations for write combining MMIO memory - Remove NULL checks before dev_put/hold()