linux.git - Unnamed repository; edit this file 'description' to name the repository.

Age	Commit message (Collapse)	Author	Files	Lines
12 days	Merge tag 'kvmarm-fixes-7.2-2' of ↵	Paolo Bonzini	1	-11/+5
	git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm into HEAD KVM/arm64 fixes for 7.2, take #2 - Move locking for kvm_io_bus_get_dev() into the caller, ensuring race-free checks that the returned object is of the correct type - Fix initialisation of the page-table walk level when relaxing permissions - Correctly update the XN attribute when relaxing permissions - Fix the sign extension of loads from emulated MMIO regions - Assorted collection of fixes for pKVM's FFA proxy, together with a couple of FFA driver adjustments
2026-07-06	KVM: Move kvm_io_bus_get_dev() locking responsibilities to callers	Marc Zyngier	1	-11/+5
	kvm_io_bus_get_dev() returns a device that is only matched by the address, and nothing else. This can cause a lifetime issue if the matched device is not the expected type, as by the time the caller can introspect the object, it might be gone (the srcu lock having been dropped). Given that there is only a single user of this helper, the simplest option is to move the locking responsibility to the caller, which can keep the srcu lock held for as long as it wants. Note that this aligns with other kvm_io_bus*() helpers, which already require the srcu lock to be held by the callers. Reported-by: Will Deacon <will@kernel.org> Fixes: 8a39d00670f07 ("KVM: kvm_io_bus: Add kvm_io_bus_get_dev() call") Link: https://lore.kernel.org/all/20260626111344.802555-1-maz@kernel.org Cc: stable@vger.kernel.org Reviewed-by: Oliver Upton <oupton@kernel.org> Link: https://patch.msgid.link/20260627105105.1005990-1-maz@kernel.org Signed-off-by: Marc Zyngier <maz@kernel.org>
2026-06-25	Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm	Linus Torvalds	1	-7/+5
	Pull kvm fixes from Paolo Bonzini: "s390: - Fix S390_USER_OPEREXEC so it can now be enabled regardless of other unrelated capabilities - Fix handling of the _PAGE_UNUSED pte bit that could lead to guest memory corruption in some scenarios - A bunch of misc gmap fixes (locking, behaviour under memory pressure) - Fix CMMA dirty tracking x86: - Tidy up some WARN_ON() and BUG_ON(), replacing them with WARN_ON_ONCE() or KVM_BUG_ON(). All of these have obviously never triggered, or somebody would have been annoyed earlier, but still... - Fix missing interrupt due to stale CR8 intercept - Add a statistic that can come in handy to debug leaks as well as the vulnerability to a class of recently-discovered issues - Do not ask arch/x86/kernel to export default_cpu_present_to_apicid() just for KVM" * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (22 commits) x86/apic: KVM: Use cpu_physical_id() to get APIC ID of running vCPU for AVIC KVM: x86/mmu: Expose number of shadow MMU shadow pages as a stat KVM: x86: Unconditionally recompute CR8 intercept on PPR update KVM: VMX: Grab vmcs12 on CR8 interception update iff vCPU is in guest mode KVM: x86: WARN (once) if RTC pending EOI tracking goes off the rails KVM: x86: WARN and fail kvm_set_irq() if a PIC or I/O APIC vector is invalid KVM: x86: Bug the VM, not the kernel, if the ISR count {under,over}flows KVM: x86/mmu: Bug the VM, not the host kernel, if KVM write-protects upper SPTEs KVM: x86: Replace BUG_ON() with WARN_ON_ONCE() on "bad" nested GPA translation KVM: Replace guest-triggerable BUG_ON() in ioeventfd datamatch with get_unaligned() KVM: s390: Return failure in case of failure in kvm_s390_set_cmma_bits() KVM: s390: selftests: Fix cmma selftest KVM: s390: Fix cmma dirty tracking KVM: s390: Fix locking in kvm_s390_set_mem_control() KVM: s390: Fix handle_{sske,pfmf} under memory pressure KVM: s390: Fix code typo in gmap_protect_asce_top_level() KVM: s390: Do not set special large pages dirty KVM: s390: Fix dat_peek_cmma() overflow s390/mm: Fix handling of _PAGE_UNUSED pte bit KVM: s390: Fix typo in UCONTROL documentation ...
2026-06-24	KVM: Replace guest-triggerable BUG_ON() in ioeventfd datamatch with ↵	Sean Christopherson	1	-7/+5
	get_unaligned() Drop a BUG_ON() that has been reachable since it was first added, way back in 2009, and instead use get_unaligned() to perform potentially-unaligned accesses. For a given store, KVM x86's emulator tracks the entire value in the destination operand, x86_emulate_ctxt.dst. If the destination is memory, and the target splits multiple pages and/or is emulated MMIO, then KVM handles each fragment independently. E.g. on a page split starting at page offset 0xffc, KVM writes 4 bytes to the first page, then the remaining bytes to the second page, using ctxt->dst as the source for both (with appropriate offsets). If the destination splits a page and hits emulated MMIO on the second page, then KVM will complete the write to the first page, then emulate the MMIO access to the second page. If there is a datamatch-enabled ioeventfd at offset 0 of the second page, then KVM will process the remainder of the store as a potential ioeventfd signal. Putting it all together, if the guest emits a store that splits a page starting at page offset N, and the second page has a datamatch-enabled ioeventfd at offset 0, then KVM will check for datamatch using &dst.valptr[N] as the source. Due to dst (and thus dst.valptr) being 32-byte aligned, if N is not aligned to @len, the BUG_ON() fires. E.g. with a 16-byte store at page offset 0xffc, to an ioeventfd of len 8, all initial checks in ioeventfd_in_range() will succeed, and the BUG_ON() fires due to @val being 4-byte aligned, but not 8-byte aligned. ------------[ cut here ]------------ kernel BUG at arch/x86/kvm/../../../virt/kvm/eventfd.c:783! Oops: invalid opcode: 0000 [#1] SMP CPU: 0 UID: 1000 PID: 615 Comm: repro Not tainted 7.1.0-rc2-ff238429d1ea #365 PREEMPT Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 0.0.0 02/06/2015 RIP: 0010:ioeventfd_write+0x6c/0x70 [kvm] Call Trace: <TASK> __kvm_io_bus_write+0x85/0xb0 [kvm] kvm_io_bus_write+0x53/0x80 [kvm] vcpu_mmio_write+0x66/0xf0 [kvm] emulator_read_write_onepage+0x12a/0x540 [kvm] emulator_read_write+0x109/0x2b0 [kvm] x86_emulate_insn+0x4f8/0xfb0 [kvm] x86_emulate_instruction+0x181/0x790 [kvm] kvm_mmu_page_fault+0x313/0x630 [kvm] vmx_handle_exit+0x18a/0x590 [kvm_intel] kvm_arch_vcpu_ioctl_run+0xc81/0x1c90 [kvm] kvm_vcpu_ioctl+0x2d5/0x970 [kvm] __x64_sys_ioctl+0x8a/0xd0 do_syscall_64+0xb7/0x890 entry_SYSCALL_64_after_hwframe+0x4b/0x53 RIP: 0033:0x7f19c931a9bf </TASK> Modules linked in: kvm_intel kvm irqbypass ---[ end trace 0000000000000000 ]--- In a perfect world, the fix would be to simply delete the BUG_ON(), as KVM x86 doesn't perform alignment checks on "normal" memory accesses at CPL0. Sadly, C99 ruins all the fun; while the x86 architecture plays nice, dereferencing an unaligned pointer directly is undefined behavior in C, e.g. triggers splats when running with CONFIG_UBSAN_ALIGNMENT=y. Fixes: d34e6b175e61 ("KVM: add ioeventfd support") Cc: stable@vger.kernel.org Signed-off-by: Sean Christopherson <seanjc@google.com> Message-ID: <20260612225241.678509-1-seanjc@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2026-06-19	Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm	Linus Torvalds	4	-84/+63
	Pull kvm updates from Paolo Bonzini: "arm64: This is a bit of an odd merge window on the KVM/arm64 front. There is absolutely no new feature in the pull request. It is purely fixes, because it is simply becoming too hard to review new stuff when so many AI-fuelled fixes hit the list. - Significant cleanup of the vgic-v5 PPI support which was merged in 7.1. This makes the code more maintainable, and squashes a couple of bugs in the meantime - Set of fixes for the handling of the MMU in an NV context, particularly VNCR-triggered faults. S1POE support is fixed as well - Large set of pKVM fixes, mostly addressing recurring issues around hypervisor tracking of donated pages in obscure cases where the donation could fail and leave things in a bizarre state - Fixes for the so-called "lazy vgic init", which resulted in sleeping operations in non-preemptible sections. This turned out to be far more invasive than initially expected.. - Reduce the overhead of L1/L2 context switch by not touching the FP registers - Fix the way non-implemented page sizes are dealt with when a guest insist on using them for S2 translation - The usual set of low-impact fixes and cleanups all over the map Loongarch: - On a request for lazy FPU load, load all FPU state that the VM supports instead of enabling only the part (FPU, LSX or LASX) that caused the FPU load request - Some enhancements about interrupt injection - Some bug fixes and other small changes RISC-V: - Batch G-stage TLB flushes for GPA range based page table updates - Convert HGEI line management to fully per-HART - Fix missing CSR dirty marking when FWFT state updated via ONE_REG - Fix stale FWFT feature exposure to Guest/VM - Speed up dirty logging write faults using MMU rwlock and atomic PTE updates using cmpxchg() for permission-only changes - Use flexible array for APLIC IRQ state - Use kvm_slot_dirty_track_enabled() for logging enable check on a memslot - Avoid skipping valid pages in kvm_riscv_gstage_wp_range() - Avoid skipping valid pages in kvm_riscv_gstage_unmap_range() - Use endian-specific __lelong for NACL shared memory S390: - KVM_PRE_FAULT_MEMORY support - Support for 2G hugepages - Support for the ASTFLEIE 2 facility - Support for fast inject using kvm_arch_set_irq_inatomic - Fix potential leak of uninitialized bytes - A few more misc gmap fixes x86: - Generic support for the more granular permissions allowed by EPT, namely "read" (which was previously usurping the U bit) and separate execution bits for kernel and userspace - Do not assume that all page tables start with U=1/W=1/NX=0 at the root, as AMD GMET needs to have U=0 at the root - Introduce common assembly macros for use within Intel and AMD vendor-specific vmentry code. This touches the SPEC_CTRL handling, which is now entirely done in assembly for Intel (by reusing the AMD code that already existed), and register save/restore which uses some macro magic to compute the offsets in the struct. Both of these are preparatory changes for upcoming APX support - Clean up KVM's register tracking and storage, primarily to prepare for APX support, which expands the maximum number of GPRs from 16 to 32 - Keep a single copy of the PDPTRs rather than two, since architecturally there is just one - Handle EXIT_FASTPATH_EXIT_USERSPACE in vendor code to ensure vendor code gets a chance to handle things like reaping the PML buffer - Update KVM's view of PV async enabling if and only if the MSR write fully succeeds - Fix a variety of issues where the emulator doesn't honor guest-debug state, and clean up related code along the way - Synthesize EPT Violation and #NPF "error code" bits when injecting faults into L1 that didn't originate in hardware (in which case the VMCS/VMCB doesn't hold relevant information) - Add support for virtualizing (well, emulating) AMD's flavor of CPL>0 CPUID faulting - Clean up the GPR APIs so that KVM's use of "raw" is consistent, and fix a variety of minor bugs along the way - Fix an OOB memory access due to not checking the VP ID when handling a Hyper-V PV TLB flush for L2 - Fix a bug in the mediated PMU's handling of fixed counters that allowed the guest to bypass the PMU event filter - Allow userspace to return EAGAIN when handling SNP and TDX hypercalls, so the KVM can forward a "retry" status code to the guest, and reserve all unused error codes for future usage - Overhaul the TDP MMU => S-EPT code to move as much S-EPT specific logic as possible into the TDX code, and to funnel (almost) all S-EPT updates into a single chokepoint. The motivation is largely to prepare for upcoming Dynamic PAMT support, but the cleanups are nice to have on their own - Plug a hole in shadow page table handling, where KVM fails to recursively zap nested EPT/NPT shadow page tables when the nested hypervisor tears down its own EPT/NPT page tables from the bottom up x86 (Intel): - Support for nested MBEC (Mode-Based Execute Control), see above in the generic section; also run with MBEC enabled even for non-nested mode - Use the kernel's "enum pg_level" in the TDX APIs instead of the TDX-Module's level definitions (which are 0-based) - Rework the TDX memory APIs to not require/assume that guest memory is backed by "struct page" (in prepartion for guest_memfd hugepage support) - Fix a largely benign bug where KVM TDX would incorrectly state it could emulate several x2APIC MSRs - Use the "safe" WRMSR API when proxying LBR MSR writes as the to-be-written value is guest controlled and completely unvalidated x86 (AMD): - Support for nested GMET (Guest Mode Execution Trap), see above in the generic section; also run with GMET enabled even for non-nested mode - Fixes and minor cleanups to GHCB handling, on top of the earlier work already merged into 7.1-rc - Ensure KVM's copy of CR0 and CR3 are up-to-date prior to invoking fastpath handlers - Add support for virtualizing gPAT (KVM previously just used L1's PAT when running L2) - Fix goofs where KVM mishandles side effects (e.g. single-step and PMC updates) when emulating VMRUN - Fix a variety of bugs in AVIC's handling of x2APIC MSR interception, most notably where KVM didn't disable interception of IRR, ISR, and TMR regs - Add support for virtualizing Host-Only/Guest-Only bits in the mediated PMU - Don't advertise support for unusable VM types, and account for VM types that are disabled by firmware, e.g. to mitigate security vulnerabilities - Rewrite the SEV {en,de}crypt debug ioctls as they were riddle with bugs and unnecessarily complicated, and add comprehensive tests - Clean up and deduplicate the SEV page pinning code - Fix minor goofs related to writing back CPUID information after firmware rejects a CPUID page for an SNP vCPU Generic: - Rename invalidate_begin() to invalidate_start() throughout KVM to follow the kernel's nomenclature, e.g. for mmu_notifiers - Use guard() to cleanup up various KVM+VFIO flows - Minor cleanups guest_memfd: - Return -EEXIST instead of -EINVAL if userspace attempts to bind a gmem range to multiple memslots, and fix the test that was supposed to ensure KVM returns -EEXIST - Treat memslot binding offsets and sizes as unsigned values to fix a bug where KVM interprets a large "offset + size" as a negative value and allows a nonsensical offset - Use the inode number instead of the page offset for the NUMA interleaving index to fix a bug where the effective index would jump by two for consecutive pages (the caller also adds in the page offset) Selftests: - Randomize the dirty log test's delay when reaping the bitmap on the first pass, as always waiting only 1ms hid a KVM RISC-V bug as the test reaped the bitmap before KVM could build up enough state to hit the bug - A pile of one-off fixes and cleanups" * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (326 commits) KVM: x86/mmu: Ensure hugepage is in by slot before checking max mapping level KVM: x86: Fix shadow paging use-after-free due to unexpected role KVM: s390: Introducing kvm_arch_set_irq_inatomic fast inject KVM: s390: Enable adapter_indicators_set to use mapped pages KVM: s390: Add map/unmap ioctl and clean mappings post-guest riscv: kvm: Use endian-specific __lelong for NACL shared memory KVM: selftests: access_tracking_perf_test: bump number of NUMA nodes to 32 KVM: s390: vsie: Implement ASTFLEIE facility 2 KVM: s390: vsie: Refactor handle_stfle s390/sclp: Detect ASTFLEIE 2 facility KVM: s390: Minor refactor of base/ext facility lists KVM: x86/mmu: move pdptrs out of the MMU KVM: x86: check that kvm_handle_invpcid is only invoked with shadow paging KVM: nSVM: invalidate cached PDPTRs across nested NPT transitions KVM: nVMX: remove unnecessary code in prepare_vmcs02_rare KVM: x86: remove nested_mmu from mmu_is_nested() KVM: arm64: vgic-its: Make ABI commit helpers return void KVM: s390: Initialize KVM_S390_GET_CMMA_BITS memory LoongArch: KVM: Add missing slots_lock for device register/unregister LoongArch: KVM: Validate irqchip index in irqfd routing ...
2026-06-15	Merge tag 'vfs-7.2-rc1.misc' of ↵	Linus Torvalds	1	-2/+0
	git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs Pull misc vfs updates from Christian Brauner: "Features: - Reduce pipe->mutex contention by pre-allocating pages outside the lock in anon_pipe_write(). anon_pipe_write() called alloc_page() once per page while holding pipe->mutex. The allocation can sleep doing direct reclaim and runs memcg charging, which extends the critical section and stalls any concurrent reader on the same mutex. Now up to 8 pages are pre-allocated before the mutex is taken, leftovers are recycled into the per-pipe tmp_page[] cache before unlock, and any remainder is released after unlock, keeping the allocator out of the critical section on both sides. On a writers x readers sweep with 64KB writes against a 1 MB pipe throughput improves 6-28% and average write latency drops 5-22%; under memory pressure - when the cost of holding the mutex across reclaim is highest - throughput improves 21-48% and latency drops 17-33%. The microbenchmark is added to selftests. - uaccess/sockptr: fix the ignored_trailing logic in copy_struct_to_user() to behave as documented and the usize check in copy_struct_from_sockptr() for user pointers, and add copy_struct_{from,to}_bounce_buffer() and copy_struct_to_sockptr() helpers for upcoming users (IPPROTO_SMBDIRECT, IPPROTO_QUIC). - bpf: add a sleepable bpf_real_inode() kfunc that resolves the real inode backing a dentry via d_real_inode(). On overlayfs the inode attached to the dentry doesn't carry the underlying device information; this is used by the filesystem restriction BPF program that was merged into systemd. - docs: add guidelines for submitting new filesystems, motivated by the maintenance burden abandoned and untestable filesystems impose on VFS developers, blocking infrastructure work like folio conversions and iomap migration. Fixes: - libfs: set SB_I_NOEXEC and SB_I_NODEV by default in init_pseudo() and drop the now-redundant assignments in callers. This began as a one-line dma-buf fix for a path_noexec() warning; a pseudo filesystem has no reason not to set SB_I_NOEXEC. All init_pseudo() callers were audited: the only visible effect is on dma-buf where SB_I_NOEXEC silences the warning. - Handle set_blocksize() failures in legacy filesystems (bfs, hpfs, qnx4, jfs, befs, affs, isofs, minix, ntfs3, omfs). Mounting a device with a sector size > PAGE_SIZE crashed roughly half of them; the rest had the same missing error handling pattern. Plus a follow-up releasing the superblock buffer_head when setting the minix v3 block size fails. - mount: honour SB_NOUSER in the new mount API. - fs/fcntl: fix a SOFTIRQ-unsafe lock order in fasync signaling by switching the process-group paths of send_sigio() and send_sigurg() from read_lock(&tasklist_lock) to RCU, matching the single-PID path. - vfs: add an FS_USERNS_DELEGATABLE flag and set it for NFS, fixing delegated NFS mounts (fsopen() in a container with the mount performed by a privileged daemon) that broke when non-init s_user_ns was tied to FS_USERNS_MOUNT. - selftests/namespaces: fix a hang in nsid_test where an unreaped grandchild kept the TAP pipe write-end open, a waitpid(-1) race in listns_efault_test, and a false FAIL on kernels without listns() where the tests should SKIP. - filelock: fix the break_lease() stub signature for CONFIG_FILE_LOCKING=n. - init/initramfs_test: wait for the async initramfs unpacking before running; the test and do_populate_rootfs() share the parser state. - fs/coredump: reduce redundant log noise in validate_coredump_safety(). - iomap: pass the correct length to fserror_report_io() in __iomap_write_begin(). - backing-file: fix the backing_file_open() kerneldoc. Cleanups: - initramfs: refactor the cpio hex header parsing to use hex2bin() instead of the hand-rolled simple_strntoul() which is reverted, and extend the initramfs KUnit tests to cover header fields with 0x prefixes. - Replace __get_free_pages() and friends with kmalloc()/kzalloc() across quota, proc, ocfs2/dlm, nilfs2, nfs, nfsd, libfs, jfs, jbd2, isofs, fuse, select, namespace, configfs, binfmt_misc, bfs, and the do_mounts init code - part of the larger work of replacing page allocator calls with kmalloc(). - Use clear_and_wake_up_bit() in unlock_buffer() and journal_end_buffer_io_sync() instead of open-coding the sequence. - Drop unused VFS exports: unexport drop_super_exclusive(), remove start_removing_user_path_at(), and fold __start_removing_path() into start_removing_path(). - fs/read_write: narrow the __kernel_write() export with EXPORT_SYMBOL_FOR_MODULES(). - vfs: uapi: retire octal and hex constants in favor of (1 << n) for the O_ flags. Finding a free bit for a new flag across the architectures was needlessly hard with the mixed bases. - dcache: add extra sanity checks of dead dentries in dentry_free() via a new DENTRY_WARN_ONCE() that also prints d_flags. - iov_iter: use kmemdup_array() in dup_iter() to harden the allocation against multiplication overflow. - fs/pipe: write to ->poll_usage only once. - vfs: remove an always-taken if-branch in find_next_fd(). - dcache: use kmalloc_flex() for struct external_name in __d_alloc(). - namei: use QSTR() instead of QSTR_INIT() in path_pts(). - sync_file_range: delete dead S_ISLNK code. - Comment fixes: retire a stale comment in fget_task_next() and fix assorted spelling mistakes" * tag 'vfs-7.2-rc1.misc' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs: (73 commits) backing-file: fix backing_file_open() kerneldoc parameter iomap: pass the correct len to fserror_report_io in __iomap_write_begin vfs: add FS_USERNS_DELEGATABLE flag and set it for NFS filelock: fix break_lease() stub signature for CONFIG_FILE_LOCKING=n vfs: uapi: retire octal and hex numbers in favor of (1 << n) for O_ flags bpf: add bpf_real_inode() kfunc fs/read_write: Do not export __kernel_write() to the entire world libfs: drop redundant SB_I_NOEXEC/SB_I_NODEV in init_pseudo() callers libfs: set SB_I_NOEXEC and SB_I_NODEV by default in init_pseudo() mount: honour SB_NOUSER in the new mount API fs/fcntl: fix SOFTIRQ-unsafe lock order in fasync signaling selftests/pipe: add pipe_bench microbenchmark fs/pipe: pre-allocate pages outside pipe->mutex in anon_pipe_write fs: retire stale comment in fget_task_next() fs: fix spelling mistakes in comment bfs: replace get_zeroed_page() with kzalloc() binfmt_misc: replace __get_free_page() with kmalloc() configfs: replace __get_free_pages() with kzalloc() fs/namespace: use __getname() to allocate mntpath buffer fs/select: replace __get_free_page() with kmalloc() ...
2026-06-12	Merge tag 'kvm-x86-vfio-7.2' of https://github.com/kvm-x86/linux into HEAD	Paolo Bonzini	1	-63/+35
	KVM VFIO changes for 7.2 Use guard() to cleanup up various KVM+VFIO flows.
2026-06-12	Merge tag 'kvm-x86-sev-7.2' of https://github.com/kvm-x86/linux into HEAD	Paolo Bonzini	1	-2/+4
	KVM SEV changes for 7.2 - Don't advertise support for unusuable VM types, and account for VM types that are disabled by firmware, e.g. to mitigate security vulnerabilities. - Rewrite the SEV {en,de}crypt debug ioctls as they were riddle with bugs and unnecessarily complicated, and add comprehensive tests. - Clean up and deduplicate the SEV page pinning code. - Fix minor goofs related to writing back CPUID information after firmware rejects a CPUID page for an SNP vCPU.
2026-06-12	Merge tag 'kvm-x86-gmem-7.2' of https://github.com/kvm-x86/linux into HEAD	Paolo Bonzini	2	-9/+14
	KVM guest_memfd changes for 7.2 - Return -EEXIST instead of -EINVAL if userspace attempts to bind a gmem range to multiple memslots, and fix the test that was supposed to ensure KVM returns -EEXIST. - Treat memslot binding offsets and sizes as unsigned values to fix a bug where KVM interprets a large "offset + size" as a negative value and allows a nonsensical offset. - Use the inode number instead of the page offset for the NUMA interleaving index to fix a bug where the effective index would jump by two for consecutive pages (the caller also adds in the page offset).
2026-06-12	Merge tag 'kvm-x86-generic-7.2' of https://github.com/kvm-x86/linux into HEAD	Paolo Bonzini	2	-10/+10
	KVM generic changes for 7.2 - Rename invalidate_begin() to invalidate_start() throughout KVM to follow the kernel's nomenclature, e.g. for mmu_notifiers. - Minor cleanups.
2026-06-08	KVM: guest_memfd: fix NUMA interleave index double-counting	Michael S. Tsirkin	1	-3/+4
	kvm_gmem_get_policy() sets the interleave index (the output param that's typically named "ilx") to the full page offset (vm_pgoff + vma offset). But get_vma_policy() adds the page offset on top of the interleave index, and so the offset is counted twice. This causes NUMA interleaving to skip nodes: for order-0 pages the effective index jumps by 2 for each consecutive page. The vm_op.get_policy() implementation should return only a per-file bias in the interleave index (like shmem_get_policy does with inode->i_ino), letting get_vma_policy() add the page-offset component. Fix by setting the output interleave index to the inode number (a la shmem) instead of the full page offset, as the index is intended to be a constant, semi-random value for a given file, e.g. so that interleaving doesn't start at the same node for every file, and so that allocations are round-robined across nodes based on the page offset (the selected node would bounce/skip around if the index isn't constant). Found by Sashiko (sashiko.dev) AI code review. Fixes: ed1ffa810bd6 ("KVM: guest_memfd: Enforce NUMA mempolicy using shared policy") Cc: Sean Christopherson <seanjc@google.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Assisted-by: Claude:claude-opus-4-6 Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Acked-by: David Hildenbrand (Arm) <david@kernel.org> Reviewed-by: Shivank Garg <shivankg@amd.com> Tested-by: Shivank Garg <shivankg@amd.com> Fixes: 7f3779a3ac3e ("mm/filemap: Add NUMA mempolicy support to filemap_alloc_folio()") Link: https://patch.msgid.link/0eff0a90667b900bee837d06b5db5025e1f304b5.1780501924.git.mst@redhat.com [sean: use reverse fir-tree, massage changelog] Signed-off-by: Sean Christopherson <seanjc@google.com>
2026-06-04	libfs: drop redundant SB_I_NOEXEC/SB_I_NODEV in init_pseudo() callers	John Hubbard	1	-2/+0
	init_pseudo() now sets SB_I_NOEXEC and SB_I_NODEV by default, so the per-caller assignments are redundant. Drop them. Signed-off-by: John Hubbard <jhubbard@nvidia.com> Link: https://patch.msgid.link/20260604025315.245910-3-jhubbard@nvidia.com Signed-off-by: Christian Brauner (Amutable) <brauner@kernel.org>
2026-06-03	KVM: Don't WARN if memory is dirtied without a vCPU when the VM is dying	Sean Christopherson	1	-1/+2
	When marking a page dirty, complain about not having a running/loaded vCPU if and only if the VM is still alive, i.e. its refcount is non-zero. This will allow fixing a memory leak for x86 SEV-ES guests without hitting what is effectively a false positive on the WARN. For some SEV-ES VM-Exits, KVM keeps a writable mapping of a guest page across an exit to userspace, and typically unmaps the page on the next KVM_RUN. But if userspace never calls KVM_RUN after such an exit, then KVM needs to unmap the page when the vCPU is destroyed, which in turn triggers the WARN about not having a running vCPU. Alternatively, SEV-ES could temporarily load the vCPU to suppress the WARN, as is done in nested_vmx_free_vcpu() (but for completely unrelated reasons; suppressing WARN from nested_put_vmcs12_pages() is pure happenstance). But loading a vCPU during destruction is gross (ideally nVMX code would be cleaned up), risks complicating the SEV-ES code (KVM would need to ensure the temporarily load()+put() only runs when the vCPU isn't already loaded), and is ultimately pointless. The motivation for the WARN is to guard against KVM dirtying guest memory without pushing the corresponding GFN to the active vCPU's dirty ring, e.g. to ensure userspace doesn't miss a dirty page. But for the VM's refcount to reach zero, there can't be _any_ userspace mappings to the dirty ring, as mapping the dirty ring requires doing mmap() on the vCPU FD. I.e. if userspace had a valid mapping for the dirty ring, then the vCPU file and thus the owning VM would still be alive. And so since userspace can't possibly reach the dirty ring, whether or not KVM technically "misses" a push to the dirty ring is irrelevant. Reported-by: Michael Roth <michael.roth@amd.com> Cc: stable@vger.kernel.org Reviewed-by: Michael Roth <michael.roth@amd.com> Signed-off-by: Sean Christopherson <seanjc@google.com> Message-ID: <20260501202250.2115252-15-seanjc@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Message-ID: <20260529183549.1104619-15-pbonzini@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2026-06-03	KVM: guest_memfd: Treat memslot binding offset+size as unsigned values	Sean Christopherson	2	-6/+9
	When binding a memslot to a guest_memfd file, treat the offset and size as unsigned values to fix a bug where the sum of the two can result in a false negative when checking for overflow against the size of the file. Passing unsigned values also avoids relying on somewhat obscure checks in other flows for safety, and tracks the offset and size as they are intended to be tracked, as unsigned values. On 64-bit kernels, the number of pages a memslot contains and thus the size (and offset) of its guest_memfd binding are unsigned 64-bit values. Taking the offset+size as an loff_t instead of a uoff_t inadvertently converts the unsigned value to a signed value if the offset and/or size is massive. Locally storing the offset and size as signed values is benign in and of itself (though even that is extremely difficult to discern), but operating on their sum is not. For the offset, KVM explicitly checks against a negative value, which might seem like a bug as KVM could incorrectly reject a legitimate binding, but that's not actually the case as KVM_CREATE_GUEST_MEMFD takes a signed value for its size, i.e. a would-be-negative offset is also greater than the maximum possible size of any guest_memfd file. Regarding the size, while KVM lacks an explicit check for a negative value, i.e. seemingly has a flawed overflow check, KVM restricts the number of pages in a single memslot to the largest positive signed 32-bit value: if (id < KVM_USER_MEM_SLOTS && (mem->memory_size >> PAGE_SHIFT) > KVM_MEM_MAX_NR_PAGES) return -EINVAL; and so that maximum "size" will ever be is 0x7fffffff000. The sum of the two is, however, problematic. While the size is restricted by KVM's memslot logic, the offset is not, i.e. the offset is completely unchecked until the "offset + size > i_size_read(inode)" check. If the offset is the (nearly) largest possible _positive_ value, then adding size to the offset can result in a signed, negative 64-bit value. When compared against the size of the file (guaranteed to be positive), the negative sum is always smaller, and KVM incorrectly allows the absurd offset. Opportunistically add missing includes in kvm_mm.h (instead of relying on its parents). Fixes: a7800aa80ea4 ("KVM: Add KVM_CREATE_GUEST_MEMFD ioctl() for guest-specific backing memory") Cc: stable@vger.kernel.org Cc: Ackerley Tng <ackerleytng@google.com> Reviewed-by: Michael Roth <michael.roth@amd.com> Reviewed-by: Ackerley Tng <ackerleytng@google.com> Link: https://patch.msgid.link/20260602170921.1304394-2-seanjc@google.com Signed-off-by: Sean Christopherson <seanjc@google.com>
2026-05-26	KVM: guest_memfd: Return -EEXIST for overlapping bindings	Zongyao Chen	1	-0/+1
	KVM_SET_USER_MEMORY_REGION2 rejects guest_memfd ranges that overlap an existing binding, but kvm_gmem_bind() currently reports the failure through its generic -EINVAL path. That makes binding conflicts indistinguishable from malformed guest_memfd parameters. Return -EEXIST when the target guest_memfd range is already bound, matching the errno used for overlapping GPA memslots and making the two types of range conflicts report the same class of error to userspace. Note, returning -EINVAL was definitely not intentional, as guest_memfd support was accompanied by a selftest to verify that attempting to create overlapping bindings fails with -EEXIST. Except the selftest was also flawed in that it unintentionally overlapped memslot GPAs, and so failed on KVM's common memslot checks before reaching guest_memfd. Fixes: a7800aa80ea4 ("KVM: Add KVM_CREATE_GUEST_MEMFD ioctl() for guest-specific backing memory") Signed-off-by: Zongyao Chen <ZongYao.Chen@linux.alibaba.com> Reviewed-by: Ackerley Tng <ackerleytng@google.com> Tested-by: Ackerley Tng <ackerleytng@google.com> [sean: call out that the original intent was to return -EEXIST] Link: https://patch.msgid.link/20260522172151.3530267-2-seanjc@google.com Signed-off-by: Sean Christopherson <seanjc@google.com>
2026-05-26	KVM: SEV: Pin source page for write when adding CPUID data for SNP guest	Sean Christopherson	1	-2/+4
	When populating a guest_memfd instance with the initial CPUID data for an SNP guest, acquire a writable pin on the source page as KVM will write back the "correct" CPUID information if the userspace provided data is rejected by trusted firmware. Because KVM writes to the source page using a kernel mapping, pinning for read could result in KVM clobbering read-only memory. Note, well-behaved VMMs are unlikely to be affected, as CPUID information is almost always dynamically generated by userspace, i.e. it's unlikely for the CPUID information to be backed by a read-only mapping. Fixes: 2a62345b30529 ("KVM: guest_memfd: GUP source pages prior to populating guest memory") Cc: stable@vger.kernel.org Signed-off-by: Ackerley Tng <ackerleytng@google.com> Link: https://patch.msgid.link/20260522-fix-sev-gmem-post-populate-v2-1-3f196bfad5a1@google.com [sean: rewrite shortlog and changelog, tag for stable@] Signed-off-by: Sean Christopherson <seanjc@google.com>
2026-05-13	KVM: VFIO: update coherency only if file was deleted	Carlos López	1	-10/+4
	When servicing a KVM_DEV_VFIO_FILE_DEL request, if a file is removed from kv->file_list, kv->noncoherent needs to be updated, in case we can revert to using coherent DMA. However, if we found no candidate to remove, there is no need to re-scan the list, so do it only if a matching file was found. To simplify the control flow, use a mutex guard so that we can return early from within the search loop if the maching file is found. Signed-off-by: Carlos López <clopez@suse.de> Reviewed-by: Alex Williamson <alex@shazbot.org> Link: https://patch.msgid.link/20260313122040.1413091-7-clopez@suse.de Signed-off-by: Sean Christopherson <seanjc@google.com>
2026-05-13	KVM: VFIO: deduplicate file release logic	Carlos López	1	-21/+18
	There are two callsites which destroy files in kv->file_list: the function servicing KVM_DEV_VFIO_FILE_DEL, and the relase of the whole KVM VFIO device. The process involves several steps, so move all those into a single function, removing duplicate code. Signed-off-by: Carlos López <clopez@suse.de> Reviewed-by: Alex Williamson <alex@shazbot.org> Link: https://patch.msgid.link/20260313122040.1413091-6-clopez@suse.de Signed-off-by: Sean Christopherson <seanjc@google.com>
2026-05-13	KVM: VFIO: use mutex guard in kvm_vfio_file_set_spapr_tce()	Carlos López	1	-14/+6
	Use a mutex guard to hold a lock for the entirety of the function, which removes the need for a goto (whose label even has a misleading name since 8152f8201088 ("fdget(), more trivial conversions")) Signed-off-by: Carlos López <clopez@suse.de> Reviewed-by: Alex Williamson <alex@shazbot.org> Link: https://patch.msgid.link/20260313122040.1413091-5-clopez@suse.de Signed-off-by: Sean Christopherson <seanjc@google.com>
2026-05-13	KVM: VFIO: clean up control flow in kvm_vfio_file_add()	Carlos López	1	-20/+9
	The struct file that this function fgets() is always passed to fput() before returning, so use automatic cleanup via __free() to avoid several jumps to the end of the function. Similarly, use a mutex guard to completely remove the need to use gotos. Signed-off-by: Carlos López <clopez@suse.de> Reviewed-by: Alex Williamson <alex@shazbot.org> Link: https://patch.msgid.link/20260313122040.1413091-4-clopez@suse.de Signed-off-by: Sean Christopherson <seanjc@google.com>
2026-05-13	KVM: Rename invalidate_begin to invalidate_start for consistency	Takahiro Itazuri	2	-10/+10
	Rename kvm_mmu_invalidate_begin() to kvm_mmu_invalidate_start() to align with mmu_notifier_ops.invalidate_range_start(), which is the callback that ultimately drives KVM's MMU invalidation. While the naming within KVM itself is a close split between "_begin" and "_start": $ git grep -E "invalidate(_range)?_begin" */kvm \| wc -l 12 $ git grep -E "invalidate(_range)?_start" */kvm \| wc -l 21 All two of the begin() uses are in KVM: $ git grep -E "invalidate(_range)?_begin" * \| wc -l 14 And those two holdouts are bugs in invalidate_range_start()'s comment, i.e. will also be fixed sooner or later[]. On the other hand, use of _start() is pervasive throughout the kernel: $ git grep -E "invalidate(_range)?_start" \| wc -l 117 Even if that weren't the case, conforming to the mmu_notifier_ops naming is the right call since invalidate_range_start() is the external API that KVM hooks into. No functional change intended. Link: https://lore.kernel.org/all/20260513163546.1176742-1-seanjc@google.com [*] Signed-off-by: Takahiro Itazuri <itazur@amazon.com> Link: https://patch.msgid.link/20260420154720.29012-4-itazur@amazon.com [sean: massage changelog to provide more (accurate) numbers] Signed-off-by: Sean Christopherson <seanjc@google.com>
2026-05-12	KVM: Reject wrapped offset in kvm_reset_dirty_gfn()	Aaron Sacks	1	-1/+2
	kvm_reset_dirty_gfn() guards the gfn range with if (!memslot \|\| (offset + __fls(mask)) >= memslot->npages) return; but offset is u64 and the addition is unchecked. The check can be silently bypassed by a u64 wrap. The dirty ring backing those entries is MAP_SHARED at KVM_DIRTY_LOG_PAGE_OFFSET of the vcpu fd, so the VMM can rewrite the slot and offset fields of any entry between when the kernel pushes them and when KVM_RESET_DIRTY_RINGS consumes them. On reset, kvm_dirty_ring_reset() re-reads the values via READ_ONCE() and feeds them straight back into this check; only the flags handshake is treated as the handover, the slot/offset payload is taken on trust. Crafting two entries entry[i].offset = 0xffffffffffffffc1 entry[i+1].offset = 0 makes the coalescing loop in kvm_dirty_ring_reset() compute delta = (s64)(0 - 0xffffffffffffffc1) = 63 which falls in [0, BITS_PER_LONG), so it folds entry[i+1] into the existing mask by setting bit 63. The trailing kvm_reset_dirty_gfn() call then sees offset = 0xffffffffffffffc1 and __fls(mask) = 63; the sum is 0 in u64 and the bounds check passes. That offset propagates into kvm_arch_mmu_enable_log_dirty_pt_masked() unchanged. On the legacy MMU path -- kvm_memslots_have_rmaps() == true, i.e. shadow paging, any VM that has allocated shadow roots, or a write-tracked slot -- it reaches gfn_to_rmap(), which indexes slot->arch.rmap[0][] with a near-U64_MAX gfn. That is an out-of-bounds load of a kvm_rmap_head, followed by a conditional clear of PT_WRITABLE_MASK in whatever the loaded pointer points at. The path is reachable from any process holding /dev/kvm. Range-check offset on its own first, so the addition cannot wrap. memslot->npages is bounded well below U64_MAX, so once offset < npages holds, offset + __fls(mask) (with __fls(mask) < BITS_PER_LONG) stays in range. Fixes: fb04a1eddb1a ("KVM: X86: Implement ring-based dirty memory tracking") Cc: stable@vger.kernel.org Signed-off-by: Aaron Sacks <contact@xchglabs.com> Link: https://patch.msgid.link/20260512060742.1628959-1-contact@xchglabs.com/ Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2026-04-17	Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm	Linus Torvalds	2	-20/+28
	Pull kvm updates from Paolo Bonzini: "Arm: - Add support for tracing in the standalone EL2 hypervisor code, which should help both debugging and performance analysis. This uses the new infrastructure for 'remote' trace buffers that can be exposed by non-kernel entities such as firmware, and which came through the tracing tree - Add support for GICv5 Per Processor Interrupts (PPIs), as the starting point for supporting the new GIC architecture in KVM - Finally add support for pKVM protected guests, where pages are unmapped from the host as they are faulted into the guest and can be shared back from the guest using pKVM hypercalls. Protected guests are created using a new machine type identifier. As the elusive guestmem has not yet delivered on its promises, anonymous memory is also supported This is only a first step towards full isolation from the host; for example, the CPU register state and DMA accesses are not yet isolated. Because this does not really yet bring fully what it promises, it is hidden behind CONFIG_ARM_PKVM_GUEST + 'kvm-arm.mode=protected', and also triggers TAINT_USER when a VM is created. Caveat emptor - Rework the dreaded user_mem_abort() function to make it more maintainable, reducing the amount of state being exposed to the various helpers and rendering a substantial amount of state immutable - Expand the Stage-2 page table dumper to support NV shadow page tables on a per-VM basis - Tidy up the pKVM PSCI proxy code to be slightly less hard to follow - Fix both SPE and TRBE in non-VHE configurations so that they do not generate spurious, out of context table walks that ultimately lead to very bad HW lockups - A small set of patches fixing the Stage-2 MMU freeing in error cases - Tighten-up accepted SMC immediate value to be only #0 for host SMCCC calls - The usual cleanups and other selftest churn LoongArch: - Use CSR_CRMD_PLV for kvm_arch_vcpu_in_kernel() - Add DMSINTC irqchip in kernel support RISC-V: - Fix steal time shared memory alignment checks - Fix vector context allocation leak - Fix array out-of-bounds in pmu_ctr_read() and pmu_fw_ctr_read_hi() - Fix double-free of sdata in kvm_pmu_clear_snapshot_area() - Fix integer overflow in kvm_pmu_validate_counter_mask() - Fix shift-out-of-bounds in make_xfence_request() - Fix lost write protection on huge pages during dirty logging - Split huge pages during fault handling for dirty logging - Skip CSR restore if VCPU is reloaded on the same core - Implement kvm_arch_has_default_irqchip() for KVM selftests - Factored-out ISA checks into separate sources - Added hideleg to struct kvm_vcpu_config - Factored-out VCPU config into separate sources - Support configuration of per-VM HGATP mode from KVM user space s390: - Support for ESA (31-bit) guests inside nested hypervisors - Remove restriction on memslot alignment, which is not needed anymore with the new gmap code - Fix LPSW/E to update the bear (which of course is the breaking event address register) x86: - Shut up various UBSAN warnings on reading module parameter before they were initialized - Don't zero-allocate page tables that are used for splitting hugepages in the TDP MMU, as KVM is guaranteed to set all SPTEs in the page table and thus write all bytes - As an optimization, bail early when trying to unsync 4KiB mappings if the target gfn can just be mapped with a 2MiB hugepage x86 generic: - Copy single-chunk MMIO write values into struct kvm_vcpu (more precisely struct kvm_mmio_fragment) to fix use-after-free stack bugs where KVM would dereference stack pointer after an exit to userspace - Clean up and comment the emulated MMIO code to try to make it easier to maintain (not necessarily "easy", but "easier") - Move VMXON+VMXOFF and EFER.SVME toggling out of KVM (not all of VMX and SVM enabling) as it is needed for trusted I/O - Advertise support for AVX512 Bit Matrix Multiply (BMM) instructions - Immediately fail the build if a required #define is missing in one of KVM's headers that is included multiple times - Reject SET_GUEST_DEBUG with -EBUSY if there's an already injected exception, mostly to prevent syzkaller from abusing the uAPI to trigger WARNs, but also because it can help prevent userspace from unintentionally crashing the VM - Exempt SMM from CPUID faulting on Intel, as per the spec - Misc hardening and cleanup changes x86 (AMD): - Fix and optimize IRQ window inhibit handling for AVIC; make it per-vCPU so that KVM doesn't prematurely re-enable AVIC if multiple vCPUs have to-be-injected IRQs - Clean up and optimize the OSVW handling, avoiding a bug in which KVM would overwrite state when enabling virtualization on multiple CPUs in parallel. This should not be a problem because OSVW should usually be the same for all CPUs - Drop a WARN in KVM_MEMORY_ENCRYPT_REG_REGION where KVM complains about a "too large" size based purely on user input - Clean up and harden the pinning code for KVM_MEMORY_ENCRYPT_REG_REGION - Disallow synchronizing a VMSA of an already-launched/encrypted vCPU, as doing so for an SNP guest will crash the host due to an RMP violation page fault - Overhaul KVM's APIs for detecting SEV+ guests so that VM-scoped queries are required to hold kvm->lock, and enforce it by lockdep. Fix various bugs where sev_guest() was not ensured to be stable for the whole duration of a function or ioctl - Convert a pile of kvm->lock SEV code to guard() - Play nicer with userspace that does not enable KVM_CAP_EXCEPTION_PAYLOAD, for which KVM needs to set CR2 and DR6 as a response to ioctls such as KVM_GET_VCPU_EVENTS (even if the payload would end up in EXITINFO2 rather than CR2, for example). Only set CR2 and DR6 when consumption of the payload is imminent, but on the other hand force delivery of the payload in all paths where userspace retrieves CR2 or DR6 - Use vcpu->arch.cr2 when updating vmcb12's CR2 on nested #VMEXIT instead of vmcb02->save.cr2. The value is out of sync after a save/restore or after a #PF is injected into L2 - Fix a class of nSVM bugs where some fields written by the CPU are not synchronized from vmcb02 to cached vmcb12 after VMRUN, and so are not up-to-date when saved by KVM_GET_NESTED_STATE - Fix a class of bugs where the ordering between KVM_SET_NESTED_STATE and KVM_SET_{S}REGS could cause vmcb02 to be incorrectly initialized after save+restore - Add a variety of missing nSVM consistency checks - Fix several bugs where KVM failed to correctly update VMCB fields on nested #VMEXIT - Fix several bugs where KVM failed to correctly synthesize #UD or #GP for SVM-related instructions - Add support for save+restore of virtualized LBRs (on SVM) - Refactor various helpers and macros to improve clarity and (hopefully) make the code easier to maintain - Aggressively sanitize fields when copying from vmcb12, to guard against unintentionally allowing L1 to utilize yet-to-be-defined features - Fix several bugs where KVM botched rAX legality checks when emulating SVM instructions. There are remaining issues in that KVM doesn't handle size prefix overrides for 64-bit guests - Fail emulation of VMRUN/VMLOAD/VMSAVE if mapping vmcb12 fails instead of somewhat arbitrarily synthesizing #GP (i.e. don't double down on AMD's architectural but sketchy behavior of generating #GP for "unsupported" addresses) - Cache all used vmcb12 fields to further harden against TOCTOU bugs x86 (Intel): - Drop obsolete branch hint prefixes from the VMX instruction macros - Use ASM_INPUT_RM() in __vmcs_writel() to coerce clang into using a register input when appropriate - Code cleanups guest_memfd: - Don't mark guest_memfd folios as accessed, as guest_memfd doesn't support reclaim, the memory is unevictable, and there is no storage to write back to LoongArch selftests: - Add KVM PMU test cases s390 selftests: - Enable more memory selftests x86 selftests: - Add support for Hygon CPUs in KVM selftests - Fix a bug in the MSR test where it would get false failures on AMD/Hygon CPUs with exactly one of RDPID or RDTSCP - Add an MADV_COLLAPSE testcase for guest_memfd as a regression test for a bug where the kernel would attempt to collapse guest_memfd folios against KVM's will" * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (373 commits) KVM: x86: use inlines instead of macros for is_sev_*guest x86/virt: Treat SVM as unsupported when running as an SEV+ guest KVM: SEV: Goto an existing error label if charging misc_cg for an ASID fails KVM: SVM: Move lock-protected allocation of SEV ASID into a separate helper KVM: SEV: use mutex guard in snp_handle_guest_req() KVM: SEV: use mutex guard in sev_mem_enc_unregister_region() KVM: SEV: use mutex guard in sev_mem_enc_ioctl() KVM: SEV: use mutex guard in snp_launch_update() KVM: SEV: Assert that kvm->lock is held when querying SEV+ support KVM: SEV: Document that checking for SEV+ guests when reclaiming memory is "safe" KVM: SEV: Hide "struct kvm_sev_info" behind CONFIG_KVM_AMD_SEV=y KVM: SEV: WARN on unhandled VM type when initializing VM KVM: LoongArch: selftests: Add PMU overflow interrupt test KVM: LoongArch: selftests: Add basic PMU event counting test KVM: LoongArch: selftests: Add cpucfg read/write helpers LoongArch: KVM: Add DMSINTC inject msi to vCPU LoongArch: KVM: Add DMSINTC device support LoongArch: KVM: Make vcpu_is_preempted() as a macro rather than function LoongArch: KVM: Move host CSR_GSTAT save and restore in context switch LoongArch: KVM: Move host CSR_EENTRY save and restore in context switch ...
2026-04-15	Merge tag 'mm-stable-2026-04-13-21-45' of ↵	Linus Torvalds	1	-21/+11
	git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm Pull MM updates from Andrew Morton: - "maple_tree: Replace big node with maple copy" (Liam Howlett) Mainly prepararatory work for ongoing development but it does reduce stack usage and is an improvement. - "mm, swap: swap table phase III: remove swap_map" (Kairui Song) Offers memory savings by removing the static swap_map. It also yields some CPU savings and implements several cleanups. - "mm: memfd_luo: preserve file seals" (Pratyush Yadav) File seal preservation to LUO's m