| Age | Commit message (Collapse) | Author | Files | Lines |
|
Replace all variations of "paddr" variables in KVM selftests with "gpa",
with the exception of the ELF structures, as those fields are not specific
to guest virtual addresses, to complete the conversion from vm_paddr_t to
gpa_t.
No functional change intended.
Link: https://patch.msgid.link/20260420212004.3938325-20-seanjc@google.com
Signed-off-by: Sean Christopherson <seanjc@google.com>
|
|
In x86's nested TDP APIs, use the appropriate gpa_t typedef and rename
variables from nested_paddr to l2_gpa to match KVM x86's nomenclature.
No functional change intended.
Link: https://patch.msgid.link/20260420212004.3938325-19-seanjc@google.com
Signed-off-by: Sean Christopherson <seanjc@google.com>
|
|
Use gpa_t instead of u64 for obvious declarations of GPA variables.
No functional change intended.
Link: https://patch.msgid.link/20260420212004.3938325-18-seanjc@google.com
Signed-off-by: Sean Christopherson <seanjc@google.com>
|
|
Replace all variations of "vaddr" variables in KVM selftests with "gva",
with the exception of the ELF structures, as those fields are not specific
to guest virtual addresses, to complete the conversion from vm_vaddr_t to
gva_t.
Opportunistically use gva_t instead of u64 for relevant variables, and
fixup indentation as appropriate.
No functional change intended.
Link: https://patch.msgid.link/20260420212004.3938325-17-seanjc@google.com
Signed-off-by: Sean Christopherson <seanjc@google.com>
|
|
Now that KVM selftests use gva_t instead of vm_vaddr_t, rename the helper
for populating the initial GVA bitmap to drop the defunct terminology and
use "vm" for the scope.
Opportunistically fixup the declaration of the API, which has been broken
since day 1. The flaw went unnoticed because the sole caller is defined
after the weak version, i.e. can see the prototype without a previous
declaration.
No functional change intended.
Fixes: e8b9a055fa04 ("KVM: arm64: selftests: Align VA space allocator with TTBR0")
Link: https://patch.msgid.link/20260420212004.3938325-14-seanjc@google.com
Signed-off-by: Sean Christopherson <seanjc@google.com>
|
|
Now that KVM selftests use gva_t instead of vm_vaddr_t, rename the API
for finding an unused range of virtual memory to drop the defunct
terminology and use "vm" for the scope.
Opportunistically clean up the function comment to drop superfluous
and redundant information.
No functional change intended.
Link: https://patch.msgid.link/20260420212004.3938325-13-seanjc@google.com
Signed-off-by: Sean Christopherson <seanjc@google.com>
|
|
Now that KVM selftests use gva_t instead of vm_vaddr_t, drop "vaddr_" from
the core memory allocation APIs as the information is extraneous and does
more harm than good. E.g. the APIs don't _just_ allocate virtual memory,
they allocate backing physical memory and install mappings in the guest
page tables. And as proven by kmalloc() and malloc(), developers generally
expect that allocations come with a working virtual address.
Opportunistically clean up the function comment for vm_alloc(), and drop
the misleading and superfluous comments for its wrappers.
No functional change intended.
Link: https://patch.msgid.link/20260420212004.3938325-12-seanjc@google.com
Signed-off-by: Sean Christopherson <seanjc@google.com>
|
|
Use u8 instead of uint8_t to make the KVM selftests code more concise
and more similar to the kernel (since selftests are primarily developed
by kernel developers).
This commit was generated with the following command:
git ls-files tools/testing/selftests/kvm | xargs sed -i 's/uint8_t/u8/g'
Then by manually adjusting whitespace to make checkpatch.pl happy.
No functional change intended.
Signed-off-by: David Matlack <dmatlack@google.com>
Link: https://patch.msgid.link/20260420212004.3938325-11-seanjc@google.com
Signed-off-by: Sean Christopherson <seanjc@google.com>
|
|
Use u16 instead of uint16_t to make the KVM selftests code more concise
and more similar to the kernel (since selftests are primarily developed
by kernel developers).
This commit was generated with the following command:
git ls-files tools/testing/selftests/kvm | xargs sed -i 's/uint16_t/u16/g'
Then by manually adjusting whitespace to make checkpatch.pl happy.
No functional change intended.
Signed-off-by: David Matlack <dmatlack@google.com>
Link: https://patch.msgid.link/20260420212004.3938325-9-seanjc@google.com
Signed-off-by: Sean Christopherson <seanjc@google.com>
|
|
Use s32 instead of int32_t to make the KVM selftests code more concise
and more similar to the kernel (since selftests are primarily developed
by kernel developers).
This commit was generated with the following command:
git ls-files tools/testing/selftests/kvm | xargs sed -i 's/int32_t/s32/g'
Then by manually adjusting whitespace to make checkpatch.pl happy.
No functional change intended.
Signed-off-by: David Matlack <dmatlack@google.com>
Link: https://patch.msgid.link/20260420212004.3938325-8-seanjc@google.com
Signed-off-by: Sean Christopherson <seanjc@google.com>
|
|
Use u32 instead of uint32_t to make the KVM selftests code more concise
and more similar to the kernel (since selftests are primarily developed
by kernel developers).
This commit was generated with the following command:
git ls-files tools/testing/selftests/kvm | xargs sed -i 's/uint32_t/u32/g'
Then by manually adjusting whitespace to make checkpatch.pl happy.
No functional change intended.
Signed-off-by: David Matlack <dmatlack@google.com>
Link: https://patch.msgid.link/20260420212004.3938325-7-seanjc@google.com
Signed-off-by: Sean Christopherson <seanjc@google.com>
|
|
Use s64 instead of int64_t to make the KVM selftests code more concise
and more similar to the kernel (since selftests are primarily developed
by kernel developers).
This commit was generated with the following command:
git ls-files tools/testing/selftests/kvm | xargs sed -i 's/int64_t/s64/g'
Then by manually adjusting whitespace to make checkpatch.pl happy.
No functional change intended.
Signed-off-by: David Matlack <dmatlack@google.com>
Link: https://patch.msgid.link/20260420212004.3938325-6-seanjc@google.com
Signed-off-by: Sean Christopherson <seanjc@google.com>
|
|
Use u64 instead of uint64_t to make the KVM selftests code more concise
and more similar to the kernel (since selftests are primarily developed
by kernel developers).
This commit was generated with the following command:
git ls-files tools/testing/selftests/kvm | xargs sed -i 's/uint64_t/u64/g'
Then by manually adjusting whitespace to make checkpatch.pl happy.
Include <linux/types.h> in include/kvm_util_types.h, iinclude/test_util.h,
and include/x86/pmu.h to pick up the tools-defined u64. Arguably, all
headers (especially kvm_util_types.h) should have already been including
stdint.h to get uint64_t from the libc headers, but the missing dependency
only rears its head once KVM uses u64 instead of uint64_t.
No functional change intended.
Signed-off-by: David Matlack <dmatlack@google.com>
[sean: rename pread_uint64() => pread_u64, expand on types.h include]
Link: https://patch.msgid.link/20260420212004.3938325-5-seanjc@google.com
Signed-off-by: Sean Christopherson <seanjc@google.com>
|
|
Replace all occurrences of vm_paddr_t with gpa_t to align with KVM code
and with the conversion helpers (e.g. addr_hva2gpa()).
This commit was generated with the following command:
git ls-files tools/testing/selftests/kvm | xargs sed -i 's/vm_paddr_/gpa_/g'
Then by manually adjusting whitespace to make checkpatch.pl happy.
No functional change intended.
Signed-off-by: David Matlack <dmatlack@google.com>
[sean: drop bogus changelog blurb about renaming functions]
Link: https://patch.msgid.link/20260420212004.3938325-3-seanjc@google.com
Signed-off-by: Sean Christopherson <seanjc@google.com>
|
|
Replace all occurrences of vm_vaddr_t with gva_t to align with KVM code
and with the conversion helpers (e.g. addr_gva2hva()).
This commit was generated with the following command:
git ls-files tools/testing/selftests/kvm | xargs sed -i 's/vm_vaddr_/gva_/g'
Then by manually adjusting whitespace to make checkpatch.pl happy, and
dropping renames of functions that allocate memory within a given VM.
No functional change intended.
Signed-off-by: David Matlack <dmatlack@google.com>
[sean: drop renames of allocator APIs]
Link: https://patch.msgid.link/20260420212004.3938325-2-seanjc@google.com
Signed-off-by: Sean Christopherson <seanjc@google.com>
|
|
https://git.kernel.org/pub/scm/linux/kernel/git/kvms390/linux into HEAD
- ESA nesting support
- 4k memslots
- LPSW/E fix
|
|
KVM nested SVM changes for 7.1 (with one common x86 fix)
- To minimize the probability of corrupting guest state, defer KVM's
non-architectural delivery of exception payloads (e.g. CR2 and DR6) until
consumption of the payload is imminent, and force delivery of the payload
in all paths where userspace saves relevant state.
- Use vcpu->arch.cr2 when updating vmcb12's CR2 on nested #VMEXIT to fix a
bug where L2's CR2 can get corrupted after a save/restore, e.g. if the VM
is migrated while L2 is faulting in memory.
- Fix a class of nSVM bugs where some fields written by the CPU are not
synchronized from vmcb02 to cached vmcb12 after VMRUN, and so are not
up-to-date when saved by KVM_GET_NESTED_STATE.
- Fix a class of bugs where the ordering between KVM_SET_NESTED_STATE and
KVM_SET_{S}REGS could cause vmcb02 to be incorrectly initialized after
save+restore.
- Add a variety of missing nSVM consistency checks.
- Fix several bugs where KVM failed to correctly update VMCB fields on nested
#VMEXIT.
- Fix several bugs where KVM failed to correctly synthesize #UD or #GP for
SVM-related instructions.
- Add support for save+restore of virtualized LBRs (on SVM).
- Refactor various helpers and macros to improve clarity and (hopefully) make
the code easier to maintain.
- Aggressively sanitize fields when copying from vmcb12 to guard against
unintentionally allowing L1 to utilize yet-to-be-defined features.
- Fix several bugs where KVM botched rAX legality checks when emulating SVM
instructions. Note, KVM is still flawed in that KVM doesn't address size
prefix overrides for 64-bit guests; this should probably be documented as a
KVM erratum.
- Fail emulation of VMRUN/VMLOAD/VMSAVE if mapping vmcb12 fails instead of
somewhat arbitrarily synthesizing #GP (i.e. don't bastardize AMD's already-
sketchy behavior of generating #GP if for "unsupported" addresses).
- Cache all used vmcb12 fields to further harden against TOCTOU bugs.
|
|
KVM selftests changes for 7.1
- Add support for Hygon CPUs in KVM selftests.
- Fix a bug in the MSR test where it would get false failures on AMD/Hygon
CPUs with exactly one of RDPID or RDTSCP.
- Add an MADV_COLLAPSE testcase for guest_memfd as a regression test for a
bug where the kernel would attempt to collapse guest_memfd folios against
KVM's will.
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm into HEAD
KVM/arm64 updates for 7.1
* New features:
- Add support for tracing in the standalone EL2 hypervisor code,
which should help both debugging and performance analysis.
This comes with a full infrastructure for 'remote' trace buffers
that can be exposed by non-kernel entities such as firmware.
- Add support for GICv5 Per Processor Interrupts (PPIs), as the
starting point for supporting the new GIC architecture in KVM.
- Finally add support for pKVM protected guests, with anonymous
memory being used as a backing store. About time!
* Improvements and bug fixes:
- Rework the dreaded user_mem_abort() function to make it more
maintainable, reducing the amount of state being exposed to
the various helpers and rendering a substantial amount of
state immutable.
- Expand the Stage-2 page table dumper to support NV shadow
page tables on a per-VM basis.
- Tidy up the pKVM PSCI proxy code to be slightly less hard
to follow.
- Fix both SPE and TRBE in non-VHE configurations so that they
do not generate spurious, out of context table walks that
ultimately lead to very bad HW lockups.
- A small set of patches fixing the Stage-2 MMU freeing in error
cases.
- Tighten-up accepted SMC immediate value to be only #0 for host
SMCCC calls.
- The usual cleanups and other selftest churn.
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/chenhuacai/linux-loongson into HEAD
LoongArch KVM changes for v7.1
1. Use CSR_CRMD_PLV in kvm_arch_vcpu_in_kernel().
2. Let vcpu_is_preempted() a macro & some enhanments.
3. Add DMSINTC irqchip in kernel support.
4. Add KVM PMU test cases for tools/selftests.
|
|
Extend the PMU test suite to cover overflow interrupts. The test enables
the PMI (Performance Monitor Interrupt), sets counter 0 to one less than
the overflow value, and verifies that an interrupt is raised when the
counter overflows. A guest interrupt handler checks the interrupt cause
and disables further PMU interrupts upon success.
Signed-off-by: Song Gao <gaosong@loongson.cn>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
|
|
Introduce a basic PMU test that verifies hardware event counting for
four performance counters. The test enables the events for CPU cycles,
instructions retired, branch instructions, and branch misses, runs a
fixed number of loops, and checks that the counter values fall within
expected ranges. It also validates that the host supports PMU and that
the VM feature is enabled.
Signed-off-by: Song Gao <gaosong@loongson.cn>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
|
|
Add helper macros and functions to read and write CPU configuration
registers (cpucfg) from the guest and from the VMM. This interface is
required in upcoming selftests for querying and setting CPU features,
such as PMU capabilities.
Signed-off-by: Song Gao <gaosong@loongson.cn>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
|
|
Remove the 1M memslot alignment requirement for s390, since it is not
needed anymore.
Reviewed-by: Steffen Eiden <seiden@linux.ibm.com>
Signed-off-by: Claudio Imbrenda <imbrenda@linux.ibm.com>
|
|
The current sbi_pmu_test attempts to read firmware counters without
configuring them first with SBI_EXT_PMU_COUNTER_CFG_MATCH.
Previously this did not fail because KVM incorrectly allowed the read
and accessed fw_event[] with an out-of-bounds index when the counter
was unconfigured. After fixing that bug, the read now correctly returns
SBI_ERR_INVALID_PARAM, causing the selftest to fail.
Update the test to configure a firmware event before reading the
counter. Also add a negative test to ensure that attempting to read an
unconfigured firmware counter fails gracefully.
Signed-off-by: Jiakai Xu <xujiakai2025@iscas.ac.cn>
Signed-off-by: Jiakai Xu <jiakaiPeanut@gmail.com>
Reviewed-by: Andrew Jones <andrew.jones@oss.qualcomm.com>
Reviewed-by: Nutty Liu <nutty.liu@hotmail.com>
Link: https://lore.kernel.org/r/20260316014533.2312254-3-xujiakai2025@iscas.ac.cn
Signed-off-by: Anup Patel <anup@brainfault.org>
|
|
Add RISC-V KVM selftests to verify the SBI Steal-Time Accounting (STA)
shared memory alignment requirements.
The SBI specification requires the STA shared memory GPA to be 64-byte
aligned, or set to all-ones to explicitly disable steal-time accounting.
This test verifies that KVM enforces the expected behavior when
configuring the SBI STA shared memory via KVM_SET_ONE_REG.
Specifically, the test checks that:
- misaligned GPAs are rejected with -EINVAL
- 64-byte aligned GPAs are accepted
- all-ones GPA is accepted
Signed-off-by: Jiakai Xu <xujiakai2025@iscas.ac.cn>
Signed-off-by: Jiakai Xu <jiakaiPeanut@gmail.com>
Reviewed-by: Andrew Jones <andrew.jones@oss.qualcomm.com>
Link: https://lore.kernel.org/r/20260303010859.1763177-4-xujiakai2025@iscas.ac.cn
Signed-off-by: Anup Patel <anup@brainfault.org>
|
|
This basic selftest creates a vgic_v5 device (if supported), and tests
that one of the PPI interrupts works as expected with a basic
single-vCPU guest.
Upon starting, the guest enables interrupts. That means that it is
initialising all PPIs to have reasonable priorities, but marking them
as disabled. Then the priority mask in the ICC_PCR_EL1 is set, and
interrupts are enable in ICC_CR0_EL1. At this stage the guest is able
to receive interrupts. The architected SW_PPI (64) is enabled and
KVM_IRQ_LINE ioctl is used to inject the state into the guest.
The guest's interrupt handler has an explicit WFI in order to ensure
that the guest skips WFI when there are pending and enabled PPI
interrupts.
Signed-off-by: Sascha Bischoff <sascha.bischoff@arm.com>
Reviewed-by: Jonathan Cameron <jonathan.cameron@huawei.com>
Link: https://patch.msgid.link/20260319154937.3619520-41-sascha.bischoff@arm.com
Signed-off-by: Marc Zyngier <maz@kernel.org>
|
|
Add "do no harm" testing of EFER, CR0, CR4, and CR8 for SEV+ guests to
verify that the guest can read and write the registers, without hitting
e.g. a #VC on SEV-ES guests due to KVM incorrectly trying to intercept a
register.
Signed-off-by: Sean Christopherson <seanjc@google.com>
Message-ID: <20260310211841.2552361-3-seanjc@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
'virt' is confusing in the VMCB because it is relative and ambiguous.
The 'virt_ext' field includes bits for LBR virtualization and
VMSAVE/VMLOAD virtualization, so it's just another miscellaneous control
field. Name it as such.
While at it, move the definitions of the bits below those for
'misc_ctl' and rename them for consistency.
Signed-off-by: Yosry Ahmed <yosry@kernel.org>
Link: https://patch.msgid.link/20260303003421.2185681-20-yosry@kernel.org
Signed-off-by: Sean Christopherson <seanjc@google.com>
|
|
The 'nested_ctl' field is misnamed. Although the first bit is for nested
paging, the other defined bits are for SEV/SEV-ES. Other bits in the
same field according to the APM (but not defined by KVM) include "Guest
Mode Execution Trap", "Enable INVLPGB/TLBSYNC", and other control bits
unrelated to 'nested'.
There is nothing common among these bits, so just name the field
misc_ctl. Also rename the flags accordingly.
Signed-off-by: Yosry Ahmed <yosry@kernel.org>
Link: https://patch.msgid.link/20260303003421.2185681-19-yosry@kernel.org
Signed-off-by: Sean Christopherson <seanjc@google.com>
|
|
Add a selftest exercising save/restore with usage of LBRs in both L1 and
L2, and making sure all LBRs remain intact.
Signed-off-by: Yosry Ahmed <yosry@kernel.org>
Link: https://patch.msgid.link/20260303003421.2185681-5-yosry@kernel.org
Signed-off-by: Sean Christopherson <seanjc@google.com>
|
|
Extend kvm_syscalls.h to wrap madvise() to assert success. This will be
used in the next patch.
Signed-off-by: Ackerley Tng <ackerleytng@google.com>
Reviewed-by: David Hildenbrand (Arm) <david@kernel.org>
Link: https://patch.msgid.link/455483ca29a3a3042efee0cf3bbd0e2548cbeb1c.1771630983.git.ackerleytng@google.com
Signed-off-by: Sean Christopherson <seanjc@google.com>
|
|
Most of KVM x86 selftests for AMD are compatible with Hygon architecture
(but not all), add a flag "host_cpu_is_amd_compatible" to figure out
these cases.
Following test failures on Hygon platform can be fixed:
* Fix hypercall test: Hygon architecture also uses VMMCALL as guest
hypercall instruction.
* Following test failures due to access reserved memory address regions:
- access_tracking_perf_test
- demand_paging_test
- dirty_log_perf_test
- dirty_log_test
- kvm_page_table_test
- memslot_modification_stress_test
- pre_fault_memory_test
- x86/dirty_log_page_splitting_test
Hygon CSV also makes the "physical address space width reduction", the
reduced physical address bits are reported by bits 11:6 of
CPUID[0x8000001f].EBX as well, so the existed logic is totally
applicable for Hygon processors. Mapping memory into these regions and
accessing to them results in a #PF.
Signed-off-by: Zhiquan Li <zhiquan_li@163.com>
Link: https://patch.msgid.link/20260212103841.171459-3-zhiquan_li@163.com
Signed-off-by: Sean Christopherson <seanjc@google.com>
|
|
Currently some KVM selftests are failed on Hygon CPUs due to missing
vendor detection and edge-case handling specific to Hygon's
architecture.
Add CPU vendor detection for Hygon and add a global variable
"host_cpu_is_hygon" as the basic facility for the following fixes.
Signed-off-by: Zhiquan Li <zhiquan_li@163.com>
Link: https://patch.msgid.link/20260212103841.171459-2-zhiquan_li@163.com
Signed-off-by: Sean Christopherson <seanjc@google.com>
|
|
KVM x86 APIC-ish changes for 6.20
- Fix a benign bug where KVM could use the wrong memslots (ignored SMM) when
creating a vCPU-specific mapping of guest memory.
- Clean up KVM's handling of marking mapped vCPU pages dirty.
- Drop a pile of *ancient* sanity checks hidden behind in KVM's unused
ASSERT() macro, most of which could be trivially triggered by the guest
and/or user, and all of which were useless.
- Fold "struct dest_map" into its sole user, "struct rtc_status", to make it
more obvious what the weird parameter is used for, and to allow burying the
RTC shenanigans behind CONFIG_KVM_IOAPIC=y.
- Bury all of ioapic.h and KVM_IRQCHIP_KERNEL behind CONFIG_KVM_IOAPIC=y.
- Add a regression test for recent APICv update fixes.
- Rework KVM's handling of VMCS updates while L2 is active to temporarily
switch to vmcs01 instead of deferring the update until the next nested
VM-Exit. The deferred updates approach directly contributed to several
bugs, was proving to be a maintenance burden due to the difficulty in
auditing the correctness of deferred updates, and was polluting
"struct nested_vmx" with a growing pile of booleans.
- Handle "hardware APIC ISR", a.k.a. SVI, updates in kvm_apic_update_apicv()
to consolidate the updates, and to co-locate SVI updates with the updates
for KVM's own cache of ISR information.
- Drop a dead function declaration.
|
|
KVM/riscv changes for 6.20
- Fixes for issues discoverd by KVM API fuzzing in
kvm_riscv_aia_imsic_has_attr(), kvm_riscv_aia_imsic_rw_attr(),
and kvm_riscv_vcpu_aia_imsic_update()
- Allow Zalasr, Zilsd and Zclsd extensions for Guest/VM
- Add riscv vm satp modes in KVM selftests
- Transparent huge page support for G-stage
- Adjust the number of available guest irq files based on
MMIO register sizes in DeviceTree or ACPI
|
|
KVM SVM changes for 6.20
- Drop a user-triggerable WARN on nested_svm_load_cr3() failure.
- Add support for virtualizing ERAPS. Note, correct virtualization of ERAPS
relies on an upcoming, publicly announced change in the APM to reduce the
set of conditions where hardware (i.e. KVM) *must* flush the RAP.
- Ignore nSVM intercepts for instructions that are not supported according to
L1's virtual CPU model.
- Add support for expedited writes to the fast MMIO bus, a la VMX's fastpath
for EPT Misconfig.
- Don't set GIF when clearing EFER.SVME, as GIF exists independently of SVM,
and allow userspace to restore nested state with GIF=0.
- Treat exit_code as an unsigned 64-bit value through all of KVM.
- Add support for fetching SNP certificates from userspace.
- Fix a bug where KVM would use vmcb02 instead of vmcb01 when emulating VMLOAD
or VMSAVE on behalf of L2.
- Misc fixes and cleanups.
|
|
KVM selftests changes for 6.20
- Add a regression test for TPR<=>CR8 synchronization and IRQ masking.
- Overhaul selftest's MMU infrastructure to genericize stage-2 MMU support,
and extend x86's infrastructure to support EPT and NPT (for L2 guests).
- Extend several nested VMX tests to also cover nested SVM.
- Add a selftest for nested VMLOAD/VMSAVE.
- Rework the nested dirty log test, originally added as a regression test for
PML where KVM logged L2 GPAs instead of L1 GPAs, to improve test coverage
and to hopefully make the test easier to understand and maintain.
|
|
Current vm modes cannot represent riscv guest modes precisely, here add
all 9 combinations of P(56,40,41) x V(57,48,39). Also the default vm
mode is detected on runtime instead of hardcoded one, which might not be
supported on specific machine.
Signed-off-by: Wu Fei <wu.fei9@sanechips.com.cn>
Reviewed-by: Andrew Jones <ajones@ventanamicro.com>
Reviewed-by: Nutty Liu <nutty.liu@hotmail.com>
Reviewed-by: Anup Patel <anup@brainfault.org>
Link: https://lore.kernel.org/r/20251105151442.28767-1-wu.fei9@sanechips.com.cn
Signed-off-by: Anup Patel <anup@brainfault.org>
|
|
Update the nested dirty log test to validate KVM's handling of READ faults
when dirty logging is enabled. Specifically, set the Dirty bit in the
guest PTEs used to map L2 GPAs, so that KVM will create writable SPTEs
when handling L2 read faults. When handling read faults in the shadow MMU,
KVM opportunistically creates a writable SPTE if the mapping can be
writable *and* the gPTE is dirty (or doesn't support the Dirty bit), i.e.
if KVM doesn't need to intercept writes in order to emulate Dirty-bit
updates.
To actually test the L2 READ=>WRITE sequence, e.g. without masking a false
pass by other test activity, route the READ=>WRITE and WRITE=>WRITE
sequences to separate L1 pages, and differentiate between "marked dirty
due to a WRITE access/fault" and "marked dirty due to creating a writable
SPTE for a READ access/fault". The updated sequence exposes the bug fixed
by KVM commit 1f4e5fc83a42 ("KVM: x86: fix nested guest live migration
with PML") when the guest performs a READ=>WRITE sequence with dirty guest
PTEs.
Opportunistically tweak and rename the address macros, and add comments,
to make it more obvious what the test is doing. E.g. NESTED_TEST_MEM1
vs. GUEST_TEST_MEM doesn't make it all that obvious that the test is
creating aliases in both the L2 GPA and GVA address spaces, but only when
L1 is using TDP to run L2.
Cc: Yosry Ahmed <yosry.ahmed@linux.dev>
Reviewed-by: Yosry Ahmed <yosry.ahmed@linux.dev>
Link: https://patch.msgid.link/20260115172154.709024-1-seanjc@google.com
Signed-off-by: Sean Christopherson <seanjc@google.com>
|
|
Fix minor documentation errors in `kvm_util.h` and `kvm_util.c`.
- Correct the argument description for `vcpu_args_set` in `kvm_util.h`,
which incorrectly listed `vm` instead of `vcpu`.
- Fix a typo in the comment for `kvm_selftest_arch_init` ("exeucting" ->
"executing").
- Correct the return value description for `vm_vaddr_unused_gap` in
`kvm_util.c` to match the implementation, which returns an address "at
or above" `vaddr_min`, not "at or below".
No functional change intended.
Reviewed-by: Andrew Jones <andrew.jones@linux.dev>
Signed-off-by: Fuad Tabba <tabba@google.com>
Link: https://patch.msgid.link/20260109082218.3236580-6-tabba@google.com
Signed-off-by: Marc Zyngier <maz@kernel.org>
|
|
To avoid code duplication, move page_align() to the shared `kvm_util.h`
header file. Rename it to vm_page_align(), to make it clear that the
alignment is done with respect to the guest's base page size.
No functional change intended.
Reviewed-by: Andrew Jones <andrew.jones@linux.dev>
Signed-off-by: Fuad Tabba <tabba@google.com>
Link: https://patch.msgid.link/20260109082218.3236580-5-tabba@google.com
Signed-off-by: Marc Zyngier <maz@kernel.org>
|
|
KVM selftests map all guest code and data into the lower virtual address
range (0x0000...) managed by TTBR0_EL1. The upper range (0xFFFF...)
managed by TTBR1_EL1 is unused and uninitialized.
If a guest accesses the upper range, the MMU attempts a translation
table walk using uninitialized registers, leading to unpredictable
behavior.
Set `TCR_EL1.EPD1` to disable translation table walks for TTBR1_EL1,
ensuring that any access to the upper range generates an immediate
Translation Fault. Additionally, set `TCR_EL1.TBI1` (Top Byte Ignore) to
ensure that tagged pointers in the upper range also deterministically
trigger a Translation Fault via EPD1.
Define `TCR_EPD1_MASK`, `TCR_EPD1_SHIFT`, and `TCR_TBI1` in
`processor.h` to support this configuration. These are based on their
definitions in `arch/arm64/include/asm/pgtable-hwdef.h`.
Suggested-by: Will Deacon <will@kernel.org>
Reviewed-by: Itaru Kitayama <itaru.kitayama@fujitsu.com>
Signed-off-by: Fuad Tabba <tabba@google.com>
Link: https://patch.msgid.link/20260109082218.3236580-2-tabba@google.com
Signed-off-by: Marc Zyngier <maz@kernel.org>
|
|
Add a test for VMLOAD/VMSAVE in an L2 guest. The test verifies that L1
intercepts for VMSAVE/VMLOAD always work regardless of
VIRTUAL_VMLOAD_VMSAVE_ENABLE_MASK.
Then, more interestingly, it makes sure that when L1 does not intercept
VMLOAD/VMSAVE, they work as intended in L2. When
VIRTUAL_VMLOAD_VMSAVE_ENABLE_MASK is enabled by L1, VMSAVE/VMLOAD from
L2 should interpret the GPA as an L2 GPA and translate it through the
NPT. When VIRTUAL_VMLOAD_VMSAVE_ENABLE_MASK is disabled by L1,
VMSAVE/VMLOAD from L2 should interpret the GPA as an L1 GPA.
To test this, put two VMCBs (0 and 1) in L1's physical address space,
and have a single L2 GPA where:
- L2 VMCB GPA == L1 VMCB(0) GPA
- L2 VMCB GPA maps to L1 VMCB(1) via the NPT in L1.
This setup allows detecting how the GPA is interpreted based on which L1
VMCB is actually accessed.
In both cases, L2 sets KERNEL_GS_BASE (one of the fields handled by
VMSAVE/VMLOAD), and executes VMSAVE to write its value to the VMCB. The
test userspace code then checks that the write was made to the correct
VMCB (based on whether VIRTUAL_VMLOAD_VMSAVE_ENABLE_MASK is set by L1),
and writes a new value to that VMCB. L2 then executes VMLOAD to load the
new value and makes sure it's reflected correctly in KERNERL_GS_BASE.
Signed-off-by: Yosry Ahmed <yosry.ahmed@linux.dev>
Link: https://patch.msgid.link/20260110004821.3411245-4-yosry.ahmed@linux.dev
Signed-off-by: Sean Christopherson <seanjc@google.com>
|
|
Fix KVM's long-standing buggy handling of SVM's exit_code as a 32-bit
value. Per the APM and Xen commit d1bd157fbc ("Big merge the HVM
full-virtualisation abstractions.") (which is arguably more trustworthy
than KVM), offset 0x70 is a single 64-bit value:
070h 63:0 EXITCODE
Track exit_code as a single u64 to prevent reintroducing bugs where KVM
neglects to correctly set bits 63:32.
Fixes: 6aa8b732ca01 ("[PATCH] kvm: userspace interface")
Cc: Jim Mattson <jmattson@google.com>
Cc: Yosry Ahmed <yosry.ahmed@linux.dev>
Reviewed-by: Yosry Ahmed <yosry.ahmed@linux.dev>
Link: https://patch.msgid.link/20251230211347.4099600-6-seanjc@google.com
Signed-off-by: Sean Christopherson <seanjc@google.com>
|
|
Add a test to verify KVM correctly handles a variety of edge cases related
to APICv updates, and in particular updates that are triggered while L2 is
actively running.
Reviewed-by: Chao Gao <chao.gao@intel.com>
Link: https://patch.msgid.link/20260109034532.1012993-2-seanjc@google.com
Signed-off-by: Sean Christopherson <seanjc@google.com>
|
|
Shorten the API to get a PTE as the "PTE" acronym is ubiquitous, and the
"page table entry" makes it unnecessarily difficult to quickly understand
what callers are doing.
No functional change intended.
Reviewed-by: Yosry Ahmed <yosry.ahmed@linux.dev>
Link: https://patch.msgid.link/20251230230150.4150236-21-seanjc@google.com
Signed-off-by: Sean Christopherson <seanjc@google.com>
|
|
According to the APM, NPT walks are treated as user accesses. In
preparation for supporting NPT mappings, set the 'user' bit on NPTs by
adding a mask of bits to always be set on PTEs in kvm_mmu.
Signed-off-by: Yosry Ahmed <yosry.ahmed@linux.dev>
Link: https://patch.msgid.link/20251230230150.4150236-18-seanjc@google.com
Signed-off-by: Sean Christopherson <seanjc@google.com>
|
|
Implement nCR3 and NPT initialization functions, similar to the EPT
equivalents, and create common TDP helpers for enablement checking and
initialization. Enable NPT for nested guests by default if the TDP MMU
was initialized, similar to VMX.
Reuse the PTE masks from the main MMU in the NPT MMU, except for the C
and S bits related to confidential VMs.
Signed-off-by: Yosry Ahmed <yosry.ahmed@linux.dev>
Link: https://patch.msgid.link/20251230230150.4150236-17-seanjc@google.com
[sean: apply Yosry's fixup for ncr3_gpa]
Signed-off-by: Sean Christopherson <seanjc@google.com>
|