aboutsummaryrefslogtreecommitdiff
path: root/drivers/virt/coco
AgeCommit message (Collapse)AuthorFilesLines
2024-12-06Merge tag 'arm64-fixes' of ↵Linus Torvalds2-6/+1
git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux Pull arm64 fixes from Catalin Marinas: "Nothing major, some left-overs from the recent merging window (MTE, coco) and some newly found issues like the ptrace() ones. - MTE/hugetlbfs: - Set VM_MTE_ALLOWED in the arch code and remove it from the core code for hugetlbfs mappings - Fix copy_highpage() warning when the source is a huge page but not MTE tagged, taking the wrong small page path - drivers/virt/coco: - Add the pKVM and Arm CCA drivers under the arm64 maintainership - Fix the pkvm driver to fall back to ioremap() (and warn) if the MMIO_GUARD hypercall fails - Keep the Arm CCA driver default 'n' rather than 'm' - A series of fixes for the arm64 ptrace() implementation, potentially leading to the kernel consuming uninitialised stack variables when PTRACE_SETREGSET is invoked with a length of 0 - Fix zone_dma_limit calculation when RAM starts below 4GB and ZONE_DMA is capped to this limit - Fix early boot warning with CONFIG_DEBUG_VIRTUAL=y triggered by a call to page_to_phys() (from patch_map()) which checks pfn_valid() before vmemmap has been set up - Do not clobber bits 15:8 of the ASID used for TTBR1_EL1 and TLBI ops when the kernel assumes 8-bit ASIDs but running under a hypervisor on a system that implements 16-bit ASIDs (found running Linux under Parallels on Apple M4) - ACPI/IORT: Add PMCG platform information for HiSilicon HIP09A as it is using the same SMMU PMCG as HIP09 and suffers from the same errata - Add GCS to cpucap_is_possible(), missed in the recent merge" * tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: arm64: ptrace: fix partial SETREGSET for NT_ARM_GCS arm64: ptrace: fix partial SETREGSET for NT_ARM_POE arm64: ptrace: fix partial SETREGSET for NT_ARM_FPMR arm64: ptrace: fix partial SETREGSET for NT_ARM_TAGGED_ADDR_CTRL arm64: cpufeature: Add GCS to cpucap_is_possible() coco: virt: arm64: Do not enable cca guest driver by default arm64: mte: Fix copy_highpage() warning on hugetlb folios arm64: Ensure bits ASID[15:8] are masked out when the kernel uses 8-bit ASIDs ACPI/IORT: Add PMCG platform information for HiSilicon HIP09A MAINTAINERS: Add CCA and pKVM CoCO guest support to the ARM64 entry drivers/virt: pkvm: Don't fail ioremap() call if MMIO_GUARD fails arm64: patching: avoid early page_to_phys() arm64: mm: Fix zone_dma_limit calculation arm64: mte: set VM_MTE_ALLOWED for hugetlbfs at correct place
2024-12-05coco: virt: arm64: Do not enable cca guest driver by defaultSuzuki K Poulose1-1/+0
As per the guidelines, new drivers may not be set to default on. An expert user can always select it. Reported-by: Dan Williams <dan.j.williams@intel.com> Cc: Will Deacon <will@kernel.org> Cc: Steven Price <steven.price@arm.com> Cc: Sami Mujawar <sami.mujawar@arm.com> Link: https://lore.kernel.org/r/6750c695194cd_2508129427@dwillia2-xfh.jf.intel.com.notmuch Link: https://lore.kernel.org/r/20241205143634.306114-1-suzuki.poulose@arm.com Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com> Reviewed-by: Steven Price <steven.price@arm.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2024-12-03drivers/virt: pkvm: Don't fail ioremap() call if MMIO_GUARD failsWill Deacon1-5/+1
Calling the MMIO_GUARD hypercall from guests which have not been enrolled (e.g. because they are running without pvmfw) results in -EINVAL being returned. In this case, MMIO_GUARD is not active and so we can simply proceed with the normal ioremap() routine. Don't fail ioremap() if MMIO_GUARD fails; instead WARN_ON_ONCE() to highlight that the pvm environment is slightly wonky. Fixes: 0f1269495800 ("drivers/virt: pkvm: Intercept ioremap using pKVM MMIO_GUARD hypercall") Signed-off-by: Will Deacon <will@kernel.org> Link: https://lore.kernel.org/r/20241202145731.6422-2-will@kernel.org Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2024-12-01Get rid of 'remove_new' relic from platform driver structLinus Torvalds2-2/+2
The continual trickle of small conversion patches is grating on me, and is really not helping. Just get rid of the 'remove_new' member function, which is just an alias for the plain 'remove', and had a comment to that effect: /* * .remove_new() is a relic from a prototype conversion of .remove(). * New drivers are supposed to implement .remove(). Once all drivers are * converted to not use .remove_new any more, it will be dropped. */ This was just a tree-wide 'sed' script that replaced '.remove_new' with '.remove', with some care taken to turn a subsequent tab into two tabs to make things line up. I did do some minimal manual whitespace adjustment for places that used spaces to line things up. Then I just removed the old (sic) .remove_new member function, and this is the end result. No more unnecessary conversion noise. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2024-11-19Merge tag 'x86_sev_for_v6.13' of ↵Linus Torvalds2-259/+161
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 SEV updates from Borislav Petkov: - Do the proper memory conversion of guest memory in order to be able to kexec kernels in SNP guests along with other adjustments and cleanups to that effect - Start converting and moving functionality from the sev-guest driver into core code with the purpose of supporting the secure TSC SNP feature where the hypervisor cannot influence the TSC exposed to the guest anymore - Add a "nosnp" cmdline option in order to be able to disable SNP support in the hypervisor and thus free-up resources which are not going to be used - Cleanups [ Reminding myself about the endless TLA's again: SEV is the AMD Secure Encrypted Virtualization - Linus ] * tag 'x86_sev_for_v6.13' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/sev: Cleanup vc_handle_msr() x86/sev: Convert shared memory back to private on kexec x86/mm: Refactor __set_clr_pte_enc() x86/boot: Skip video memory access in the decompressor for SEV-ES/SNP virt: sev-guest: Carve out SNP message context structure virt: sev-guest: Reduce the scope of SNP command mutex virt: sev-guest: Consolidate SNP guest messaging parameters to a struct x86/sev: Cache the secrets page address x86/sev: Handle failures from snp_init() virt: sev-guest: Use AES GCM crypto library x86/virt: Provide "nosnp" boot option for sev kernel command line x86/virt: Move SEV-specific parsing into arch/x86/virt/svm
2024-10-23virt: arm-cca-guest: TSM_REPORT support for realmsSami Mujawar5-0/+240
Introduce an arm-cca-guest driver that registers with the configfs-tsm module to provide user interfaces for retrieving an attestation token. When a new report is requested the arm-cca-guest driver invokes the appropriate RSI interfaces to query an attestation token. The steps to retrieve an attestation token are as follows: 1. Mount the configfs filesystem if not already mounted mount -t configfs none /sys/kernel/config 2. Generate an attestation token report=/sys/kernel/config/tsm/report/report0 mkdir $report dd if=/dev/urandom bs=64 count=1 > $report/inblob hexdump -C $report/outblob rmdir $report Signed-off-by: Sami Mujawar <sami.mujawar@arm.com> Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com> Signed-off-by: Steven Price <steven.price@arm.com> Reviewed-by: Gavin Shan <gshan@redhat.com> Link: https://lore.kernel.org/r/20241017131434.40935-11-steven.price@arm.com Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2024-10-16virt: sev-guest: Carve out SNP message context structureNikunj A Dadhania1-91/+87
Currently, the sev-guest driver is the only user of SNP guest messaging. The snp_guest_dev structure holds all the allocated buffers, secrets page and VMPCK details. In preparation for adding messaging allocation and initialization APIs, decouple snp_guest_dev from messaging-related information by carving out the guest message context structure(snp_msg_desc). Incorporate this newly added context into snp_send_guest_request() and all related functions, replacing the use of the snp_guest_dev. No functional change. Signed-off-by: Nikunj A Dadhania <nikunj@amd.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Reviewed-by: Tom Lendacky <thomas.lendacky@amd.com> Link: https://lore.kernel.org/r/20241009092850.197575-7-nikunj@amd.com
2024-10-16virt: sev-guest: Reduce the scope of SNP command mutexNikunj A Dadhania1-27/+8
The SNP command mutex is used to serialize access to the shared buffer, command handling, and message sequence number. All shared buffer, command handling, and message sequence updates are done within snp_send_guest_request(), so moving the mutex to this function is appropriate and maintains the critical section. Since the mutex is now taken at a later point in time, remove the lockdep checks that occur before taking the mutex. Signed-off-by: Nikunj A Dadhania <nikunj@amd.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Reviewed-by: Tom Lendacky <thomas.lendacky@amd.com> Link: https://lore.kernel.org/r/20241009092850.197575-6-nikunj@amd.com
2024-10-16virt: sev-guest: Consolidate SNP guest messaging parameters to a structNikunj A Dadhania1-30/+54
Add a snp_guest_req structure to eliminate the need to pass a long list of parameters. This structure will be used to call the SNP Guest message request API, simplifying the function arguments. Update the snp_issue_guest_request() prototype to include the new guest request structure. Signed-off-by: Nikunj A Dadhania <nikunj@amd.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Reviewed-by: Tom Lendacky <thomas.lendacky@amd.com> Link: https://lore.kernel.org/r/20241009092850.197575-5-nikunj@amd.com
2024-10-16virt: sev-guest: Use AES GCM crypto libraryNikunj A Dadhania2-139/+40
The sev-guest driver encryption code uses the crypto API for SNP guest messaging with the AMD Security processor. In order to enable secure TSC, SEV-SNP guests need to send such a TSC_INFO message before the APs are booted. Details from the TSC_INFO response will then be used to program the VMSA before the APs are brought up. However, the crypto API is not available this early in the boot process. In preparation for moving the encryption code out of sev-guest to support secure TSC and to ease review, switch to using the AES GCM library implementation instead. Drop __enc_payload() and dec_payload() helpers as both are small and can be moved to the respective callers. Signed-off-by: Nikunj A Dadhania <nikunj@amd.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Reviewed-by: Tom Lendacky <thomas.lendacky@amd.com> Acked-by: Borislav Petkov (AMD) <bp@alien8.de> Tested-by: Peter Gonda <pgonda@google.com> Link: https://lore.kernel.org/r/20241009092850.197575-2-nikunj@amd.com
2024-09-27[tree-wide] finally take no_llseek outAl Viro1-1/+0
no_llseek had been defined to NULL two years ago, in commit 868941b14441 ("fs: remove no_llseek") To quote that commit, At -rc1 we'll need do a mechanical removal of no_llseek - git grep -l -w no_llseek | grep -v porting.rst | while read i; do sed -i '/\<no_llseek\>/d' $i done would do it. Unfortunately, that hadn't been done. Linus, could you do that now, so that we could finally put that thing to rest? All instances are of the form .llseek = no_llseek, so it's obviously safe. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2024-09-16Merge tag 'arm64-upstream' of ↵Linus Torvalds5-0/+142
git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux Pull arm64 updates from Will Deacon: "The highlights are support for Arm's "Permission Overlay Extension" using memory protection keys, support for running as a protected guest on Android as well as perf support for a bunch of new interconnect PMUs. Summary: ACPI: - Enable PMCG erratum workaround for HiSilicon HIP10 and 11 platforms. - Ensure arm64-specific IORT header is covered by MAINTAINERS. CPU Errata: - Enable workaround for hardware access/dirty issue on Ampere-1A cores. Memory management: - Define PHYSMEM_END to fix a crash in the amdgpu driver. - Avoid tripping over invalid kernel mappings on the kexec() path. - Userspace support for the Permission Overlay Extension (POE) using protection keys. Perf and PMUs: - Add support for the "fixed instruction counter" extension in the CPU PMU architecture. - Extend and fix the event encodings for Apple's M1 CPU PMU. - Allow LSM hooks to decide on SPE permissions for physical profiling. - Add support for the CMN S3 and NI-700 PMUs. Confidential Computing: - Add support for booting an arm64 kernel as a protected guest under Android's "Protected KVM" (pKVM) hypervisor. Selftests: - Fix vector length issues in the SVE/SME sigreturn tests - Fix build warning in the ptrace tests. Timers: - Add support for PR_{G,S}ET_TSC so that 'rr' can deal with non-determinism arising from the architected counter. Miscellaneous: - Rework our IPI-based CPU stopping code to try NMIs if regular IPIs don't succeed. - Minor fixes and cleanups" * tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: (94 commits) perf: arm-ni: Fix an NULL vs IS_ERR() bug arm64: hibernate: Fix warning for cast from restricted gfp_t arm64: esr: Define ESR_ELx_EC_* constants as UL arm64: pkeys: remove redundant WARN perf: arm_pmuv3: Use BR_RETIRED for HW branch event if enabled MAINTAINERS: List Arm interconnect PMUs as supported perf: Add driver for Arm NI-700 interconnect PMU dt-bindings/perf: Add Arm NI-700 PMU perf/arm-cmn: Improve format attr printing perf/arm-cmn: Clean up unnecessary NUMA_NO_NODE check arm64/mm: use lm_alias() with addresses passed to memblock_free() mm: arm64: document why pte is not advanced in contpte_ptep_set_access_flags() arm64: Expose the end of the linear map in PHYSMEM_END arm64: trans_pgd: mark PTEs entries as valid to avoid dead kexec() arm64/mm: Delete __init region from memblock.reserved perf/arm-cmn: Support CMN S3 dt-bindings: perf: arm-cmn: Add CMN S3 perf/arm-cmn: Refactor DTC PMU register access perf/arm-cmn: Make cycle counts less surprising perf/arm-cmn: Improve build-time assertion ...
2024-08-30drivers/virt: pkvm: Intercept ioremap using pKVM MMIO_GUARD hypercallWill Deacon1-0/+35
Hook up pKVM's MMIO_GUARD hypercall so that ioremap() and friends will register the target physical address as MMIO with the hypervisor, allowing guest exits to that page to be emulated by the host with full syndrome information. Acked-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20240830130150.8568-7-will@kernel.org Signed-off-by: Will Deacon <will@kernel.org>
2024-08-30drivers/virt: pkvm: Hook up mem_encrypt API using pKVM hypercallsWill Deacon1-0/+55
If we detect the presence of pKVM's SHARE and UNSHARE hypercalls, then register a backend implementation of the mem_encrypt API so that things like DMA buffers can be shared appropriately with the host. Acked-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20240830130150.8568-5-will@kernel.org Signed-off-by: Will Deacon <will@kernel.org>
2024-08-30drivers/virt: pkvm: Add initial support for running as a protected guestWill Deacon5-0/+52
Implement a pKVM protected guest driver to probe the presence of pKVM and determine the memory protection granule using the HYP_MEMINFO hypercall. Acked-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20240830130150.8568-3-will@kernel.org Signed-off-by: Will Deacon <will@kernel.org>
2024-08-27virt: sev-guest: Ensure the SNP guest messages do not exceed a pageNikunj A Dadhania1-0/+2
Currently, struct snp_guest_msg includes a message header (96 bytes) and a payload (4000 bytes). There is an implicit assumption here that the SNP message header will always be 96 bytes, and with that assumption the payload array size has been set to 4000 bytes - a magic number. If any new member is added to the SNP message header, the SNP guest message will span more than a page. Instead of using a magic number for the payload, declare struct snp_guest_msg in a way that payload plus the message header do not exceed a page. [ bp: Massage. ] Suggested-by: Tom Lendacky <thomas.lendacky@amd.com> Signed-off-by: Nikunj A Dadhania <nikunj@amd.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Acked-by: Borislav Petkov (AMD) <bp@alien8.de> Link: https://lore.kernel.org/r/20240731150811.156771-5-nikunj@amd.com
2024-08-27virt: sev-guest: Fix user-visible stringsNikunj A Dadhania1-4/+4
User-visible abbreviations should be in capitals, ensure messages are readable and clear. No functional change. Signed-off-by: Nikunj A Dadhania <nikunj@amd.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Link: https://lore.kernel.org/r/20240731150811.156771-4-nikunj@amd.com
2024-08-27virt: sev-guest: Rename local guest message variablesNikunj A Dadhania1-58/+59
Rename local guest message variables for more clarity. No functional change. Signed-off-by: Nikunj A Dadhania <nikunj@amd.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Link: https://lore.kernel.org/r/20240731150811.156771-3-nikunj@amd.com
2024-08-27virt: sev-guest: Replace dev_dbg() with pr_debug()Nikunj A Dadhania1-4/+5
In preparation for moving code to arch/x86/coco/sev/core.c, replace dev_dbg with pr_debug. No functional change. Signed-off-by: Nikunj A Dadhania <nikunj@amd.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Reviewed-by: Tom Lendacky <thomas.lendacky@amd.com> Reviewed-by: Borislav Petkov (AMD) <bp@alien8.de> Tested-by: Peter Gonda <pgonda@google.com> Link: https://lore.kernel.org/r/20240731150811.156771-2-nikunj@amd.com
2024-07-20Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvmLinus Torvalds2-65/+0
Pull kvm updates from Paolo Bonzini: "ARM: - Initial infrastructure for shadow stage-2 MMUs, as part of nested virtualization enablement - Support for userspace changes to the guest CTR_EL0 value, enabling (in part) migration of VMs between heterogenous hardware - Fixes + improvements to pKVM's FF-A proxy, adding support for v1.1 of the protocol - FPSIMD/SVE support for nested, including merged trap configuration and exception routing - New command-line parameter to control the WFx trap behavior under KVM - Introduce kCFI hardening in the EL2 hypervisor - Fixes + cleanups for handling presence/absence of FEAT_TCRX - Miscellaneous fixes + documentation updates LoongArch: - Add paravirt steal time support - Add support for KVM_DIRTY_LOG_INITIALLY_SET - Add perf kvm-stat support for loongarch RISC-V: - Redirect AMO load/store access fault traps to guest - perf kvm stat support - Use guest files for IMSIC virtualization, when available s390: - Assortment of tiny fixes which are not time critical x86: - Fixes for Xen emulation - Add a global struct to consolidate tracking of host values, e.g. EFER - Add KVM_CAP_X86_APIC_BUS_CYCLES_NS to allow configuring the effective APIC bus frequency, because TDX - Print the name of the APICv/AVIC inhibits in the relevant tracepoint - Clean up KVM's handling of vendor specific emulation to consistently act on "compatible with Intel/AMD", versus checking for a specific vendor - Drop MTRR virtualization, and instead always honor guest PAT on CPUs that support self-snoop - Update to the newfangled Intel CPU FMS infrastructure - Don't advertise IA32_PERF_GLOBAL_OVF_CTRL as an MSR-to-be-saved, as it reads '0' and writes from userspace are ignored - Misc cleanups x86 - MMU: - Small cleanups, renames and refactoring extracted from the upcoming Intel TDX support - Don't allocate kvm_mmu_page.shadowed_translation for shadow pages that can't hold leafs SPTEs - Unconditionally drop mmu_lock when allocating TDP MMU page tables for eager page splitting, to avoid stalling vCPUs when splitting huge pages - Bug the VM instead of simply warning if KVM tries to split a SPTE that is non-present or not-huge. KVM is guaranteed to end up in a broken state because the callers fully expect a valid SPTE, it's all but dangerous to let more MMU changes happen afterwards x86 - AMD: - Make per-CPU save_area allocations NUMA-aware - Force sev_es_host_save_area() to be inlined to avoid calling into an instrumentable function from noinstr code - Base support for running SEV-SNP guests. API-wise, this includes a new KVM_X86_SNP_VM type, encrypting/measure the initial image into guest memory, and finalizing it before launching it. Internally, there are some gmem/mmu hooks needed to prepare gmem-allocated pages before mapping them into guest private memory ranges This includes basic support for attestation guest requests, enough to say that KVM supports the GHCB 2.0 specification There is no support yet for loading into the firmware those signing keys to be used for attestation requests, and therefore no need yet for the host to provide certificate data for those keys. To support fetching certificate data from userspace, a new KVM exit type will be needed to handle fetching the certificate from userspace. An attempt to define a new KVM_EXIT_COCO / KVM_EXIT_COCO_REQ_CERTS exit type to handle this was introduced in v1 of this patchset, but is still being discussed by community, so for now this patchset only implements a stub version of SNP Extended Guest Requests that does not provide certificate data x86 - Intel: - Remove an unnecessary EPT TLB flush when enabling hardware - Fix a series of bugs that cause KVM to fail to detect nested pending posted interrupts as valid wake eents for a vCPU executing HLT in L2 (with HLT-exiting disable by L1) - KVM: x86: Suppress MMIO that is triggered during task switch emulation Explicitly suppress userspace emulated MMIO exits that are triggered when emulating a task switch as KVM doesn't support userspace MMIO during complex (multi-step) emulation Silently ignoring the exit request can result in the WARN_ON_ONCE(vcpu->mmio_needed) firing if KVM exits to userspace for some other reason prior to purging mmio_needed See commit 0dc902267cb3 ("KVM: x86: Suppress pending MMIO write exits if emulator detects exception") for more details on KVM's limitations with respect to emulated MMIO during complex emulator flows Generic: - Rename the AS_UNMOVABLE flag that was introduced for KVM to AS_INACCESSIBLE, because the special casing needed by these pages is not due to just unmovability (and in fact they are only unmovable because the CPU cannot access them) - New ioctl to populate the KVM page tables in advance, which is useful to mitigate KVM page faults during guest boot or after live migration. The code will also be used by TDX, but (probably) not through the ioctl - Enable halt poll shrinking by default, as Intel found it to be a clear win - Setup empty IRQ routing when creating a VM to avoid having to synchronize SRCU when creating a split IRQCHIP on x86 - Rework the sched_in/out() paths to replace kvm_arch_sched_in() with a flag that arch code can use for hooking both sched_in() and sched_out() - Take the vCPU @id as an "unsigned long" instead of "u32" to avoid truncating a bogus value from userspace, e.g. to help userspace detect bugs - Mark a vCPU as preempted if and only if it's scheduled out while in the KVM_RUN loop, e.g. to avoid marking it preempted and thus writing guest memory when retrieving guest state during live migration blackout Selftests: - Remove dead code in the memslot modification stress test - Treat "branch instructions retired" as supported on all AMD Family 17h+ CPUs - Print the guest pseudo-RNG seed only when it changes, to avoid spamming the log for tests that create lots of VMs - Make the PMU counters test less flaky when counting LLC cache misses by doing CLFLUSH{OPT} in every loop iteration" * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (227 commits) crypto: ccp: Add the SNP_VLEK_LOAD command KVM: x86/pmu: Add kvm_pmu_call() to simplify static calls of kvm_pmu_ops KVM: x86: Introduce kvm_x86_call() to simplify static calls of kvm_x86_ops KVM: x86: Replace static_call_cond() with static_call() KVM: SEV: Provide support for SNP_EXTENDED_GUEST_REQUEST NAE event x86/sev: Move sev_guest.h into common SEV header KVM: SEV: Provide support for SNP_GUEST_REQUEST NAE event KVM: x86: Suppress MMIO that is triggered during task switch emulation KVM: x86/mmu: Clean up make_huge_page_split_spte() definition and intro KVM: x86/mmu: Bug the VM if KVM tries to split a !hugepage SPTE KVM: selftests: x86: Add test for KVM_PRE_FAULT_MEMORY KVM: x86: Implement kvm_arch_vcpu_pre_fault_memory() KVM: x86/mmu: Make kvm_mmu_do_page_fault() return mapped level KVM: x86/mmu: Account pf_{fixed,emulate,spurious} in callers of "do page fault" KVM: x86/mmu: Bump pf_taken stat only in the "real" page fault handler KVM: Add KVM_PRE_FAULT_MEMORY vcpu ioctl to pre-populate guest memory KVM: Document KVM_PRE_FAULT_MEMORY ioctl mm, virt: merge AS_UNMOVABLE and AS_INACCESSIBLE perf kvm: Add kvm-stat for loongarch64 LoongArch: KVM: Add PV steal time support in guest side ...
2024-07-16Merge branch 'kvm-6.11-sev-attestation' into HEADPaolo Bonzini2-65/+0
The GHCB 2.0 specification defines 2 GHCB request types to allow SNP guests to send encrypted messages/requests to firmware: SNP Guest Requests and SNP Extended Guest Requests. These encrypted messages are used for things like servicing attestation requests issued by the guest. Implementing support for these is required to be fully GHCB-compliant. For the most part, KVM only needs to handle forwarding these requests to firmware (to be issued via the SNP_GUEST_REQUEST firmware command defined in the SEV-SNP Firmware ABI), and then forwarding the encrypted response to the guest. However, in the case of SNP Extended Guest Requests, the host is also able to provide the certificate data corresponding to the endorsement key used by firmware to sign attestation report requests. This certificate data is provided by userspace because: 1) It allows for different keys/key types to be used for each particular guest with requiring any sort of KVM API to configure the certificate table in advance on a per-guest basis. 2) It provides additional flexibility with how attestation requests might be handled during live migration where the certificate data for source/dest might be different. 3) It allows all synchronization between certificates and firmware/signing key updates to be handled purely by userspace rather than requiring some in-kernel mechanism to facilitate it. [1] To support fetching certificate data from userspace, a new KVM exit type will be needed to handle fetching the certificate from userspace. An attempt to define a new KVM_EXIT_COCO/KVM_EXIT_COCO_REQ_CERTS exit type to handle this was introduced in v1 of this patchset, but is still being discussed by community, so for now this patchset only implements a stub version of SNP Extended Guest Requests that does not provide certificate data, but is still enough to provide compliance with the GHCB 2.0 spec.
2024-07-16x86/sev: Move sev_guest.h into common SEV headerMichael Roth2-65/+0
sev_guest.h currently contains various definitions relating to the format of SNP_GUEST_REQUEST commands to SNP firmware. Currently only the sev-guest driver makes use of them, but when the KVM side of this is implemented there's a need to parse the SNP_GUEST_REQUEST header to determine whether additional information needs to be provided to the guest. Prepare for this by moving those definitions to a common header that's shared by host/guest code so that KVM can also make use of them. Reviewed-by: Tom Lendacky <thomas.lendacky@amd.com> Reviewed-by: Liam Merwick <liam.merwick@oracle.com> Signed-off-by: Michael Roth <michael.roth@amd.com> Message-ID: <20240701223148.3798365-3-michael.roth@amd.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2024-06-20virt: sev-guest: Mark driver struct with __refdata to prevent section mismatchUwe Kleine-König1-1/+6
As described in the added code comment, a reference to .exit.text is ok for drivers registered via module_platform_driver_probe(). Make this explicit to prevent the following section mismatch warning: WARNING: modpost: drivers/virt/coco/sev-guest/sev-guest: section mismatch in reference: \ sev_guest_driver+0x10 (section: .data) -> sev_guest_remove (section: .exit.text) that triggers on an allmodconfig W=1 build. Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Reviewed-by: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com> Reviewed-by: Tom Lendacky <thomas.lendacky@amd.com> Link: https://lore.kernel.org/r/4a81b0e87728a58904283e2d1f18f73abc69c2a1.1711748999.git.u.kleine-koenig@pengutronix.de
2024-06-17x86/sev: Extend the config-fs attestation support for an SVSMTom Lendacky2-2/+270
When an SVSM is present, the guest can also request attestation reports from it. These SVSM attestation reports can be used to attest the SVSM and any services running within the SVSM. Extend the config-fs attestation support to provide such. This involves creating four new config-fs attributes: - 'service-provider' (input) This attribute is used to determine whether the attestation request should be sent to the specified service provider or to the SEV firmware. The SVSM service provider is represented by the value 'svsm'. - 'service_guid' (input) Used for requesting the attestation of a single service within the service provider. A null GUID implies that the SVSM_ATTEST_SERVICES call should be used to request the attestation report. A non-null GUID implies that the SVSM_ATTEST_SINGLE_SERVICE call should be used. - 'service_manifest_version' (input) Used with the SVSM_ATTEST_SINGLE_SERVICE call, the service version represents a specific service manifest version be used for the attestation report. - 'manifestblob' (output) Used to return the service manifest associated with the attestation report. Only display these new attributes when running under an SVSM. [ bp: Massage. - s/svsm_attestation_call/svsm_attest_call/g ] Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Link: https://lore.kernel.org/r/965015dce3c76bb8724839d50c5dea4e4b5d598f.1717600736.git.thomas.lendacky@amd.com
2024-06-17x86/sev: Take advantage of configfs visibility support in TSMTom Lendacky3-46/+69
The TSM attestation report support provides multiple configfs attribute types (both for standard and binary attributes) to allow for additional attributes to be displayed for SNP as compared to TDX. With the ability to hide attributes via configfs, consolidate the multiple attribute groups into a single standard attribute group and a single binary attribute group. Modify the TDX support to hide the attributes that were previously "hidden" as a result of registering the selective attribute groups. Co-developed-by: Dan Williams <dan.j.williams@intel.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com> Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Reviewed-by: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com> Link: https://lore.kernel.org/r/8873c45d0c8abc35aaf01d7833a55788a6905727.1717600736.git.thomas.lendacky@amd.com
2024-06-17sev-guest: configfs-tsm: Allow the privlevel_floor attribute to be updatedTom Lendacky1-1/+4
With the introduction of an SVSM, Linux will be running at a non-zero VMPL. Any request for an attestation report at a higher privilege VMPL than what Linux is currently running will result in an error. Allow for the privlevel_floor attribute to be updated dynamically. [ bp: Trim commit message. ] Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Link: https://lore.kernel.org/r/5a736be9384aebd98a0b7c929660f8a97cbdc366.1717600736.git.thomas.lendacky@amd.com
2024-06-17virt: sev-guest: Choose the VMPCK key based on executing VMPLTom Lendacky1-3/+14
Currently, the sev-guest driver uses the vmpck-0 key by default. When an SVSM is present, the kernel is running at a VMPL other than 0 and the vmpck-0 key is no longer available. If a specific vmpck key has not be requested by the user via the vmpck_id module parameter, choose the vmpck key based on the active VMPL level. Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Link: https://lore.kernel.org/r/b88081c5d88263176849df8ea93e90a404619cab.1717600736.git.thomas.lendacky@amd.com
2024-04-25x86/sev: Shorten struct name snp_secrets_page_layout to snp_secrets_pageTom Lendacky1-14/+14
Ending a struct name with "layout" is a little redundant, so shorten the snp_secrets_page_layout name to just snp_secrets_page. No functional change. [ bp: Rename the local pointer to "secrets" too for more clarity. ] Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Link: https://lore.kernel.org/r/bc8d58302c6ab66c3beeab50cce3ec2c6bd72d6c.1713974291.git.thomas.lendacky@amd.com
2024-03-09virt: efi_secret: Convert to platform remove callback returning voidUwe Kleine-König1-3/+2
The .remove() callback for a platform driver returns an int which makes many driver authors wrongly assume it's possible to do error handling by returning an error code. However the value returned is ignored (apart from emitting a warning) and this typically results in resource leaks. To improve here there is a quest to make the remove callback return void. In the first step of this quest all drivers are converted to .remove_new(), which already returns void. Eventually after all drivers are converted, .remove_new() will be renamed to .remove(). Trivially convert this driver from always returning zero in the remove callback to the void returning variant. Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de> Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
2024-01-02virt: sev-guest: Convert to platform remove callback returning voidUwe Kleine-König1-4/+2
The .remove() callback for a platform driver returns an int which makes many driver authors wrongly assume it's possible to do error handling by returning an error code. However the value returned is ignored (apart from emitting a warning) and this typically results in resource leaks. To improve here there is a quest to make the remove callback return void. In the first step of this quest all drivers are converted to .remove_new(), which already returns void. Eventually after all drivers are converted, .remove_new() will be renamed to .remove(). Trivially convert this driver from always returning zero in the remove callback to the void returning variant. Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Acked-by: Tom Lendacky <thomas.lendacky@amd.com> Link: https://lore.kernel.org/r/52826a50250304ab0af14c594009f7b901c2cd31.1703596577.git.u.kleine-koenig@pengutronix.de
2023-10-19virt: tdx-guest: Add Quote generation support using TSM_REPORTSKuppuswamy Sathyanarayanan2-1/+229
In TDX guest, the attestation process is used to verify the TDX guest trustworthiness to other entities before provisioning secrets to the guest. The first step in the attestation process is TDREPORT generation, which involves getting the guest measurement data in the format of TDREPORT, which is further used to validate the authenticity of the TDX guest. TDREPORT by design is integrity-protected and can only be verified on the local machine. To support remote verification of the TDREPORT in a SGX-based attestation, the TDREPORT needs to be sent to the SGX Quoting Enclave (QE) to convert it to a remotely verifiable Quote. SGX QE by design can only run outside of the TDX guest (i.e. in a host process or in a normal VM) and guest can use communication channels like vsock or TCP/IP to send the TDREPORT to the QE. But for security concerns, the TDX guest may not support these communication channels. To handle such cases, TDX defines a GetQuote hypercall which can be used by the guest to request the host VMM to communicate with the SGX QE. More details about GetQuote hypercall can be found in TDX Guest-Host Communication Interface (GHCI) for Intel TDX 1.0, section titled "TDG.VP.VMCALL<GetQuote>". Trusted Security Module (TSM) [1] exposes a common ABI for Confidential Computing Guest platforms to get the measurement data via ConfigFS. Extend the TSM framework and add support to allow an attestation agent to get the TDX Quote data (included usage example below). report=/sys/kernel/config/tsm/report/report0 mkdir $report dd if=/dev/urandom bs=64 count=1 > $report/inblob hexdump -C $report/outblob rmdir $report GetQuote TDVMCALL requires TD guest pass a 4K aligned shared buffer with TDREPORT data as input, which is further used by the VMM to copy the TD Quote result after successful Quote generation. To create the shared buffer, allocate a large enough memory and mark it shared using set_memory_decrypted() in tdx_guest_init(). This buffer will be re-used for GetQuote requests in the TDX TSM handler. Although this method reserves a fixed chunk of memory for GetQuote requests, such one time allocation can help avoid memory fragmentation related allocation failures later in the uptime of the guest. Since the Quote generation process is not time-critical or frequently used, the current version uses a polling model for Quote requests and it also does not support parallel GetQuote requests. Link: https://lore.kernel.org/lkml/169342399185.3934343.3035845348326944519.stgit@dwillia2-xfh.jf.intel.com/ [1] Signed-off-by: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com> Reviewed-by: Erdem Aktas <erdemaktas@google.com> Tested-by: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com> Tested-by: Peter Gonda <pgonda@google.com> Reviewed-by: Tom Lendacky <thomas.lendacky@amd.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2023-10-19virt: sevguest: Add TSM_REPORTS support for SNP_GET_EXT_REPORTDan Williams2-0/+136
The sevguest driver was a first mover in the confidential computing space. As a first mover that afforded some leeway to build the driver without concern for common infrastructure. Now that sevguest is no longer a singleton [1] the common operation of building and transmitting attestation report blobs can / should be made common. In this model the so called "TSM-provider" implementations can share a common envelope ABI even if the contents of that envelope remain vendor-specific. When / if the industry agrees on an attestation record format, that definition can also fit in the same ABI. In the meantime the kernel's maintenance burden is reduced and collaboration on the commons is increased. Convert sevguest to use CONFIG_TSM_REPORTS to retrieve the data that the SNP_GET_EXT_REPORT ioctl produces. An example flow follows for retrieving the report blob via the TSM interface utility, assuming no nonce and VMPL==2: report=/sys/kernel/config/tsm/report/report0 mkdir $report echo 2 > $report/privlevel dd if=/dev/urandom bs=64 count=1 > $report/inblob hexdump -C $report/outblob # SNP report hexdump -C $report/auxblob # cert_table rmdir $report Given that the platform implementation is free to return empty certificate data if none is available it lets configfs-tsm be simplified as it only needs to worry about wrapping SNP_GET_EXT_REPORT, and leave SNP_GET_REPORT alone. The old ioctls can be lazily deprecated, the main motivation of this effort is to stop the proliferation of new ioctls, and to increase cross-vendor collaboration. Link: http://lore.kernel.org/r/64961c3baf8ce_142af829436@dwillia2-xfh.jf.intel.com.notmuch [1] Cc: Borislav Petkov <bp@alien8.de> Cc: Tom Lendacky <thomas.lendacky@amd.com> Cc: Dionna Glaze <dionnaglaze@google.com> Cc: Jeremi Piotrowski <jpiotrowski@linux.microsoft.com> Tested-by: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com> Tested-by: Alexey Kardashevskiy <aik@amd.com> Reviewed-by: Tom Lendacky <thomas.lendacky@amd.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2023-10-19virt: sevguest: Prep for kernel internal get_ext_report()Dan Williams1-12/+32
In preparation for using the configs-tsm facility to convey attestation blobs to userspace, switch to using the 'sockptr' api for copying payloads to provided buffers where 'sockptr' handles user vs kernel buffers. While configfs-tsm is meant to replace existing confidential computing ioctl() implementations for attestation report retrieval the old ioctl() path needs to stick around for a deprecation period. No behavior change intended. Cc: Borislav Petkov <bp@alien8.de> Cc: Tom Lendacky <thomas.lendacky@amd.com> Cc: Dionna Glaze <dionnaglaze@google.com> Reviewed-by: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com> Tested-by: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com> Reviewed-by: Tom Lendacky <thomas.lendacky@amd.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2023-10-19configfs-tsm: Introduce a shared ABI for attestation reportsDan Williams3-0/+431
One of the common operations of a TSM (Trusted Security Module) is to provide a way for a TVM (confidential computing guest execution environment) to take a measurement of its launch state, sign it and submit it to a verifying party. Upon successful attestation that verifies the integrity of the TVM additional secrets may be deployed. The concept is common across TSMs, but the implementations are unfortunately vendor specific. While the industry grapples with a common definition of this attestation format [1], Linux need not make this problem worse by defining a new ABI per TSM that wants to perform a similar operation. The current momentum has been to invent new ioctl-ABI per TSM per function which at best is an abdication of the kernel's responsibility to make common infrastructure concepts share common ABI. The proposal, targeted to conceptually work with TDX, SEV-SNP, COVE if not more, is to define a configfs interface to retrieve the TSM-specific blob. report=/sys/kernel/config/tsm/report/report0 mkdir $report dd if=binary_userdata_plus_nonce > $report/inblob hexdump $report/outblob This approach later allows for the standardization of the attestation blob format without needing to invent a new ABI. Once standardization happens the standard format can be emitted by $report/outblob and indicated by $report/provider, or a new attribute like "$report/tcg_coco_report" can emit the standard format alongside the vendor format. Review of previous iterations of this interface identified that there is a need to scale report generation for multiple container environments [2]. Configfs enables a model where each container can bind mount one or more report generation item instances. Still, within a container only a single thread can be manipulating a given configuration instance at a time. A 'generation' count is provided to detect conflicts between multiple threads racing to configure a report instance. The SEV-SNP concepts of "extended reports" and "privilege levels" are optionally enabled by selecting 'tsm_report_ext_type' at register_tsm() time. The expectation is that those concepts are generic enough that they may be adopted by other TSM implementations. In other words, configfs-tsm aims to address a superset of TSM specific functionality with a common ABI where attributes may appear, or not appear, based on the set of concepts the implementation supports. Link: http://lore.kernel.org/r/64961c3baf8ce_142af829436@dwillia2-xfh.jf.intel.com.notmuch [1] Link: http://lore.kernel.org/r/57f3a05e-8fcd-4656-beea-56bb8365ae64@linux.microsoft.com [2] Cc: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com> Cc: Dionna Amalie Glaze <dionnaglaze@google.com> Cc: James Bottomley <James.Bottomley@HansenPartnership.com> Cc: Peter Gonda <pgonda@google.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Samuel Ortiz <sameo@rivosinc.com> Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Acked-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com> Tested-by: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com> Reviewed-by: Tom Lendacky <thomas.lendacky@amd.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2023-10-19virt: coco: Add a coco/Makefile and coco/KconfigDan Williams2-0/+16
In preparation for adding another coco build target, relieve drivers/virt/Makefile of the responsibility to track new compilation unit additions to drivers/virt/coco/, and do the same for drivers/virt/Kconfig. Reviewed-by: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com> Tested-by: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com> Reviewed-by: Tom Lendacky <thomas.lendacky@amd.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2023-10-10virt: sevguest: Fix passing a stack buffer as a scatterlist targetDan Williams1-20/+25