aboutsummaryrefslogtreecommitdiff
path: root/arch/x86/kernel/cpu
AgeCommit message (Collapse)AuthorFilesLines
2026-06-16Merge tag 'x86_microcode_for_v7.2_rc1' of ↵Linus Torvalds3-244/+249
gitolite.kernel.org:pub/scm/linux/kernel/git/tip/tip Pull x86 microcode loader updates from Borislav Petkov: - Move the zero-revision fixup for AMD microcode to the patch level retrieval function and restrict it to Zen family processors, ensuring patch level arithmetic always operates on a valid revision - Fix an incorrect comment about which CPUID bit is checked when determining whether the microcode loader should be disabled - Add the latest Intel microcode revision data for a broad range of processor models and steppings and add the script which generates the header of minimum expected Intel microcode revisions * tag 'x86_microcode_for_v7.2_rc1' of gitolite.kernel.org:pub/scm/linux/kernel/git/tip/tip: x86/microcode/AMD: Move the no-revision fixup to get_patch_level() x86/microcode: Fix comment in microcode_loader_disabled() scripts/x86/intel: Add a script to update the old microcode list x86/microcode/intel: Refresh old_microcode defines with Nov 2025 release
2026-06-16Merge tag 'x86_cleanups_for_v7.2_rc1' of ↵Linus Torvalds1-1/+0
gitolite.kernel.org:pub/scm/linux/kernel/git/tip/tip Pull x86 cleanups from Borislav Petkov: - The usual pile of cleanups and fixlets the cat dragged in * tag 'x86_cleanups_for_v7.2_rc1' of gitolite.kernel.org:pub/scm/linux/kernel/git/tip/tip: x86/cpu: Remove obsolete aperfmperf_get_khz() declaration x86/pmem: Check for platform_device_alloc() retval x86/platform/uv: Use str_enabled_disabled() in uv_nmi_setup_hubless_intr() x86/cpu: Keep the PROCESSOR_SELECT menu together x86/tlb: Convert copy_from_user() + kstrtouint() to kstrtouint_from_user() x86/purgatory: Fix #endif comment x86/boot: Get rid of kstrtoull() x86/boot/compressed: Use boot_kstrtoul() for hugepages= parsing
2026-06-16Merge tag 'x86_cache_for_v7.2_rc1' of ↵Linus Torvalds1-0/+1
gitolite.kernel.org:pub/scm/linux/kernel/git/tip/tip Pull x86 resource control updates from Borislav Petkov: "Preparatory work for MPAM counter assignment: - Simplify the error handling path when creating monitor group event configuration directories - Make the MBM event filter configurable only on architectures that support it and expose this with the respective file modes in the event config - Disallow the MBA software controller on systems where MBM counters are assignable, as it requires continuous bandwidth measurement that assignable counters do not guarantee - Replace a compile-time Kconfig option for fixed counter assignment with a per-architecture runtime property, and expose whether the counter assignment mode is changeable to userspace - Continue counter allocation across all domains instead of aborting at the first failure - Document that automatic MBM counter assignment is best effort and may not assign counters to all domains - Document the behavior of task ID 0 and idle tasks in the resctrl tasks file" * tag 'x86_cache_for_v7.2_rc1' of gitolite.kernel.org:pub/scm/linux/kernel/git/tip/tip: fs/resctrl: Document tasks file behaviour for task id 0 and idle tasks fs/resctrl: Document that automatic counter assignment is best effort fs/resctrl: Continue counter allocation after failure fs/resctrl: Add monitor property 'mbm_cntr_assign_fixed' fs/resctrl: Disallow the software controller when MBM counters are assignable x86,fs/resctrl: Create 'event_filter' files read only if they're not configurable fs/resctrl: Tidy up the error path in resctrl_mkdir_event_configs()
2026-06-15Merge tag 'x86-cpu-2026-06-14' of ↵Linus Torvalds26-48/+338
gitolite.kernel.org:pub/scm/linux/kernel/git/tip/tip Pull x86 cpuid updates from Ingo Molnar: - CPUID API updates (Ahmed S. Darwish): - Introduce a centralized CPUID parser - Introduce a centralized CPUID data model - Introduce <asm/cpuid/leaf_types.h> - Rename cpuid_leaf()/cpuid_subleaf() APIs - treewide: Explicitly include the x86 CPUID headers - Update to x86-cpuid-db v3.1 (Maciej Wieczor-Retman) - Continued removal of pre-i586 support and related simplifications (Ingo Molnar) - Add Intel CPU model number for rugged Panther Lake (Tony Luck) - Misc fixes, updates and cleanups by Arnd Bergmann, Chao Gao, Lukas Bulwahn, Sohil Mehta, Maciej Wieczor-Retman. * tag 'x86-cpu-2026-06-14' of gitolite.kernel.org:pub/scm/linux/kernel/git/tip/tip: (25 commits) x86/cpu: Make CONFIG_X86_CX8 unconditional x86/cpu: Remove unused !CONFIG_X86_TSC code x86/cpuid: Update bitfields to x86-cpuid-db v3.1 tools/x86/kcpuid: Update bitfields to x86-cpuid-db v3.1 x86/cpu: Make CONFIG_X86_TSC unconditional MAINTAINERS: Drop obsolete FPU EMULATOR section x86/cpu: Fix a F00F bug warning and clean up surrounding code x86/cpu: Add Intel CPU model number for rugged Panther Lake x86/cpuid: Introduce a centralized CPUID parser x86/cpu: Introduce a centralized CPUID data model x86/cpuid: Introduce <asm/cpuid/leaf_types.h> x86/cpuid: Rename cpuid_leaf()/cpuid_subleaf() APIs x86/cpu: Do not include the CPUID API header in asm/processor.h Documentation: core-api/cpu_hotplug: Remove stale cpu0_hotplug docs x86/cpu, cpufreq: Remove AMD ELAN support x86/fpu: Remove the math-emu/ FPU emulation library x86/fpu: Remove the 'no387' boot option x86/fpu: Remove MATH_EMULATION and related glue code treewide: Explicitly include the x86 CPUID headers x86/cpu: Remove the CONFIG_X86_INVD_BUG quirk ...
2026-06-15Merge tag 'x86-msr-2026-06-14' of ↵Linus Torvalds3-9/+9
gitolite.kernel.org:pub/scm/linux/kernel/git/tip/tip Pull x86/msr updates from Ingo Molnar: - Large series to reorganize the rdmsr/wrmsr APIs to remove 32-bit variants and convert to 64-bit variants (Juergen Gross) - Fix W=1 warning (HyeongJun An) * tag 'x86-msr-2026-06-14' of gitolite.kernel.org:pub/scm/linux/kernel/git/tip/tip: x86/msr: Remove wrmsrl() x86/msr: Switch wrmsrl() users to wrmsrq() x86/msr: Remove rdmsrl() x86/msr: Switch rdmsrl() users to rdmsrq() x86/msr: Remove wrmsr_safe_on_cpu() x86/msr: Switch wrmsr_safe_on_cpu() users to wrmsrq_safe_on_cpu() x86/msr: Remove rdmsr_safe_on_cpu() x86/msr: Switch rdmsr_safe_on_cpu() users to rdmsrq_safe_on_cpu() x86/msr: Don't use rdmsr_safe_on_cpu() in rdmsrq_safe_on_cpu() x86/msr: Remove wrmsr_on_cpu() x86/msr: Switch wrmsr_on_cpu() users to wrmsrq_on_cpu() x86/msr: Remove rdmsr_on_cpu() x86/msr: Switch rdmsr_on_cpu() users to rdmsrq_on_cpu() x86/msr: Remove rdmsrl_on_cpu() x86/msr: Switch rdmsrl_on_cpu() user to rdmsrq_on_cpu() x86/process: Convert rdmsr() to rdmsrq() in arch_post_acpi_subsys_init() to address W=1 warning
2026-06-15Merge tag 'irq-core-2026-06-13' of ↵Linus Torvalds5-11/+7
gitolite.kernel.org:pub/scm/linux/kernel/git/tip/tip Pull interrupt core updates from Thomas Gleixner: - Rework of /proc/interrupt handling: /proc/interrupts was subject to micro optimizations for a long time, but most of the low hanging fruit was left on the table. This rework addresses the major time consuming issues: - Printing a long series of zeros one by one via a format string instead of counting subsequent zeros and emitting a string constant. - Simplify and cache the conditions whether interrupts should be printed - Use a proper iteration over the interrupt descriptor xarray instead of walking and testing one by one. - Provide helper functions for the architecture code to emit the architecture specific counters - Convert the counter structure in x86 to an array, which simplifies the output and add mechanisms to suppress unused architecture interrupts, which just occupy space for nothing. Adopt the new core mechanisms. This adjusts the gdb scripts related to interrupt counter statistics to work with the new mechanisms. - Prevent a string overflow in the /proc/irq/$N/ directory name creation code. * tag 'irq-core-2026-06-13' of gitolite.kernel.org:pub/scm/linux/kernel/git/tip/tip: x86/irq: Add missing 's' back to thermal event printout genirq/proc: Speed up /proc/interrupts iteration genirq/proc: Runtime size the chip name genirq: Expose irq_find_desc_at_or_after() in core code genirq: Add rcuref count to struct irq_desc genirq/proc: Increase default interrupt number precision to four genirq: Calculate precision only when required genirq: Cache the condition for /proc/interrupts exposure genirq/manage: Make NMI cleanup RT safe genirq: Expose nr_irqs in core code scripts/gdb: Update x86 interrupts to the array based storage x86/irq: Move IOAPIC misrouted and PIC/APIC error counts into irq_stats x86/irq: Suppress unlikely interrupt stats by default x86/irq: Make irqstats array based genirq/proc: Utilize irq_desc::tot_count to avoid evaluation genirq/proc: Avoid formatting zero counts in /proc/interrupts x86/irq: Optimize interrupts decimals printing genirq/proc: Size interrupt directory names for 10-digit interrupt numbers
2026-06-08x86/msr: Switch wrmsrl() users to wrmsrq()Juergen Gross1-1/+1
wrmsrl() is a deprecated synonym for wrmsrq(). Switch its users to wrmsrq(). Signed-off-by: Juergen Gross <jgross@suse.com> Signed-off-by: Ingo Molnar <mingo@kernel.org> Cc: Sean Christopherson <seanjc@google.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: "K. Y. Srinivasan" <kys@microsoft.com> Cc: Haiyang Zhang <haiyangz@microsoft.com> Cc: Wei Liu <wei.liu@kernel.org> Cc: Dexuan Cui <decui@microsoft.com> Cc: Long Li <longli@microsoft.com> Cc: "Rafael J. Wysocki" <rafael@kernel.org> Cc: Artem Bityutskiy <artem.bityutskiy@linux.intel.com> Link: https://patch.msgid.link/20260608082809.3492719-4-jgross@suse.com
2026-06-08x86/msr: Switch rdmsrl() users to rdmsrq()Juergen Gross1-1/+1
rdmsrl() is a deprecated synonym for rdmsrq(). Switch its users to rdmsrq(). Signed-off-by: Juergen Gross <jgross@suse.com> Signed-off-by: Ingo Molnar <mingo@kernel.org> Cc: "K. Y. Srinivasan" <kys@microsoft.com> Cc: Haiyang Zhang <haiyangz@microsoft.com> Cc: Wei Liu <wei.liu@kernel.org> Cc: Dexuan Cui <decui@microsoft.com> Cc: Long Li <longli@microsoft.com> Cc: "Rafael J. Wysocki" <rafael@kernel.org> Cc: Artem Bityutskiy <artem.bityutskiy@linux.intel.com> Link: https://patch.msgid.link/20260608082809.3492719-2-jgross@suse.com
2026-06-08x86/msr: Switch wrmsr_on_cpu() users to wrmsrq_on_cpu()Juergen Gross1-1/+1
In order to prepare retiring wrmsr_on_cpu() switch wrmsr_on_cpu() users to wrmsrq_on_cpu(). Tested-by: K Prateek Nayak <kprateek.nayak@amd.com> Signed-off-by: Juergen Gross <jgross@suse.com> Signed-off-by: Ingo Molnar <mingo@kernel.org> Reviewed-by: Dave Hansen <dave.hansen@linux.intel.com> Cc: Viresh Kumar <viresh.kumar@linaro.org> Cc: Rafael J. Wysocki <rafael@kernel.org> Cc: Daniel Lezcano <daniel.lezcano@kernel.org> Link: https://patch.msgid.link/20260608051741.3207435-6-jgross@suse.com
2026-06-08x86/msr: Switch rdmsr_on_cpu() users to rdmsrq_on_cpu()Juergen Gross2-7/+7
In order to prepare retiring rdmsr_on_cpu() switch rdmsr_on_cpu() users to rdmsrq_on_cpu(). Tested-by: K Prateek Nayak <kprateek.nayak@amd.com> Signed-off-by: Juergen Gross <jgross@suse.com> Signed-off-by: Ingo Molnar <mingo@kernel.org> Reviewed-by: Dave Hansen <dave.hansen@linux.intel.com> Cc: Rafael J. Wysocki <rafael@kernel.org> Cc: Viresh Kumar <viresh.kumar@linaro.org> Cc: Guenter Roeck <linux@roeck-us.net> Cc: Daniel Lezcano <daniel.lezcano@kernel.org> Link: https://patch.msgid.link/20260608051741.3207435-4-jgross@suse.com
2026-06-05x86/cpu: Remove obsolete aperfmperf_get_khz() declarationJunxiao Chang1-1/+0
aperfmperf_get_khz() was replaced by arch_freq_get_on_cpu(). The remaining declaration in the header file is no longer used and should be removed. Fixes: f3eca381bd49 ("x86/aperfmperf: Replace arch_freq_get_on_cpu()") Signed-off-by: Junxiao Chang <junxiao.chang@intel.com> Signed-off-by: Ingo Molnar <mingo@kernel.org> Reviewed-by: Nikolay Borisov <nik.borisov@suse.com> Link: https://patch.msgid.link/20260606021514.1433619-1-junxiao.chang@intel.com
2026-06-05x86/resctrl: Only check Intel systems for SNCTony Luck1-1/+6
topology_num_nodes_per_package() reports values greater than one on certain AMD systems resulting in resctrl's Intel model specific SNC detection printing the confusing message: "CoD enabled system? Resctrl not supported" Add a check for Intel systems before looking at the topology. [ reinette: Add Closes tag, fix tag typos, rework changelog ] Fixes: 59674fc9d0bf ("x86/resctrl: Fix SNC detection") Reported-by: Babu Moger <babu.moger@amd.com> Signed-off-by: Tony Luck <tony.luck@intel.com> Signed-off-by: Reinette Chatre <reinette.chatre@intel.com> Signed-off-by: Ingo Molnar <mingo@kernel.org> Tested-by: Babu Moger <babu.moger@amd.com> Link: https://patch.msgid.link/9849330f45ac86344cc5ac54df2d313906d70bc4.1780634584.git.reinette.chatre@intel.com Closes: https://lore.kernel.org/lkml/37ac0376-43a3-4283-a3d5-4d57b3bec578@amd.com/
2026-06-04x86/microcode/AMD: Move the no-revision fixup to get_patch_level()Borislav Petkov (AMD)1-5/+7
On machines which don't have microcode applied yet, the revision is 0. However, this doesn't work with the Zen family/model/stepping patch arithmetic. So move the fixup to the patch level getter function and this way make sure the patch level is always proper and thus the arithmetic always works. And now that it can be called on any family, make this Zen-only. Assisted-by: claude/claude-opus-4-6 Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Link: https://lore.kernel.org/r/20260530024213.86137-1-bp@kernel.org
2026-06-01x86/CPU/AMD: Add more Zen6 modelsPratik Vishwakarma1-1/+1
Family 0x1a, models 0xd0 - 0xef are Zen6, so add them to the range which sets X86_FEATURE_ZEN6. [ bp: Massage commit message. ] Signed-off-by: Pratik Vishwakarma <Pratik.Vishwakarma@amd.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Link: https://patch.msgid.link/20260530061819.9721-1-Pratik.Vishwakarma@amd.com
2026-05-28x86/cpu: Fix a F00F bug warning and clean up surrounding codeSohil Mehta1-12/+3
On x86 SMP systems with the F00F bug present, do_clear_cpu_cap() rightfully warns that the code clears the X86_BUG_F00F flag after alternatives have been patched. X86_BUG_F00F is first cleared in intel_workarounds() and then set for the affected models. This sequence works fine on the BSP but on AP bringup, where alternatives have already been patched and clearing the flag there triggers the warning. There is no technical reason for clearing the flag before setting it. It is mainly an artifact of introducing the X86_BUG_F00F flag in e2604b49e8a8 ("x86, cpu: Convert F00F bug detection"). Remove the unnecessary clearing of the flag. While at it, remove the kernel notification and the surrounding logic to inform the user about the workaround exactly once. If needed, the presence of the F00F bug can be determined through /proc/cpuinfo. Additionally, the F00F bug was the last remaining user of clear_cpu_bug(). With no users left, get rid of this helper as well. [ bp: Massage commit message. ] Co-developed-by: Richard Weinberger <richard@nod.at> Signed-off-by: Richard Weinberger <richard@nod.at> Signed-off-by: Sohil Mehta <sohil.mehta@intel.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Reviewed-by: Ahmed S. Darwish <darwi@linutronix.de> Link: https://patch.msgid.link/20260528184826.3642051-1-sohil.mehta@intel.com
2026-05-26x86/microcode: Do not access MSR_IA32_PLATFORM_ID when running as a guestBorislav Petkov4-15/+15
Patch in Fixes: causes the usual: unchecked MSR access error: RDMSR from 0x17 at ... (intel_get_platform_id) Call Trace: early_init_intel early_cpu_init setup_arch _printk start_kernel x86_64_start_reservations x86_64_start_kernel common_startup_64 because the kernel is booted in a guest. In order to avoid it, this MSR access needs to be prevented when running virtualized. That is usually done by checking X86_FEATURE_HYPERVISOR but for this particular case it is too early yet. The platform ID needs to be read as early as when microcode is loaded on the BSP: load_ucode_bsp ... -> get_microcode_blob ... -> intel_find_matching_signature and by that time, CPUID leafs haven't been parsed yet. The microcode loader already has logic to check early whether the kernel is running virtualized so make that globally available to arch/x86/. The query whether running virtualized is getting more and more prominent in recent times so might as well make it an arch-global var which the rest of the code can use. Fixes: d8630b67ca1ed ("x86/cpu: Add platform ID to CPU info structure") Reported-by: Vishal Verma <vishal.l.verma@intel.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Reviewed-by: Binbin Wu <binbin.wu@linux.intel.com> Reviewed-by: Xiaoyao Li <xiaoyao.li@intel.com> Tested-by: Binbin Wu <binbin.wu@linux.intel.com> Link: https://lore.kernel.org/all/20260430020953.1405535-1-binbin.wu@linux.intel.com
2026-05-26x86/irq: Make irqstats array basedThomas Gleixner5-11/+7
Having the x86 specific interrupt statistics as a data structure with individual members instead of an array is just stupid as it requires endless copy and paste in arch_show_interrupts() and arch_irq_stat_cpu(), where the latter does not even take the latest interrupt additions into account. The resulting #ifdef orgy is just disgusting. Convert it to an array of counters, which does not make a difference in the actual interrupt hotpath increment as the array index is constant and therefore not any different than the member based access. But in arch_show_interrupts() and arch_irq_stat_cpu() this just turns into a loop, which reduces the text size by ~2k (~12%): text data bss dec hex filename 19643 15250 904 35797 8bd5 ../build/arch/x86/kernel/irq.o 17355 15250 904 33509 82e5 ../build/arch/x86/kernel/irq.o Adding a new vector or software counter only requires to update the table and everything just works. Using the core provided emit function which speeds up 0 outputs makes it significantly faster. Signed-off-by: Thomas Gleixner <tglx@kernel.org> Tested-by: Michael Kelley <mhklinux@outlook.com> Reviewed-by: Radu Rendec <radu@rendec.net> Link: https://patch.msgid.link/20260517194931.196070643@kernel.org
2026-05-20x86/mm: Disable broadcast TLB flush when PCID is disabledTom Lendacky1-0/+1
Booting with "nopcid" clears X86_FEATURE_PCID and keeps CR4.PCIDE from being set to one. On AMD CPUs that support INVLPGB, broadcast TLB flushing remains enabled. There are two checks that decide whether the global ASID code runs, mm_global_asid() and consider_global_asid(), that key off of the X86_FEATURE_INVLPGB feature. Once an mm becomes active on more than three CPUs, consider_global_asid() assigns it a global ASID, after which flush_tlb_mm_range() takes the broadcast_tlb_flush() path using a non-zero PCID. Issuing an INVLPGB with a non-zero PCID while CR4.PCIDE is not set results in a #GP: Oops: general protection fault, kernel NULL pointer dereference 0x1: 0000 [#1] SMP NOPTI CPU: 158 UID: 0 PID: 3119 Comm: snap Not tainted 7.1.0-rc3 #1 PREEMPT(full) Hardware name: ... RIP: 0010:broadcast_tlb_flush Code: ... 89 da 48 83 c8 07 <0f> 01 fe eb 08 cc cc cc ... Call Trace: <TASK> flush_tlb_mm_range ptep_clear_flush wp_page_copy ? _raw_spin_unlock __handle_mm_fault handle_mm_fault do_user_addr_fault exc_page_fault asm_exc_page_fault All processors that support broadcast TLB invalidation also have PCID support, so it is only the "nopcid" scenario that is of concern. In this situation just disable the broadcast TLB support using the CPUID dependency support by making X86_FEATURE_INVLPGB dependent on X86_FEATURE_PCID. [ bp: Massage commit message. ] Fixes: 4afeb0ed1753 ("x86/mm: Enable broadcast TLB invalidation for multi-threaded processes") Suggested-by: Dave Hansen <dave.hansen@intel.com> Assisted-by: Claude:claude-opus-4.7 Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Acked-by: Rik van Riel <riel@surriel.com> Cc: <stable@kernel.org> Link: https://patch.msgid.link/b915acfd63e8b2a094fdeb8dc608738072518764.1779296450.git.thomas.lendacky@amd.com
2026-05-17Merge tag 'ras-urgent-2026-05-17' of ↵Linus Torvalds1-28/+5
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull MCE fix from Ingo Molnar: - Fix an MCE polling interval adjustment regression (Borislav Petkov) * tag 'ras-urgent-2026-05-17' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/mce: Restore MCA polling interval halving
2026-05-13x86/mce: Restore MCA polling interval halvingBorislav Petkov (AMD)1-28/+5
RongQing reported that the MCA polling interval doesn't halve when an error gets logged. It was traced down to the commit in Fixes:, because: mce_timer_fn() |-> mce_poll_banks() |-> machine_check_poll() |-> mce_log() which will queue the work and return. Now, back in mce_timer_fn(): /* * Alert userspace if needed. If we logged an MCE, reduce the polling * interval, otherwise increase the polling interval. */ if (mce_notify_irq()) <--- here we haven't ran the notifier chain yet so mce_need_notify is not set yet so this won't hit and we won't halve the interval iv. Now the notifier chain runs. mce_early_notifier() sets the bit, does mce_notify_irq(), that clears the bit and then the notifier chain a little later logs the error. So this is a silly timing issue. But, that's all unnecessary. All it needs to happen here is, the "should we notify of a logged MCE" mce_notify_irq() asks, should be simply a question to the mce gen pool: "Are you empty?" And that then turns into a simple yes or no answer and it all JustWorks(tm). So do that and also distribute the functionality where it belongs: - Print that MCE events have been logged in mce_log() - Trigger the mcelog tool specific work in the first notifier As a result, mce_notify_irq() can go now. Fixes: 011d82611172 ("RAS: Add a Corrected Errors Collector") Reported-by: Li RongQing <lirongqing@baidu.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Reviewed-by: Qiuxu Zhuo <qiuxu.zhuo@intel.com> Tested-by: Qiuxu Zhuo <qiuxu.zhuo@intel.com> Link: https://lore.kernel.org/r/20260112082747.2842-1-lirongqing@baidu.com
2026-05-12x86/microcode: Fix comment in microcode_loader_disabled()Xiaoyao Li1-1/+1
The code in microcode_loader_disabled() actually checks for the bit 31 in CPUID[1]:ECX being set. Update the comment to match the code. No functional change intended. Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Link: https://patch.msgid.link/20260512152754.671760-1-xiaoyao.li@intel.com
2026-05-11x86/CPU/AMD: Prevent improper isolation of shared resources in Zen2's op cachePrathyushi Nangia1-0/+3
Make sure resources are not improperly shared in the op cache and cause instruction corruption this way. Signed-off-by: Prathyushi Nangia <prathyushi.nangia@amd.com> Co-developed-by: Borislav Petkov (AMD) <bp@alien8.de> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Cc: stable@vger.kernel.org Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2026-05-11x86/cpuid: Introduce a centralized CPUID parserAhmed S. Darwish4-1/+307
Introduce a CPUID parser for populating the system's CPUID tables. Since accessing a leaf within the CPUID table requires compile time tokenization, split the parser into two stages: (a) Compile-time macros for tokenizing the leaf/subleaf offsets within the CPUID table. (b) Generic runtime code to fill the CPUID data, using a parsing table which collects these compile-time offsets. For actual CPUID output parsing, support both generic and leaf-specific read functions. To ensure CPUID data early availability, invoke the parser during early boot, early Xen boot, and at early secondary CPUs bring up. Provide call site APIs to refresh a single leaf, or a leaf range, within the CPUID tables. This is for sites issuing MSR writes that partially change the CPU's CPUID layout. Doing full CPUID table rescans in such cases will be destructive since the CPUID tables will host all of the kernel's X86_FEATURE flags at a later stage. Suggested-by: Thomas Gleixner <tglx@kernel.org> Signed-off-by: Ahmed S. Darwish <darwi@linutronix.de> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Link: https://lore.kernel.org/all/20260327021645.555257-1-darwi@linutronix.de
2026-05-08x86/cpuid: Rename cpuid_leaf()/cpuid_subleaf() APIsAhmed S. Darwish2-2/+2
A new CPUID model will be added where its APIs will be designated as the official CPUID API. Free the cpuid_leaf() and cpuid_subleaf() function names for that API. Rename them accordingly to cpuid_read() and cpuid_read_subleaf(). For kernel/cpuid.c, rename its local file operations read function from cpuid_read() to cpuid_read_f() so that it does not conflict with the new API. No functional change. Signed-off-by: Ahmed S. Darwish <darwi@linutronix.de> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Link: https://lore.kernel.org/20260327021645.555257-1-darwi@linutronix.de
2026-05-08x86/fpu: Remove the 'no387' boot optionIngo Molnar1-7/+0
Without math emulation there's no point to this option. Signed-off-by: Ingo Molnar <mingo@kernel.org> Reviewed-by: Arnd Bergmann <arnd@arndb.de> Acked-by: Dave Hansen <dave.hansen@linux.intel.com> Cc: Ahmed S. Darwish <darwi@linutronix.de> Cc: Linus Torvalds <torvalds@linux-foundation.org> Link: https://lore.kernel.org/r/20250425084216.3913608-10-mingo@kernel.org
2026-05-06x86,fs/resctrl: Create 'event_filter' files read only if they're not ↵Ben Horgan1-0/+1
configurable When the counter assignment mode is mbm_event resctrl assumes the MBM events are configurable and exposes the 'event_filter' files. These files live at info/L3_MON/event_configs/<event>/event_filter and are used to display and set the event configuration. The MPAM architecture has support for configuring the memory bandwidth utilization (MBWU) counters to only count reads or only count writes. However, in MPAM, this event filtering support is optional in the hardware (and not yet implemented in the MPAM driver) but MBM counter assignment is always possible for MPAM MBWU counters. In order to support mbm_event mode with MPAM, create the 'event_filter' files read only if the event configuration can't be changed. A user can still chmod the file and so also return early with an error from event_filter_write(). Introduce a new monitor property, mbm_cntr_configurable, to indicate whether or not assignable MBM counters are configurable. On x86, set this to true whenever mbm_cntr_assignable is true to keep existing behaviour. Signed-off-by: Ben Horgan <ben.horgan@arm.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Reviewed-by: Reinette Chatre <reinette.chatre@intel.com> Reviewed-by: Babu Moger <babu.moger@amd.com> Tested-by: Babu Moger <babu.moger@amd.com> Link: https://lore.kernel.org/20260506082855.3694761-1-ben.horgan@arm.com
2026-05-06treewide: Explicitly include the x86 CPUID headersAhmed S. Darwish20-0/+26
Modify all CPUID call sites which implicitly include any of the CPUID headers to explicitly include them instead. For KVM's reverse_cpuid.h, just include <asm/cpuid/types.h> since it references the CPUID_EAX..EDX symbols without using the CPUID APIs. Note, this allows removing the inclusion of <asm/cpuid/api.h> from within <asm/processor.h> next. That allows the CPUID API headers to include <asm/processor.h> without introducing a circular dependency. Signed-off-by: Ahmed S. Darwish <darwi@linutronix.de> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Link: https://lore.kernel.org/20260327021645.555257-1-darwi@linutronix.de
2026-05-06x86/cpu: Remove CPU_SUP_UMC_32 supportIngo Molnar1-26/+0
These are 486 based CPUs, which build option (M486) is now gone upstream. Signed-off-by: Ingo Molnar <mingo@kernel.org> Reviewed-by: Arnd Bergmann <arnd@arndb.de> Acked-by: Dave Hansen <dave.hansen@linux.intel.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Link: https://lore.kernel.org/r/20250425084216.3913608-4-mingo@kernel.org
2026-04-29x86/microcode/intel: Refresh old_microcode defines with Nov 2025 releaseSohil Mehta1-238/+241
Update the minimum expected revisions of Intel microcode based on the microcode-20251111 (November 2025) release. Add an SPDX header as well as a comment to specify that the file is auto-generated. While at it, remove an extra space before the model field to match the formatting of the other fields. Signed-off-by: Sohil Mehta <sohil.mehta@intel.com> Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com> Link: https://patch.msgid.link/20260407014226.1169040-2-sohil.mehta@intel.com
2026-04-22Merge tag 'hyperv-next-signed-20260421' of ↵Linus Torvalds1-2/+13
git://git.kernel.org/pub/scm/linux/kernel/git/hyperv/linux Pull Hyper-V updates from Wei Liu: - Fix cross-compilation for hv tools (Aditya Garg) - Fix vmemmap_shift exceeding MAX_FOLIO_ORDER in mshv_vtl (Naman Jain) - Limit channel interrupt scan to relid high water mark (Michael Kelley) - Export hv_vmbus_exists() and use it in pci-hyperv (Dexuan Cui) - Fix cleanup and shutdown issues for MSHV (Jork Loeser) - Introduce more tracing support for MSHV (Stanislav Kinsburskii) * tag 'hyperv-next-signed-20260421' of git://git.kernel.org/pub/scm/linux/kernel/git/hyperv/linux: x86/hyperv: Skip LP/VP creation on kexec x86/hyperv: move stimer cleanup to hv_machine_shutdown() Drivers: hv: vmbus: fix hyperv_cpuhp_online variable shadowing mshv: Add tracepoint for GPA intercept handling mshv_vtl: Fix vmemmap_shift exceeding MAX_FOLIO_ORDER tools: hv: Fix cross-compilation Drivers: hv: vmbus: Export hv_vmbus_exists() and use it in pci-hyperv mshv: Introduce tracing support Drivers: hv: vmbus: Limit channel interrupt scan to relid high water mark
2026-04-22x86/hyperv: Skip LP/VP creation on kexecJork Loeser1-0/+7
After a kexec the logical processors and virtual processors already exist in the hypervisor because they were created by the previous kernel. Attempting to add them again causes either a BUG_ON or corrupted VP state leading to MCEs in the new kernel. Add hv_lp_exists() to probe whether an LP is already present by calling HVCALL_GET_LOGICAL_PROCESSOR_RUN_TIME. When it succeeds the LP exists and we skip the add-LP and create-VP loops entirely. Also add hv_call_notify_all_processors_started() which informs the hypervisor that all processors are online. This is required after adding LPs (fresh boot) and is a no-op on kexec since we skip that path. Co-developed-by: Anirudh Rayabharam <anrayabh@linux.microsoft.com> Signed-off-by: Anirudh Rayabharam <anrayabh@linux.microsoft.com> Co-developed-by: Stanislav Kinsburskii <stanislav.kinsburskii@gmail.com> Signed-off-by: Stanislav Kinsburskii <stanislav.kinsburskii@gmail.com> Co-developed-by: Mukesh Rathor <mrathor@linux.microsoft.com> Signed-off-by: Mukesh Rathor <mrathor@linux.microsoft.com> Signed-off-by: Jork Loeser <jloeser@linux.microsoft.com> Reviewed-by: Stanislav Kinsburskii <skinsburskii@linux.microsoft.com> Signed-off-by: Wei Liu <wei.liu@kernel.org>
2026-04-22x86/hyperv: move stimer cleanup to hv_machine_shutdown()Jork Loeser1-2/+6
Move hv_stimer_global_cleanup() from vmbus's hv_kexec_handler() to hv_machine_shutdown() in the platform code. This ensures stimer cleanup happens before the vmbus unload, which is required for root partition kexec to work correctly. Co-developed-by: Anirudh Rayabharam <anrayabh@linux.microsoft.com> Signed-off-by: Anirudh Rayabharam <anrayabh@linux.microsoft.com> Signed-off-by: Jork Loeser <jloeser@linux.microsoft.com> Reviewed-by: Stanislav Kinsburskii <skinsburskii@linux.microsoft.com> Signed-off-by: Wei Liu <wei.liu@kernel.org>
2026-04-17Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvmLinus Torvalds1-0/+2
Pull kvm updates from Paolo Bonzini: "Arm: - Add support for tracing in the standalone EL2 hypervisor code, which should help both debugging and performance analysis. This uses the new infrastructure for 'remote' trace buffers that can be exposed by non-kernel entities such as firmware, and which came through the tracing tree - Add support for GICv5 Per Processor Interrupts (PPIs), as the starting point for supporting the new GIC architecture in KVM - Finally add support for pKVM protected guests, where pages are unmapped from the host as they are faulted into the guest and can be shared back from the guest using pKVM hypercalls. Protected guests are created using a new machine type identifier. As the elusive guestmem has not yet delivered on its promises, anonymous memory is also supported This is only a first step towards full isolation from the host; for example, the CPU register state and DMA accesses are not yet isolated. Because this does not really yet bring fully what it promises, it is hidden behind CONFIG_ARM_PKVM_GUEST + 'kvm-arm.mode=protected', and also triggers TAINT_USER when a VM is created. Caveat emptor - Rework the dreaded user_mem_abort() function to make it more maintainable, reducing the amount of state being exposed to the various helpers and rendering a substantial amount of state immutable - Expand the Stage-2 page table dumper to support NV shadow page tables on a per-VM basis - Tidy up the pKVM PSCI proxy code to be slightly less hard to follow - Fix both SPE and TRBE in non-VHE configurations so that they do not generate spurious, out of context table walks that ultimately lead to very bad HW lockups - A small set of patches fixing the Stage-2 MMU freeing in error cases - Tighten-up accepted SMC immediate value to be only #0 for host SMCCC calls - The usual cleanups and other selftest churn LoongArch: - Use CSR_CRMD_PLV for kvm_arch_vcpu_in_kernel() - Add DMSINTC irqchip in kernel support RISC-V: - Fix steal time shared memory alignment checks - Fix vector context allocation leak - Fix array out-of-bounds in pmu_ctr_read() and pmu_fw_ctr_read_hi() - Fix double-free of sdata in kvm_pmu_clear_snapshot_area() - Fix integer overflow in kvm_pmu_validate_counter_mask() - Fix shift-out-of-bounds in make_xfence_request() - Fix lost write protection on huge pages during dirty logging - Split huge pages during fault handling for dirty logging - Skip CSR restore if VCPU is reloaded on the same core - Implement kvm_arch_has_default_irqchip() for KVM selftests - Factored-out ISA checks into separate sources - Added hideleg to struct kvm_vcpu_config - Factored-out VCPU config into separate sources - Support configuration of per-VM HGATP mode from KVM user space s390: - Support for ESA (31-bit) guests inside nested hypervisors - Remove restriction on memslot alignment, which is not needed anymore with the new gmap code - Fix LPSW/E to update the bear (which of course is the breaking event address register) x86: - Shut up various UBSAN warnings on reading module parameter before they were initialized - Don't zero-allocate page tables that are used for splitting hugepages in the TDP MMU, as KVM is guaranteed to set all SPTEs in the page table and thus write all bytes - As an optimization, bail early when trying to unsync 4KiB mappings if the target gfn can just be mapped with a 2MiB hugepage x86 generic: - Copy single-chunk MMIO write values into struct kvm_vcpu (more precisely struct kvm_mmio_fragment) to fix use-after-free stack bugs where KVM would dereference stack pointer after an exit to userspace - Clean up and comment the emulated MMIO code to try to make it easier to maintain (not necessarily "easy", but "easier") - Move VMXON+VMXOFF and EFER.SVME toggling out of KVM (not *all* of VMX and SVM enabling) as it is needed for trusted I/O - Advertise support for AVX512 Bit Matrix Multiply (BMM) instructions - Immediately fail the build if a required #define is missing in one of KVM's headers that is included multiple times - Reject SET_GUEST_DEBUG with -EBUSY if there's an already injected exception, mostly to prevent syzkaller from abusing the uAPI to trigger WARNs, but also because it can help prevent userspace from unintentionally crashing the VM - Exempt SMM from CPUID faulting on Intel, as per the spec - Misc hardening and cleanup changes x86 (AMD): - Fix and optimize IRQ window inhibit handling for AVIC; make it per-vCPU so that KVM doesn't prematurely re-enable AVIC if multiple vCPUs have to-be-injected IRQs - Clean up and optimize the OSVW handling, avoiding a bug in which KVM would overwrite state when enabling virtualization on multiple CPUs in parallel. This should not be a problem because OSVW should usually be the same for all CPUs - Drop a WARN in KVM_MEMORY_ENCRYPT_REG_REGION where KVM complains about a "too large" size based purely on user input - Clean up and harden the pinning code for KVM_MEMORY_ENCRYPT_REG_REGION - Disallow synchronizing a VMSA of an already-launched/encrypted vCPU, as doing so for an SNP guest will crash the host due to an RMP violation page fault - Overhaul KVM's APIs for detecting SEV+ guests so that VM-scoped queries are required to hold kvm->lock, and enforce it by lockdep. Fix various bugs where sev_guest() was not ensured to be stable for the whole duration of a function or ioctl - Convert a pile of kvm->lock SEV code to guard() - Play nicer with userspace that does not enable KVM_CAP_EXCEPTION_PAYLOAD, for which KVM needs to set CR2 and DR6 as a response to ioctls such as KVM_GET_VCPU_EVENTS (even if the payload would end up in EXITINFO2 rather than CR2, for example). Only set CR2 and DR6 when consumption of the payload is imminent, but on the other hand force delivery of the payload in all paths where userspace retrieves CR2 or DR6 - Use vcpu->arch.cr2 when updating vmcb12's CR2 on nested #VMEXIT instead of vmcb02->save.cr2. The value is out of sync after a save/restore or after a #PF is injected into L2 - Fix a class of nSVM bugs where some fields written by the CPU are not synchronized from vmcb02 to cached vmcb12 after VMRUN, and so are not up-to-date when saved by KVM_GET_NESTED_STATE - Fix a class of bugs where the ordering between KVM_SET_NESTED_STATE and KVM_SET_{S}REGS could cause vmcb02 to be incorrectly initialized after save+restore - Add a variety of missing nSVM consistency checks - Fix several bugs where KVM failed to correctly update VMCB fields on nested #VMEXIT - Fix several bugs where KVM failed to correctly synthesize #UD or #GP for SVM-related instructions - Add support for save+restore of virtualized LBRs (on SVM) - Refactor various helpers and macros to improve clarity and (hopefully) make the code easier to maintain - Aggressively sanitize fields when copying from vmcb12, to guard against unintentionally allowing L1 to utilize yet-to-be-defined features - Fix several bugs where KVM botched rAX legality checks when emulating SVM instructions. There are remaining issues in that KVM doesn't handle size prefix overrides for 64-bit guests - Fail emulation of VMRUN/VMLOAD/VMSAVE if mapping vmcb12 fails instead of somewhat arbitrarily synthesizing #GP (i.e. don't double down on AMD's architectural but sketchy behavior of generating #GP for "unsupported" addresses) - Cache all used vmcb12 fields to further harden against TOCTOU bugs x86 (Intel): - Drop obsolete branch hint prefixes from the VMX instruction macros - Use ASM_INPUT_RM() in __vmcs_writel() to coerce clang into using a register input when appropriate - Code cleanups guest_memfd: - Don't mark guest_memfd folios as accessed, as guest_memfd doesn't support reclaim, the memory is unevictable, and there is no storage to write back to LoongArch selftests: - Add KVM PMU test cases s390 selftests: - Enable more memory selftests x86 selftests: - Add support for Hygon CPUs in KVM selftests - Fix a bug in the MSR test where it would get false failures on AMD/Hygon CPUs with exactly one of RDPID or RDTSCP - Add an MADV_COLLAPSE testcase for guest_memfd as a regression test for a bug where the kernel would attempt to collapse guest_memfd folios against KVM's will" * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (373 commits) KVM: x86: use inlines instead of macros for is_sev_*guest x86/virt: Treat SVM as unsupported when running as an SEV+ guest KVM: SEV: Goto an existing error label if charging misc_cg for an ASID fails KVM: SVM: Move lock-protected allocation of SEV ASID into a separate helper KVM: SEV: use mutex guard in snp_handle_guest_req() KVM: SEV: use mutex guard in sev_mem_enc_unregister_region() KVM: SEV: use mutex guard in sev_mem_enc_ioctl() KVM: SEV: use mutex guard in snp_launch_update() KVM: SEV: Assert that kvm->lock is held when querying SEV+ support KVM: SEV: Document that checking for SEV+ guests when reclaiming memory is "safe" KVM: SEV: Hide "struct kvm_sev_info" behind CONFIG_KVM_AMD_SEV=y KVM: SEV: WARN on unhandled VM type when initializing VM KVM: LoongArch: selftests: Add PMU overflow interrupt test KVM: LoongArch: selftests: Add basic PMU event counting test KVM: LoongArch: selftests: Add cpucfg read/write helpers LoongArch: KVM: Add DMSINTC inject msi to vCPU LoongArch: KVM: Add DMSINTC device support LoongArch: KVM: Make vcpu_is_preempted() as a macro rather than function LoongArch: KVM: Move host CSR_GSTAT save and restore in context switch LoongArch: KVM: Move host CSR_EENTRY save and restore in context switch ...
2026-04-17x86/CPU: Fix FPDSS on Zen1Borislav Petkov (AMD)1-0/+3
Zen1's hardware divider can leave, under certain circumstances, partial results from previous operations. Those results can be leaked by another, attacker thread. Fix that with a chicken bit. Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2026-04-15Merge tag 'mm-stable-2026-04-13-21-45' of ↵Linus Torvalds1-1/+1
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm Pull MM updates from Andrew Morton: - "maple_tree: Replace big node with maple copy" (Liam Howlett) Mainly prepararatory work for ongoing development but it does reduce stack usage and is an improvement. - "mm, swap: swap table phase III: remove swap_map" (Kairui Song) Offers memory savings by removing the static swap_map. It also yields some CPU savings and implements several cleanups. - "mm: memfd_luo: preserve file seals" (Pratyush Yadav) File seal preservation to LUO's memfd code - "mm: zswap: add per-memcg stat for incompressible pages" (Jiayuan Chen) Additional userspace stats reportng to zswap - "arch, mm: consolidate empty_zero_page" (Mike Rapoport) Some cleanups for our handling of ZERO_PAGE() and zero_pfn - "mm/kmemleak: Improve scan_should_stop() implementation" (Zhongqiu Han) A robustness improvement and some cleanups in the kmemleak code - "Improve khugepaged scan logic" (Vernon Yang) Improve khugepaged scan logic and reduce CPU consumption by prioritizing scanning tasks that access memory frequently - "Make KHO Stateless" (Jason Miu) Simplify Kexec Handover by transitioning KHO from an xarray-based metadata tracking system with serialization to a radix tree data structure that can be passed directly to the next kernel - "mm: vmscan: add PID and cgroup ID to vmscan tracepoints" (Thomas Ballasi and Steven Rostedt) Enhance vmscan's tracepointing - "mm: arch/shstk: Common shadow stack mapping helper and VM_NOHUGEPAGE" (Catalin Marinas) Cleanup for the shadow stack code: remove per-arch code in favour of a generic implementation - "Fix KASAN support for KHO restored vmalloc regions" (Pasha Tatashin) Fix a WARN() which can be emitted the KHO restores a vmalloc area - "mm: Remove stray references to pagevec" (Tal Zussman) Several cleanups, mainly udpating references to "struct pagevec", which became folio_batch three years ago - "mm: Eliminate fake head pages from vmemmap optimization" (Kiryl Shutsemau) Simplify the HugeTLB vmemmap optimization (HVO) by changing how tail pages encode their relationship to the head page - "mm/damon/core: improve DAMOS quota efficiency for core layer filters" (SeongJae Park) Improve two problematic behaviors of DAMOS that makes it less efficient when core layer filters are used - "mm/damon: strictly respect min_nr_regions" (SeongJae Park) Improve DAMON usability by extending the treatment of the min_nr_regions user-settable parameter - "mm/page_alloc: pcp locking cleanup" (Vlastimil Babka) The proper fix for a previously hotfixed SMP=n issue. Code simplifications and cleanups ensued - "mm: cleanups around unmapping / zapping" (David Hildenbrand) A bunch of cleanups around unmapping and zapping. Mostly simplifications, code movements, documentation and renaming of zapping functions - "support batched checking of the young flag for MGLRU" (Baolin Wang) Batched checking of the young flag for MGLRU. It's part cleanups; one benchmark shows large performance benefits for arm64 - "memcg: obj stock and slab stat caching cleanups" (Johannes Weiner) memcg cleanup and robustness improvements - "Allow order zero pages in page reporting" (Yuvraj Sakshith) Enhance free page reporting - it is presently and undesirably order-0 pages when reporting free memory. - "mm: vma flag tweaks" (Lorenzo Stoakes) Cleanup work following from the recent conversion of the VMA flags to a bitmap - "mm/damon: add optional debugging-purpose sanity checks" (SeongJae Park) Add some more developer-facing debug checks into DAMON core - "mm/damon: test and document power-of-2 min_region_sz requirement" (SeongJae Park) An additional DAMON kunit test and makes some adjustments to the addr_unit parameter handling - "mm/damon/core: make passed_sample_intervals comparisons overflow-safe" (SeongJae Park) Fix a hard-to-hit time overflow issue in DAMON core - "mm/damon: improve/fixup/update ratio calculation, test and documentation" (SeongJae Park) A batch of misc/minor improvements and fixups for DAMON - "mm: move vma_(kernel|mmu)_pagesize() out of hugetlb.c" (David Hildenbrand) Fix a possible issue with dax-device when CONFIG_HUGETLB=n. Some code movement was required. - "zram: recompressio