aboutsummaryrefslogtreecommitdiff
path: root/drivers/pci/msi
AgeCommit message (Collapse)AuthorFilesLines
2026-02-23PCI/MSI: Add TODO comment about legacy pcim_enable_device() side-effectShawn Lin1-0/+10
Add a TODO comment in pci/msi/msi.c to document that the automatic IRQ vector management activated by pcim_enable_device() is a dangerous and confusing. Suggested-by: Philipp Stanner <phasta@kernel.org> Signed-off-by: Shawn Lin <shawn.lin@rock-chips.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Link: https://patch.msgid.link/1770798299-202288-4-git-send-email-shawn.lin@rock-chips.com
2026-02-23PCI/MSI: Clarify pci_free_irq_vectors() usage for managed devicesShawn Lin1-0/+5
Update pci_free_irq_vectors() documentation to clarify that drivers using pcim_enable_device() must not call pci_free_irq_vectors(). For legacy reasons, pcim_enable_device() switches several normally un-managed functions into managed mode. Currently, the only function affected in this way is pcim_setup_msi_release(), which results in automatic IRQ vector management. This behavior is dangerous and confusing. Drivers using pcim_enable_device() should rely on the automatic IRQ vector management and avoid calling pci_free_irq_vectors() manually. Suggested-by: Philipp Stanner <phasta@kernel.org> Signed-off-by: Shawn Lin <shawn.lin@rock-chips.com> [bhelgaas: squash both updates to pci_free_irq_vectors() documentation] Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Link: https://patch.msgid.link/1770798299-202288-2-git-send-email-shawn.lin@rock-chips.com Link: https://patch.msgid.link/1770798299-202288-3-git-send-email-shawn.lin@rock-chips.com
2026-02-10Merge tag 'irq-msi-2026-02-09' of ↵Linus Torvalds2-6/+10
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull MSI updates from Thomas Gleixner: "Updates for the [PCI] MSI subsystem: - Add interrupt redirection infrastructure Some PCI controllers use a single demultiplexing interrupt for the MSI interrupts of subordinate devices. This prevents setting the interrupt affinity of device interrupts, which causes device interrupts to be delivered to a single CPU. That obviously is counterproductive for multi-queue devices and interrupt balancing. To work around this limitation the new infrastructure installs a dummy irq_set_affinity() callback which captures the affinity mask and picks a redirection target CPU out of the mask. When the PCI controller demultiplexes the interrupts it invokes a new handling function in the core, which either runs the interrupt handler in the context of the target CPU or delegates it to irq_work on the target CPU. - Utilize the interrupt redirection mechanism in the PCI DWC host controller driver. This allows affinity control for the subordinate device MSI interrupts instead of being randomly executed on the CPU which runs the demultiplex handler. - Replace the binary 64-bit MSI flag with a DMA mask Some PCI devices have PCI_MSI_FLAGS_64BIT in the MSI capability, but implement less than 64 address bits. This breaks on platforms where such a device is assigned an MSI address higher than what's supported. With the binary 64-bit flag there is no other choice than disabling 64-bit MSI support which leaves the device disfunctional. By using a DMA mask the address limit of a device can be described correctly which provides support for the above scenario. - Make use of the DMA mask based address limit in the hda/intel and radeon drivers to enable them on affected platforms - The usual small cleanups and improvements" * tag 'irq-msi-2026-02-09' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: ALSA: hda/intel: Make MSI address limit based on the device DMA limit drm/radeon: Make MSI address limit based on the device DMA limit PCI/MSI: Check the device specific address mask in msi_verify_entries() PCI/MSI: Convert the boolean no_64bit_msi flag to a DMA address mask genirq/redirect: Prevent writing MSI message on affinity change PCI/MSI: Unmap MSI-X region on error genirq: Update effective affinity for redirected interrupts PCI: dwc: Enable MSI affinity support PCI: dwc: Code cleanup genirq: Add interrupt redirection infrastructure genirq/msi: Correct kernel-doc in <linux/msi.h>
2026-01-31PCI/MSI: Check the device specific address mask in msi_verify_entries()Vivian Wang1-3/+5
Instead of a 32-bit/64-bit dichotomy, check the MSI address against the device specific address mask. This allows platforms with an MSI doorbell address above the 32-bit limit to work with devices without full 64-bit MSI address support, as long as the doorbell is within the addressable range of the device. [ tglx: Massaged changelog ] Signed-off-by: Vivian Wang <wangruikang@iscas.ac.cn> Signed-off-by: Thomas Gleixner <tglx@kernel.org> Reviewed-by: Thomas Gleixner <tglx@kernel.org> Link: https://patch.msgid.link/20260129-pci-msi-addr-mask-v4-2-70da998f2750@iscas.ac.cn
2026-01-31PCI/MSI: Convert the boolean no_64bit_msi flag to a DMA address maskVivian Wang2-2/+2
Some PCI devices have PCI_MSI_FLAGS_64BIT in the MSI capability, but implement less than 64 address bits. This breaks on platforms where such a device is assigned an MSI address higher than what's supported. Currently, no_64bit_msi bit is set for these devices, meaning that only 32-bit MSI addresses are allowed for them. However, on some platforms the MSI doorbell address is above the 32-bit limit but within the addressable range of the device. As a first step to enable MSI on those combinations of devices and platforms, convert the boolean no_64bit_msi flag to a DMA mask and fixup the affected usage sites: - no_64bit_msi = 1 -> msi_addr_mask = DMA_BIT_MASK(32) - no_64bit_msi = 0 -> msi_addr_mask = DMA_BIT_MASK(64) - if (no_64bit_msi) -> if (msi_addr_mask < DMA_BIT_MASK(64)) Since no values other than DMA_BIT_MASK(32) and DMA_BIT_MASK(64) are used, this is functionally equivalent. This prepares for changing the binary decision between 32 and 64 bit to a DMA mask based decision which allows to support systems which have a DMA address space less than 64bit but a MSI doorbell address above the 32-bit limit. [ tglx: Massaged changelog ] Signed-off-by: Vivian Wang <wangruikang@iscas.ac.cn> Signed-off-by: Thomas Gleixner <tglx@kernel.org> Reviewed-by: Brett Creeley <brett.creeley@amd.com> # ionic Reviewed-by: Thomas Gleixner <tglx@kernel.org> Acked-by: Takashi Iwai <tiwai@suse.de> # sound Link: https://patch.msgid.link/20260129-pci-msi-addr-mask-v4-1-70da998f2750@iscas.ac.cn
2026-01-27irqchip/gic-v5: Add ACPI ITS probingLorenzo Pieralisi1-0/+2
On ACPI ARM64 systems the GICv5 ITS configuration and translate frames are described in the MADT table. Refactor the current GICv5 ITS driver code to share common functions between ACPI and OF and implement ACPI probing in the GICv5 ITS driver. Add iort_msi_xlate() to map a device ID and retrieve an MSI controller fwnode node for ACPI systems and update pci_msi_map_rid_ctlr_node() to use it in its ACPI code path. Add the required functions to IORT code for deviceID retrieval and IRQ domain registration and look-up so that the GICv5 ITS driver in an ACPI based system can be successfully probed. Signed-off-by: Lorenzo Pieralisi <lpieralisi@kernel.org> Reviewed-by: Jonathan Cameron <jonathan.cameron@huawei.com> Acked-by: Thomas Gleixner <tglx@kernel.org> Link: https://patch.msgid.link/20260115-gicv5-host-acpi-v3-5-c13a9a150388@kernel.org Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2026-01-27PCI/MSI: Make the pci_msi_map_rid_ctlr_node() interface firmware agnosticLorenzo Pieralisi1-5/+16
To support booting with OF and ACPI seamlessly, GIC ITS parent code requires the PCI/MSI irqdomain layer to implement a function to retrieve an MSI controller fwnode and map an RID in a firmware agnostic way (ie pci_msi_map_rid_ctlr_node()). Convert pci_msi_map_rid_ctlr_node() to an OF agnostic interface (fwnode_handle based) and update the GIC ITS MSI parent code to reflect the pci_msi_map_rid_ctlr_node() change. Signed-off-by: Lorenzo Pieralisi <lpieralisi@kernel.org> Reviewed-by: Jonathan Cameron <jonathan.cameron@huawei.com> Acked-by: Bjorn Helgaas <bhelgaas@google.com> Acked-by: Thomas Gleixner <tglx@kernel.org> Link: https://patch.msgid.link/20260115-gicv5-host-acpi-v3-2-c13a9a150388@kernel.org Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2026-01-26PCI/MSI: Unmap MSI-X region on errorHaoxiang Li1-1/+3
msix_capability_init() fails to unmap the MSI-X region if msix_setup_interrupts() fails. Add the missing iounmap() for that error path. [ tglx: Massaged change log ] Signed-off-by: Haoxiang Li <lihaoxiang@isrc.iscas.ac.cn> Signed-off-by: Thomas Gleixner <tglx@kernel.org> Link: https://patch.msgid.link/20260125144452.2103812-1-lihaoxiang@isrc.iscas.ac.cn
2025-10-16PCI/MSI: Delete pci_msi_create_irq_domain()Nam Cao1-90/+0
pci_msi_create_irq_domain() is now unused. Delete it. Signed-off-by: Nam Cao <namcao@linutronix.de> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Bjorn Helgaas <bhelgaas@google.com>
2025-10-01Merge tag 'devicetree-for-6.18' of ↵Linus Torvalds1-1/+1
git://git.kernel.org/pub/scm/linux/kernel/git/robh/linux Pull devicetree updates from Rob Herring: "DT core: - Update dtc to upstream version v1.7.2-35-g52f07dcca47c - Add stub for of_get_next_child_with_prefix() - Convert of_msi_map_id() callers to of_msi_xlate() DT bindings: - Convert multiple text board bindings to DT schema format - Add bindings for synaptics,synaptics_i2c touchscreen controller, innolux,n133hse-ea1 and nlt,nl12880bc20-spwg-24 displays, and NXP vf610 reboot controller - Add new Arm Cortex-A320/A520AE/A720AE and C1-Nano/Pro/Premium/Ultra CPUs. Add missing Applied Micro CPU compatibles. Add pu-supply and fsl,soc-operating-points properties for CPU nodes. - Add QCom Glymur PDC and tegra264-agic interrupt controllers - Add samsung,exynos8890-mali GPU to Arm Mali Midgard - Drop Samsung S3C2410 display related bindings - Allow separate DP lane and AUX connections in dp-connector - Add some missing, undocumented vendor prefixes - Add missing '#address-cells' properties in interrupt controller bindings which dtc now warns about - Drop duplicate socfpga-sdram-edac.txt, moxa,moxart-watchdog.txt, fsl/mpic.txt, ti,opa362.txt, and cavium-thunder2.txt legacy text bindings which are already covered by existing schemas. - Various binding fixes for Mediatek platforms in mailbox, regulator, pinctrl, timer, and display - Drop work-around for yamllint quoting of values containing ',' - Various spelling, typo, grammar, and duplicated words fixes in DT bindings and docs - Add binding guidelines for defining properties at top level of schemas, lack of node name ABI, and usage of simple-mfd" * tag 'devicetree-for-6.18' of git://git.kernel.org/pub/scm/linux/kernel/git/robh/linux: (81 commits) dt-bindings: arm: altera: Drop socfpga-sdram-edac.txt dt-bindings: gpu: Convert nvidia,gk20a to DT schema dt-bindings: rng: sparc_sun_oracle_rng: convert to DT schema dt-bindings: vendor-prefixes: update regex for properties without a prefix dt-bindings: display: bridge: convert megachips-stdpxxxx-ge-b850v3-fw.txt to yaml scripts: dt_to_config: fix grammar and a typo in --help text dt-bindings: fix spelling, typos, grammar, duplicated words docs: dt: fix grammar and spelling of: base: Add of_get_next_child_with_prefix() stub dt-bindings: trivial-devices: Add compatible string synaptics,synaptics_i2c dt-bindings: soc: mediatek: pwrap: Add power-domains property dt-bindings: pinctrl: mt65xx: Allow gpio-line-names dt-bindings: media: Convert MediaTek mt8173-vpu bindings to DT schema dt-bindings: arm: mediatek: Support mt8183-audiosys variant dt-bindings: mailbox: mediatek,gce-mailbox: Make clock-names optional dt-bindings: regulator: mediatek,mt6331: Add missing compatible dt-bindings: regulator: mediatek,mt6331: Fix various regulator names dt-bindings: regulator: mediatek,mt6332-regulator: Add missing compatible dt-bindings: pinctrl: mediatek,mt7622-pinctrl: Add missing base reg dt-bindings: pinctrl: mediatek,mt7622-pinctrl: Add missing pwm_ch7_2 ...
2025-09-09PCI/MSI: Remove the conditional parent [un]mask logicThomas Gleixner1-20/+0
Now that msi_lib_init_dev_msi_info() overwrites the irq_[un]mask() callbacks when the MSI_FLAG_PCI_MSI_MASK_PARENT flag is set by the parent domain, the conditional [un]mask logic is obsolete. Remove it. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Tested-by: Marc Zyngier <maz@kernel.org> Acked-by: Bjorn Helgaas <bhelgaas@google.com> Acked-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/all/20250903135433.444329373@linutronix.de
2025-09-03of/irq: Convert of_msi_map_id() callers to of_msi_xlate()Lorenzo Pieralisi1-1/+1
With the introduction of the of_msi_xlate() function, the OF layer provides an API to map a device ID and retrieve the MSI controller node the ID is mapped to with a single call. of_msi_map_id() is currently used to map a deviceID to a specific MSI controller node; of_msi_xlate() can be used for that purpose too, there is no need to keep the two functions. Convert of_msi_map_id() to of_msi_xlate() calls and update the of_msi_xlate() documentation to describe how the struct device_node pointer passed in should be set-up to either provide the MSI controller node target or receive its pointer upon mapping completion. Signed-off-by: Lorenzo Pieralisi <lpieralisi@kernel.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Rob Herring <robh@kernel.org> Cc: Marc Zyngier <maz@kernel.org> Acked-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Frank Li <Frank.Li@nxp.com> Link: https://lore.kernel.org/r/20250805133443.936955-1-lpieralisi@kernel.org Signed-off-by: Rob Herring (Arm) <robh@kernel.org>
2025-09-02PCI/MSI: Check MSI_FLAG_PCI_MSI_MASK_PARENT in cond_[startup|shutdown]_parent()Inochi Amaoto1-0/+5
For MSI controllers which only support MSI_FLAG_PCI_MSI_MASK_PARENT, the newly added callback irq_startup() and irq_shutdown() for pci_msi[x]_template will not unmask or mask the interrupt when startup() resp. shutdown() is invoked. This prevents the interrupt from being enabled resp. disabled. Invoke irq_[un]mask_parent() in cond_[startup|shutdown]_parent(), when the interrupt has the MSI_FLAG_PCI_MSI_MASK_PARENT flag set. Fixes: 54f45a30c0d0 ("PCI/MSI: Add startup/shutdown for per device domains") Reported-by: Linux Kernel Functional Testing <lkft@linaro.org> Reported-by: Nathan Chancellor <nathan@kernel.org> Reported-by: Wei Fang <wei.fang@nxp.com> Signed-off-by: Inochi Amaoto <inochiama@gmail.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Tested-by: Nathan Chancellor <nathan@kernel.org> Tested-by: Linux Kernel Functional Testing <lkft@linaro.org> Tested-by: Jon Hunter <jonathanh@nvidia.com> Tested-by: Wei Fang <wei.fang@nxp.com> Tested-by: Chen Wang <unicorn_wang@outlook.com> # Pioneerbox/SG2042 Acked-by: Bjorn Helgaas <bhelgaas@google.com> Link: https://lore.kernel.org/all/20250827230943.17829-1-inochiama@gmail.com Closes: https://lore.kernel.org/regressions/aK4O7Hl8NCVEMznB@monster/ Closes: https://lore.kernel.org/regressions/20250826220959.GA4119563@ax162/ Closes: https://lore.kernel.org/all/20250827093911.1218640-1-wei.fang@nxp.com/
2025-08-23PCI/MSI: Add startup/shutdown for per device domainsInochi Amaoto1-0/+52
As the RISC-V PLIC cannot apply affinity settings without invoking irq_enable(), it will make the interrupt unavailble when used as an underlying interrupt chip for the MSI controller. Implement the irq_startup() and irq_shutdown() callbacks for the PCI MSI and MSI-X templates. For chips that specify MSI_FLAG_PCI_MSI_STARTUP_PARENT, the parent startup and shutdown functions are invoked. That allows the interrupt on the parent chip to be enabled if the interrupt has not been enabled during allocation. This is necessary for MSI controllers which use PLIC as underlying parent interrupt chip. Suggested-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Inochi Amaoto <inochiama@gmail.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Tested-by: Chen Wang <unicorn_wang@outlook.com> # Pioneerbox Reviewed-by: Chen Wang <unicorn_wang@outlook.com> Acked-by: Bjorn Helgaas <bhelgaas@google.com> Link: https://lore.kernel.org/all/20250813232835.43458-3-inochiama@gmail.com
2025-08-01Merge tag 'pci-v6.17-changes' of ↵Linus Torvalds1-1/+1
git://git.kernel.org/pub/scm/linux/kernel/git/pci/pci Pull PCI updates from Bjorn Helgaas: "Enumeration: - Allow built-in drivers, not just modular drivers, to use async initial probing (Lukas Wunner) - Support Immediate Readiness even on devices with no PM Capability (Sean Christopherson) - Consolidate definition of PCIE_RESET_CONFIG_WAIT_MS (100ms), the required delay between a reset and sending config requests to a device (Niklas Cassel) - Add pci_is_display() to check for "Display" base class and use it in ALSA hda, vfio, vga_switcheroo, vt-d (Mario Limonciello) - Allow 'isolated PCI functions' (multi-function devices without a function 0) for LoongArch, similar to s390 and jailhouse (Huacai Chen) Power control: - Add ability to enable optional slot clock for cases where the PCIe host controller and the slot are supplied by different clocks (Marek Vasut) PCIe native device hotplug: - Fix runtime PM ref imbalance on Hot-Plug Capable ports caused by misinterpreting a config read failure after a device has been removed (Lukas Wunner) - Avoid creating a useless PCIe port service device for pciehp if the slot is handled by the ACPI hotplug driver (Lukas Wunner) - Ignore ACPI hotplug slots when calculating depth of pciehp hotplug ports (Lukas Wunner) Virtualization: - Save VF resizable BAR state and restore it after reset (Michał Winiarski) - Allow IOV resources (VF BARs) to be resized (Michał Winiarski) - Add pci_iov_vf_bar_set_size() so drivers can control VF BAR size (Michał Winiarski) Endpoint framework: - Add RC-to-EP doorbell support using platform MSI controller, including a test case (Frank Li) - Allow BAR assignment via configfs so platforms have flexibility in determining BAR usage (Jerome Brunet) Native PCIe controller drivers: - Convert amazon,al-alpine-v[23]-pcie, apm,xgene-pcie, axis,artpec6-pcie, marvell,armada-3700-pcie, st,spear1340-pcie to DT schema format (Rob Herring) - Use dev_fwnode() instead of of_fwnode_handle() to remove OF dependency in altera (fixes an unused variable), designware-host, mediatek, mediatek-gen3, mobiveil, plda, xilinx, xilinx-dma, xilinx-nwl (Jiri Slaby, Arnd Bergmann) - Convert aardvark, altera, brcmstb, designware-host, iproc, mediatek, mediatek-gen3, mobiveil, plda, rcar-host, vmd, xilinx, xilinx-dma, xilinx-nwl from using pci_msi_create_irq_domain() to using msi_create_parent_irq_domain() instead; this makes the interrupt controller per-PCI device, allows dynamic allocation of vectors after initialization, and allows support of IMS (Nam Cao) APM X-Gene PCIe controller driver: - Rewrite MSI handling to MSI CPU affinity, drop useless CPU hotplug bits, use device-managed memory allocations, and clean things up (Marc Zyngier) - Probe xgene-msi as a standard platform driver rather than a subsys_initcall (Marc Zyngier) Broadcom STB PCIe controller driver: - Add optional DT 'num-lanes' property and if present, use it to override the Maximum Link Width advertised in Link Capabilities (Jim Quinlan) Cadence PCIe controller driver: - Use PCIe Message routing types from the PCI core rather than defining private ones (Hans Zhang) Freescale i.MX6 PCIe controller driver: - Add IMX8MQ_EP third 64-bit BAR in epc_features (Richard Zhu) - Add IMX8MM_EP and IMX8MP_EP fixed 256-byte BAR 4 in epc_features (Richard Zhu) - Configure LUT for MSI/IOMMU in Endpoint mode so Root Complex can trigger doorbel on Endpoint (Frank Li) - Remove apps_reset (LTSSM_EN) from imx_pcie_{assert,deassert}_core_reset(), which fixes a hotplug regression on i.MX8MM (Richard Zhu) - Delay Endpoint link start until configfs 'start' written (Richard Zhu) Intel VMD host bridge driver: - Add Intel Panther Lake (PTL)-H/P/U Vendor ID (George D Sworo) Qualcomm PCIe controller driver: - Add DT binding and driver support for SA8255p, which supports ECAM for Configuration Space access (Mayank Rana) - Update DT binding and driver to describe PHYs and per-Root Port resets in a Root Port stanza and deprecate describing them in the host bridge; this makes it possible to support multiple Root Ports in the future (Krishna Chaitanya Chundru) - Add Qualcomm QCS615 to SM8150 DT binding (Ziyue Zhang) - Add Qualcomm QCS8300 to SA8775p DT binding (Ziyue Zhang) - Drop TBU and ref clocks from Qualcomm SM8150 and SC8180x DT bindings (Konrad Dybcio) - Document 'link_down' reset in Qualcomm SA8775P DT binding (Ziyue Zhang) - Add required PCIE_RESET_CONFIG_WAIT_MS delay after Link up IRQ (Niklas Cassel) Rockchip PCIe controller driver: - Drop unused PCIe Message routing and code definitions (Hans Zhang) - Remove several unused header includes (Hans Zhang) - Use standard PCIe config register definitions instead of rockchip-specific redefinitions (Geraldo Nascimento) - Set Target Link Speed to 5.0 GT/s before retraining so we have a chance to train at a higher speed (Geraldo Nascimento) Rockchip DesignWare PCIe controller driver: - Prevent race between link training and register update via DBI by inhibiting link training after hot reset and link down (Wilfred Mallawa) - Add required PCIE_RESET_CONFIG_WAIT_MS delay after Link up IRQ (Niklas Cassel) Sophgo PCIe controller driver: - Add DT binding and driver for Sophgo SG2044 PCIe controller driver in Root Complex mode (Inochi Amaoto) Synopsys DesignWare PCIe controller driver: - Add required PCIE_RESET_CONFIG_WAIT_MS after waiting for Link up on Ports that support > 5.0 GT/s. Slower Ports still rely on the not-quite-correct PCIE_LINK_WAIT_SLEEP_MS 90ms default delay while waiting for the Link (Niklas Cassel)" * tag 'pci-v6.17-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/pci/pci: (116 commits) dt-bindings: PCI: qcom,pcie-sa8775p: Document 'link_down' reset dt-bindings: PCI: Remove 83xx-512x-pci.txt dt-bindings: PCI: Convert amazon,al-alpine-v[23]-pcie to DT schema dt-bindings: PCI: Convert marvell,armada-3700-pcie to DT schema dt-bindings: PCI: Convert apm,xgene-pcie to DT schema dt-bindings: PCI: Convert axis,artpec6-pcie to DT schema dt-bindings: PCI: Convert st,spear1340-pcie to DT schema PCI: Move is_pciehp check out of pciehp_is_native() PCI: pciehp: Use is_pciehp instead of is_hotplug_bridge PCI/portdrv: Use is_pciehp instead of is_hotplug_bridge PCI/ACPI: Fix runtime PM ref imbalance on Hot-Plug Capable ports selftests: pci_endpoint: Add doorbell test case misc: pci_endpoint_test: Add doorbell test case PCI: endpoint: pci-epf-test: Add doorbell test support PCI: endpoint: Add pci_epf_align_inbound_addr() helper for inbound address alignment PCI: endpoint: pci-ep-msi: Add checks for MSI parent and mutability PCI: endpoint: Add RC-to-EP doorbell support using platform MSI controller PCI: dwc: Add Sophgo SG2044 PCIe controller driver in Root Complex mode PCI: vmd: Switch to msi_create_parent_irq_domain() PCI: vmd: Convert to lock guards ...
2025-07-30Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvmLinus Torvalds1-0/+20
Pull kvm updates from Paolo Bonzini: "ARM: - Host driver for GICv5, the next generation interrupt controller for arm64, including support for interrupt routing, MSIs, interrupt translation and wired interrupts - Use FEAT_GCIE_LEGACY on GICv5 systems to virtualize GICv3 VMs on GICv5 hardware, leveraging the legacy VGIC interface - Userspace control of the 'nASSGIcap' GICv3 feature, allowing userspace to disable support for SGIs w/o an active state on hardware that previously advertised it unconditionally - Map supporting endpoints with cacheable memory attributes on systems with FEAT_S2FWB and DIC where KVM no longer needs to perform cache maintenance on the address range - Nested support for FEAT_RAS and FEAT_DoubleFault2, allowing the guest hypervisor to inject external aborts into an L2 VM and take traps of masked external aborts to the hypervisor - Convert more system register sanitization to the config-driven implementation - Fixes to the visibility of EL2 registers, namely making VGICv3 system registers accessible through the VGIC device instead of the ONE_REG vCPU ioctls - Various cleanups and minor fixes LoongArch: - Add stat information for in-kernel irqchip - Add tracepoints for CPUCFG and CSR emulation exits - Enhance in-kernel irqchip emulation - Various cleanups RISC-V: - Enable ring-based dirty memory tracking - Improve perf kvm stat to report interrupt events - Delegate illegal instruction trap to VS-mode - MMU improvements related to upcoming nested virtualization s390x - Fixes x86: - Add CONFIG_KVM_IOAPIC for x86 to allow disabling support for I/O APIC, PIC, and PIT emulation at compile time - Share device posted IRQ code between SVM and VMX and harden it against bugs and runtime errors - Use vcpu_idx, not vcpu_id, for GA log tag/metadata, to make lookups O(1) instead of O(n) - For MMIO stale data mitigation, track whether or not a vCPU has access to (host) MMIO based on whether the page tables have MMIO pfns mapped; using VFIO is prone to false negatives - Rework the MSR interception code so that the SVM and VMX APIs are more or less identical - Recalculate all MSR intercepts from scratch on MSR filter changes, instead of maintaining shadow bitmaps - Advertise support for LKGS (Load Kernel GS base), a new instruction that's loosely related to FRED, but is supported and enumerated independently - Fix a user-triggerable WARN that syzkaller found by setting the vCPU in INIT_RECEIVED state (aka wait-for-SIPI), and then putting the vCPU into VMX Root Mode (post-VMXON). Trying to detect every possible path leading to architecturally forbidden states is hard and even risks breaking userspace (if it goes from valid to valid state but passes through invalid states), so just wait until KVM_RUN to detect that the vCPU state isn't allowed - Add KVM_X86_DISABLE_EXITS_APERFMPERF to allow disabling interception of APERF/MPERF reads, so that a "properly" configured VM can access APERF/MPERF. This has many caveats (APERF/MPERF cannot be zeroed on vCPU creation or saved/restored on suspend and resume, or preserved over thread migration let alone VM migration) but can be useful whenever you're interested in letting Linux guests see the effective physical CPU frequency in /proc/cpuinfo - Reject KVM_SET_TSC_KHZ for vm file descriptors if vCPUs have been created, as there's no known use case for changing the default frequency for other VM types and it goes counter to the very reason why the ioctl was added to the vm file descriptor. And also, there would be no way to make it work for confidential VMs with a "secure" TSC, so kill two birds with one stone - Dynamically allocation the shadow MMU's hashed page list, and defer allocating the hashed list until it's actually needed (the TDP MMU doesn't use the list) - Extract many of KVM's helpers for accessing architectural local APIC state to common x86 so that they can be shared by guest-side code for Secure AVIC - Various cleanups and fixes x86 (Intel): - Preserve the host's DEBUGCTL.FREEZE_IN_SMM when running the guest. Failure to honor FREEZE_IN_SMM can leak host state into guests - Explicitly check vmcs12.GUEST_DEBUGCTL on nested VM-Enter to prevent L1 from running L2 with features that KVM doesn't support, e.g. BTF x86 (AMD): - WARN and reject loading kvm-amd.ko instead of panicking the kernel if the nested SVM MSRPM offsets tracker can't handle an MSR (which is pretty much a static condition and therefore should never happen, but still) - Fix a variety of flaws and bugs in the AVIC device posted IRQ code - Inhibit AVIC if a vCPU's ID is too big (relative to what hardware supports) instead of rejecting vCPU creation - Extend enable_ipiv module param support to SVM, by simply leaving IsRunning clear in the vCPU's physical ID table entry - Disable IPI virtualization, via enable_ipiv, if the CPU is affected by erratum #1235, to allow (safely) enabling AVIC on such CPUs - Request GA Log interrupts if and only if the target vCPU is blocking, i.e. only if KVM needs a notification in order to wake the vCPU - Intercept SPEC_CTRL on AMD if the MSR shouldn't exist according to the vCPU's CPUID model - Accept any SNP policy that is accepted by the firmware with respect to SMT and single-socket restrictions. An incompatible policy doesn't put the kernel at risk in any way, so there's no reason for KVM to care - Drop a superfluous WBINVD (on all CPUs!) when destroying a VM and use WBNOINVD instead of WBINVD when possible for SEV cache maintenance - When reclaiming memory from an SEV guest, only do cache flushes on CPUs that have ever run a vCPU for the guest, i.e. don't flush the caches for CPUs that can't possibly have cache lines with dirty, encrypted data Generic: - Rework irqbypass to track/match producers and consumers via an xarray instead of a linked list. Using a linked list leads to O(n^2) insertion times, which is hugely problematic for use cases that create large numbers of VMs. Such use cases typically don't actually use irqbypass, but eliminating the pointless registration is a future problem to solve as it likely requires new uAPI - Track irqbypass's "token" as "struct eventfd_ctx *" instead of a "void *", to avoid making a simple concept unnecessarily difficult to understand - Decouple device posted IRQs from VFIO device assignment, as binding a VM to a VFIO group is not a requirement for enabling device posted IRQs - Clean up and document/comment the irqfd assignment code - Disallow binding multiple irqfds to an eventfd with a priority waiter, i.e. ensure an eventfd is bound to at most one irqfd through the entire host, and add a selftest to verify eventfd:irqfd bindings are globally unique - Add a tracepoint for KVM_SET_MEMORY_ATTRIBUTES to help debug issues related to private <=> shared memory conversions - Drop guest_memfd's .getattr() implementation as the VFS layer will call generic_fillattr() if inode_operations.getattr is NULL - Fix issues with dirty ring harvesting where KVM doesn't bound the processing of entries in any way, which allows userspace to keep KVM in a tight loop indefinitely - Kill off kvm_arch_{start,end}_assignment() and x86's associated tracking, now that KVM no longer uses assigned_device_count as a heuristic for either irqbypass usage or MDS mitigation Selftests: - Fix a comment typo - Verify KVM is loaded when getting any KVM module param so that attempting to run a selftest without kvm.ko loaded results in a SKIP message about KVM not being loaded/enabled (versus some random parameter not existing) - Skip tests that hit EACCES when attempting to access a file, and print a "Root required?" help message. In most cases, the test just needs to be run with elevated permissions" * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (340 commits) Documentation: KVM: Use unordered list for pre-init VGIC registers RISC-V: KVM: Avoid re-acquiring memslot in kvm_riscv_gstage_map() RISC-V: KVM: Use find_vma_intersection() to search for intersecting VMAs RISC-V: perf/kvm: Add reporting of interrupt events RISC-V: KVM: Enable ring-based dirty memory tracking RISC-V: KVM: Fix inclusion of Smnpm in the guest ISA bitmap RISC-V: KVM: Delegate illegal instruction fault to VS mode RISC-V: KVM: Pass VMID as parameter to kvm_riscv_hfence_xyz() APIs RISC-V: KVM: Factor-out g-stage page table management RISC-V: KVM: Add vmid field to struct kvm_riscv_hfence RISC-V: KVM: Introduce struct kvm_gstage_mapping RISC-V: KVM: Factor-out MMU related declarations into separate headers RISC-V: KVM: Use ncsr_xyz() in kvm_riscv_vcpu_trap_redirect() RISC-V: KVM: Implement kvm_arch_flush_remote_tlbs_range() RISC-V: KVM: Don't flush TLB when PTE is unchanged RISC-V: KVM: Replace KVM_REQ_HFENCE_GVMA_VMID_ALL with KVM_REQ_TLB_FLUSH RISC-V: KVM: Rename and move kvm_riscv_local_tlb_sanitize() RISC-V: KVM: Drop the return value of kvm_riscv_vcpu_aia_init() RISC-V: KVM: Check kvm_riscv_vcpu_alloc_vector_context() return value KVM: arm64: selftests: Add FEAT_RAS EL2 registers to get-reg-list ...
2025-07-30Merge tag 'net-next-6.17' of ↵Linus Torvalds1-2/+3
git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next Pull networking updates from Jakub Kicinski: "Core & protocols: - Wrap datapath globals into net_aligned_data, to avoid false sharing - Preserve MSG_ZEROCOPY in forwarding (e.g. out of a container) - Add SO_INQ and SCM_INQ support to AF_UNIX - Add SIOCINQ support to AF_VSOCK - Add TCP_MAXSEG sockopt to MPTCP - Add IPv6 force_forwarding sysctl to enable forwarding per interface - Make TCP validation of whether packet fully fits in the receive window and the rcv_buf more strict. With increased use of HW aggregation a single "packet" can be multiple 100s of kB - Add MSG_MORE flag to optimize large TCP transmissions via sockmap, improves latency up to 33% for sockmap users - Convert TCP send queue handling from tasklet to BH workque - Improve BPF iteration over TCP sockets to see each socket exactly once - Remove obsolete and unused TCP RFC3517/RFC6675 loss recovery code - Support enabling kernel threads for NAPI processing on per-NAPI instance basis rather than a whole device. Fully stop the kernel NAPI thread when threaded NAPI gets disabled. Previously thread would stick around until ifdown due to tricky synchronization - Allow multicast routing to take effect on locally-generated packets - Add output interface argument for End.X in segment routing - MCTP: add support for gateway routing, improve bind() handling - Don't require rtnl_lock when fetching an IPv6 neighbor over Netlink - Add a new neighbor flag ("extern_valid"), which cedes refresh responsibilities to userspace. This is needed for EVPN multi-homing where a neighbor entry for a multi-homed host needs to be synced across all the VTEPs among which the host is multi-homed - Support NUD_PERMANENT for proxy neighbor entries - Add a new queuing discipline for IETF RFC9332 DualQ Coupled AQM - Add sequence numbers to netconsole messages. Unregister netconsole's console when all net targets are removed. Code refactoring. Add a number of selftests - Align IPSec inbound SA lookup to RFC 4301. Only SPI and protocol should be used for an inbound SA lookup - Support inspecting ref_tracker state via DebugFS - Don't force bonding advertisement frames tx to ~333 ms boundaries. Add broadcast_neighbor option to send ARP/ND on all bonded links - Allow providing upcall pid for the 'execute' command in openvswitch - Remove DCCP support from Netfilter's conntrack - Disallow multiple packet duplications in the queuing layer - Prevent use of deprecated iptables code on PREEMPT_RT Driver API: - Support RSS and hashing configuration over ethtool Netlink - Add dedicated ethtool callbacks for getting and setting hashing fields - Add support for power budget evaluation strategy in PSE / Power-over-Ethernet. Generate Netlink events for overcurrent etc - Support DPLL phase offset monitoring across all device inputs. Support providing clock reference and SYNC over separate DPLL inputs - Support traffic classes in devlink rate API for bandwidth management - Remove rtnl_lock dependency from UDP tunnel port configuration Device drivers: - Add a new Broadcom driver for 800G Ethernet (bnge) - Add a standalone driver for Microchip ZL3073x DPLL - Remove IBM's NETIUCV device driver - Ethernet high-speed NICs: - Broadcom (bnxt): - support zero-copy Tx of DMABUF memory - take page size into account for page pool recycling rings - Intel (100G, ice, idpf): - idpf: XDP and AF_XDP support preparations - idpf: add flow steering - add link_down_events statistic - clean up the TSPLL code - preparations for live VM migration - nVidia/Mellanox: - support zero-copy Rx/Tx interfaces (DMABUF and io_uring) - optimize context memory usage for matchers - expose serial numbers in devlink info - support PCIe congestion metrics - Meta (fbnic): - add 25G, 50G, and 100G link modes to phylink - support dumping FW logs - Marvell/Cavium: - support for CN20K generation of the Octeon chips - Amazon: - add HW clock (without timestamping, just hypervisor time access) - Ethernet virtual: - VirtIO net: - support segmentation of UDP-tunnel-encapsulated packets - Google (gve): - support packet timestamping and clock synchronization - Microsoft vNIC: - add handler for device-originated servicing events - allow dynamic MSI-X vector allocation - support Tx bandwidth clamping - Ethernet NICs consumer, and embedded: - AMD: - amd-xgbe: hardware timestamping and PTP clock support - Broadcom integrated MACs (bcmgenet, bcmasp): - use napi_complete_done() return value to support NAPI polling - add support for re-starting auto-negotiation - Broadcom switches (b53): - support BCM5325 switches - add bcm63xx EPHY power control - Synopsys (stmmac): - lots of code refactoring and cleanups - TI: - icssg-prueth: read firmware-names from device tree - icssg: PRP offload support - Microchip: - lan78xx: convert to PHYLINK for improved PHY and MAC management - ksz: add KSZ8463 switch support - Intel: - support similar queue priority scheme in multi-queue and time-sensitive networking (taprio) - support packet pre-emption in both - RealTek (r8169): - enable EEE at 5Gbps on RTL8126 - Airoha: - add PPPoE offload support - MDIO bus controller for Airoha AN7583 - Ethernet PHYs: - support for the IPQ5018 internal GE PHY - micrel KSZ9477 switch-integrated PHYs: - add MDI/MDI-X control support - add RX error counters - add cable test support - add Signal Quality Indicator (SQI) reporting - dp83tg720: improve reset handling and reduce link recovery time - support bcm54811 (and its MII-Lite interface type) - air_en8811h: support resume/suspend - support PHY counters for QCA807x and QCA808x - support WoL for QCA807x - CAN drivers: - rcar_canfd: support for Transceiver Delay Compensation - kvaser: report FW versions via devlink dev info - WiFi: - extended regulatory info support (6 GHz) - add statistics and beacon monitor for Multi-Link Operation (MLO) - support S1G aggregation, improve S1G support - add Radio Measurement action fields - support per-radio RTS threshold - some work around how FIPS affects wifi, which was wrong (RC4 is used by TKIP, not only WEP) - improvements for unsolicited probe response handling - WiFi drivers: - RealTek (rtw88): - IBSS mode for SDIO devices - RealTek (rtw89): - BT coexistence for MLO/WiFi7 - concurrent station + P2P support - support for USB devices RTL8851BU/RTL8852BU - Intel (iwlwifi): - use embedded PNVM in (to be released) FW images to fix compatibility issues - many cleanups (unused FW APIs, PCIe code, WoWLAN) - some FIPS interoperability - MediaTek (mt76): - firmware recovery improvements - more MLO work - Qualcomm/Atheros (ath12k): - fix scan on multi-radio devices - more EHT/Wi-Fi 7 features - encapsulation/decapsulation offload - Broadcom (brcm80211): - support SDIO 43751 device - Bluetooth: - hci_event: add support for handling LE BIG Sync Lost event - ISO: add socket option to report packet seqnum via CMSG - ISO: support SCM_TIMESTAMPING for ISO TS - Bluetooth drivers: - intel_pcie: support Function Level Reset - nxpuart: add support for 4M baudrate - nxpuart: implement powerup sequence, reset, FW dump, and FW loading" * tag 'net-next-6.17' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next: (1742 commits) dpll: zl3073x: Fix build failure selftests: bpf: fix legacy netfilter options ipv6: annotate data-races around rt->fib6_nsiblings ipv6: fix possible infinite loop in fib6_info_uses_dev() ipv6: prevent infinite loop in rt6_nlmsg_size() ipv6: add a retry logic in net6_rt_notify() vrf: Drop existing dst reference in vrf_ip6_input_dst net/sched: taprio: align entry index attr validation with mqprio net: fsl_pq_mdio: use dev_err_probe selftests: rtnetlink.sh: remove esp4_offload after test vsock: remove unnecessary null check in vsock_getname() igb: xsk: solve negative overflow of nb_pkts in zerocopy mode stmmac: xsk: fix negative overflow of budget in zerocopy mode dt-bindings: ieee802154: Convert at86rf230.txt yaml format net: dsa: microchip: Disable PTP function of KSZ8463 net: dsa: microchip: Setup fiber ports for KSZ8463 net: dsa: microchip: Write switch MAC address differently for KSZ8463 net: dsa: microchip: Use different registers for KSZ8463 net: dsa: microchip: Add KSZ8463 switch support to KSZ DSA driver dt-bindings: net: dsa: microchip: Add KSZ8463 switch support ...
2025-07-29Merge tag 'irq-msi-2025-07-27' of ↵Linus Torvalds1-3/+3
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull MSI update from Thomas Gleixner: "A trivial cleanup in the PCI/MSI code to remove a duplicated back and forth conversion" * tag 'irq-msi-2025-07-27' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: PCI/MSI: Remove duplicated to_pci_dev() conversion
2025-07-29Merge tag 'kvmarm-6.17' of ↵Paolo Bonzini1-0/+20
https://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm into HEAD KVM/arm64 changes for 6.17, round #1 - Host driver for GICv5, the next generation interrupt controller for arm64, including support for interrupt routing, MSIs, interrupt translation and wired interrupts. - Use FEAT_GCIE_LEGACY on GICv5 systems to virtualize GICv3 VMs on GICv5 hardware, leveraging the legacy VGIC interface. - Userspace control of the 'nASSGIcap' GICv3 feature, allowing userspace to disable support for SGIs w/o an active state on hardware that previously advertised it unconditionally. - Map supporting endpoints with cacheable memory attributes on systems with FEAT_S2FWB and DIC where KVM no longer needs to perform cache maintenance on the address range. - Nested support for FEAT_RAS and FEAT_DoubleFault2, allowing the guest hypervisor to inject external aborts into an L2 VM and take traps of masked external aborts to the hypervisor. - Convert more system register sanitization to the config-driven implementation. - Fixes to the visibility of EL2 registers, namely making VGICv3 system registers accessible through the VGIC device instead of the ONE_REG vCPU ioctls. - Various cleanups and minor fixes.
2025-07-23PCI: Fix typosBjorn Helgaas1-1/+1
Fix typos. Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Acked-by: Thomas Gleixner <tglx@linutronix.de> Link: https://patch.msgid.link/20250722213743.2822761-1-helgaas@kernel.org
2025-07-17Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netJakub Kicinski1-1/+3
Cross-merge networking fixes after downstream PR (net-6.16-rc7). Conflicts: Documentation/netlink/specs/ovpn.yaml 880d43ca9aa4 ("netlink: specs: clean up spaces in brackets") af52020fc599 ("ovpn: reject unexpected netlink attributes") drivers/net/phy/phy_device.c a44312d58e78 ("net: phy: Don't register LEDs for genphy") f0f2b992d818 ("net: phy: Don't register LEDs for genphy") https://lore.kernel.org/20250710114926.7ec3a64f@kernel.org drivers/net/wireless/intel/iwlwifi/fw/regulatory.c drivers/net/wireless/intel/iwlwifi/mld/regulatory.c 5fde0fcbd760 ("wifi: iwlwifi: mask reserved bits in chan_state_active_bitmap") ea045a0de3b9 ("wifi: iwlwifi: add support for accepting raw DSM tables by firmware") net/ipv6/mcast.c ae3264a25a46 ("ipv6: mcast: Delay put pmc->idev in mld_del_delrec()") a8594c956cc9 ("ipv6: mcast: Avoid a duplicate pointer check in mld_del_delrec()") https://lore.kernel.org/8cc52891-3653-4b03-a45e-05464fe495cf@kernel.org No adjacent changes. Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-07-10PCI/MSI: Prevent recursive locking in pci_msix_write_tph_tag()Himanshu Madhani1-1/+3
pci_msix_write_tph_tag() takes the per device MSI descriptor mutex and then invokes msi_domain_get_virq(), which takes the same mutex again. That obviously results in a system hang which is exposed by a softlockup or lockdep warning. Move the lock guard after the invocation of msi_domain_get_virq() to fix this. [ tglx: Massage changelog by adding a proper explanation and removing the not really useful stacktrace ] Fixes: d5124a9957b2 ("PCI/MSI: Provide a sane mechanism for TPH") Reported-by: Jorge Lopez <jorge.jo.lopez@oracle.com> Suggested-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Himanshu Madhani <himanshu.madhani@oracle.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Tested-by: Jorge Lopez <jorge.jo.lopez@oracle.com> Link: https://lore.kernel.org/all/20250708222530.1041477-1-himanshu.madhani@oracle.com
2025-07-08PCI/MSI: Add pci_msi_map_rid_ctlr_node() helper functionLorenzo Pieralisi1-0/+20
IRQchip drivers need a PCI/MSI function to map a RID to a MSI controller deviceID namespace and at the same time retrieve the struct device_node pointer of the MSI controller the RID is mapped to. Add pci_msi_map_rid_ctlr_node() to achieve this purpose. Cc Bjorn Helgaas <bhelgaas@google.com> Signed-off-by: Lorenzo Pieralisi <lpieralisi@kernel.org> Reviewed-by: Marc Zyngier <maz@kernel.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20250703-gicv5-host-v7-25-12e71f1b3528@kernel.org Signed-off-by: Marc Zyngier <maz@kernel.org>
2025-06-18PCI/MSI: Remove duplicated to_pci_dev() conversionChris Li1-3/+3
In pci_msi_update_mask(), "lock = &to_pci_dev()" does the to_pci_dev() lookup, and there's another one buried inside msi_desc_to_pci_dev(). Introduce a local variable to remove that duplication. Signed-off-by: Chris Li <chrisl@kernel.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> Link: https://lore.kernel.org/all/20250617-pci-msi-avoid-dup-pcidev-v1-1-ed75b0419023@kernel.org
2025-06-17PCI/MSI: Export pci_msix_prepare_desc() for dynamic MSI-X allocationsShradha Gupta1-2/+3
For supporting dynamic MSI-X vector allocation by PCI controllers, enabling the flag MSI_FLAG_PCI_MSIX_ALLOC_DYN is not enough, msix_prepare_msi_desc() to prepare the MSI descriptor is also needed. Export pci_msix_prepare_desc() to allow PCI controllers to support dynamic MSI-X vector allocation. Signed-off-by: Shradha Gupta <shradhagupta@linux.microsoft.com> Reviewed-by: Haiyang Zhang <haiyangz@microsoft.com> Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Saurabh Sengar <ssengar@linux.microsoft.com> Acked-by: Bjorn Helgaas <bhelgaas@google.com>
2025-06-04PCI/MSI: Size device MSI domain with the maximum number of vectorsMarc Zyngier3-7/+8
Zenghui reports that since 1396e89e09f0 ("genirq/msi: Move prepare() call to per-device allocation"), his Multi-MSI capable device isn't working anymore. This is a consequence of 15c72f824b32 ("PCI/MSI: Add support for per device MSI[X] domains"), which always creates a MSI domain of size 1, even in the presence of Multi-MSI. While this was somehow working until then, moving the .prepare() call ends up sizing the ITS table with a tiny value for this device, and making the endpoint driver unhappy. Instead, always create the domain and call the .prepare() helper with the maximum expected size. Fixes: 1396e89e09f0 ("genirq/msi: Move prepare() call to per-device allocation") Fixes: 15c72f824b32 ("PCI/MSI: Add support for per device MSI[X] domains") Reported-by: Zenghui Yu <yuzenghui@huawei.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Tested-by: Zenghui Yu <yuzenghui@huawei.com> Reviewed-by: Lorenzo Pieralisi <lpieralisi@kernel.org> Link: https://lore.kernel.org/all/20250603141801.915305-1-maz@kernel.org Closes: https://lore.kernel.org/r/0b1d7aec-1eac-a9cd-502a-339e216e08a1@huawei.com
2025-05-27Merge tag 'irq-msi-2025-05-25' of ↵Linus Torvalds3-73/+116
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull MSI updates from Thomas Gleixner: "Updates for the MSI subsystem (core code and PCI): - Switch the MSI descriptor locking to lock guards - Replace a broken and naive implementation of PCI/MSI-X control word updates in the PC