aboutsummaryrefslogtreecommitdiff
path: root/drivers/gpu/nova-core/gsp
AgeCommit message (Collapse)AuthorFilesLines
11 daysgpu: nova-core: convert to keyworded projection syntaxGary Guo1-3/+3
Use "build" to denote that the index bounds checking here is performed at build time. Reviewed-by: Alexandre Courbot <acourbot@nvidia.com> Reviewed-by: Alice Ryhl <aliceryhl@google.com> Signed-off-by: Gary Guo <gary@garyguo.net> Acked-by: Danilo Krummrich <dakr@kernel.org> Link: https://patch.msgid.link/20260602-projection-syntax-rework-v2-5-6989470f5440@garyguo.net Signed-off-by: Miguel Ojeda <ojeda@kernel.org>
2026-04-15Merge tag 'drm-next-2026-04-15' of https://gitlab.freedesktop.org/drm/kernelLinus Torvalds8-381/+821
Pull drm updates from Dave Airlie: "Highlights: - new DRM RAS infrastructure using netlink - amdgpu: enable DC on CIK APUs, and more IP enablement, and more user queue work - xe: purgeable BO support, and new hw enablement - dma-buf : add revocable operations Full summary: mm: - two-pass MMU interval notifiers - add gpu active/reclaim per-node stat counters math: - provide __KERNEL_DIV_ROUND_CLOSEST() in UAPI - implement DIV_ROUND_CLOSEST() with __KERNEL_DIV_ROUND_CLOSEST() rust: - shared tag with driver-core: register macro and io infra - core: rework DMA coherent API - core: add interop::list to interop with C linked lists - core: add more num::Bounded operations - core: enable generic_arg_infer and add EMSGSIZE - workqueue: add ARef<T> support for work and delayed work - add GPU buddy allocator abstraction - add DRM shmem GEM helper abstraction - allow drm:::Device to dispatch work and delayed work items to driver private data - add dma_resv_lock helper and raw accessors core: - introduce DRM RAS infrastructure over netlink - add connector panel_type property - fourcc: add ARM interleaved 64k modifier - colorop: add destroy helper - suballoc: split into alloc and init helpers - mode: provide DRM_ARGB_GET*() macros for reading color components edid: - provide drm_output_color_Format dma-buf: - provide revoke mechanism for shared buffers - rename move_notify to invalidate_mappings - always enable move_notify - protect dma_fence_ops with RCU and improve locking - clean pages with helpers atomic: - allocate drm_private_state via callback - helper: use system_percpu_wq buddy: - make buddy allocator available to gpu level - add kernel-doc for buddy allocator - improve aligned allocation ttm: - fix fence signalling - improve tests and docs - improve handling of gfp_retry_mayfail - use per-node stat counters to track memory allocations - port pool to use list_lru - drop NUMA specific pools - make pool shrinker numa aware - track allocated pages per numa node coreboot: - cleanup coreboot framebuffer support sched: - fix race condition in drm_sched_fini pagemap: - enable THP support - pass pagemap_addr by reference gem-shmem: - Track page accessed/dirty status across mmap/vmap gpusvm: - reenable device to device migration - fix unbalanced unclock bridge: - anx7625: Support USB-C plus DT bindings - connector: Fix EDID detection - dw-hdmi-qp: Support Vendor-Specfic and SDP Infoframes; improve others - fsl-ldb: Fix visual artifacts plus related DT property 'enable-termination-resistor' - imx8qxp-pixel-link: Improve bridge reference handling - lt9611: Support Port-B-only input plus DT bindings - tda998x: Support DRM_BRIDGE_ATTACH_NO_CONNECTOR; Clean up - Support TH1520 HDMI plus DT bindings - waveshare-dsi: Fix register and attach; Support 1..4 DSI lanes plus DT bindings - anx7625: Fix USB Type-C handling - cdns-mhdp8546-core: Handle HDCP state in bridge atomic_check - Support Lontium LT8713SX DP MST bridge plus DT bindings - analogix_dp: Use DP helpers for link training panel: - panel-jdi-lt070me05000: Use mipi-dsi multi functions - panel-edp: Support Add AUO B116XAT04.1 (HW: 1A); Support CMN N116BCL-EAK (C2); Support FriendlyELEC plus DT changes - panel-edp: Fix timings for BOE NV140WUM-N64 - ilitek-ili9882t: Allow GPIO calls to sleep - jadard: Support TAIGUAN XTI05101-01A - lxd: Support LXD M9189A plus DT bindings - mantix: Fix pixel clock; Clean up - motorola: Support Motorola Atrix 4G and Droid X2 plus DT bindings - novatek: Support Novatek/Tianma NT37700F plus DT bindings - simple: Support EDT ET057023UDBA plus DT bindings; Support Powertip PH800480T032-ZHC19 plus DT bindings; Support Waveshare 13.3" - novatek-nt36672a: Use mipi_dsi_*_multi() functions - panel-edp: Support BOE NV153WUM-N42, CMN N153JCA-ELK, CSW MNF307QS3-2 - support Himax HX83121A plus DT bindings - support JuTouch JT070TM041 plus DT bindings - support Samsung S6E8FC0 plus DT bindings - himax-hx83102c: support Samsung S6E8FC0 plus DT bindings; support backlight - ili9806e: support Rocktech RK050HR345-CT106A plus DT bindings - simple: support Tianma TM050RDH03 plus DT bindings amdgpu: - enable DC by default on CIK APUs - userq fence ioctl param size fixes - set panel_type to OLED for eDP - refactor DC i2c code - FAMS2 update - rework ttm handling to allow multiple engines - DC DCE 6.x cleanup - DC support for NUTMEG/TRAVIS DP bridge - DCN 4.2 support - GC12 idle power fix for compute - use struct drm_edid in non-DC code - enable NV12/P010 support on primary planes - support newer IP discovery tables - VCN/JPEG 5.0.2 support - GC/MES 12.1 updates - USERQ fixes - add DC idle state manager - eDP DSC seamless boot amdkfd: - GC 12.1 updates - non 4K page fixes xe: - basic Xe3p_LPG and NVL-P enabling patches - allow VM_BIND decompress support - add purgeable buffer object support - add xe_vm_get_property_ioctl - restrict multi-lrc to VCS/VECS engines - allow disabling VM overcommit in fault mode - dGPU memory optimizations - Workaround cleanups and simplification - Allow VFs VRAM quote changes using sysfs - convert GT stats to per-cpu counters - pagefault refactors - enable multi-queue on xe3p_xpc - disable DCC on PTL - make MMIO communication more robust - disable D3Cold for BMG on specific platforms - vfio: improve FLR sync for Xe VFIO i915/display: - C10/C20/LT PHY PLL divider verification - use trans push mechanism to generate PSR frame change on LNL+ - refactor DP DSC slice config - VGA decode refactoring - refactor DPT, gen2-4 overlay, masked field register macro helpers - refactor stolen memory allocation decisions - prepare for UHBR DP tunnels - refactor LT PHY PLL to use DPLL framework - implement register polling/waiting in display code - add shared stepping header between i915 and display i915: - fix potential overflow of shmem scatterlist length nouveau: - provide Z cull info to userspace - initial GA100 support - shutdown on PCI device shutdown nova-core: - harden GSP command queue - add support for large RPCs - simplify GSP sequencer and message handling - refactor falcon firmware handling - convert to new register macro - conver to new DMA coherent API - use checked arithmetic - add debugfs support for gsp-rm log buffers - fix aux device registration for multi-GPU msm: - CI: - Uprev mesa - Restore CI jobs for Qualcomm APQ8016 and APQ8096 devices - Core: - Switched to of_get_available_child_by_name() - DPU: - Fixes for DSC panels - Fixed brownout because of the frequency / OPP mismatch - Quad pipe preparation (not enabled yet) - Switched to virtual planes by default - Dropped VBIF_NRT support - Added support for Eliza platform - Reworked alpha handling - Switched to correct CWB definitions on Eliza - Dropped dummy INTF_0 on MSM8953 - Corrected INTFs related to DP-MST - DP: - Removed debug prints looking into PHY internals - DSI: - Fixes for DSC panels - RGB101010 support - Support for SC8280XP - Moved PHY bindings from display/ to phy/ - GPU: - Preemption support for x2-85 and a840 - IFPC support for a840 - SKU detection support for x2-85 and a840 - Expose AQE support (VK ray-pipeline) - Avoid locking in VM_BIND fence signaling path - Fix to avoid reclaim in GPU snapshot path - Disallow foreign mapping of _NO_SHARE BOs - HDMI: - Fixed infoframes programming - MDP5: - Dropped support for MSM8974v1 - Dropped now unused code for MSM8974 v1 and SDM660 / MSM8998 panthor: - add tracepoints for power and IRQs - fix fence handling - extend timestamp query with flags - support various sources for timestamp queries tyr: - fix names and model/versions rockchip: - vop2: use drm logging function - rk3576 displayport support - support CRTC background color atmel-hlcdc: - support sana5d65 LCD controller tilcdc: - use DT bindings schema - use managed DRM interfaces - support DRM_BRIDGE_ATTACH_NO_CONNECTOR verisilicon: - support DC8200 + DT bindings virtgpu: - support PRIME import with 3D enabled komeda: - fix integer overflow in AFBC checks mcde: - improve bridge handling gma500: - use drm client buffer for fbdev framebuffer amdxdna: - add sensors ioctls - provide NPU power estimate - support column utilization sensor - allow forcing DMA through IOMMU IOVA - support per-BO mem usage queries - refactor GEM implementation ivpu: - update boot API to v3.29.4 - limit per-user number of doorbells/contexts - perform engine reset on TDR error loongson: - replace custom code with drm_gem_ttm_dumb_map_offset() imx: - support planes behind the primary plane - fix bus-format selection vkms: - support CRTC background color v3d: - improve handling of struct v3d_stats komeda: - support Arm China Linlon D6 plus DT bindings imagination: - improve power-off sequence - support context-reset notification from firmware mediatek: - mtk_dsi: enable hs clock during pre-enable - Remove all conflicting aperture devices during probe - Add support for mt8167 display blocks" * tag 'drm-next-2026-04-15' of https://gitlab.freedesktop.org/drm/kernel: (1735 commits) drm/ttm/tests: Remove checks from ttm_pool_free_no_dma_alloc drm/ttm/tests: fix lru_count ASSERT drm/vram: remove DRM_VRAM_MM_FILE_OPERATIONS from docs drm/fb-helper: Fix a locking bug in an error path dma-fence: correct kernel-doc function parameter @flags ttm/pool: track allocated_pages per numa node. ttm/pool: make pool shrinker NUMA aware (v2) ttm/pool: drop numa specific pools ttm/pool: port to list_lru. (v2) drm/ttm: use gpu mm stats to track gpu memory allocations. (v4) mm: add gpu active/reclaim per-node stat counters (v2) gpu: nova-core: fix missing colon in SEC2 boot debug message gpu: nova-core: vbios: use from_le_bytes() for PCI ROM header parsing gpu: nova-core: bitfield: fix broken Default implementation gpu: nova-core: falcon: pad firmware DMA object size to required block alignment gpu: nova-core: gsp: fix undefined behavior in command queue code drm/shmem_helper: Make sure PMD entries get the writeable upgrade accel/ivpu: Trigger recovery on TDR with OS scheduling drm/msm: Use of_get_available_child_by_name() dt-bindings: display/msm: move DSI PHY bindings to phy/ subdir ...
2026-04-08Merge tag 'rust-timekeeping-for-v7.1' of ↵Miguel Ojeda3-93/+103
https://github.com/Rust-for-Linux/linux into rust-next Pull timekeeping updates from Andreas Hindborg: - Expand the example section in the 'HrTimer' documentation. - Mark the 'ClockSource' trait as unsafe to ensure valid values for 'ktime_get()'. - Add 'Delta::from_nanos()'. This is a back merge since the pull request has a newer base -- we will avoid that in the future. And, given it is a back merge, it happens to resolve the "subtle" conflict around '--remap-path-{prefix,scope}' that I discussed in linux-next [1], plus a few other common conflicts. The result matches what we did for next-20260407. The actual diffstat (i.e. using a temporary merge of upstream first) is: rust/kernel/time.rs | 32 ++++- rust/kernel/time/hrtimer.rs | 336 ++++++++++++++++++++++++++++++++++++++++++++ 2 files changed, 362 insertions(+), 6 deletions(-) Link: https://lore.kernel.org/linux-next/CANiq72kdxB=W3_CV1U44oOK3SssztPo2wLDZt6LP94TEO+Kj4g@mail.gmail.com/ [1] * tag 'rust-timekeeping-for-v7.1' of https://github.com/Rust-for-Linux/linux: hrtimer: add usage examples to documentation rust: time: make ClockSource unsafe trait rust/time: Add Delta::from_nanos()
2026-04-07rust: bump Clippy's MSRV and clean `incompatible_msrv` allowsMiguel Ojeda1-5/+1
Following the Rust compiler bump, we can now update Clippy's MSRV we set in the configuration, which will improve the diagnostics it generates. Thus do so and clean a few of the `allow`s that are not needed anymore. Reviewed-by: Tamir Duberstein <tamird@kernel.org> Acked-by: Danilo Krummrich <dakr@kernel.org> Reviewed-by: Gary Guo <gary@garyguo.net> Link: https://patch.msgid.link/20260405235309.418950-7-ojeda@kernel.org Signed-off-by: Miguel Ojeda <ojeda@kernel.org>
2026-04-07gpu: nova-core: bindings: remove unneeded `cfg_attr`Miguel Ojeda1-3/+0
These were likely copied from the `bindings` and `uapi` crates, but are unneeded since there are no `cfg(test)`s in the bindings. In addition, the issue that triggered the addition in those crates originally is also fixed in `bindgen` (please see the previous commit). Thus remove them. Reviewed-by: Tamir Duberstein <tamird@kernel.org> Reviewed-by: Gary Guo <gary@garyguo.net> Acked-by: Danilo Krummrich <dakr@kernel.org> Link: https://patch.msgid.link/20260405235309.418950-5-ojeda@kernel.org Signed-off-by: Miguel Ojeda <ojeda@kernel.org>
2026-04-06gpu: nova-core: fix missing colon in SEC2 boot debug messageDavid Carlier1-1/+1
The SEC2 mailbox debug output formats MBOX1 without a colon separator, producing "MBOX10xdead" instead of "MBOX1: 0xdead". The GSP debug message a few lines above uses the correct format. Fixes: 5949d419c193 ("gpu: nova-core: gsp: Boot GSP") Signed-off-by: David Carlier <devnexen@gmail.com> Link: https://patch.msgid.link/20260331103744.605683-1-devnexen@gmail.com Signed-off-by: Alexandre Courbot <acourbot@nvidia.com>
2026-04-05gpu: nova-core: gsp: fix undefined behavior in command queue codeAlexandre Courbot1-46/+68
`driver_read_area` and `driver_write_area` are internal methods that return slices containing the area of the command queue buffer that the driver has exclusive read or write access, respectively. While their returned value is correct and safe to use, internally they temporarily create a reference to the whole command-buffer slice, including GSP-owned regions. These regions can change without notice, and thus creating a slice to them, even if never accessed, is undefined behavior. Fix this by making these methods create slices to valid regions only. Fixes: 75f6b1de8133 ("gpu: nova-core: gsp: Add GSP command queue bindings and handling") Reported-by: Danilo Krummrich <dakr@kernel.org> Closes: https://lore.kernel.org/all/DH47AVPEKN06.3BERUSJIB4M1R@kernel.org/ Signed-off-by: Alexandre Courbot <acourbot@nvidia.com> Reviewed-by: Gary Guo <gary@garyguo.net> Link: https://patch.msgid.link/20260404-cmdq-ub-fix-v5-1-53d21f4752f5@nvidia.com Signed-off-by: Danilo Krummrich <dakr@kernel.org>
2026-03-26gpu: nova-core: convert PFB registers to kernel register macroAlexandre Courbot2-7/+8
Convert all PFB registers to use the kernel's register macro and update the code accordingly. NV_PGSP_QUEUE_HEAD was somehow caught in the PFB section, so move it to its own section and convert it as well. Reviewed-by: Eliot Courtney <ecourtney@nvidia.com> Reviewed-by: Gary Guo <gary@garyguo.net> Acked-by: Danilo Krummrich <dakr@kernel.org> Link: https://patch.msgid.link/20260325-b4-nova-register-v4-4-bdf172f0f6ca@nvidia.com Signed-off-by: Alexandre Courbot <acourbot@nvidia.com>
2026-03-26gpu: nova-core: convert PBUS registers to kernel register macroAlexandre Courbot1-1/+4
Convert all PBUS registers to use the kernel's register macro and update the code accordingly. Reviewed-by: Eliot Courtney <ecourtney@nvidia.com> Reviewed-by: Gary Guo <gary@garyguo.net> Acked-by: Danilo Krummrich <dakr@kernel.org> Link: https://patch.msgid.link/20260325-b4-nova-register-v4-3-bdf172f0f6ca@nvidia.com Signed-off-by: Alexandre Courbot <acourbot@nvidia.com>
2026-03-24gpu: nova-core: gsp: move Cmdq's DMA handle to a struct memberAlexandre Courbot2-12/+16
The command-queue structure has a `dma_handle` method that returns the DMA handle to the memory segment shared with the GSP. This works, but is not ideal for the following reasons: - That method is effectively only ever called once, and is technically an accessor method since the handle doesn't change over time, - It feels a bit out-of-place with the other methods of `Cmdq` which only deal with the sending or receiving of messages, - The method has `pub(crate)` visibility, allowing other driver code to access this highly-sensitive handle. Address all these issues by turning `dma_handle` into a struct member with `pub(super)` visibility. This keeps the method space focused, and also ensures the member is not visible outside of the modules that need it. Reviewed-by: Eliot Courtney <ecourtney@nvidia.com> Reviewed-by: Danilo Krummrich <dakr@kernel.org> Link: https://patch.msgid.link/20260319-b4-cmdq-dma-handle-v1-1-57840b4a4f90@nvidia.com Signed-off-by: Alexandre Courbot <acourbot@nvidia.com>
2026-03-23gpu: nova-core: convert to new dma::Coherent APIGary Guo2-43/+24
Remove all usages of dma::CoherentAllocation and use the new dma::Coherent type instead. Signed-off-by: Gary Guo <gary@garyguo.net> Co-developed-by: Danilo Krummrich <dakr@kernel.org> Signed-off-by: Danilo Krummrich <dakr@kernel.org> Reviewed-by: Alexandre Courbot <acourbot@nvidia.com> Link: https://patch.msgid.link/20260320194626.36263-9-dakr@kernel.org Signed-off-by: Danilo Krummrich <dakr@kernel.org>
2026-03-23gpu: nova-core: convert Gsp::new() to use CoherentBoxDanilo Krummrich1-18/+44
Convert libos (LibosMemoryRegionInitArgument) and rmargs (GspArgumentsPadded) to use CoherentBox / Coherent::init() and simplify the initialization. This also avoids separate initialization on the stack. Reviewed-by: Gary Guo <gary@garyguo.net> Reviewed-by: Alexandre Courbot <acourbot@nvidia.com> Link: https://patch.msgid.link/20260320194626.36263-8-dakr@kernel.org Signed-off-by: Danilo Krummrich <dakr@kernel.org>
2026-03-23gpu: nova-core: use Coherent::init to initialize GspFwWprMetaDanilo Krummrich2-10/+17
Convert wpr_meta to use Coherent::init() and simplify the initialization. It also avoids a separate initialization of GspFwWprMeta on the stack. Reviewed-by: Gary Guo <gary@garyguo.net> Reviewed-by: Alexandre Courbot <acourbot@nvidia.com> Link: https://patch.msgid.link/20260320194626.36263-7-dakr@kernel.org Signed-off-by: Danilo Krummrich <dakr@kernel.org>
2026-03-18gpu: nova-core: gsp: add mutex locking to CmdqEliot Courtney4-77/+107
Wrap `Cmdq`'s mutable state in a new struct `CmdqInner` and wrap that in a Mutex. This lets `Cmdq` methods take &self instead of &mut self, which lets required commands be sent e.g. while unloading the driver. The mutex is held over both send and receive in `send_command` to make sure that it doesn't get the reply of some other command that could have been sent just beforehand. Reviewed-by: Zhi Wang <zhiw@nvidia.com> Tested-by: Zhi Wang <zhiw@nvidia.com> Signed-off-by: Eliot Courtney <ecourtney@nvidia.com> Link: https://patch.msgid.link/20260318-cmdq-locking-v5-5-18b37e3f9069@nvidia.com Signed-off-by: Alexandre Courbot <acourbot@nvidia.com>
2026-03-18gpu: nova-core: gsp: make `Cmdq` a pinned typeEliot Courtney1-5/+4
Make `Cmdq` a pinned type. This is needed to use Mutex, which is needed to add locking to `Cmdq`. Reviewed-by: Zhi Wang <zhiw@nvidia.com> Tested-by: Zhi Wang <zhiw@nvidia.com> Signed-off-by: Eliot Courtney <ecourtney@nvidia.com> Link: https://patch.msgid.link/20260318-cmdq-locking-v5-4-18b37e3f9069@nvidia.com Signed-off-by: Alexandre Courbot <acourbot@nvidia.com>
2026-03-18gpu: nova-core: gsp: add reply/no-reply info to `CommandToGsp`Eliot Courtney4-16/+75
Add type infrastructure to know what reply is expected from each `CommandToGsp`. Uses a marker type `NoReply` which does not implement `MessageFromGsp` to mark commands which don't expect a response. Update `send_command` to wait for a reply and add `send_command_no_wait` which sends a command that has no reply, without blocking. This prepares for adding locking to the queue. Tested-by: Zhi Wang <zhiw@nvidia.com> Reviewed-by: Gary Guo <gary@garyguo.net> Signed-off-by: Eliot Courtney <ecourtney@nvidia.com> Link: https://patch.msgid.link/20260318-cmdq-locking-v5-3-18b37e3f9069@nvidia.com Signed-off-by: Alexandre Courbot <acourbot@nvidia.com>
2026-03-18gpu: nova-core: gsp: add `RECEIVE_TIMEOUT` constant for command queueEliot Courtney3-4/+6
Remove magic numbers and add a default timeout for callers to use. Tested-by: Zhi Wang <zhiw@nvidia.com> Reviewed-by: Gary Guo <gary@garyguo.net> Signed-off-by: Eliot Courtney <ecourtney@nvidia.com> Link: https://patch.msgid.link/20260318-cmdq-locking-v5-2-18b37e3f9069@nvidia.com Signed-off-by: Alexandre Courbot <acourbot@nvidia.com>
2026-03-18gpu: nova-core: gsp: fix stale doc comments on command queue methodsEliot Courtney1-9/+8
Fix some inaccuracies / old doc comments. Reviewed-by: Zhi Wang <zhiw@nvidia.com> Tested-by: Zhi Wang <zhiw@nvidia.com> Signed-off-by: Eliot Courtney <ecourtney@nvidia.com> Link: https://patch.msgid.link/20260318-cmdq-locking-v5-1-18b37e3f9069@nvidia.com Signed-off-by: Alexandre Courbot <acourbot@nvidia.com>
2026-03-15Merge tag 'v7.0-rc4' into drm-rust-nextDanilo Krummrich3-95/+103
We need the latest fixes from drm-rust-fixes in drm-rust-next as well to build on top of. Signed-off-by: Danilo Krummrich <dakr@kernel.org>
2026-03-11gpu: nova-core: gsp: fix UB in DmaGspMem pointer accessorsDanilo Krummrich2-88/+84
The DmaGspMem pointer accessor methods (gsp_write_ptr, gsp_read_ptr, cpu_read_ptr, cpu_write_ptr, advance_cpu_read_ptr, advance_cpu_write_ptr) dereference a raw pointer to DMA memory, creating an intermediate reference before calling volatile read/write methods. This is undefined behavior since DMA memory can be concurrently modified by the device. Fix this by moving the implementations into a gsp_mem module in fw.rs that uses the dma_read!() / dma_write!() macros, making the original methods on DmaGspMem thin forwarding wrappers. An alternative approach would have been to wrap the shared memory in Opaque, but that would have required even more unsafe code. Since the gsp_mem module lives in fw.rs (to access firmware-specific binding field names), GspMem, Msgq and their relevant fields are temporarily widened to pub(super). This will be reverted once IoView projections are available. Cc: Gary Guo <gary@garyguo.net> Closes: https://lore.kernel.org/nouveau/DGUT14ILG35P.1UMNRKU93JUM1@kernel.org/ Fixes: 75f6b1de8133 ("gpu: nova-core: gsp: Add GSP command queue bindings and handling") Reviewed-by: Alexandre Courbot <acourbot@nvidia.com> Link: https://patch.msgid.link/20260309225408.27714-1-dakr@kernel.org [ Use pub(super) where possible; replace bitwise-and with modulo operator analogous to [1]. - Danilo ] Link: https://lore.kernel.org/all/20260129-nova-core-cmdq1-v3-1-2ede85493a27@nvidia.com/ [1] Signed-off-by: Danilo Krummrich <dakr@kernel.org>
2026-03-10gpu: nova-core: fix stack overflow in GSP memory allocationTim Kovalenko1-2/+12
The `Cmdq::new` function was allocating a `PteArray` struct on the stack and was causing a stack overflow with 8216 bytes. Modify the `PteArray` to calculate and write the Page Table Entries directly into the coherent DMA buffer one-by-one. This reduces the stack usage quite a lot. Reported-by: Gary Guo <gary@garyguo.net> Closes: https://rust-for-linux.zulipchat.com/#narrow/channel/509436-Nova/topic/.60Cmdq.3A.3Anew.60.20uses.20excessive.20stack.20size/near/570375549 Link: https://lore.kernel.org/rust-for-linux/CANiq72mAQxbRJZDnik3Qmd4phvFwPA01O2jwaaXRh_T+2=L-qA@mail.gmail.com/ Fixes: f38b4f105cfc ("gpu: nova-core: Create initial Gsp") Acked-by: Alexandre Courbot <acourbot@nvidia.com> Signed-off-by: Tim Kovalenko <tim.kovalenko@proton.me> Link: https://patch.msgid.link/20260309-drm-rust-next-v4-4-4ef485b19a4c@proton.me [ * Use PteArray::entry() in LogBuffer::new(), * Add TODO comment to use IoView projections once available, * Add PTE_ARRAY_SIZE constant to avoid duplication. - Danilo ] Signed-off-by: Danilo Krummrich <dakr@kernel.org>
2026-03-10gpu: nova-core: apply the one "use" item per line policy to commands.rsJohn Hubbard1-3/+9
As per [1], we need one "use" item per line, in order to reduce merge conflicts. Furthermore, we need a trailing ", //" in order to tell rustfmt(1) to leave it alone. This does that for commands.rs, which is the only file in nova-core that has any remaining instances of the old style. [1] https://docs.kernel.org/rust/coding-guidelines.html#imports Reviewed-by: Gary Guo <gary@garyguo.net> Signed-off-by: John Hubbard <jhubbard@nvidia.com> Link: https://patch.msgid.link/20260310021125.117855-7-jhubbard@nvidia.com Signed-off-by: Alexandre Courbot <acourbot@nvidia.com>
2026-03-10gpu: nova-core: add FbRange.len() and use it in boot.rsJohn Hubbard1-1/+1
A tiny simplification: now that FbLayout uses its own specific FbRange type, add an FbRange.len() method, and use that to (very slightly) simplify the calculation of Frts::frts_size initialization. Suggested-by: Alexandre Courbot <acourbot@nvidia.com> Reviewed-by: Gary Guo <gary@garyguo.net> Signed-off-by: John Hubbard <jhubbard@nvidia.com> Link: https://patch.msgid.link/20260310021125.117855-3-jhubbard@nvidia.com Signed-off-by: Alexandre Courbot <acourbot@nvidia.com>
2026-03-10gpu: nova-core: gsp: add tests for continuation recordsEliot Courtney1-0/+138
Add tests for continuation record splitting. They cover boundary conditions at the split points to make sure the right number of continuation records are made. They also check that the data concatenated is correct. Tested-by: Zhi Wang <zhiw@nvidia.com> Signed-off-by: Eliot Courtney <ecourtney@nvidia.com> Link: https://patch.msgid.link/20260306-cmdq-continuation-v6-9-cc7b629200ee@nvidia.com Signed-off-by: Alexandre Courbot <acourbot@nvidia.com>
2026-03-10gpu: nova-core: gsp: support large RPCs via continuation recordEliot Courtney3-2/+207
Splits large RPCs if necessary and sends the remaining parts using continuation records. RPCs that do not need continuation records continue to write directly into the command buffer. Ones that do write into a staging buffer first, so there is one copy. Continuation record for receive is not necessary to support at the moment because those replies do not need to be read and are currently drained by retrying `receive_msg` on `ERANGE`. Signed-off-by: Eliot Courtney <ecourtney@nvidia.com> Link: https://patch.msgid.link/20260306-cmdq-continuation-v6-8-cc7b629200ee@nvidia.com Signed-off-by: Alexandre Courbot <acourbot@nvidia.com>
2026-03-10gpu: nova-core: gsp: add `size` helper to `CommandToGsp`Eliot Courtney1-3/+9
Add a default method to `CommandToGsp` which computes the size of a command. Tested-by: Zhi Wang <zhiw@nvidia.com> Signed-off-by: Eliot Courtney <ecourtney@nvidia.com> Link: https://patch.msgid.link/20260306-cmdq-continuation-v6-7-cc7b629200ee@nvidia.com Signed-off-by: Alexandre Courbot <acourbot@nvidia.com>
2026-03-10gpu: nova-core: gsp: unconditionally call variable payload handlingEliot Courtney1-9/+7
Unconditionally call the variable length payload code, which is a no-op if there is no such payload but could defensively catch some coding errors by e.g. checking that the allocated size is completely filled. Tested-by: Zhi Wang <zhiw@nvidia.com> Signed-off-by: Eliot Courtney <ecourtney@nvidia.com> Link: https://patch.msgid.link/20260306-cmdq-continuation-v6-6-cc7b629200ee@nvidia.com Signed-off-by: Alexandre Courbot <acourbot@nvidia.com>
2026-03-10gpu: nova-core: gsp: clarify invariant on command queueEliot Courtney1-1/+3
Clarify why using only the first returned slice from allocate_command for the message headers is okay. Tested-by: Zhi Wang <zhiw@nvidia.com> Signed-off-by: Eliot Courtney <ecourtney@nvidia.com> Link: https://patch.msgid.link/20260306-cmdq-continuation-v6-5-cc7b629200ee@nvidia.com Signed-off-by: Alexandre Courbot <acourbot@nvidia.com>
2026-03-10gpu: nova-core: gsp: add checking oversized commandsEliot Courtney3-1/+11
The limit is 16 pages for a single command sent to the GSP. Return an error if `allocate_command` is called with a too large size. Tested-by: Zhi Wang <zhiw@nvidia.com> Signed-off-by: Eliot Courtney <ecourtney@nvidia.com> Link: https://patch.msgid.link/20260306-cmdq-continuation-v6-4-cc7b629200ee@nvidia.com Signed-off-by: Alexandre Courbot <acourbot@nvidia.com>
2026-03-10gpu: nova-core: gsp: add mechanism to wait for space on command queueEliot Courtney1-12/+30
Add a timeout to `allocate_command` which waits for space on the GSP command queue. It uses a similar timeout to nouveau. This lets `send_command` wait for space to free up in the command queue. This is required to support continuation records which can fill up the queue. Tested-by: Zhi Wang <zhiw@nvidia.com> Signed-off-by: Eliot Courtney <ecourtney@nvidia.com> Link: https://patch.msgid.link/20260306-cmdq-continuation-v6-2-cc7b629200ee@nvidia.com Signed-off-by: Alexandre Courbot <acourbot@nvidia.com>
2026-03-10gpu: nova-core: gsp: sort `MsgFunction` variants alphabeticallyEliot Courtney1-32/+35
There is no particular order required here and keeping them alphabetical will help preventing future mistakes. Tested-by: Zhi Wang <zhiw@nvidia.com> Signed-off-by: Eliot Courtney <ecourtney@nvidia.com> Link: https://patch.msgid.link/20260306-cmdq-continuation-v6-1-cc7b629200ee@nvidia.com Signed-off-by: Alexandre Courbot <acourbot@nvidia.com>
2026-03-09gpu: nova-core: use the Generic Bootloader to boot FWSEC on TuringTimur Tabi1-3/+12
On Turing and GA100, a new firmware image called the Generic Bootloader (gen_bootloader) must be used to load FWSEC into Falcon memory. The driver loads the generic bootloader into Falcon IMEM, passes a descriptor that points to FWSEC using DMEM, and then boots the generic bootloader. The bootloader will then load FWSEC into IMEM and boot it. Signed-off-by: Timur Tabi <ttabi@nvidia.com> Co-developed-by: Alexandre Courbot <acourbot@nvidia.com> Signed-off-by: Alexandre Courbot <acourbot@nvidia.com> Acked-by: Danilo Krummrich <dakr@kernel.org> Link: https://patch.msgid.link/20260306-turing_prep-v11-12-8f0042c5d026@nvidia.com
2026-03-09gpu: nova-core: create falcon firmware DMA objects lazilyAlexandre Courbot1-1/+1
When DMA was the only loading option for falcon firmwares, we decided to store them in DMA objects as soon as they were loaded from disk and patch them in-place to avoid having to do an extra copy. This decision complicates the PIO loading patch considerably, and actually does not even stand on its own when put into perspective with the fact that it requires 8 unsafe statements in the code that wouldn't exist if we stored the firmware into a `KVVec` and copied it into a DMA object at the last minute. The cost of the copy is, as can be expected, imperceptible at runtime. Thus, switch to a lazy DMA object creation model and simplify our code a bit. This will also have the nice side-effect of being more fit for PIO loading. Reviewed-by: Eliot Courtney <ecourtney@nvidia.com> Acked-by: Danilo Krummrich <dakr@kernel.org> Link: https://patch.msgid.link/20260306-turing_prep-v11-1-8f0042c5d026@nvidia.com [acourbot@nvidia.com: add TODO item to switch back to a coherent allocation when it becomes convenient to do so.] Signed-off-by: Alexandre Courbot <acourbot@nvidia.com>
2026-03-07rust: dma: use pointer projection infra for `dma_{read,write}` macroGary Guo2-4/+8
Current `dma_read!`, `dma_write!` macros also use a custom `addr_of!()`-based implementation for projecting pointers, which has soundness issue as it relies on absence of `Deref` implementation on types. It also has a soundness issue where it does not protect against unaligned fields (when `#[repr(packed)]` is used) so it can generate misaligned accesses. This commit migrates them to use the general pointer projection infrastructure, which handles these cases correctly. As part of migration, the macro is updated to have an improved surface syntax. The current macro have dma_read!(a.b.c[d].e.f) to mean `a.b.c` is a DMA coherent allocation and it should project into it with `[d].e.f` and do a read, which is confusing as it makes the indexing operator integral to the macro (so it will break if you have an array of `CoherentAllocation`, for example). This also is problematic as we would like to generalize `CoherentAllocation` from just slices to arbitrary types. Make the macro expects `dma_read!(path.to.dma, .path.inside.dma)` as the canonical syntax. The index operator is no longer special and is just one type of projection (in additional to field projection). Similarly, make `dma_write!(path.to.dma, .path.inside.dma, value)` become the canonical syntax for writing. Another issue of the current macro is that it is always fallible. This makes sense with existing design of `CoherentAllocation`, but once we support fixed size arrays with `CoherentAllocation`, it is desirable to have the ability to perform infallible indexing as well, e.g. doing a `[0]` index of `[Foo; 2]` is okay and can be checked at build-time, so forcing falliblity is non-ideal. To capture this, the macro is changed to use `[idx]` as infallible projection and `[idx]?` as fallible index projection (those syntax are part of the general projection infra). A benefit of this is that while individual indexing operation may fail, the overall read/write operation is not fallible. Fixes: ad2907b4e308 ("rust: add dma coherent allocator abstraction") Reviewed-by: Benno Lossin <lossin@kernel.org> Signed-off-by: Gary Guo <gary@garyguo.net> Link: https://patch.msgid.link/20260302164239.284084-4-gary@kernel.org [ Capitalize safety comments; slightly improve wording in doc-comments. - Danilo ] Signed-off-by: Danilo Krummrich <dakr@kernel.org>
2026-02-25gpu: nova-core: gsp: derive Zeroable for GspStaticConfigInfoAlexandre Courbot1-4/+1
We can now derive `Zeroable` on tuple structs, so do this instead of providing our own implementation. Reviewed-by: Lyude Paul <lyude@redhat.com> Reviewed-by: Danilo Krummrich <dakr@kernel.org> Link: https://patch.msgid.link/20260217-nova-misc-v3-6-b4e2d45eafbc@nvidia.com Signed-off-by: Alexandre Courbot <acourbot@nvidia.com>
2026-02-25gpu: nova-core: gsp: derive `Debug` on more sequencer typesAlexandre Courbot2-5/+6
Being able to print these is useful when debugging the sequencer. Reviewed-by: Lyude Paul <lyude@redhat.com> Reviewed-by: Gary Guo <gary@garyguo.net> Reviewed-by: Danilo Krummrich <dakr@kernel.org> Link: https://patch.msgid.link/20260217-nova-misc-v3-5-b4e2d45eafbc@nvidia.com Signed-off-by: Alexandre Courbot <acourbot@nvidia.com>
2026-02-25gpu: nova-core: gsp: remove unneeded sequencer traitAlexandre Courbot1-11/+6
The `GspSeqCmdRunner` trait is never used as we never call the `run` methods from generic code. Remove it. Reviewed-by: Lyude Paul <lyude@redhat.com> Reviewed-by: Gary Guo <gary@garyguo.net> Reviewed-by: Danilo Krummrich <dakr@kernel.org> Link: https://patch.msgid.link/20260217-nova-misc-v3-4-b4e2d45eafbc@nvidia.com Signed-off-by: Alexandre Courbot <acourbot@nvidia.com>
2026-02-25gpu: nova-core: gsp: simplify sequencer opcode parsingAlexandre Courbot1-35/+5
The opcodes are already the right type in the C union, so we can use them directly instead of converting them to a byte stream and back again using `FromBytes`. Reviewed-by: Lyude Paul <lyude@redhat.com> Reviewed-by: Gary Guo <gary@garyguo.net> Reviewed-by: Danilo Krummrich <dakr@kernel.org> Link: https://patch.msgid.link/20260217-nova-misc-v3-3-b4e2d45eafbc@nvidia.com Signed-off-by: Alexandre Courbot <acourbot@nvidia.com>
2026-02-25gpu: nova-core: gsp: remove unnecessary Display implsAlexandre Courbot2-55/+1
We only ever display these in debug context, for which the automatically derived `Debug` impls work just fine - so use them and remove these boilerplate-looking implementations. Reviewed-by: Lyude Paul <lyude@redhat.com> Reviewed-by: Alistair Popple <apopple@nvidia.com> Reviewed-by: Gary Guo <gary@garyguo.net> Reviewed-by: Danilo Krummrich <dakr@kernel.org> Link: https://patch.msgid.link/20260217-nova-misc-v3-2-b4e2d45eafbc@nvidia.com Signed-off-by: Alexandre Courbot <acourbot@nvidia.com>
2026-02-25gpu: nova-core: gsp: warn if data remains after processing a messageAlexandre Courbot1-1/+11
Not processing the whole data from a received message is a strong indicator of a bug - emit a warning when such cases are detected. Reviewed-by: Lyude Paul <lyude@redhat.com> Reviewed-by: Danilo Krummrich <dakr@kernel.org> Link: https://patch.msgid.link/20260217-nova-misc-v3-1-b4e2d45eafbc@nvidia.com Signed-off-by: Alexandre Courbot <acourbot@nvidia.com>
2026-02-25gpu: nova-core: gsp: fix improper indexing in driver_read_areaEliot Courtney1-8/+13
The current code indexes into `after_rx` using `tx` which is an index for the whole buffer, not the split buffer `after_rx`. Also add more rigorous no-panic proofs. Fixes: 75f6b1de8133 ("gpu: nova-core: gsp: Add GSP command queue bindings and handling") Signed-off-by: Eliot Courtney <ecourtney@nvidia.com> Reviewed-by: Gary Guo <gary@garyguo.net> Link: https://patch.msgid.link/20260129-nova-core-cmdq1-v3-5-2ede85493a27@nvidia.com Signed-off-by: Alexandre Courbot <acourbot@nvidia.com>
2026-02-25gpu: nova-core: gsp: fix improper handling of empty slot in cmdqEliot Courtney1-14/+20
The current code hands out buffers that go all the way up to and including `rx - 1`, but we need to maintain an empty slot to prevent the ring buffer from wrapping around into having 'tx == rx', which means empty. Also add more rigorous no-panic proofs. Fixes: 75f6b1de8133 ("gpu: nova-core: gsp: Add GSP command queue bindings and handling") Signed-off-by: Eliot Courtney <ecourtney@nvidia.com> Reviewed-by: Gary Guo <gary@garyguo.net> Link: https://patch.msgid.link/20260129-nova-core-cmdq1-v3-4-2ede85493a27@nvidia.com Signed-off-by: Alexandre Courbot <acourbot@nvidia.com>
2026-02-25gpu: nova-core: gsp: use empty slices instead of [0..0] rangesEliot Courtney1-4/+4
The current code unnecessarily uses, for example, &before_rx[0..0] to return an empty slice. Instead, just use an empty slice. Signed-off-by: Eliot Courtney <ecourtney@nvidia.com> Reviewed-by: Gary Guo <gary@garyguo.net> Link: https://patch.msgid.link/20260129-nova-core-cmdq1-v3-3-2ede85493a27@nvidia.com Signed-off-by: Alexandre Courbot <acourbot@nvidia.com>
2026-02-25gpu: nova-core: gsp: clarify comments about invariants and pointer rolesEliot Courtney1-8/+10
Disambiguate a few things in comments in cmdq.rs. Signed-off-by: Eliot Courtney <ecourtney@nvidia.com> Reviewed-by: Gary Guo <gary@garyguo.net> Link: https://patch.msgid.link/20260129-nova-core-cmdq1-v3-2-2ede85493a27@nvidia.com Signed-off-by: Alexandre Courbot <acourbot@nvidia.com>
2026-02-25gpu: nova-core: gsp: fix incorrect advancing of write pointerEliot Courtney1-1/+1
We should modulo not bitwise-and here. The current code could, for example, set wptr to MSGQ_NUM_PAGES which is not valid. Fixes: 75f6b1de8133 ("gpu: nova-core: gsp: Add GSP command queue bindings and handling") Signed-off-by: Eliot Courtney <ecourtney@nvidia.com> Reviewed-by: Gary Guo <gary@garyguo.net> Link: https://patch.msgid.link/20260129-nova-core-cmdq1-v3-1-2ede85493a27@nvidia.com Signed-off-by: Alexandre Courbot <acourbot@nvidia.com>
2026-02-24gpu: nova-core: remove redundant `.as_ref()` for `dev_*` printGary Guo1-25/+7
This is now handled by the macro itself. Signed-off-by: Gary Guo <gary@garyguo.net> Link: https://patch.msgid.link/20260123175854.176735-7-gary@kernel.org Signed-off-by: Danilo Krummrich <dakr@kernel.org>
2026-02-11Merge tag 'driver-core-7.0-rc1' of ↵Linus Torvalds1-1/+4
git://git.kernel.org/pub/scm/linux/kernel/git/driver-core/driver-core Pull driver core updates from Danilo Krummrich: "Bus: - Ensure bus->match() is consistently called with the device lock held - Improve type safety of bus_find_device_by_acpi_dev() Devtmpfs: - Parse 'devtmpfs.mount=' boot parameter with kstrtoint() instead of simple_strtoul() - Avoid sparse warning by making devtmpfs_context_ops static IOMMU: - Do not register the qcom_smmu_tbu_driver in arm_smmu_device_probe() MAINTAINERS: - Add the new driver-core mailing list (driver-core@lists.linux.dev) to all relevant entries - Add missing tree location for "FIRMWARE LOADER (request_firmware)" - Add driver-model documentation to the "DRIVER CORE" entry - Add missing driver-core maintainers to the "AUXILIARY BUS" entry Misc: - Change return type of attribute_container_register() to void; it has always been infallible - Do not export sysfs_change_owner(), sysfs_file_change_owner() and device_change_owner() - Move devres_for_each_res() from the public devres header to drivers/base/base.h - Do not use a static struct device for the faux bus; allocate it dynamically Revocable: - Patches for the revocable synchronization primitive have been scheduled for v7.0-rc1, but have been reverted as they need some more refinement Rust: - Device: - Support dev_printk on all device types, not just the core Device struct; remove now-redundant .as_ref() calls in dev_* print calls - Devres: - Introduce an internal reference count in Devres<T> to avoid a deadlock condition in c