aboutsummaryrefslogtreecommitdiff
path: root/drivers/gpu
AgeCommit message (Collapse)AuthorFilesLines
2025-12-08drm/amdkfd: Fix PTE clearing during SVM unmap on GFX 12.1Mukul Joshi1-1/+1
During migration from VRAM to RAM, when PTE is cleared, reset the PTE to always ensure that PTE.P=1 is set on GFX 12.1. If PTE.P is not set, it can lead to TF faults. Signed-off-by: Mukul Joshi <mukul.joshi@amd.com> Reviewed-by: Alex Sierra <alex.sierra@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-12-08drm/amdgpu: Enable PDE.C usage on GFX 12.1Mukul Joshi1-12/+4
On GFX 12.1, PDE.C is ignored if (PDE|PTE)_REQUEST_PHYSICAL is not setup in the GCVM control register. Always set this field to enable PDE.C usage. Signed-off-by: Mukul Joshi <mukul.joshi@amd.com> Reviewed-by: Alex Sierra <alex.sierra@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-12-08drm/amdgpu: Always set snoop bit in PDE on GFX 12.1Mukul Joshi1-0/+2
GFX 12.1 has the requirement to always set snoop bit in PDE to maintain coherency. Signed-off-by: Mukul Joshi <mukul.joshi@amd.com> Reviewed-by: Alex Sierra <alex.sierra@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-12-08drm/amdgpu: Add per-ASIC PTE init flagMukul Joshi3-2/+4
On GFX12.1, default PTE setup needs an additional bit to be set. Add PTE initialization flags to handle setup default PTE on a per-ASIC basis. While at it, fixup the coding style too. Signed-off-by: Mukul Joshi <mukul.joshi@amd.com> Reviewed-by: Philip Yang <Philip.Yang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-12-08drm/amdgpu: Add gmc v12_1 gmc callbacksHawking Zhang4-2/+362
Implement gmc v12_1 gmc callbacks v2: revert temporary PDE MTYPE to UC setting Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: Likun Gao <Likun.Gao@amd.com> Reviewed-by: Mukul Joshi <mukul.joshi@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-12-08drm/amdgpu: Add gmc v12_1 supportLikun Gao1-6/+28
Add gmc support for gc version 12_1_0. Signed-off-by: Likun Gao <Likun.Gao@amd.com> Reviewed-by: Le Ma <le.ma@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-12-08drm/amdgpu: Add gfxhub v12_1 supportHawking Zhang3-1/+928
gfxhub v12_1 is a new generation ip v2: squash in update to new IP headers v3: squash in cast fix Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: Le Ma <le.ma@amd.com> Reviewed-by: Likun Gao <Likun.Gao@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-12-08drm/amdgpu: Add gc v12_1_0 ip headers v4Hawking Zhang2-0/+57027
Add header files for gc v12_1_0 register offsets and shift masks v2: Update gc v12_1_0 ip headers v3: Update gc v12_1_0 ip headers v4, v5: Clean up registers (Alex) Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: Likun Gao <Likun.Gao@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-12-08drm/amdgpu: Add osssys v7_1_0 ip headers v3Hawking Zhang2-0/+1304
Add header files for osssys v7_1_0 register offsets and shift masks v2: Update osssys v7_1_0 ip headers to the latest version v3: Clean up registers (Alex) Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: Likun Gao <Likun.Gao@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-12-08drm/amdgpu: Add initial support for mmhub v4_2Likun Gao3-1/+943
Add initial support for mmhub v4_2_0. v2: squash in cast fix Signed-off-by: Likun Gao <Likun.Gao@amd.com> Reviewed-by: Le Ma <le.ma@amd.com> Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-12-08drm/amdgpu: fix spelling in gmc9/10 codeAlex Deucher2-2/+2
onyl -> only Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-12-08drm/amdgpu/ras: Move ras data alloc before bad page checkAsad Kamal1-5/+5
In the rare event if eeprom has only invalid address entries, allocation is skipped, this causes following NULL pointer issue [ 547.103445] BUG: kernel NULL pointer dereference, address: 0000000000000010 [ 547.118897] #PF: supervisor read access in kernel mode [ 547.130292] #PF: error_code(0x0000) - not-present page [ 547.141689] PGD 124757067 P4D 0 [ 547.148842] Oops: 0000 [#1] PREEMPT SMP NOPTI [ 547.158504] CPU: 49 PID: 8167 Comm: cat Tainted: G OE 6.8.0-38-generic #38-Ubuntu [ 547.177998] Hardware name: Supermicro AS -8126GS-TNMR/H14DSG-OD, BIOS 1.7 09/12/2025 [ 547.195178] RIP: 0010:amdgpu_ras_sysfs_badpages_read+0x2f2/0x5d0 [amdgpu] [ 547.210375] Code: e8 63 78 82 c0 45 31 d2 45 3b 75 08 48 8b 45 a0 73 44 44 89 f1 48 8b 7d 88 48 89 ca 48 c1 e2 05 48 29 ca 49 8b 4d 00 48 01 d1 <48> 83 79 10 00 74 17 49 63 f2 48 8b 49 08 41 83 c2 01 48 8d 34 76 [ 547.252045] RSP: 0018:ffa0000067287ac0 EFLAGS: 00010246 [ 547.263636] RAX: ff11000167c28130 RBX: ff11000127600000 RCX: 0000000000000000 [ 547.279467] RDX: 0000000000000000 RSI: 0000000000000000 RDI: ff11000125b1c800 [ 547.295298] RBP: ffa0000067287b50 R08: 0000000000000000 R09: 0000000000000000 [ 547.311129] R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000 [ 547.326959] R13: ff11000217b1de00 R14: 0000000000000000 R15: 0000000000000092 [ 547.342790] FS: 0000746e59d14740(0000) GS:ff11017dfda80000(0000) knlGS:0000000000000000 [ 547.360744] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 547.373489] CR2: 0000000000000010 CR3: 000000019585e001 CR4: 0000000000f71ef0 [ 547.389321] PKRU: 55555554 [ 547.395316] Call Trace: [ 547.400737] <TASK> [ 547.405386] ? show_regs+0x6d/0x80 [ 547.412929] ? __die+0x24/0x80 [ 547.419697] ? page_fault_oops+0x99/0x1b0 [ 547.428588] ? do_user_addr_fault+0x2ee/0x6b0 [ 547.438249] ? exc_page_fault+0x83/0x1b0 [ 547.446949] ? asm_exc_page_fault+0x27/0x30 [ 547.456225] ? amdgpu_ras_sysfs_badpages_read+0x2f2/0x5d0 [amdgpu] [ 547.470040] ? mas_wr_modify+0xcd/0x140 [ 547.478548] sysfs_kf_bin_read+0x63/0xb0 [ 547.487248] kernfs_file_read_iter+0xa1/0x190 [ 547.496909] kernfs_fop_read_iter+0x25/0x40 [ 547.506182] vfs_read+0x255/0x390 This also result in space left assigned to negative values. Moving data alloc call before bad page check resolves both the issue. Signed-off-by: Asad Kamal <asad.kamal@amd.com> Suggested-by: Lijo Lazar <lijo.lazar@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: Lijo Lazar <lijo.lazar@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-12-08drm/amdgpu: Map/Unmap MMIO_REMAP as BAR register window; add TTM sg helpers; ↵Srinivasan Shanmugam5-0/+127
wire dma-buf MMIO_REMAP (HDP flush page) exposes a hardware MMIO register window via a PCI BAR; there are no struct pages backing it (not normal RAM). But when one device shares memory with another through dma-buf, the receiver still expects a delivery route—a list of DMA-able chunks—called an sg_table. For the BAR window, we can’t (no pages!), so we instead create a one-entry list that points directly to the BAR’s physical bus address and tell DMA: “use this I/O span.” - A single, contiguous byte range on the PCI bus (start DMA address + length)). That’s why we map it with dma_map_resource() and set sg_set_page(..., NULL, ...). Perform DMA reads/writes directly to that range so we build an sg_table from a BAR physical span and map it with dma_map_resource(). This patch centralizes the BAR-I/O mapping in TTM and wires dma-buf to it: Add amdgpu_ttm_mmio_remap_alloc_sgt() / amdgpu_ttm_mmio_remap_free_sgt(). They walk the TTM resource via amdgpu_res_cursor, add the byte offset to adev->rmmio_remap.bus_addr, build a one-entry sg_table with sg_set_page(NULL, …), and map/unmap it with dma_map_resource(). In dma-buf map/unmap, if the BO is in AMDGPU_PL_MMIO_REMAP, call the new helpers. Single place for BAR-I/O handling: amdgpu_ttm.c in amdgpu_ttm_mmio_remap_alloc_sgt() and ..._free_sgt(). No struct pages: sg_set_page(sg, NULL, cur.size, 0); inside amdgpu_ttm_mmio_remap_alloc_sgt(). Minimal sg_table: sg_alloc_table(*sgt, 1, GFP_KERNEL); inside amdgpu_ttm_mmio_remap_alloc_sgt(). Hooked into dma-buf: amdgpu_dma_buf_map()/unmap() in amdgpu_dma_buf.c call these helpers for AMDGPU_PL_MMIO_REMAP. v2: squash in fix for set/get tiling Suggested-by: Christian König <christian.koenig@amd.com> Cc: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-12-08drm/amdgpu/ttm: Pin 4K MMIO_REMAP Singleton BO at Init v2Srinivasan Shanmugam1-0/+32
MMIO_REMAP (HDP flush page) is a hardware I/O window exposed via a PCI BAR. It must not migrate or be evicted. Allocate a single 4 KB GEM BO in AMDGPU_GEM_DOMAIN_MMIO_REMAP during TTM initialization when the hardware exposes a remap bus address and the host page size is <= 4 KiB. Reserve the BO and pin it at the TTM level so it remains fixed for its lifetime. No CPU mapping is established here. On teardown, reserve, unpin, and free the BO if present. This prepares the object to be shared (e.g., via dma-buf) without triggering placement changes or no CPU-access migration v2: Added extra NULL checks Suggested-by: Christian König <christian.koenig@amd.com> Suggested-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-12-08drm/amd/ras: Compatible with legacy sriov hostYiPeng Chai5-1/+42
If sriov host is legacy, the guest uniras will be disabled. Signed-off-by: YiPeng Chai <YiPeng.Chai@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-12-08drm/amd/ras: Add sriov ras preprocessing before gpu resetYiPeng Chai3-1/+23
Sriov host may clear all VF commands registered to auto update list during VF reset, set ecc.auto_uUpdate block to false before VF reset, and after VF reset is complete, RAS_CMD__GET_ALL_BLOCK_ECC_STATUS command will be re-registered to auto update list of sriov host. Signed-off-by: YiPeng Chai <YiPeng.Chai@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-12-08drm/amdgpu: Add mmhub v4_2_0 ip headers v5Hawking Zhang2-0/+3815
Add header files for mmhub v4_2_0 register offsets and shift masks v2: Update mmhub v4_2_0 ip headers v3: Update mmhub v4_2_0 ip headers v4: Clean up registers (Alex) v5: Clean up registers (Alex) Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: Likun Gao <Likun.Gao@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-12-08drm/amd/ras: Support high-frequency querying sriov ras block error countYiPeng Chai3-0/+154
Support high-frequency querying sriov ras block error count: 1. Create shared memory and fills it with RAS_CMD__GET_LAL_LOC_STATUS ras command. 2. The RAS_CMD_GET_ALL_BLOCK_ECC_STATUS command and shared memory are registered to sriov host ras auto-update list via RAS_CMD_SET_CMD_AUTO_UPDATE command. 3. Once sriov host detects ras error, it will automatically execute RAS_CMD__GET_ALL_BLOCK_ECC_STATUS command and write the result to shared memory. Signed-off-by: YiPeng Chai <YiPeng.Chai@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-12-08drm/amd/ras: Add ras command to retrieve cper data from sriov hostYiPeng Chai2-1/+173
In order to reduce the number of interactions with sriov host and the amount of data exchanged, a set of ras commands is first used to obtain the raw data used to generate cper from the host, then, guest driver generates cper based on the obtained raw data. Signed-off-by: YiPeng Chai <YiPeng.Chai@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-12-08drm/amd/pm: Enable system power caps for smu_v13_0_12Asad Kamal1-1/+4
Enable system power caps to fetch system power and threshold for smu_v13_0_12 Signed-off-by: Asad Kamal <asad.kamal@amd.com> Reviewed-by: Lijo Lazar <lijo.lazar@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-12-08drm/amd/pm: Fetch ubb power for smu_v13_0_12Asad Kamal3-0/+45
Feth ubb power from system metrics table for smu_v13_0_12 Signed-off-by: Asad Kamal <asad.kamal@amd.com> Reviewed-by: Lijo Lazar <lijo.lazar@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-12-08drm/amd/ras: Support sriov uniras to obtain cper dataYiPeng Chai2-3/+6
Support sriov uniras to obtain cper data. Signed-off-by: YiPeng Chai <YiPeng.Chai@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-12-08drm/amd/ras: sriov supports handling VF ras commands.YiPeng Chai6-10/+207
Add basic framework code to sriov to handle VF ras commands. Signed-off-by: YiPeng Chai <YiPeng.Chai@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-12-08drm/amdgpu: Add virt command to send VF ras commandYiPeng Chai4-2/+44
Add virt command and interface to send VF ras command. Signed-off-by: YiPeng Chai <YiPeng.Chai@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-12-08drm/amdgpu: fix the calculation of RAS bad page numberTao Zhou1-4/+0
__amdgpu_ras_restore_bad_pages is responsible for the maintenance of bad page number, drop the unnecessary bad page number update in the error handling path of add_bad_pages. Signed-off-by: Tao Zhou <tao.zhou1@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-12-08drm/amd/pm: Add sysfs node for ubb powerAsad Kamal2-2/+65
Add sysfs node to expose ubb power limit for smu_v13_0_12 v2: Update sysfs node name to baseboard_power & baseboard_power_limit to make it consistent with other node names (Lijo) Signed-off-by: Asad Kamal <asad.kamal@amd.com> Reviewed-by: Lijo Lazar <lijo.lazar@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-12-08drm/amd/pm: Update pmfw headers for smu_v13_0_12Asad Kamal1-0/+8
Update pmfw headers for smu_v13_0_12 to include ubb power Signed-off-by: Asad Kamal <asad.kamal@amd.com> Reviewed-by: Lijo Lazar <lijo.lazar@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-12-08drm/amdgpu: Expand kernel-doc in amdgpu_ringRodrigo Siqueira2-3/+10
Expand the kernel-doc about amdgpu_ring and add some tiny improvements. Cc: Alex Deucher <alexander.deucher@amd.com> Cc: Christian König <christian.koenig@amd.com> Cc: Timur Kristóf <timur.kristof@gmail.com> Signed-off-by: Rodrigo Siqueira <siqueira@igalia.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-12-08drm/amdgpu: Enable IH CAM on IH 7.1.0Mukul Joshi1-1/+29
Enable IH CAM to handle retry faults on IH 7.1.0. Also increase the soft ring size. Signed-off-by: Mukul Joshi <mukul.joshi@amd.com> Reviewed-by: Philip Yang <Philip.Yang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-12-08drm/amdgpu: Use ih v7_0 ip block for ih v7_1Hawking Zhang1-0/+1
ih v7_1 and ih v7_0 share the same ip block implementation Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: Likun Gao <Likun.Gao@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-12-08drm/amdgpu: Set psp ip block and funcs for v15.0.8Le Ma2-0/+7
Set psp ip block and funcs for MP0 15.0.8 Signed-off-by: Le Ma <le.ma@amd.com> Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: Likun Gao <Likun.Gao@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-12-08drm/amdgpu: Upload a single sdma fw copy when using psp v15.0.8Hawking Zhang1-1/+3
driver only need to upload sdma firmware copy for all sdma instances when using PSP v15.0.8 Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: Likun Gao <Likun.Gao@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-12-08drm/amdgpu: Set skip_tmr to true for psp v15_0_8Hawking Zhang1-0/+1
psp v15_0_8 does not require tmr created by gpu driver Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: Likun Gao <Likun.Gao@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-12-08drm/amdgpu: Add psp v15.0.8 ip block v3Le Ma5-1/+383
Add psp_v15_0_8.c for MPASP 15.0.8 v2: drop memory training intf as they are only necessary for GDDR memory v3: Implement psp_v15_0_8_get_fw_type (Feifei) Signed-off-by: Le Ma <le.ma@amd.com> Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Feifei Xu <Feifei.Xu@amd.com> Reviewed-by: Likun Gao <Likun.Gao@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-12-08drm/amdgpu: Add mp v15_0_8 ip headers v4Hawking Zhang2-0/+1484
Add header files for mp v15_0_8 register offsets and shift masks v2: Update mp v15_0_8 ip headers v3: Update mp v15_0_8 ip headers v4: Clean up registers (Alex) Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: Likun Gao <Likun.Gao@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-12-08drm/amdgpu: update psp_get_fw_type() functionFeifei Xu2-3/+10
In psp 15.0.8, mes and sdma GFX_FW_TYPE have been changed. Define a psp common function: psp_get_fw_type(). Hide the GFX_FW_TYPE Changes in each ip's psp->funcs_get_fw_type callback. (like psp_v15_0_8_get_fw_type()). If no GFX_FW_TYPE change, reuse the amdgpu_psp_get_fw_type(). Signed-off-by: Feifei Xu <Feifei.Xu@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-12-08drm/amdgpu: Add rlcv firmware for frontdoor loading.Feifei Xu6-1/+96
Rlcv is required to be loaded for frontdoor. 1. Add 2 rlcv ucode ids: AMDGPU_UCODE_RLC_IRAM_1 and AMDGPU_UCODE_RLC_DRAM_1 2. Add rlc_firmware_header_v2_5 for above 2 rlcv headers. 3. Add 2 types in psp_fw_gfx_if interface interacting with asp: GFX_FW_TYPE_RLX6_UCODE_CORE1 - RLCV IRAM GFX_FW_TYPE_RLX6_DRAM_BOOT_CORE1 - RLCV DRAM BOOT Signed-off-by: Feifei Xu <Feifei.Xu@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-12-08drm/amdgpu: Initialize smuio functions for smuio v15_0_8Hawking Zhang1-0/+4
Add initialization for smuio funcs specific to v15.0.8 Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Likun Gao <Likun.Gao@amd.com> Reviewed-by: Yang Wang <kevinyang.wang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-12-08drm/amdgpu: Add smuio v15_0_8 support v4Hawking Zhang4-1/+248
v15_0_8 is a new generation smuio ip block v2: Add smuio callbacks for interface id v3: Add smuio callback to identify custom hbm v4: comment out unused functions (Alex) Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Likun Gao <Likun.Gao@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-12-08drm/amdgpu: Add smuio v15_0_8 ip headers v4Hawking Zhang2-0/+1625
Add header files for smuio v15_0_8 register offsets and shift masks v2: Update smuio v15_0_8 ip headers v3: Update smuio v15_0_8 ip headers v4: Clean up registers (Alex) Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: Likun Gao <Likun.Gao@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-12-08drm/amdkfd: Remove hard‑coded GC IP version checks from kfd_node_by_irq_idsSreekant Somasekharan1-3/+5
Replace the GC IP version hard-coded check with multi-aid check in kfd_node_by_irq_ids(). If aid_mask is not set, we immediately return dev->nodes[0] otherwise we iterate and match using kfd_irq_is_from_node(). Signed-off-by: Sreekant Somasekharan <Sreekant.Somasekharan@amd.com> Reviewed-by: Philip Yang <Philip.Yang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-12-08drm/amdgpu: Update vm start, end, hole to support 57bit addressPhilip Yang10-25/+33
Change gmc macro AMDGPU_GMC_HOLE_START/END/MASK to 57bit if vm root level is PDB3 for 5-level page tables. The macro access adev without passing adev as parameter is to minimize the code change to support 57bit, then we have to add adev variable in several places to use the macro. Because adev definition is not available in all amdgpu c files which include amdgpu_gmc.h, change inline function amdgpu_gmc_sign_extend to macro. Signed-off-by: Philip Yang <Philip.Yang@amd.com> Acked-by: Felix Kuehling <felix.kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-12-08drm/amdgpu: GPU vm support 5-level page tablePhilip Yang3-1/+23
If GPU supports 5-level page table, but CPU disable 5-level page table by using boot option no5lvl or CPU feature not available, the virtual address will be 48bit, not needed to enable 5-level page table on GPU vm. If adev->vm_manager.num_level, number of pde levels, set to 4, then gfxhub and mmhub register VM_CONTEXTx_CNTL/PAGE_TABLE_DEPTH will set to 4 to enable 5-level page table in page table walker. Set vm_manager.root_level to AMDGPU_VM_PDE3, then update GPU mapping will allocate and update PDE3/PDE2/PDE1/PDE0/PTB 5-level page tables. If max_level is not 4, no change for the logic to support features needed by old ASICs. v2: squash in CONFIG fix Signed-off-by: Philip Yang <Philip.Yang@amd.com> Acked-by: Felix Kuehling <felix.kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-12-08drm/amdgpu: Add soc v1_0 enum headerHawking Zhang1-0/+33
Add soc v1_0 enum header Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: Le Ma <le.ma@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-12-08drm/amdgpu: update VRAM typesHawking Zhang1-1/+2
Update VRAM types. Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: Likun Gao <Likun.Gao@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-12-08drm/amdgpu: Move XCP_INST_MASK amdgpu_xcp.hHawking Zhang2-3/+3
Move the common macro for xcp manger to amdgpu_xcp.h Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: Likun Gao <Likun.Gao@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-12-08drm/amdgpu: Verify dpm setting for enabling smu with direct fw loadingHawking Zhang1-2/+4
Ensure that amdgpu_dpm kernel module parameter is set to 1 when enabling smu with direct firmware loading Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: Likun Gao <Likun.Gao@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-12-08drm/amdkfd: refactor rlc/gfx spmJames Zhu9-17/+24
for adding multiple xcc support. Signed-off-by: James Zhu <James.Zhu@amd.com> Reviewed-by: Bing Ma <Bing.Ma@amd.com> Reviewed-by: Gang Ba <gaba@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-12-08drm/amdgpu: Generalize HQD and VMID mask calculation for MESMukul Joshi1-2/+12
Generalize the calculation for determining the HQD mask and VMID mask passed to MES during initialization. v2: rebase (Alex) Signed-off-by: Mukul Joshi <mukul.joshi@amd.com> Reviewed-by: Harish Kasiviswanathan <Harish.Kasiviswanathan@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-12-08drm/amdgpu/mes: add multi-xcc supportJack Xiao13-108/+158
a. extend mes pipe instances to num_xcc * max_mes_pipe b. initialize mes schq/kiq pipes per xcc c. submit mes packet to mes ring according to xcc_id v2: rebase (Alex) Signed-off-by: Jack Xiao <Jack.Xiao@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>