| Age | Commit message (Collapse) | Author | Files | Lines |
|
queue_check_job_completion calls queue_reset_timeout_locked to reset the
timeout when progress is made. We want the reset to happen when the
timeout is running, not when it is suspended.
Fixes: 345c5b7cc0f85 ("drm/panthor: Make the timeout per-queue instead of per-job")
Signed-off-by: Chia-I Wu <olvaffe@gmail.com>
Signed-off-by: Liviu Dudau <liviu.dudau@arm.com>
Link: https://patch.msgid.link/20251202174028.1600218-1-olvaffe@gmail.com
|
|
This commit removes the redundant call to disable the MCU firmware
in the suspend path.
Fixes: 514072549865 ("drm/panthor: Support GLB_REQ.STATE field for Mali-G1 GPUs")
Signed-off-by: Akash Goel <akash.goel@arm.com>
Signed-off-by: Liviu Dudau <liviu.dudau@arm.com>
Link: https://patch.msgid.link/20251203091911.145623-1-akash.goel@arm.com
|
|
An AS can be disabled in the middle of a VM operation (VM being
evicted from an AS slot, for instance). In that case, we need the
locked section to be unlocked before releasing the slot.
v2:
- Add an lockdep_assert_held() in panthor_mmu_as_disable()
- Collect R-bs
v3:
- Don't reset the locked_region range in the as_disable() path
Fixes: 6e2d3b3e8589 ("drm/panthor: Add support for atomic page table updates")
Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Liviu Dudau <liviu.dudau@arm.com>
Reviewed-by: Chia-I Wu <olvaffe@gmail.com>
Signed-off-by: Liviu Dudau <liviu.dudau@arm.com>
Link: https://patch.msgid.link/20251203121750.404340-4-boris.brezillon@collabora.com
|
|
When we re-assign a slot to a different VM, we need to make sure the
old VM caches are flushed before doing the switch. Specialize
panthor_mmu_as_disable() so we can skip the slot programmation while
still getting the cache flushing, and call this helper from
panthor_vm_active() when an idle slot is recycled.
v2:
- Collect R-bs
Fixes: 6e2d3b3e8589 ("drm/panthor: Add support for atomic page table updates")
Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Liviu Dudau <liviu.dudau@arm.com>
Reviewed-by: Chia-I Wu <olvaffe@gmail.com>
Signed-off-by: Liviu Dudau <liviu.dudau@arm.com>
Link: https://patch.msgid.link/20251203121750.404340-3-boris.brezillon@collabora.com
|
|
It appears the timeout can still be enabled when we reach that point,
because of the asynchronous progress check done on queues that resets
the timer when jobs are still in-flight, but progress was made.
We could add more checks to make sure the timer is not re-enabled when
a group can't run anymore, but we don't have a group to pass to
queue_check_job_completion() in some context.
It's just as safe (we just want to be sure the timer is stopped before
we destroy the queue) and simpler to drop the WARN_ON() in
group_free_queue().
v2:
- Collect R-bs
Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Liviu Dudau <liviu.dudau@arm.com>
Reviewed-by: Chia-I Wu <olvaffe@gmail.com>
Signed-off-by: Liviu Dudau <liviu.dudau@arm.com>
Link: https://patch.msgid.link/20251203121750.404340-2-boris.brezillon@collabora.com
|
|
Backmerging to bring in a needed dependency for the Xe VFIO
driver variant. This should ideally have been done before we
commited that, so we now have a small window in drm-xe-next
where that driver doesn't compile.
Reported-by: kernel test robot <lkp@intel.com>
Closes: https://lore.kernel.org/oe-kbuild-all/202512030331.I8CveRre-lkp@intel.com/
Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
|
|
Using drm_mode_size_dumb() to compute the size of dumb buffers introduced
an 8-byte alignment constraint on the pitch that wasn’t present before.
Let’s remove this constraint, which isn’t necessarily required and may
cause buffers to be allocated larger than needed.
Signed-off-by: Ludovic Desroches <ludovic.desroches@microchip.com>
Reviewed-by: Thomas Zimmermann <tzimmermann@suse.de>
Fixes: 4977dcecb931 ("drm/gem-shmem: Compute dumb-buffer sizes with drm_mode_size_dumb()")
Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de>
Link: https://patch.msgid.link/20251126-lcd_pitch_alignment-v1-2-991610a1e369@microchip.com
|
|
Using drm_mode_size_dumb() to compute the size of dumb buffers introduced
an 8-byte alignment constraint on the pitch that wasn’t present before.
Let’s remove this constraint, which isn’t necessarily required and may
cause buffers to be allocated larger than needed.
Signed-off-by: Ludovic Desroches <ludovic.desroches@microchip.com>
Reviewed-by: Thomas Zimmermann <tzimmermann@suse.de>
Fixes: dcacfcd35cef ("drm/gem-dma: Compute dumb-buffer sizes with drm_mode_size_dumb()")
Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de>
Link: https://patch.msgid.link/20251126-lcd_pitch_alignment-v1-1-991610a1e369@microchip.com
|
|
CASF requires the second scaler for sharpness.
Do not expose the SHARPNESS_STRENGTH property if
the CRTC has fewer than two scalers.
v2: Modify header and commit message. [Ankit]
Signed-off-by: Nemesa Garg <nemesa.garg@intel.com>
Reviewed-by: Ankit Nautiyal <ankit.k.nautiyal@intel.com>
Signed-off-by: Animesh Manna <animesh.manna@intel.com>
Link: https://patch.msgid.link/20251126084152.3905905-1-nemesa.garg@intel.com
|
|
https://gitlab.freedesktop.org/agd5f/linux into drm-next
amd-drm-next-6.19-2025-12-02:
amdgpu:
- Unified MES fix
- SMU 11 unbalanced irq fix
- Fix for driver reloading on APUs
- pp_table sysfs fix
- Fix memory leak in fence handling
- HDMI fix
- DC cursor fixes
- eDP panel parsing fix
- Brightness fix
- DC analog fixes
- EDID retry fixes
- UserQ fixes
- RAS fixes
- IP discovery fix
- Add missing locking in amdgpu_ttm_access_memory_sdma()
- Smart Power OLED fix
- PRT and page fault fixes for GC 6-8
- VMID reservation fix
- ACP platform device fix
- Add missing vm fault handling for GC 11-12
- VPE fix
amdkfd:
- Partitioning fix
Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Alex Deucher <alexander.deucher@amd.com>
Link: https://patch.msgid.link/20251202220101.2039347-1-alexander.deucher@amd.com
|
|
Bspec states that the new AUX power enable/disable sequences are
associated with the LT PHY. As such, use HAS_LT_PHY() instead of IP
checks in those paths in the driver code.
While at it, also move the comment that we can't use the power status
flag to the "else" branch, since that comment is not applicable for the
LT PHY.
Bspec: 68967
Cc: Matt Roper <matthew.d.roper@intel.com>
Cc: Suraj Kandpal <suraj.kandpal@intel.com>
Suggested-by: Matt Roper <matthew.d.roper@intel.com>
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Signed-off-by: Gustavo Sousa <gustavo.sousa@intel.com>
Link: https://patch.msgid.link/20251202012306.9315-9-matthew.s.atwood@intel.com
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
|
|
We will need to HAS_LT_PHY() that macro in code outside of LT PHY
implementation. Move its definition to intel_display_device.h.
Cc: Matt Roper <matthew.d.roper@intel.com>
Cc: Suraj Kandpal <suraj.kandpal@intel.com>
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Signed-off-by: Gustavo Sousa <gustavo.sousa@intel.com>
Link: https://patch.msgid.link/20251202012306.9315-8-matthew.s.atwood@intel.com
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
|
|
NVL uses the Lake Tahoe PHY for display output and the driver recently
added the macro HAS_LT_PHY() to allow selecting code paths specific for
that type of PHY.
While NVL uses Xe3p_LPD as display IP, the type of PHY is actually
defined at the SoC level, so use a platform check instead of display
version.
Bspec: 74199
Cc: Suraj Kandpal <suraj.kandpal@intel.com>
Cc: Matt Roper <matthew.d.roper@intel.com>
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Reviewed-by: Dnyaneshwar Bhadane <dnyaneshwar.bhadane@intel.com>
Signed-off-by: Gustavo Sousa <gustavo.sousa@intel.com>
Link: https://patch.msgid.link/20251202012306.9315-7-matthew.s.atwood@intel.com
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
|
|
Add platform description and PCI IDs for NVL-S.
BSpec: 74201
Signed-off-by: Sai Teja Pottumuttu <sai.teja.pottumuttu@intel.com>
Reviewed-by: Shekhar Chauhan <shekhar.chauhan@intel.com>
Signed-off-by: Gustavo Sousa <gustavo.sousa@intel.com>
Link: https://patch.msgid.link/20251202012306.9315-6-matthew.s.atwood@intel.com
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
|
|
Xe3p_LPD added several bits containing information that can be relevant
to debugging FIFO underruns. Add the logic necessary to handle them
when reporting underruns.
This was adapted from the initial patch[1] from Sai Teja Pottumuttu.
[1] https://lore.kernel.org/all/20251015-xe3p_lpd-basic-enabling-v1-12-d2d1e26520aa@intel.com/
v2:
- Handle FBC-related bits in intel_fbc.c. (Ville)
- Do not use intel_fbc_id_for_pipe (promoted from
skl_fbc_id_for_pipe), but use pipe's primary plane to get the FBC
instance. (Ville, Matt)
- Improve code readability by moving logic specific to logging fields
of UNDERRUN_DBG1 to a separate function. (Jani)
v3:
- Use crtc->base.primary instead of cycling through all crtcs
Bspec: 69111, 69561, 74411, 74412
Cc: Jani Nikula <jani.nikula@intel.com>
Cc: Matt Roper <matthew.d.roper@intel.com>
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Gustavo Sousa <gustavo.sousa@intel.com>
Signed-off-by: Matt Atwood <matthew.s.atwood@intel.com>
Link: https://patch.msgid.link/20251202012306.9315-5-matthew.s.atwood@intel.com
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
|
|
Starting with Xe3p_LPD, the VBT has a new field, called in the driver
"dedicated_external", which tells that a Type-C capable port is
physically connected to a PHY outside of the Type-C subsystem. When
that's the case, the driver must not do the extra Type-C programming for
that port. Update intel_encoder_is_tc() to check for that case.
While at it, add a note to intel_phy_is_tc() to remind us that it is
about whether the respective port is a Type-C capable port rather than
the PHY itself.
(Maybe it would be a nice idea to rename intel_phy_is_tc()?)
Note that this was handled with a new bool member added to struct
intel_digital_port instead of having querying the VBT directly because
VBT memory is freed (intel_bios_driver_remove) before encoder cleanup
(intel_ddi_encoder_destroy), which would cause an oops to happen when
the latter calls intel_encoder_is_tc(). This could be fixed by keeping
VBT data around longer, but that's left for a follow-up work, if deemed
necessary.
v2:
- Drop printing info about dedicated external, now that we are doing
it when parsing the VBT. (Jani)
- Add a FIXME comment on the code explaining why we need to store
dedicated_external in struct intel_digital_port. (Jani)
v3:
- Simplify the code by using NULL check for dig_port to avoid using
intel_encoder_is_dig_port(). (Imre)
Cc: Imre Deak <imre.deak@intel.com>
Cc: Jani Nikula <jani.nikula@linux.intel.com>
Cc: Shekhar Chauhan <shekhar.chauhan@intel.com>
Signed-off-by: Gustavo Sousa <gustavo.sousa@intel.com>
Reviewed-by: Imre Deak <imre.deak@intel.com>
Link: https://patch.msgid.link/20251202012306.9315-4-matthew.s.atwood@intel.com
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
|
|
Starting with Xe3p_LPD, when intel_phy_is_tc() returns true, it does
not necessarily mean that the port is connected to a PHY in the Type-C
subsystem. The reason is that there is now a VBT field called
dedicated_external that will indicate that a Type-C capable port is
connected to a (most likely) combo/dedicated PHY. When that's the case,
we must not do the extra programming required for Type-C connections.
In an upcoming change, we will modify intel_encoder_is_tc() to take the
VBT field dedicated_external into consideration. Update
intel_display_power_well.c to use that function instead of
intel_phy_is_tc().
Note that, even though icl_aux_power_well_{enable,disable} are not part
of Xe3p_LPD's display paths, we modify them anyway for uniformity.
v2:
- Add and use icl_aux_pw_is_tc_phy() helper to avoid explicit encoder
lookup. (Imre)
Cc: Imre Deak <imre.deak@intel.com>
Cc: Shekhar Chauhan <shekhar.chauhan@intel.com>
Reviewed-by: Suraj Kandpal <suraj.kandpal@intel.com> # v1
Signed-off-by: Gustavo Sousa <gustavo.sousa@intel.com>
Reviewed-by: Imre Deak <imre.deak@intel.com>
Link: https://patch.msgid.link/20251202012306.9315-3-matthew.s.atwood@intel.com
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
|
|
VBT version 264 adds new fields associated to Xe3p_LPD's new ways of
configuring SoC for TC ports and PHYs. Update the code to match the
updates in VBT.
The new field dedicated_external is used to represent TC ports that are
connected to PHYs outside of the Type-C subsystem, meaning that they
behave like dedicated ports and don't require the extra Type-C
programming. In an upcoming change, we will update the driver to take
this field into consideration when detecting the type of port.
The new field dyn_port_over_tc is used to inform that the TC port can be
dynamically allocated for a legacy connector in the Type-C subsystem,
which is a new feature in Xe3p_LPD. In upcoming changes, we will use
that field in order to handle the IOM resource management programming
required for that.
Note that, when dedicated_external is set, the fields dp_usb_type_c and
tbt are tagged as "don't care" in the spec, so they should be ignored in
that case, so also add a sanitization function to take care of forcing
them to zero when dedicated_external is true.
v2:
- Use sanitization function to force dp_usb_type_c and tbt fields to
be zero instead of adding a
intel_bios_encoder_is_dedicated_external() check in each of their
respective accessor functions. (Jani)
- Print info about dedicated external ports in print_ddi_port().
(Jani)
v3:
- Also zero out field dyn_port_over_tc when dedicated_external is set.
(Imre)
- Use intel_bios_encoder_is_dedicated_external() directly instead of
storing return value into variable in print_ddi_port(). (Imre)
- Also print info for dyn_port_over_tc in print_ddi_port(). (Imre)
Bspec: 20124, 68954, 74304
Cc: Imre Deak <imre.deak@intel.com>
Cc: Jani Nikula <jani.nikula@linux.intel.com>
Cc: Shekhar Chauhan <shekhar.chauhan@intel.com>
Signed-off-by: Gustavo Sousa <gustavo.sousa@intel.com>
Reviewed-by: Imre Deak <imre.deak@intel.com>
Link: https://patch.msgid.link/20251202012306.9315-2-matthew.s.atwood@intel.com
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
|
|
Skipping power ungate exposed some scenarios that will fail
like below:
```
amdgpu: Register(0) [regVPEC_QUEUE_RESET_REQ] failed to reach value 0x00000000 != 0x00000001n
amdgpu 0000:c1:00.0: amdgpu: VPE queue reset failed
...
amdgpu: [drm] *ERROR* wait_for_completion_timeout timeout!
```
The underlying s2idle issue that prompted this commit is going to
be fixed in BIOS.
This reverts commit 2a6c826cfeedd7714611ac115371a959ead55bda.
Fixes: 2a6c826cfeed ("drm/amd: Skip power ungate during suspend for VPE")
Cc: stable@vger.kernel.org
Signed-off-by: Mario Limonciello (AMD) <superm1@kernel.org>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Reported-by: Konstantin <answer2019@yandex.ru>
Closes: https://bugzilla.kernel.org/show_bug.cgi?id=220812
Reported-by: Matthew Schwartz <matthew.schwartz@linux.dev>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
Use common definitions for the fault bits in the IH sourc
data for the gmc9-12 memory hub faults
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
We need to call amdgpu_vm_handle_fault() on page fault
on all gfx9 and newer parts to properly update the
page tables, not just for recoverable page faults.
Cc: stable@vger.kernel.org
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
We need to call amdgpu_vm_handle_fault() on page fault
on all gfx9 and newer parts to properly update the
page tables, not just for recoverable page faults.
Cc: stable@vger.kernel.org
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
mfd_add_hotplug_devices() assigns child platform devices with
PLATFORM_DEVID_AUTO, but the ACP machine drivers expect the platform
device names to never change. Use mfd_add_devices() instead and give
each cell a unique id.
Signed-off-by: Brady Norander <bradynorander@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
protected-fence fix
On GFX11.0.3, earlier SDMA firmware versions issue the
PROTECTED_FENCE write from the user VMID (e.g. VMID 8) instead of
VMID 0. This causes a GPU VM protection fault when SDMA tries to
write the secure fence location, as seen in the UMQ SDMA test
(cs-sdma-with-IP-DMA-UMQ)
Fixes the below GPU page fault:
[ 514.037189] amdgpu 0000:0b:00.0: amdgpu: [gfxhub] page fault (src_id:0 ring:40 vmid:8 pasid:32770)
[ 514.037199] amdgpu 0000:0b:00.0: amdgpu: Process pid 0 thread pid 0
[ 514.037205] amdgpu 0000:0b:00.0: amdgpu: in page starting at address 0x00007fff00409000 from client 10
[ 514.037212] amdgpu 0000:0b:00.0: amdgpu: GCVM_L2_PROTECTION_FAULT_STATUS:0x00841A51
[ 514.037217] amdgpu 0000:0b:00.0: amdgpu: Faulty UTCL2 client ID: SDMA0 (0xd)
[ 514.037223] amdgpu 0000:0b:00.0: amdgpu: MORE_FAULTS: 0x1
[ 514.037227] amdgpu 0000:0b:00.0: amdgpu: WALKER_ERROR: 0x0
[ 514.037232] amdgpu 0000:0b:00.0: amdgpu: PERMISSION_FAULTS: 0x5
[ 514.037236] amdgpu 0000:0b:00.0: amdgpu: MAPPING_ERROR: 0x0
[ 514.037241] amdgpu 0000:0b:00.0: amdgpu: RW: 0x1
v2: Updated commit message
v3: s/gfx11.0.3/sdma 6.0.3/ in patch title (Alex)
Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: Christian König <christian.koenig@amd.com>
Cc: stable@vger.kernel.org
Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
Otherwise userspace may be fooled into believing it has a reserved VMID
when in reality it doesn't, ultimately leading to GPU hangs when SPM is
used.
Fixes: 80e709ee6ecc ("drm/amdgpu: add option params to enforce process isolation between graphics and compute")
Cc: stable@vger.kernel.org
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Natalie Vock <natalie.vock@gmx.de>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
On old GPUs, it may be an issue that handling the interrupts from
VM faults is too slow and the interrupt handler (IH) ring may
overflow, which can cause an eventual hang.
Delegate the processing of all VM faults to the soft
IRQ handler ring.
As a result, we spend much less time in the IRQ handler that
interacts with the HW IH ring, which significantly reduces the
chance of hangs/reboots.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
On old GPUs, it may be an issue that handling the interrupts from
VM faults is too slow and the interrupt handler (IH) ring may
overflow, which can cause an eventual hang.
Delegate the processing of all VM faults to the soft
IRQ handler ring.
As a result, we spend much less time in the IRQ handler that
interacts with the HW IH ring, which significantly reduces the
chance of hangs/reboots.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
On old GPUs, it may be an issue that handling the interrupts from
VM faults is too slow and the interrupt handler (IH) ring may
overflow, which can cause an eventual hang.
Delegate the processing of all VM faults to the soft
IRQ handler ring.
As a result, we spend much less time in the IRQ handler that
interacts with the HW IH ring, which significantly reduces the
chance of hangs/reboots.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
Call amdgpu_vm_update_fault_cache on GMC v6 similarly to how we
do in GMC v7-v8 so that VM fault info can be used later by
userspace for debugging.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
The VM_CONTEXT1_PROTECTION_FAULT_MCCLIENT register
doesn't exist on GMC v6 so we can't print the MC client as a
string like we do on GMC v7-v8. However, we still print the
mc_id from VM_CONTEXT1_PROTECTION_FAULT_STATUS.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
We are going to use the soft IRQ handler ring on GMC v8
to process interrupts from VM faults.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
We are going to use the soft IRQ handler ring on GMC v8
to process interrupts from VM faults.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
We are going to use the soft IRQ handler ring on GMC v8
to process interrupts from VM faults.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
We are going to use the soft IRQ handler ring on GMC v7 (CIK)
to process interrupts from VM faults.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
We are going to use the soft IRQ handler ring on GMC v6 (SI)
to process interrupts from VM faults.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
Fix a typo in a comment, change "enviroment" to "environment" in
drivers/gpu/drm/amd/display/dc/dml2/display_mode_core_structs.h
Fixes: e6a8a000cfe6 ("drm/amd/display: Rename dml2 to dml2_0 folder")
Acked-by: Aurabindo Pillai <aurabindo.pillai@amd.com>
Signed-off-by: Aditya Gollamudi <adigollamudi@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Roman Li <roman.li@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
[HOW]
Before enable smart power OLED, we need to call set pipe to let
DMUB get correct ABM config.
Reviewed-by: Robin Chen <robin.chen@amd.com>
Signed-off-by: Ian Chen <ian.chen@amd.com>
Signed-off-by: Roman Li <roman.li@amd.com>
Tested-by: Dan Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
[Why&How]
DVI-A & VGA connectors are applicable to DCE ASICs, so move them to
dce110_hwseq.c to block audio sync on SIGNAL_TYPE_RGB for DCE ASICs.
Signed-off-by: Ivan Lipski <ivan.lipski@amd.com>
Reviewed-by: Harry Wentland <harry.wentland@amd.com>
Tested-by: Dan Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
Users of ttm entities need to hold the gtt_window_lock before using them
to guarantee proper ordering of jobs.
Cc: stable@vger.kernel.org
Fixes: cb5cc4f573e1 ("drm/amdgpu: improve debug VRAM access performance using sdma")
Signed-off-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
Switch runtime PM code to use scope-based forcewake for consistency with
other parts of the driver.
Signed-off-by: Raag Jadav <raag.jadav@intel.com>
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Link: https://patch.msgid.link/20251128082212.294592-1-raag.jadav@intel.com
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
|
|
VF migration sends a marker to the GUC before resource fixups begin,
and repeats the marker with the RESFIX_DONE notification. This prevents
the GUC from submitting jobs during double migration events.
To reliably test double migration, a second migration must be triggered
while fixups from the first migration are still in progress. Since fixups
complete quickly, reproducing this scenario is difficult. Introduce
debugfs controls to add delays in the post-fixup phase, creating a
deterministic window for subsequent migrations.
New debugfs entries:
/sys/kernel/debug/dri/BDF/
├── tile0
│ ├─gt0
│ │ ├──vf
│ │ │ ├── resfix_stoppers
resfix_stoppers: Predefined checkpoints that allow the migration process
to pause at specific stages. The stages are given below.
VF_MIGRATION_WAIT_RESFIX_START - BIT(0)
VF_MIGRATION_WAIT_FIXUPS - BIT(1)
VF_MIGRATION_WAIT_RESTART_JOBS - BIT(2)
VF_MIGRATION_WAIT_RESFIX_DONE - BIT(3)
Each state will pause with a 1-second delay per iteration, continuing until
its corresponding bit is cleared.
Signed-off-by: Satyanarayana K V P <satyanarayana.k.v.p@intel.com>
Cc: Michal Wajdeczko <michal.wajdeczko@intel.com>
Cc: Matthew Brost <matthew.brost@intel.com>
Cc: Tomasz Lis <tomasz.lis@intel.com>
Acked-by: Adam Miszczak <adam.miszczak@linux.intel.com>
Reviewed-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Link: https://patch.msgid.link/20251201095011.21453-10-satyanarayana.k.v.p@intel.com
|
|
Handle GuC response `XE_GUC_RESPONSE_VF_MIGRATED` as a special case in the
VF post-migration recovery flow. When this error occurs, it indicates that
a new migration was detected while the resource fixup process was still in
progress. Instead of failing immediately, requeue the VF into the recovery
path to allow proper handling of the new migration event.
This improves robustness of VF recovery in SR-IOV environments where
migrations can overlap with resource fixup steps.
Signed-off-by: Satyanarayana K V P <satyanarayana.k.v.p@intel.com>
Cc: Michal Wajdeczko <michal.wajdeczko@intel.com>
Cc: Matthew Brost <matthew.brost@intel.com>
Cc: Tomasz Lis <tomasz.lis@intel.com>
Reviewed-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Link: https://patch.msgid.link/20251201095011.21453-9-satyanarayana.k.v.p@intel.com
|
|
In scenarios involving double migration, the VF KMD may encounter
situations where it is instructed to re-migrate before having the
opportunity to send RESFIX_DONE for the initial migration. This can occur
when the fix-up for the prior migration is still underway, but the VF KMD
is migrated again.
Consequently, this may lead to the possibility of sending two migration
notifications (i.e., pending fix-up for the first migration and a second
notification for the new migration). Upon receiving the first RES_FIX
notification, the GuC will resume VF submission on the GPU, potentially
resulting in undefined behavior, such as system hangs or crashes.
To avoid this, post migration, a marker is sent to the GUC prior to the
start of resource fixups to indicate start of resource fixups. The same
marker is sent along with RESFIX_DONE notification so that GUC can avoid
submitting jobs to HW in case of double migration.
Signed-off-by: Satyanarayana K V P <satyanarayana.k.v.p@intel.com>
Cc: Michal Wajdeczko <michal.wajdeczko@intel.com>
Cc: Matthew Brost <matthew.brost@intel.com>
Cc: Tomasz Lis <tomasz.lis@intel.com>
Reviewed-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Link: https://patch.msgid.link/20251201095011.21453-8-satyanarayana.k.v.p@intel.com
|
|
Enable VF migration starting with GuC 70.54.0 (compatibility version
1.27.0) which supports additional VF2GUC_RESFIX_START message required
to handle migration recovery in a more robust way.
Signed-off-by: Satyanarayana K V P <satyanarayana.k.v.p@intel.com>
Cc: Michal Wajdeczko <michal.wajdeczko@intel.com>
Cc: Matthew Brost <matthew.brost@intel.com>
Cc: Tomasz Lis <tomasz.lis@intel.com>
Reviewed-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Link: https://patch.msgid.link/20251201095011.21453-7-satyanarayana.k.v.p@intel.com
|
|
Add intel_bo_key_check() next to intel_bo_is_protected() where it feels
like it belongs, and drop the extra pxp compat header.
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://patch.msgid.link/20251201172730.2154668-1-jani.nikula@intel.com
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
|
|
Commit 965930962a41 ("drm/i915/frontbuffer: Fix intel_frontbuffer
lifetime handling") dropped the last xe display users of the
headers. They're still used in intel_overlay.c, but it's not built as
part of xe.
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://patch.msgid.link/20251201171050.2145833-1-jani.nikula@intel.com
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
|
|
Remove the debug_enter/debug_leave helpers, as there are no DRM
drivers supporting debugging with kgdb. Remove code to keep track
of existing fbdev-emulation state. None of this required any longer.
Also remove mode_set_base_atomic from struct drm_crtc_helper_funcs,
which has no callers or implementations.
Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de>
Reviewed-by: Simona Vetter <simona.vetter@ffwll.ch>
Acked-by: Daniel Thompson (RISCstar) <danielt@kernel.org>
Link: https://patch.msgid.link/20251125130634.1080966-5-tzimmermann@suse.de
|
|
Remove the implementation of the CRTC helper mode_set_base_atomic
from radeon. It pretends to provide mode setting for kdb debugging,
but has been broken for some time.
Kdb output has been supported only for non-atomic mode setting since
commit 9c79e0b1d096 ("drm/fb-helper: Give up on kgdb for atomic drivers")
from 2017.
While radeon provides non-atomic mode setting, kdb assumes that the GEM
buffer object is at a fixed location in video memory. This assumption
currently blocks radeon from converting to generic fbdev emulation.
Fbdev-ttm helpers use a shadow buffer with a movable GEM buffer object.
Triggering kdb does therefore not update the display.
Another problem is that the current implementation does not handle
USB keyboard input. Therefore a serial terminal is required. Then when
continuing from the debugger, radeon fails with an error:
[7]kdb> go
[ 40.345523][ C7] BUG: scheduling while atomic: bash/1580/0x00110003
[...]
[ 40.345613][ C7] schedule+0x27/0xd0
[ 40.345615][ C7] schedule_timeout+0x7b/0x100
[ 40.345617][ C7] ? __pfx_process_timeout+0x10/0x10
[ 40.345619][ C7] msleep+0x31/0x50
[ 40.345621][ C7] radeon_crtc_load_lut+0x2e4/0xcb0 [radeon 31c1ee785de120fcfd0babcc09babb3770252b4e]
[ 40.345698][ C7] radeon_crtc_gamma_set+0xe/0x20 [radeon 31c1ee785de120fcfd0babcc09babb3770252b4e]
[ 40.345760][ C7] drm_fb_helper_debug_leave+0xd8/0x130
[ 40.345763][ C7] kgdboc_post_exp_handler+0x54/0x70
[...]
and the system hangs.
Support for kdb feels pretty much broken. Hence remove the whole kdb
support from radeon.
Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de>
Acked-by: Christian König <christian.koenig@amd.com>
Acked-by: Simona Vetter <simona.vetter@ffwll.ch>
Acked-by: Daniel Thompson (RISCstar) <danielt@kernel.org>
Link: https://patch.msgid.link/20251125130634.1080966-4-tzimmermann@suse.de
|
|
Remove the implementation of the CRTC helper mode_set_base_atomic
from nouveau. It pretends to provide mode setting for kdb debugging,
but has been broken for some time.
Kdb output has been supported only for non-atomic mode setting since
commit 9c79e0b1d096 ("drm/fb-helper: Give up on kgdb for atomic drivers")
from 2017.
While nouveau provides non-atomic mode setting for some devices, kdb
assumes that the GEM buffer object is at a fixed location in video
memory. This has not been the case since
commit 4a16dd9d18a0 ("drm/nouveau/kms: switch to drm fbdev helpers")
from 2022. Fbdev-ttm helpers use a shadow buffer with a movable GEM
buffer object. Triggering kdb does therefore not update the display.
Hence remove the whole kdb support from nouveau.
Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de>
Reviewed-by: Lyude Paul <lyude@redhat.com>
Acked-by: Simona Vetter <simona.vetter@ffwll.ch>
Acked-by: Daniel Thompson (RISCstar) <danielt@kernel.org>
Link: https://patch.msgid.link/20251125130634.1080966-3-tzimmermann@suse.de
|
|
Remove all implementations of the CRTC helper mode_set_base_atomic
from amdgpu. It pretends to provide mode setting for kdb debugging,
but has been broken for some time.
Kdb output has been supported only for non-atomic mode setting since
commit 9c79e0b1d096 ("drm/fb-helper: Give up on kgdb for atomic drivers")
from 2017.
While amdgpu provides non-atomic mode setting for some devices, kdb
assumes that the GEM buffer object is at a fixed location in video
memory. This has not been the case since commit 087451f372bf ("drm/amdgpu:
use generic fb helpers instead of setting up AMD own's.") from 2021.
Fbdev-ttm helpers use a shadow buffer with a movable GEM buffer object.
Triggering kdb does not update the display.
Hence remove the whole kdb support from amdgpu.
Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de>
Acked-by: Christian König <christian.koenig@amd.com>
Acked-by: Simona Vetter <simona.vetter@ffwll.ch>
Acked-by: Daniel Thompson (RISCstar) <danielt@kernel.org>
Link: https://patch.msgid.link/20251125130634.1080966-2-tzimmermann@suse.de
|