aboutsummaryrefslogtreecommitdiff
path: root/drivers/gpu
AgeCommit message (Collapse)AuthorFilesLines
2025-09-29drm/xe/stolen: use the same types as i915 interfaceJani Nikula2-8/+7
Unify the i915 and xe interfaces by switching to the same types as i915. Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Link: https://lore.kernel.org/r/f15d41bc232dfa957841f16d9a069c777af40194.1758732183.git.jani.nikula@intel.com Signed-off-by: Jani Nikula <jani.nikula@intel.com>
2025-09-29drm/{i915, xe}/stolen: convert stolen interface to struct drm_deviceJani Nikula5-21/+25
Make the stolen interface agnostic to i915/xe, and pass struct drm_device instead. Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Link: https://lore.kernel.org/r/bbfc2aeaeee3156e92d49c73983be05b6feeede2.1758732183.git.jani.nikula@intel.com Signed-off-by: Jani Nikula <jani.nikula@intel.com>
2025-09-29drm/{i915, xe}/stolen: use the stored i915/xe device pointerJani Nikula5-50/+33
Now that we store the i915/xe device pointer in struct intel_stolen_node, we can reduce parameter passing in a number of functions. Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Link: https://lore.kernel.org/r/6f31114c8113ce2254d422ca53992088b673fb2f.1758732183.git.jani.nikula@intel.com Signed-off-by: Jani Nikula <jani.nikula@intel.com>
2025-09-29drm/{i915, xe}/stolen: add device pointer to struct intel_stolen_nodeJani Nikula2-0/+8
Add backpointers to i915/xe to allow simplifying some interfaces in follow-up. Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Link: https://lore.kernel.org/r/321354d47f9e530159caefef510d5394f4177470.1758732183.git.jani.nikula@intel.com Signed-off-by: Jani Nikula <jani.nikula@intel.com>
2025-09-29drm/{i915, xe}/stolen: make struct intel_stolen_node opaqueJani Nikula5-30/+89
Add i915_gem_stolen_node_alloc() and i915_gem_stolen_node_free(), returning struct intel_stolen_node pointer. Make struct intel_stolen_node an opaque pointer, with different implementations in i915 and xe. Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Link: https://lore.kernel.org/r/3fe71bbb4c75ee86b4d129fafa3d4cd6526363f4.1758732183.git.jani.nikula@intel.com Signed-off-by: Jani Nikula <jani.nikula@intel.com>
2025-09-29drm/xe/stolen: convert compat static inlines to proper functionsJani Nikula3-85/+119
Add display/xe_stolen.c as the implementation for the stolen interface exposed to display. This allows hiding the implementation details that shouldn't be exposed to display. Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Link: https://lore.kernel.org/r/8e807c6aafc6151b18df08dda20053516813e001.1758732183.git.jani.nikula@intel.com Signed-off-by: Jani Nikula <jani.nikula@intel.com>
2025-09-29drm/i915/stolen: convert intel_stolen_node into a real struct of its ownJani Nikula2-32/+56
i915_gem_stolen.h simply defines intel_stolen_node as drm_mm_node. Make struct intel_stolen_node an actual struct of its own right, and embed struct drm_mm_node inside. This allow better unification between i915 and xe. Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Link: https://lore.kernel.org/r/36762f611566d81427e702369f4e8207ead5f26c.1758732183.git.jani.nikula@intel.com Signed-off-by: Jani Nikula <jani.nikula@intel.com>
2025-09-29drm/xe/stolen: switch from BUG_ON() to WARN_ON() in compatJani Nikula1-1/+2
We're pretty much never supposed to be using BUG_ON(). Switch to WARN_ON(). Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Link: https://lore.kernel.org/r/d14c693a3387a5d89bb88e81349639b5ec5663fb.1758732183.git.jani.nikula@intel.com Signed-off-by: Jani Nikula <jani.nikula@intel.com>
2025-09-29drm/xe/stolen: convert compat stolen macros to inline functionsJani Nikula2-8/+33
Improve type safety. Allows getting rid of a __maybe_unused annotation too. Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Link: https://lore.kernel.org/r/1ec1fa59e0e54da49a1ec4fd1d535288066db502.1758732183.git.jani.nikula@intel.com Signed-off-by: Jani Nikula <jani.nikula@intel.com>
2025-09-29drm/xe/stolen: rename fb to node in stolen compat headerJani Nikula1-12/+12
It's more about node than fb, and this makes more sense now that the struct is also named intel_stolen_node. Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Link: https://lore.kernel.org/r/71a7872e47da5f3fbe61cc21723bfcf8ff6518b8.1758732183.git.jani.nikula@intel.com Signed-off-by: Jani Nikula <jani.nikula@intel.com>
2025-09-29drm/{i915, xe}/stolen: rename i915_stolen_fb to intel_stolen_nodeJani Nikula3-7/+7
Use a more generic name than one that refers to "i915" and "fb". Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Link: https://lore.kernel.org/r/925fd07d3f2a6115c71984f5a40a06c9eb46a539.1758732183.git.jani.nikula@intel.com Signed-off-by: Jani Nikula <jani.nikula@intel.com>
2025-09-29Merge drm/drm-next into drm-intel-nextJani Nikula634-11515/+20949
Backmerge to sync with drm/xe changes. Signed-off-by: Jani Nikula <jani.nikula@intel.com>
2025-09-29drm/bridge: imx: add driver for HDMI TX Parallel Audio InterfaceShengjiu Wang4-5/+230
The HDMI TX Parallel Audio Interface (HTX_PAI) is a digital module that acts as the bridge between the Audio Subsystem to the HDMI TX Controller. This IP block is found in the HDMI subsystem of the i.MX8MP SoC. Data received from the audio subsystem can have an arbitrary component ordering. The HTX_PAI block has integrated muxing options to select which sections of the 32-bit input data word will be mapped to each IEC60958 field. The HTX_PAI_FIELD_CTRL register contains mux selects to individually select P,C,U,V,Data, and Preamble. Use component helper so that imx8mp-hdmi-tx will be aggregate driver, imx8mp-hdmi-pai will be component driver, then imx8mp-hdmi-pai can use bind() ops to get the plat_data from imx8mp-hdmi-tx device. Signed-off-by: Shengjiu Wang <shengjiu.wang@nxp.com> Reviewed-by: Liu Ying <victor.liu@nxp.com> Tested-by: Alexander Stein <alexander.stein@ew.tq-group.com> Signed-off-by: Liu Ying <victor.liu@nxp.com> Link: https://lore.kernel.org/r/20250923053001.2678596-6-shengjiu.wang@nxp.com
2025-09-29drm/bridge: dw-hdmi: Add API dw_hdmi_set_sample_iec958() for iec958 formatShengjiu Wang2-1/+16
Add API dw_hdmi_set_sample_iec958() for IEC958 format because audio device driver needs IEC958 information to configure this specific setting. Signed-off-by: Shengjiu Wang <shengjiu.wang@nxp.com> Acked-by: Liu Ying <victor.liu@nxp.com> Tested-by: Alexander Stein <alexander.stein@ew.tq-group.com> Signed-off-by: Liu Ying <victor.liu@nxp.com> Link: https://lore.kernel.org/r/20250923053001.2678596-5-shengjiu.wang@nxp.com
2025-09-29drm/bridge: dw-hdmi: Add API dw_hdmi_to_plat_data() to get plat_dataShengjiu Wang1-0/+6
Add API dw_hdmi_to_plat_data() to fetch plat_data because audio device driver needs it to enable(disable)_audio(). Signed-off-by: Shengjiu Wang <shengjiu.wang@nxp.com> Acked-by: Liu Ying <victor.liu@nxp.com> Tested-by: Alexander Stein <alexander.stein@ew.tq-group.com> Signed-off-by: Liu Ying <victor.liu@nxp.com> Link: https://lore.kernel.org/r/20250923053001.2678596-4-shengjiu.wang@nxp.com
2025-09-26drm/xe/hw_engine_group: Fix double write lock release in error pathShuicheng Lin1-5/+1
In xe_hw_engine_group_get_mode(), a write lock is acquired before calling switch_mode(), which in turn invokes xe_hw_engine_group_suspend_faulting_lr_jobs(). On failure inside xe_hw_engine_group_suspend_faulting_lr_jobs(), the write lock is released there, and then again in xe_hw_engine_group_get_mode(), leading to a double release. Fix this by keeping both acquire and release operation in xe_hw_engine_group_get_mode(). Fixes: 770bd1d34113 ("drm/xe/hw_engine_group: Ensure safe transition between execution modes") Cc: Francois Dugast <francois.dugast@intel.com> Signed-off-by: Shuicheng Lin <shuicheng.lin@intel.com> Reviewed-by: Francois Dugast <francois.dugast@intel.com> Link: https://lore.kernel.org/r/20250925023145.1203004-2-shuicheng.lin@intel.com Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
2025-09-26drm/solomon: Enforce one assignment per lineIker Pedrosa1-4/+8
The code contains several instances of chained assignments. The Linux kernel coding style generally favors clarity and simplicity over terse syntax. Refactor the code to use a separate line for each assignment. Reviewed-by: Javier Martinez Canillas <javierm@redhat.com> Reviewed-by: Thomas Zimmermann <tzimmermann@suse.de> Signed-off-by: Iker Pedrosa <ikerpedrosam@gmail.com> Link: https://lore.kernel.org/r/20250920-improve-ssd130x-v2-5-77721e87ae08@gmail.com Signed-off-by: Javier Martinez Canillas <javierm@redhat.com>
2025-09-26drm/solomon: Simplify get_modes() using DRM helperIker Pedrosa1-13/+1
The ssd130x_connector_get_modes function contains a manual implementation to manage modes. This pattern is common for simple displays, and the DRM core already provides the drm_connector_helper_get_modes_fixed() helper for this exact use case. Reviewed-by: Javier Martinez Canillas <javierm@redhat.com> Reviewed-by: Thomas Zimmermann <tzimmermann@suse.de> Signed-off-by: Iker Pedrosa <ikerpedrosam@gmail.com> Link: https://lore.kernel.org/r/20250920-improve-ssd130x-v2-4-77721e87ae08@gmail.com Signed-off-by: Javier Martinez Canillas <javierm@redhat.com>
2025-09-26drm/solomon: Simplify mode_valid() using DRM helperIker Pedrosa1-9/+1
The ssd130x_crtc_mode_valid() function contains a manual implementation to validate the display mode against the panel's single fixed resolution. This pattern is common for simple displays, and the DRM core already provides the drm_crtc_helper_mode_valid_fixed() helper for this exact use case. Reviewed-by: Javier Martinez Canillas <javierm@redhat.com> Reviewed-by: Thomas Zimmermann <tzimmermann@suse.de> Signed-off-by: Iker Pedrosa <ikerpedrosam@gmail.com> Link: https://lore.kernel.org/r/20250920-improve-ssd130x-v2-3-77721e87ae08@gmail.com Signed-off-by: Javier Martinez Canillas <javierm@redhat.com>
2025-09-26drm/solomon: Use drm_WARN_ON_ONCE instead of WARN_ONIker Pedrosa1-4/+4
To prevent log spam, convert all instances to the DRM-specific drm_WARN_ON_ONCE() macro. This ensures that a warning is emitted only the first time the condition is met for a given device instance, which is the desired behavior within the graphics subsystem. Reviewed-by: Javier Martinez Canillas <javierm@redhat.com> Reviewed-by: Thomas Zimmermann <tzimmermann@suse.de> Signed-off-by: Iker Pedrosa <ikerpedrosam@gmail.com> Link: https://lore.kernel.org/r/20250920-improve-ssd130x-v2-2-77721e87ae08@gmail.com Signed-off-by: Javier Martinez Canillas <javierm@redhat.com>
2025-09-26drm/solomon: Move calls to drm_gem_fb_end_cpu*()Iker Pedrosa1-18/+18
Calls to drm_gem_fb_end_cpu*() should be between the calls to drm_dev*(), and not hidden inside some other function. This way the critical section code is visible at a glance, keeping it short and improving maintainability. Reviewed-by: Javier Martinez Canillas <javierm@redhat.com> Reviewed-by: Thomas Zimmermann <tzimmermann@suse.de> Signed-off-by: Iker Pedrosa <ikerpedrosam@gmail.com> Link: https://lore.kernel.org/r/20250920-improve-ssd130x-v2-1-77721e87ae08@gmail.com Signed-off-by: Javier Martinez Canillas <javierm@redhat.com>
2025-09-26drm/i915/psr: Deactivate PSR only on LNL and when selective fetch enabledJouni Högander1-2/+10
Using intel_psr_exit in frontbuffer flush on older platforms seems to be causing problems. Sending single full frame update using intel_psr_force_update is anyways more optimal compared to psr deactivate/activate -> move back to this approach on PSR1, PSR HW tracking and Panel Replay full frame update and use deactivate/activate only on LunarLake and only when selective fetch is enabled. Tested-by: Lemen <lemen@lemen.xyz> Tested-by: Koos Vriezen <koos.vriezen@gmail.com> Closes: https://gitlab.freedesktop.org/drm/i915/kernel/-/issues/14946 Signed-off-by: Jouni Högander <jouni.hogander@intel.com> Reviewed-by: Mika Kahola <mika.kahola@intel.com> Link: https://lore.kernel.org/r/20250922102725.2752742-1-jouni.hogander@intel.com
2025-09-26Merge tag 'drm-xe-fixes-2025-09-25' of ↵Dave Airlie3-4/+4
https://gitlab.freedesktop.org/drm/xe/kernel into drm-fixes - Don't expose sysfs attributes not applicable for VFs (Michal) - Fix build with CONFIG_MODULES=n (Lucas) - Don't copy pinned kernel bos twice on suspend (Thomas) Signed-off-by: Dave Airlie <airlied@redhat.com> From: Rodrigo Vivi <rodrigo.vivi@intel.com> Link: https://lore.kernel.org/r/aNU-FkJEcA3T4aDB@intel.com
2025-09-26Merge tag 'drm-misc-fixes-2025-09-25' of ↵Dave Airlie3-9/+3
https://gitlab.freedesktop.org/drm/misc/kernel into drm-fixes A CPU stall fix for ast, a NULL pointer dereference fix for gma500, an OOB and overflow fixes for fbcon, and a race condition fix for panthor. Signed-off-by: Dave Airlie <airlied@redhat.com> From: Maxime Ripard <mripard@redhat.com> Link: https://lore.kernel.org/r/20250925-smilodon-of-luxurious-genius-4ebee7@penduick
2025-09-26Merge tag 'drm-intel-fixes-2025-09-25' of ↵Dave Airlie2-2/+10
https://gitlab.freedesktop.org/drm/i915/kernel into drm-fixes - Set O_LARGEFILE in __create_shmem() (Taotao Chen) - Guard reg_val against a INVALID_TRANSCODER [ddi] (Suraj Kandpal) Signed-off-by: Dave Airlie <airlied@redhat.com> From: Tvrtko Ursulin <tursulin@igalia.com> Link: https://lore.kernel.org/r/aNTxWfhsMkFZ3Q-a@linux
2025-09-26Merge tag 'drm-misc-next-fixes-2025-09-25' of ↵Dave Airlie2-3/+3
https://gitlab.freedesktop.org/drm/misc/kernel into drm-next Short summary of fixes pull: bridge: - waveshare-dsi: Fix error handling in probe function pixpaper: - select GEM SHMEM helpers Signed-off-by: Dave Airlie <airlied@redhat.com> From: Thomas Zimmermann <tzimmermann@suse.de> Link: https://lore.kernel.org/r/20250925064257.GA9107@linux.fritz.box
2025-09-26drm/i915: i915_pmu: Use sysfs_emit() instead of sprintf()Madhur Kumar1-2/+2
Follow the advice in Documentation/filesystems/sysfs.rst: show() should only use sysfs_emit() or sysfs_emit_at() when formatting the value to be returned to user space. Signed-off-by: Madhur Kumar <madhurkumar004@gmail.com> Reviewed-by: Andi Shyti <andi.shyti@linux.intel.com> Signed-off-by: Andi Shyti <andi.shyti@linux.intel.com> Link: https://lore.kernel.org/r/20250923195051.1277855-1-madhurkumar004@gmail.com
2025-09-26drm/i915/gvt: Improve intel_vgpu_ioctl hdr error handlingJonathan Cavitt1-6/+12
Add error handling for the following VFIO_DEVICE_SET_IRQS cases with respect to the hdr struct: - More than one VFIO_IRQ_DATA_TYPE_MASK flag is set in hdr.flags - More than one VFIO_IRQ_ACTION_TYPE_MASK flag is set in hdr.flags - hdr.count is not specified Note that since hdr.count != 0, data_size != 0 is guaranteed unless vfio_set_irqs_validate_and_prepare fails and returns an error. So, we no longer need to check data_size before running memdup_user because checking the return value of the function is sufficient. v2: Use correct name for mask v3: Use is_power_of_2 over hweight32 as it's more efficient (Andi) Signed-off-by: Jonathan Cavitt <jonathan.cavitt@intel.com> Cc: Andi Shyti <andi.shyti@linux.intel.com> Reviewed-by: Zhenyu Wang <zhenyuw.linux@gmail.com> Reviewed-by: Krzysztof Karas <krzysztof.karas@intel.com> Reviewed-by: Andi Shyti <andi.shyti@linux.intel.com> Signed-off-by: Andi Shyti <andi.shyti@linux.intel.com> Link: https://lore.kernel.org/r/20250923212332.112137-2-jonathan.cavitt@intel.com
2025-09-25drm/amd: Add name to modes from amdgpu_connector_add_common_modes()Mario Limonciello1-12/+14
[Why] When DC adds common modes it adds modes with a string to match what they are. Non-DC doesn't. This can be inconsistent when turning on/off DC support. [How] Add a name member to common_modes[] and copy it into the drm display mode. Cc: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Link: https://lore.kernel.org/r/20250924161624.1975819-6-mario.limonciello@amd.com Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-09-25drm/amd: Drop some common modes from amdgpu_connector_add_common_modes()Mario Limonciello1-6/+0
[Why] DC and non-DC codepaths have different sets of common modes that are added for eDP and LVDS cases. This can cause different behaviors for turning on DC on hardware that can support both. [How] Drop extra modes from amdgpu_connector_add_common_modes() not present in amdgpu_dm_connector_add_common_modes(). Cc: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Link: https://lore.kernel.org/r/20250924161624.1975819-5-mario.limonciello@amd.com Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-09-25drm/amdgpu: update MODULE_PARM_DESC for freesync_videoAlex Deucher1-1/+1
To better describe what it does. Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/3756 Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-09-25drm/amd: Use dynamic array size declaration for ↵Mario Limonciello1-2/+5
amdgpu_connector_add_common_modes() [Why] Adding or removing a mode from common_modes[] can be fragile if a user forgot to update the for loop boundaries. [How] Use ARRAY_SIZE() to detect size of the array and use that instead. Cc: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Link: https://lore.kernel.org/r/20250924161624.1975819-4-mario.limonciello@amd.com Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-09-25drm/amd/display: Share dce100_validate_global with DCE6-8Timur Kristóf4-63/+7
The dce100_validate_global function was verbatim exactly the same as dce60_validate_global and dce80_validate_global. Share dce100_validate_global between DCE6-10 to save code size. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-09-25drm/amd/display: Share dce100_validate_bandwidth with DCE6-8Timur Kristóf4-77/+18
DCE6-8 have very similar capabilities to DCE10, they support the same DP and HDMI versions and work similarly. Share dce100_validate_bandwidth between DCE6-10 to reduce code duplication in the DC driver. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-09-25drm/amdgpu: Fix fence signaling race condition in userqueueJesse.Zhang1-1/+1
This commit fixes a potential race condition in the userqueue fence signaling mechanism by replacing dma_fence_is_signaled_locked() with dma_fence_is_signaled(). The issue occurred because: 1. dma_fence_is_signaled_locked() should only be used when holding the fence's individual lock, not just the fence list lock 2. Using the locked variant without the proper fence lock could lead to double-signaling scenarios: - Hardware completion signals the fence - Software path also tries to signal the same fence By using dma_fence_is_signaled() instead, we properly handle the locking hierarchy and avoid the race condition while still maintaining the necessary synchronization through the fence_list_lock. v2: drop the comment (Christian) Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Jesse Zhang <Jesse.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-09-25amd/amdkfd: enhance kfd process check in switch partitionYifan Zhang3-0/+16
current switch partition only check if kfd_processes_table is empty. kfd_prcesses_table entry is deleted in kfd_process_notifier_release, but kfd_process tear down is in kfd_process_wq_release. consider two processes: Process A (workqueue) -> kfd_process_wq_release -> Access kfd_node member Process B switch partition -> amdgpu_xcp_pre_partition_switch -> amdgpu_amdkfd_device_fini_sw -> kfd_node tear down. Process A and B may trigger a race as shown in dmesg log. This patch is to resolve the race by adding an atomic kfd_process counter kfd_processes_count, it increment as create kfd process, decrement as finish kfd_process_wq_release. v2: Put kfd_processes_count per kfd_dev, move decrement to kfd_process_destroy_pdds and bug fix. (Philip Yang) [3966658.307702] divide error: 0000 [#1] SMP NOPTI [3966658.350818] i10nm_edac [3966658.356318] CPU: 124 PID: 38435 Comm: kworker/124:0 Kdump: loaded Tainted [3966658.356890] Workqueue: kfd_process_wq kfd_process_wq_release [amdgpu] [3966658.362839] nfit [3966658.366457] RIP: 0010:kfd_get_num_sdma_engines+0x17/0x40 [amdgpu] [3966658.366460] Code: 00 00 e9 ac 81 02 00 66 66 2e 0f 1f 84 00 00 00 00 00 90 0f 1f 44 00 00 48 8b 4f 08 48 8b b7 00 01 00 00 8b 81 58 26 03 00 99 <f7> be b8 01 00 00 80 b9 70 2e 00 00 00 74 0b 83 f8 02 ba 02 00 00 [3966658.380967] x86_pkg_temp_thermal [3966658.391529] RSP: 0018:ffffc900a0edfdd8 EFLAGS: 00010246 [3966658.391531] RAX: 0000000000000008 RBX: ffff8974e593b800 RCX: ffff888645900000 [3966658.391531] RDX: 0000000000000000 RSI: ffff888129154400 RDI: ffff888129151c00 [3966658.391532] RBP: ffff8883ad79d400 R08: 0000000000000000 R09: ffff8890d2750af4 [3966658.391532] R10: 0000000000000018 R11: 0000000000000018 R12: 0000000000000000 [3966658.391533] R13: ffff8883ad79d400 R14: ffffe87ff662ba00 R15: ffff8974e593b800 [3966658.391533] FS: 0000000000000000(0000) GS:ffff88fe7f600000(0000) knlGS:0000000000000000 [3966658.391534] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [3966658.391534] CR2: 0000000000d71000 CR3: 000000dd0e970004 CR4: 0000000002770ee0 [3966658.391535] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [3966658.391535] DR3: 0000000000000000 DR6: 00000000fffe07f0 DR7: 0000000000000400 [3966658.391536] PKRU: 55555554 [3966658.391536] Call Trace: [3966658.391674] deallocate_sdma_queue+0x38/0xa0 [amdgpu] [3966658.391762] process_termination_cpsch+0x1ed/0x480 [amdgpu] [3966658.399754] intel_powerclamp [3966658.402831] kfd_process_dequeue_from_all_devices+0x5b/0xc0 [amdgpu] [3966658.402908] kfd_process_wq_release+0x1a/0x1a0 [amdgpu] [3966658.410516] coretemp [3966658.434016] process_one_work+0x1ad/0x380 [3966658.434021] worker_thread+0x49/0x310 [3966658.438963] kvm_intel [3966658.446041] ? process_one_work+0x380/0x380 [3966658.446045] kthread+0x118/0x140 [3966658.446047] ? __kthread_bind_mask+0x60/0x60 [3966658.446050] ret_from_fork+0x1f/0x30 [3966658.446053] Modules linked in: kpatch_20765354(OEK) [3966658.455310] kvm [3966658.464534] mptcp_diag xsk_diag raw_diag unix_diag af_packet_diag netlink_diag udp_diag act_pedit act_mirred act_vlan cls_flower kpatch_21951273(OEK) kpatch_18424469(OEK) kpatch_19749756(OEK) [3966658.473462] idxd_mdev [3966658.482306] kpatch_17971294(OEK) sch_ingress xt_conntrack amdgpu(OE) amdxcp(OE) amddrm_buddy(OE) amd_sched(OE) amdttm(OE) amdkcl(OE) intel_ifs iptable_mangle tcm_loop target_core_pscsi tcp_diag target_core_file inet_diag target_core_iblock target_core_user target_core_mod coldpgs kpatch_18383292(OEK) ip6table_nat ip6table_filter ip6_tables ip_set_hash_ipportip ip_set_hash_ipportnet ip_set_hash_ipport ip_set_bitmap_port xt_comment iptable_nat nf_nat iptable_filter ip_tables ip_set ip_vs_sh ip_vs_wrr ip_vs_rr ip_vs nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 sn_core_odd(OE) i40e overlay binfmt_misc tun bonding(OE) aisqos(OE) aisqos_hotfixes(OE) rfkill uio_pci_generic uio cuse fuse nf_tables nfnetlink intel_rapl_msr intel_rapl_common intel_uncore_frequency intel_uncore_frequency_common i10nm_edac nfit x86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel kvm idxd_mdev [3966658.491237] vfio_pci [3966658.501196] vfio_pci vfio_virqfd mdev vfio_iommu_type1 vfio iax_crypto intel_pmt_telemetry iTCO_wdt intel_pmt_class iTCO_vendor_support irqbypass crct10dif_pclmul crc32_pclmul ghash_clmulni_intel rapl intel_cstate snd_hda_intel snd_intel_dspcfg snd_hda_codec snd_hda_core snd_hwdep snd_seq [3966658.508537] vfio_virqfd [3966658.517569] snd_seq_device ipmi_ssif isst_if_mbox_pci isst_if_mmio pcspkr snd_pcm idxd intel_uncore ses isst_if_common intel_vsec idxd_bus enclosure snd_timer mei_me snd i2c_i801 i2c_smbus mei i2c_ismt soundcore joydev acpi_ipmi ipmi_si ipmi_devintf ipmi_msghandler acpi_power_meter acpi_pad vfat fat [3966658.526851] mdev [3966658.536096] nfsd auth_rpcgss nfs_acl lockd grace slb_vtoa(OE) sunrpc dm_mod hookers mlx5_ib(OE) ast i2c_algo_bit drm_vram_helper drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops drm_ttm_helper ttm mlx5_core(OE) mlxfw(OE) [3966658.540381] vfio_iommu_type1 [3966658.544341] nvme mpt3sas tls drm nvme_core pci_hyperv_intf raid_class psample libcrc32c crc32c_intel mlxdevm(OE) i2c_core [3966658.551254] vfio [3966658.558742] scsi_transport_sas wmi pinctrl_emmitsburg sd_mod t10_pi sg ahci libahci libata rdma_ucm(OE) ib_uverbs(OE) rdma_cm(OE) iw_cm(OE) ib_cm(OE) ib_umad(OE) ib_core(OE) ib_ucm(OE) mlx_compat(OE) [3966658.563004] iax_crypto [3966658.570988] [last unloaded: diagnose] [3966658.571027] ---[ end trace cc9dbb180f9ae537 ]--- Signed-off-by: Yifan Zhang <yifan1.zhang@amd.com> Reviewed-by: Philip.Yang<Philip.Yang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-09-25amd/amdkfd: resolve a race in amdgpu_amdkfd_device_fini_swYifan Zhang1-1/+9
There is race in amdgpu_amdkfd_device_fini_sw and interrupt. if amdgpu_amdkfd_device_fini_sw run in b/w kfd_cleanup_nodes and kfree(kfd), and KGD interrupt generated. kernel panic log: BUG: kernel NULL pointer dereference, address: 0000000000000098 amdgpu 0000:c8:00.0: amdgpu: Requesting 4 partitions through PSP PGD d78c68067 P4D d78c68067 kfd kfd: amdgpu: Allocated 3969056 bytes on gart PUD 1465b8067 PMD @ Oops: @002 [#1] SMP NOPTI kfd kfd: amdgpu: Total number of KFD nodes to be created: 4 CPU: 115 PID: @ Comm: swapper/115 Kdump: loaded Tainted: G S W OE K RIP: 0010:_raw_spin_lock_irqsave+0x12/0x40 Code: 89 e@ 41 5c c3 cc cc cc cc 66 66 2e Of 1f 84 00 00 00 00 00 OF 1f 40 00 Of 1f 44% 00 00 41 54 9c 41 5c fa 31 cO ba 01 00 00 00 <fO> OF b1 17 75 Ba 4c 89 e@ 41 Sc 89 c6 e8 07 38 5d RSP: 0018: ffffc90@1a6b0e28 EFLAGS: 00010046 RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000018 0000000000000001 RSI: ffff8883bb623e00 RDI: 0000000000000098 ffff8883bb000000 RO8: ffff888100055020 ROO: ffff888100055020 0000000000000000 R11: 0000000000000000 R12: 0900000000000002 ffff888F2b97da0@ R14: @000000000000098 R15: ffff8883babdfo00 CS: 010 DS: 0000 ES: 0000 CRO: 0000000080050033 CR2: 0000000000000098 CR3: 0000000e7cae2006 CR4: 0000000002770ce0 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 0000000000000000 DR6: 00000000fffeO7FO DR7: 0000000000000400 PKRU: 55555554 Call Trace: <IRQ> kgd2kfd_interrupt+@x6b/0x1f@ [amdgpu] ? amdgpu_fence_process+0xa4/0x150 [amdgpu] kfd kfd: amdgpu: Node: 0, interrupt_bitmap: 3 YcpxFl Rant tErace amdgpu_irq_dispatch+0x165/0x210 [amdgpu] amdgpu_ih_process+0x80/0x100 [amdgpu] amdgpu: Virtual CRAT table created for GPU amdgpu_irq_handler+0x1f/@x60 [amdgpu] __handle_irq_event_percpu+0x3d/0x170 amdgpu: Topology: Add dGPU node [0x74a2:0x1002] handle_irq_event+0x5a/@xcO handle_edge_irq+0x93/0x240 kfd kfd: amdgpu: KFD node 1 partition @ size 49148M asm_call_irq_on_stack+0xf/@x20 </IRQ> common_interrupt+0xb3/0x130 asm_common_interrupt+0x1le/0x40 5.10.134-010.a1i5000.a18.x86_64 #1 Signed-off-by: Yifan Zhang <yifan1.zhang@amd.com> Reviewed-by: Philip Yang<Philip.Yang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-09-25drm/amd/display: Reject modes with too high pixel clock on DCE6-10Timur Kristóf5-3/+35
Reject modes with a pixel clock higher than the maximum display clock. Use 400 MHz as a fallback value when the maximum display clock is not known. Pixel clocks that are higher than the display clock just won't work and are not supported. With the addition of the YUV422 fallback, DC can now accidentally select a mode requiring higher pixel clock than actually supported when the DP version supports the required bandwidth but the clock is otherwise too high for the display engine. DCE 6-10 don't support these modes but they don't have a bandwidth calculation to reject them properly. Fixes: db291ed1732e ("drm/amd/display: Add fallback path for YCBCR422") Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-09-25drm/amd: Drop unnecessary check in amdgpu_connector_add_common_modes()Mario Limonciello1-2/+0
[Why] amdgpu_connector_add_common_modes() has a check for the width and height of common modes being too small, but the array of common_modes[] has fixed values. The check is dead code. [How] Drop unnecessary check. Cc: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Link: https://lore.kernel.org/r/20250924161624.1975819-3-mario.limonciello@amd.com Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-09-25drm/amd/display: Only enable common modes for eDP and LVDSMario Limonciello1-0/+4
[Why] The main reason common modes are added is for compatibility with clone mode when a laptop is connected to a projector or external monitor. Since commit 978fa2f6d0b12 ("drm/amd/display: Use scaling for non-native resolutions on eDP") when non-native modes are picked for eDP the GPU scalar will be used. This is because it is inconsistent whether eDP panels have the capability to actually drive non-native resolutions. With panels connected to other connectors this limitation generally doesn't exist as we the EDID will advertise support for a number of resolutions and monitors will use built in scaling hardware. Comparing DC and non-DC code paths the non-DC code path only adds common modes for LVDS and eDP whereas the DC codepath does it for all connector types. In the past there was an experiment done to disable common mode adding for eDP and LVDS from commit 6d396e7ac1ce3 ("drm/amd/display: Disable common modes for LVDS") and commit 7948afb46af92 ("drm/amd/display: Disable common modes for eDP") but this was reverted in commit a8b79b09185de ("drm/amd: Re-enable common modes for eDP and LVDS") because it caused problems with Xorg. [How] Only add common modes for eDP and LVDS for DC, matching the behavior of non-DC. Suggested-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Harry Wentland <harry.wentland@amd.com> Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Link: https://lore.kernel.org/r/20250924161624.1975819-2-mario.limonciello@amd.com Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-09-25drm/amdgpu: remove the redeclaration of variable iSunil Khatri1-1/+0
Variable "i" has been redeclared as integer later in the function which is wrong and not serving any purpose. Fixes: 899fbde14646 ("drm/amdgpu: replace get_user_pages with HMM mirror helpers") Signed-off-by: Sunil Khatri <sunil.khatri@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-09-25drm/amdgpu/userq: assign an error code for invalid userq vaPrike Liang1-0/+2
It should return an error code if userq VA validation fails. Fixes: 9e46b8bb0539 ("drm/amdgpu: validate userq buffer virtual address and size") Signed-off-by: Prike Liang <Prike.Liang@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-09-25drm/amdgpu: revert "rework reserved VMID handling" v2Christian König4-41/+50
This reverts commit e44a0fe630c58b0a87d8281f5c1077a3479e5fce. Initially we used VMID reservation to enforce isolation between processes. That has now been replaced by proper fence handling. Both OpenGL, RADV and ROCm developers requested a way to reserve a VMID for SPM, so restore that approach by reverting back to only allowing a single process to use the reserved VMID. Only compile tested for now. v2: use -ENOENT instead of -EINVAL if VMID is not available Signed-off-by: Christian König <christian.koenig@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-09-25drm/amdgpu: remove leftover from enforcing isolation by VMIDChristian König1-5/+0
Initially we enforced isolation by reserving a VMID, but that practice was now removed. Signed-off-by: Christian König <christian.koenig@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-09-25drm/amdgpu: Add fallback to pipe reset if KCQ ring reset failsJesse.Zhang1-0/+12
Add a fallback mechanism to attempt pipe reset when KCQ reset fails to recover the ring. After performing the KCQ reset and queue remapping, test the ring functionality. If the ring test fails, initiate a pipe reset as an additional recovery step. v2: fix the typo (Lijo) v3: try pipeline reset when kiq mapping fails (Lijo) Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Lijo Lazar <lijo.lazar@amd.com> Signed-off-by: Jesse Zhang <Jesse.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-09-25drm/amd: Fix hybrid sleepMario Limonciello (AMD)1-1/+1
[Why] commit 530694f54dd5e ("drm/amdgpu: do not resume device in thaw for normal hibernation") optimized the flow for systems that are going into S4 where the power would be turned off. Basically the thaw() callback wouldn't resume the device if the hibernation image was successfully created since the system would be powered off. This however isn't the correct flow for a system entering into s0i3 after the hibernation image is created. Some of the amdgpu callbacks have different behavior depending upon the intended state of the suspend. [How] Use pm_hibernation_mode_is_suspend() as an input to decide whether to run resume during thaw() callback. Reported-by: Ionut Nechita <ionut_n2001@yahoo.com> Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/4573 Tested-by: Ionut Nechita <ionut_n2001@yahoo.com> Fixes: 530694f54dd5e ("drm/amdgpu: do not resume device in thaw for normal hibernation") Acked-by: Alex Deucher <alexander.deucher@amd.com> Tested-by: Kenneth Crudup <kenny@panix.com> Signed-off-by: Mario Limonciello (AMD) <superm1@kernel.org> Cc: 6.17+ <stable@vger.kernel.org> # 6.17+: 495c8d35035e: PM: hibernate: Add pm_hibernation_mode_is_suspend() Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2025-09-25drm/i915/gvt: Simplify case switch in intel_vgpu_ioctlJonathan Cavitt1-9/+4
We do not need a case switch to check cap_type_id in intel_vgpu_ioctl for various reasons (it's impossible to hit the default case in the current code, there's only one valid case to check, the error handling code overlaps in both cases, etc.). Simplify the case switch into a single if statement. This has the additional effect of simplifying the error handling code. Note that it is still currently impossible for 'if (cap_type_id == VFIO_REGION_INFO_CAP_SPARSE_MMAP)' to fail, but we should still guard against the possibility of this changing in the future. Signed-off-by: Jonathan Cavitt <jonathan.cavitt@intel.com> Cc: Andi Shyti <andi.shyti@linux.intel.com> Reviewed-by: Zhenyu Wang <zhenyuw.linux@gmail.com> Reviewed-by: Andi Shyti <andi.shyti@linux.intel.com> Signed-off-by: Andi Shyti <andi.shyti@linux.intel.com> Link: https://lore.kernel.org/r/20250918214515.66926-2-jonathan.cavitt@intel.com
2025-09-25drm/i915/dsb: Inline dsb_vblank_delay() into intel_dsb_wait_for_delayed_vblank()Ankit Nautiyal1-33/+26
Drop the now single-use dsb_vblank_delay() helper and inline its logic directly into intel_dsb_wait_for_delayed_vblank(). This will help to keep all VRR related wait stuff in one place. v2: Use intel_scanlines_to_usecs() in intel_dsb_wait_usec(). (Ville) Signed-off-by: Ankit Nautiyal <ankit.k.nautiyal@intel.com> Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Link: https://lore.kernel.org/r/20250925022352.3129859-1-ankit.k.nautiyal@intel.com
2025-09-25drm/i915/display: Drop intel_vrr_vblank_delay and use set_context_latencyAnkit Nautiyal4-12/+3
The helper intel_vrr_vblank_delay() was used to keep track of the SCL lines + the extra vblank delay required for ICL/TGL. This was used to wait for sufficient lines for: -push send bit to clear for VRR case -evasion to delay the commit. For first case we are using safe window scanline wait and with that we just need to wait for SCL lines, we do not need to wait for the extra vblank delay required for ICL/TGL. For the second case, we actually do not need to wait for extra lines before the undelayed vblank, if we are already in the safe window. To sum up, SCL lines is sufficient for both cases. So drop the helper intel_vrr_vblank_delay and just use crtc_state->set_context_latency instead. Signed-off-by: Ankit Nautiyal <ankit.k.nautiyal@intel.com> Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Link: https://lore.kernel.org/r/20250924141542.3122126-10-ankit.k.nautiyal@intel.com
2025-09-25drm/i915/vrr: Clamp guardband as per hardware and timing constraintsAnkit Nautiyal1-12/+35
The maximum guardband value is constrained by two factors: - The actual vblank length minus set context latency (SCL) - The hardware register field width: - 8 bits for ICL/TGL (VRR_CTL_PIPELINE_FULL_MASK -> max 255) - 16 bits for ADL+ (XELPD_VRR_CTL_VRR_GUARDBAND_MASK -> max 65535) Remove the #FIXME and clamp the guardband to the maximum allowed value. v2: - Use REG_FIELD_MAX(). (Ville) - Separate out functions for intel_vrr_max_guardband(), intel_vrr_max_vblank_guardband(). (Ville) v3: - Fix Typo: Add the missing adjusted_mode->crtc_vdisplay in guardband computation. (Ville) - Refactor intel_vrr_max_hw_guardband() and use else for consistency. (Ville) v4: - Drop max_guardband from intel_vrr_max_hw_guardband(). (Ville) Signed-off-by: Ankit Nautiyal <ankit.k.nautiyal@intel.com> Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com> (#v2) Link: https://lore.kernel.org/r/20250924141542.3122126-9-ankit.k.nautiyal@intel.com