aboutsummaryrefslogtreecommitdiff
path: root/drivers/gpu
AgeCommit message (Collapse)AuthorFilesLines
2025-09-05drm/radeon/pm: Remove redundant ternary operatorsLiao Yuanhong1-2/+1
For ternary operators in the form of "a ? true : false", if 'a' itself returns a boolean result, the ternary operator can be omitted. Remove redundant ternary operators to clean up the code. Signed-off-by: Liao Yuanhong <liaoyuanhong@vivo.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-09-05drm/radeon/radeon_legacy_encoders: Remove redundant ternary operatorsLiao Yuanhong1-10/+10
For ternary operators in the form of "a ? true : false", if 'a' itself returns a boolean result, the ternary operator can be omitted. Remove redundant ternary operators to clean up the code. Signed-off-by: Liao Yuanhong <liaoyuanhong@vivo.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-09-05drm/radeon/dpm: Remove redundant ternary operatorsLiao Yuanhong2-5/+5
For ternary operators in the form of "a ? true : false", if 'a' itself returns a boolean result, the ternary operator can be omitted. Remove redundant ternary operators to clean up the code. Signed-off-by: Liao Yuanhong <liaoyuanhong@vivo.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-09-05drm/radeon/atom: Remove redundant ternary operatorsLiao Yuanhong1-1/+1
For ternary operators in the form of "a ? true : false", if 'a' itself returns a boolean result, the ternary operator can be omitted. Remove redundant ternary operators to clean up the code. Signed-off-by: Liao Yuanhong <liaoyuanhong@vivo.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-09-05drm/amd/pm/powerplay/smumgr: remove redundant ternary operatorsLiao Yuanhong4-12/+8
For ternary operators in the form of "a ? true : false", if 'a' itself returns a boolean result, the ternary operator can be omitted. Remove redundant ternary operators to clean up the code. Swap variable positions on either side of '==' to enhance readability. Signed-off-by: Liao Yuanhong <liaoyuanhong@vivo.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-09-05drm/amd/pm/powerplay/hwmgr/ppatomctrl: Remove redundant ternary operatorsLiao Yuanhong1-2/+2
For ternary operators in the form of "a ? true : false", if 'a' itself returns a boolean result, the ternary operator can be omitted. Remove redundant ternary operators to clean up the code. Swap variable positions on either side of '!=' to enhance readability. Signed-off-by: Liao Yuanhong <liaoyuanhong@vivo.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-09-05amdgpu/pm/legacy: remove redundant ternary operatorsLiao Yuanhong1-2/+1
For ternary operators in the form of "a ? true : false", if 'a' itself returns a boolean result, the ternary operator can be omitted. Remove redundant ternary operators to clean up the code. Signed-off-by: Liao Yuanhong <liaoyuanhong@vivo.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-09-05drm/amd/display: Remove redundant ternary operatorsLiao Yuanhong5-7/+6
For ternary operators in the form of "a ? true : false" or "a ? false : true", if 'a' itself returns a boolean result, the ternary operator can be omitted. Remove redundant ternary operators to clean up the code. Signed-off-by: Liao Yuanhong <liaoyuanhong@vivo.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-09-05drm/amdgpu/userq: add a detect and reset callbackJesse.Zhang2-0/+51
Add a detect and reset callback and add the implementation for mes. The callback will detect all hung queues of a particular ip type (e.g., GFX or compute or SDMA) and reset them. v2: increase reset counter and set fence force completion v3: Removed userq_mutex in mes_userq_detect_and_reset since the driver holds it when calling Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Jesse Zhang <Jesse.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-09-05drm/amdkfd: Fix error code sign for EINVAL in svm_ioctl()Qianfeng Rong1-1/+1
Use negative error code -EINVAL instead of positive EINVAL in the default case of svm_ioctl() to conform to Linux kernel error code conventions. Fixes: 42de677f7999 ("drm/amdkfd: register svm range") Signed-off-by: Qianfeng Rong <rongqianfeng@vivo.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-09-05drm/amdgpu: don't enable SMU on cyan skillfishAlex Deucher1-1/+4
Cyan skillfish uses different SMU firmware. Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-09-05drm/amdgpu: add support for cyan skillfish gpu_infoAlex Deucher1-0/+4
Some SOCs which are part of the cyan skillfish family rely on an explicit firmware for IP discovery. Add support for the gpu_info firmware. Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-09-05drm/amdgpu: add support for cyan skillfish without IP discoveryAlex Deucher1-0/+30
For platforms without an IP discovery table. Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-09-05drm/amdgpu: add ip offset support for cyan skillfishAlex Deucher3-1/+59
For chips that don't have IP discovery tables. Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-09-05drm/amdgpu: Fix function header names in amdgpu_connectors.cSrinivasan Shanmugam1-3/+12
Align the function headers for `amdgpu_max_hdmi_pixel_clock` and `amdgpu_connector_dvi_mode_valid` with the function implementations so they match the expected kdoc style. Fixes the below: drivers/gpu/drm/amd/amdgpu/amdgpu_connectors.c:1199: warning: This comment starts with '/**', but isn't a kernel-doc comment. Refer Documentation/doc-guide/kernel-doc.rst * Returns the maximum supported HDMI (TMDS) pixel clock in KHz. drivers/gpu/drm/amd/amdgpu/amdgpu_connectors.c:1212: warning: This comment starts with '/**', but isn't a kernel-doc comment. Refer Documentation/doc-guide/kernel-doc.rst * Validates the given display mode on DVI and HDMI connectors. Fixes: 585b2f685c56 ("drm/amdgpu: Respect max pixel clock for HDMI and DVI-D (v2)") Cc: Christian König <christian.koenig@amd.com> Cc: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-09-05amd/amdkfd: correct mem limit calculation for small APUsYifan Zhang1-12/+32
Current mem limit check leaks some GTT memory (reserved_for_pt reserved_for_ras + adev->vram_pin_size) for small APUs. Since carveout VRAM is tunable on APUs, there are three case regarding the carveout VRAM size relative to GTT: 1. 0 < carveout < gtt apu_prefer_gtt = true, is_app_apu = false 2. carveout > gtt / 2 apu_prefer_gtt = false, is_app_apu = false 3. 0 = carveout apu_prefer_gtt = true, is_app_apu = true It doesn't make sense to check below limitation in case 1 (default case, small carveout) because the values in the below expression are mixed with carveout and gtt. adev->kfd.vram_used[xcp_id] + vram_needed > vram_size - reserved_for_pt - reserved_for_ras - atomic64_read(&adev->vram_pin_size) gtt: kfd.vram_used, vram_needed, vram_size carveout: reserved_for_pt, reserved_for_ras, adev->vram_pin_size In case 1, vram allocation will go to gtt domain, skip vram check since ttm_mem_limit check already cover this allocation. Signed-off-by: Yifan Zhang <yifan1.zhang@amd.com> Reviewed-by: Mario Limonciello <mario.limonciello@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-09-05drm/amd/display: remove oem i2c adapter on finishGeoffrey McRae1-1/+12
Fixes a bug where unbinding of the GPU would leave the oem i2c adapter registered resulting in a null pointer dereference when applications try to access the invalid device. Fixes: 3d5470c97314 ("drm/amd/display/dm: add support for OEM i2c bus") Cc: Harry Wentland <harry.wentland@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Geoffrey McRae <geoffrey.mcrae@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-09-05drm/amdgpu/userq: add force completion helpersAlex Deucher2-0/+43
Add support for forcing completion of userq fences. This is needed for userq resets and asic resets so that we can set the error on the fence and force completion. Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-09-05drm/amdgpu: add user queue reset sourceAlex Deucher2-0/+4
Track resets from user queues. Reviewed-by: Christian König <christian.koenig@amd.com> Reviewed-by: Sunil Khatri <sunil.khatri@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-09-05drm/amdgpu/mes12: implement detect and reset callbackJesse.Zhang1-0/+31
Implement support for the hung queue detect and reset functionality. v2: Always use AMDGPU_MES_SCHED_PIPE Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Jesse Zhang <Jesse.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-09-05drm/amdgpu/mes11: implement detect and reset callbackJesse.Zhang1-0/+31
Implement support for the hung queue detect and reset functionality. Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Jesse Zhang <Jesse.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-09-05drm/amdgpu/mes: add front end for detect and reset hung queueJesse.Zhang2-0/+86
Helper function to detect and reset hung queues. MES will return an array of doorbell indices of which queues are hung and were optionally reset. v2: Clear the doorbell array before detection Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Jesse Zhang <Jesse.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-09-05drm/amd/amdgpu: Implement MES suspend/resume gang functionality for v12Jesse.Zhang1-2/+30
This commit implements the actual MES (Micro Engine Scheduler) suspend and resume gang operations for version 12 hardware. Previously these functions were just stubs returning success. v2: Always use AMDGPU_MES_SCHED_PIPE Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Jesse Zhang <Jesse.Zhang@amd.com>
2025-09-05drm/amdgpu: Add preempt and restore callbacks to userq funcsJesse.Zhang1-0/+4
Add two new function pointers to struct amdgpu_userq_funcs: - preempt: To handle preemption of user mode queues - restore: To restore preempted user mode queues These callbacks will allow the driver to properly manage queue preemption and restoration when needed, such as during context switching or priority changes. Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Jesse Zhang <Jesse.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-09-05drm/xe/guc: Fix badly worded error messageJohn Harrison1-1/+1
If a GuC id lookup failed, the error message was 'Not engine present', which is bad in multiple ways - incorrect English and 'engines' are now called 'exec queues' in this context. So fix it. Signed-off-by: John Harrison <John.C.Harrison@Intel.com> Reviewed-by: Matt Atwood <matthew.s.atwood@intel.com> Link: https://lore.kernel.org/r/20250904195752.3846138-3-John.C.Harrison@Intel.com
2025-09-05drm/xe/guc: Clean up of GuC 'CTL' definesJohn Harrison2-34/+15
All the field generation for the CTL defines (used for GuC init data) were hand-rolled rather than using FIELD_PREP/REG_GENMASK/BIT macros. Also, there were a bunch of macros defined for verbosity settings that were never used. So fix that all up. Signed-off-by: John Harrison <John.C.Harrison@Intel.com> Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com> Link: https://lore.kernel.org/r/20250904195752.3846138-2-John.C.Harrison@Intel.com
2025-09-05drm/amdgpu: fix the formating for debugfs printSunil Khatri1-1/+1
Fix the format of debugfs print in the mqd. Need to add a colon so parser can parse it properly. Signed-off-by: Sunil Khatri <sunil.khatri@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-09-05drm/amd: add more cyan skillfish PCI idsAlex Deucher1-0/+5
Add additional PCI IDs to the cyan skillfish family. Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-09-05drm/amdgpu: add more information in debugfs to pagetable dumpSunil Khatri1-0/+6
Add more information in the debugfs which is needed to dump a pagetable correctly for userqueues where vmid is not known in the kernel. Signed-off-by: Sunil Khatri <sunil.khatri@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-09-05drm/amdgpu: Correct info field of bad page threshold exceed CPERXiang Liu1-1/+3
Correct valid_bits and ms_chk_bits of section info field for bad page threshold exceed CPER to match OOB's behavior. Signed-off-by: Xiang Liu <xiang.liu@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-09-05drm/amdkfd: fix p2p links bug in topologyEric Huang1-1/+2
When creating p2p links, KFD needs to check XGMI link with two conditions, hive_id and is_sharing_enabled, but it is missing to check is_sharing_enabled, so add it to fix the error. Signed-off-by: Eric Huang <jinhuieric.huang@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-09-05drm/radeon/ci_dpm: Use int type to store negative error codesQianfeng Rong1-2/+4
Change the 'ret' variable in ci_populate_all_graphic_levels() and ci_populate_all_memory_levels() from u32 to int, as it needs to store either negative error codes or zero returned by other functions. Storing the negative error codes in unsigned type, doesn't cause an issue at runtime but can be confusing. Additionally, assigning negative error codes to unsigned type may trigger a GCC warning when the -Wsign-conversion flag is enabled. No effect on runtime. Signed-off-by: Qianfeng Rong <rongqianfeng@vivo.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-09-05drm/amdgpu/vcn: Remove redundant ternary operatorsLiao Yuanhong2-2/+2
For ternary operators in the form of "a ? true : false", if 'a' itself returns a boolean result, the ternary operator can be omitted. Remove redundant ternary operators to clean up the code. Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Liao Yuanhong <liaoyuanhong@vivo.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-09-05drm/amdgpu/jpeg: Remove redundant ternary operatorsLiao Yuanhong3-3/+3
For ternary operators in the form of "a ? true : false", if 'a' itself returns a boolean result, the ternary operator can be omitted. Remove redundant ternary operators to clean up the code. Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Liao Yuanhong <liaoyuanhong@vivo.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-09-05drm/amdgpu/ih: Remove redundant ternary operatorsLiao Yuanhong3-6/+3
For ternary operators in the form of "a ? false : true", if 'a' itself returns a boolean result, the ternary operator can be omitted. Remove redundant ternary operators to clean up the code. Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Liao Yuanhong <liaoyuanhong@vivo.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-09-05drm/amdgpu/gmc: Remove redundant ternary operatorsLiao Yuanhong3-6/+3
For ternary operators in the form of "a ? false : true", if 'a' itself returns a boolean result, the ternary operator can be omitted. Remove redundant ternary operators to clean up the code. Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Liao Yuanhong <liaoyuanhong@vivo.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-09-05drm/amdgpu/gfx: Remove redundant ternary operatorsLiao Yuanhong2-4/+2
For ternary operators in the form of "a ? false : true", if 'a' itself returns a boolean result, the ternary operator can be omitted. Remove redundant ternary operators to clean up the code. Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Liao Yuanhong <liaoyuanhong@vivo.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-09-05drm/amdgpu/amdgpu_cper: Remove redundant ternary operatorsLiao Yuanhong1-1/+1
For ternary operators in the form of "a ? false : true", if 'a' itself returns a boolean result, the ternary operator can be omitted. Remove redundant ternary operators to clean up the code. Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Liao Yuanhong <liaoyuanhong@vivo.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-09-05drm/amd/amdgpu: Fix a less than zero check on a uint32_t struct fieldColin Ian King1-2/+4
Currently the error check from the call to mes_v12_inv_tlb_convert_hub_id is always false because a uint32_t struct field hub_id is being used to to perform the less than zero error check. Fix this by using the int variable ret to perform the check. Fixes: 87e65052616c ("drm/amd/amdgpu : Use the MES INV_TLBS API for tlb invalidation on gfx12") Reviewed-by: Dan Carpenter <dan.carpenter@linaro.org> Signed-off-by: Colin Ian King <colin.i.king@gmail.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-09-05drm/xe: Extend Wa_13011645652 to PTL-H, WCLJulia Filipchuk1-1/+2
Expand workaround to additional graphics architectures. Cc: Vinay Belgaumkar <vinay.belgaumkar@intel.com> Cc: Stuart Summers <stuart.summers@intel.com> Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Cc: Lucas De Marchi <lucas.demarchi@intel.com> Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com> Cc: Rodrigo Vivi <rodrigo.vivi@intel.com> Cc: intel-xe@lists.freedesktop.org Cc: <stable@vger.kernel.org> # v6.17+ Signed-off-by: Julia Filipchuk <julia.filipchuk@intel.com> Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com> Link: https://lore.kernel.org/r/20250903190122.1028373-2-julia.filipchuk@intel.com Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
2025-09-05drm/vkms: Add P01* formatsLouis Chauvet2-1/+9
The formats NV 12/16/24/21/61/42 were already supported. Add support for: - P010 - P012 - P016 Reviewed-by: Maíra Canal <mcanal@igalia.com> Acked-by: Daniel Stone <daniels@collabora.com> Link: https://lore.kernel.org/r/20250703-b4-new-color-formats-v7-8-15fd8fd2e15c@bootlin.com Signed-off-by: Louis Chauvet <louis.chauvet@bootlin.com>
2025-09-05drm/vkms: Create helper macro for YUV formatsLouis Chauvet1-27/+48
The callback functions for line conversion are almost identical for semi-planar formats. The generic READ_LINE_YUV_SEMIPLANAR macro generate all the required boilerplate to process a line from a semi-planar format. Reviewed-by: Maíra Canal <mcanal@igalia.com> Acked-by: Daniel Stone <daniels@collabora.com> Link: https://lore.kernel.org/r/20250703-b4-new-color-formats-v7-7-15fd8fd2e15c@bootlin.com Signed-off-by: Louis Chauvet <louis.chauvet@bootlin.com>
2025-09-05drm/vkms: Change YUV helpers to support u16 inputs for conversionLouis Chauvet3-84/+85
Some YUV format uses 16 bit values, so change the helper function for conversion to support those new formats. Reviewed-by: Maíra Canal <mcanal@igalia.com> Acked-by: Daniel Stone <daniels@collabora.com> Link: https://lore.kernel.org/r/20250703-b4-new-color-formats-v7-6-15fd8fd2e15c@bootlin.com Signed-off-by: Louis Chauvet <louis.chauvet@bootlin.com>
2025-09-05drm/vkms: Add support for RGB888 formatsLouis Chauvet2-0/+9
Add the support for: - RGB888 - BGR888 Reviewed-by: Maíra Canal <mcanal@igalia.com> Acked-by: Daniel Stone <daniels@collabora.com> Link: https://lore.kernel.org/r/20250703-b4-new-color-formats-v7-5-15fd8fd2e15c@bootlin.com Signed-off-by: Louis Chauvet <louis.chauvet@bootlin.com>
2025-09-05drm/vkms: Add support for RGB565 formatsLouis Chauvet2-0/+14
The format RGB565 was already supported. Add the support for: - BGR565 Reviewed-by: Maíra Canal <mcanal@igalia.com> Acked-by: Daniel Stone <daniels@collabora.com> Link: https://lore.kernel.org/r/20250703-b4-new-color-formats-v7-4-15fd8fd2e15c@bootlin.com Signed-off-by: Louis Chauvet <louis.chauvet@bootlin.com>
2025-09-05drm/vkms: Add support for ARGB16161616 formatsLouis Chauvet2-0/+8
The formats XRGB16161616 and ARGB16161616 were already supported. Add the support for: - ABGR16161616 - XBGR16161616 Reviewed-by: Maíra Canal <mcanal@igalia.com> Acked-by: Daniel Stone <daniels@collabora.com> Link: https://lore.kernel.org/r/20250703-b4-new-color-formats-v7-3-15fd8fd2e15c@bootlin.com Signed-off-by: Louis Chauvet <louis.chauvet@bootlin.com>
2025-09-05drm/vkms: Add support for ARGB8888 formatsLouis Chauvet2-3/+15
The formats XRGB8888 and ARGB8888 were already supported. Add the support for: - XBGR8888 - ABGR8888 - RGBA8888 - BGRA8888 Reviewed-by: Maíra Canal <mcanal@igalia.com> Acked-by: Daniel Stone <daniels@collabora.com> Link: https://lore.kernel.org/r/20250703-b4-new-color-formats-v7-2-15fd8fd2e15c@bootlin.com Signed-off-by: Louis Chauvet <louis.chauvet@bootlin.com>
2025-09-05drm/vkms: Create helpers macro to avoid code duplication in format callbacksLouis Chauvet1-127/+65
The callback functions for line conversion are almost identical for some format. The generic READ_LINE macro generate all the required boilerplate to process a line. Two overrides of this macro have been added to avoid duplication of the same arguments every time. Reviewed-by: Maíra Canal <mcanal@igalia.com> Acked-by: Daniel Stone <daniels@collabora.com> Link: https://lore.kernel.org/r/20250703-b4-new-color-formats-v7-1-15fd8fd2e15c@bootlin.com Signed-off-by: Louis Chauvet <louis.chauvet@bootlin.com>
2025-09-05drm/vkms: Assert if vkms_config_create_*() failsJosé Expósito1-5/+46
Check that the value returned by the vkms_config_create_*() functions is valid. Otherwise, assert and finish the KUnit test. Reported-by: Dan Carpenter <dan.carpenter@linaro.org> Closes: https://lore.kernel.org/dri-devel/aJTL6IFEBaI8gqtH@stanley.mountain/ Signed-off-by: José Expósito <jose.exposito89@gmail.com> Reviewed-by: Louis Chauvet <louis.chauvet@bootlin.com> Reviewed-by: Dan Carpenter <dan.carpenter@linaro.org> Link: https://lore.kernel.org/r/20250811101529.150716-1-jose.exposito89@gmail.com Signed-off-by: Louis Chauvet <louis.chauvet@bootlin.com>
2025-09-05drm/xe: Block exec and rebind worker while evicting for suspend / hibernateThomas Hellström6-1/+88
When the xe pm_notifier evicts for suspend / hibernate, there might be racing tasks trying to re-validate again. This can lead to suspend taking excessive time or get stuck in a live-lock. This behaviour becomes much worse with the fix that actually makes re-validation bring back bos to VRAM rather than letting them remain in TT. Prevent that by having exec and the rebind worker waiting for a completion that is set to block by the pm_notifier before suspend and is signaled by the pm_notifier after resume / wakeup. It's probably still possible to craft malicious applications that block suspending. More work is pending to fix that. v3: - Avoid wait_for_completion() in the kernel worker since it could potentially cause work item flushes from freezable processes to wait forever. Instead terminate the rebind workers if needed and re-launch at resume. (Matt Auld) v4: - Fix some bad naming and leftover debug printouts. - Fix kerneldoc. - Use drmm_mutex_init() for the xe->rebind_resume_lock (Matt Auld). - Rework the interface of xe_vm_rebind_resume_worker (Matt Auld). Link: https://gitlab.freedesktop.org/drm/xe/kernel/-/issues/4288 Fixes: c6a4d46ec1d7 ("drm/xe: evict user memory in PM notifier") Cc: Matthew Auld <matthew.auld@intel.com> Cc: Rodrigo Vivi <rodrigo.vivi@intel.com> Cc: <stable@vger.kernel.org> # v6.16+ Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com> Reviewed-by: Matthew Auld <matthew.auld@intel.com> Link: https://lore.kernel.org/r/20250904160715.2613-4-thomas.hellstrom@linux.intel.com