aboutsummaryrefslogtreecommitdiff
path: root/drivers/accel
AgeCommit message (Collapse)AuthorFilesLines
2025-10-06accel/qaic: Replace snprintf() with sysfs_emit() in sysfs show functionsChelsy Ratnawat1-3/+3
Documentation/filesystems/sysfs.rst mentions that show() should only use sysfs_emit() or sysfs_emit_at() when formatting the value to be returned to user space. So replace snprintf() with sysfs_emit(). Signed-off-by: Chelsy Ratnawat <chelsyratnawat2001@gmail.com> Reviewed-by: Jeff Hugo <jeff.hugo@oss.qualcomm.com> [jhugo: Fix commit text typos] Signed-off-by: Jeff Hugo <jeff.hugo@oss.qualcomm.com> Link: https://lore.kernel.org/r/20250822112804.1726592-1-chelsyratnawat2001@gmail.com
2025-10-06accel/qaic: Replace kcalloc + copy_from_user with memdup_array_userThorsten Blum1-25/+9
Replace kcalloc() followed by copy_from_user() with memdup_array_user() to improve and simplify both __qaic_execute_bo_ioctl() and qaic_perf_stats_bo_ioctl(). In __qaic_execute_bo_ioctl(), return early if an error occurs and remove the obsolete 'free_exec' label. Since memdup_array_user() already checks for multiplication overflow, remove the manual check in __qaic_execute_bo_ioctl(). Remove any unused local variables accordingly. Since 'ret = copy_from_user()' has been removed, initialize 'ret = 0' to preserve the same return value on success. No functional changes intended. Signed-off-by: Thorsten Blum <thorsten.blum@linux.dev> Reviewed-by: Jeff Hugo <jeff.hugo@oss.qualcomm.com> Signed-off-by: Jeff Hugo <jeff.hugo@oss.qualcomm.com> Link: https://lore.kernel.org/r/20250917124805.90395-4-thorsten.blum@linux.dev
2025-10-06accel/qaic: Replace kzalloc + copy_from_user with memdup_userThorsten Blum1-9/+4
Replace kzalloc() followed by copy_from_user() with memdup_user() to improve and simplify qaic_attach_slice_bo_ioctl(). No functional changes intended. Signed-off-by: Thorsten Blum <thorsten.blum@linux.dev> Reviewed-by: Karol Wachowski <karol.wachowski@linux.intel.com> Reviewed-by: Jeff Hugo <jeff.hugo@oss.qualcomm.com> Signed-off-by: Jeff Hugo <jeff.hugo@oss.qualcomm.com> Link: https://lore.kernel.org/r/20250917124805.90395-2-thorsten.blum@linux.dev
2025-10-02accel/ivpu: Fix DCT active percent formatKarol Wachowski3-4/+9
The pcode MAILBOX STATUS register PARAM2 field expects DCT active percent in U1.7 value format. Convert percentage value to this format before writing to the register. Fixes: a19bffb10c46 ("accel/ivpu: Implement DCT handling") Reviewed-by: Lizhi Hou <lizhi.hou@amd.com> Signed-off-by: Karol Wachowski <karol.wachowski@linux.intel.com> Link: https://lore.kernel.org/r/20251001104322.1249896-1-karol.wachowski@linux.intel.com
2025-10-01accel/ivpu: Improve BO alloc/free warningsJacek Lawrynowicz1-2/+7
Add additional warnings related to allocation and deallocation of buffer objects to better track possible memory leaks and generally the BO's lifecycle. Introduce checks for handle_count to ensure it is zero before creating a new handle, and exactly one after successfully creating a handle. Introduce also a check to warn if the VMA node is not empty when freeing the buffer object. Signed-off-by: Jacek Lawrynowicz <jacek.lawrynowicz@linux.intel.com> Signed-off-by: Maciej Falkowski <maciej.falkowski@linux.intel.com> Reviewed-by: Karol Wachowski <karol.wachowski@linux.intel.com> Signed-off-by: Karol Wachowski <karol.wachowski@linux.intel.com> Link: https://lore.kernel.org/r/20250925145154.1446427-1-maciej.falkowski@linux.intel.com
2025-10-01accel/ivpu: Fix doc description of job structureAndrzej Kacprowski1-17/+27
Fix doc description of job structure as it is improperly formatted. Align order of job structure's fields according to the documentation. Fixes: 0bf37f45d5c4 ("accel/ivpu: Add support for user-managed preemption buffer") Signed-off-by: Andrzej Kacprowski <andrzej.kacprowski@linux.intel.com> Signed-off-by: Maciej Falkowski <maciej.falkowski@linux.intel.com> Reviewed-by: Karol Wachowski <karol.wachowski@linux.intel.com> Signed-off-by: Karol Wachowski <karol.wachowski@linux.intel.com> Link: https://lore.kernel.org/r/20250925145131.1446323-1-maciej.falkowski@linux.intel.com
2025-10-01accel/ivpu: Fix page fault in ivpu_bo_unbind_all_bos_from_context()Jacek Lawrynowicz1-6/+16
Don't add BO to the vdev->bo_list in ivpu_gem_create_object(). When failure happens inside drm_gem_shmem_create(), the BO is not fully created and ivpu_gem_bo_free() callback will not be called causing a deleted BO to be left on the list. Fixes: 8d88e4cdce4f ("accel/ivpu: Use GEM shmem helper for all buffers") Signed-off-by: Jacek Lawrynowicz <jacek.lawrynowicz@linux.intel.com> Signed-off-by: Maciej Falkowski <maciej.falkowski@linux.intel.com> Reviewed-by: Karol Wachowski <karol.wachowski@linux.intel.com> Signed-off-by: Karol Wachowski <karol.wachowski@linux.intel.com> Link: https://lore.kernel.org/r/20250925145114.1446283-1-maciej.falkowski@linux.intel.com
2025-10-01accel/ivpu: Rework bind/unbind of imported buffersJacek Lawrynowicz3-34/+60
Ensure that imported buffers are properly mapped and unmapped in the same way as regular buffers to properly handle buffers during device's bind and unbind operations to prevent resource leaks and inconsistent buffer states. Imported buffers are now dma_mapped before submission and dma_unmapped in ivpu_bo_unbind(), guaranteeing they are unmapped when the device is unbound. Add also imported buffers to vdev->bo_list for consistent unmapping on device unbind. The bo->ctx_id is set in open() so imported buffers have a valid context ID. Debug logs have been updated to match the new code structure. The function ivpu_bo_pin() has been renamed to ivpu_bo_bind() to better reflect its purpose, and unbind tests have been refactored for improved coverage and clarity. Signed-off-by: Jacek Lawrynowicz <jacek.lawrynowicz@linux.intel.com> Signed-off-by: Maciej Falkowski <maciej.falkowski@linux.intel.com> Reviewed-by: Karol Wachowski <karol.wachowski@linux.intel.com> Signed-off-by: Karol Wachowski <karol.wachowski@linux.intel.com> Link: https://lore.kernel.org/r/20250925145059.1446243-1-maciej.falkowski@linux.intel.com
2025-10-01accel/ivpu: Enable MCA ECC signalling based on MSRTomasz Rusinowicz3-0/+28
Add new boot parameter for NPU5+ that enables ECC signalling for on-chip memory based on the value of MSR_INTEGRITY_CAPS register. Signed-off-by: Tomasz Rusinowicz <tomasz.rusinowicz@intel.com> Signed-off-by: Maciej Falkowski <maciej.falkowski@linux.intel.com> Reviewed-by: Karol Wachowski <karol.wachowski@linux.intel.com> Signed-off-by: Karol Wachowski <karol.wachowski@linux.intel.com> Link: https://lore.kernel.org/r/20250925145020.1446208-1-maciej.falkowski@linux.intel.com
2025-09-25accel/ivpu: Split FW runtime and global memory buffersKarol Wachowski9-75/+152
Split firmware boot parameters (4KB) and FW version (4KB) into dedicated buffer objects, separating them from the FW runtime memory buffer. This creates three distinct buffers with independent allocation control. This enables future modifications, particularly allowing the FW image memory to be moved into a read-only buffer. Fix user range starting address from incorrect 0x88000000 (2GB + 128MB) overlapping global aperture on 37XX to intended 0xa0000000 (2GB + 512MB). This caused no issues as FW aligned this range to 512MB anyway, but corrected for consistency. Convert ivpu_hw_range_init() from inline helper to function with overflow validation to prevent potential security issues from address range arithmetic overflows. Improve readability of ivpu_fw_parse() function, remove unrelated constant defines and validate firmware header values based on vdev->hw->ranges. Reviewed-by: Lizhi Hou <lizhi.hou@amd.com> Signed-off-by: Karol Wachowski <karol.wachowski@linux.intel.com> Link: https://lore.kernel.org/r/20250925074209.1148924-1-karol.wachowski@linux.intel.com
2025-09-25accel/habanalabs: add Infineon version checkPavan S1-2/+9
On HL338 ASICs, the Infineon first‑stage firmware is not present and the reported version is 0. In this case printing a version number is misleading, as it suggests valid firmware when it does not exist. Fix this by printing the first‑stage Infineon firmware version only if the reported value is non‑zero. This avoids confusing or incorrect log messages on devices where the first stage is not applicable. Signed-off-by: Pavan S <pavan.sreenivas@intel.com> Reviewed-by: Koby Elbaz <koby.elbaz@intel.com> Signed-off-by: Koby Elbaz <koby.elbaz@intel.com>
2025-09-25accel/habanalabs/gaudi2: read preboot status after recovering from dirty stateKonstantin Sinyuk1-1/+7
Dirty state can occur when the host VM undergoes a reset while the device does not. In such a case, the driver must reset the device before it can be used again. As part of this reset, the device capabilities are zeroed. Therefore, the driver must read the Preboot status again to learn the Preboot state, capabilities, and security configuration. Signed-off-by: Konstantin Sinyuk <konstantin.sinyuk@intel.com> Reviewed-by: Koby Elbaz <koby.elbaz@intel.com> Signed-off-by: Koby Elbaz <koby.elbaz@intel.com>
2025-09-25accel/habanalabs: add HL_GET_P_STATE passthrough typeAriel Aviad1-0/+3
Add a new passthrough type HL_GET_P_STATE to the cpucp generic ioctl to allow userspace to read the device performance state via firmware. Signed-off-by: Ariel Aviad <ariel.aviad@intel.com> Reviewed-by: Koby Elbaz <koby.elbaz@intel.com> Signed-off-by: Koby Elbaz <koby.elbaz@intel.com>
2025-09-25accel/habanalabs: add debugfs interface for HLDIO testingKonstantin Sinyuk2-0/+214
Add debugfs files for NVMe Direct I/O (HLDIO) functionality. This interface allows userspace access to direct SSD ↔ device transfers through debugfs nodes. Four debugfs files are created under /sys/kernel/debug/habanalabs/hlN/: - dio_ssd2hl : trigger SSD-to-device transfers - dio_hl2ssd : trigger device-to-SSD transfers (placeholder, not yet implemented) - dio_stats : show transfer statistics - dio_reset : reset statistics counters Usage examples: # Perform SSD → device transfer echo "fd=3 va=0x10000 off=0 len=4096" > \ /sys/kernel/debug/habanalabs/hl0/dio_ssd2hl # View statistics cat /sys/kernel/debug/habanalabs/hl0/dio_stats # Reset counters echo 1 > /sys/kernel/debug/habanalabs/hl0/dio_reset This interface provides access to HLDIO functionality for validation and diagnostics. Signed-off-by: Konstantin Sinyuk <konstantin.sinyuk@intel.com> Reviewed-by: Farah Kassabri <farah.kassabri@intel.com> Reviewed-by: Koby Elbaz <koby.elbaz@intel.com> Signed-off-by: Koby Elbaz <koby.elbaz@intel.com>
2025-09-25accel/habanalabs: add NVMe Direct I/O (HLDIO) infrastructureKonstantin Sinyuk6-2/+624
Introduce NVMe Direct I/O (HLDIO) infrastructure to support peer‑to‑peer DMA in the habanalabs driver. This adds internal helpers and data structures to enable direct transfers between NVMe storage and device memory. The feature is built only when CONFIG_HL_HLDIO is enabled. A debugfs interface is also provided for functional validation. Signed-off-by: Konstantin Sinyuk <konstantin.sinyuk@intel.com> Reviewed-by: Farah Kassabri <farah.kassabri@intel.com> Reviewed-by: Koby Elbaz <koby.elbaz@intel.com> Signed-off-by: Koby Elbaz <koby.elbaz@intel.com>
2025-09-25accel/habanalabs: support mapping cb with vmalloc-backed coherent memoryMoti Haimovski2-0/+26
When IOMMU is enabled, dma_alloc_coherent() with GFP_USER may return addresses from the vmalloc range. If such an address is mapped without VM_MIXEDMAP, vm_insert_page() will trigger a BUG_ON due to the VM_PFNMAP restriction. Fix this by checking for vmalloc addresses and setting VM_MIXEDMAP in the VMA before mapping. This ensures safe mapping and avoids kernel crashes. The memory is still driver-allocated and cannot be accessed directly by userspace. Signed-off-by: Moti Haimovski <moti.haimovski@intel.com> Reviewed-by: Koby Elbaz <koby.elbaz@intel.com> Signed-off-by: Koby Elbaz <koby.elbaz@intel.com>
2025-09-25accel/habanalabs: remove old interface variation of 'access_ok()'Ilia Levi1-5/+0
The access_ok() API no longer requires the VERIFY_WRITE argument, and the use of the old interface with VERIFY_WRITE is deprecated. Clean up the habanalabs memory manager to use the modern access_ok() interface consistently. This removes old #ifdef guards and aligns the driver with current upstream kernel APIs. Signed-off-by: Ilia Levi <ilia.levi@intel.com> Reviewed-by: Koby Elbaz <koby.elbaz@intel.com> Signed-off-by: Koby Elbaz <koby.elbaz@intel.com>
2025-09-25accel/habanalabs/gaudi2: use the CPLD_SHUTDOWN event handlerKonstantin Sinyuk2-4/+2
After CPLD shutdown event the device is not usable anymore. The common CPLD_SHUTDOWN event handler disables any subsequent device access. Signed-off-by: Konstantin Sinyuk <konstantin.sinyuk@intel.com> Reviewed-by: Koby Elbaz <koby.elbaz@intel.com> Signed-off-by: Koby Elbaz <koby.elbaz@intel.com>
2025-09-25accel/habanalabs: disable device access after CPLD_SHUTDOWNKonstantin Sinyuk2-0/+28
After a CPLD shutdown event the device becomes unusable. Prevent further device access once this event is received. Signed-off-by: Konstantin Sinyuk <konstantin.sinyuk@intel.com> Reviewed-by: Koby Elbaz <koby.elbaz@intel.com> Signed-off-by: Koby Elbaz <koby.elbaz@intel.com>
2025-09-25accel/habanalabs: clarify ctx use after hl_ctx_put() in dmabuf releaseTomer Tayar1-1/+6
In hl_release_dmabuf(), ctx is dereferenced after calling hl_ctx_put() to obtain the compute device file. This is safe because the dma-buf object holds a file reference taken in export_dmabuf(), and the file release (which drops another ctx reference) can only happen after we drop that file reference via fput(). Thus, this hl_ctx_put() call cannot be the last one at this point. Add a comment explaining this to avoid confusion. Signed-off-by: Tomer Tayar <tomer.tayar@intel.com> Reviewed-by: Koby Elbaz <koby.elbaz@intel.com> Signed-off-by: Koby Elbaz <koby.elbaz@intel.com>
2025-09-25accel/habanalabs/gaudi2: add support for logging register accesses from debugfsSharley Calzolari3-1/+148
Add infrastructure for logging the last configuration register accesses that occur via debugfs read/write operations. At interrupt time, these log entries can be dumped to dmesg, which helps in diagnosing the cause of RAZWI and ADDR_DEC interrupts. The logging is implemented as a ring buffer of access entries, with each entry recording timestamp and access details. To ensure correctness under concurrent access, operations are now protected using spinlocks. Entries are copied under lock and then printed after releasing it, which minimizes time spent in the critical section. Signed-off-by: Sharley Calzolari <sharley.calzolari@intel.com> Reviewed-by: Koby Elbaz <koby.elbaz@intel.com> Signed-off-by: Koby Elbaz <koby.elbaz@intel.com>
2025-09-25accel/habanalabs/gaudi2: stringify engine/queue idsAriel Suller2-8/+367
Print engine/queue names instead of numerical engine/queue IDs to make logs and debug output more readable. Signed-off-by: Ariel Suller <ariel.suller@intel.com> Reviewed-by: Koby Elbaz <koby.elbaz@intel.com> Signed-off-by: Koby Elbaz <koby.elbaz@intel.com>
2025-09-25accel/habanalabs: add generic message type to get error countersVitaly Margolin1-0/+3
Add a new CPUCP generic message type to retrieve HBM, SRAM and critical error counters from the device. Signed-off-by: Vitaly Margolin <vitaly.margolin@intel.com> Reviewed-by: Koby Elbaz <koby.elbaz@intel.com> Signed-off-by: Koby Elbaz <koby.elbaz@intel.com>
2025-09-25accel/habanalabs/gaudi2: fix BMON disable configurationVered Yavniely1-1/+1
Change the BMON_CR register value back to its original state before enabling, so that BMON does not continue to collect information after being disabled. Signed-off-by: Vered Yavniely <vered.yavniely@intel.com> Reviewed-by: Koby Elbaz <koby.elbaz@intel.com> Signed-off-by: Koby Elbaz <koby.elbaz@intel.com>
2025-09-25accel/habanalabs: return ENOMEM if less than requested pages were pinnedTomer Tayar1-1/+1
EFAULT is currently returned if less than requested user pages are pinned. This value means a "bad address" which might be confusing to the user, as the address of the given user memory is not necessarily "bad". Modify the return value to ENOMEM, as "out of memory" is more suitable in this case. Signed-off-by: Tomer Tayar <tomer.tayar@intel.com> Reviewed-by: Koby Elbaz <koby.elbaz@intel.com> Signed-off-by: Koby Elbaz <koby.elbaz@intel.com>
2025-09-24accel/amdxdna: Enhance runtime power managementLizhi Hou12-160/+262
Currently, pm_runtime_resume_and_get() is invoked in the driver's open callback, and pm_runtime_put_autosuspend() is called in the close callback. As a result, the device remains active whenever an application opens it, even if no I/O is performed, leading to unnecessary power consumption. Move the runtime PM calls to the AIE2 callbacks that actually interact with the hardware. The device will automatically suspend after 5 seconds of inactivity (no hardware accesses and no pending commands), and it will be resumed on the next hardware access. Reviewed-by: Karol Wachowski <karol.wachowski@linux.intel.com> Signed-off-by: Lizhi Hou <lizhi.hou@amd.com> Link: https://lore.kernel.org/r/20250923152229.1303625-1-lizhi.hou@amd.com
2025-09-18accel/ivpu: Add support for user-managed preemption bufferAndrzej Kacprowski6-44/+130
Allow user mode drivers to manage preemption buffers, enabling memory savings by sharing a single buffer across multiple command queues within the same memory context. Introduce DRM_IVPU_PARAM_PREEMPT_BUFFER_SIZE to report the required preemption buffer size as specified by the firmware. The preemption buffer is now passed from user space as an entry in the BO list of DRM_IVPU_CMDQ_SUBMIT. The buffer must be non-mappable and large enough to hold preemption data. For backward compatibility, the kernel will allocate an internal preemption buffer if user space does not provide one. User space can only provide a single preemption buffer, simplifying the ioctl interface and parameter validation. A separate secondary preemption buffer is only needed to save below 4GB address space on 37xx and only if preemption buffers are not shared. Signed-off-by: Andrzej Kacprowski <Andrzej.Kacprowski@intel.com> Reviewed-by: Lizhi Hou <lizhi.hou@amd.com> Signed-off-by: Karol Wachowski <karol.wachowski@linux.intel.com> Link: https://lore.kernel.org/r/20250915103437.830086-1-karol.wachowski@linux.intel.com
2025-09-18accel/ivpu: Update JSM firmware API to latest 3.32.5 versionKarol Wachowski1-187/+326
Synchronize the JSM API header file with the latest 3.32.5 version to reflect all changes introduced in the new firmware API Reviewed-by: Lizhi Hou <lizhi.hou@amd.com> Signed-off-by: Karol Wachowski <karol.wachowski@linux.intel.com> Link: https://lore.kernel.org/r/20250916084131.848988-1-karol.wachowski@linux.intel.com
2025-09-18accel/ivpu: Ensure rpm_runtime_put in case of engine reset/resume failKarol Wachowski1-2/+2
Previously, aborting work could return early after engine reset or resume failure, skipping the necessary runtime_put cleanup leaving the device with incorrect reference count breaking runtime power management state. Replace early returns with goto statements to ensure runtime_put is always executed. Fixes: a47e36dc5d90 ("accel/ivpu: Trigger device recovery on engine reset/resume failure") Reviewed-by: Lizhi Hou <lizhi.hou@amd.com> Signed-off-by: Karol Wachowski <karol.wachowski@linux.intel.com> Link: https://lore.kernel.org/r/20250916084809.850073-1-karol.wachowski@linux.intel.com
2025-09-18accel/ivpu: Remove unused firmware boot parametersAndrzej Kacprowski1-9/+0
Remove references to firmware boot parameters that were never used by any production version of device firmware. Signed-off-by: Andrzej Kacprowski <Andrzej.Kacprowski@intel.com> Reviewed-by: Lizhi Hou <lizhi.hou@amd.com> Signed-off-by: Karol Wachowski <karol.wachowski@linux.intel.com> Link: https://lore.kernel.org/r/20250915103553.830151-1-karol.wachowski@linux.intel.com
2025-09-18accel/ivpu: Refactor priority_bands_show for readabilityJacek Lawrynowicz1-24/+14
Reduce code duplication and improve the overall readability of the debugfs output for job scheduling priority bands. Additionally fix clang-tidy warning about missing default case in the switch statement. Signed-off-by: Jacek Lawrynowicz <jacek.lawrynowicz@linux.intel.com> Reviewed-by: Lizhi Hou <lizhi.hou@amd.com> Signed-off-by: Karol Wachowski <karol.wachowski@linux.intel.com> Link: https://lore.kernel.org/r/20250915103401.830045-1-karol.wachowski@linux.intel.com
2025-09-18accel/ivpu: Reset cmdq->db_id on register failureKarol Wachowski1-2/+4
Ensure that cmdq->db_id is reset to 0 if ivpu_jsm_register_db fails, preventing potential reuse of invalid command queue with unregistered doorbell. Reviewed-by: Lizhi Hou <lizhi.hou@amd.com> Signed-off-by: Karol Wachowski <karol.wachowski@linux.intel.com> Link: https://lore.kernel.org/r/20250915103421.830065-1-karol.wachowski@linux.intel.com
2025-09-17accel/amdxdna: Call dma_buf_vmap_unlocked() for imported objectLizhi Hou1-27/+20
In amdxdna_gem_obj_vmap(), calling dma_buf_vmap() triggers a kernel warning if LOCKDEP is enabled. So for imported object, use dma_buf_vmap_unlocked(). Then, use drm_gem_vmap() for other objects. The similar change applies to vunmap code. Fixes: bd72d4acda10 ("accel/amdxdna: Support user space allocated buffer") Reviewed-by: Maciej Falkowski <maciej.falkowski@linux.intel.com> Signed-off-by: Lizhi Hou <lizhi.hou@amd.com> Link: https://lore.kernel.org/r/20250916174842.234709-1-lizhi.hou@amd.com
2025-09-15Merge drm/drm-next into drm-misc-nextThomas Zimmermann4-5/+5
Backmerging to drm-misc-next to get fixes from v6.17-rc6. Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de>
2025-09-15Merge tag 'v6.17-rc6' into drm-nextDave Airlie4-5/+5
This is a backmerge of Linux 6.17-rc6, needed for msm, also requested by misc. Signed-off-by: Dave Airlie <airlied@redhat.com>
2025-09-11accel/amdxdna: Fix an integer overflow in aie2_query_ctx_status_array()Lizhi Hou1-0/+6
The unpublished smatch static checker reported a warning. drivers/accel/amdxdna/aie2_pci.c:904 aie2_query_ctx_status_array() warn: potential user controlled sizeof overflow 'args->num_element * args->element_size' '1-u32max(user) * 1-u32max(user)' Even this will not cause a real issue, it is better to put a reasonable limitation for element_size and num_element. Add condition to make sure the input element_size <= 4K and num_element <= 1K. Reported-by: Dan Carpenter <dan.carpenter@linaro.org> Closes: https://lore.kernel.org/dri-devel/aL56ZCLyl3tLQM1e@stanley.mountain/ Fixes: 2f509fe6a42c ("accel/amdxdna: Add ioctl DRM_IOCTL_AMDXDNA_GET_ARRAY") Reviewed-by: Maciej Falkowski <maciej.falkowski@linux.intel.com> Signed-off-by: Lizhi Hou <lizhi.hou@amd.com> Link: https://lore.kernel.org/r/20250909154531.3469979-1-lizhi.hou@amd.com
2025-09-04accel/amdxdna: Add ioctl DRM_IOCTL_AMDXDNA_GET_ARRAYLizhi Hou3-26/+118
Add interface for applications to get information array. The application provides a buffer pointer along with information type, maximum number of entries and maximum size of each entry. The buffer may also contain match conditions based on the information type. After the ioctl completes, the actual number of entries and entry size are returned. (see [1], used by driver runtime library) [1] https://github.com/amd/xdna-driver/blob/main/src/shim/host/platform_host.cpp#L337 Reviewed-by: Mario Limonciello (AMD) <superm1@kernel.org> Reviewed-by: Maciej Falkowski <maciej.falkowski@linux.intel.com> Signed-off-by: Lizhi Hou <lizhi.hou@amd.com> Link: https://lore.kernel.org/r/20250903053402.2103196-1-lizhi.hou@amd.com
2025-09-01accel/ivpu: Prevent recovery work from being queued during device removalKarol Wachowski3-4/+4
Use disable_work_sync() instead of cancel_work_sync() in ivpu_dev_fini() to ensure that no new recovery work items can be queued after device removal has started. Previously, recovery work could be scheduled even after canceling existing work, potentially leading to use-after-free bugs if recovery accessed freed resources. Rename ivpu_pm_cancel_recovery() to ivpu_pm_disable_recovery() to better reflect its new behavior. Fixes: 58cde80f45a2 ("accel/ivpu: Use dedicated work for job timeout detection") Cc: stable@vger.kernel.org # v6.8+ Signed-off-by: Karol Wachowski <karol.wachowski@intel.com> Reviewed-by: Lizhi Hou <lizhi.hou@amd.com> Signed-off-by: Jacek Lawrynowicz <jacek.lawrynowicz@linux.intel.com> Link: https://lore.kernel.org/r/20250808110939.328366-1-jacek.lawrynowicz@linux.intel.com
2025-09-01accel/ivpu: Make function parameter names consistentJacek Lawrynowicz2-2/+2
Make ivpu_hw_btrs_dct_set_status() and ivpu_fw_boot_params_setup() declaration and definition parameter names consistent. Reviewed-by: Lizhi Hou <lizhi.hou@amd.com> Signed-off-by: Jacek Lawrynowicz <jacek.lawrynowicz@linux.intel.com> Link: https://lore.kernel.org/r/20250808111014.328607-1-jacek.lawrynowicz@linux.intel.com
2025-09-01accel/ivpu: Remove unused PLL_CONFIG_DEFAULTJacek Lawrynowicz1-2/+1
This change removes the unnecessary condition, makes the code clearer, and silences clang-tidy warning. Reviewed-by: Lizhi Hou <lizhi.hou@amd.com> Signed-off-by: Jacek Lawrynowicz <jacek.lawrynowicz@linux.intel.com> Link: https://lore.kernel.org/r/20250808111044.328800-1-jacek.lawrynowicz@linux.intel.com
2025-09-01accel/rocket: Fix some error checking in rocket_core_init()Dan Carpenter1-1/+1
The problem is that pm_runtime_get_sync() can return 1 on success so checking for zero doesn't work. Use the pm_runtime_resume_and_get() function instead. The pm_runtime_resume_and_get() function does additional cleanup as well so that's a bonus as well. Fixes: 0810d5ad88a1 ("accel/rocket: Add job submission IOCTL") Signed-off-by: Dan Carpenter <dan.carpenter@linaro.org> Signed-off-by: Tomeu Vizoso <tomeu@tomeuvizoso.net> Link: https://lore.kernel.org/r/aKcRW6fsRP_o5C_y@stanley.mountain
2025-09-01accel/rocket: Check the correct DMA irq status to warn aboutHeiko Stuebner1-1/+1
Right now, the code checks the DMA_READ_ERROR state 2 times, while I guess it was supposed to warn about both read and write errors. Change the 2nd check to look at the write-error flag. Fixes: 0810d5ad88a1 ("accel/rocket: Add job submission IOCTL") Signed-off-by: Heiko Stuebner <heiko@sntech.de> Signed-off-by: Tomeu Vizoso <tomeu@tomeuvizoso.net> Link: https://lore.kernel.org/r/20250818185658.2585696-1-heiko@sntech.de
2025-09-01accel/rocket: Fix usages of kfree() and sizeof()Brigham Campbell1-3/+4
Replace usages of kfree() with kvfree() for pointers which were allocated using kvmalloc(), as required by the kernel memory management API. Use sizeof() on the type that a pointer references instead of the pointer itself. In this case, scheds and *scheds both happen to be pointers, so sizeof() will expand to the same value in either case, but using *scheds is more technically correct since scheds is an array of drm_gpu_scheduler *. Reported-by: kernel test robot <lkp@intel.com> Reported-by: Julia Lawall <julia.lawall@inria.fr> Closes: https://lore.kernel.org/r/202508120730.PLbjlKbI-lkp@intel.com/ Signed-off-by: Brigham Campbell <me@brighamcampbell.com> Signed-off-by: Tomeu Vizoso <tomeu@tomeuvizoso.net> Link: https://lore.kernel.org/r/20250813-rocket-free-fix-v1-1-51f00a7a1271@brighamcampbell.com Fixes: 0810d5ad88a1 ("accel/rocket: Add job submission IOCTL")
2025-09-01accel/rocket: Depend on DRM_ACCEL not just DRMHeiko Stuebner1-1/+1
With the current dependency on only DRM, a config of CONFIG_DRM_ACCEL_ROCKET=y is possible, but of course wrong, because without DRM_ACCEL the build- system will never even enter drivers/accel/* . So depend on DRM_ACCEL instead of just DRM. Fixes: ed98261b4168 ("accel/rocket: Add a new driver for Rockchip's NPU") Signed-off-by: Heiko Stuebner <heiko@sntech.de> Signed-off-by: Tomeu Vizoso <tomeu@tomeuvizoso.net> Link: https://lore.kernel.org/r/20250814113519.1551855-3-heiko@sntech.de
2025-09-01accel/rocket: Fix indentation of Kconfig entryHeiko Stuebner1-8/+8
The general indentation for the Kconfig lines is one tab, so adapt the lines accordingly. The description is correctly indented (1 tab + 2 spaces) so doesn't need changes. Fixes: ed98261b4168 ("accel/rocket: Add a new driver for Rockchip's NPU") Signed-off-by: Heiko Stuebner <heiko@sntech.de> Signed-off-by: Tomeu Vizoso <tomeu@tomeuvizoso.net> Link: https://lore.kernel.org/r/20250814113519.1551855-2-heiko@sntech.de
2025-08-29accel/amdxdna: Use int instead of u32 to store error codesQianfeng Rong1-3/+3
Change the 'ret' variable from u32 to int to store -EINVAL. Storing the negative error codes in unsigned type, doesn't cause an issue at runtime but it's ugly as pants. Additionally, assigning -EINVAL to u32 ret (i.e., u32 ret = -EINVAL) may trigger a GCC warning when the -Wsign-conversion flag is enabled. Fixes: aac243092b70 ("accel/amdxdna: Add command execution") Signed-off-by: Qianfeng Rong <rongqianfeng@vivo.com> Reviewed-by: Lizhi Hou <lizhi.hou@amd.com> Signed-off-by: Lizhi Hou <lizhi.hou@amd.com> Link: https://lore.kernel.org/r/20250828033917.113364-1-rongqianfeng@vivo.com
2025-08-26accel/amdxdna: Fix incorrect type used for a local variableLizhi Hou1-1/+2
drivers/accel/amdxdna/aie2_pci.c:794:13: sparse: sparse: incorrect type in assignment (different address spaces) Fixes: c8cea4371e5e ("accel/amdxdna: Add a function to walk hardware contexts") Reported-by: kernel test robot <lkp@intel.com> Closes: https://lore.kernel.org/oe-kbuild-all/202508230855.0b9efFl6-lkp@intel.com/ Reviewed-by: Mario Limonciello <mario.limonciello@amd.com> Signed-off-by: Lizhi Hou <lizhi.hou@amd.com> Link: https://lore.kernel.org/r/20250826171951.801585-1-lizhi.hou@amd.com
2025-08-20Merge drm/drm-fixes into drm-misc-fixesMaxime Ripard1-16/+7
Update drm-misc-fixes to -rc2. Signed-off-by: Maxime Ripard <mripard@kernel.org>
2025-08-20Merge drm/drm-next into drm-misc-nextMaxime Ripard1-16/+7
Bring v6.17-rc2 in to unstuck for-linux-next. Signed-off-by: Maxime Ripard <mripard@kernel.org>
2025-08-19Merge tag 'drm-misc-next-2025-08-14' of ↵Dave Airlie27-166/+6370
https://gitlab.freedesktop.org/drm/misc/kernel into drm-next drm-misc-next for v6.18: UAPI Changes: - Add DRM_IOCTL_GEM_CHANGE_HANDLE for reassigning GEM handles - Document DRM_MODE_PAGE_FLIP_EVENT Cross-subsystem Changes: fbcon: - Add missing declarations in fbcon.h Core Changes: bridge: - Fix ref counting panel: - Replace and remove mipi_dsi_generic_write_{seq/_chatty}() sched: - Fixes Rust: - Drop Opaque<> from ioctl arguments Driver Changes: amdxdma: - Support buffers allocated by user space - Streamline PM interfaces - Fixes bridge: - cdns-dsi: Various improvements to mode setting - Support Solomon SSD2825 plus DT bindings - Support Waveshare DSI2DPI plus DT bindings gud: - Fixes ivpu: - Fixes nouveau: - Use GSP firmware by default - Fixes panel: - panel-edp: Support mt8189 Chromebooks; Support BOE NV140WUM-N64; Support SHP LQ134Z1; Fixes - panel-simple: Support Olimex LCD-OLinuXino-5CTS plus DT bindings - Support Samsung AMS561RA01 - Support Hydis HV101HD1 plus DT bindings panthor: - Print task/pid on errors - Fixes renesas: - convert to RUNTIME_PM_OPS repaper: - Use shadow-plane helpers rocket: - Add driver for Rockchip NPU plus DT bindings sharp-memory: - Use shadow-plane helpers simpledrm: - Use of_reserved_mem_region_to_resource() helper tidss: - Use crtc_ fields for programming display mode - Remove other drivers from aperture v3d: - Support querying nubmer of GPU resets for KHR_robustness vmwgfx: - Fixes Signed-off-by: Dave Airlie <airlied@redhat.com> From: Thomas Zimmermann <tzimmermann@suse.de> Link: https://lore.kernel.org/r/20250814072454.GA18104@linux.fritz.box