aboutsummaryrefslogtreecommitdiff
path: root/drivers
AgeCommit message (Collapse)AuthorFilesLines
2026-01-27Revert "rnbd-clt: fix refcount underflow in device unmap path"Jens Axboe1-0/+1
This reverts commit ec19ed2b3e2af8ec5380400cdee9cb6560144506. This commit relies on changes queued for 7.0, and isn't safe in its current form for the 6.19 release. Revert it for now for 6.19. Link: https://lore.kernel.org/linux-block/aXhLQmRudk7cSAnT@shinmob/ Signed-off-by: Jens Axboe <axboe@kernel.dk>
2026-01-28regmap: reg_default_cb for flat cache defaultsMark Brown456-2951/+4738
Merge series from "Sheetal ." <sheetal@nvidia.com>: This series adds a reg_default_cb callback for REGCACHE_FLAT to provide defaults for registers not listed in reg_defaults. Defaults are loaded eagerly during regcache init and the callback can use writeable_reg to filter valid addresses and avoid holes.
2026-01-28Merge tag 'drm-rust-next-2026-01-26' of ↵Dave Airlie25-369/+768
https://gitlab.freedesktop.org/drm/rust/kernel into drm-next DRM Rust changes for v7.0-rc1 DRM: - Fix documentation for Registration constructors. - Use pin_init::zeroed() for fops initialization. - Annotate DRM helpers with __rust_helper. - Improve safety documentation for gem::Object::new(). - Update AlwaysRefCounted imports. MM: - Prevent integer overflow in page_align(). Nova (Core): - Prepare for Turing support. This includes parsing and handling Turing-specific firmware headers and sections as well as a Turing Falcon HAL implementation. - Get rid of the Result<impl PinInit<T, E>> anti-pattern. - Relocate initializer-specific code into the appropriate initializer. - Use CStr::from_bytes_until_nul() to remove custom helpers. - Improve handling of unexpected firmware values. - Clean up redundant debug prints. - Replace c_str!() with native Rust C-string literals. - Update nova-core task list. Nova (DRM): - Align GEM object size to system page size. Tyr: - Use generated uAPI bindings for GpuInfo. - Replace manual sleeps with read_poll_timeout(). - Replace c_str!() with native Rust C-string literals. - Suppress warnings for unread fields. - Fix incorrect register name in print statement. Signed-off-by: Dave Airlie <airlied@redhat.com> From: "Danilo Krummrich" <dakr@kernel.org> Link: https://patch.msgid.link/DFYW1WV6DUCG.3K8V2DAVD1Q4A@kernel.org
2026-01-28wifi: rtw89: regd: update regulatory map to R73-R54Zong-Zhe Yang1-10/+10
Sync Realtek Channel Plan R73 and Realtek Regulatory R54. Configure 6 GHz field of Realtek regd for the following countries. PY NA BD ID VN TN GL GP YT EH Signed-off-by: Zong-Zhe Yang <kevin_yang@realtek.com> Signed-off-by: Ping-Ke Shih <pkshih@realtek.com> Link: https://patch.msgid.link/20260123013957.16418-12-pkshih@realtek.com
2026-01-28wifi: rtw89: pci: validate release report content before using for RTL8922DEPing-Ke Shih1-2/+4
The commit 957eda596c76 ("wifi: rtw89: pci: validate sequence number of TX release report") does validation on existing chips, which somehow a release report of SKB becomes malformed. As no clear cause found, add rules ahead for RTL8922DE to avoid crash if it happens. Signed-off-by: Ping-Ke Shih <pkshih@realtek.com> Link: https://patch.msgid.link/20260123013957.16418-11-pkshih@realtek.com
2026-01-28wifi: rtw89: get designated link to replace link instance 0Zong-Zhe Yang3-11/+8
Clean up some places where still to get link instance 0 directly. Since now MLSR switch is supported, it's not guaranteed to always run on link instance 0. So, prefer to get designated link in most cases. For now, the only exception is MCC (multi-channel concurrency) case. How to fill content of its H2C command depends on how to choose link instance, so cannot simply change it as above. Will handle MCC case separately afterwards. Signed-off-by: Zong-Zhe Yang <kevin_yang@realtek.com> Signed-off-by: Ping-Ke Shih <pkshih@realtek.com> Link: https://patch.msgid.link/20260123013957.16418-10-pkshih@realtek.com
2026-01-28wifi: rtw89: 8922a: configure FW version for SIM_SER_L0L1_BY_HALT_H2CZong-Zhe Yang1-0/+1
After FW version 0.35.97.0, 8922A supports SIM_SER_L0L1_BY_HALT_H2C FW feature. It allows to simulate FW L0/L1 crash under PS mode. Signed-off-by: Zong-Zhe Yang <kevin_yang@realtek.com> Signed-off-by: Ping-Ke Shih <pkshih@realtek.com> Link: https://patch.msgid.link/20260123013957.16418-9-pkshih@realtek.com
2026-01-28wifi: rtw89: phy: add PHY C2H event dummy handler for func 1-7 and 2-10Ping-Ke Shih4-2/+19
The two functions aren't implemented and hard necessary by driver. Implement dummy handler to avoid messages: rtw89_8922de 0000:03:00.0: PHY c2h class 1 func 7 not support rtw89_8922de 0000:03:00.0: PHY c2h class 2 func 10 not support Signed-off-by: Ping-Ke Shih <pkshih@realtek.com> Link: https://patch.msgid.link/20260123013957.16418-8-pkshih@realtek.com
2026-01-28wifi: rtw89: fw: correct content of DACK H2C commandPing-Ke Shih2-6/+6
The fields of command should be u8 instead of __le32. However, current firmware doesn't really use the data for now, so this mistake doesn't impact performance. Signed-off-by: Ping-Ke Shih <pkshih@realtek.com> Link: https://patch.msgid.link/20260123013957.16418-7-pkshih@realtek.com
2026-01-28wifi: rtw89: rfk: update RFK report format of IQK, DACK and TXGAPKPing-Ke Shih2-14/+40
The report formats of IQK, DACK and TXGAPK are changed. Update them accordingly. Signed-off-by: Ping-Ke Shih <pkshih@realtek.com> Link: https://patch.msgid.link/20260123013957.16418-6-pkshih@realtek.com
2026-01-28wifi: rtw89: rfk: add to print debug log of CIM3KPing-Ke Shih3-0/+41
Add calibration report of CIM3K, which does calibration in firmware and send a C2H event as debug purpose. Signed-off-by: Ping-Ke Shih <pkshih@realtek.com> Link: https://patch.msgid.link/20260123013957.16418-5-pkshih@realtek.com
2026-01-28wifi: rtw89: rfk: add firmware command to do CIM3KPing-Ke Shih4-0/+77
CIM is short for counter intermodulation products 3rd-order. Due to non-linearity in transmit path, need a calibration to yield performance for RF system. Signed-off-by: Ping-Ke Shih <pkshih@realtek.com> Link: https://patch.msgid.link/20260123013957.16418-4-pkshih@realtek.com
2026-01-28wifi: rtw89: rfk: add to print debug log of TX IQKPing-Ke Shih3-1/+35
Add report format for TX IQK, which do calibration in firmware and send a C2H event as debug purpose. Signed-off-by: Ping-Ke Shih <pkshih@realtek.com> Link: https://patch.msgid.link/20260123013957.16418-3-pkshih@realtek.com
2026-01-28Merge tag 'drm-msm-next-2026-01-23' of ↵Dave Airlie53-2143/+2090
https://gitlab.freedesktop.org/drm/msm into drm-next Changes for v6.20 GPU: - Document a612/RGMU dt bindings - UBWC 6.0 support (for A840 / Kaanapali) - a225 support - Fixes DPU: - Switched to use virtual planes by default - Fixed DSI CMD panels on DPU 3.x - Rewrote format handling to remove intermediate representation - Fixed watchdog on DPU 8.x+ - Fixed TE / Vsync source setting on DPU 8.x+ - Added 3D_Mux on SC7280 - Kaanapali platform support - Fixed UBWC register programming - Made RM reserve DSPP-enabled mixers for CRTCs with LMs. - Gamma correction support DP: - Enabled support for eDP 1.4+ link rate tables - Fixed MDSS1 DP indices on SA8775P, making them to work - Fixed msm_dp_ctrl_config_msa() to work with LLVM 20 DSI: - Documented QCS8300 as compatible with SA8775P - Kaanapali platform support DSI PHY: - switched to divider_determine_rate() MDP5: - Dropped support for MSM8998, SDM660 and SDM630 (switched over to DPU) MDSS: - Kaanapali platform support - Fixed UBWC register programming Signed-off-by: Dave Airlie <airlied@redhat.com> From: Rob Clark <rob.clark@oss.qualcomm.com> Link: https://patch.msgid.link/CACSVV03Sbeca93A+gGh-TKpzFYVabbkWVgPCCicG0_NQG+5Y2A@mail.gmail.com
2026-01-28wifi: rtw89: rfk: add firmware command to do TX IQKPing-Ke Shih4-0/+79
TX IQK is a RF calibration, which driver call this H2C command to trigger the calibration. Signed-off-by: Ping-Ke Shih <pkshih@realtek.com> Link: https://patch.msgid.link/20260123013957.16418-2-pkshih@realtek.com
2026-01-28BackMerge tag 'v6.19-rc7' into drm-nextDave Airlie554-3200/+5278
Linux 6.19-rc7 This is needed for msm and rust trees. Signed-off-by: Dave Airlie <airlied@redhat.com>
2026-01-27gve: fix probe failure if clock read failsJordan Rhee5-15/+14
If timestamping is supported, GVE reads the clock during probe, which can fail for various reasons. Previously, this failure would abort the driver probe, rendering the device unusable. This behavior has been observed on production GCP VMs, causing driver initialization to fail completely. This patch allows the driver to degrade gracefully. If gve_init_clock() fails, it logs a warning and continues loading the driver without PTP support. Cc: stable@vger.kernel.org Fixes: a479a27f4da4 ("gve: Move gve_init_clock to after AQ CONFIGURE_DEVICE_RESOURCES call") Signed-off-by: Jordan Rhee <jordanrhee@google.com> Reviewed-by: Shachar Raindel <shacharr@google.com> Signed-off-by: Harshitha Ramamurthy <hramamurthy@google.com> Link: https://patch.msgid.link/20260127010210.969823-1-hramamurthy@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-01-27net/mlx5e: Account for netdev stats in ndo_get_stats64Gal Pressman1-9/+11
The driver's ndo_get_stats64 callback is only reporting mlx5 counters, without accounting for the netdev stats, causing errors from the network stack to be invisible in statistics. Add netdev_stats_to_stats64() call to first populate the counters, then add mlx5 counters on top, ensuring both are accounted for (where appropriate). Fixes: f62b8bb8f2d3 ("net/mlx5: Extend mlx5_core to support ConnectX-4 Ethernet functionality") Signed-off-by: Gal Pressman <gal@nvidia.com> Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/1769411695-18820-4-git-send-email-tariqt@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-01-27net/mlx5e: TC, delete flows only for existing peersMark Bloch1-6/+13
When deleting TC steering flows, iterate only over actual devcom peers instead of assuming all possible ports exist. This avoids touching non-existent peers and ensures cleanup is limited to devices the driver is currently connected to. BUG: kernel NULL pointer dereference, address: 0000000000000008 #PF: supervisor write access in kernel mode #PF: error_code(0x0002) - not-present page PGD 133c8a067 P4D 0 Oops: Oops: 0002 [#1] SMP CPU: 19 UID: 0 PID: 2169 Comm: tc Not tainted 6.18.0+ #156 NONE Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.16.0-0-gd239552ce722-prebuilt.qemu.org 04/01/2014 RIP: 0010:mlx5e_tc_del_fdb_peers_flow+0xbe/0x200 [mlx5_core] Code: 00 00 a8 08 74 a8 49 8b 46 18 f6 c4 02 74 9f 4c 8d bf a0 12 00 00 4c 89 ff e8 0e e7 96 e1 49 8b 44 24 08 49 8b 0c 24 4c 89 ff <48> 89 41 08 48 89 08 49 89 2c 24 49 89 5c 24 08 e8 7d ce 96 e1 49 RSP: 0018:ff11000143867528 EFLAGS: 00010246 RAX: 0000000000000000 RBX: dead000000000122 RCX: 0000000000000000 RDX: ff11000143691580 RSI: ff110001026e5000 RDI: ff11000106f3d2a0 RBP: dead000000000100 R08: 00000000000003fd R09: 0000000000000002 R10: ff11000101c75690 R11: ff1100085faea178 R12: ff11000115f0ae78 R13: 0000000000000000 R14: ff11000115f0a800 R15: ff11000106f3d2a0 FS: 00007f35236bf740(0000) GS:ff110008dc809000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000000000008 CR3: 0000000157a01001 CR4: 0000000000373eb0 Call Trace: <TASK> mlx5e_tc_del_flow+0x46/0x270 [mlx5_core] mlx5e_flow_put+0x25/0x50 [mlx5_core] mlx5e_delete_flower+0x2a6/0x3e0 [mlx5_core] tc_setup_cb_reoffload+0x20/0x80 fl_reoffload+0x26f/0x2f0 [cls_flower] ? mlx5e_tc_reoffload_flows_work+0xc0/0xc0 [mlx5_core] ? mlx5e_tc_reoffload_flows_work+0xc0/0xc0 [mlx5_core] tcf_block_playback_offloads+0x9e/0x1c0 tcf_block_unbind+0x7b/0xd0 tcf_block_setup+0x186/0x1d0 tcf_block_offload_cmd.isra.0+0xef/0x130 tcf_block_offload_unbind+0x43/0x70 __tcf_block_put+0x85/0x160 ingress_destroy+0x32/0x110 [sch_ingress] __qdisc_destroy+0x44/0x100 qdisc_graft+0x22b/0x610 tc_get_qdisc+0x183/0x4d0 rtnetlink_rcv_msg+0x2d7/0x3d0 ? rtnl_calcit.isra.0+0x100/0x100 netlink_rcv_skb+0x53/0x100 netlink_unicast+0x249/0x320 ? __alloc_skb+0x102/0x1f0 netlink_sendmsg+0x1e3/0x420 __sock_sendmsg+0x38/0x60 ____sys_sendmsg+0x1ef/0x230 ? copy_msghdr_from_user+0x6c/0xa0 ___sys_sendmsg+0x7f/0xc0 ? ___sys_recvmsg+0x8a/0xc0 ? __sys_sendto+0x119/0x180 __sys_sendmsg+0x61/0xb0 do_syscall_64+0x55/0x640 entry_SYSCALL_64_after_hwframe+0x4b/0x53 RIP: 0033:0x7f35238bb764 Code: 15 b9 86 0c 00 f7 d8 64 89 02 b8 ff ff ff ff eb bf 0f 1f 44 00 00 f3 0f 1e fa 80 3d e5 08 0d 00 00 74 13 b8 2e 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 4c c3 0f 1f 00 55 48 89 e5 48 83 ec 20 89 55 RSP: 002b:00007ffed4c35638 EFLAGS: 00000202 ORIG_RAX: 000000000000002e RAX: ffffffffffffffda RBX: 000055a2efcc75e0 RCX: 00007f35238bb764 RDX: 0000000000000000 RSI: 00007ffed4c356a0 RDI: 0000000000000003 RBP: 00007ffed4c35710 R08: 0000000000000010 R09: 00007f3523984b20 R10: 0000000000000004 R11: 0000000000000202 R12: 00007ffed4c35790 R13: 000000006947df8f R14: 000055a2efcc75e0 R15: 00007ffed4c35780 Fixes: 9be6c21fdcf8 ("net/mlx5e: Handle offloads flows per peer") Signed-off-by: Mark Bloch <mbloch@nvidia.com> Reviewed-by: Shay Drori <shayd@nvidia.com> Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/1769411695-18820-3-git-send-email-tariqt@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-01-27net/mlx5: Fix Unbinding uplink-netdev in switchdev modeShay Drory5-0/+46
It is possible to unbind the uplink ETH driver while the E-Switch is in switchdev mode. This leads to netdevice reference counting issues[1], as the driver removal path was not designed to clean up from this state. During uplink ETH driver removal (_mlx5e_remove), the code now waits for any concurrent E-Switch mode transition to finish. It then removes the REPs auxiliary device, if exists. This ensures a graceful cleanup. [1] unregister_netdevice: waiting for eth2 to become free. Usage count = 2 ref_tracker: netdev@00000000c912e04b has 1/1 users at ib_device_set_netdev+0x130/0x270 [ib_core] mlx5_ib_vport_rep_load+0xf4/0x3e0 [mlx5_ib] mlx5_esw_offloads_rep_load+0xc7/0xe0 [mlx5_core] esw_offloads_enable+0x583/0x900 [mlx5_core] mlx5_eswitch_enable_locked+0x1b2/0x290 [mlx5_core] mlx5_devlink_eswitch_mode_set+0x107/0x3e0 [mlx5_core] devlink_nl_eswitch_set_doit+0x60/0xd0 genl_family_rcv_msg_doit+0xe0/0x130 genl_rcv_msg+0x183/0x290 netlink_rcv_skb+0x4b/0xf0 genl_rcv+0x24/0x40 netlink_unicast+0x255/0x380 netlink_sendmsg+0x1f3/0x420 __sock_sendmsg+0x38/0x60 __sys_sendto+0x119/0x180 __x64_sys_sendto+0x20/0x30 Fixes: 7a9fb35e8c3a ("net/mlx5e: Do not reload ethernet ports when changing eswitch mode") Signed-off-by: Shay Drory <shayd@nvidia.com> Reviewed-by: Mark Bloch <mbloch@nvidia.com> Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/1769411695-18820-2-git-send-email-tariqt@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-01-27net: usb: int51x1: use usbnet_cdc_update_filterEthan Nelson-Moore2-36/+4
The int51x1 driver uses the same requests as USB CDC to handle packet filtering, but provides its own definitions and function to handle it. The chip datasheet says the requests are CDC compliant. Replace this unnecessary code with a reference to usbnet_cdc_update_filter. Also fix the broken datasheet link and remove an empty comment. Signed-off-by: Ethan Nelson-Moore <enelsonmoore@gmail.com> Link: https://patch.msgid.link/20260126044049.40359-1-enelsonmoore@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-01-27Merge branch '100GbE' of ↵Jakub Kicinski13-207/+396
git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/next-queue Tony Nguyen says: ==================== Intel Wired LAN Driver Updates 2026-01-26 (ice, idpf) For ice: Jake converts ring stats to utilize u64_stats APIs and performs some cleanups along the way. Alexander reorganizes layout of Tx and Rx rings for cacheline locality and utilizes __cacheline_group* macros on the new layouts. For idpf: YiFei Zhu adds support for BPF kfunc reporting of hardware Rx timestamps. * '100GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/next-queue: idpf: export RX hardware timestamping information to XDP ice: reshuffle and group Rx and Tx queue fields by cachelines ice: convert all ring stats to u64_stats_t ice: shorten ring stat names and add accessors ice: use u64_stats API to access pkts/bytes in dim sample ice: remove ice_q_stats struct and use struct_group ice: pass pointer to ice_fetch_u64_stats_per_ring ==================== Link: https://patch.msgid.link/20260126224313.3847849-1-anthony.l.nguyen@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-01-27net: aquantia: Remove redundant UDP length adjustment with GSO_PARTIALGal Pressman1-3/+0
GSO_PARTIAL now takes care of updating the UDP header length, remove the redundant assignment in aq_nic_map_skb(). Reviewed-by: Dragos Tatulea <dtatulea@nvidia.com> Signed-off-by: Gal Pressman <gal@nvidia.com> Link: https://patch.msgid.link/20260125121649.778086-4-gal@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-01-27net/mlx5e: Remove redundant UDP length adjustment with GSO_PARTIALGal Pressman1-17/+0
GSO_PARTIAL now takes care of updating the UDP header length, mlx5e_udp_gso_handle_tx_skb() is redundant, remove it. Reviewed-by: Dragos Tatulea <dtatulea@nvidia.com> Signed-off-by: Gal Pressman <gal@nvidia.com> Reviewed-by: Tariq Toukan <tariqt@nvidia.com> Link: https://patch.msgid.link/20260125121649.778086-3-gal@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-01-27net: stmmac: don't pass ioaddr to fix_soc_reset() methodRussell King (Oracle)3-6/+10
As the stmmac_priv struct is passed to the fix_soc_reset() method which has the ioaddr, there is no need to pass ioaddr separately. Pass just the stmmac_priv struct. Fix up the glues that use it. Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Reviewed-by: Maxime Chevallier <maxime.chevallier@bootlin.com> Link: https://patch.msgid.link/E1vkLmM-00000005vE1-0nop@rmk-PC.armlinux.org.uk Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-01-28ASoC: codec: Remove ak4641/pxa2xx-ac97 and convert toMark Brown198-1012/+1938
Merge series from "Peng Fan (OSS)" <peng.fan@oss.nxp.com>: The main goal is to convert drivers to use GPIO descriptors. While reading the code, I think it is time to remove ak4641 and pxa2xx-ac97 driver, more info could be found in commit log of each patch. Then only need to convert sound/arm/pxa2xx-ac97-lib.c to use GPIO descriptors. Not have hardware to test the pxa2xx ac97.
2026-01-27net: usb: sr9700: replace magic numbers with register bit macrosEthan Nelson-Moore1-2/+2
The first byte of the Rx frame is a copy of the Rx status register, so 0x40 corresponds to RSR_MF (meaning the frame is multicast). Replace 0x40 with RSR_MF for clarity. (All other bits of the RSR indicate errors. The fact that the driver ignores these errors will be fixed by a later patch.) The first byte of the status URB is a copy of the NSR, so 0x40 corresponds to NSR_LINKST. Replace 0x40 with NSR_LINKST for clarity. Signed-off-by: Ethan Nelson-Moore <enelsonmoore@gmail.com> Reviewed-by: Peter Korsgaard <peter@korsgaard.com> Link: https://patch.msgid.link/20260124032248.26807-1-enelsonmoore@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-01-27drm/amd/pm: fix race in power state check before mutex lockYang Wang1-3/+4
The power state check in amdgpu_dpm_set_powergating_by_smu() is done before acquiring the pm mutex, leading to a race condition where: 1. Thread A checks state and thinks no change is needed 2. Thread B acquires mutex and modifies the state 3. Thread A returns without updating state, causing inconsistency Fix this by moving the mutex lock before the power state check, ensuring atomicity of the state check and modification. Fixes: 6ee27ee27ba8 ("drm/amd/pm: avoid duplicate powergate/ungate setting") Signed-off-by: Yang Wang <kevinyang.wang@amd.com> Reviewed-by: Kenneth Feng <kenneth.feng@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit 7a3fbdfd19ec5992c0fc2d0bd83888644f5f2f38)
2026-01-27drm/amdgpu: fix NULL pointer dereference in amdgpu_gmc_filter_faults_removeJon Doron1-1/+6
On APUs such as Raven and Renoir (GC 9.1.0, 9.2.2, 9.3.0), the ih1 and ih2 interrupt ring buffers are not initialized. This is by design, as these secondary IH rings are only available on discrete GPUs. See vega10_ih_sw_init() which explicitly skips ih1/ih2 initialization when AMD_IS_APU is set. However, amdgpu_gmc_filter_faults_remove() unconditionally uses ih1 to get the timestamp of the last interrupt entry. When retry faults are enabled on APUs (noretry=0), this function is called from the SVM page fault recovery path, resulting in a NULL pointer dereference when amdgpu_ih_decode_iv_ts_helper() attempts to access ih->ring[]. The crash manifests as: BUG: kernel NULL pointer dereference, address: 0000000000000004 RIP: 0010:amdgpu_ih_decode_iv_ts_helper+0x22/0x40 [amdgpu] Call Trace: amdgpu_gmc_filter_faults_remove+0x60/0x130 [amdgpu] svm_range_restore_pages+0xae5/0x11c0 [amdgpu] amdgpu_vm_handle_fault+0xc8/0x340 [amdgpu] gmc_v9_0_process_interrupt+0x191/0x220 [amdgpu] amdgpu_irq_dispatch+0xed/0x2c0 [amdgpu] amdgpu_ih_process+0x84/0x100 [amdgpu] This issue was exposed by commit 1446226d32a4 ("drm/amdgpu: Remove GC HW IP 9.3.0 from noretry=1") which changed the default for Renoir APU from noretry=1 to noretry=0, enabling retry fault handling and thus exercising the buggy code path. Fix this by adding a check for ih1.ring_size before attempting to use it. Also restore the soft_ih support from commit dd299441654f ("drm/amdgpu: Rework retry fault removal"). This is needed if the hardware doesn't support secondary HW IH rings. v2: additional updates (Alex) Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/3814 Fixes: dd299441654f ("drm/amdgpu: Rework retry fault removal") Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Philip Yang <Philip.Yang@amd.com> Signed-off-by: Jon Doron <jond@wiz.io> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit 6ce8d536c80aa1f059e82184f0d1994436b1d526) Cc: stable@vger.kernel.org
2026-01-27drm/amd/pm: fix smu v14 soft clock frequency setting issueYang Wang2-0/+2
v1: resolve the issue where some freq frequencies cannot be set correctly due to insufficient floating-point precision. v2: patch this convert on 'max' value only. Signed-off-by: Yang Wang <kevinyang.wang@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit 53868dd8774344051999c880115740da92f97feb) Cc: stable@vger.kernel.org
2026-01-27drm/amd/pm: fix smu v13 soft clock frequency setting issueYang Wang2-0/+2
v1: resolve the issue where some freq frequencies cannot be set correctly due to insufficient floating-point precision. v2: patch this convert on 'max' value only. Signed-off-by: Yang Wang <kevinyang.wang@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit 6194f60c707e3878e120adeb36997075664d8429) Cc: stable@vger.kernel.org
2026-01-27drm/amdkfd: add extended capabilities to device snapshotJonathan Kim1-0/+1
Add additional capabilities reporting. Signed-off-by: Jonathan Kim <jonathan.kim@amd.com> Reviewed-by: James Zhu <james.zhu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2026-01-27drm/amdgpu: Send RMA CPER at bad page loadingKent Russell1-0/+4
Some older builds weren't sending RMA CPERs when the bad page threshold was exceeded. Newer builds have resolved this, but there could be systems out there with bad page numbers higher than the threshold, that haven't sent out an RMA CPER. To be thorough and safe, send an RMA CPER when we load the table, if the threshold is met or exceeded, instead of waiting for the next UE to trigger the CPER. Signed-off-by: Kent Russell <kent.russell@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2026-01-27drm/amdgpu: Fix jpeg ring test order in vcn_v4_0_3Jesse.Zhang1-1/+1
Fix the vcn reset sequence in vcn_v4_0_3_ring_reset() to restore JPEG power state and unlock the JPEG powergating mutex before running the JPEG ring post-reset helper. Fixes: d25c67fd9d6f ("drm/amdgpu/vcn4.0.3: rework reset handling") Reviewed-by: Lijo Lazar <lijo.lazar@amd.com> Signed-off-by: Jesse Zhang <jesse.zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2026-01-27drm/amd/pm: fix race in power state check before mutex lockYang Wang1-3/+4
The power state check in amdgpu_dpm_set_powergating_by_smu() is done before acquiring the pm mutex, leading to a race condition where: 1. Thread A checks state and thinks no change is needed 2. Thread B acquires mutex and modifies the state 3. Thread A returns without updating state, causing inconsistency Fix this by moving the mutex lock before the power state check, ensuring atomicity of the state check and modification. Fixes: 6ee27ee27ba8 ("drm/amd/pm: avoid duplicate powergate/ungate setting") Signed-off-by: Yang Wang <kevinyang.wang@amd.com> Reviewed-by: Kenneth Feng <kenneth.feng@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2026-01-27drm/amd/display: Promote DC to 3.2.367Taimur Hassan1-1/+1
* Fw release 0.1.44.0 * Fixes for corruption on platforms older than DCN4x. * Bug fixes related to USB4 link training * Fixes related to FP guard * Debug helpers and other stability fixes. * Some refactors to improve code quality Signed-off-by: Taimur Hassan <Syed.Hassan@amd.com> Signed-off-by: Aurabindo Pillai <aurabindo.pillai@amd.com> Reviewed-by: Alex Hung <alex.hung@amd.com> Tested-by: Dan Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2026-01-27drm/amd/display: [FW Promotion] Release 0.1.44.0Taimur Hassan1-6/+49
* Panel Replay related features/bugfixes * BootCRC feature Signed-off-by: Taimur Hassan <Syed.Hassan@amd.com> Signed-off-by: Aurabindo Pillai <aurabindo.pillai@amd.com> Reviewed-by: Alex Hung <alex.hung@amd.com> Tested-by: Dan Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2026-01-27drm/amd/display: Migrate HUBBUB register access from hwseq to hubbub component.Bhuvanachandra Pinninti8-4/+29
[why] Direct HUBBUB register access in the hwseq layer was creating register conflicts. [how] Migrated HUBBUB registers from hwseq to the hubbub component. Reviewed-by: Martin Leung <martin.leung@amd.com> Signed-off-by: Bhuvanachandra Pinninti <bpinnint@amd.com> Signed-off-by: Aurabindo Pillai <aurabindo.pillai@amd.com> Tested-by: Dan Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2026-01-27drm/amd/display: mouse event trigger to boost RR when idleMuaaz Nisar1-0/+13
[WHY+HOW] Add trigger event to boost refresh rate on mouse movement. Reviewed-by: Jun Lei <jun.lei@amd.com> Signed-off-by: Muaaz Nisar <muanisar@amd.com> Signed-off-by: Aurabindo Pillai <aurabindo.pillai@amd.com> Tested-by: Dan Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2026-01-27drm/amd/display: Add debug flag to override min dispclkMichael Strauss1-0/+1
[WHY] Enable dynamic ODM testing without needing a valid dispclk table [HOW] Create a debug flag to specify an override value for min dispclk Reviewed-by: Dmytro Laktyushkin <dmytro.laktyushkin@amd.com> Signed-off-by: Michael Strauss <michael.strauss@amd.com> Signed-off-by: Aurabindo Pillai <aurabindo.pillai@amd.com> Tested-by: Dan Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2026-01-27drm/amd/display: avoid dig reg access timeout on usb4 link training failZhongwei1-2/+10
[Why] When usb4 link training fails, the dpia sym clock will be disabled and SYMCLK source should be changed back to phy clock. In enable_streams, it is assumed that link training succeeded and will switch from refclk to phy clock. But phy clk here might not be on. Dig reg access timeout will occur. [How] When enable_stream is hit, check if link training failed for usb4. If it did, fall back to the ref clock to avoid reg access timeout. Reviewed-by: Wenjing Liu <wenjing.liu@amd.com> Signed-off-by: Zhongwei <Zhongwei.Zhang@amd.com> Signed-off-by: Aurabindo Pillai <aurabindo.pillai@amd.com> Tested-by: Dan Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2026-01-27drm/amd/display: Remove unnecessary DC FP guardWayne Lin2-4/+0
[Why & How] For dcn2x_fast_validate_bw(), not only populate_dml_pipes needs FP guard but also dml_get_voltage_level(). Remove unnecessary DC_FP_START/DC_FP_END guard in dcn20_fast_validate_bw and dcn21_fast_validate_bw. FP guard is already there before calling dcn2x_validate_bandwidth_fp(). Reviewed-by: ChiaHsuan (Tom) Chung <chiahsuan.chung@amd.com> Signed-off-by: Wayne Lin <Wayne.Lin@amd.com> Signed-off-by: Aurabindo Pillai <aurabindo.pillai@amd.com> Tested-by: Dan Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2026-01-27drm/amd/display: add setup_stereo for dcn4x or laterCharlene Liu2-1/+3
[why] stereo_sync pin is removed, but we still support display stereo Reviewed-by: Ovidiu (Ovi) Bunea <ovidiu.bunea@amd.com> Signed-off-by: Charlene Liu <Charlene.Liu@amd.com> Signed-off-by: Aurabindo Pillai <aurabindo.pillai@amd.com> Tested-by: Dan Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2026-01-27drm/amd/display: perform clear update flags for all DCN asicsAurabindo Pillai1-1/+1
Existing version check that limits the sequence to clear update flags should be performed for all asics. Exclude DCE asics for now. Reviewed-by: Sun peng (Leo) Li <sunpeng.li@amd.com> Signed-off-by: Aurabindo Pillai <aurabindo.pillai@amd.com> Tested-by: Dan Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2026-01-27drm/amd/display: Enable bootcrc on FW sideWayne Lin1-0/+3
[Why] The bootcrc feature is controlled on the FW side. [How] Pass the control bits in boot options to FW. Reviewed-by: ChiaHsuan (Tom) Chung <chiahsuan.chung@amd.com> Signed-off-by: Wayne Lin <Wayne.Lin@amd.com> Signed-off-by: Aurabindo Pillai <aurabindo.pillai@amd.com> Tested-by: Dan Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2026-01-27drm/amd/display: Add FR skipping CTS functionsJack Chang1-0/+1
1. To check whether Sink reaches maximum skipping number Reviewed-by: Robin Chen <robin.chen@amd.com> Signed-off-by: Jack Chang <jack.chang@amd.com> Signed-off-by: Aurabindo Pillai <aurabindo.pillai@amd.com> Tested-by: Dan Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2026-01-27drm/amd/display: Fix GFX12 family constant checksMatthew Stewart2-3/+3
Using >=, <= for checking the family is not always correct. Reviewed-by: Aurabindo Pillai <aurabindo.pillai@amd.com> Signed-off-by: Matthew Stewart <Matthew.Stewart2@amd.com> Tested-by: Dan Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2026-01-27drm/amd/display: Enable vstateup hook for DCN401 to be reusedCharlene Liu2-1/+3
Add the hook to the DCN401 header file so that it can be reused in other files Reviewed-by: Leo Chen <leo.chen@amd.com> Signed-off-by: Charlene Liu <Charlene.Liu@amd.com> Signed-off-by: Aurabindo Pillai <aurabindo.pillai@amd.com> Tested-by: Dan Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2026-01-27drm/amdgpu: fix NULL pointer dereference in amdgpu_gmc_filter_faults_removeJon Doron1-1/+6
On APUs such as Raven and Renoir (GC 9.1.0, 9.2.2, 9.3.0), the ih1 and ih2 interrupt ring buffers are not initialized. This is by design, as these secondary IH rings are only available on discrete GPUs. See vega10_ih_sw_init() which explicitly skips ih1/ih2 initialization when AMD_IS_APU is set. However, amdgpu_gmc_filter_faults_remove() unconditionally uses ih1 to get the timestamp of the last interrupt entry. When retry faults are enabled on APUs (noretry=0), this function is called from the SVM page fault recovery path, resulting in a NULL pointer dereference when amdgpu_ih_decode_iv_ts_helper() attempts to access ih->ring[]. The crash manifests as: BUG: kernel NULL pointer dereference, address: 0000000000000004 RIP: 0010:amdgpu_ih_decode_iv_ts_helper+0x22/0x40 [amdgpu] Call Trace: amdgpu_gmc_filter_faults_remove+0x60/0x130 [amdgpu] svm_range_restore_pages+0xae5/0x11c0 [amdgpu] amdgpu_vm_handle_fault+0xc8/0x340 [amdgpu] gmc_v9_0_process_interrupt+0x191/0x220 [amdgpu] amdgpu_irq_dispatch+0xed/0x2c0 [amdgpu] amdgpu_ih_process+0x84/0x100 [amdgpu] This issue was exposed by commit 1446226d32a4 ("drm/amdgpu: Remove GC HW IP 9.3.0 from noretry=1") which changed the default for Renoir APU from noretry=1 to noretry=0, enabling retry fault handling and thus exercising the buggy code path. Fix this by adding a check for ih1.ring_size before attempting to use it. Also restore the soft_ih support from commit dd299441654f ("drm/amdgpu: Rework retry fault removal"). This is needed if the hardware doesn't support secondary HW IH rings. v2: additional updates (Alex) Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/3814 Fixes: dd299441654f ("drm/amdgpu: Rework retry fault removal") Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Philip Yang <Philip.Yang@amd.com> Signed-off-by: Jon Doron <jond@wiz.io> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2026-01-27drm/amd/pm: fix smu v14 soft clock frequency setting issueYang Wang2-0/+2
v1: resolve the issue where some freq frequencies cannot be set correctly due to insufficient floating-point precision. v2: patch this convert on 'max' value only. Signed-off-by: Yang Wang <kevinyang.wang@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>