aboutsummaryrefslogtreecommitdiff
path: root/drivers/net/ethernet/intel
AgeCommit message (Collapse)AuthorFilesLines
4 daysMerge branch '100GbE' of ↵Jakub Kicinski5-8/+33
git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net-queue Tony Nguyen says: ==================== Intel Wired LAN Driver Updates 2026-01-20 (ice, idpf) For ice: Cody Haas breaks dependency of needing both RSS key and LUT for ice_get_rxfh() as ethtool ioctls do not always supply both. Paul fixes issues related to devlink reload; adding missing deinit HW call and moving hwmon exit function to the proper call chain. For idpf: Mina Almasry moves a register read call into the time sandwich to ensure the register is properly flushed. * '100GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net-queue: idpf: read lower clock bits inside the time sandwich ice: fix devlink reload call trace ice: add missing ice_deinit_hw() in devlink reinit path ice: Fix persistent failure in ice_get_rxfh ==================== Link: https://patch.msgid.link/20260120224430.410377-1-anthony.l.nguyen@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
5 daysidpf: Fix data race in idpf_net_dimDavid Yang1-5/+11
In idpf_net_dim(), some statistics protected by u64_stats_sync, are read and accumulated in ignorance of possible u64_stats_fetch_retry() events. The correct way to copy statistics is already illustrated by idpf_add_queue_stats(). Fix this by reading them into temporary variables first. Fixes: c2d548cad150 ("idpf: add TX splitq napi poll support") Fixes: 3a8845af66ed ("idpf: add RX splitq napi poll support") Signed-off-by: David Yang <mmyangfl@gmail.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Link: https://patch.msgid.link/20260119162720.1463859-1-mmyangfl@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
5 daysidpf: read lower clock bits inside the time sandwichMina Almasry1-1/+1
PCIe reads need to be done inside the time sandwich because PCIe writes may get buffered in the PCIe fabric and posted to the device after the _postts completes. Doing the PCIe read inside the time sandwich guarantees that the write gets flushed before the _postts timestamp is taken. Cc: lrizzo@google.com Cc: namangulati@google.com Cc: willemb@google.com Cc: intel-wired-lan@lists.osuosl.org Cc: milena.olech@intel.com Cc: jacob.e.keller@intel.com Fixes: 5cb8805d2366 ("idpf: negotiate PTP capabilities and get PTP clock") Suggested-by: Shachar Raindel <shacharr@google.com> Signed-off-by: Mina Almasry <almasrymina@google.com> Reviewed-by: Jacob Keller <jacob.e.keller@intel.com> Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com> Tested-by: Samuel Salin <Samuel.salin@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
5 daysice: fix devlink reload call tracePaul Greenwalt1-2/+1
Commit 4da71a77fc3b ("ice: read internal temperature sensor") introduced internal temperature sensor reading via HWMON. ice_hwmon_init() was added to ice_init_feature() and ice_hwmon_exit() was added to ice_remove(). As a result if devlink reload is used to reinit the device and then the driver is removed, a call trace can occur. BUG: unable to handle page fault for address: ffffffffc0fd4b5d Call Trace: string+0x48/0xe0 vsnprintf+0x1f9/0x650 sprintf+0x62/0x80 name_show+0x1f/0x30 dev_attr_show+0x19/0x60 The call trace repeats approximately every 10 minutes when system monitoring tools (e.g., sadc) attempt to read the orphaned hwmon sysfs attributes that reference freed module memory. The sequence is: 1. Driver load, ice_hwmon_init() gets called from ice_init_feature() 2. Devlink reload down, flow does not call ice_remove() 3. Devlink reload up, ice_hwmon_init() gets called from ice_init_feature() resulting in a second instance 4. Driver unload, ice_hwmon_exit() called from ice_remove() leaving the first hwmon instance orphaned with dangling pointer Fix this by moving ice_hwmon_exit() from ice_remove() to ice_deinit_features() to ensure proper cleanup symmetry with ice_hwmon_init(). Fixes: 4da71a77fc3b ("ice: read internal temperature sensor") Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com> Signed-off-by: Paul Greenwalt <paul.greenwalt@intel.com> Reviewed-by: Paul Menzel <pmenzel@molgen.mpg.de> Tested-by: Rinitha S <sx.rinitha@intel.com> (A Contingent worker at Intel) Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
5 daysice: add missing ice_deinit_hw() in devlink reinit pathPaul Greenwalt1-0/+1
devlink-reload results in ice_init_hw failed error, and then removing the ice driver causes a NULL pointer dereference. [ +0.102213] ice 0000:ca:00.0: ice_init_hw failed: -16 ... [ +0.000001] Call Trace: [ +0.000003] <TASK> [ +0.000006] ice_unload+0x8f/0x100 [ice] [ +0.000081] ice_remove+0xba/0x300 [ice] Commit 1390b8b3d2be ("ice: remove duplicate call to ice_deinit_hw() on error paths") removed ice_deinit_hw() from ice_deinit_dev(). As a result ice_devlink_reinit_down() no longer calls ice_deinit_hw(), but ice_devlink_reinit_up() still calls ice_init_hw(). Since the control queues are not uninitialized, ice_init_hw() fails with -EBUSY. Add ice_deinit_hw() to ice_devlink_reinit_down() to correspond with ice_init_hw() in ice_devlink_reinit_up(). Fixes: 1390b8b3d2be ("ice: remove duplicate call to ice_deinit_hw() on error paths") Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com> Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com> Signed-off-by: Paul Greenwalt <paul.greenwalt@intel.com> Reviewed-by: Paul Menzel <pmenzel@molgen.mpg.de> Tested-by: Rinitha S <sx.rinitha@intel.com> (A Contingent worker at Intel) Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
5 daysice: Fix persistent failure in ice_get_rxfhCody Haas3-5/+30
Several ioctl functions have the ability to call ice_get_rxfh, however all of these ioctl functions do not provide all of the expected information in ethtool_rxfh_param. For example, ethtool_get_rxfh_indir does not provide an rss_key. This previously caused ethtool_get_rxfh_indir to always fail with -EINVAL. This change draws inspiration from i40e_get_rss to handle this situation, by only calling the appropriate rss helpers when the necessary information has been provided via ethtool_rxfh_param. Fixes: b66a972abb6b ("ice: Refactor ice_set/get_rss into LUT and key specific functions") Signed-off-by: Cody Haas <chaas@riotgames.com> Closes: https://lore.kernel.org/intel-wired-lan/CAH7f-UKkJV8MLY7zCdgCrGE55whRhbGAXvgkDnwgiZ9gUZT7_w@mail.gmail.com/ Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com> Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com> Tested-by: Rinitha S <sx.rinitha@intel.com> (A Contingent worker at Intel) Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
12 daysigc: Reduce TSN TX packet buffer from 7KB to 5KB per queueChwee-Lin Choong1-2/+3
The previous 7 KB per queue caused TX unit hangs under heavy timestamping load. Reducing to 5 KB avoids these hangs and matches the TSN recommendation in I225/I226 SW User Manual Section 7.5.4. The 8 KB "freed" by this change is currently unused. This reduction is not expected to impact throughput, as the i226 is PCIe-limited for small TSN packets rather than TX-buffer-limited. Fixes: 0d58cdc902da ("igc: optimize TX packet buffer utilization for TSN mode") Reported-by: Zdenek Bouska <zdenek.bouska@siemens.com> Closes: https://lore.kernel.org/netdev/AS1PR10MB5675DBFE7CE5F2A9336ABFA4EBEAA@AS1PR10MB5675.EURPRD10.PROD.OUTLOOK.COM/ Reviewed-by: Paul Menzel <pmenzel@molgen.mpg.de> Reviewed-by: Simon Horman <horms@kernel.org> Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com> Signed-off-by: Chwee-Lin Choong <chwee.lin.choong@intel.com> Tested-by: Avigail Dahan <avigailx.dahan@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
12 daysigc: fix race condition in TX timestamp read for register 0Chwee-Lin Choong1-18/+25
The current HW bug workaround checks the TXTT_0 ready bit first, then reads TXSTMPL_0 twice (before and after reading TXSTMPH_0) to detect whether a new timestamp was captured by timestamp register 0 during the workaround. This sequence has a race: if a new timestamp is captured after checking the TXTT_0 bit but before the first TXSTMPL_0 read, the detection fails because both the "old" and "new" values come from the same timestamp. Fix by reading TXSTMPL_0 first to establish a baseline, then checking the TXTT_0 bit. This ensures any timestamp captured during the race window will be detected. Old sequence: 1. Check TXTT_0 ready bit 2. Read TXSTMPL_0 (baseline) 3. Read TXSTMPH_0 (interrupt workaround) 4. Read TXSTMPL_0 (detect changes vs baseline) New sequence: 1. Read TXSTMPL_0 (baseline) 2. Check TXTT_0 ready bit 3. Read TXSTMPH_0 (interrupt workaround) 4. Read TXSTMPL_0 (detect changes vs baseline) Fixes: c789ad7cbebc ("igc: Work around HW bug causing missing timestamps") Suggested-by: Avi Shalev <avi.shalev@intel.com> Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com> Co-developed-by: Song Yoong Siang <yoong.siang.song@intel.com> Signed-off-by: Song Yoong Siang <yoong.siang.song@intel.com> Signed-off-by: Chwee-Lin Choong <chwee.lin.choong@intel.com> Tested-by: Avigail Dahan <avigailx.dahan@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
12 daysigc: Restore default Qbv schedule when changing channelsKurt Kanzenbach2-2/+7
The Multi-queue Priority (MQPRIO) and Earliest TxTime First (ETF) offloads utilize the Time Sensitive Networking (TSN) Tx mode. This mode is always coupled to IEEE 802.1Qbv time aware shaper (Qbv). Therefore, the driver sets a default Qbv schedule of all gates opened and a cycle time of 1s. This schedule is set during probe. However, the following sequence of events lead to Tx issues: - Boot a dual core system igc_probe(): igc_tsn_clear_schedule(): -> Default Schedule is set Note: At this point the driver has allocated two Tx/Rx queues, because there are only two CPUs. - ethtool -L enp3s0 combined 4 igc_ethtool_set_channels(): igc_reinit_queues() -> Default schedule is gone, per Tx ring start and end time are zero - tc qdisc replace dev enp3s0 handle 100 parent root mqprio \ num_tc 4 map 3 3 2 2 0 1 1 1 3 3 3 3 3 3 3 3 \ queues 1@0 1@1 1@2 1@3 hw 1 igc_tsn_offload_apply(): igc_tsn_enable_offload(): -> Writes zeros to IGC_STQT(i) and IGC_ENDQT(i), causing Tx to stall/fail Therefore, restore the default Qbv schedule after changing the number of channels. Furthermore, add a restriction to not allow queue reconfiguration when TSN/Qbv is enabled, because it may lead to inconsistent states. Fixes: c814a2d2d48f ("igc: Use default cycle 'start' and 'end' values for queues") Signed-off-by: Kurt Kanzenbach <kurt@linutronix.de> Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com> Tested-by: Avigail Dahan <avigailx.dahan@intel.com> Acked-by: Vinicius Costa Gomes <vinicius.gomes@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
12 daysice: Fix incorrect timeout ice_release_res()Ding Hui1-1/+1
The commit 5f6df173f92e ("ice: implement and use rd32_poll_timeout for ice_sq_done timeout") converted ICE_CTL_Q_SQ_CMD_TIMEOUT from jiffies to microseconds. But the ice_release_res() function was missed, and its logic still treats ICE_CTL_Q_SQ_CMD_TIMEOUT as a jiffies value. So correct the issue by usecs_to_jiffies(). Found by inspection of the DDP downloading process. Compile and modprobe tested only. Fixes: 5f6df173f92e ("ice: implement and use rd32_poll_timeout for ice_sq_done timeout") Signed-off-by: Ding Hui <dinghui@sangfor.com.cn> Reviewed-by: Simon Horman <horms@kernel.org> Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com> Reviewed-by: Jacob Keller <jacob.e.keller@intel.com> Reviewed-by: Paul Menzel <pmenzel@molgen.mpg.de> Tested-by: Rinitha S <sx.rinitha@intel.com> (A Contingent worker at Intel) Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
12 daysice: Avoid detrimental cleanup for bond during interface stopDave Ertman1-8/+17
When the user issues an administrative down to an interface that is the primary for an aggregate bond, the prune lists are being purged. This breaks communication to the secondary interface, which shares a prune list on the main switch block while bonded together. For the primary interface of an aggregate, avoid deleting these prune lists during stop, and since they are hardcoded to specific values for the default vlan and QinQ vlans, the attempt to re-add them during the up phase will quietly fail without any additional problem. Fixes: 1e0f9881ef79 ("ice: Flesh out implementation of support for SRIOV on bonded interface") Reviewed-by: Jacob Keller <jacob.e.keller@intel.com> Reviewed-by: Marcin Szycik <marcin.szycik@linux.intel.com> Signed-off-by: Dave Ertman <david.m.ertman@intel.com> Tested-by: Rinitha S <sx.rinitha@intel.com> (A Contingent worker at Intel) Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
12 daysice: initialize ring_stats->syncpJacob Keller1-0/+4
The u64_stats_sync structure is empty on 64-bit systems. However, on 32-bit systems it contains a seqcount_t which needs to be initialized. While the memory is zero-initialized, a lack of u64_stats_init means that lockdep won't get initialized properly. Fix this by adding u64_stats_init() calls to the rings just after allocation. Fixes: 2b245cb29421 ("ice: Implement transmit and NAPI support") Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com> Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Reviewed-by: Simon Horman <horms@kernel.org> Tested-by: Rinitha S <sx.rinitha@intel.com> (A Contingent worker at Intel) Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2026-01-06idpf: fix aux device unplugging when rdma is not supported by vportLarysa Zaremba1-1/+1
If vport flags do not contain VIRTCHNL2_VPORT_ENABLE_RDMA, driver does not allocate vdev_info for this vport. This leads to kernel NULL pointer dereference in idpf_idc_vport_dev_down(), which references vdev_info for every vport regardless. Check, if vdev_info was ever allocated before unplugging aux device. Fixes: be91128c579c ("idpf: implement RDMA vport auxiliary dev create, init, and destroy") Reviewed-by: Madhu Chittim <madhu.chittim@intel.com> Signed-off-by: Larysa Zaremba <larysa.zaremba@intel.com> Reviewed-by: Paul Menzel <pmenzel@molgen.mpg.de> Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com> Tested-by: Krishneil Singh <krishneil.k.singh@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2026-01-06idpf: cap maximum Rx buffer sizeJoshua Hay2-3/+6
The HW only supports a maximum Rx buffer size of 16K-128. On systems using large pages, the libeth logic can configure the buffer size to be larger than this. The upper bound is PAGE_SIZE while the lower bound is MTU rounded up to the nearest power of 2. For example, ARM systems with a 64K page size and an mtu of 9000 will set the Rx buffer size to 16K, which will cause the config Rx queues message to fail. Initialize the bufq/fill queue buf_len field to the maximum supported size. This will trigger the libeth logic to cap the maximum Rx buffer size by reducing the upper bound. Fixes: 74d1412ac8f37 ("idpf: use libeth Rx buffer management for payload buffer") Signed-off-by: Joshua Hay <joshua.a.hay@intel.com> Acked-by: Alexander Lobakin <aleksander.lobakin@intel.com> Reviewed-by: Madhu Chittim <madhu.chittim@intel.com> Reviewed-by: Jacob Keller <jacob.e.keller@intel.com> Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com> Reviewed-by: David Decotigny <ddecotig@google.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2026-01-06idpf: Fix error handling in idpf_vport_open()Sreedevi Joshi1-2/+2
Fix error handling to properly cleanup interrupts when idpf_vport_queue_ids_init() or idpf_rx_bufs_init_all() fail. Jump to 'intr_deinit' instead of 'queues_rel' to ensure interrupts are cleaned up before releasing other resources. Fixes: d4d558718266 ("idpf: initialize interrupts and enable vport") Signed-off-by: Sreedevi Joshi <sreedevi.joshi@intel.com> Reviewed-by: Madhu Chittim <madhu.chittim@intel.com> Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com> Reviewed-by: Simon Horman <horms@kernel.org> Tested-by: Samuel Salin <Samuel.salin@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2026-01-06idpf: Fix RSS LUT NULL ptr issue after soft resetSreedevi Joshi3-17/+6
During soft reset, the RSS LUT is freed and not restored unless the interface is up. If an ethtool command that accesses the rss lut is attempted immediately after reset, it will result in NULL ptr dereference. Also, there is no need to reset the rss lut if the soft reset does not involve queue count change. After soft reset, set the RSS LUT to default values based on the updated queue count only if the reset was a result of a queue count change and the LUT was not configured by the user. In all other cases, don't touch the LUT. Steps to reproduce: ** Bring the interface down (if up) ifconfig eth1 down ** update the queue count (eg., 27->20) ethtool -L eth1 combined 20 ** display the RSS LUT ethtool -x eth1 [82375.558338] BUG: kernel NULL pointer dereference, address: 0000000000000000 [82375.558373] #PF: supervisor read access in kernel mode [82375.558391] #PF: error_code(0x0000) - not-present page [82375.558408] PGD 0 P4D 0 [82375.558421] Oops: Oops: 0000 [#1] SMP NOPTI <snip> [82375.558516] RIP: 0010:idpf_get_rxfh+0x108/0x150 [idpf] [82375.558786] Call Trace: [82375.558793] <TASK> [82375.558804] rss_prepare.isra.0+0x187/0x2a0 [82375.558827] rss_prepare_data+0x3a/0x50 [82375.558845] ethnl_default_doit+0x13d/0x3e0 [82375.558863] genl_family_rcv_msg_doit+0x11f/0x180 [82375.558886] genl_rcv_msg+0x1ad/0x2b0 [82375.558902] ? __pfx_ethnl_default_doit+0x10/0x10 [82375.558920] ? __pfx_genl_rcv_msg+0x10/0x10 [82375.558937] netlink_rcv_skb+0x58/0x100 [82375.558957] genl_rcv+0x2c/0x50 [82375.558971] netlink_unicast+0x289/0x3e0 [82375.558988] netlink_sendmsg+0x215/0x440 [82375.559005] __sys_sendto+0x234/0x240 [82375.559555] __x64_sys_sendto+0x28/0x30 [82375.560068] x64_sys_call+0x1909/0x1da0 [82375.560576] do_syscall_64+0x7a/0xfa0 [82375.561076] ? clear_bhb_loop+0x60/0xb0 [82375.561567] entry_SYSCALL_64_after_hwframe+0x76/0x7e <snip> Fixes: 02cbfba1add5 ("idpf: add ethtool callbacks") Signed-off-by: Sreedevi Joshi <sreedevi.joshi@intel.com> Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com> Reviewed-by: Sridhar Samudrala <sridhar.samudrala@intel.com> Reviewed-by: Emil Tantilov <emil.s.tantilov@intel.com> Reviewed-by: Simon Horman <horms@kernel.org> Tested-by: Samuel Salin <Samuel.salin@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2026-01-06idpf: Fix RSS LUT configuration on down interfacesSreedevi Joshi1-7/+11
RSS LUT provisioning and queries on a down interface currently return silently without effect. Users should be able to configure RSS settings even when the interface is down. Fix by maintaining RSS configuration changes in the driver's soft copy and deferring HW programming until the interface comes up. Fixes: 02cbfba1add5 ("idpf: add ethtool callbacks") Signed-off-by: Sreedevi Joshi <sreedevi.joshi@intel.com> Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com> Reviewed-by: Sridhar Samudrala <sridhar.samudrala@intel.com> Reviewed-by: Emil Tantilov <emil.s.tantilov@intel.com> Tested-by: Samuel Salin <Samuel.salin@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2026-01-06idpf: Fix RSS LUT NULL pointer crash on early ethtool operationsSreedevi Joshi5-79/+66
The RSS LUT is not initialized until the interface comes up, causing the following NULL pointer crash when ethtool operations like rxhash on/off are performed before the interface is brought up for the first time. Move RSS LUT initialization from ndo_open to vport creation to ensure LUT is always available. This enables RSS configuration via ethtool before bringing the interface up. Simplify LUT management by maintaining all changes in the driver's soft copy and programming zeros to the indirection table when rxhash is disabled. Defer HW programming until the interface comes up if it is down during rxhash and LUT configuration changes. Steps to reproduce: ** Load idpf driver; interfaces will be created modprobe idpf ** Before bringing the interfaces up, turn rxhash off ethtool -K eth2 rxhash off [89408.371875] BUG: kernel NULL pointer dereference, address: 0000000000000000 [89408.371908] #PF: supervisor read access in kernel mode [89408.371924] #PF: error_code(0x0000) - not-present page [89408.371940] PGD 0 P4D 0 [89408.371953] Oops: Oops: 0000 [#1] SMP NOPTI <snip> [89408.372052] RIP: 0010:memcpy_orig+0x16/0x130 [89408.372310] Call Trace: [89408.372317] <TASK> [89408.372326] ? idpf_set_features+0xfc/0x180 [idpf] [89408.372363] __netdev_update_features+0x295/0xde0 [89408.372384] ethnl_set_features+0x15e/0x460 [89408.372406] genl_family_rcv_msg_doit+0x11f/0x180 [89408.372429] genl_rcv_msg+0x1ad/0x2b0 [89408.372446] ? __pfx_ethnl_set_features+0x10/0x10 [89408.372465] ? __pfx_genl_rcv_msg+0x10/0x10 [89408.372482] netlink_rcv_skb+0x58/0x100 [89408.372502] genl_rcv+0x2c/0x50 [89408.372516] netlink_unicast+0x289/0x3e0 [89408.372533] netlink_sendmsg+0x215/0x440 [89408.372551] __sys_sendto+0x234/0x240 [89408.372571] __x64_sys_sendto+0x28/0x30 [89408.372585] x64_sys_call+0x1909/0x1da0 [89408.372604] do_syscall_64+0x7a/0xfa0 [89408.373140] ? clear_bhb_loop+0x60/0xb0 [89408.373647] entry_SYSCALL_64_after_hwframe+0x76/0x7e [89408.378887] </TASK> <snip> Fixes: a251eee62133 ("idpf: add SRIOV support and other ndo_ops") Signed-off-by: Sreedevi Joshi <sreedevi.joshi@intel.com> Reviewed-by: Sridhar Samudrala <sridhar.samudrala@intel.com> Reviewed-by: Emil Tantilov <emil.s.tantilov@intel.com> Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com> Reviewed-by: Paul Menzel <pmenzel@molgen.mpg.de> Reviewed-by: Simon Horman <horms@kernel.org> Tested-by: Samuel Salin <Samuel.salin@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2026-01-06idpf: fix issue with ethtool -n command displayErik Gabriel Carrillo2-22/+40
When ethtool -n is executed on an interface to display the flow steering rules, "rxclass: Unknown flow type" error is generated. The flow steering list maintained in the driver currently stores only the location and q_index but other fields of the ethtool_rx_flow_spec are not stored. This may be enough for the virtchnl command to delete the entry. However, when the ethtool -n command is used to query the flow steering rules, the ethtool_rx_flow_spec returned is not complete causing the error below. Resolve this by storing the flow spec (fsp) when rules are added and returning the complete flow spec when rules are queried. Also, change the return value from EINVAL to ENOENT when flow steering entry is not found during query by location or when deleting an entry. Add logic to detect and reject duplicate filter entries at the same location and change logic to perform upfront validation of all error conditions before adding flow rules through virtchnl. This avoids the need for additional virtchnl delete messages when subsequent operations fail, which was missing in the original upstream code. Example: Before the fix: ethtool -n eth1 2 RX rings available Total 2 rules rxclass: Unknown flow type rxclass: Unknown flow type After the fix: ethtool -n eth1 2 RX rings available Total 2 rules Filter: 0 Rule Type: TCP over IPv4 Src IP addr: 10.0.0.1 mask: 0.0.0.0 Dest IP addr: 0.0.0.0 mask: 255.255.255.255 TOS: 0x0 mask: 0xff Src port: 0 mask: 0xffff Dest port: 0 mask: 0xffff Action: Direct to queue 0 Filter: 1 Rule Type: UDP over IPv4 Src IP addr: 10.0.0.1 mask: 0.0.0.0 Dest IP addr: 0.0.0.0 mask: 255.255.255.255 TOS: 0x0 mask: 0xff Src port: 0 mask: 0xffff Dest port: 0 mask: 0xffff Action: Direct to queue 0 Fixes: ada3e24b84a0 ("idpf: add flow steering support") Signed-off-by: Erik Gabriel Carrillo <erik.g.carrillo@intel.com> Co-developed-by: Sreedevi Joshi <sreedevi.joshi@intel.com> Signed-off-by: Sreedevi Joshi <sreedevi.joshi@intel.com> Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com> Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com> Reviewed-by: Simon Horman <horms@kernel.org> Tested-by: Mina Almasry <almasrymina@google.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2026-01-06idpf: fix memory leak of flow steer list on rmmodSreedevi Joshi3-3/+42
The flow steering list maintains entries that are added and removed as ethtool creates and deletes flow steering rules. Module removal with active entries causes memory leak as the list is not properly cleaned up. Prevent this by iterating through the remaining entries in the list and freeing the associated memory during module removal. Add a spinlock (flow_steer_list_lock) to protect the list access from multiple threads. Fixes: ada3e24b84a0 ("idpf: add flow steering support") Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com> Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com> Signed-off-by: Sreedevi Joshi <sreedevi.joshi@intel.com> Reviewed-by: Simon Horman <horms@kernel.org> Tested-by: Mina Almasry <almasrymina@google.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2026-01-06idpf: fix error handling in the init_task on loadEmil Tantilov1-4/+12
If the init_task fails during a driver load, we end up without vports and netdevs, effectively failing the entire process. In that state a subsequent reset will result in a crash as the service task attempts to access uninitialized resources. Following trace is from an error in the init_task where the CREATE_VPORT (op 501) is rejected by the FW: [40922.763136] idpf 0000:83:00.0: Device HW Reset initiated [40924.449797] idpf 0000:83:00.0: Transaction failed (op 501) [40958.148190] idpf 0000:83:00.0: HW reset detected [40958.161202] BUG: kernel NULL pointer dereference, address: 00000000000000a8 ... [40958.168094] Workqueue: idpf-0000:83:00.0-vc_event idpf_vc_event_task [idpf] [40958.168865] RIP: 0010:idpf_vc_event_task+0x9b/0x350 [idpf] ... [40958.177932] Call Trace: [40958.178491] <TASK> [40958.179040] process_one_work+0x226/0x6d0 [40958.179609] worker_thread+0x19e/0x340 [40958.180158] ? __pfx_worker_thread+0x10/0x10 [40958.180702] kthread+0x10f/0x250 [40958.181238] ? __pfx_kthread+0x10/0x10 [40958.181774] ret_from_fork+0x251/0x2b0 [40958.182307] ? __pfx_kthread+0x10/0x10 [40958.182834] ret_from_fork_asm+0x1a/0x30 [40958.183370] </TASK> Fix the error handling in the init_task to make sure the service and mailbox tasks are disabled if the error happens during load. These are started in idpf_vc_core_init(), which spawns the init_task and has no way of knowing if it failed. If the error happens on reset, following successful driver load, the tasks can still run, as that will allow the netdevs to attempt recovery through another reset. Stop the PTP callbacks either way as those will be restarted by the call to idpf_vc_core_init() during a successful reset. Fixes: 0fe45467a104 ("idpf: add create vport and netdev configuration") Reported-by: Vivek Kumar <iamvivekkumar@google.com> Signed-off-by: Emil Tantilov <emil.s.tantilov@intel.com> Reviewed-by: Madhu Chittim <madhu.chittim@intel.com> Tested-by: Samuel Salin <Samuel.salin@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2026-01-06idpf: fix memory leak in idpf_vc_core_deinit()Emil Tantilov1-0/+4
Make sure to free hw->lan_regs. Reported by kmemleak during reset: unreferenced object 0xff1b913d02a936c0 (size 96): comm "kworker/u258:14", pid 2174, jiffies 4294958305 hex dump (first 32 bytes): 00 00 00 c0 a8 ba 2d ff 00 00 00 00 00 00 00 00 ......-......... 00 00 40 08 00 00 00 00 00 00 25 b3 a8 ba 2d ff ..@.......%...-. backtrace (crc 36063c4f): __kmalloc_noprof+0x48f/0x890 idpf_vc_core_init+0x6ce/0x9b0 [idpf] idpf_vc_event_task+0x1fb/0x350 [idpf] process_one_work+0x226/0x6d0 worker_thread+0x19e/0x340 kthread+0x10f/0x250 ret_from_fork+0x251/0x2b0 ret_from_fork_asm+0x1a/0x30 Fixes: 6aa53e861c1a ("idpf: implement get LAN MMIO memory regions") Signed-off-by: Emil Tantilov <emil.s.tantilov@intel.com> Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com> Reviewed-by: Joshua Hay <joshua.a.hay@intel.com> Tested-by: Samuel Salin <Samuel.salin@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2026-01-06idpf: fix memory leak in idpf_vport_rel()Emil Tantilov1-0/+2
Free vport->rx_ptype_lkup in idpf_vport_rel() to avoid leaking memory during a reset. Reported by kmemleak: unreferenced object 0xff450acac838a000 (size 4096): comm "kworker/u258:5", pid 7732, jiffies 4296830044 hex dump (first 32 bytes): 00 00 00 00 00 10 00 00 00 10 00 00 00 00 00 00 ................ 00 00 00 00 00 00 00 00 00 10 00 00 00 00 00 00 ................ backtrace (crc 3da81902): __kmalloc_cache_noprof+0x469/0x7a0 idpf_send_get_rx_ptype_msg+0x90/0x570 [idpf] idpf_init_task+0x1ec/0x8d0 [idpf] process_one_work+0x226/0x6d0 worker_thread+0x19e/0x340 kthread+0x10f/0x250 ret_from_fork+0x251/0x2b0 ret_from_fork_asm+0x1a/0x30 Fixes: 0fe45467a104 ("idpf: add create vport and netdev configuration") Signed-off-by: Emil Tantilov <emil.s.tantilov@intel.com> Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com> Reviewed-by: Madhu Chittim <madhu.chittim@intel.com> Tested-by: Samuel Salin <Samuel.salin@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2026-01-06idpf: detach and close netdevs while handling a resetEmil Tantilov1-49/+72
Protect the reset path from callbacks by setting the netdevs to detached state and close any netdevs in UP state until the reset handling has completed. During a reset, the driver will de-allocate resources for the vport, and there is no guarantee that those will recover, which is why the existing vport_ctrl_lock does not provide sufficient protection. idpf_detach_and_close() is called right before reset handling. If the reset handling succeeds, the netdevs state is recovered via call to idpf_attach_and_open(). If the reset handling fails the netdevs remain down. The detach/down calls are protected with RTNL lock to avoid racing with callbacks. On the recovery side the attach can be done without holding the RTNL lock as there are no callbacks expected at that point, due to detach/close always being done first in that flow. The previous logic restoring the netdevs state based on the IDPF_VPORT_UP_REQUESTED flag in the init task is not needed anymore, hence the removal of idpf_set_vport_state(). The IDPF_VPORT_UP_REQUESTED is still being used to restore the state of the netdevs following the reset, but has no use outside of the reset handling flow. idpf_init_hard_reset() is converted to void, since it was used as such and there is no error handling being done based on its return value. Before this change, invoking hard and soft resets simultaneously will cause the driver to lose the vport state: ip -br a <inf> UP echo 1 > /sys/class/net/ens801f0/device/reset& \ ethtool -L ens801f0 combined 8 ip -br a <inf> DOWN ip link set <inf> up ip -br a <inf> DOWN Also in case of a failure in the reset path, the netdev is left exposed to external callbacks, while vport resources are not initialized, leading to a crash on subsequent ifup/down: [408471.398966] idpf 0000:83:00.0: HW reset detected [408471.411744] idpf 0000:83:00.0: Device HW Reset initiated [408472.277901] idpf 0000:83:00.0: The driver was unable to contact the device's firmware. Check that the FW is running. Driver state= 0x2 [408508.125551] BUG: kernel NULL pointer dereference, address: 0000000000000078 [408508.126112] #PF: supervisor read access in kernel mode [408508.126687] #PF: error_code(0x0000) - not-present page [408508.127256] PGD 2aae2f067 P4D 0 [408508.127824] Oops: Oops: 0000 [#1] SMP NOPTI ... [408508.130871] RIP: 0010:idpf_stop+0x39/0x70 [idpf] ... [408508.139193] Call Trace: [408508.139637] <TASK> [408508.140077] __dev_close_many+0xbb/0x260 [408508.140533] __dev_change_flags+0x1cf/0x280 [408508.140987] netif_change_flags+0x26/0x70 [408508.141434] dev_change_flags+0x3d/0xb0 [408508.141878] devinet_ioctl+0x460/0x890 [408508.142321] inet_ioctl+0x18e/0x1d0 [408508.142762] ? _copy_to_user+0x22/0x70 [408508.143207] sock_do_ioctl+0x3d/0xe0 [408508.143652] sock_ioctl+0x10e/0x330 [408508.144091] ? find_held_lock+0x2b/0x80 [408508.144537] __x64_sys_ioctl+0x96/0xe0 [408508.144979] do_syscall_64+0x79/0x3d0 [408508.145415] entry_SYSCALL_64_after_hwframe+0x76/0x7e [408508.145860] RIP: 0033:0x7f3e0bb4caff Fixes: 0fe45467a104 ("idpf: add create vport and netdev configuration") Signed-off-by: Emil Tantilov <emil.s.tantilov@intel.com> Reviewed-by: Madhu Chittim <madhu.chittim@intel.com> Tested-by: Samuel Salin <Samuel.salin@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2026-01-06idpf: keep the netdev when a reset failsEmil Tantilov1-11/+6
During a successful reset the driver would re-allocate vport resources while keeping the netdevs intact. However, in case of an error in the init task, the netdev of the failing vport will be unregistered, effectively removing the network interface: [ 121.211076] idpf 0000:83:00.0: enabling device (0100 -> 0102) [ 121.221976] idpf 0000:83:00.0: Device HW Reset initiated [ 124.161229] idpf 0000:83:00.0 ens801f0: renamed from eth0 [ 124.163364] idpf 0000:83:00.0 ens801f0d1: renamed from eth1 [ 125.934656] idpf 0000:83:00.0 ens801f0d2: renamed from eth2 [ 128.218429] idpf 0000:83:00.0 ens801f0d3: renamed from eth3 ip -br a ens801f0 UP ens801f0d1 UP ens801f0d2 UP ens801f0d3 UP echo 1 > /sys/class/net/ens801f0/device/reset [ 145.885537] idpf 0000:83:00.0: resetting [ 145.990280] idpf 0000:83:00.0: reset done [ 146.284766] idpf 0000:83:00.0: HW reset detected [ 146.296610] idpf 0000:83:00.0: Device HW Reset initiated [ 211.556719] idpf 0000:83:00.0: Transaction timed-out (op:526 cookie:7700 vc_op:526 salt:77 timeout:60000ms) [ 272.996705] idpf 0000:83:00.0: Transaction timed-out (op:502 cookie:7800 vc_op:502 salt:78 timeout:60000ms) ip -br a ens801f0d1 DOWN ens801f0d2 DOWN ens801f0d3 DOWN Re-shuffle the logic in the error path of the init task to make sure the netdevs remain intact. This will allow the driver to attempt recovery via subsequent resets, provided the FW is still functional. The main change is to make sure that idpf_decfg_netdev() is not called should the init task fail during a reset. The error handling is consolidated under unwind_vports, as the removed labels had the same cleanup logic split depending on the point of failure. Fixes: ce1b75d0635c ("idpf: add ptypes and MAC filter support") Signed-off-by: Emil Tantilov <emil.s.tantilov@intel.com> Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com> Tested-by: Samuel Salin <Samuel.salin@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2025-12-27Merge branch '40GbE' of ↵Paolo Abeni8-18/+31
git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net-queue Tony Nguyen says: ==================== Intel Wired LAN Driver Updates 2025-12-17 (i40e, iavf, idpf, e1000) For i40e: Przemyslaw immediately schedules service task following changes to filters to ensure timely setup for PTP. Gregory Herrero adjusts VF descriptor size checks to be device specific. For iavf: Kohei Enju corrects a couple of condition checks which caused off-by-one issues. For idpf: Larysa fixes LAN memory region call to follow expected requirements. Brian Vazquez reduces mailbox wait time during init to avoid lengthy delays. For e1000: Guangshuo Li adds validation of data length to prevent out-of-bounds access. * '40GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net-queue: e1000: fix OOB in e1000_tbi_should_accept() idpf: reduce mbx_task schedule delay to 300us idpf: fix LAN memory regions command on some NVMs iavf: fix off-by-one issues in iavf_config_rss_reg() i40e: validate ring_len parameter against hardware-specific values i40e: fix scheduling in set_rx_mode ==================== Link: https://patch.msgid.link/ Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-12-17e1000: fix OOB in e1000_tbi_should_accept()Guangshuo Li1-1/+9
In e1000_tbi_should_accept() we read the last byte of the frame via 'data[length - 1]' to evaluate the TBI workaround. If the descriptor- reported length is zero or larger than the actual RX buffer size, this read goes out of bounds and can hit unrelated slab objects. The issue is observed from the NAPI receive path (e1000_clean_rx_irq): ================================================================== BUG: KASAN: slab-out-of-bounds in e1000_tbi_should_accept+0x610/0x790 Read of size 1 at addr ffff888014114e54 by task sshd/363 CPU: 0 PID: 363 Comm: sshd Not tainted 5.18.0-rc1 #1 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.12.0-59-gc9ba5276e321-prebuilt.qemu.org 04/01/2014 Call Trace: <IRQ> dump_stack_lvl+0x5a/0x74 print_address_description+0x7b/0x440 print_report+0x101/0x200 kasan_report+0xc1/0xf0 e1000_tbi_should_accept+0x610/0x790 e1000_clean_rx_irq+0xa8c/0x1110 e1000_clean+0xde2/0x3c10 __napi_poll+0x98/0x380 net_rx_action+0x491/0xa20 __do_softirq+0x2c9/0x61d do_softirq+0xd1/0x120 </IRQ> <TASK> __local_bh_enable_ip+0xfe/0x130 ip_finish_output2+0x7d5/0xb00 __ip_queue_xmit+0xe24/0x1ab0 __tcp_transmit_skb+0x1bcb/0x3340 tcp_write_xmit+0x175d/0x6bd0 __tcp_push_pending_frames+0x7b/0x280 tcp_sendmsg_locked+0x2e4f/0x32d0 tcp_sendmsg+0x24/0x40 sock_write_iter+0x322/0x430 vfs_write+0x56c/0xa60 ksys_write+0xd1/0x190 do_syscall_64+0x43/0x90 entry_SYSCALL_64_after_hwframe+0x44/0xae RIP: 0033:0x7f511b476b10 Code: 73 01 c3 48 8b 0d 88 d3 2b 00 f7 d8 64 89 01 48 83 c8 ff c3 66 0f 1f 44 00 00 83 3d f9 2b 2c 00 00 75 10 b8 01 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 31 c3 48 83 ec 08 e8 8e 9b 01 00 48 89 04 24 RSP: 002b:00007ffc9211d4e8 EFLAGS: 00000246 ORIG_RAX: 0000000000000001 RAX: ffffffffffffffda RBX: 0000000000004024 RCX: 00007f511b476b10 RDX: 0000000000004024 RSI: 0000559a9385962c RDI: 0000000000000003 RBP: 0000559a9383a400 R08: fffffffffffffff0 R09: 0000000000004f00 R10: 0000000000000070 R11: 0000000000000246 R12: 0000000000000000 R13: 00007ffc9211d57f R14: 0000559a9347bde7 R15: 0000000000000003 </TASK> Allocated by task 1: __kasan_krealloc+0x131/0x1c0 krealloc+0x90/0xc0 add_sysfs_param+0xcb/0x8a0 kernel_add_sysfs_param+0x81/0xd4 param_sysfs_builtin+0x138/0x1a6 param_sysfs_init+0x57/0x5b do_one_initcall+0x104/0x250 do_initcall_level+0x102/0x132 do_initcalls+0x46/0x74 kernel_init_freeable+0x28f/0x393 kernel_init+0x14/0x1a0 ret_from_fork+0x22/0x30 The buggy address belongs to the object at ffff888014114000 which belongs to the cache kmalloc-2k of size 2048 The buggy address is located 1620 bytes to the right of 2048-byte region [ffff888014114000, ffff888014114800] The buggy address belongs to the physical page: page:ffffea0000504400 refcount:1 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x14110 head:ffffea0000504400 order:3 compound_mapcount:0 compound_pincount:0 flags: 0x100000000010200(slab|head|node=0|zone=1) raw: 0100000000010200 0000000000000000 dead000000000001 ffff888013442000 raw: 0000000000000000 0000000000080008 00000001ffffffff 0000000000000000 page dumped because: kasan: bad access detected ================================================================== This happens because the TBI check unconditionally dereferences the last byte without validating the reported length first: u8 last_byte = *(data + length - 1); Fix by rejecting the frame early if the length is zero, or if it exceeds adapter->rx_buffer_len. This preserves the TBI workaround semantics for valid frames and prevents touching memory beyond the RX buffer. Fixes: 2037110c96d5 ("e1000: move tbi workaround code into helper function") Cc: stable@vger.kernel.org Signed-off-by: Guangshuo Li <lgs201920130244@gmail.com> Reviewed-by: Simon Horman <horms@kernel.org> Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2025-12-17idpf: reduce mbx_task schedule delay to 300usBrian Vazquez1-1/+1
During the IDPF init phase, the mailbox runs in poll mode until it is configured to properly handle interrupts. The previous delay of 300ms is excessively long for the mailbox polling mechanism, which causes a slow initialization of ~2s: echo 0000:06:12.4 > /sys/bus/pci/drivers/idpf/bind [ 52.444239] idpf 0000:06:12.4: enabling device (0000 -> 0002) [ 52.485005] idpf 0000:06:12.4: Device HW Reset initiated [ 54.177181] idpf 0000:06:12.4: PTP init failed, err=-EOPNOTSUPP [ 54.206177] idpf 0000:06:12.4: Minimum RX descriptor support not provided, using the default [ 54.206182] idpf 0000:06:12.4: Minimum TX descriptor support not provided, using the default Changing the delay to 300us avoids the delays during the initial mailbox transactions, making the init phase much faster: [ 83.342590] idpf 0000:06:12.4: enabling device (0000 -> 0002) [ 83.384402] idpf 0000:06:12.4: Device HW Reset initiated [ 83.518323] idpf 0000:06:12.4: PTP init failed, err=-EOPNOTSUPP [ 83.547430] idpf 0000:06:12.4: Minimum RX descriptor support not provided, using the default [ 83.547435] idpf 0000:06:12.4: Minimum TX descriptor support not provided, using the default Fixes: 4930fbf419a7 ("idpf: add core init and interrupt request") Signed-off-by: Brian Vazquez <brianvv@google.com> Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com> Tested-by: Samuel Salin <Samuel.salin@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2025-12-17idpf: fix LAN memory regions command on some NVMsLarysa Zaremba1-0/+5
IPU SDK versions 1.9 through 2.0.5 require send buffer to contain a single empty memory region. Set number of regions to 1 and use appropriate send buffer size to satisfy this requirement. Fixes: 6aa53e861c1a ("idpf: implement get LAN MMIO memory regions") Suggested-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com> Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com> Signed-off-by: Larysa Zaremba <larysa.zaremba@intel.com> Tested-by: Krishneil Singh <krishneil.k.singh@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2025-12-17iavf: fix off-by-one issues in iavf_config_rss_reg()Kohei Enju1-2/+2
There are off-by-one bugs when configuring RSS hash key and lookup table, causing out-of-bounds reads to memory [1] and out-of-bounds writes to device registers. Before commit 43a3d9ba34c9 ("i40evf: Allow PF driver to configure RSS"), the loop upper bounds were: i <= I40E_VFQF_{HKEY,HLUT}_MAX_INDEX which is safe since the value is the last valid index. That commit changed the bounds to: i <= adapter->rss_{key,lut}_size / 4 where `rss_{key,lut}_size / 4` is the number of dwords, so the last valid index is `(rss_{key,lut}_size / 4) - 1`. Therefore, using `<=` accesses one element past the end. Fix the issues by using `<` instead of `<=`, ensuring we do not exceed the bounds. [1] KASAN splat about rss_key_size off-by-one BUG: KASAN: slab-out-of-bounds in iavf_config_rss+0x619/0x800 Read of size 4 at addr ffff888102c50134 by task kworker/u8:6/63 CPU: 0 UID: 0 PID: 63 Comm: kworker/u8:6 Not tainted 6.18.0-rc2-enjuk-tnguy-00378-g3005f5b77652-dirty #156 PREEMPT(voluntary) Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.3-debian-1.16.3-2 04/01/2014 Workqueue: iavf iavf_watchdog_task Call Trace: <TASK> dump_stack_lvl+0x6f/0xb0 print_report+0x170/0x4f3 kasan_report+0xe1/0x1a0 iavf_config_rss+0x619/0x800 iavf_watchdog_task+0x2be7/0x3230 process_one_work+0x7fd/0x1420 worker_thread+0x4d1/0xd40 kthread+0x344/0x660 ret_from_fork+0x249/0x320 ret_from_fork_asm+0x1a/0x30 </TASK> Allocated by task 63: kasan_save_stack+0x30/0x50 kasan_save_track+0x14/0x30 __kasan_kmalloc+0x7f/0x90 __kmalloc_noprof+0x246/0x6f0 iavf_watchdog_task+0x28fc/0x3230 process_one_work+0x7fd/0x1420 worker_thread+0x4d1/0xd40 kthread+0x344/0x660 ret_from_fork+0x249/0x320 ret_from_fork_asm+0x1a/0x30 The buggy address belongs to the object at ffff888102c50100 which belongs to the cache kmalloc-64 of size 64 The buggy address is located 0 bytes to the right of allocated 52-byte region [ffff888102c50100, ffff888102c50134) The buggy address belongs to the physical page: page: refcount:0 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x102c50 flags: 0x200000000000000(node=0|zone=2) page_type: f5(slab) raw: 0200000000000000 ffff8881000418c0 dead000000000122 0000000000000000 raw: 0000000000000000 0000000080200020 00000000f5000000 0000000000000000 page dumped because: kasan: bad access detected Memory state around the buggy address: ffff888102c50000: 00 00 00 00 00 00 00 fc fc fc fc fc fc fc fc fc ffff888102c50080: 00 00 00 00 00 00 00 fc fc fc fc fc fc fc fc fc >ffff888102c50100: 00 00 00 00 00 00 04 fc fc fc fc fc fc fc fc fc ^ ffff888102c50180: 00 00 00 00 00 00 00 00 fc fc fc fc fc fc fc fc ffff888102c50200: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc Fixes: 43a3d9ba34c9 ("i40evf: Allow PF driver to configure RSS") Signed-off-by: Kohei Enju <enjuk@amazon.com> Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com> Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com> Tested-by: Rafal Romanowski <rafal.romanowski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2025-12-17i40e: validate ring_len parameter against hardware-specific valuesGregory Herrero3-14/+13
The maximum number of descriptors supported by the hardware is hardware-dependent and can be retrieved using i40e_get_max_num_descriptors(). Move this function to a shared header and use it when checking for valid ring_len parameter rather than using hardcoded value. By fixing an over-acceptance issue, behavior change could be seen where ring_len could now be rejected while configuring rx and tx queues if its size is larger than the hardware-dependent maximum number of descriptors. Fixes: 55d225670def ("i40e: add validation for ring_len param") Signed-off-by: Gregory Herrero <gregory.herrero@oracle.com> Tested-by: Rafal Romanowski <rafal.romanowski@intel.com> Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2025-12-17i40e: fix scheduling in set_rx_modePrzemyslaw Korba1-0/+1
Add service task schedule to set_rx_mode. In some cases there are error messages printed out in PTP application (ptp4l): ptp4l[13848.762]: port 1 (ens2f3np3): received SYNC without timestamp ptp4l[13848.825]: port 1 (ens2f3np3): received SYNC without timestamp ptp4l[13848.887]: port 1 (ens2f3np3): received SYNC without timestamp This happens when service task would not run immediately after set_rx_mode, and we need it for setup tasks. This service task checks, if PTP RX packets are hung in firmware, and propagate correct settings such as multicast address for IEEE 1588 Precision Time Protocol. RX timestamping depends on some of these filters set. Bug happens only with high PTP packets frequency incoming, and not every run since