aboutsummaryrefslogtreecommitdiff
path: root/drivers/net
AgeCommit message (Collapse)AuthorFilesLines
3 dayswireguard: send: append trailer after expanding headJason A. Donenfeld1-10/+10
With how this is currently written, we add the trailer, zero it out, and then add the header space on. If that header space requires a reallocation + copy, the zeros in the trailer aren't copied, because the skb len hasn't actually been yet expanded to cover that. Instead add the padding at the end of the process rather than at the beginning. Fixes: e7096c131e51 ("net: WireGuard secure network tunnel") Cc: stable@vger.kernel.org Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com> Link: https://patch.msgid.link/20260529173134.3080773-2-Jason@zx2c4.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
4 daysnet: pcs: pcs-mtk-lynxi: fix bpi-r3 serdes configurationFrank Wunderlich1-0/+3
Commit 8871389da151 introduces common pcs dts properties which writes rx=normal,tx=normal polarity to register SGMSYS_QPHY_WRAP_CTRL of switch. This is initialized with tx-bit set and so change inverts polarity compared to before. It looks like mt7531 has tx polarity inverted in hardware and set tx-bit by default to restore the normal polarity. The MT7531 datasheet quite clearly states: Register 000050EC QPHY_WRAP_CTRL -- QPHY wrapper control Reset value: 0x00000501 BIT 1 RX_BIT_POLARITY -- RX bit polarity control 1'b0: normal 1'b1: inverted BIT 0 TX_BIT_POLARITY -- TX bit polarity control (TX default inversed in MT7531) 1'b0: normal 1'b1: inverted Till this patch the register write was only called when mediatek,pnswap property was set which cannot be done for switch because the fw-node param was always NULL from switch driver in the mtk_pcs_lynxi_create call. Do not configure switch side like it's done before. Fixes: 8871389da151 ("net: pcs: pcs-mtk-lynxi: deprecate "mediatek,pnswap"") Signed-off-by: Frank Wunderlich <frank-w@public-files.de> Reviewed-by: Vladimir Oltean <vladimir.oltean@nxp.com> Link: https://patch.msgid.link/20260526153239.30194-1-linux@fw-web.de Signed-off-by: Jakub Kicinski <kuba@kernel.org>
4 daysnet: mana: Skip redundant detach on already-detached portDipayaan Roy1-0/+6
When mana_per_port_queue_reset_work_handler() runs after a previous detach succeeded but attach failed, the port is left in a detached state with apc->tx_qp and apc->rxqs already freed. Calling mana_detach() again unconditionally leads to NULL pointer dereferences during queue teardown. Add an early exit in mana_detach() when the port is already in detached state (!netif_device_present) for non-close callers, making it safe to call idempotently. This allows the queue reset handler and other recovery paths to simply retry mana_attach() without redundant teardown. Fixes: 3b194343c250 ("net: mana: Implement ndo_tx_timeout and serialize queue resets per port.") Reviewed-by: Haiyang Zhang <haiyangz@microsoft.com> Signed-off-by: Dipayaan Roy <dipayanroy@linux.microsoft.com> Link: https://patch.msgid.link/20260525081129.1230035-3-dipayanroy@linux.microsoft.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
4 daysnet: mana: Add NULL guards in teardown path to prevent panic on attach failureDipayaan Roy1-29/+41
When queue allocation fails partway through, the error cleanup frees and NULLs apc->tx_qp and apc->rxqs. Multiple teardown paths such as mana_remove(), mana_change_mtu() recovery, and internal error handling in mana_alloc_queues() can subsequently call into functions that dereference these pointers without NULL checks: - mana_chn_setxdp() dereferences apc->rxqs[0], causing a NULL pointer dereference panic (CR2: 0000000000000000 at mana_chn_setxdp+0x26). - mana_destroy_vport() iterates apc->rxqs without a NULL check. - mana_fence_rqs() iterates apc->rxqs without a NULL check. - mana_dealloc_queues() iterates apc->tx_qp without a NULL check. Add NULL guards for apc->rxqs in mana_fence_rqs(), mana_destroy_vport(), and before the mana_chn_setxdp() call. Add a NULL guard for apc->tx_qp in mana_dealloc_queues() to skip TX queue draining when TX queues were never allocated or already freed. Fixes: ca9c54d2d6a5 ("net: mana: Add a driver for Microsoft Azure Network Adapter (MANA)") Reviewed-by: Haiyang Zhang <haiyangz@microsoft.com> Signed-off-by: Dipayaan Roy <dipayanroy@linux.microsoft.com> Link: https://patch.msgid.link/20260525081129.1230035-2-dipayanroy@linux.microsoft.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
4 daysMerge tag 'net-7.1-rc6' of ↵Linus Torvalds13-26/+216
git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net Pull networking fixes from Paolo Abeni: "This is again significantly bigger than the same point into the previous cycle, but at least smaller than last week. I'm not aware of any pending regression for the current cycle. Including fixes from netfilter. Current release - regressions: - netfilter: walk fib6_siblings under RCU Previous releases - regressions: - netlink: fix sending unassigned nsid after assigned one - bridge: fix sleep in atomic context in netlink path - sched: fix ethx:ingress -> ethy:egress -> ethx:ingress mirred loop - ipv4: fix net->ipv4.sysctl_local_reserved_ports UaF - eth: tun: free page on short-frame rejection in tun_xdp_one() Previous releases - always broken: - skbuff: fix missing zerocopy reference in pskb_carve helpers - handshake: drain pending requests at net namespace exit - ethtool: - rss: avoid modifying the RSS context response - module: avoid leaking a netdev ref on module flash errors - coalesce: cap profile updates at NET_DIM_PARAMS_NUM_PROFILES - netfilter: fix dst corruption in same register operation - nfc: hci: fix out-of-bounds read in HCP header parsing - ipv6: exthdrs: refresh nh pointer after ipv6_hop_jumbo() - eth: - vti: use ip6_tnl.net in vti6_changelink(). - vxlan: do not reuse cached ip_hdr() value after skb_tunnel_check_pmtu()" * tag 'net-7.1-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (94 commits) dpll: zl3073x: make frequency monitor a per-device attribute dpll: zl3073x: use __dpll_device_change_ntf() and remove change_work dpll: export __dpll_device_change_ntf() for use under dpll_lock net/handshake: Drain pending requests at net namespace exit net/handshake: Verify file-reference balance in submit paths net/handshake: Close the submit-side sock_hold race net/handshake: hand off the pinned file reference to accept_doit net/handshake: Take a long-lived file reference at submit net/handshake: Pass negative errno through handshake_complete() nvme-tcp: store negative errno in queue->tls_err net/handshake: Use spin_lock_bh for hn_lock net: skbuff: fix missing zerocopy reference in pskb_carve helpers net: hibmcge: move dma_rmb() after dma_sync_single_for_cpu() in RX path net: hibmcge: disable Relaxed Ordering to fix RX packet corruption selftests/tc-testing: Add netem test case exercising loops selftests/tc-testing: Add mirred test cases exercising loops net/sched: act_mirred: Fix return code in early mirred redirect error paths net/sched: act_mirred: Fix blockcast recursion bypass leading to stack overflow net/sched: Fix ethx:ingress -> ethy:egress -> ethx:ingress mirred loop net/sched: fix packet loop on netem when duplicate is on ...
5 daysnet: hibmcge: move dma_rmb() after dma_sync_single_for_cpu() in RX pathJijie Shao1-3/+3
The dma_rmb() barrier was placed before dma_sync_single_for_cpu(), which is incorrect. DMA sync must complete first to make the buffer accessible to the CPU, then the rmb barrier ensures subsequent descriptor reads observe the latest data written by the hardware. Reorder the operations so dma_sync_single_for_cpu() is called before dma_rmb() to guarantee the driver reads consistent data from the DMA buffer. Fixes: f72e25594061 ("net: hibmcge: Implement rx_poll function to receive packets") Signed-off-by: Jijie Shao <shaojijie@huawei.com> Link: https://patch.msgid.link/20260525144525.94884-3-shaojijie@huawei.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
5 daysnet: hibmcge: disable Relaxed Ordering to fix RX packet corruptionJijie Shao1-0/+3
When SMMU is disabled, the hibmcge driver may receive corrupted packets. The hardware writes packet data and descriptors to the same page, but with Relaxed Ordering enabled, PCI write transactions may not be strictly ordered. This can cause the driver to observe a valid descriptor before the corresponding packet data is fully written. Fix this by clearing PCI_EXP_DEVCTL_RELAX_EN in the PCI bridge control register to ensure strict write ordering between packet data and descriptors. Fixes: f72e25594061 ("net: hibmcge: Implement rx_poll function to receive packets") Signed-off-by: Jijie Shao <shaojijie@huawei.com> Link: https://patch.msgid.link/20260525144525.94884-2-shaojijie@huawei.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
5 daysbonding: refuse to enslave CAN devicesOliver Hartkopp1-0/+6
syzbot reported a kernel paging request crash in can_rx_unregister() inside net/can/af_can.c. The crash occurs because a virtual CAN device (vxcan) is being enslaved to a bonding master. During the enslavement process, the bonding driver mutates and modifies the network device states to fit an Ethernet-like aggregation model. However, CAN devices operate on a completely different Layer 2 architecture, relying on the CAN mid-layer private data structure (can_ml_priv) instead of standard Ethernet structures. Since bonding does not initialize or maintain these CAN structures, subsequent operations on the half-enslaved interface (such as closing associated sockets via isotp_release) lead to a null-pointer dereference when accessing the CAN receiver lists. Bonding CAN interfaces is architecturally invalid as CAN lacks MAC addresses, ARP capabilities, and standard Ethernet link-layer mechanisms. While generic loopback devices are blocked globally in net/core/dev.c, virtual CAN devices bypass this check because they do not carry the IFF_LOOPBACK flag, despite acting as local software-loopbacks. Fix this by explicitly blocking network devices of type ARPHRD_CAN from being enslaved at the very beginning of bond_enslave(). This prevents illegal state mutations, eliminates the resulting KASAN crashes, and avoids potential memory leaks from incomplete socket cleanups. As the CAN support has been added a long time after bonding the Fixes-tag points to the introduction of ARPHRD_CAN that would have needed a specific handling in bonding_main.c. Fixes: cd05acfe65ed ("[CAN]: Allocate protocol numbers for PF_CAN") Reported-by: syzbot+8ed98cbd0161632bce95@syzkaller.appspotmail.com Closes: https://syzkaller.appspot.com/bug?extid=8ed98cbd0161632bce95 Signed-off-by: Oliver Hartkopp <socketcan@hartkopp.net> Acked-by: Jay Vosburgh <jv@jvosburgh.net> Link: https://patch.msgid.link/20260526-bonding-candev-v1-1-ba1df400918a@hartkopp.net Signed-off-by: Jakub Kicinski <kuba@kernel.org>
6 daysvxlan: do not reuse cached ip_hdr() value after skb_tunnel_check_pmtu()Eric Dumazet1-2/+2
skb_tunnel_check_pmtu() can change skb->head. Reusing old_iph afer skb_tunnel_check_pmtu() can cause an UAF. Use instead ip_hdr(skb) as done in drivers/net/bareudp.c and drivers/net/geneve.c. Found by Sashiko. Fixes: 4cb47a8644cc ("tunnels: PMTU discovery support for directly bridged IP packets") Signed-off-by: Eric Dumazet <edumazet@google.com> Reviewed-by: Stefano Brivio <sbrivio@redhat.com> Link: https://patch.msgid.link/20260525203642.2389723-1-edumazet@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
6 daysnet: phy: air_en8811h: add AN8811HB MCU assert/deassert supportLucien.Jheng1-4/+149
AN8811HB needs a MCU soft-reset cycle before firmware loading begins. Assert the MCU (hold it in reset) and immediately deassert (release) via a dedicated PBUS register pair (0x5cf9f8 / 0x5cf9fc), accessed through a registered mdio_device at PHY-addr+8. Add __air_pbus_reg_write() as a low-level helper taking a struct mdio_device *, create and register the PBUS mdio_device in an8811hb_probe() and store it in priv->pbusdev, then implement an8811hb_mcu_assert() / _deassert() on top of it. Add an8811hb_remove() to unregister the PBUS device on teardown. Wire both calls into an8811hb_load_firmware() and en8811h_restart_mcu() so every firmware load or MCU restart on AN8811HB correctly sequences the reset control registers. Fixes: 5afda1d734ed ("net: phy: air_en8811h: add Airoha AN8811HB support") Signed-off-by: Lucien Jheng <lucienzx159@gmail.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Link: https://patch.msgid.link/20260524063915.47961-1-lucienzx159@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
6 daysMerge tag 'mm-hotfixes-stable-2026-05-25-16-22' of ↵Linus Torvalds1-1/+1
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm Pull misc fixes from Andrew Morton: "13 hotfixes. 9 are for MM. 9 are cc:stable and the remaining 4 address post-7.1 issues or aren't considered suitable for backporting. All patches are singletons - please see the individual changelogs for details" * tag 'mm-hotfixes-stable-2026-05-25-16-22' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: Revert "mm: introduce a new page type for page pool in page type" mm/vmalloc: do not trigger BUG() on BH disabled context MAINTAINERS, mailmap: change email for Eugen Hristev mm/migrate_device: fix pgtable leak in migrate_vma_insert_huge_pmd_page kernel/fork: validate exit_signal in kernel_clone() mm: memcontrol: propagate NMI slab stats to memcg vmstats mm/damon/sysfs-schemes: delete tried region in regions_rmdirs() mm/rmap: initialize nr_pages to 1 at loop start in try_to_unmap_one zram: fix use-after-free in zram_writeback_endio memfd: deny writeable mappings when implying SEAL_WRITE ipc: limit next_id allocation to the valid ID range Revert "mm/hugetlbfs: update hugetlbfs to use mmap_prepare" MAINTAINERS: .mailmap: update after GEHC spin-off
7 daysnet: team: fix NULL pointer dereference in team_xmit during mode changeWeiming Shi1-13/+32
__team_change_mode() clears team->ops with memset() before restoring safe dummy handlers via team_adjust_ops(). A concurrent team_xmit() running under RCU on another CPU can read team->ops.transmit during this window and call a NULL function pointer, crashing the kernel. The race requires a mode change (CAP_NET_ADMIN) concurrent with transmit on the team device. BUG: kernel NULL pointer dereference, address: 0000000000000000 Oops: 0010 [#1] SMP KASAN NOPTI RIP: 0010:0x0 Call Trace: team_xmit (drivers/net/team/team_core.c:1853) dev_hard_start_xmit (net/core/dev.c:3904) __dev_queue_xmit (net/core/dev.c:4871) packet_sendmsg (net/packet/af_packet.c:3109) __sys_sendto (net/socket.c:2265) The original code assumed that no ports means no traffic, so mode changes could freely memset()/memcpy() the ops. AF_PACKET with forced carrier breaks that assumption. Prevent the race instead of making it safe: replace memset()/memcpy() with per-field updates that never touch transmit or receive. Those two handlers are managed solely by team_adjust_ops(), which already installs dummies when tx_en_port_count == 0 (always true during mode change since no ports are present). WRITE_ONCE/READ_ONCE prevent store/load tearing on the handler pointers. synchronize_net() before exit_op() drains in-flight readers that may still reference old mode state from before port removal switched the handlers to dummies. Fixes: 3d249d4ca7d0 ("net: introduce ethernet teaming device") Reported-by: Xiang Mei <xmei5@asu.edu> Signed-off-by: Weiming Shi <bestswngs@gmail.com> Reviewed-by: Jiayuan Chen <jiayuan.chen@linux.dev> Link: https://patch.msgid.link/20260521081159.1491563-3-bestswngs@gmail.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
7 daysocteontx2-af: validate body pcifunc in rvu_mbox_handler_rep_event_notifyMichael Bommarito3-1/+10
rvu_mbox_handler_rep_event_notify() in drivers/net/ethernet/marvell/ octeontx2/af/rvu_rep.c queues a sender-controlled REP_EVENT_NOTIFY request body verbatim, and rvu_rep_up_notify() then forwards event->pcifunc (the nested body field, distinct from the AF-normalised header pcifunc) into rvu_get_pfvf(), rvu_get_pf() and the AF->PF mailbox device index without any bounds check. A VF attached to a PF that has been put into switchdev representor mode reaches this path: the VF mailbox handler otx2_pfvf_mbox_handler() forwards every message id including MBOX_MSG_REP_EVENT_NOTIFY to AF without an allowlist, and the AF dispatcher rewrites only msg->pcifunc, leaving struct rep_event::pcifunc attacker-controlled. The sibling rvu_mbox_handler_esw_cfg() refuses requests whose header pcifunc is not rvu->rep_pcifunc; this handler has no equivalent gate. An out-of-range body pcifunc selects an &rvu->pf[]/&rvu->hwvf[] element past the allocated array and, for RVU_EVENT_MAC_ADDR_CHANGE, turns into a six-byte attacker-chosen OOB ether_addr_copy() target inside the queued worker; KASAN reports a slab-out-of-bounds write in rvu_rep_wq_handler. Reject malformed requests at the handler entry by gating on is_pf_func_valid(), which is already the canonical PF/VF range check in this driver; expose it via rvu.h so callers in rvu_rep.c can use it instead of open-coding the same range arithmetic. Fixes: b8fea84a0468 ("octeontx2-pf: Add support to sync link state between representor and VFs") Cc: stable@vger.kernel.org Signed-off-by: Michael Bommarito <michael.bommarito@gmail.com> Link: https://patch.msgid.link/20260520154157.1439319-1-michael.bommarito@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
7 daysmacsec: fix replay protection at XPN lower-PN wrapJunrui Luo1-1/+2
In macsec_post_decrypt(), when pn is U32_MAX, pn + 1 overflows u32 to 0 and the first branch never fires. If next_pn_halves.lower is also in the upper half, pn_same_half(pn, lower) is true and the XPN else-if does not fire either, leaving next_pn_halves unchanged. An attacker that captures the legitimate frame carrying pn == 0xFFFFFFFF on an XPN association can then replay it indefinitely, since lowest_pn never rises above the captured pn and macsec_decrypt() reconstructs the same IV. Extend the XPN else-if to also fire when pn + 1 wraps to 0, so receipt of pn == U32_MAX advances next_pn_halves to (upper + 1, 0). Fixes: a21ecf0e0338 ("macsec: Support XPN frame handling - IEEE 802.1AEbw") Reported-by: Yuhao Jiang <danisjiang@gmail.com> Cc: stable@vger.kernel.org Signed-off-by: Junrui Luo <moonafterrain@outlook.com> Link: https://patch.msgid.link/SYBPR01MB78813FD49E58F253B989F197AF012@SYBPR01MB7881.ausprd01.prod.outlook.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
7 daysnet/mlx5: HWS: Reject unsupported remove-header actionPrathamesh Deshpande1-1/+3
mlx5_cmd_hws_packet_reformat_alloc() handles MLX5_REFORMAT_TYPE_REMOVE_HDR by looking up a matching HWS remove-header action. If mlx5_fs_get_action_remove_header_vlan() returns NULL, the code only logs an error and continues. The function then returns success with a NULL HWS action stored in the packet-reformat object. Return an error when no matching remove-header action is available. Fixes: aecd9d1020e3 ("net/mlx5: fs, add HWS packet reformat API function") Signed-off-by: Prathamesh Deshpande <prathameshdeshpande7@gmail.com> Reviewed-by: Simon Horman <horms@kernel.org> Reviewed-by: Yevgeny Kliteynik <kliteyn@nvidia.com> Acked-by: Tariq Toukan <tariqt@nvidia.com> Link: https://patch.msgid.link/20260506000054.51797-1-prathameshdeshpande7@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
10 daystun: free page on build_skb failure in tun_xdp_one()Weiming Shi1-0/+1
When build_skb() fails in tun_xdp_one(), the function sets ret to -ENOMEM and jumps to the out label, which returns without freeing the page that vhost_net_build_xdp() allocated for the frame. As with the short-frame rejection path, tun_sendmsg() discards the per-buffer error and still returns total_len, so vhost_tx_batch() takes the success path and never frees the page. Each build_skb() failure in a batch leaks one page-frag chunk. Free the page before taking the error path, matching the put_page() the other error exits of tun_xdp_one() already perform. Fixes: 043d222f93ab ("tuntap: accept an array of XDP buffs through sendmsg()") Reported-by: Xiang Mei <xmei5@asu.edu> Signed-off-by: Weiming Shi <bestswngs@gmail.com> Reviewed-by: Dongli Zhang <dongli.zhang@oracle.com> Reviewed-by: Willem de Bruijn <willemb@google.com> Link: https://patch.msgid.link/20260521163312.1479805-2-bestswngs@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
10 daystap: free page on error paths in tap_get_user_xdp()Weiming Shi1-0/+2
tap_get_user_xdp() rejects a frame shorter than ETH_HLEN with -EINVAL, and returns -ENOMEM when build_skb() fails. Both paths jump to the err label without freeing the page that vhost_net_build_xdp() allocated for the frame. tap_sendmsg() discards the per-buffer return value and always returns 0, so vhost_tx_batch() takes the success path and never frees the page; each rejected frame in a batch leaks one page-frag chunk. Free the page on both error paths, before the skb is built. This is the tap counterpart of the same leak in tun_xdp_one(). Fixes: 0efac27791ee ("tap: accept an array of XDP buffs through sendmsg()") Fixes: ed7f2afdd0e0 ("tap: add missing verification for short frame") Reported-by: Xiang Mei <xmei5@asu.edu> Signed-off-by: Weiming Shi <bestswngs@gmail.com> Reviewed-by: Dongli Zhang <dongli.zhang@oracle.com> Reviewed-by: Willem de Bruijn <willemb@google.com> Link: https://patch.msgid.link/20260521163230.1478627-2-bestswngs@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
11 daystun: free page on short-frame rejection in tun_xdp_one()Weiming Shi1-1/+3
tun_xdp_one() returns -EINVAL on a frame shorter than ETH_HLEN without freeing the page that vhost_net_build_xdp() allocated for it. tun_sendmsg() discards that -EINVAL and still returns total_len, so vhost_tx_batch() takes the success path and never frees the page; each short frame in a batch leaks one page-frag chunk. A local process that can open /dev/net/tun and /dev/vhost-net can hit this path: it attaches a tun/tap device as the vhost-net backend and feeds TX descriptors whose length minus the virtio-net header is below ETH_HLEN. Each kick leaks the page-frag chunks for that batch, and a tight submission loop exhausts host memory and triggers an OOM panic. Free the page before returning -EINVAL, matching the XDP-program error path in the same function. Fixes: 049584807f1d ("tun: add missing verification for short frame") Reported-by: Xiang Mei <xmei5@asu.edu> Signed-off-by: Weiming Shi <bestswngs@gmail.com> Reviewed-by: Dongli Zhang <dongli.zhang@oracle.com> Reviewed-by: Willem de Bruijn <willemb@google.com> Link: https://patch.msgid.link/20260520160020.375349-2-bestswngs@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
11 daysRevert "mm: introduce a new page type for page pool in page type"Byungchul Park1-1/+1
This reverts commit db359fccf212 ("mm: introduce a new page type for page pool in page type") and a part of 735a309b4bfb9e ("net: add net_iov_init() and use it to initialize ->page_type"). Netpp page_type'ed pages might be used in mapping so as to use @_mapcount. However, since @page_type and @_mapcount are union'ed in struct page, these two can't be used at the same time. Revert the commit introducing page_type for Netpp for now. The patch will be retried once @page_type and @_mapcount get allowed to be used at the same time. The revert also includes removal of @page_type initialization part introduced by commit 735a309b4bfb9e ("net: add net_iov_init() and use it to initialize ->page_type"), which will be restored on the retry. Link: https://lore.kernel.org/20260515034701.17027-1-byungchul@sk.com Fixes: db359fccf212 ("mm: introduce a new page type for page pool in page type") Signed-off-by: Byungchul Park <byungchul@sk.com> Reported-by: Dragos Tatulea <dtatulea@nvidia.com> Closes: https://lore.kernel.org/all/982b9bc1-0a0a-4fc5-8e3a-3672db2b29a1@nvidia.com Acked-by: Jakub Kicinski <kuba@kernel.org> Acked-by: David Hildenbrand (Arm) <david@kernel.org> Acked-by: Harry Yoo (Oracle) <harry@kernel.org> Reviewed-by: Lorenzo Stoakes <ljs@kernel.org> Cc: Alexei Starovoitov <ast@kernel.org> Cc: Baolin Wang <baolin.wang@linux.alibaba.com> Cc: Brendan Jackman <jackmanb@google.com> Cc: David S. Miller <davem@davemloft.net> Cc: Eric Dumazet <edumazet@google.com> Cc: Ilias Apalodimas <ilias.apalodimas@linaro.org> Cc: Jesper Dangaard Brouer <hawk@kernel.org> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: John Fastabend <john.fastabend@gmail.com> Cc: Leon Romanovsky <leon@kernel.org> Cc: Liam R. Howlett <liam@infradead.org> Cc: Mark Bloch <mbloch@nvidia.com> Cc: Matthew Wilcox (Oracle) <willy@infradead.org> Cc: Michal Hocko <mhocko@suse.com> Cc: Mike Rapoport <rppt@kernel.org> Cc: Paolo Abeni <pabeni@redhat.com> Cc: Pavel Begunkov <asml.silence@gmail.com> Cc: Saeed Mahameed <saeedm@nvidia.com> Cc: Simon Horman <horms@kernel.org> Cc: Stanislav Fomichev <sdf@fomichev.me> Cc: Suren Baghdasaryan <surenb@google.com> Cc: Tariq Toukan <tariqt@nvidia.com> Cc: Toke Hoiland-Jorgensen <toke@redhat.com> Cc: Vlastimil Babka <vbabka@kernel.org> Cc: Zi Yan <ziy@nvidia.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
11 daysMerge tag 'wireless-2026-05-21' of ↵Jakub Kicinski15-76/+194
https://git.kernel.org/pub/scm/linux/kernel/git/wireless/wireless Johannes Berg says: ==================== Quite a few more updates: - cfg80211/mac80211: - various security(-ish) fixes - fix A-MSDU subframe handling - fix multi-link element parsing - ath10: avoid sending commands to dead device - ath11k: - fix WMI buffer leaks on error conditions - fix UAF in RX MSDU coalesce path - allow peer ID 0 on RX path (legal for mobile devices) - reinitialize shared SRNG pointers on restart - ath12k: - fix 20 MHz-only parsing of EHT-MCS map - iwlwifi: - fix TSO segmentation explosion - don't TX to dead device - fix warning in WoWLAN - fix TX rates on old devices - disconnect on beacon loss only if also no other traffic - fill NULL-ptr deref - fix STEP_URM hardware access * tag 'wireless-2026-05-21' of https://git.kernel.org/pub/scm/linux/kernel/git/wireless/wireless: (24 commits) wifi: cfg80211: wext: validate chandef in monitor mode wifi: mac80211: consume only present negotiated TTLM maps wifi: wilc1000: fix dma_buffer leak on bus acquire failure wifi: mac80211: capture fast-RX rate before mesh reuses skb->cb wifi: mac80211: fix multi-link element inheritance wifi: mac80211: fix MLE defragmentation wifi: mac80211: don't override max_amsdu_subframes wifi: mac80211: bounds-check link_id in ieee80211_ml_epcs wifi: ath12k: fix EHT TX MCS limitation due to wrong 20 MHz-only parsing wifi: ath11k: clear shared SRNG pointer state on restart wifi: ath11k: fix use after free in ath11k_dp_rx_msdu_coalesce() wifi: ath11k: fix peer resolution on rx path when peer_id=0 wifi: iwlwifi: mld: disconnect only after 6 beacons without Rx wifi: iwlwifi: mld: don't WARN on WoWLAN suspend w/o BSS vif wifi: iwlwifi: use correct function to read STEP_URM register wifi: iwlwifi: mvm: fix driver-set TX rates on old devices wifi: iwlwifi: mld: don't dereference a pointer before NULL checking it wifi: iwlwifi: mld: stop TX during firmware restart wifi: iwlwifi: mld: fix TSO segmentation explosion when AMSDU is disabled wifi: ath10k: skip WMI and beacon transmission when device is wedged ... ==================== Link: https://patch.msgid.link/20260521152903.374070-3-johannes@sipsolutions.net Signed-off-by: Jakub Kicinski <kuba@kernel.org>
11 daysnet: enetc: avoid VF->PF mailbox timeout during SR-IOV teardownWei Fang1-1/+1
During SR-IOV teardown, enetc_msg_psi_free() disables the MR interrupt before pci_disable_sriov() removes the VFs. If a VF sends a mailbox message during this window, the PF cannot receive it, causing the VF to timeout waiting for a reply. Since the timeout occurs during SR-IOV teardown when the VF is about to be removed anyway, it has no functional impact on operation. However, more messages will be added in the future, some visible error logs may confuse users. So fix it by calling pci_disable_sriov() first to remove all VFs, then safely clean up the mailbox resources. This eliminates the race window where VFs could send messages to an unresponsive PF. Fixes: beb74ac878c8 ("enetc: Add vf to pf messaging support") Signed-off-by: Wei Fang <wei.fang@nxp.com> Reviewed-by: Harshitha Ramamurthy <hramamurthy@google.com> Link: https://patch.msgid.link/20260520064421.91569-10-wei.fang@nxp.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
11 daysnet: enetc: fix init and teardown order to prevent use of unsafe resourcesWei Fang1-17/+18
Sashiko reported a potential issue in enetc_msg_psi_init() where the IRQ handler is registered before DMA resources are fully initialized [1]. The current initialization sequence is: 1. request_irq(enetc_msg_psi_msix) <- IRQ handler registered 2. INIT_WORK(&pf->msg_task, ...) <- work_struct initialized 3. enetc_msg_alloc_mbx() <- mailbox DMA allocated This ordering is unsafe because if a spurious interrupt or pending interrupt from a previous device state fires immediately after request_irq() returns, the registered ISR enetc_msg_psi_msix() will execute and unconditionally call: schedule_work(&pf->msg_task) At this point, pf->msg_task has not been initialized by INIT_WORK(), so the work_struct contains garbage values in its internal linked list pointers (work_struct->entry). Passing an uninitialized work_struct to schedule_work() could corrupt the kernel's workqueue linked lists, potentially leading to: - Kernel panic in __queue_work() - Memory corruption in workqueue data structures - System deadlock or undefined behavior Additionally, even if the work_struct was initialized, the mailbox DMA buffers (pf->rxmsg[]) may not yet be allocated when the work handler enetc_msg_task() runs, resulting in NULL pointer dereference. Fix by reordering the initialization sequence to ensure all resources are properly initialized before the interrupt handler can execute: 1. enetc_msg_alloc_mbx() <- Allocate all mailboxes 2. INIT_WORK(&pf->msg_task, ...) <- Initialize work first 3. request_irq(enetc_msg_psi_msix) <- Register IRQ last 4. Configure hardware & enable MR interrupts This guarantees that when enetc_msg_psi_msix() runs: - pf->msg_task is properly initialized (safe for schedule_work) - pf->rxmsg[] buffers are allocated (safe for work handler access) - Hardware is configured appropriately As the inverse of enetc_msg_psi_init(), enetc_msg_psi_free() also has similar problems. For example, if a pending interrupt fires between enetc_msg_free_mbx() and free_irq(), the ISR enetc_msg_psi_msix() may schedule the work handler again via schedule_work(), which could then access already-freed DMA buffers (pf->rxmsg[]), leading to use-after-free and potential memory corruption. Therefore, the order of enetc_msg_psi_free() is adjusted: 1. enetc_msg_disable_mr_int() <- Stop new interrupts first 2. free_irq() <- Ensure no IRQ handler can run 3. cancel_work_sync() <- Wait for any pending work 4. enetc_msg_disable_mr_int() <- Re-disable in case work re-enabled it 5. enetc_msg_free_mbx() <- Safe to free DMA buffers now Link: https://sashiko.dev/#/patchset/20260511080805.2052495-1-wei.fang%40nxp.com #1 Fixes: beb74ac878c8 ("enetc: Add vf to pf messaging support") Signed-off-by: Wei Fang <wei.fang@nxp.com> Reviewed-by: Harshitha Ramamurthy <hramamurthy@google.com> Link: https://patch.msgid.link/20260520064421.91569-9-wei.fang@nxp.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
11 daysnet: enetc: fix unbounded loop and interrupt handling in VF-to-PF messagingWei Fang2-31/+49
The enetc_msg_task() function has several issues that need to be addressed: 1. Unbounded loop causing potential DoS: enetc_msg_task() processes VF-to-PF mailbox messages in an unbounded for(;;) loop that keeps polling ENETC_PSIMSGRR until no MR bits are set. A malicious guest VM can exploit this by continuously sending messages at a high rate - immediately sending a new message as soon as the PF acknowledges the previous one. Since the worker thread never yields or enforces a processing budget, the mr_mask check frequently evaluates to non-zero, causing the PF to spin indefinitely and starving other tasks. Fix this by replacing the unbounded loop with a single snapshot read at task entry. The task processes only the VFs whose MR bits were set at that point, then re-enables message interrupts before returning. This bounds work per invocation to at most num_vfs iterations. No messages are lost because the message interrupt is disabled in enetc_msg_psi_msix() before scheduling enetc_msg_task(), so any new messages arriving during processing will trigger a fresh interrupt once re-enabled, scheduling another task invocation. 2. Write order of ENETC_PSIIDR and ENETC_PSIMSGRR: Both ENETC_PSIIDR and ENETC_PSIMSGRR contain MR bits indicating messages have been received from VSIs, but only ENETC_PSIIDR trigger the CPU interrupt. Previously, ENETC_PSIMSGRR was written before ENETC_PSIIDR. Writing ENETC_PSIMSGRR returns the message code to the VSI in its upper 16 bits, signaling to the VF that message processing is complete and it may send the next message. If the VF sends a new message before ENETC_PSIIDR is written, the subsequent w1c write to ENETC_PSIIDR would inadvertently clear the MR bit set by the new message, causing the interrupt to be lost and the new message to go unprocessed. Therefore, write ENETC_PSIIDR first to clear the interrupt source, then write ENETC_PSIMSGRR to acknowledge the message to the VSI. 3. Check both ENETC_PSIMSGRR and ENETC_PSIIDR for mr_status: The write order change above introduces a potential race: if a VF sends a new message in the window between the ENETC_PSIIDR w1c and the ENETC_PSIMSGRR w1c, the ENETC_PSIMSGRR MR bit for the new message may not be set. If mr_status was derived solely from ENETC_PSIMSGRR, this message would never be detected despite ENETC_PSIIDR retaining its MR bit, leading to an unacknowledged interrupt storm. Fix this by computing mr_status as the union of both ENETC_PSIMSGRR and ENETC_PSIIDR MR bits, ensuring all pending messages are detected regardless of which register reflects the new message state. Additionally, rename the per-register MR macros (ENETC_PSI*_MR_MASK, ENETC_PSI*_MR) to register-agnostic names (ENETC_PSIMR_MASK, ENETC_PSIMR_BIT) since the MR bit layout is shared across ENETC_PSIMSGRR, ENETC_PSIIER, and ENETC_PSIIDR. Make the mask macro dynamic based on the actual number of active VFs rather than hardcoded. Fixes: beb74ac878c8 ("enetc: Add vf to pf messaging support") Signed-off-by: Wei Fang <wei.fang@nxp.com> Link: https://patch.msgid.link/20260520064421.91569-8-wei.fang@nxp.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
11 daysnet: enetc: fix DMA write to freed memory in enetc_msg_free_mbx()Wei Fang1-3/+3
The teardown sequence in enetc_msg_psi_free() frees the DMA buffer before clearing the device's DMA address registers. If a VF sends a message or a pending DMA transfer completes within this window, the hardware will perform a DMA write into the kernel memory that has already been returned to the allocator. The result is silent memory corruption that can affect arbitrary kernel data structures. Therefore, clear the DMA address registers before the DMA buffer is freed. Fixes: beb74ac878c8 ("enetc: Add vf to pf messaging support") Signed-off-by: Wei Fang <wei.fang@nxp.com> Reviewed-by: Harshitha Ramamurthy <hramamurthy@google.com> Link: https://patch.msgid.link/20260520064421.91569-7-wei.fang@nxp.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
11 daysnet: enetc: fix race condition in VF MAC address configurationWei Fang2-0/+11
Sashiko reported a potential race condition between the VF message handler and administrative VF MAC configuration from the host [1]. The VF message handler (enetc_msg_pf_set_vf_primary_mac_addr) runs asynchronously in a workqueue context and accesses vf_state->flags without any locking. Concurrently, the host can administratively change the VF MAC address via enetc_pf_set_vf_mac(), which executes under RTNL lock and modifies both vf_state->flags and hardware registers. This creates two race windows: 1) TOCTOU race on vf_state->flags: The check of ENETC_VF_FLAG_PF_SET_MAC and subsequent MAC programming are not atomic, allowing the flag state to change between check and use. 2) Torn MAC address writes: Hardware MAC programming requires multiple non-atomic register writes (__raw_writel for lower 32 bits and __raw_writew for upper 16 bits). Concurrent updates from VF mailbox and PF admin paths can interleave these operations, resulting in a corrupted MAC address being programmed into the hardware. Fix by introducing a per-VF mutex to serialize access to vf_state and hardware MAC register updates. Both enetc_pf_set_vf_mac() and enetc_msg_pf_set_vf_primary_mac_addr() now acquire this lock before accessing vf_state->flags or programming the MAC address, ensuring atomic read-modify-write sequences and preventing register write interleaving. Link: https://sashiko.dev/#/patchset/20260511080805.2052495-1-wei.fang%40nxp.com #1 Fixes: beb74ac878c8 ("enetc: Add vf to pf messaging support") Signed-off-by: Wei Fang <wei.fang@nxp.com> Reviewed-by: Harshitha Ramamurthy <hramamurthy@google.com> Link: https://patch.msgid.link/20260520064421.91569-6-wei.fang@nxp.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
11 daysnet: enetc: fix TOCTOU race and validate VF MAC addressWei Fang1-9/+30
Sashiko reported that the PF driver accepts arbitrary MAC address from from VF mailbox messages without proper validation, creating a security vulnerability [1]. In enetc_msg_pf_set_vf_primary_mac_addr(), the MAC address is extracted directly from the message buffer (cmd->mac.sa_data) and programmed into hardware via pf->ops->set_si_primary_mac() without any validity checks. A malicious VF can configure a multicast, broadcast, or all-zero MAC address. Therefore, a validation to check the MAC address provided by VF is required. However, simply checking the MAC address is not enough, because it also has the potential TOCTOU race [2]: The code reads the MAC address from the DMA buffer to validate it via is_valid_ether_addr(), if validation passes, reads the same DMA buffer a second time when calling enetc_pf_set_primary_mac_addr() to program the hardware. A malicious VF can exploit this window by overwriting the MAC address in the DMA buffer between the validation check and the hardware programming, bypassing the validation entirely. Therefore, allocate a local buffer in enetc_msg_handle_rxmsg() and copy the message content from the DMA buffer via memcpy() before processing. This ensures the PF operates on a stable snapshot that the VF cannot modify. Link: https://sashiko.dev/#/patchset/20260511080805.2052495-1-wei.fang%40nxp.com #1 Link: https://sashiko.dev/#/patchset/20260513103021.2190593-1-wei.fang%40nxp.com #2 Fixes: beb74ac878c8 ("enetc: Add vf to pf messaging support") Signed-off-by: Wei Fang <wei.fang@nxp.com> Reviewed-by: Harshitha Ramamurthy <hramamurthy@google.com> Link: https://patch.msgid.link/20260520064421.91569-5-wei.fang@nxp.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
11 daysnet: enetc: add ratelimiting to VF mailbox error messagesWei Fang1-4/+6
Sashiko reported that a buggy or malicious guest VM can flood the host kernel log by repeatedly sending VF-to-PF messages at a high rate, degrading host performance and hiding important system logs [1]. Fix by replacing dev_err()/dev_warn() with dev_err_ratelimited(), limiting output to the default kernel ratelimit. This ensures errors are still logged for debugging while preventing log flooding attacks. Link: https://sashiko.dev/#/patchset/20260511080805.2052495-1-wei.fang%40nxp.com #1 Fixes: beb74ac878c8 ("enetc: Add vf to pf messaging support") Signed-off-by: Wei Fang <wei.fang@nxp.com> Reviewed-by: Harshitha Ramamurthy <hramamurthy@google.com> Link: https://patch.msgid.link/20260520064421.91569-4-wei.fang@nxp.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
11 daysnet: enetc: fix missing error code when pf->vf_state allocation failsWei Fang1-1/+3
In enetc_pf_probe(), when the memory allocation for pf->vf_state fails, the code jumps to the error handling label but the variable 'err' is not assigned an appropriate error code beforehand. This causes the function to return 0 (success) on an allocation failure path, misleading the caller into thinking the probe succeeded. So set err to -ENOMEM before jumping to the error handling label when the allocation for pf->vf_state returns NULL. Fixes: e15c5506dd39 ("net: enetc: allocate vf_state during PF probes") Signed-off-by: Wei Fang <wei.fang@nxp.com> Reviewed-by: Harshitha Ramamurthy <hramamurthy@google.com> Link: https://patch.msgid.link/20260520064421.91569-3-wei.fang@nxp.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
11 daysnet: enetc: fix incorrect mailbox message status returned to VFsWei Fang1-4/+6
There are two cases where VFs receive an incorrect success status from the PF mailbox message handler, misleading them into believing their requests have been fulfilled: In enetc_msg_handle_rxmsg(), *status is pre-initialized to ENETC_MSG_CMD_STATUS_OK. When an unsupported command type is received, the default case only logs an error without updating *status, so it remains as ENETC_MSG_CMD_STATUS_OK. In enetc_msg_pf_set_vf_primary_mac_addr(), when the PF has already assigned a MAC address for the VF (ENETC_VF_FLAG_PF_SET_MAC is set), the function rejects the request but returns ENETC_MSG_CMD_STATUS_OK instead of ENETC_MSG_CMD_STATUS_FAIL. Therefore, correct the status value for the two cases mentioned above. Fixes: beb74ac878c8 ("enetc: Add vf to pf messaging support") Signed-off-by: Wei Fang <wei.fang@nxp.com> Reviewed-by: Harshitha Ramamurthy <hramamurthy@google.com> Link: https://patch.msgid.link/20260520064421.91569-2-wei.fang@nxp.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
11 daysnet: bcmgenet: keep RBUF EEE/PM disabledNicolai Buchwitz1-5/+4
Setting RBUF_EEE_EN | RBUF_PM_EN in RBUF_ENERGY_CTRL breaks the RX path on GENET hardware once MAC EEE becomes active. RX traffic stops flowing while the link stays up and the usual descriptor/RX error counters remain quiet. In that state the MAC still accepts frames (rbuf_ovflow_cnt keeps climbing) but RBUF no longer forwards them to DMA, so rx_packets is no longer incremented at the netdev level. On some boards the corruption ends up as a paging fault in skb_release_data via bcmgenet_rx_poll on an LPI exit. Reproduced on Pi 4B (BCM2711 + BCM54213PE) and confirmed by Florian Fainelli on an internal Broadcom 4908-family board with the same crash signature. RBUF_PM_EN is not publicly documented. This shows up more often now that phy_support_eee() enables EEE by default, but it also affects older kernels as soon as TX LPI is turned on via ethtool, so it is not specific to recent changes. Always clear RBUF_EEE_EN | RBUF_PM_EN in bcmgenet_eee_enable_set so the bits stay off across resets. UMAC and TBUF setup is left alone so TX-side EEE keeps working. Link: https://github.com/raspberrypi/linux/issues/7304 Fixes: 6ef398ea60d9 ("net: bcmgenet: add EEE support") Cc: stable@vger.kernel.org Signed-off-by: Nicolai Buchwitz <nb@tipi-net.de> Reviewed-by: Florian Fainelli <florian.fainelli@broadcom.com> Link: https://patch.msgid.link/20260520184320.652053-1-nb@tipi-net.de Signed-off-by: Jakub Kicinski <kuba@kernel.org>
11 daysethernet: 3c509: Fix most coding style issuesMaciej W. Rozycki1-376/+469
Update the driver for our current coding style according to output from `checkpatch.pl' and manual code review, where no change to binary code results, as indicated by `objdump -dr'. Exceptions are as follows: - incomplete reverse xmas tree in set_multicast_list(), as that would change binary output, - referring el3_start_xmit() verbatim rather than via `__func__' with pr_debug(), likewise, - a bunch of pr_cont() calls, likewise, - a long udelay() call in el3_netdev_set_ecmd() made under a spinlock, likewise plus it's not eligible for conversion to a sleep in the first place, - a blank line at the start of a block in el3_interrupt(), to improve readability where the first statement would otherwise visually merge with the controlling expression of the enclosing `while' statement. These issues are benign and depending on circumstances may be adressed with suitable code refactoring later on. Signed-off-by: Maciej W. Rozycki <macro@orcam.me.uk> Link: https://patch.msgid.link/alpine.DEB.2.21.2605201208280.1450@angie.orcam.me.uk Signed-off-by: Jakub Kicinski <kuba@kernel.org>
11 daysethernet: 3c509: Add GPL 2.0 SPDX license identifierMaciej W. Rozycki1-0/+1
This driver has landed with Linux 0.99.13k, which was covered by the GNU General Public License version 2, and no further conditions as to licensing terms have been specified within the copyright notice included with the driver itself. Signed-off-by: Maciej W. Rozycki <macro@orcam.me.uk> Link: https://patch.msgid.link/alpine.DEB.2.21.2605201206370.1450@angie.orcam.me.uk Signed-off-by: Jakub Kicinski <kuba@kernel.org>
11 daysethernet: 3c509: Fix AUI transceiver type selectionMaciej W. Rozycki1-0/+1
The transceiver type is held in bits 15:14 of the Address Configuration Register, with the values of 0b00, 0b01, and 0b11 denoting TP, AUI, and BNC types respectively. Therefore switching from BNC to AUI requires bits to be cleared before setting bit 14 or the setting won't change. NB this has always been wrong ever since this code was added in 2.5.42. Signed-off-by: Maciej W. Rozycki <macro@orcam.me.uk> Link: https://patch.msgid.link/alpine.DEB.2.21.2605201205160.1450@angie.orcam.me.uk Signed-off-by: Jakub Kicinski <kuba@kernel.org>
11 daysRevert "drivers: net: 3com: 3c509: Remove this driver"Maciej W. Rozycki3-0/+1463
This reverts commit 91f3a27ae9f66d81a5906461762c37c8a2bcab06. Contrary to the assumption stated with the original commit description this driver is in use and I'm going to maintain it for the foreseeable future. Signed-off-by: Maciej W. Rozycki <macro@orcam.me.uk> Link: https://patch.msgid.link/alpine.DEB.2.21.2605201204260.1450@angie.orcam.me.uk Signed-off-by: Jakub Kicinski <kuba@kernel.org>
11 dayspds_core: ensure null-termination for firmware version stringsNikhil P. Rao1-2/+4
The driver passes fw_version directly to devlink_info_version_stored_put() without ensuring null-termination. While curre