aboutsummaryrefslogtreecommitdiff
path: root/net
AgeCommit message (Collapse)AuthorFilesLines
6 daysMerge tag 'nf-26-06-10' of ↵Paolo Abeni18-51/+150
git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf Pablo Neira Ayuso says: ==================== Netfilter fixes for net The following patchset contains Netfilter fixes for net: 1) Revalidate bridge ports, add missing NULL checks to fetch the bridge device by the port. From Florian Westphal. 2) Fix netdevice refcount leak in the error path of nft_fwd hardware offload function, also from Florian. 3) Unregister helper expectfn callback on conntrack helper module removal, otherwise dangling pointer remains in place, from Weiming Shi. 4) Fix possible pointer infoleak in getsockopt() IPT_SO_GET_ENTRIES, From Kyle Zeng. 5) Validate that device MAC header is present before nf_syslog accesses it. From Xiang Mei. 6-8) Three patches to address a possible infoleak of stale stack data in three nf_tables expressions, due to mismatch in the _init() and _eval() function which is possible since 14fb07130c7d. From Davide Ornaghi and Florian Westphal. netfilter pull request 26-06-10 * tag 'nf-26-06-10' of git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf: netfilter: nft_meta_bridge: fix stale stack leak via IIFHWADDR register netfilter: nft_fib: fix stale stack leak via the OIFNAME register netfilter: nft_exthdr: fix register tracking for F_PRESENT flag netfilter: nf_log: validate MAC header was set before dumping it netfilter: x_tables: avoid leaking percpu counter pointers netfilter: nf_conntrack: destroy stale expectfn expectations on unregister netfilter: nf_tables_offload: drop device refcount on error netfilter: revalidate bridge ports ==================== Link: https://patch.msgid.link/20260610161629.214092-1-pablo@netfilter.org Signed-off-by: Paolo Abeni <pabeni@redhat.com>
6 daysMerge tag 'ipsec-2026-06-10' of ↵Paolo Abeni5-27/+35
git://git.kernel.org/pub/scm/linux/kernel/git/klassert/ipsec Steffen Klassert says: ==================== pull request (net): ipsec 2026-06-10 1) xfrm: iptfs: preserve shared-frag marker in iptfs_consume_frags() Propagate SKBFL_SHARED_FRAG when paged fragments are moved between skbs so ESP can decide whether in-place crypto is safe. 2) xfrm: iptfs: fix use-after-free on first_skb in __input_process_payload Replace the unlocked read of xtfs->ra_newskb with a local flag so a concurrent reassembly can no longer free first_skb between spin_unlock and the post-loop check. 3) xfrm: policy: fix use-after-free on inexact bin in xfrm_policy_bysel_ctx() Prune the inexact bin under xfrm_policy_lock so a concurrent xfrm_hash_rebuild() can no longer free it before xfrm_policy_kill() dereferences it. 4) xfrm: iptfs: fix ABBA deadlock in iptfs_destroy_state() Move hrtimer_cancel() for the output and drop timers ahead of their spinlocks, breaking the softirq/lock cycle that could deadlock against the timer callbacks on SMP. 5) xfrm: espintcp: do not reuse an in-progress partial send Fail a new send when espintcp_push_msgs() returns with emsg->len still set, so a blocking caller can no longer overwrite ctx->partial while a previous transfer still owns it. 6) esp: fix page frag reference leak on skb_to_sgvec failure Add a flag to esp_ssg_unref() to unconditionally unref the source scatterlist, releasing the old page references that are otherwise leaked when the second skb_to_sgvec() in esp_output_tail() fails. Please pull or let me know if there are problems. ipsec-2026-06-10 * tag 'ipsec-2026-06-10' of git://git.kernel.org/pub/scm/linux/kernel/git/klassert/ipsec: esp: fix page frag reference leak on skb_to_sgvec failure xfrm: espintcp: do not reuse an in-progress partial send xfrm: iptfs: fix ABBA deadlock in iptfs_destroy_state() xfrm: policy: fix use-after-free on inexact bin in xfrm_policy_bysel_ctx() xfrm: iptfs: fix use-after-free on first_skb in __input_process_payload xfrm: iptfs: preserve shared-frag marker in iptfs_consume_frags() ==================== Link: https://patch.msgid.link/20260610140800.2562818-1-steffen.klassert@secunet.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
6 daysipv6: Fix a potential NPD in cleanup_prefix_route()Ido Schimmel1-2/+4
addrconf_get_prefix_route() can return the fib6_null_entry sentinel entry which has a NULL fib6_table pointer. Therefore, before setting the route's expiration time, check that we are not working with this entry, as otherwise a NPD will be triggered [1]. Note that the other callers of addrconf_get_prefix_route() are not susceptible to this bug: 1. addrconf_prefix_rcv(): Requests a route with the 'RTF_ADDRCONF | RTF_PREFIX_RT' flags which are not set on fib6_null_entry. 2. modify_prefix_route(): Fixed by commit a747e02430df ("ipv6: avoid possible NULL deref in modify_prefix_route()"). 3. __ipv6_ifa_notify(): Calls ip6_del_rt() which specifically checks for fib6_null_entry and returns an error. [1] Oops: general protection fault, probably for non-canonical address 0xdffffc0000000006: 0000 [#1] SMP KASAN KASAN: null-ptr-deref in range [0x0000000000000030-0x0000000000000037] [...] Call Trace: <TASK> __kasan_check_byte (mm/kasan/common.c:573) lock_acquire.part.0 (kernel/locking/lockdep.c:5842 (discriminator 1)) _raw_spin_lock_bh (kernel/locking/spinlock.c:182 (discriminator 1)) cleanup_prefix_route (net/ipv6/addrconf.c:1280) ipv6_del_addr (net/ipv6/addrconf.c:1342) inet6_addr_del.isra.0 (net/ipv6/addrconf.c:3119) inet6_rtm_deladdr (net/ipv6/addrconf.c:4812) rtnetlink_rcv_msg (net/core/rtnetlink.c:6997) netlink_rcv_skb (net/netlink/af_netlink.c:2555) netlink_unicast (net/netlink/af_netlink.c:1344) netlink_sendmsg (net/netlink/af_netlink.c:1899) __sock_sendmsg (net/socket.c:802 (discriminator 4)) ____sys_sendmsg (net/socket.c:2698) ___sys_sendmsg (net/socket.c:2752) __sys_sendmsg (net/socket.c:2784) do_syscall_64 (arch/x86/entry/syscall_64.c:63 arch/x86/entry/syscall_64.c:94) entry_SYSCALL_64_after_hwframe (arch/x86/entry/entry_64.S:121) Fixes: 5eb902b8e719 ("net/ipv6: Remove expired routes with a separated list of routes.") Reported-by: Ji'an Zhou <eilaimemedsnaimel@gmail.com> Reviewed-by: David Ahern <dahern@nvidia.com> Signed-off-by: Ido Schimmel <idosch@nvidia.com> Link: https://patch.msgid.link/20260609145448.768318-1-idosch@nvidia.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
7 daysnetfilter: nft_meta_bridge: fix stale stack leak via IIFHWADDR registerDavide Ornaghi1-0/+2
NFT_META_BRI_IIFHWADDR declares its destination register with len = ETH_ALEN (6 bytes), which the register-init tracking rounds up to two 32-bit registers (8 bytes). nft_meta_bridge_get_eval() then does memcpy(dest, br_dev->dev_addr, ETH_ALEN), writing only 6 bytes and leaving the upper 2 bytes of the second register as uninitialised nft_do_chain() stack. A downstream load of that register span leaks those stale bytes to userspace. Zero the second register before the memcpy so the full declared span is written. Fixes: cbd2257dc96e ("netfilter: nft_meta_bridge: introduce NFT_META_BRI_IIFHWADDR support") Cc: stable@vger.kernel.org Signed-off-by: Davide Ornaghi <d.ornaghi97@gmail.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
7 daysnetfilter: nft_fib: fix stale stack leak via the OIFNAME registerDavide Ornaghi3-2/+8
For NFT_FIB_RESULT_OIFNAME the destination register is declared with len = IFNAMSIZ (four 32-bit registers), but on the lookup-fail, RTN_LOCAL and oif-mismatch paths nft_fib{4,6}_eval() only writes one register via "*dest = 0". The remaining three registers are left as whatever was on the stack in nft_do_chain()'s struct nft_regs, and a downstream expression that loads the register span can leak that uninitialised kernel stack to userspace. The NFTA_FIB_F_PRESENT existence check has the same shape: it is only meaningful for NFT_FIB_RESULT_OIF, yet it was accepted for any result type while the eval stores a single byte via nft_reg_store8(), leaving the rest of the declared span stale. Fix both: - replace the bare "*dest = 0" in the eval with nft_fib_store_result(), which strscpy_pad()s the whole IFNAMSIZ for OIFNAME (and is already used on the other early-return path), and - restrict NFTA_FIB_F_PRESENT to NFT_FIB_RESULT_OIF and declare its destination as a single u8, so the marked span matches the one byte the eval writes. Fixes: f6d0cbcf09c5 ("netfilter: nf_tables: add fib expression") Suggested-by: Florian Westphal <fw@strlen.de> Cc: stable@vger.kernel.org Signed-off-by: Davide Ornaghi <d.ornaghi97@gmail.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
7 daysnetfilter: nft_exthdr: fix register tracking for F_PRESENT flagFlorian Westphal1-0/+3
nft_exthdr_init() passes user-controlled priv->len to nft_parse_register_store(), which marks that many bytes in the register bitmap as initialized. However, when NFT_EXTHDR_F_PRESENT is set, the eval paths write only 1 byte (nft_reg_store8) or 4 bytes (*dest = 0 on TCP/DCCP error path). When len > 4, registers beyond the first are never written, retaining uninitialized stack data from nft_regs. Bail out if userspace requests too much data when F_PRESENT is set. Reported-by: Ji'an Zhou <eilaimemedsnaimel@gmail.com> Fixes: c078ca3b0c5b ("netfilter: nft_exthdr: Add support for existence check") Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
7 daysnetfilter: nf_log: validate MAC header was set before dumping itXiang Mei1-2/+2
The fallback path of dump_mac_header() guards the MAC header access only with "skb->mac_header != skb->network_header", without checking skb_mac_header_was_set(). When the MAC header is unset, mac_header is 0xffff, so the test passes and skb_mac_header(skb) returns skb->head + 0xffff, ~64 KiB past the buffer; the loop then reads dev->hard_header_len bytes out of bounds into the kernel log. This is reachable via the netdev logger: nf_log_unknown_packet() calls dump_mac_header() unconditionally, and an skb sent through AF_PACKET with PACKET_QDISC_BYPASS reaches the egress hook with mac_header still unset (__dev_queue_xmit(), which would reset it, is bypassed). Add the skb_mac_header_was_set() check the ARPHRD_ETHER path already uses, and replace the open-coded MAC header length test with skb_mac_header_len(). Only skbs with an unset MAC header are affected; valid ones are dumped as before. BUG: KASAN: slab-out-of-bounds in dump_mac_header (net/netfilter/nf_log_syslog.c:831) Read of size 1 at addr ffff88800ea49d3f by task exploit/148 Call Trace: kasan_report (mm/kasan/report.c:595) dump_mac_header (net/netfilter/nf_log_syslog.c:831) nf_log_netdev_packet (net/netfilter/nf_log_syslog.c:938 net/netfilter/nf_log_syslog.c:963) nf_log_packet (net/netfilter/nf_log.c:260) nft_log_eval (net/netfilter/nft_log.c:60) nft_do_chain (net/netfilter/nf_tables_core.c:285) nft_do_chain_netdev (net/netfilter/nft_chain_filter.c:307) nf_hook_slow (net/netfilter/core.c:619) nf_hook_direct_egress (net/packet/af_packet.c:257) packet_xmit (net/packet/af_packet.c:280) packet_sendmsg (net/packet/af_packet.c:3114) __sys_sendto (net/socket.c:2265) Fixes: 7eb9282cd0ef ("netfilter: ipt_LOG/ip6t_LOG: add option to print decoded MAC header") Reported-by: Weiming Shi <bestswngs@gmail.com> Assisted-by: Claude:claude-opus-4-8 Signed-off-by: Xiang Mei <xmei5@asu.edu> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
7 daysnetfilter: x_tables: avoid leaking percpu counter pointersKyle Zeng3-27/+18
The native and compat get-entries paths copy the fixed rule entry header from the kernelized rule blob to userspace before overwriting the entry's counter fields with a sanitized counter snapshot. On SMP kernels, entry->counters.pcnt contains the percpu allocation address used by x_tables rule counters. A caller can provide a userspace buffer that faults during the initial fixed-header copy after pcnt has been copied but before the later sanitized counter copy runs. The syscall then returns -EFAULT while leaving the raw percpu pointer in userspace. Copy only the fixed entry prefix before counters from the kernelized rule blob, then copy the sanitized counter snapshot into the counter field. Apply this ordering to the IPv4, IPv6, and ARP native and compat get-entries implementations so a fault cannot expose the internal percpu counter pointer. Fixes: 71ae0dff02d7 ("netfilter: xtables: use percpu rule counters") Signed-off-by: Kyle Zeng <kylebot@openai.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
7 daysnetfilter: nf_conntrack: destroy stale expectfn expectations on unregisterWeiming Shi4-0/+24
NAT helpers such as nf_nat_h323 store a raw pointer to module text in exp->expectfn (e.g. ip_nat_q931_expect). nf_ct_helper_expectfn_unregister() only unlinks the callback descriptor and never walks the expectation table, so an expectation pending at module removal survives with a dangling exp->expectfn into freed module text. When the expected connection arrives, init_conntrack() invokes exp->expectfn(), now a stale pointer into the unloaded module. Reproduced on a KASAN build by loading the H.323 helpers, creating a Q.931 expectation, unloading nf_nat_h323, then connecting to the expected port: Oops: int3: 0000 [#1] SMP KASAN NOPTI RIP: 0010:0xffffffffa06102d1 init_conntrack.isra.0 (net/netfilter/nf_conntrack_core.c:1862) nf_conntrack_in (net/netfilter/nf_conntrack_core.c:2049) ipv4_conntrack_local (net/netfilter/nf_conntrack_proto.c:223) nf_hook_slow (net/netfilter/core.c:619) __ip_local_out (net/ipv4/ip_output.c:120) __tcp_transmit_skb (net/ipv4/tcp_output.c:1715) tcp_connect (net/ipv4/tcp_output.c:4374) tcp_v4_connect (net/ipv4/tcp_ipv4.c:345) __sys_connect (net/socket.c:2167) Modules linked in: nf_conntrack_h323 [last unloaded: nf_nat_h323] Reaching the dangling state requires CAP_SYS_MODULE in the initial user namespace to remove a NAT helper that still has live expectations, so this is a robustness fix; leaving an expectation pointing at freed text is wrong regardless. Add nf_ct_helper_expectfn_destroy(), which walks the expectation table and drops every expectation whose ->expectfn matches the descriptor being torn down. Call it from each NAT helper's exit path after the existing RCU grace period, so no expectation outlives the code it points at and no extra synchronize_rcu() is introduced. With the fix, the same reproducer runs to completion without the Oops. Fixes: f587de0e2feb ("[NETFILTER]: nf_conntrack/nf_nat: add H.323 helper port") Reported-by: Xiang Mei <xmei5@asu.edu> Assisted-by: Claude:claude-opus-4-8 Signed-off-by: Weiming Shi <bestswngs@gmail.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
7 daysnetfilter: nf_tables_offload: drop device refcount on errorFlorian Westphal1-2/+4
Reported by sashiko: If nft_flow_action_entry_next() returns NULL, dev reference leaks. Fixes: c6f85577584b ("netfilter: nf_tables_offload: add nft_flow_action_entry_next() and use it") Reported-by: Juri Lelli <juri.lelli@redhat.com> Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
7 daysnetfilter: revalidate bridge portsFlorian Westphal4-18/+89
ebt_redirect_tg() dereferences br_port_get_rcu() return without a NULL check, causing a kernel panic when the bridge port has been removed between the original hook invocation and an NFQUEUE reinject. A mere NULL check isn't sufficient, however. As sashiko review points out userspace can not only remove the port from the bridge, it could also place the device in a different virtual device, e.g. macvlan. If this happens, we must drop the packet, there is no way for us to reinject it into the bridge path. Switch to _upper API, we don't need the bridge port structure. Also, this fix keeps another bug intact: Both nfnetlink_log and nfnetlink_queue use CONFIG_BRIDGE_NETFILTER too aggressive, which prevents certain logging features when queueing in bridge family: NETFILTER_FAMILY_BRIDGE can be enabled while the old CONFIG_BRIDGE_NETFILTER cruft is off. Fixes tag is a common ancestor, this was always broken. Fixes: f350a0a87374 ("bridge: use rx_handler_data pointer to store net_bridge_port pointer") Reported-by: Ji'an Zhou <eilaimemedsnaimel@gmail.com> Assisted-by: Claude:claude-sonnet-4-6 Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
7 daysrds: mark snapshot pages dirty in rds_info_getsockopt()Breno Leitao1-1/+1
rds_info_getsockopt() pins the destination user pages with FOLL_WRITE and the RDS_INFO_* producers memcpy the snapshot into them through kmap_atomic(). Because that copy goes through the kernel direct map, the dirty bit on the user PTE is never set, so unpin_user_pages() releases the pages without marking them dirty. A file-backed destination page can then be reclaimed without writeback, silently discarding the copied data. Use unpin_user_pages_dirty_lock() with make_dirty=true so the modified pages are marked dirty before they are unpinned. Fixes: a8c879a7ee98 ("RDS: Info and stats") Signed-off-by: Breno Leitao <leitao@debian.org> Reviewed-by: Allison Henderson <achender@kernel.org> Link: https://patch.msgid.link/20260608-rds_fix-v1-1-006c88543408@debian.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
7 daysip6_vti: fix incorrect tunnel matching in vti6_tnl_lookup()Eric Dumazet1-0/+2
In vti6_tnl_lookup(), when an exact match for a tunnel fails, the code falls back to searching for wildcard tunnels: - Tunnels matching the packet's local address, with any remote address wildcard remote). - Tunnels matching the packet's remote address, with any local address (wildcard local). However, vti6 stores all these different types of tunnels in the same hash table (ip6n->tnls_r_l) prone to hash collisions. The bug is that the fallback search loops in vti6_tnl_lookup() were missing checks to ensure that the candidate tunnel actually has a wildcard address. Fixes: fbe68ee87522 ("vti6: Add a lookup method for tunnels with wildcard endpoints.") Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Steffen Klassert <steffen.klassert@secunet.com> Reviewed-by: Nicolas Dichtel <nicolas.dichtel@6wind.com> Link: https://patch.msgid.link/20260608164613.933023-1-edumazet@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
7 daysnet/rds: fix NULL deref in rds_ib_send_cqe_handler() on masked atomic completionWeiming Shi1-0/+2
rds_ib_xmit_atomic() always programs a masked atomic opcode (IB_WR_MASKED_ATOMIC_CMP_AND_SWP or IB_WR_MASKED_ATOMIC_FETCH_AND_ADD) for every RDS atomic cmsg. But the completion-side switch in rds_ib_send_unmap_op() only handles the non-masked opcodes, so a masked atomic completion falls through to default and returns rm == NULL while send->s_op is left set. rds_ib_send_cqe_handler() then dereferences the NULL rm via rm->m_final_op, oopsing in softirq context. An unprivileged AF_RDS sendmsg() of an atomic cmsg over an active RDS/IB connection triggers it; on hardware that natively accepts masked atomics (mlx4, mlx5) no extra setup is needed. RDS/IB: rds_ib_send_unmap_op: unexpected opcode 0xd in WR! Oops: general protection fault [#1] SMP KASAN KASAN: null-ptr-deref in range [0x0000000000000190-0x0000000000000197] RIP: rds_ib_send_cqe_handler+0x25c/0xb10 (net/rds/ib_send.c:282) Call Trace: <IRQ> rds_ib_send_cqe_handler (net/rds/ib_send.c:282) poll_scq (net/rds/ib_cm.c:274) rds_ib_tasklet_fn_send (net/rds/ib_cm.c:294) tasklet_action_common (kernel/softirq.c:943) handle_softirqs (kernel/softirq.c:573) run_ksoftirqd (kernel/softirq.c:479) </IRQ> Kernel panic - not syncing: Fatal exception in interrupt Handle the masked atomic opcodes in the same case as the non-masked ones: they map to the same struct rds_message.atomic union member, so the existing container_of()/rds_ib_send_unmap_atomic() body is correct for them. Fixes: 20c72bd5f5f9 ("RDS: Implement masked atomic operations") Reported-by: Xiang Mei <xmei5@asu.edu> Signed-off-by: Weiming Shi <bestswngs@gmail.com> Reviewed-by: Allison Henderson <achender@kernel.org> Link: https://patch.msgid.link/20260606192447.1179255-2-bestswngs@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
7 daysnet: guard timestamp cmsgs to real error queue skbsKyle Zeng2-8/+9
skb_is_err_queue() treats PACKET_OUTGOING as the sole marker for an skb from sk_error_queue. That assumption is not true for AF_PACKET sockets: outgoing packet taps are also delivered to packet sockets with skb->pkt_type == PACKET_OUTGOING, but their skb->cb is owned by AF_PACKET instead of struct sock_exterr_skb. If such an skb is received with timestamping enabled, the generic timestamp cmsg path can read AF_PACKET control-buffer state as sock_exterr_skb::opt_stats. With SO_RXQ_OVFL enabled, the packet drop counter overlaps opt_stats. An odd drop count makes the path emit SCM_TIMESTAMPING_OPT_STATS with skb->len and skb->data. For non-linear skbs this copies past the linear head and can trigger hardened usercopy or disclose adjacent heap contents. Keep skb_is_err_queue() local to net/socket.c, but make it verify that the PACKET_OUTGOING marker is paired with the sock_rmem_free destructor installed by sock_queue_err_skb(). AF_PACKET receive skbs use normal receive ownership and no longer pass as error-queue skbs, while legitimate sk_error_queue entries keep the PACKET_OUTGOING marker and sock_rmem_free ownership. Fixes: 8605330aac5a ("tcp: fix SCM_TIMESTAMPING_OPT_STATS for normal skbs") Signed-off-by: Kyle Zeng <kylebot@openai.com> Reviewed-by: Kuniyuki Iwashima <kuniyu@google.com> Reviewed-by: Willem de Bruijn <willemb@google.com> Link: https://patch.msgid.link/20260607021819.49698-1-kylebot@openai.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
7 dayssctp: validate embedded INIT chunk and address list lengths in cookieXin Long2-3/+17
sctp_unpack_cookie() only checked that the embedded INIT chunk length did not exceed the remaining cookie payload, but did not ensure that the INIT chunk is large enough to contain a complete INIT header. A malformed COOKIE_ECHO can therefore carry a truncated INIT chunk whose length field is smaller than sizeof(struct sctp_init_chunk). Later, sctp_process_init() accesses INIT parameters unconditionally, which may lead to out-of-bounds reads. In addition, raw_addr_list_len is not fully validated against the remaining cookie payload. When cookie authentication is disabled, an attacker can supply an oversized raw_addr_list_len and cause sctp_raw_to_bind_addrs() to read beyond the end of the cookie. The address parser also lacks sufficient bounds checks for parameter headers and lengths, allowing malformed address parameters to trigger out-of-bounds reads. Fix this by: - requiring the embedded INIT chunk length to be at least sizeof(struct sctp_init_chunk); - validating that the INIT chunk and raw address list together fit within the cookie payload; - verifying sufficient data exists for each address parameter header and payload before parsing it. Note that sctp_verify_init() must be called after sctp_unpack_cookie() and before sctp_process_init() when cookie authentication is disabled. This will be addressed in a separate patch. Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") Reported-by: Sashiko <sashiko-bot@kernel.org> Signed-off-by: Xin Long <lucien.xin@gmail.com> Link: https://patch.msgid.link/75af23a89adf881a0895d511775e4770da367cbf.1780873427.git.lucien.xin@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
7 daysip6_vti: set netns_immutable on the fallback device.Eric Dumazet1-0/+1
john1988 and Noam Rathaus reported that vti6_init_net() does not set the netns_immutable flag on the per-netns fallback tunnel device (ip6_vti0). Other similar tunnel drivers (like ip6_tunnel, sit, ip6_gre, and ip_tunnel) correctly set this flag during their fallback device initialization to prevent them from being moved to another network namespace. Fixes: 61220ab34948 ("vti6: Enable namespace changing") Reported-by: Noam Rathaus <noamr@ssd-disclosure.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Steffen Klassert <steffen.klassert@secunet.com> Reviewed-by: Nicolas Dichtel <nicolas.dichtel@6wind.com> Link: https://patch.msgid.link/20260608155918.787644-1-edumazet@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
7 dayssctp: fix uninit-value in __sctp_rcv_asconf_lookup()Michael Bommarito1-0/+8
__sctp_rcv_asconf_lookup() in net/sctp/input.c only checks that the ASCONF chunk can hold the ADDIP header and a parameter header, then calls af->from_addr_param(), which reads the full address (16 bytes for IPv6) trusting the parameter's declared length. An unauthenticated peer can send a truncated trailing ASCONF chunk that declares an IPv6 address parameter but stops after the 4-byte parameter header; reached from the no-association lookup path, from_addr_param() then reads uninitialized bytes past the parameter. Impact: an unauthenticated SCTP peer makes the receive path read up to 16 bytes of uninitialized memory past a truncated ASCONF address parameter. The sibling __sctp_rcv_init_lookup() bounds parameters with sctp_walk_params(); this path open-codes the fetch and omits the bound. Verify the whole address parameter lies within the chunk before from_addr_param() reads it, the same class of fix as commit 51e5ad549c43 ("net: sctp: fix KMSAN uninit-value in sctp_inq_pop"). Fixes: df2185771439 ("[SCTP]: Update association lookup to look at ASCONF chunks as well") Signed-off-by: Michael Bommarito <michael.bommarito@gmail.com> Acked-by: Xin Long <lucien.xin@gmail.com> Link: https://patch.msgid.link/20260608122234.459098-1-michael.bommarito@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
7 dayssctp: stream: fully roll back denied add-stream stateWyatt Feng1-1/+5
When ADD_OUT_STREAMS is denied, SCTP only shrinks the queued chunks and then lowers outcnt. That leaves removed stream metadata behind, so a later re-add can reuse a stale ext and hit a null-pointer dereference in the scheduler get path. Fix the rollback by tearing down the removed stream state the same way other stream resizes do. Unschedule the current scheduler state, drop the removed stream ext state with sctp_stream_outq_migrate(), and then reschedule the remaining streams. This keeps scheduler-private RR/FC/PRIO lists consistent while fully rolling back denied outgoing stream additions. Fixes: 637784ade221 ("sctp: introduce priority based stream scheduler") Cc: stable@kernel.org Reported-by: Yuan Tan <yuantan098@gmail.com> Reported-by: Yifan Wu <yifanwucs@gmail.com> Reported-by: Juefei Pu <tomapufckgml@gmail.com> Reported-by: Zhengchuan Liang <zcliangcn@gmail.com> Reported-by: Xin Liu <bird@lzu.edu.cn> Signed-off-by: Wyatt Feng <bronzed_45_vested@icloud.com> Signed-off-by: Ren Wei <n05ec@lzu.edu.cn> Acked-by: Xin Long <lucien.xin@gmail.com> Link: https://patch.msgid.link/d78954ecd94954653ee299400e98d74a03a6f7d3.1780603399.git.bronzed_45_vested@icloud.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
8 daysesp: fix page frag reference leak on skb_to_sgvec failureAlessandro Schino2-12/+22
In esp_output_tail(), when esp->inplace is false, the old skb page frags are replaced with a new page from the xfrm page_frag cache The source scatterlist (sg) is built from the old frags before the replacement, and esp_ssg_unref() is responsible for releasing the old page references after the crypto operation completes However, if the second skb_to_sgvec() call (which builds the destination scatterlist from the new page) fails, the code jumps to error_free which only calls kfree(tmp). The old page frag references captured in the source scatterlist are never released: 1 sg[] is built from old frags via skb_to_sgvec() (no extra get_page) 2 nr_frags is set to 1 and frag[0] is replaced with the new page 3 Second skb_to_sgvec() fails -> goto error_free Fix this by adding a bool parameter to esp_ssg_unref() that, when true, unconditionally unrefs the source scatterlist frags. Since req->src is not yet initialized by aead_request_set_crypt() at the point of the error, the source scatterlist is obtained directly via esp_req_sg() Existing callers pass false to preserve the original behavior The same issue exists in both esp4 and esp6 as the code is identical Fixes: cac2661c53f3 ("esp4: Avoid skb_cow_data whenever possible") Fixes: 03e2a30f6a27 ("esp6: Avoid skb_cow_data whenever possible") Signed-off-by: Alessandro Schino <7991aleschino@gmail.com> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
8 daysrxrpc: Fix the ACK parser to extract the SACK table for parsingDavid Howells1-9/+17
Fix modification of the received skbuff in rxrpc_input_soft_acks() and a potential incorrect access of the buffer in a fragmented UDP packet (the packet would probably have to be deliberately pre-generated as fragmented) when AF_RXRPC tries to extract the contents of the SACK table by copying out the contents of the SACK table into a buffer before attempting to parse AF_RXRPC assumes that it can just call skb_condense() and then validly access the SACK table from skb->data and that it will be a flat buffer - but skb_condense() can silently fail to do anything under some circumstances. Note that whilst rxrpc_input_soft_acks() should be able to parse extended ACKs, the rest of AF_RXRPC doesn't currently support that. Further, there's then no need to call skb_condense() in rxrpc_input_ack(), so don't. Fixes: d57a3a151660 ("rxrpc: Save last ACK's SACK table rather than marking txbufs") Reported-by: Michael Bommarito <michael.bommarito@gmail.com> Link: https://lore.kernel.org/r/20260513180907.2061972-1-michael.bommarito@gmail.com Signed-off-by: David Howells <dhowells@redhat.com> cc: Marc Dionne <marc.dionne@auristor.com> cc: Jeffrey Altman <jaltman@auristor.com> cc: Eric Dumazet <edumazet@google.com> cc: "David S. Miller" <davem@davemloft.net> cc: Jakub Kicinski <kuba@kernel.org> cc: Paolo Abeni <pabeni@redhat.com> cc: Simon Horman <horms@kernel.org> cc: linux-afs@lists.infradead.org cc: netdev@vger.kernel.org cc: stable@kernel.org Link: https://patch.msgid.link/105362.1780573560@warthog.procyon.org.uk Signed-off-by: Paolo Abeni <pabeni@redhat.com>
8 daysnet: openvswitch: fix possible kfree_skb of ERR_PTRAdrian Moreno1-0/+1
After the patch in the "Fixes" tag, the allocation of the "reply" skb can happen either before or after locking the ovs_mutex. However, error cleanups still follow the classical reversed order, assuming "reply" is allocated before locking: it is freed after unlocking. If "reply" allocation happens after locking the mutex and it fails, "reply" is left with an ERR_PTR, and execution jumps to the correspondent cleanup stage which will try to free an invalid pointer. Fix this by setting the pointer to NULL after having saved its error value. Fixes: 893f139b9a6c ("openvswitch: Minimize ovs_flow_cmd_new|set critical sections.") Signed-off-by: Adrian Moreno <amorenoz@redhat.com> Reviewed-by: Aaron Conole <aconole@redhat.com> Acked-by: Eelco Chaudron <echaudro@redhat.com> Link: https://patch.msgid.link/20260604121946.942164-1-amorenoz@redhat.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
8 daysipv6: sit: reload inner IPv6 header after GSO offloadsKyle Zeng1-0/+1
ipip6_tunnel_xmit() caches the inner IPv6 header pointer at function entry and continues using it after iptunnel_handle_offloads(). For GSO skbs, iptunnel_handle_offloads() calls skb_header_unclone(). When the skb header is cloned, skb_header_unclone() can call pskb_expand_head(), which may move the skb head. The pskb_expand_head() contract requires pointers into the skb header to be reloaded after the call. If the later skb_realloc_headroom() branch is not taken, SIT uses the stale iph6 pointer to read the inner hop limit and DS field. That can read from a freed skb head after the old head's remaining clone is released. Reload iph6 after the offload helper succeeds and before subsequent reads from the inner IPv6 header. Keep the existing reload after skb_realloc_headroom(), since that branch can also replace the skb. Fixes: 14909664e4e1 ("sit: Setup and TX path for sit/UDP foo-over-udp encapsulation") Signed-off-by: Kyle Zeng <kylebot@openai.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Reported-by: syzbot+6eb9ca986d80f6f88cf9@syzkaller.appspotmail.com Link: https://patch.msgid.link/20260605073448.6524-1-kylebot@openai.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
8 daysnet: qrtr: fix refcount saturation and potential UAF in qrtr_port_removeMingyu Wang1-2/+2
In qrtr_port_remove(), the socket reference count is decremented via __sock_put() before the port is removed from the qrtr_ports XArray and before the RCU grace period elapses. This breaks the fundamental RCU update paradigm. It exposes a race window where a concurrent RCU reader (such as qrtr_reset_ports() or qrtr_port_lookup()) can obtain a pointer to the socket from the XArray, and attempt to call sock_hold() on a socket whose reference count has already dropped to zero. This exact race condition was hit during syzkaller fuzzing, leading to the following refcount saturation warning and a potential Use-After-Free: refcount_t: saturated; leaking memory. WARNING: CPU: 3 PID: 1273 at lib/refcount.c:22 refcount_warn_saturate+0xae/0x1d0 Modules linked in: qrtr(+) bochs drm_shmem_helper ... Call Trace: <TASK> qrtr_reset_ports net/qrtr/af_qrtr.c:768 [inline] [qrtr] __qrtr_bind.isra.0+0x48b/0x570 net/qrtr/af_qrtr.c:805 [qrtr] qrtr_bind+0x17d/0x210 net/qrtr/af_qrtr.c:901 [qrtr] kernel_bind+0xe4/0x120 net/socket.c:3592 qrtr_ns_init+0x1a6/0x380 net/qrtr/ns.c:715 [qrtr] qrtr_proto_init+0x3b/0xff0 net/qrtr/af_qrtr.c:169 [qrtr] do_one_initcall+0xf5/0x5e0 init/main.c:1283 ... </TASK> Fix this by deferring the reference count decrement until after the xa_erase() and the synchronize_rcu() complete. (Note: The v1 of this patch incorrectly replaced __sock_put() with sock_put(). As Simon Horman pointed out, the callers of qrtr_port_remove() still hold a reference to the socket, so freeing the socket memory here would lead to a subsequent UAF in the caller. Thus, the __sock_put() is kept, but only repositioned to close the RCU race.) Fixes: bdabad3e363d ("net: Add Qualcomm IPC router") Signed-off-by: Mingyu Wang <25181214217@stu.xidian.edu.cn> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/20260604064801.1180388-1-w15303746062@163.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
8 daysnetdev: fix double-free in netdev_nl_bind_rx_doit()Jakub Kicinski1-3/+1
Sashiko flags that genlmsg_reply() always consumes the skb. The error path calls nlmsg_free(rsp) so we can't jump directly to it. Let's not unbind, just propagate the error to the user. This is the typical way of handling genlmsg_reply() failures. They shouldn't happen unless user does something silly like calling the kernel with an already-full rcvbuf. Reported-by: Sashiko <sashiko-bot@kernel.org> Fixes: 170aafe35cb9 ("netdev: support binding dma-buf to netdevice") Reviewed-by: Bobby Eshleman <bobbyeshleman@meta.com> Acked-by: Daniel Borkmann <daniel@iogearbox.net> Reviewed-by: Nikolay Aleksandrov <razor@blackwall.org> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
8 daysnet: phonet: free phonet_device after RCU grace periodSantosh Kalluri1-1/+1
phonet_device_destroy() removes a phonet_device from the per-net device list with list_del_rcu(), but frees it immediately. RCU readers walking the same list can still hold a pointer to the object after it has been removed, leading to a slab-use-after-free. Use kfree_rcu(), matching the lifetime rule already used by phonet_address_del() for the same object type. Fixes: eeb74a9d45f7 ("Phonet: convert devices list to RCU") Cc: stable@vger.kernel.org Signed-off-by: Santosh Kalluri <santosh.kalluri129@gmail.com> Acked-by: Rémi Denis-Courmont <remi@remlab.net> Reviewed-by: Simon Horman <horms@kernel.org> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
8 daysnet: add pskb_may_pull() to skb_gro_receive_list()HanQuan1-0/+5
skb_gro_receive_list() calls skb_pull(skb, skb_gro_offset(skb)) without first ensuring the data is in the linear area via pskb_may_pull(). When the skb arrives via napi_gro_frags(), skb_headlen can be 0 (all data in page fragments) while skb_gro_offset is non-zero (after IP+TCP header parsing). The skb_pull() then decrements skb->len by skb_gro_offset but skb->data_len stays unchanged, hitting BUG_ON(skb->len < skb->data_len) in __skb_pull(). The UDP fraglist GRO path already contains this guard at udp_offload.c:749. Adding it to skb_gro_receive_list() itself provides centralized protection for all callers (TCP, UDP, and any future protocols), and ensures the precondition of skb_pull() is satisfied before it is called. On pskb_may_pull() failure, set NAPI_GRO_CB(skb)->flush = 1 so the skb is not held as a new GRO head and is instead delivered through the normal receive path, matching the UDP handling. Fixes: 8d95dc474f85 ("net: add code for TCP fraglist GRO") Reported-by: HanQuan <eilaimemedsnaimel@gmail.com> Reported-by: MingXuan <bwnie0730@outlook.com> Signed-off-by: HanQuan <eilaimemedsnaimel@gmail.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
9 daystcp: restrict SO_ATTACH_FILTER to priv usersEric Dumazet1-0/+5
This patch restricts the use of SO_ATTACH_FILTER (cBPF) on TCP sockets to users with CAP_NET_ADMIN capability. This blocks potential side-channel attack where an unprivileged application attaches a filter to leak TCP sequence/acknowledgment numbers. Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") Signed-off-by: Eric Dumazet <edumazet@google.com> Reported-by: Tamir Shahar <tamirthesis@gmail.com> Reported-by: Amit Klein <aksecurity@gmail.com> Cc: Willem de Bruijn <willemb@google.com> Cc: Alexei Starovoitov <ast@kernel.org> Cc: Daniel Borkmann <daniel@iogearbox.net> Cc: Andrii Nakryiko <andrii@kernel.org> Cc: Martin KaFai Lau <martin.lau@linux.dev> Cc: Eduard Zingerman <eddyz87@gmail.com> Cc: Kumar Kartikeya Dwivedi <memxor@gmail.com> Cc: Song Liu <song@kernel.org> Cc: Yonghong Song <yonghong.song@linux.dev> Cc: Jiri Olsa <jolsa@kernel.org> Cc: John Fastabend <john.fastabend@gmail.com> Cc: Stanislav Fomichev <sdf@fomichev.me> Acked-by: Daniel Borkmann <daniel@iogearbox.net> Reviewed-by: Willem de Bruijn <willemb@google.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
11 daysnetlabel: validate unlabeled address and mask attribute lengthsChenguang Zhao1-20/+10
netlbl_unlabel_addrinfo_get() used the address attribute length to determine whether the attribute data could be read as an IPv4 or IPv6 address, but did not independently validate the corresponding mask attribute length. A crafted Generic Netlink request could therefore provide a valid IPv4/IPv6 address attribute with a shorter mask attribute, which would later be read as a full struct in_addr or struct in6_addr. NLA_BINARY policy lengths are maximum lengths by default, so use NLA_POLICY_EXACT_LEN() for the unlabeled IPv4/IPv6 address and mask attributes. This rejects short attributes during policy validation and also exposes the exact length requirements through policy introspection. Fixes: 8cc44579d1bd ("NetLabel: Introduce static network labels for unlabeled connections") Signed-off-by: Chenguang Zhao <zhaochenguang@kylinos.cn> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
12 daysxfrm: espintcp: do not reuse an in-progress partial sendWyatt Feng1-0/+4
espintcp keeps a single in-flight transmit in ctx->partial. Before building a new sk_msg, espintcp_sendmsg() first tries to flush that state through espintcp_push_msgs(). For blocking callers, espintcp_push_msgs() may return success even when the previous partial send is still pending. espintcp_sendmsg() would then reinitialize emsg->skmsg and reuse ctx->partial while the old transfer still owns that state. Do not rebuild the send message when ctx->partial is still in progress. If espintcp_push_msgs() returns with emsg->len still set, fail the new send instead of overwriting the live partial state. This is a memory-safety fix: reusing the live partial-send state can leave a stale offset attached to a new sk_msg and lead to an out-of- bounds read in the send path. tcp_sendmsg_locked() already handles waiting for send buffer memory, so the fix here is just to preserve espintcp's one-message-at-a-time transmit state. Fixes: e27cca96cd68 ("xfrm: add espintcp (RFC 8229)") Cc: stable@kernel.org Reported-by: Yuan Tan <yuantan098@gmail.com> Reported-by: Yifan Wu <yifanwucs@gmail.com> Reported-by: Juefei Pu <tomapufckgml@gmail.com> Reported-by: Zhengchuan Liang <zcliangcn@gmail.com> Reported-by: Xin Liu <bird@lzu.edu.cn> Assisted-by: Codex:GPT-5.4 Signed-off-by: Wyatt Feng <bronzed_45_vested@icloud.com> Signed-off-by: Ren Wei <n05ec@lzu.edu.cn> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
12 daysxfrm: iptfs: fix ABBA deadlock in iptfs_destroy_state()Tristan Madani1-3/+2
iptfs_destroy_state() calls hrtimer_cancel() while holding a spinlock that the timer callback also acquires, leading to an ABBA deadlock on SMP systems. For the output timer (iptfs_timer): - iptfs_destroy_state() holds x->lock, calls hrtimer_cancel() - iptfs_delay_timer() callback takes x->lock For the drop timer (drop_timer): - iptfs_destroy_state() holds drop_lock, calls hrtimer_cancel() - iptfs_drop_timer() callback takes drop_lock Both timers use HRTIMER_MODE_REL_SOFT, so their callbacks run in softirq context. When hrtimer_cancel() is called for a soft timer that is currently executing on another CPU, hrtimer_cancel_wait_running() spins on softirq_expiry_lock -- the same lock held by the softirq running the callback. If the callback is blocked waiting for the spinlock held by the caller of hrtimer_cancel(), a circular dependency forms: CPU 0: holds lock_A -> waits for softirq_expiry_lock CPU 1: holds softirq_expiry_lock -> waits for lock_A Fix by calling hrtimer_cancel() before acquiring the respective locks. hrtimer_cancel() is safe to call without holding any lock and will wait for any in-progress callback to complete. For the output timer, the lock is still acquired afterwards to drain the packet queue. For the drop timer, the lock/unlock pair is removed entirely since it only existed to serialize with the timer callback, which hrtimer_cancel() already guarantees. Found by source code audit. Fixes: 4b3faf610cc6 ("xfrm: iptfs: add new iptfs xfrm mode impl") Cc: Christian Hopps <chopps@labn.net> Cc: Steffen Klassert <steffen.klassert@secunet.com> Cc: stable@vger.kernel.org Signed-off-by: Tristan Madani <tristan@talencesecurity.com> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
12 daysinet: frags: fix use-after-free caused by the fqdir_pre_exit() flushHyunwoo Kim2-3/+3
On netns teardown, fqdir_pre_exit() walks the fqdir rhashtable and flushes every fragment queue that is not yet complete using inet_frag_queue_flush(). That helper frees all the skbs queued on the fragment queue but does not set INET_FRAG_COMPLETE, and leaves q->fragments_tail and q->last_run_head pointing at the freed skbs. The queue itself stays in the rhashtable. fqdir_pre_exit() first lowers high_thresh to 0 to stop new queue lookups, but it cannot stop a fragment that already obtained the queue through inet_frag_find() earlier and stalled just before taking the queue lock. Once that fragment resumes after the flush and takes the queue lock, it passes the INET_FRAG_COMPLETE check and then dereferences the freed fragments_tail. inet_frag_queue_insert() reads FRAG_CB() and ->len of that pointer and, on the append path, writes ->next_frag, causing a slab use-after-free. IPv6, nf_conntrack_reasm6 and 6lowpan reassembly share the same flush path and are affected as well. Reset rb_fragments, fragments_tail and last_run_head in inet_frag_queue_flush() so a flushed queue no longer points at the freed skbs. A fragment that resumes after the flush and takes the queue lock then finds an empty queue and starts a new run instead of dereferencing the freed fragments_tail. ip_frag_reinit() already performed this reset after its own flush, so drop the now duplicate code there. Cc: stable@vger.kernel.org Fixes: 006a5035b495 ("inet: frags: flush pending skbs in fqdir_pre_exit()") Suggested-by: Eric Dumazet <edumazet@google.com> Signed-off-by: Hyunwoo Kim <imv4bel@gmail.com> Link: https://patch.msgid.link/ah6ukYq5G98LshdA@v4bel Signed-off-by: Jakub Kicinski <kuba@kernel.org>
13 daysudp: clear skb->dev before running a sockmap verdictSechang Lim1-0/+8
On the UDP receive path skb->dev is repurposed as dev_scratch (the truesize/state cache set by udp_set_dev_scratch()), through the union { struct net_device *dev; unsigned long dev_scratch; } in sk_buff. When a UDP socket is in a sockmap, sk_data_ready is sk_psock_verdict_data_ready(), which calls udp_read_skb() -> recv_actor() (sk_psock_verdict_recv) to run the attached SK_SKB verdict program in softirq. If that program calls a socket-lookup helper (bpf_sk_lookup_tcp/udp, bpf_skc_lookup_tcp), bpf_skc_lookup() does: if (skb->dev) caller_net = dev_net(skb->dev); skb->dev still holds the dev_scratch value (a non-NULL integer), so dev_net() dereferences it as a struct net_device * and the kernel takes a general protection fault on a non-canonical address in softirq: Oops: general protection fault, probably for non-canonical address 0x1010000800004a0 CPU: 1 UID: 0 PID: 1406 Comm: syz.2.19 Not tainted 7.1.0-rc6 #1 PREEMPT(full) RIP: 0010:bpf_skc_lookup net/core/filter.c:7033 [inline] RIP: 0010:bpf_sk_lookup+0x45/0x160 net/core/filter.c:7047 Call Trace: <IRQ> bpf_prog_4675cb904b7071f8+0x12e/0x14e bpf_prog_run_pin_on_cpu+0xc6/0x1f0 sk_psock_verdict_recv+0x1ba/0x350 udp_read_skb+0x31a/0x370 sk_psock_verdict_data_ready+0x2e3/0x600 __udp_enqueue_schedule_skb+0x4c8/0x650 udpv6_queue_rcv_one_skb+0x3ec/0x740 udp6_unicast_rcv_skb+0x11d/0x140 ip6_protocol_deliver_rcu+0x61e/0x950 ip6_input_finish+0xa9/0x150 NF_HOOK+0x286/0x2f0 ip6_input+0x117/0x220 NF_HOOK+0x286/0x2f0 __netif_receive_skb+0x85/0x200 process_backlog+0x374/0x9a0 __napi_poll+0x4f/0x1c0 net_rx_action+0x3b0/0x770 handle_softirqs+0x15a/0x460 do_softirq+0x57/0x80 </IRQ> The rmem charge that dev_scratch accounted for is released by skb_recv_udp() on dequeue, just above, so the scratch is dead by the time recv_actor() runs. Clear skb->dev so bpf_skc_lookup() falls back to sock_net(skb->sk), which skb_set_owner_sk_safe() set just above. Fixes: 965b57b469a5 ("net: Introduce a new proto_ops ->read_skb()") Cc: stable@vger.kernel.org Signed-off-by: Sechang Lim <rhkrqnwk98@gmail.com> Reviewed-by: Jiayuan Chen <jiayuan.chen@linux.dev> Reviewed-by: Eric Dumazet <edumazet@google.com> Link: https://patch.msgid.link/20260603162737.697215-1-rhkrqnwk98@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
13 dayssctp: purge outqueue on stale COOKIE-ECHO handlingXin Long1-5/+1
sctp_stream_update() is only invok