aboutsummaryrefslogtreecommitdiff
path: root/net/qrtr
AgeCommit message (Collapse)AuthorFilesLines
2026-04-14Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netJakub Kicinski1-10/+69
Merge in late fixes in preparation for the net-next PR. Conflicts: include/net/sch_generic.h a6bd339dbb351 ("net_sched: fix skb memory leak in deferred qdisc drops") ff2998f29f390 ("net: sched: introduce qdisc-specific drop reason tracing") https://lore.kernel.org/adz0iX85FHMz0HdO@sirena.org.uk drivers/net/ethernet/airoha/airoha_eth.c 1acdfbdb516b ("net: airoha: Fix VIP configuration for AN7583 SoC") bf3471e6e6c0 ("net: airoha: Make flow control source port mapping dependent on nbq parameter") Adjacent changes: drivers/net/ethernet/airoha/airoha_ppe.c f44218cd5e6a ("net: airoha: Reset PPE cpu port configuration in airoha_ppe_hw_init()") 7da62262ec96 ("inet: add ip_local_port_step_width sysctl to improve port usage distribution") Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-04-13net: qrtr: ns: Fix use-after-free in driver remove()Manivannan Sadhasivam1-0/+11
In the remove callback, if a packet arrives after destroy_workqueue() is called, but before sock_release(), the qrtr_ns_data_ready() callback will try to queue the work, causing use-after-free issue. Fix this issue by saving the default 'sk_data_ready' callback during qrtr_ns_init() and use it to replace the qrtr_ns_data_ready() callback at the start of remove(). This ensures that even if a packet arrives after destroy_workqueue(), the work struct will not be dereferenced. Note that it is also required to ensure that the RX threads are completed before destroying the workqueue, because the threads could be using the qrtr_ns_data_ready() callback. Cc: stable@vger.kernel.org Fixes: 0c2204a4ad71 ("net: qrtr: Migrate nameservice to kernel from userspace") Signed-off-by: Manivannan Sadhasivam <manivannan.sadhasivam@oss.qualcomm.com> Link: https://patch.msgid.link/20260409-qrtr-fix-v3-5-00a8a5ff2b51@oss.qualcomm.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-04-13net: qrtr: ns: Limit the total number of nodesManivannan Sadhasivam1-2/+14
Currently, the nameserver doesn't limit the number of nodes it handles. This can be an attack vector if a malicious client starts registering random nodes, leading to memory exhaustion. Hence, limit the maximum number of nodes to 64. Note that, limit of 64 is chosen based on the current platform requirements. If requirement changes in the future, this limit can be increased. Cc: stable@vger.kernel.org Fixes: 0c2204a4ad71 ("net: qrtr: Migrate nameservice to kernel from userspace") Signed-off-by: Manivannan Sadhasivam <manivannan.sadhasivam@oss.qualcomm.com> Link: https://patch.msgid.link/20260409-qrtr-fix-v3-4-00a8a5ff2b51@oss.qualcomm.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-04-13net: qrtr: ns: Free the node during ctrl_cmd_bye()Manivannan Sadhasivam1-5/+15
A node sends the BYE packet when it is about to go down. So the nameserver should advertise the removal of the node to all remote and local observers and free the node finally. But currently, the nameserver doesn't free the node memory even after processing the BYE packet. This causes the node memory to leak. Hence, remove the node from Xarray list and free the node memory during both success and failure case of ctrl_cmd_bye(). Cc: stable@vger.kernel.org Fixes: 0c2204a4ad71 ("net: qrtr: Migrate nameservice to kernel from userspace") Signed-off-by: Manivannan Sadhasivam <manivannan.sadhasivam@oss.qualcomm.com> Link: https://patch.msgid.link/20260409-qrtr-fix-v3-3-00a8a5ff2b51@oss.qualcomm.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-04-13net: qrtr: ns: Limit the maximum number of lookupsManivannan Sadhasivam1-2/+12
Current code does no bound checking on the number of lookups a client can perform. Though the code restricts the lookups to local clients, there is still a possibility of a malicious local client sending a flood of NEW_LOOKUP messages over the same socket. Fix this issue by limiting the maximum number of lookups to 64 globally. Since the nameserver allows only atmost one local observer, this global lookup count will ensure that the lookups stay within the limit. Note that, limit of 64 is chosen based on the current platform requirements. If requirement changes in the future, this limit can be increased. Cc: stable@vger.kernel.org Fixes: 0c2204a4ad71 ("net: qrtr: Migrate nameservice to kernel from userspace") Signed-off-by: Manivannan Sadhasivam <manivannan.sadhasivam@oss.qualcomm.com> Link: https://patch.msgid.link/20260409-qrtr-fix-v3-2-00a8a5ff2b51@oss.qualcomm.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-04-13net: qrtr: ns: Limit the maximum server registration per nodeManivannan Sadhasivam1-5/+21
Current code does no bound checking on the number of servers added per node. A malicious client can flood NEW_SERVER messages and exhaust memory. Fix this issue by limiting the maximum number of server registrations to 256 per node. If the NEW_SERVER message is received for an old port, then don't restrict it as it will get replaced. While at it, also rate limit the error messages in the failure path of qrtr_ns_worker(). Note that the limit of 256 is chosen based on the current platform requirements. If requirement changes in the future, this limit can be increased. Cc: stable@vger.kernel.org Fixes: 0c2204a4ad71 ("net: qrtr: Migrate nameservice to kernel from userspace") Reported-by: Yiming Qian <yimingqian591@gmail.com> Reviewed-by: Simon Horman <horms@kernel.org> Signed-off-by: Manivannan Sadhasivam <manivannan.sadhasivam@oss.qualcomm.com> Link: https://patch.msgid.link/20260409-qrtr-fix-v3-1-00a8a5ff2b51@oss.qualcomm.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-04-02Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netJakub Kicinski1-18/+13
Cross-merge networking fixes after downstream PR (net-7.0-rc7). Conflicts: net/vmw_vsock/af_vsock.c b18c83388874 ("vsock: initialize child_ns_mode_locked in vsock_net_init()") 0de607dc4fd8 ("vsock: add G2H fallback for CIDs not owned by H2G transport") Adjacent changes: drivers/net/ethernet/broadcom/bnxt/bnxt_ethtool.c ceee35e5674a ("bnxt_en: Refactor some basic ring setup and adjustment logic") 57cdfe0dc70b ("bnxt_en: Resize RSS contexts on channel count change") drivers/net/wireless/intel/iwlwifi/mld/mac80211.c 4d56037a02bd ("wifi: iwlwifi: mld: block EMLSR during TDLS connections") 687a95d204e7 ("wifi: iwlwifi: mld: correctly set wifi generation data") drivers/net/wireless/intel/iwlwifi/mld/scan.h b6045c899e37 ("wifi: iwlwifi: mld: Refactor scan command handling") ec66ec6a5a8f ("wifi: iwlwifi: mld: Fix MLO scan timing") drivers/net/wireless/intel/iwlwifi/mvm/fw.c 078df640ef05 ("wifi: iwlwifi: mld: add support for iwl_mcc_allowed_ap_type_cmd v 2") 323156c3541e ("wifi: iwlwifi: mvm: don't send a 6E related command when not supported") Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-03-27net: qrtr: fix endian handling of confirm_rx fieldAlexander Wilhelm1-2/+2
Convert confirm_rx to little endian when enqueueing and convert it back on receive. This fixes control flow on big endian hosts, little endian is unaffected. On transmit, store confirm_rx as __le32 using cpu_to_le32(). On receive, apply le32_to_cpu() before using the value. !! ensures the value is 0 or 1 in native endianness, so the conversion isn’t strictly required here, but it is kept for consistency and clarity. Reviewed-by: Manivannan Sadhasivam <mani@kernel.org> Signed-off-by: Alexander Wilhelm <alexander.wilhelm@westermo.com> Reviewed-by: Simon Horman <horms@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2026-03-26net: qrtr: replace qrtr_tx_flow radix_tree with xarray to fix memory leakJiayuan Chen1-18/+13
__radix_tree_create() allocates and links intermediate nodes into the tree one by one. If a subsequent allocation fails, the already-linked nodes remain in the tree with no corresponding leaf entry. These orphaned internal nodes are never reclaimed because radix_tree_for_each_slot() only visits slots containing leaf values. The radix_tree API is deprecated in favor of xarray. As suggested by Matthew Wilcox, migrate qrtr_tx_flow from radix_tree to xarray instead of fixing the radix_tree itself [1]. xarray properly handles cleanup of internal nodes — xa_destroy() frees all internal xarray nodes when the qrtr_node is released, preventing the leak. [1] https://lore.kernel.org/all/20260225071623.41275-1-jiayuan.chen@linux.dev/T/ Reported-by: syzbot+006987d1be3586e13555@syzkaller.appspotmail.com Closes: https://lore.kernel.org/all/000000000000bfba3a060bf4ffcf@google.com/T/ Fixes: 5fdeb0d372ab ("net: qrtr: Implement outgoing flow control") Signed-off-by: Jiayuan Chen <jiayuan.chen@shopee.com> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/20260324080645.290197-1-jiayuan.chen@linux.dev Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-02-21Convert 'alloc_obj' family to use the new default GFP_KERNEL argumentLinus Torvalds3-6/+6
This was done entirely with mindless brute force, using git grep -l '\<k[vmz]*alloc_objs*(.*, GFP_KERNEL)' | xargs sed -i 's/\(alloc_objs*(.*\), GFP_KERNEL)/\1)/' to convert the new alloc_obj() users that had a simple GFP_KERNEL argument to just drop that argument. Note that due to the extreme simplicity of the scripting, any slightly more complex cases spread over multiple lines would not be triggered: they definitely exist, but this covers the vast bulk of the cases, and the resulting diff is also then easier to check automatically. For the same reason the 'flex' versions will be done as a separate conversion. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2026-02-21treewide: Replace kmalloc with kmalloc_obj for non-scalar typesKees Cook3-6/+6
This is the result of running the Coccinelle script from scripts/coccinelle/api/kmalloc_objs.cocci. The script is designed to avoid scalar types (which need careful case-by-case checking), and instead replace kmalloc-family calls that allocate struct or union object instances: Single allocations: kmalloc(sizeof(TYPE), ...) are replaced with: kmalloc_obj(TYPE, ...) Array allocations: kmalloc_array(COUNT, sizeof(TYPE), ...) are replaced with: kmalloc_objs(TYPE, COUNT, ...) Flex array allocations: kmalloc(struct_size(PTR, FAM, COUNT), ...) are replaced with: kmalloc_flex(*PTR, FAM, COUNT, ...) (where TYPE may also be *VAR) The resulting allocations no longer return "void *", instead returning "TYPE *". Signed-off-by: Kees Cook <kees@kernel.org>
2025-12-31net: qrtr: Drop the MHI auto_queue feature for IPCR DL channelsManivannan Sadhasivam1-11/+58
MHI stack offers the 'auto_queue' feature, which allows the MHI stack to auto queue the buffers for the RX path (DL channel). Though this feature simplifies the client driver design, it introduces race between the client drivers and the MHI stack. For instance, with auto_queue, the 'dl_callback' for the DL channel may get called before the client driver is fully probed. This means, by the time the dl_callback gets called, the client driver's structures might not be initialized, leading to NULL ptr dereference. Currently, the drivers have to workaround this issue by initializing the internal structures before calling mhi_prepare_for_transfer_autoqueue(). But even so, there is a chance that the client driver's internal code path may call the MHI queue APIs before mhi_prepare_for_transfer_autoqueue() is called, leading to similar NULL ptr dereference. This issue has been reported on the Qcom X1E80100 CRD machines affecting boot. So to properly fix all these races, drop the MHI 'auto_queue' feature altogether and let the client driver (QRTR) manage the RX buffers manually. In the QRTR driver, queue the RX buffers based on the ring length during probe and recycle the buffers in 'dl_callback' once they are consumed. This also warrants removing the setting of 'auto_queue' flag from controller drivers. Currently, this 'auto_queue' feature is only enabled for IPCR DL channel. So only the QRTR client driver requires the modification. Fixes: 227fee5fc99e ("bus: mhi: core: Add an API for auto queueing buffers for DL channel") Fixes: 68a838b84eff ("net: qrtr: start MHI channel after endpoit creation") Reported-by: Johan Hovold <johan@kernel.org> Closes: https://lore.kernel.org/linux-arm-msm/ZyTtVdkCCES0lkl4@hovoldconsulting.com Suggested-by: Chris Lew <quic_clew@quicinc.com> Signed-off-by: Manivannan Sadhasivam <manivannan.sadhasivam@oss.qualcomm.com> Reviewed-by: Jeff Hugo <jeff.hugo@oss.qualcomm.com> Reviewed-by: Loic Poulain <loic.poulain@oss.qualcomm.com> Acked-by: Jeff Johnson <jjohnson@kernel.org> # drivers/net/wireless/ath/... Acked-by: Jeff Hugo <jeff.hugo@oss.qualcomm.com> Acked-by: Paolo Abeni <pabeni@redhat.com> Cc: stable@vger.kernel.org Link: https://patch.msgid.link/20251218-qrtr-fix-v2-1-c7499bfcfbe0@oss.qualcomm.com
2025-11-04net: Convert proto_ops connect() callbacks to use sockaddr_unsizedKees Cook1-1/+1
Update all struct proto_ops connect() callback function prototypes from "struct sockaddr *" to "struct sockaddr_unsized *" to avoid lying to the compiler about object sizes. Calls into struct proto handlers gain casts that will be removed in the struct proto conversion patch. No binary changes expected. Signed-off-by: Kees Cook <kees@kernel.org> Link: https://patch.msgid.link/20251104002617.2752303-3-kees@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-11-04net: Convert proto_ops bind() callbacks to use sockaddr_unsizedKees Cook2-2/+2
Update all struct proto_ops bind() callback function prototypes from "struct sockaddr *" to "struct sockaddr_unsized *" to avoid lying to the compiler about object sizes. Calls into struct proto handlers gain casts that will be removed in the struct proto conversion patch. No binary changes expected. Signed-off-by: Kees Cook <kees@kernel.org> Link: https://patch.msgid.link/20251104002617.2752303-2-kees@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-09-24net: qrtr: Update packets cloning when broadcastingYoussef Samir1-1/+1
When broadcasting data to multiple nodes via MHI, using skb_clone() causes all nodes to receive the same header data. This can result in packets being discarded by endpoints, leading to lost data. This issue occurs when a socket is closed, and a QRTR_TYPE_DEL_CLIENT packet is broadcasted. All nodes receive the same destination node ID, causing the node connected to the client to discard the packet and remain unaware of the client's deletion. Replace skb_clone() with pskb_copy(), to create a separate copy of the header for each sk_buff. Fixes: bdabad3e363d ("net: Add Qualcomm IPC router") Signed-off-by: Youssef Samir <quic_yabdulra@quicinc.com> Reviewed-by: Jeffery Hugo <quic_jhugo@quicinc.com> Reviewed-by: Carl Vanderlip <quic_carlv@quicinc.com> Reviewed-by: Chris Lew <quic_clew@quicinc.com> Link: https://patch.msgid.link/20240916170858.2382247-1-quic_yabdulra@quicinc.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-06-14net: qrtr: ns: Ignore ENODEV failures in nsChris Lew1-7/+10
Ignore the ENODEV failures returned by kernel_sendmsg(). These errors indicate that either the local port has been closed or the remote has gone down. Neither of these scenarios are fatal and will eventually be handled through packets that are later queued on the control port. Signed-off-by: Chris Lew <quic_clew@quicinc.com> Signed-off-by: Sarannya Sasikumar <quic_sarannya@quicinc.com> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://lore.kernel.org/r/20240612063156.1377210-1-quic_sarannya@quicinc.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-05-16net: qrtr: ns: Fix module refcntChris Lew1-0/+27
The qrtr protocol core logic and the qrtr nameservice are combined into a single module. Neither the core logic or nameservice provide much functionality by themselves; combining the two into a single module also prevents any possible issues that may stem from client modules loading inbetween qrtr and the ns. Creating a socket takes two references to the module that owns the socket protocol. Since the ns needs to create the control socket, this creates a scenario where there are always two references to the qrtr module. This prevents the execution of 'rmmod' for qrtr. To resolve this, forcefully put the module refcount for the socket opened by the nameservice. Fixes: a365023a76f2 ("net: qrtr: combine nameservice into main module") Reported-by: Jeffrey Hugo <quic_jhugo@quicinc.com> Tested-by: Jeffrey Hugo <quic_jhugo@quicinc.com> Signed-off-by: Chris Lew <quic_clew@quicinc.com> Reviewed-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org> Reviewed-by: Jeffrey Hugo <quic_jhugo@quicinc.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2024-04-09net: qrtr: support suspend/hibernationBaochen Qiang1-0/+46
MHI devices may not be destroyed during suspend/hibernation, so need to unprepare/prepare MHI channels throughout the transition, this is done by adding suspend/resume callbacks. The suspend callback is called in the late suspend stage, this means MHI channels are still alive at suspend stage, and that makes it possible for an MHI controller driver to communicate with others over those channels at suspend stage. While the resume callback is called in the early resume stage, for a similar reason. Also note that we won't do unprepare/prepare when MHI device is in suspend state because it's pointless if MHI is only meant to go through a suspend/resume transition, instead of a complete power cycle. Tested-on: WCN6855 hw2.0 PCI WLAN.HSP.1.1-03125-QCAHSPSWPL_V1_V2_SILICONZ_LITE-3.6510.30 Signed-off-by: Baochen Qiang <quic_bqiang@quicinc.com> Reviewed-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org> Reviewed-by: Jeff Johnson <quic_jjohnson@quicinc.com> Signed-off-by: Kalle Valo <quic_kvalo@quicinc.com> Link: https://msgid.link/20240305021320.3367-3-quic_bqiang@quicinc.com
2024-01-01net: qrtr: ns: Return 0 if server port is not presentSarannya S1-1/+3
When a 'DEL_CLIENT' message is received from the remote, the corresponding server port gets deleted. A DEL_SERVER message is then announced for this server. As part of handling the subsequent DEL_SERVER message, the name- server attempts to delete the server port which results in a '-ENOENT' error. The return value from server_del() is then propagated back to qrtr_ns_worker, causing excessive error prints. To address this, return 0 from control_cmd_del_server() without checking the return value of server_del(), since the above scenario is not an error case and hence server_del() doesn't have any other error return value. Signed-off-by: Sarannya Sasikumar <quic_sarannya@quicinc.com> Reviewed-by: Simon Horman <horms@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-07-17net: qrtr: Handle IPCR control port format of older targetsVignesh Viswanathan1-0/+5
The destination port value in the IPCR control buffer on older targets is 0xFFFF. Handle the same by updating the dst_port to QRTR_PORT_CTRL. Signed-off-by: Vignesh Viswanathan <quic_viswanat@quicinc.com> Reviewed-by: Simon Horman <simon.horman@corigine.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-07-17net: qrtr: ns: Change nodes radix tree to xarrayVignesh Viswanathan1-3/+3
There is a use after free scenario while iterating through the nodes radix tree despite the ns being a single threaded process. This can happen when the radix tree APIs are not synchronized with the rcu_read_lock() APIs. Convert the radix tree for nodes to xarray to take advantage of the built in rcu lock usage provided by xarray. Signed-off-by: Chris Lew <quic_clew@quicinc.com> Signed-off-by: Vignesh Viswanathan <quic_viswanat@quicinc.com> Reviewed-by: Simon Horman <simon.horman@corigine.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-07-17net: qrtr: ns: Change servers radix tree to xarrayVignesh Viswanathan1-109/+24
There is a use after free scenario while iterating through the servers radix tree despite the ns being a single threaded process. This can happen when the radix tree APIs are not synchronized with the rcu_read_lock() APIs. Convert the radix tree for servers to xarray to take advantage of the built in rcu lock usage provided by xarray. Signed-off-by: Chris Lew <quic_clew@quicinc.com> Signed-off-by: Vignesh Viswanathan <quic_viswanat@quicinc.com> Reviewed-by: Simon Horman <simon.horman@corigine.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-06-28Merge tag 'net-next-6.5' of ↵Linus Torvalds1-1/+0
git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next Pull networking changes from Jakub Kicinski: "WiFi 7 and sendpage changes are the biggest pieces of work for this release. The latter will definitely require fixes but I think that we got it to a reasonable point. Core: - Rework the sendpage & splice implementations Instead of feeding data into sockets page by page extend sendmsg handlers to support taking a reference on the data, controlled by a new flag called MSG_SPLICE_PAGES Rework the handling of unexpected-end-of-file to invoke an additional callback instead of trying to predict what the right combination of MORE/NOTLAST flags is Remove the MSG_SENDPAGE_NOTLAST flag completely - Implement SCM_PIDFD, a new type of CMSG type analogous to SCM_CREDENTIALS, but it contains pidfd instead of plain pid - Enable socket busy polling with CONFIG_RT - Improve reliability and efficiency of reporting for ref_tracker - Auto-generate a user space C library for various Netlink families Protocols: - Allow TCP to shrink the advertised window when necessary, prevent sk_rcvbuf auto-tuning from growing the window all the way up to tcp_rmem[2] - Use per-VMA locking for "page-flipping" TCP receive zerocopy - Prepare TCP for device-to-device data transfers, by making sure that payloads are always attached to skbs as page frags - Make the backoff time for the first N TCP SYN retransmissions linear. Exponential backoff is unnecessarily conservative - Create a new MPTCP getsockopt to retrieve all info (MPTCP_FULL_INFO) - Avoid waking up applications using TLS sockets until we have a full record - Allow using kernel memory for protocol ioctl callbacks, paving the way to issuing ioctls over io_uring - Add nolocalbypass option to VxLAN, forcing packets to be fully encapsulated even if they are destined for a local IP address - Make TCPv4 use consistent hash in TIME_WAIT and SYN_RECV. Ensure in-kernel ECMP implementation (e.g. Open vSwitch) select the same link for all packets. Support L4 symmetric hashing in Open vSwitch - PPPoE: make number of hash bits configurable - Allow DNS to be overwritten by DHCPACK in the in-kernel DHCP client (ipconfig) - Add layer 2 miss indication and filtering, allowing higher layers (e.g. ACL filters) to make forwarding decisions based on whether packet matched forwarding state in lower devices (bridge) - Support matching on Connectivity Fault Management (CFM) packets - Hide the "link becomes ready" IPv6 messages by demoting their printk level to debug - HSR: don't enable promiscuous mode if device offloads the proto - Support active scanning in IEEE 802.15.4 - Continue work on Multi-Link Operation for WiFi 7 BPF: - Add precision propagation for subprogs and callbacks. This allows maintaining verification efficiency when subprograms are used, or in fact passing the verifier at all for complex programs, especially those using open-coded iterators - Improve BPF's {g,s}setsockopt() length handling. Previously BPF assumed the length is always equal to the amount of written data. But some protos allow passing a NULL buffer to discover what the output buffer *should* be, without writing anything - Accept dynptr memory as memory arguments passed to helpers - Add routing table ID to bpf_fib_lookup BPF helper - Support O_PATH FDs in BPF_OBJ_PIN and BPF_OBJ_GET commands - Drop bpf_capable() check in BPF_MAP_FREEZE command (used to mark maps as read-only) - Show target_{obj,btf}_id in tracing link fdinfo - Addition of several new kfuncs (most of the names are self-explanatory): - Add a set of new dynptr kfuncs: bpf_dynptr_adjust(), bpf_dynptr_is_null(), bpf_dynptr_is_rdonly(), bpf_dynptr_size() and bpf_dynptr_clone(). - bpf_task_under_cgroup() - bpf_sock_destroy() - force closing sockets - bpf_cpumask_first_and(), rework bpf_cpumask_any*() kfuncs Netfilter: - Relax set/map validation checks in nf_tables. Allow checking presence of an entry in a map without using the value - Increase ip_vs_conn_tab_bits range for 64BIT builds - Allow updating size of a set - Improve NAT tuple selection when connection is closing Driver API: - Integrate netdev with LED subsystem, to allow configuring HW "offloaded" blinking of LEDs based on link state and activity (i.e. packets coming in and out) - Support configuring rate selection pins of SFP modules - Factor Clause 73 auto-negotiation code out of the drivers, provide common helper routines - Add more fool-proof helpers for managing lifetime of MDIO devices associated with the PCS layer - Allow drivers to report advanced statistics related to Time Aware scheduler offload (taprio) - Allow opting out of VF statistics in link dump, to allow more VFs to fit into the message - Split devlink instance and devlink port operations New hardware / drivers: - Ethernet: - Synopsys EMAC4 IP support (stmmac) - Marvell 88E6361 8 port (5x1GE + 3x2.5GE) switches - Marvell 88E6250 7 port switches - Microchip LAN8650/1 Rev.B0 PHYs - MediaTek MT7981/MT7988 built-in 1GE PHY driver - WiFi: - Realtek RTL8192FU, 2.4 GHz, b/g/n mode, 2T2R, 300 Mbps - Realtek RTL8723DS (SDIO variant) - Realtek RTL8851BE - CAN: - Fintek F81604 Drivers: - Ethernet NICs: - Intel (100G, ice): - support dynamic interrupt allocation - use meta data match instead of VF MAC addr on slow-path - nVidia/Mellanox: - extend link aggregation to handle 4, rather than just 2 ports - spawn sub-functions without any features by default - OcteonTX2: - support HTB (Tx scheduling/QoS) offload - make RSS hash generation configurable - support selecting Rx queue using TC filters - Wangxun (ngbe/txgbe): - add basic Tx/Rx packet offloads - add phylink support (SFP/PCS control) - Freescale/NXP (enetc): - report TAPRIO packet statistics - Solarflare/AMD: - support matching on IP ToS and UDP source port of outer header - VxLAN and GENEVE tunnel encapsulation over IPv4 or IPv6 - add devlink dev info support for EF10 - Virtual NICs: - Microsoft vNIC: - size the Rx indirection table based on requested configuration - support VLAN tagging - Amazon vNIC: - try to reuse Rx buffers if not fully consumed, useful for ARM servers running with 16kB pages - Google vNIC: - support TCP segmentation of >64kB frames - Ethernet embedded switches: - Marvell (mv88e6xxx): - enable USXGMII (88E6191X) - Microchip: - lan966x: add support for Egress Stage 0 ACL engine - lan966x: support mapping packet priority to internal switch priority (based on PCP or DSCP) - Ethernet PHYs: - Broadcom PHYs: - support for Wake-on-LAN for BCM54210E/B50212E - report LPI counter - Microsemi PHYs: support RGMII delay configuration (VSC85xx) - Micrel PHYs: receive timestamp in the frame (LAN8841) - Realtek PHYs: support optional external PHY clock - Altera TSE PCS: merge the driver into Lynx PCS which it is a variant of - CAN: Kvaser PCIEcan: - support packet timestamping - WiFi: - Intel (iwlwifi): - major update for new firmware and Multi-Link Operation (MLO) - configuration rework to drop test devices and split the different families - support for segmented PNVM images and power tables - new vendor entries for PPAG (platform antenna gain) feature - Qualcomm 802.11ax (ath11k): - Multiple Basic Service Set Identifier (MBSSID) and Enhanced MBSSID Advertisement (EMA) support in AP mode - support factory test mode - RealTek (rtw89): - add RSSI based antenna diversity - support U-NII-4 channels on 5 GHz band - RealTek (rtl8xxxu): - AP mode support for 8188f - support USB RX aggregation for the newer chips" * tag 'net-next-6.5' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next: (1602 commits) net: scm: introduce and use scm_recv_unix helper af_unix: Skip SCM_PIDFD if scm->pid is NULL. net: lan743x: Simplify comparison netlink: Add __sock_i_ino() for __netlink_diag_dump(). net: dsa: avoid suspicious RCU usage for synced VLAN-aware MAC addresses Revert "af_unix: Call scm_recv() only after scm_set_cred()." phylink: ReST-ify the phylink_pcs_neg_mode() kdoc libceph: Partially revert changes to support MSG_SPLICE_PAGES net: phy: mscc: fix packet loss due to RGMII delays net: mana: use vmalloc_array and vcalloc net: enetc: use vmalloc_array and vcalloc ionic: use vmalloc_array and vcalloc pds_core: use vmalloc_array and vcalloc gve: use vmalloc_array and vcalloc octeon_ep: use vmalloc_array and vcalloc net: usb: qmi_wwan: add u-blox 0x1312 composition perf trace: fix MSG_SPLICE_PAGES build error ipvlan: Fix return value of ipvlan_queue_xmit() netfilter: nf_tables: fix underflow in chain reference counter netfilter: nf_tables: unbind non-anonymous set if rule construction fails ...
2023-06-24sock: Remove ->sendpage*() in favour of sendmsg(MSG_SPLICE_PAGES)David Howells1-1/+0
Remove ->sendpage() and ->sendpage_locked(). sendmsg() with MSG_SPLICE_PAGES should be used instead. This allows multiple pages and multipage folios to be passed through. Signed-off-by: David Howells <dhowells@redhat.com> Acked-by: Marc Kleine-Budde <mkl@pengutronix.de> # for net/can cc: Jens Axboe <axboe@kernel.dk> cc: Matthew Wilcox <willy@infradead.org> cc: linux-afs@lists.infradead.org cc: mptcp@lists.linux.dev cc: rds-devel@oss.oracle.com cc: tipc-discussion@lists.sourceforge.net cc: virtualization@lists.linux-foundation.org Link: https://lore.kernel.org/r/20230623225513.2732256-16-dhowells@redhat.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-05-25net: qrtr: Use alloc_ordered_workqueue() to create ordered workqueuesTejun Heo1-1/+1
BACKGROUND ========== When multiple work items are queued to a workqueue, their execution order doesn't match the queueing order. They may get executed in any order and simultaneously. When fully serialized execution - one by one in the queueing order - is needed, an ordered workqueue should be used which can be created with alloc_ordered_workqueue(). However, alloc_ordered_workqueue() was a later addition. Before it, an ordered workqueue could be obtained by creating an UNBOUND workqueue with @max_active==1. This originally was an implementation side-effect which was broken by 4c16bd327c74 ("workqueue: restore WQ_UNBOUND/max_active==1 to be ordered"). Because there were users that depended on the ordered execution, 5c0338c68706 ("workqueue: restore WQ_UNBOUND/max_active==1 to be ordered") made workqueue allocation path to implicitly promote UNBOUND workqueues w/ @max_active==1 to ordered workqueues. While this has worked okay, overloading the UNBOUND allocation interface this way creates other issues. It's difficult to tell whether a given workqueue actually needs to be ordered and users that legitimately want a min concurrency level wq unexpectedly gets an ordered one instead. With planned UNBOUND workqueue updates to improve execution locality and more prevalence of chiplet designs which can benefit from such improvements, this isn't a state we wanna be in forever. This patch series audits all callsites that create an UNBOUND workqueue w/ @max_active==1 and converts them to alloc_ordered_workqueue() as necessary. WHAT TO LOOK FOR ================ The conversions are from alloc_workqueue(WQ_UNBOUND | flags, 1, args..) to alloc_ordered_workqueue(flags, args...) which don't cause any functional changes. If you know that fully ordered execution is not necessary, please let me know. I'll drop the conversion and instead add a comment noting the fact to reduce confusion while conversion is in progress. If you aren't fully sure, it's completely fine to let the conversion through. The behavior will stay exactly the same and we can always reconsider later. As there are follow-up workqueue core changes, I'd really appreciate if the patch can be routed through the workqueue tree w/ your acks. Thanks. Signed-off-by: Tejun Heo <tj@kernel.org> Cc: Manivannan Sadhasivam <mani@kernel.org> Cc: "David S. Miller" <davem@davemloft.net> Cc: Eric Dumazet <edumazet@google.com> Cc: Jakub Kicinski <kuba@kernel.org> Cc: Paolo Abeni <pabeni@redhat.com> Cc: linux-arm-msm@vger.kernel.org Cc: netdev@vger.kernel.org
2023-04-13net: qrtr: Fix an uninit variable access bug in qrtr_tx_resume()Ziyang Xuan1-3/+5
Syzbot reported a bug as following: ===================================================== BUG: KMSAN: uninit-value in qrtr_tx_resume+0x185/0x1f0 net/qrtr/af_qrtr.c:230 qrtr_tx_resume+0x185/0x1f0 net/qrtr/af_qrtr.c:230 qrtr_endpoint_post+0xf85/0x11b0 net/qrtr/af_qrtr.c:519 qrtr_tun_write_iter+0x270/0x400 net/qrtr/tun.c:108 call_write_iter include/linux/fs.h:2189 [inline] aio_write+0x63a/0x950 fs/aio.c:1600 io_submit_one+0x1d1c/0x3bf0 fs/aio.c:2019 __do_sys_io_submit fs/aio.c:2078 [inline] __se_sys_io_submit+0x293/0x770 fs/aio.c:2048 __x64_sys_io_submit+0x92/0xd0 fs/aio.c:2048 do_syscall_x64 arch/x86/entry/common.c:50 [inline] do_syscall_64+0x3d/0xb0 arch/x86/entry/common.c:80 entry_SYSCALL_64_after_hwframe+0x63/0xcd Uninit was created at: slab_post_alloc_hook mm/slab.h:766 [inline] slab_alloc_node mm/slub.c:3452 [inline] __kmem_cache_alloc_node+0x71f/0xce0 mm/slub.c:3491 __do_kmalloc_node mm/slab_common.c:967 [inline] __kmalloc_node_track_caller+0x114/0x3b0 mm/slab_common.c:988 kmalloc_reserve net/core/skbuff.c:492 [inline] __alloc_skb+0x3af/0x8f0 net/core/skbuff.c:565 __netdev_alloc_skb+0x120/0x7d0 net/core/skbuff.c:630 qrtr_endpoint_post+0xbd/0x11b0 net/qrtr/af_qrtr.c:446 qrtr_tun_write_iter+0x270/0x400 net/qrtr/tun.c:108 call_write_iter include/linux/fs.h:2189 [inline] aio_write+0x63a/0x950 fs/aio.c:1600 io_submit_one+0x1d1c/0x3bf0 fs/aio.c:2019 __do_sys_io_submit fs/aio.c:2078 [inline] __se_sys_io_submit+0x293/0x770 fs/aio.c:2048 __x64_sys_io_submit+0x92/0xd0 fs/aio.c:2048 do_syscall_x64 arch/x86/entry/common.c:50 [inline] do_syscall_64+0x3d/0xb0 arch/x86/entry/common.c:80 entry_SYSCALL_64_after_hwframe+0x63/0xcd It is because that skb->len requires at least sizeof(struct qrtr_ctrl_pkt) in qrtr_tx_resume(). And skb->len equals to size in qrtr_endpoint_post(). But size is less than sizeof(struct qrtr_ctrl_pkt) when qrtr_cb->type equals to QRTR_TYPE_RESUME_TX in qrtr_endpoint_post() under the syzbot scenario. This triggers the uninit variable access bug. Add size check when qrtr_cb->type equals to QRTR_TYPE_RESUME_TX in qrtr_endpoint_post() to fix the bug. Fixes: 5fdeb0d372ab ("net: qrtr: Implement outgoing flow control") Reported-by: syzbot+4436c9630a45820fda76@syzkaller.appspotmail.com Link: https://syzkaller.appspot.com/bug?id=c14607f0963d27d5a3d5f4c8639b500909e43540 Suggested-by: Manivannan Sadhasivam <mani@kernel.org> Signed-off-by: Ziyang Xuan <william.xuanziyang@huawei.com> Reviewed-by: Simon Horman <simon.horman@corigine.com> Link: https://lore.kernel.org/r/20230410012352.3997823-1-william.xuanziyang@huawei.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2023-04-03net: qrtr: Do not do DEL_SERVER broadcast after DEL_CLIENTSricharan Ramabadhran1-6/+9
On the remote side, when QRTR socket is removed, af_qrtr will call qrtr_port_remove() which broadcasts the DEL_CLIENT packet to all neighbours including local NS. NS upon receiving the DEL_CLIENT packet, will remove the lookups associated with the node:port and broadcasts the DEL_SERVER packet. But on the host side, due to the arrival of the DEL_CLIENT packet, the NS would've already deleted the server belonging to that port. So when the remote's NS again broadcasts the DEL_SERVER for that port, it throws below error message on the host: "failed while handling packet from 2:-2" So fix this error by not broadcasting the DEL_SERVER packet when the DEL_CLIENT packet gets processed." Fixes: 0c2204a4ad71 ("net: qrtr: Migrate nameservice to kernel from userspace") Reviewed-by: Manivannan Sadhasivam <mani@kernel.org> Signed-off-by: Ram Kumar Dharuman <quic_ramd@quicinc.com> Signed-off-by: Sricharan Ramabadhran <quic_srichara@quicinc.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-03-31net: qrtr: Fix a refcount bug in qrtr_recvmsg()Ziyang Xuan1-0/+2
Syzbot reported a bug as following: refcount_t: addition on 0; use-after-free. ... RIP: 0010:refcount_warn_saturate+0x17c/0x1f0 lib/refcount.c:25 ... Call Trace: <TASK> __refcount_add include/linux/refcount.h:199 [inline] __refcount_inc include/linux/refcount.h:250 [inline] refcount_inc include/linux/refcount.h:267 [inline] kref_get include/linux/kref.h:45 [inline] qrtr_node_acquire net/qrtr/af_qrtr.c:202 [inline] qrtr_node_lookup net/qrtr/af_qrtr.c:398 [inline] qrtr_send_resume_tx net/qrtr/af_qrtr.c:1003 [inline] qrtr_recvmsg+0x85f/0x990 net/qrtr/af_qrtr.c:1070 sock_recvmsg_nosec net/socket.c:1017 [inline] sock_recvmsg+0xe2/0x160 net/socket.c:1038 qrtr_ns_worker+0x170/0x1700 net/qrtr/ns.c:688 process_one_work+0x991/0x15c0 kernel/workqueue.c:2390 worker_thread+0x669/0x1090 kernel/workqueue.c:2537 It occurs in the concurrent scenario of qrtr_recvmsg() and qrtr_endpoint_unregister() as following: cpu0 cpu1 qrtr_recvmsg qrtr_endpoint_unregister qrtr_send_resume_tx qrtr_node_release qrtr_node_lookup mutex_lock(&qrtr_node_lock) spin_lock_irqsave(&qrtr_nodes_lock, ) refcount_dec_and_test(&node->ref) [node->ref == 0] radix_tree_lookup [node != NULL] __qrtr_node_release qrtr_node_acquire spin_lock_irqsave(&qrtr_nodes_lock, ) kref_get(&node->ref) [WARNING] ... mutex_unlock(&qrtr_node_lock) Use qrtr_node_lock to protect qrtr_node_lookup() implementation, this is actually improving the protection of node reference. Fixes: 0a7e0d0ef054 ("net: qrtr: Migrate node lookup tree to spinlock") Reported-by: syzbot+a7492efaa5d61b51db23@syzkaller.appspotmail.com Link: https://syzkaller.appspot.com/bug?extid=a7492efaa5d61b51db23 Signed-off-by: Ziyang Xuan <william.xuanziyang@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-02-02Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netJakub Kicinski1-1/+4
net/core/gro.c 7d2c89b32587 ("skb: Do mix page pool and page referenced frags in GRO") b1a78b9b9886 ("net: add support for ipv4 big tcp") https://lore.kernel.org/all/20230203094454.5766f160@canb.auug.org.au/ Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-01-28net: qrtr: free memory on error path in radix_tree_insert()Natalia Petrova1-1/+4
Function radix_tree_insert() returns errors if the node hasn't been initialized and added to the tree. "kfree(node)" and return value "NULL" of node_get() help to avoid using unclear node in other calls. Found by Linux Verification Center (linuxtesting.org) with SVACE. Cc: <stable@vger.kernel.org> # 5.7 Fixes: 0c2204a4ad71 ("net: qrtr: Migrate nameservice to kernel from userspace") Signed-off-by: Natalia Petrova <n.petrova@fintech.ru> Reviewed-by: Simon Horman <simon.horman@corigine.com> Reviewed-by: Manivannan Sadhasivam <mani@kernel.org> Link: https://lore.kernel.org/r/20230125134831.8090-1-n.petrova@fintech.ru Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-01-23net/sock: Introduce trace_sk_data_ready()Peilin Ye1-0/+3
As suggested by Cong, introduce a tracepoint for all ->sk_data_ready() callback implementations. For example: <...> iperf-609 [002] ..... 70.660425: sk_data_ready: family=2 protocol=6 func=sock_def_readable iperf-609 [002] ..... 70.660436: sk_data_ready: family=2 protocol=6 func=sock_def_readable <...> Suggested-by: Cong Wang <cong.wang@bytedance.com> Signed-off-by: Peilin Ye <peilin.ye@bytedance.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-08-15net: qrtr: start MHI channel after endpoit creationMaxim Kochetkov1-5/+7
MHI channel may generates event/interrupt right after enabling. It may leads to 2 race conditions issues. 1) Such event may be dropped by qcom_mhi_qrtr_dl_callback() at check: if (!qdev || mhi_res->transaction_status) return; Because dev_set_drvdata(&mhi_dev->dev, qdev) may be not performed at this moment. In this situation qrtr-ns will be unable to enumerate services in device. --------------------------------------------------------------- 2) Such event may come at the moment after dev_set_drvdata() and before qrtr_endpoint_register(). In this case kernel will panic with accessing wrong pointer at qcom_mhi_qrtr_dl_callback(): rc = qrtr_endpoint_post(&qdev->ep, mhi_res->buf_addr, mhi_res->bytes_xferd); Because endpoint is not created yet. -------------------------------------------------------------- So move mhi_prepare_for_transfer_autoqueue after endpoint creation to fix it. Fixes: a2e2cc0dbb11 ("net: qrtr: Start MHI channels during init") Signed-off-by: Maxim Kochetkov <fido_max@inbox.ru> Reviewed-by: Hemant Kumar <quic_hemantk@quicinc.com> Reviewed-by: Manivannan Sadhasivam <mani@kernel.org> Reviewed-by: Loic Poulain <loic.poulain@linaro.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-04-06net: remove noblock parameter from skb_recv_datagram()Oliver Hartkopp1-2/+1
skb_recv_datagram() has two parameters 'flags' and 'noblock' that are merged inside skb_recv_datagram() by 'flags | (noblock ? MSG_DONTWAIT : 0)' As 'flags' may contain MSG_DONTWAIT as value most callers split the 'flags' into 'flags' and 'noblock' with finally obsolete bit operations like this: skb_recv_datagram(sk, flags & ~MSG_DONTWAIT, flags & MSG_DONTWAIT, &rc); And this is not even done consistently with the 'flags' parameter. This patch removes the obsolete and costly splitting into two parameters and only performs bit operations when really needed on the caller side. One missing conversion thankfully reported by kernel test robot. I missed to enable kunit tests to build the mctp code. Reported-by: kernel test robot <lkp@intel.com> Signed-off-by: Oliver Hartkopp <socketcan@hartkopp.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-12-17bus: mhi: core: Add an API for auto queueing buffers for DL channelManivannan Sadhasivam1-1/+1
Add a new API "mhi_prepare_for_transfer_autoqueue" for using with client drivers like QRTR to request MHI core to autoqueue buffers for the DL channel along with starting both UL and DL channels. So far, the "auto_queue" flag specified by the controller drivers in channel definition served this purpose but this will be removed at some point in future. Cc: netdev@vger.kernel.org Cc: Jakub Kicinski <kuba@kernel.org> Cc: David S. Miller <davem@davemloft.net> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Co-developed-by: Loic Poulain <loic.poulain@linaro.org> Acked-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Loic Poulain <loic.poulain@linaro.org> Signed-off-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org> Link: https://lore.kernel.org/r/20211216081227.237749-9-manivannan.sadhasivam@linaro.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-09-28net: qrtr: combine nameservice into main moduleLuca Weiss2-1/+2
Previously with CONFIG_QRTR=m a separate ns.ko would be built which wasn't done on purpose and should be included in qrtr.ko. Rename qrtr.c to af_qrtr.c so we can build a qrtr.ko with both af_qrtr.c and ns.c. Signed-off-by: Luca Weiss <luca@z3ntu.xyz> Reviewed-by: Bjorn Andersson <bjorn.andersson@linaro.org> Tested-By: Steev Klimaszewski <steev@kali.org> Reviewed-by: Manivannan Sadhasivam <mani@kernel.org> Link: https://lore.kernel.org/r/20210928171156.6353-1-luca@z3ntu.xyz Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-09-02net: qrtr: revert check in qrtr_endpoint_post()Dan Carpenter1-1/+1
I tried to make this check stricter as a hardenning measure but it broke audo and wifi on these devices so revert it. Fixes: aaa8e4922c88 ("net: qrtr: make checks in qrtr_endpoint_post() stricter") Reported-by: John Stultz <john.stultz@linaro.org> Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Tested-by: Srinivas Kandagatla <srinivas.kandagatla@linaro.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-08-30net: qrtr: make checks in qrtr_endpoint_post() stricterDan Carpenter1-2/+6
These checks are still not strict enough. The main problem is that if "cb->type == QRTR_TYPE_NEW_SERVER" is true then "len - hdrlen" is guaranteed to be 4 but we need to be at least 16 bytes. In fact, we can reject everything smaller than sizeof(*pkt) which is 20 bytes. Also I don't like the ALIGN(size, 4). It's better to just insist that data is needs to be aligned at the start. Fixes: 0baa99ee353c ("net: qrtr: Allow non-immediate node routing") Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-08-26Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netJakub Kicinski2-16/+2
drivers/net/wwan/mhi_wwan_mbim.c - drop the extra arg. Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-08-26Revert "net: really fix the build..."Kalle Valo1-15/+1
This reverts commit ce78ffa3ef1681065ba451cfd545da6126f5ca88. Wren and Nicol