| Age | Commit message (Collapse) | Author | Files | Lines |
|
Shivam Kumar found via vulnerability testing:
When data digest is enabled on an NVMe/TCP connection and a digest
mismatch occurs on a non-final H2C_DATA PDU during an R2T-based
data transfer, the digest error handler in nvmet_tcp_try_recv_ddgst()
calls nvmet_req_uninit() — which performs percpu_ref_put() on the
submission queue — but does NOT mark the command as completed. It
does not set cqe->status, does not modify rbytes_done, and does not
clear any flag. When the subsequent fatal error triggers queue
teardown, nvmet_tcp_uninit_data_in_cmds() iterates all commands,
checks nvmet_tcp_need_data_in() for each one, and finds that the
already-uninited command still appears to need data (because
rbytes_done < transfer_len and cqe->status == 0). It therefore calls
nvmet_req_uninit() a second time on the same command — a double
percpu_ref_put against a single percpu_ref_get.
Reported-by: Shivam Kumar <kumar.shivam43666@gmail.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Sagi Grimberg <sagi@grimberg.me>
Signed-off-by: Keith Busch <kbusch@kernel.org>
|
|
nvmet_tcp_handle_icreq() updates queue->state after sending an
Initialization Connection Response (ICResp), but it does so without
serializing against target-side queue teardown.
If an NVMe/TCP host sends an Initialization Connection Request
(ICReq) and immediately closes the connection, target-side teardown
may start in softirq context before io_work drains the already
buffered ICReq. In that case, nvmet_tcp_schedule_release_queue()
sets queue->state to NVMET_TCP_Q_DISCONNECTING and drops the queue
reference under state_lock.
If io_work later processes that ICReq, nvmet_tcp_handle_icreq() can
still overwrite the state back to NVMET_TCP_Q_LIVE. That defeats the
DISCONNECTING-state guard in nvmet_tcp_schedule_release_queue() and
allows a later socket state change to re-enter teardown and issue a
second kref_put() on an already released queue.
The ICResp send failure path has the same problem. If teardown has
already moved the queue to DISCONNECTING, a send error can still
overwrite the state with NVMET_TCP_Q_FAILED, again reopening the
window for a second teardown path to drop the queue reference.
Fix this by serializing both post-send state transitions with
state_lock and bailing out if teardown has already started.
Use -ESHUTDOWN as an internal sentinel for that bail-out path rather
than propagating it as a transport error like -ECONNRESET. Keep
nvmet_tcp_socket_error() setting rcv_state to NVMET_TCP_RECV_ERR before
honoring that sentinel so receive-side parsing stays quiesced until the
existing release path completes.
Fixes: c46a6465bac2 ("nvmet-tcp: add NVMe over TCP target driver")
Cc: stable@vger.kernel.org
Reported-by: Shivam Kumar <skumar47@syr.edu>
Tested-by: Shivam Kumar <kumar.shivam43666@gmail.com>
Signed-off-by: Chaitanya Kulkarni <kch@nvidia.com>
Signed-off-by: Keith Busch <kbusch@kernel.org>
|
|
Executing nvmet_tcp_fatal_error() is generally the responsibility
of the caller (nvmet_tcp_try_recv); all other functions should
just return the error code.
Remove the nvmet_tcp_fatal_error() function, it's not needed
anymore.
Reviewed-by: Hannes Reinecke <hare@suse.de>
Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com>
Signed-off-by: Maurizio Lombardi <mlombard@redhat.com>
Signed-off-by: Keith Busch <kbusch@kernel.org>
|
|
Currently, when nvmet_tcp_build_pdu_iovec() detects an out-of-bounds
PDU length or offset, it triggers nvmet_tcp_fatal_error(cmd->queue)
and returns early. However, because the function returns void, the
callers are entirely unaware that a fatal error has occurred and
that the cmd->recv_msg.msg_iter was left uninitialized.
Callers such as nvmet_tcp_handle_h2c_data_pdu() proceed to blindly
overwrite the queue state with queue->rcv_state = NVMET_TCP_RECV_DATA
Consequently, the socket receiving loop may attempt to read incoming
network data into the uninitialized iterator.
Fix this by shifting the error handling responsibility to the callers.
Fixes: 52a0a9854934 ("nvmet-tcp: add bounds checks in nvmet_tcp_build_pdu_iovec")
Reviewed-by: Hannes Reinecke <hare@suse.de>
Reviewed-by: Yunje Shin <ioerts@kookmin.ac.kr>
Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com>
Signed-off-by: Maurizio Lombardi <mlombard@redhat.com>
Signed-off-by: Keith Busch <kbusch@kernel.org>
|
|
In nvmet_tcp_try_recv_ddgst(), when a data digest mismatch is detected,
nvmet_req_uninit() is called unconditionally. However, if the command
arrived via the nvmet_tcp_handle_req_failure() path, nvmet_req_init()
had returned false and percpu_ref_tryget_live() was never executed. The
unconditional percpu_ref_put() inside nvmet_req_uninit() then causes a
refcount underflow, leading to a WARNING in
percpu_ref_switch_to_atomic_rcu, a use-after-free diagnostic, and
eventually a permanent workqueue deadlock.
Check cmd->flags & NVMET_TCP_F_INIT_FAILED before calling
nvmet_req_uninit(), matching the existing pattern in
nvmet_tcp_execute_request().
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Shivam Kumar <kumar.shivam43666@gmail.com>
Signed-off-by: Keith Busch <kbusch@kernel.org>
|
|
This continues the effort to refactor workqueue APIs, which began with
the introduction of new workqueues and a new alloc_workqueue flag in:
commit 128ea9f6ccfb ("workqueue: Add system_percpu_wq and system_dfl_wq")
commit 930c2ea566af ("workqueue: Add new WQ_PERCPU flag")
The refactoring is going to alter the default behavior of
alloc_workqueue() to be unbound by default.
With the introduction of the WQ_PERCPU flag (equivalent to !WQ_UNBOUND),
any alloc_workqueue() caller that doesn’t explicitly specify WQ_UNBOUND
must now use WQ_PERCPU. For more details see the Link tag below.
In order to keep alloc_workqueue() behavior identical, explicitly request
WQ_PERCPU.
Link: https://lore.kernel.org/all/20250221112003.1dSuoGyc@linutronix.de/
Suggested-by: Tejun Heo <tj@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Marco Crivellari <marco.crivellari@suse.com>
Signed-off-by: Keith Busch <kbusch@kernel.org>
|
|
This was done entirely with mindless brute force, using
git grep -l '\<k[vmz]*alloc_objs*(.*, GFP_KERNEL)' |
xargs sed -i 's/\(alloc_objs*(.*\), GFP_KERNEL)/\1)/'
to convert the new alloc_obj() users that had a simple GFP_KERNEL
argument to just drop that argument.
Note that due to the extreme simplicity of the scripting, any slightly
more complex cases spread over multiple lines would not be triggered:
they definitely exist, but this covers the vast bulk of the cases, and
the resulting diff is also then easier to check automatically.
For the same reason the 'flex' versions will be done as a separate
conversion.
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
This is the result of running the Coccinelle script from
scripts/coccinelle/api/kmalloc_objs.cocci. The script is designed to
avoid scalar types (which need careful case-by-case checking), and
instead replace kmalloc-family calls that allocate struct or union
object instances:
Single allocations: kmalloc(sizeof(TYPE), ...)
are replaced with: kmalloc_obj(TYPE, ...)
Array allocations: kmalloc_array(COUNT, sizeof(TYPE), ...)
are replaced with: kmalloc_objs(TYPE, COUNT, ...)
Flex array allocations: kmalloc(struct_size(PTR, FAM, COUNT), ...)
are replaced with: kmalloc_flex(*PTR, FAM, COUNT, ...)
(where TYPE may also be *VAR)
The resulting allocations no longer return "void *", instead returning
"TYPE *".
Signed-off-by: Kees Cook <kees@kernel.org>
|
|
nvmet_tcp_build_pdu_iovec() could walk past cmd->req.sg when a PDU
length or offset exceeds sg_cnt and then use bogus sg->length/offset
values, leading to _copy_to_iter() GPF/KASAN. Guard sg_idx, remaining
entries, and sg->length/offset before building the bvec.
Fixes: 872d26a391da ("nvmet-tcp: add NVMe over TCP target driver")
Signed-off-by: YunJe Shin <ioerts@kookmin.ac.kr>
Reviewed-by: Sagi Grimberg <sagi@grimberg.me>
Reviewed-by: Joonkyo Jung <joonkyoj@yonsei.ac.kr>
Signed-off-by: Keith Busch <kbusch@kernel.org>
|
|
When the socket is closed while in TCP_LISTEN a callback is run to
flush all outstanding packets, which in turns calls
nvmet_tcp_listen_data_ready() with the sk_callback_lock held.
So we need to check if we are in TCP_LISTEN before attempting
to get the sk_callback_lock() to avoid a deadlock.
Link: https://lore.kernel.org/linux-nvme/CAHj4cs-zu7eVB78yUpFjVe2UqMWFkLk8p+DaS3qj+uiGCXBAoA@mail.gmail.com/
Tested-by: Yi Zhang <yi.zhang@redhat.com>
Reviewed-by: Sagi Grimberg <sagi@grimberg.me>
Signed-off-by: Hannes Reinecke <hare@kernel.org>
Signed-off-by: Keith Busch <kbusch@kernel.org>
|
|
Commit efa56305908b ("nvmet-tcp: Fix a kernel panic when host sends an invalid H2C PDU length")
added ttag bounds checking and data_offset
validation in nvmet_tcp_handle_h2c_data_pdu(), but it did not validate
whether the command's data structures (cmd->req.sg and cmd->iov) have
been properly initialized before processing H2C_DATA PDUs.
The nvmet_tcp_build_pdu_iovec() function dereferences these pointers
without NULL checks. This can be triggered by sending H2C_DATA PDU
immediately after the ICREQ/ICRESP handshake, before
sending a CONNECT command or NVMe write command.
Attack vectors that trigger NULL pointer dereferences:
1. H2C_DATA PDU sent before CONNECT → both pointers NULL
2. H2C_DATA PDU for READ command → cmd->req.sg allocated, cmd->iov NULL
3. H2C_DATA PDU for uninitialized command slot → both pointers NULL
The fix validates both cmd->req.sg and cmd->iov before calling
nvmet_tcp_build_pdu_iovec(). Both checks are required because:
- Uninitialized commands: both NULL
- READ commands: cmd->req.sg allocated, cmd->iov NULL
- WRITE commands: both allocated
Fixes: efa56305908b ("nvmet-tcp: Fix a kernel panic when host sends an invalid H2C PDU length")
Reviewed-by: Sagi Grimberg <sagi@grimberg.me>
Signed-off-by: Shivam Kumar <kumar.shivam43666@gmail.com>
Signed-off-by: Keith Busch <kbusch@kernel.org>
|
|
Pull NVMe updates from Keith:
"- Subsystem usage cleanups (Max)
- Endpoint device fixes (Shin'ichiro)
- Debug statements (Gerd)
- FC fabrics cleanups and fixes (Daniel)
- Consistent alloc API usages (Israel)
- Code comment updates (Chu)
- Authentication retry fix (Justin)"
* tag 'nvme-6.19-2025-12-04' of git://git.infradead.org/nvme:
nvme-fabrics: add ENOKEY to no retry criteria for authentication failures
nvme-auth: use kvfree() for memory allocated with kvcalloc()
nvmet-tcp: use kvcalloc for commands array
nvmet-rdma: use kvcalloc for commands and responses arrays
nvme: fix typo error in nvme target
nvmet-fc: use pr_* print macros instead of dev_*
nvmet-fcloop: remove unused lsdir member.
nvmet-fcloop: check all request and response have been processed
nvme-fc: check all request and response have been processed
nvme-fc: don't hold rport lock when putting ctrl
nvme-pci: add debug message on fail to read CSTS
nvme-pci: print error message on failure in nvme_probe
nvmet: pci-epf: fix DMA channel debug print
nvmet: pci-epf: move DMA initialization to EPC init callback
nvmet: remove redundant subsysnqn field from ctrl
nvmet: add sanity checks when freeing subsystem
|
|
Replace kcalloc with kvcalloc for allocation of the commands
array. Each command structure is 712 bytes. The array typically
exceeds a single page, and grows much larger with high queue depths
(e.g., commands >182KB).
kvcalloc automatically falls back to vmalloc for large or fragmented
allocations, improving reliability. In our case, this memory is not
aimed for DMA operations and could be safely allocated by kvcalloc.
Using virtually contiguous memory helps to avoid allocation failures
and out-of-memory conditions common with kcalloc on large pools.
Signed-off-by: Israel Rukshin <israelr@nvidia.com>
Reviewed-by: Max Gurtovoy <mgurtovoy@nvidia.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Keith Busch <kbusch@kernel.org>
|
|
Update all struct proto_ops bind() callback function prototypes from
"struct sockaddr *" to "struct sockaddr_unsized *" to avoid lying to the
compiler about object sizes. Calls into struct proto handlers gain casts
that will be removed in the struct proto conversion patch.
No binary changes expected.
Signed-off-by: Kees Cook <kees@kernel.org>
Link: https://patch.msgid.link/20251104002617.2752303-2-kees@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Pull block fixes from Jens Axboe:
- NVMe changes via Christoph:
- revert the cross-controller atomic write size validation
that caused regressions (Christoph Hellwig)
- fix endianness of command word printout in
nvme_log_err_passthru() (John Garry)
- fix callback lock for TLS handshake (Maurizio Lombardi)
- fix misaccounting of nvme-mpath inflight I/O (Yu Kuai)
- fix inconsistent RCU list manipulation in
nvme_ns_add_to_ctrl_list() (Zheng Qixing)
- Fix for a kobject leak in queue unregistration
- Fix for loop async file write start/end handling
* tag 'block-6.16-20250718' of git://git.kernel.dk/linux:
loop: use kiocb helpers to fix lockdep warning
nvmet-tcp: fix callback lock for TLS handshake
nvme: fix misaccounting of nvme-mpath inflight I/O
nvme: revert the cross-controller atomic write size validation
nvme: fix endianness of command word prints in nvme_log_err_passthru()
nvme: fix inconsistent RCU list manipulation in nvme_ns_add_to_ctrl_list()
block: fix kobject leak in blk_unregister_queue
|
|
When restoring the default socket callbacks during a TLS handshake, we
need to acquire a write lock on sk_callback_lock. Previously, a read
lock was used, which is insufficient for modifying sk_user_data and
sk_data_ready.
Fixes: 675b453e0241 ("nvmet-tcp: enable TLS handshake upcall")
Signed-off-by: Maurizio Lombardi <mlombard@redhat.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next
Pull networking updates from Paolo Abeni:
"Core:
- Implement the Device Memory TCP transmit path, allowing zero-copy
data transmission on top of TCP from e.g. GPU memory to the wire.
- Move all the IPv6 routing tables management outside the RTNL scope,
under its own lock and RCU. The route control path is now 3x times
faster.
- Convert queue related netlink ops to instance lock, reducing again
the scope of the RTNL lock. This improves the control plane
scalability.
- Refactor the software crc32c implementation, removing unneeded
abstraction layers and improving significantly the related
micro-benchmarks.
- Optimize the GRO engine for UDP-tunneled traffic, for a 10%
performance improvement in related stream tests.
- Cover more per-CPU storage with local nested BH locking; this is a
prep work to remove the current per-CPU lock in local_bh_disable()
on PREMPT_RT.
- Introduce and use nlmsg_payload helper, combining buffer bounds
verification with accessing payload carried by netlink messages.
Netfilter:
- Rewrite the procfs conntrack table implementation, improving
considerably the dump performance. A lot of user-space tools still
use this interface.
- Implement support for wildcard netdevice in netdev basechain and
flowtables.
- Integrate conntrack information into nft trace infrastructure.
- Export set count and backend name to userspace, for better
introspection.
BPF:
- BPF qdisc support: BPF-qdisc can be implemented with BPF struct_ops
programs and can be controlled in similar way to traditional qdiscs
using the "tc qdisc" command.
- Refactor the UDP socket iterator, addressing long standing issues
WRT duplicate hits or missed sockets.
Protocols:
- Improve TCP receive buffer auto-tuning and increase the default
upper bound for the receive buffer; overall this improves the
single flow maximum thoughput on 200Gbs link by over 60%.
- Add AFS GSSAPI security class to AF_RXRPC; it provides transport
security for connections to the AFS fileserver and VL server.
- Improve TCP multipath routing, so that the sources address always
matches the nexthop device.
- Introduce SO_PASSRIGHTS for AF_UNIX, to allow disabling SCM_RIGHTS,
and thus preventing DoS caused by passing around problematic FDs.
- Retire DCCP socket. DCCP only receives updates for bugs, and major
distros disable it by default. Its removal allows for better
organisation of TCP fields to reduce the number of cache lines hit
in the fast path.
- Extend TCP drop-reason support to cover PAWS checks.
Driver API:
- Reorganize PTP ioctl flag support to require an explicit opt-in for
the drivers, avoiding the problem of drivers not rejecting new
unsupported flags.
- Converted several device drivers to timestamping APIs.
- Introduce per-PHY ethtool dump helpers, improving the support for
dump operations targeting PHYs.
Tests and tooling:
- Add support for classic netlink in user space C codegen, so that
ynl-c can now read, create and modify links, routes addresses and
qdisc layer configuration.
- Add ynl sub-types for binary attributes, allowing ynl-c to output
known struct instead of raw binary data, clarifying the classic
netlink output.
- Extend MPTCP selftests to improve the code-coverage.
- Add tests for XDP tail adjustment in AF_XDP.
New hardware / drivers:
- OpenVPN virtual driver: offload OpenVPN data channels processing to
the kernel-space, increasing the data transfer throughput WRT the
user-space implementation.
- Renesas glue driver for the gigabit ethernet RZ/V2H(P) SoC.
- Broadcom asp-v3.0 ethernet driver.
- AMD Renoir ethernet device.
- ReakTek MT9888 2.5G ethernet PHY driver.
- Aeonsemi 10G C45 PHYs driver.
Drivers:
- Ethernet high-speed NICs:
- nVidia/Mellanox (mlx5):
- refactor the steering table handling to significantly
reduce the amount of memory used
- add support for complex matches in H/W flow steering
- improve flow streeing error handling
- convert to netdev instance locking
- Intel (100G, ice, igb, ixgbe, idpf):
- ice: add switchdev support for LLDP traffic over VF
- ixgbe: add firmware manipulation and regions devlink support
- igb: introduce support for frame transmission premption
- igb: adds persistent NAPI configuration
- idpf: introduce RDMA support
- idpf: add initial PTP support
- Meta (fbnic):
- extend hardware stats coverage
- add devlink dev flash support
- Broadcom (bnxt):
- add support for RX-side device memory TCP
- Wangxun (txgbe):
- implement support for udp tunnel offload
- complete PTP and SRIOV support for AML 25G/10G devices
- Ethernet NICs embedded and virtual:
- Google (gve):
- add device memory TCP TX support
- Amazon (ena):
- support persistent per-NAPI config
- Airoha:
- add H/W support for L2 traffic offload
- add per flow stats for flow offloading
- RealTek (rtl8211): add support for WoL magic packet
- Synopsys (stmmac):
- dwmac-socfpga 1000BaseX support
- add Loongson-2K3000 support
- introduce support for hardware-accelerated VLAN stripping
- Broadcom (bcmgenet):
- expose more H/W stats
- Freescale (enetc, dpaa2-eth):
- enetc: add MAC filter, VLAN filter RSS and loopback support
- dpaa2-eth: convert to H/W timestamping APIs
- vxlan: convert FDB table to rhashtable, for better scalabilty
- veth: apply qdisc backpressure on full ring to reduce TX drops
- Ethernet switches:
- Microchip (kzZ88x3): add ETS scheduler support
- Ethernet PHYs:
- RealTek (rtl8211):
- add support for WoL magic packet
- add support for PHY LEDs
- CAN:
- Adds RZ/G3E CANFD support to the rcar_canfd driver.
- Preparatory work for CAN-XL support.
- Add self-tests framework with support for CAN physical interfaces.
- WiFi:
- mac80211:
- scan improvements with multi-link operation (MLO)
- Qualcomm (ath12k):
- enable AHB support for IPQ5332
- add monitor interface support to QCN9274
- add multi-link operation support to WCN7850
- add 802.11d scan offload support to WCN7850
- monitor mode for WCN7850, better 6 GHz regulatory
- Qualcomm (ath11k):
- restore hibernation support
- MediaTek (mt76):
- WiFi-7 improvements
- implement support for mt7990
- Intel (iwlwifi):
- enhanced multi-link single-radio (EMLSR) support on 5 GHz links
- rework device configuration
- RealTek (rtw88):
- improve throughput for RTL8814AU
- RealTek (rtw89):
- add multi-link operation support
- STA/P2P concurrency improvements
- support different SAR configs by antenna
- Bluetooth:
- introduce HCI Driver protocol
- btintel_pcie: do not generate coredump for diagnostic events
- btusb: add HCI Drv commands for configuring altsetting
- btusb: add RTL8851BE device 0x0bda:0xb850
- btusb: add new VID/PID 13d3/3584 for MT7922
- btusb: add new VID/PID 13d3/3630 and 13d3/3613 for MT7925
- btnxpuart: implement host-wakeup feature"
* tag 'net-next-6.16' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next: (1611 commits)
selftests/bpf: Fix bpf selftest build warning
selftests: netfilter: Fix skip of wildcard interface test
net: phy: mscc: Stop clearing the the UDPv4 checksum for L2 frames
net: openvswitch: Fix the dead loop of MPLS parse
calipso: Don't call calipso functions for AF_INET sk.
selftests/tc-testing: Add a test for HFSC eltree double add with reentrant enqueue behaviour on netem
net_sched: hfsc: Address reentrant enqueue adding class to eltree twice
octeontx2-pf: QOS: Refactor TC_HTB_LEAF_DEL_LAST callback
octeontx2-pf: QOS: Perform cache sync on send queue teardown
net: mana: Add support for Multi Vports on Bare metal
net: devmem: ncdevmem: remove unused variable
net: devmem: ksft: upgrade rx test to send 1K data
net: devmem: ksft: add 5 tuple FS support
net: devmem: ksft: add exit_wait to make rx test pass
net: devmem: ksft: add ipv4 support
net: devmem: preserve sockc_err
page_pool: fix ugly page_pool formatting
net: devmem: move list_add to net_devmem_bind_dmabuf.
selftests: netfilter: nft_queue.sh: include file transfer duration in log message
net: phy: mscc: Fix memory leak when using one step timestamping
...
|
|
All the callers of inet_addr_is_any() have a sockaddr_storage-backed
sockaddr. Avoid casts and switch prototype to the actual object being
used.
Reviewed-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com> # SCSI
Signed-off-by: Kees Cook <kees@kernel.org>
Link: https://patch.msgid.link/20250521204619.2301870-1-kees@kernel.org
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
Now that a submission queue holds a reference to its completion queue,
there is no need to pass the cq argument to nvmet_req_init(), so remove
it.
Signed-off-by: Wilfred Mallawa <wilfred.mallawa@wdc.com>
Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com>
Reviewed-by: Damien Le Moal <dlemoal@kernel.org>
Signed-off-by: Christoph Hellwig <hch@lst.de>
|
|
The NVMe PCI transport specification allows for completion queues to be
shared by different submission queues.
This patch allows a submission queue to keep track of the completion queue
it is using with reference counting. As such, it can be ensured that a
completion queue is not deleted while a submission queue is actively
using it.
This patch enables completion queue sharing in the pci-epf target driver.
For fabrics drivers, completion queue sharing is not enabled as it is
not possible as per the fabrics specification. However, this patch
modifies the fabrics drivers to correctly integrate the new API that
supports completion queue sharing.
Signed-off-by: Wilfred Mallawa <wilfred.mallawa@wdc.com>
Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com>
Reviewed-by: Damien Le Moal <dlemoal@kernel.org>
Signed-off-by: Christoph Hellwig <hch@lst.de>
|
|
With struct nvmet_cq now having a reference count, this patch amends the
target fabrics call chain to initialize and destroy/put a completion
queue.
Signed-off-by: Wilfred Mallawa <wilfred.mallawa@wdc.com>
Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com>
Reviewed-by: Damien Le Moal <dlemoal@kernel.org>
Signed-off-by: Christoph Hellwig <hch@lst.de>
|
|
Now that the crc32c() library function directly takes advantage of
architecture-specific optimizations, it is unnecessary to go through the
crypto API. Just use crc32c(). This is much simpler, and it improves
performance due to eliminating the crypto API overhead.
Signed-off-by: Eric Biggers <ebiggers@google.com>
Reviewed-by: Sagi Grimberg <sagi@grimberg.me>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: Christoph Hellwig <hch@lst.de>
|
|
queue->state_change is set as part of nvmet_tcp_set_queue_sock(), but if
the TCP connection isn't established when nvmet_tcp_set_queue_sock() is
called then queue->state_change isn't set and sock->sk->sk_state_change
isn't replaced.
As such we don't need to restore sock->sk->sk_state_change if
queue->state_change is NULL.
This avoids NULL pointer dereferences such as this:
[ 286.462026][ C0] BUG: kernel NULL pointer dereference, address: 0000000000000000
[ 286.462814][ C0] #PF: supervisor instruction fetch in kernel mode
[ 286.463796][ C0] #PF: error_code(0x0010) - not-present page
[ 286.464392][ C0] PGD 8000000140620067 P4D 8000000140620067 PUD 114201067 PMD 0
[ 286.465086][ C0] Oops: Oops: 0010 [#1] SMP KASAN PTI
[ 286.465559][ C0] CPU: 0 UID: 0 PID: 1628 Comm: nvme Not tainted 6.15.0-rc2+ #11 PREEMPT(voluntary)
[ 286.466393][ C0] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.3-3.fc41 04/01/2014
[ 286.467147][ C0] RIP: 0010:0x0
[ 286.467420][ C0] Code: Unable to access opcode bytes at 0xffffffffffffffd6.
[ 286.467977][ C0] RSP: 0018:ffff8883ae008580 EFLAGS: 00010246
[ 286.468425][ C0] RAX: 0000000000000000 RBX: ffff88813fd34100 RCX: ffffffffa386cc43
[ 286.469019][ C0] RDX: 1ffff11027fa68b6 RSI: 0000000000000008 RDI: ffff88813fd34100
[ 286.469545][ C0] RBP: ffff88813fd34160 R08: 0000000000000000 R09: ffffed1027fa682c
[ 286.470072][ C0] R10: ffff88813fd34167 R11: 0000000000000000 R12: ffff88813fd344c3
[ 286.470585][ C0] R13: ffff88813fd34112 R14: ffff88813fd34aec R15: ffff888132cdd268
[ 286.471070][ C0] FS: 00007fe3c04c7d80(0000) GS:ffff88840743f000(0000) knlGS:0000000000000000
[ 286.471644][ C0] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 286.472543][ C0] CR2: ffffffffffffffd6 CR3: 000000012daca000 CR4: 00000000000006f0
[ 286.473500][ C0] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ 286.474467][ C0] DR3: 0000000000000000 DR6: 00000000ffff07f0 DR7: 0000000000000400
[ 286.475453][ C0] Call Trace:
[ 286.476102][ C0] <IRQ>
[ 286.476719][ C0] tcp_fin+0x2bb/0x440
[ 286.477429][ C0] tcp_data_queue+0x190f/0x4e60
[ 286.478174][ C0] ? __build_skb_around+0x234/0x330
[ 286.478940][ C0] ? rcu_is_watching+0x11/0xb0
[ 286.479659][ C0] ? __pfx_tcp_data_queue+0x10/0x10
[ 286.480431][ C0] ? tcp_try_undo_loss+0x640/0x6c0
[ 286.481196][ C0] ? seqcount_lockdep_reader_access.constprop.0+0x82/0x90
[ 286.482046][ C0] ? kvm_clock_get_cycles+0x14/0x30
[ 286.482769][ C0] ? ktime_get+0x66/0x150
[ 286.483433][ C0] ? rcu_is_watching+0x11/0xb0
[ 286.484146][ C0] tcp_rcv_established+0x6e4/0x2050
[ 286.484857][ C0] ? rcu_is_watching+0x11/0xb0
[ 286.485523][ C0] ? ipv4_dst_check+0x160/0x2b0
[ 286.486203][ C0] ? __pfx_tcp_rcv_established+0x10/0x10
[ 286.486917][ C0] ? lock_release+0x217/0x2c0
[ 286.487595][ C0] tcp_v4_do_rcv+0x4d6/0x9b0
[ 286.488279][ C0] tcp_v4_rcv+0x2af8/0x3e30
[ 286.488904][ C0] ? raw_local_deliver+0x51b/0xad0
[ 286.489551][ C0] ? rcu_is_watching+0x11/0xb0
[ 286.490198][ C0] ? __pfx_tcp_v4_rcv+0x10/0x10
[ 286.490813][ C0] ? __pfx_raw_local_deliver+0x10/0x10
[ 286.491487][ C0] ? __pfx_nf_confirm+0x10/0x10 [nf_conntrack]
[ 286.492275][ C0] ? rcu_is_watching+0x11/0xb0
[ 286.492900][ C0] ip_protocol_deliver_rcu+0x8f/0x370
[ 286.493579][ C0] ip_local_deliver_finish+0x297/0x420
[ 286.494268][ C0] ip_local_deliver+0x168/0x430
[ 286.494867][ C0] ? __pfx_ip_local_deliver+0x10/0x10
[ 286.495498][ C0] ? __pfx_ip_local_deliver_finish+0x10/0x10
[ 286.496204][ C0] ? ip_rcv_finish_core+0x19a/0x1f20
[ 286.496806][ C0] ? lock_release+0x217/0x2c0
[ 286.497414][ C0] ip_rcv+0x455/0x6e0
[ 286.497945][ C0] ? __pfx_ip_rcv+0x10/0x10
[ 286.498550][ C0] ? rcu_is_watching+0x11/0xb0
[ 286.499137][ C0] ? __pfx_ip_rcv_finish+0x10/0x10
[ 286.499763][ C0] ? lock_release+0x217/0x2c0
[ 286.500327][ C0] ? dl_scaled_delta_exec+0xd1/0x2c0
[ 286.500922][ C0] ? __pfx_ip_rcv+0x10/0x10
[ 286.501480][ C0] __netif_receive_skb_one_core+0x166/0x1b0
[ 286.502173][ C0] ? __pfx___netif_receive_skb_one_core+0x10/0x10
[ 286.502903][ C0] ? lock_acquire+0x2b2/0x310
[ 286.503487][ C0] ? process_backlog+0x372/0x1350
[ 286.504087][ C0] ? lock_release+0x217/0x2c0
[ 286.504642][ C0] process_backlog+0x3b9/0x1350
[ 286.505214][ C0] ? process_backlog+0x372/0x1350
[ 286.505779][ C0] __napi_poll.constprop.0+0xa6/0x490
[ 286.506363][ C0] net_rx_action+0x92e/0xe10
[ 286.506889][ C0] ? __pfx_net_rx_action+0x10/0x10
[ 286.507437][ C0] ? timerqueue_add+0x1f0/0x320
[ 286.507977][ C0] ? sched_clock_cpu+0x68/0x540
[ 286.508492][ C0] ? lock_acquire+0x2b2/0x310
[ 286.509043][ C0] ? kvm_sched_clock_read+0xd/0x20
[ 286.509607][ C0] ? handle_softirqs+0x1aa/0x7d0
[ 286.510187][ C0] handle_softirqs+0x1f2/0x7d0
[ 286.510754][ C0] ? __pfx_handle_softirqs+0x10/0x10
[ 286.511348][ C0] ? irqtime_account_irq+0x181/0x290
[ 286.511937][ C0] ? __dev_queue_xmit+0x85d/0x3450
[ 286.512510][ C0] do_softirq.part.0+0x89/0xc0
[ 286.513100][ C0] </IRQ>
[ 286.513548][ C0] <TASK>
[ 286.513953][ C0] __local_bh_enable_ip+0x112/0x140
[ 286.514522][ C0] ? __dev_queue_xmit+0x85d/0x3450
[ 286.515072][ C0] __dev_queue_xmit+0x872/0x3450
[ 286.515619][ C0] ? nft_do_chain+0xe16/0x15b0 [nf_tables]
[ 286.516252][ C0] ? __pfx___dev_queue_xmit+0x10/0x10
[ 286.516817][ C0] ? selinux_ip_postroute+0x43c/0xc50
[ 286.517433][ C0] ? __pfx_selinux_ip_postroute+0x10/0x10
[ 286.518061][ C0] ? rcu_is_watching+0x11/0xb0
[ 286.518606][ C0] ? ip_output+0x164/0x4a0
[ 286.519149][ C0] ? rcu_is_watching+0x11/0xb0
[ 286.519671][ C0] ? ip_finish_output2+0x17d5/0x1fb0
[ 286.520258][ C0] ip_finish_output2+0xb4b/0x1fb0
[ 286.520787][ C0] ? __pfx_ip_finish_output2+0x10/0x10
[ 286.521355][ C0] ? __ip_finish_output+0x15d/0x750
[ 286.521890][ C0] ip_output+0x164/0x4a0
[ 286.522372][ C0] ? __pfx_ip_output+0x10/0x10
[ 286.522872][ C0] ? rcu_is_watching+0x11/0xb0
[ 286.523402][ C0] ? _raw_spin_unlock_irqrestore+0x4c/0x60
[ 286.524031][ C0] ? __pfx_ip_finish_output+0x10/0x10
[ 286.524605][ C0] ? __ip_queue_xmit+0x999/0x2260
[ 286.525200][ C0] ? rcu_is_watching+0x11/0xb0
[ 286.525744][ C0] ? ipv4_dst_check+0x16a/0x2b0
[ 286.526279][ C0] ? lock_release+0x217/0x2c0
[ 286.526793][ C0] __ip_queue_xmit+0x1883/0x2260
[ 286.527324][ C0] ? __skb_clone+0x54c/0x730
[ 286.527827][ C0] __tcp_transmit_skb+0x209b/0x37a0
[ 286.528374][ C0] ? __pfx___tcp_transmit_skb+0x10/0x10
[ 286.528952][ C0] ? rcu_is_watching+0x11/0xb0
[ 286.529472][ C0] ? seqcount_lockdep_reader_access.constprop.0+0x82/0x90
[ 286.530152][ C0] ? trace_hardirqs_on+0x12/0x120
[ 286.530691][ C0] tcp_write_xmit+0xb81/0x88b0
[ 286.531224][ C0] ? mod_memcg_state+0x4d/0x60
[ 286.531736][ C0] ? rcu_is_watching+0x11/0xb0
[ 286.532253][ C0] __tcp_push_pending_frames+0x90/0x320
[ 286.532826][ C0] tcp_send_fin+0x141/0xb50
[ 286.533352][ C0] ? __pfx_tcp_send_fin+0x10/0x10
[ 286.533908][ C0] ? __local_bh_enable_ip+0xab/0x140
[ 286.534495][ C0] inet_shutdown+0x243/0x320
[ 286.535077][ C0] nvme_tcp_alloc_queue+0xb3b/0x2590 [nvme_tcp]
[ 286.535709][ C0] ? do_raw_spin_lock+0x129/0x260
[ 286.536314][ C0] ? __pfx_nvme_tcp_alloc_queue+0x10/0x10 [nvme_tcp]
[ 286.536996][ C0] ? do_raw_spin_unlock+0x54/0x1e0
[ 286.537550][ C0] ? _raw_spin_unlock+0x29/0x50
[ 286.538127][ C0] ? do_raw_spin_lock+0x129/0x260
[ 286.538664][ C0] ? __pfx_do_raw_spin_lock+0x10/0x10
[ 286.539249][ C0] ? nvme_tcp_alloc_admin_queue+0xd5/0x340 [nvme_tcp]
[ 286.539892][ C0] ? __wake_up+0x40/0x60
[ 286.540392][ C0] nvme_tcp_alloc_admin_queue+0xd5/0x340 [nvme_tcp]
[ 286.541047][ C0] ? rcu_is_watching+0x11/0xb0
[ 286.541589][ C0] nvme_tcp_setup_ctrl+0x8b/0x7a0 [nvme_tcp]
[ 286.542254][ C0] ? _raw_spin_unlock_irqrestore+0x4c/0x60
[ 286.542887][ C0] ? __pfx_nvme_tcp_setup_ctrl+0x10/0x10 [nvme_tcp]
[ 286.543568][ C0] ? trace_hardirqs_on+0x12/0x120
[ 286.544166][ C0] ? _raw_spin_unlock_irqrestore+0x35/0x60
[ 286.544792][ C0] ? nvme_change_ctrl_state+0x196/0x2e0 [nvme_core]
[ 286.545477][ C0] nvme_tcp_create_ctrl+0x839/0xb90 [nvme_tcp]
[ 286.546126][ C0] nvmf_dev_write+0x3db/0x7e0 [nvme_fabrics]
[ 286.546775][ C0] ? rw_verify_area+0x69/0x520
[ 286.547334][ C0] vfs_write+0x218/0xe90
[ 286.547854][ C0] ? do_syscall_64+0x9f/0x190
[ 286.548408][ C0] ? trace_hardirqs_on_prepare+0xdb/0x120
[ 286.549037][ C0] ? syscall_exit_to_user_mode+0x93/0x280
[ 286.549659][ C0] ? __pfx_vfs_write+0x10/0x10
[ 286.550259][ C0] ? do_syscall_64+0x9f/0x190
[ 286.550840][ C0] ? syscall_exit_to_user_mode+0x8e/0x280
[ 286.551516][ C0] ? trace_hardirqs_on_prepare+0xdb/0x120
[ 286.552180][ C0] ? syscall_exit_to_user_mode+0x93/0x280
[ 286.552834][ C0] ? ksys_read+0xf5/0x1c0
[ 286.553386][ C0] ? __pfx_ksys_read+0x10/0x10
[ 286.553964][ C0] ksys_write+0xf5/0x1c0
[ 286.554499][ C0] ? __pfx_ksys_write+0x10/0x10
[ 286.555072][ C0] ? trace_hardirqs_on_prepare+0xdb/0x120
[ 286.555698][ C0] ? syscall_exit_to_user_mode+0x93/0x280
[ 286.556319][ C0] ? do_syscall_64+0x54/0x190
[ 286.556866][ C0] do_syscall_64+0x93/0x190
[ 286.557420][ C0] ? rcu_read_unlock+0x17/0x60
[ 286.557986][ C0] ? rcu_is_watching+0x11/0xb0
[ 286.558526][ C0] ? lock_release+0x217/0x2c0
[ 286.559087][ C0] ? rcu_is_watching+0x11/0xb0
[ 286.559659][ C0] ? count_memcg_events.constprop.0+0x4a/0x60
[ 286.560476][ C0] ? exc_page_fault+0x7a/0x110
[ 286.561064][ C0] ? rcu_is_watching+0x11/0xb0
[ 286.561647][ C0] ? lock_release+0x217/0x2c0
[ 286.562257][ C0] ? do_user_addr_fault+0x171/0xa00
[ 286.562839][ C0] ? do_user_addr_fault+0x4a2/0xa00
[ 286.563453][ C0] ? irqentry_exit_to_user_mode+0x84/0x270
[ 286.564112][ C0] ? rcu_is_watching+0x11/0xb0
[ 286.564677][ C0] ? irqentry_exit_to_user_mode+0x84/0x270
[ 286.565317][ C0] ? trace_hardirqs_on_prepare+0xdb/0x120
[ 286.565922][ C0] entry_SYSCALL_64_after_hwframe+0x76/0x7e
[ 286.566542][ C0] RIP: 0033:0x7fe3c05e6504
[ 286.567102][ C0] Code: c7 00 16 00 00 00 b8 ff ff ff ff c3 66 2e 0f 1f 84 00 00 00 00 00 f3 0f 1e fa 80 3d c5 8b 10 00 00 74 13 b8 01 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 54 c3 0f 1f 00 55 48 89 e5 48 83 ec 20 48 89
[ 286.568931][ C0] RSP: 002b:00007fff76444f58 EFLAGS: 00000202 ORIG_RAX: 0000000000000001
[ 286.569807][ C0] RAX: ffffffffffffffda RBX: 000000003b40d930 RCX: 00007fe3c05e6504
[ 286.570621][ C0] RDX: 00000000000000cf RSI: 000000003b40d930 RDI: 0000000000000003
[ 286.571443][ C0] RBP: 0000000000000003 R08: 00000000000000cf R09: 000000003b40d930
[ 286.572246][ C0] R10: 0000000000000000 R11: 0000000000000202 R12: 000000003b40cd60
[ 286.573069][ C0] R13: 00000000000000cf R14: 00007fe3c07417f8 R15: 00007fe3c073502e
[ 286.573886][ C0] </TASK>
Closes: https://lore.kernel.org/linux-nvme/5hdonndzoqa265oq3bj6iarwtfk5dewxxjtbjvn5uqnwclpwt6@a2n6w3taxxex/
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
Reviewed-by: Sagi Grimberg <sagi@grimberg.me>
Tested-by: Shin'ichiro Kawasaki <shinichiro.kawasaki@wdc.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
|
|
Pull block updates from Jens Axboe:
- Fixes for integrity handling
- NVMe pull request via Keith:
- Secure concatenation for TCP transport (Hannes)
- Multipath sysfs visibility (Nilay)
- Various cleanups (Qasim, Baruch, Wang, Chen, Mike, Damien, Li)
- Correct use of 64-bit BARs for pci-epf target (Niklas)
- Socket fix for selinux when used in containers (Peijie)
- MD pull request via Yu:
- fix recovery can preempt resync (Li Nan)
- fix md-bitmap IO limit (Su Yue)
- fix raid10 discard with REQ_NOWAIT (Xiao Ni)
- fix raid1 memory leak (Zheng Qixing)
- fix mddev uaf (Yu Kuai)
- fix raid1,raid10 IO flags (Yu Kuai)
- some refactor and cleanup (Yu Kuai)
- Series cleaning up and fixing bugs in the bad block handling code
- Improve support for write failure simulation in null_blk
- Various lock ordering fixes
- Fixes for locking for debugfs attributes
- Various ublk related fixes and improvements
- Cleanups for blk-rq-qos wait handling
- blk-throttle fixes
- Fixes for loop dio and sync handling
- Fixes and cleanups for the auto-PI code
- Block side support for hardware encryption keys in blk-crypto
- Various cleanups and fixes
* tag 'for-6.15/block-20250322' of git://git.kernel.dk/linux: (105 commits)
nvmet: replace max(a, min(b, c)) by clamp(val, lo, hi)
nvme-tcp: fix selinux denied when calling sock_sendmsg
nvmet: pci-epf: Always configure BAR0 as 64-bit
nvmet: Remove duplicate uuid_copy
nvme: zns: Simplify nvme_zone_parse_entry()
nvmet: pci-epf: Remove redundant 'flush_workqueue()' calls
nvmet-fc: Remove unused functions
nvme-pci: remove stale comment
nvme-fc: Utilise min3() to simplify queue count calculation
nvme-multipath: Add visibility for queue-depth io-policy
nvme-multipath: Add visibility for numa io-policy
nvme-multipath: Add visibility for round-robin io-policy
nvmet: add tls_concat and tls_key debugfs entries
nvmet-tcp: support secure channel concatenation
nvmet: Add 'sq' argument to alloc_ctrl_args
nvme-fabrics: reset admin connection for secure concatenation
nvme-tcp: request secure channel concatenation
nvme-keyring: add nvme_tls_psk_refresh()
nvme: add nvme_auth_derive_tls_psk()
nvme: add nvme_auth_generate_digest()
...
|
|
Evaluate the SC_C flag during DH-CHAP-HMAC negotiation to check if secure
concatenation as specified in the NVMe Base Specification v2.1, section
8.3.4.3: "Secure Channel Concatenationand" is requested. If requested the
generated PSK is inserted into the keyring once negotiation has finished
allowing for an encrypted connection once the admin queue is restarted.
Signed-off-by: Hannes Reinecke <hare@kernel.org>
Reviewed-by: Sagi Grimberg <sagi@grimberg.me>
Signed-off-by: Keith Busch <kbusch@kernel.org>
|
|
Add a function to refresh a generated PSK in the specified keyring.
Signed-off-by: Hannes Reinecke <hare@kernel.org>
Reviewed-by: Sagi Grimberg <sagi@grimberg.me>
Signed-off-by: Keith Busch <kbusch@kernel.org>
|
|
The order in which queue->cmd and rcv_state are updated is crucial.
If these assignments are reordered by the compiler, the worker might not
get queued in nvmet_tcp_queue_response(), hanging the IO. to enforce the
the correct reordering, set rcv_state using smp_store_release().
Fixes: bdaf13279192 ("nvmet-tcp: fix a segmentation fault during io parsing error")
Signed-off-by: Meir Elisha <meir.elisha@volumez.com>
Reviewed-by: Sagi Grimberg <sagi@grimberg.me>
Signed-off-by: Keith Busch <kbusch@kernel.org>
|
|
If the commands allocation fails in nvmet_tcp_alloc_cmds()
the kernel crashes in nvmet_tcp_release_queue_work() because of
a NULL pointer dereference.
nvmet: failed to install queue 0 cntlid 1 ret 6
Unable to handle kernel NULL pointer dereference at
virtual address 0000000000000008
Fix the bug by setting queue->nr_cmds to zero in case
nvmet_tcp_alloc_cmd() fails.
Fixes: 872d26a391da ("nvmet-tcp: add NVMe over TCP target driver")
Signed-off-by: Maurizio Lombardi <mlombard@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Keith Busch <kbusch@kernel.org>
|
|
Implement callback to display the host transport address.
Signed-off-by: Hannes Reinecke <hare@kernel.org>
Reviewed-by: Sagi Grimberg <sagi@grimberg.me>
Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com>
Signed-off-by: Daniel Wagner <dwagner@suse.de>
Signed-off-by: Keith Busch <kbusch@kernel.org>
|
|
CDR/MORE/DNR fields are not belonging to SC in the NVMe spec, rename
them to NVME_STATUS_* to avoid confusion.
Signed-off-by: Weiwen Hu <huweiwen@linux.alibaba.com>
Reviewed-by: Sagi Grimberg <sagi@grimberg.me>
Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Keith Busch <kbusch@kernel.org>
|
|
When we teardown the controller, we wait for pending I/Os to complete
(sq->ref on all queues to drop to zero) and then we go over the commands,
and free their command buffers in case they are still fetching data from
the host (e.g. processing nvme writes) and have yet to take a reference
on the sq.
However, we may miss the case where commands have failed before executing
and are queued for sending a response, but will never occur because the
queue socket is already down. In this case we may miss deallocating command
buffers.
Solve this by freeing all commands buffers as nvmet_tcp_free_cmd_buffers is
idempotent anyways.
Reported-by: Yi Zhang <yi.zhang@redhat.com>
Tested-by: Yi Zhang <yi.zhang@redhat.com>
Signed-off-by: Sagi Grimberg <sagi@grimberg.me>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Keith Busch <kbusch@kernel.org>
|
|
Pull NVMe fixes from Keith:
"nvme updates for Linux 6.9
- Make an informative message less ominous (Keith)
- Enhanced trace decoding (Guixin)
- TCP updates (Hannes, Li)
- Fabrics connect deadlock fix (Chunguang)
- Platform API migration update (Uwe)
- A new device quirk (Jiawei)"
* tag 'nvme-6.9-2024-03-21' of git://git.infradead.org/nvme:
nvmet-rdma: remove NVMET_RDMA_REQ_INVALIDATE_RKEY flag
nvme: remove redundant BUILD_BUG_ON check
nvme/tcp: Add wq_unbound modparam for nvme_tcp_wq
nvme-tcp: Export the nvme_tcp_wq to sysfs
drivers/nvme: Add quirks for device 126f:2262
nvme: parse format command's lbafu when tracing
nvme: add tracing of reservation commands
nvme: parse zns command's zsa and zrasf to string
nvme: use nvme_disk_is_ns_head helper
nvme: fix reconnection fail due to reserved tag allocation
nvmet: add tracing of zns commands
nvmet: add tracing of authentication commands
nvme-apple: Convert to platform remove callback returning void
nvmet-tcp: do not continue for invalid icreq
nvme: change shutdown timeout setting message
|
|
When the length check for an icreq sqe fails we should not
continue processing but rather return immediately as all
other contents of that sqe cannot be relied on.
Signed-off-by: Hannes Reinecke <hare@suse.de>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Sagi Grimberg <sagi@grimberg.me>
Signed-off-by: Keith Busch <kbusch@kernel.org>
|
|
When draining a page_frag_cache, most user are doing
the similar steps, so introduce an API to avoid code
duplication.
Signed-off-by: Yunsheng Lin <linyunsheng@huawei.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Reviewed-by: Alexander Duyck <alexanderduyck@fb.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|