linux.git - Unnamed repository; edit this file 'description' to name the repository.

Age	Commit message (Collapse)	Author	Files	Lines
5 days	Merge tag 'for-7.0/block-20260206' of ↵	Linus Torvalds	1	-1/+2
	git://git.kernel.org/pub/scm/linux/kernel/git/axboe/linux Pull block updates from Jens Axboe: - Support for batch request processing for ublk, improving the efficiency of the kernel/ublk server communication. This can yield nice 7-12% performance improvements - Support for integrity data for ublk - Various other ublk improvements and additions, including a ton of selftests additions and updated - Move the handling of blk-crypto software fallback from below the block layer to above it. This reduces the complexity of dealing with bio splitting - Series fixing a number of potential deadlocks in blk-mq related to the queue usage counter and writeback throttling and rq-qos debugfs handling - Add an async_depth queue attribute, to resolve a performance regression that's been around for a qhilw related to the scheduler depth handling - Only use task_work for IOPOLL completions on NVMe, if it is necessary to do so. An earlier fix for an issue resulted in all these completions being punted to task_work, to guarantee that completions were only run for a given io_uring ring when it was local to that ring. With the new changes, we can detect if it's necessary to use task_work or not, and avoid it if possible. - rnbd fixes: - Fix refcount underflow in device unmap path - Handle PREFLUSH and NOUNMAP flags properly in protocol - Fix server-side bi_size for special IOs - Zero response buffer before use - Fix trace format for flags - Add .release to rnbd_dev_ktype - MD pull requests via Yu Kuai - Fix raid5_run() to return error when log_init() fails - Fix IO hang with degraded array with llbitmap - Fix percpu_ref not resurrected on suspend timeout in llbitmap - Fix GPF in write_page caused by resize race - Fix NULL pointer dereference in process_metadata_update - Fix hang when stopping arrays with metadata through dm-raid - Fix any_working flag handling in raid10_sync_request - Refactor sync/recovery code path, improve error handling for badblocks, and remove unused recovery_disabled field - Consolidate mddev boolean fields into mddev_flags - Use mempool to allocate stripe_request_ctx and make sure max_sectors is not less than io_opt in raid5 - Fix return value of mddev_trylock - Fix memory leak in raid1_run() - Add Li Nan as mdraid reviewer - Move phys_vec definitions to the kernel types, mostly in preparation for some VFIO and RDMA changes - Improve the speed for secure erase for some devices - Various little rust updates - Various other minor fixes, improvements, and cleanups * tag 'for-7.0/block-20260206' of git://git.kernel.org/pub/scm/linux/kernel/git/axboe/linux: (162 commits) blk-mq: ABI/sysfs-block: fix docs build warnings selftests: ublk: organize test directories by test ID block: decouple secure erase size limit from discard size limit block: remove redundant kill_bdev() call in set_blocksize() blk-mq: add documentation for new queue attribute async_dpeth block, bfq: convert to use request_queue->async_depth mq-deadline: covert to use request_queue->async_depth kyber: covert to use request_queue->async_depth blk-mq: add a new queue sysfs attribute async_depth blk-mq: factor out a helper blk_mq_limit_depth() blk-mq-sched: unify elevators checking for async requests block: convert nr_requests to unsigned int block: don't use strcpy to copy blockdev name blk-mq-debugfs: warn about possible deadlock blk-mq-debugfs: add missing debugfs_mutex in blk_mq_debugfs_register_hctxs() blk-mq-debugfs: remove blk_mq_debugfs_unregister_rqos() blk-mq-debugfs: make blk_mq_debugfs_register_rqos() static blk-rq-qos: fix possible debugfs_mutex deadlock blk-mq-debugfs: factor out a helper to register debugfs for all rq_qos blk-wbt: fix possible deadlock to nest pcpu_alloc_mutex under q_usage_counter ...
2026-01-20	block: pass io_comp_batch to rq_end_io_fn callback	Ming Lei	1	-1/+2
	Add a third parameter 'const struct io_comp_batch *' to the rq_end_io_fn callback signature. This allows end_io handlers to access the completion batch context when requests are completed via blk_mq_end_request_batch(). The io_comp_batch is passed from blk_mq_end_request_batch(), while NULL is passed from __blk_mq_end_request() and blk_mq_put_rq_ref() which don't have batch context. This infrastructure change enables drivers to detect whether they're being called from a batched completion path (like iopoll) and access additional context stored in the io_comp_batch. Update all rq_end_io_fn implementations: - block/blk-mq.c: blk_end_sync_rq - block/blk-flush.c: flush_end_io, mq_flush_data_end_io - drivers/nvme/host/ioctl.c: nvme_uring_cmd_end_io - drivers/nvme/host/core.c: nvme_keep_alive_end_io - drivers/nvme/host/pci.c: abort_endio, nvme_del_queue_end, nvme_del_cq_end - drivers/nvme/target/passthru.c: nvmet_passthru_req_done - drivers/scsi/scsi_error.c: eh_lock_door_done - drivers/scsi/sg.c: sg_rq_end_io - drivers/scsi/st.c: st_scsi_execute_end - drivers/target/target_core_pscsi.c: pscsi_req_done - drivers/md/dm-rq.c: end_clone_request Signed-off-by: Ming Lei <ming.lei@redhat.com> Reviewed-by: Kanchan Joshi <joshi.k@samsung.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2026-01-16	scsi: core: Wake up the error handler when final completions race against ↵	David Jeffery	1	-1/+10
	each other The fragile ordering between marking commands completed or failed so that the error handler only wakes when the last running command completes or times out has race conditions. These race conditions can cause the SCSI layer to fail to wake the error handler, leaving I/O through the SCSI host stuck as the error state cannot advance. First, there is an memory ordering issue within scsi_dec_host_busy(). The write which clears SCMD_STATE_INFLIGHT may be reordered with reads counting in scsi_host_busy(). While the local CPU will see its own write, reordering can allow other CPUs in scsi_dec_host_busy() or scsi_eh_inc_host_failed() to see a raised busy count, causing no CPU to see a host busy equal to the host_failed count. This race condition can be prevented with a memory barrier on the error path to force the write to be visible before counting host busy commands. Second, there is a general ordering issue with scsi_eh_inc_host_failed(). By counting busy commands before incrementing host_failed, it can race with a final command in scsi_dec_host_busy(), such that scsi_dec_host_busy() does not see host_failed incremented but scsi_eh_inc_host_failed() counts busy commands before SCMD_STATE_INFLIGHT is cleared by scsi_dec_host_busy(), resulting in neither waking the error handler task. This needs the call to scsi_host_busy() to be moved after host_failed is incremented to close the race condition. Fixes: 6eb045e092ef ("scsi: core: avoid host-wide host_busy counter for scsi_mq") Signed-off-by: David Jeffery <djeffery@redhat.com> Reviewed-by: Bart Van Assche <bvanassche@acm.org> Link: https://patch.msgid.link/20260113161036.6730-1-djeffery@redhat.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2026-01-04	scsi: core: Fix error handler encryption support	Brian Kao	1	-0/+24
	Some low-level drivers (LLD) access block layer crypto fields, such as rq->crypt_keyslot and rq->crypt_ctx within `struct request`, to configure hardware for inline encryption. However, SCSI Error Handling (EH) commands (e.g., TEST UNIT READY, START STOP UNIT) should not involve any encryption setup. To prevent drivers from erroneously applying crypto settings during EH, this patch saves the original values of rq->crypt_keyslot and rq->crypt_ctx before an EH command is prepared via scsi_eh_prep_cmnd(). These fields in the 'struct request' are then set to NULL. The original values are restored in scsi_eh_restore_cmnd() after the EH command completes. This ensures that the block layer crypto context does not leak into EH command execution. Signed-off-by: Brian Kao <powenkao@google.com> Link: https://patch.msgid.link/20251218031726.2642834-1-powenkao@google.com Cc: stable@vger.kernel.org Reviewed-by: Bart Van Assche <bvanassche@acm.org> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2025-11-19	Merge branch 6.18/scsi-fixes into 6.19/scsi-staging	Martin K. Petersen	1	-2/+2
	Pull in fixes branch to resolve UFS merge conflict. Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2025-11-12	scsi: core: Make the budget map optional	Bart Van Assche	1	-0/+3
	Prepare for not allocating a budget map for pseudo SCSI devices by checking whether a budget map has been allocated before using it. Reviewed-by: Hannes Reinecke <hare@suse.de> Cc: John Garry <john.g.garry@oracle.com> Cc: Ming Lei <ming.lei@redhat.com> Signed-off-by: Bart Van Assche <bvanassche@acm.org> Link: https://patch.msgid.link/20251031204029.2883185-4-bvanassche@acm.org Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2025-10-21	scsi: core: Fix the unit attention counter implementation	Bart Van Assche	1	-2/+2
	scsi_decide_disposition() may call scsi_check_sense(). scsi_decide_disposition() calls are not serialized. Hence, counter updates by scsi_check_sense() must be serialized. Hence this patch that makes the counters updated by scsi_check_sense() atomic. Cc: Kai Mäkisara <Kai.Makisara@kolumbus.fi> Fixes: a5d518cd4e3e ("scsi: core: Add counters for New Media and Power On/Reset UNIT ATTENTIONs") Signed-off-by: Bart Van Assche <bvanassche@acm.org> Reviewed-by: Ewan D. Milne <emilne@redhat.com> Link: https://patch.msgid.link/20251014220244.3689508-1-bvanassche@acm.org Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2025-06-09	scsi: error: alua: I/O errors for ALUA state transitions	Rajashekhar M A	1	-1/+2
	When a host is configured with a few LUNs and I/O is running, injecting FC faults repeatedly leads to path recovery problems. The LUNs have 4 paths each and 3 of them come back active after say an FC fault which makes 2 of the paths go down, instead of all 4. This happens after several iterations of continuous FC faults. Reason here is that we're returning an I/O error whenever we're encountering sense code 06/04/0a (LOGICAL UNIT NOT ACCESSIBLE, ASYMMETRIC ACCESS STATE TRANSITION) instead of retrying. Signed-off-by: Rajashekhar M A <rajs@netapp.com> Signed-off-by: Hannes Reinecke <hare@suse.de> Link: https://lore.kernel.org/r/20250606135924.27397-1-hare@kernel.org Reviewed-by: Lee Duncan <lduncan@suse.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2025-03-03	scsi: scsi_error: Add comments to scsi_check_sense()	Damien Le Moal	1	-0/+7
	Add a comment block describing the COMPLETED case with ASC/ASCQ 0x55/0xA to mention that it relates to command duration limits very special policy 0xD command completion. Signed-off-by: Damien Le Moal <dlemoal@kernel.org> Link: https://lore.kernel.org/r/20250228031751.12083-1-dlemoal@kernel.org Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2025-02-03	scsi: core: Add counters for New Media and Power On/Reset UNIT ATTENTIONs	Kai Mäkisara	1	-0/+12
	The purpose of the counters is to enable all ULDs attached to a device to find out that a New Media or/and Power On/Reset Unit Attentions has/have been set, even if another ULD catches the Unit Attention as response to a SCSI command. The ULDs can read the counters and see if the values have changed from the previous check. Signed-off-by: Kai Mäkisara <Kai.Makisara@kolumbus.fi> Link: https://lore.kernel.org/r/20250120194925.44432-3-Kai.Makisara@kolumbus.fi Reviewed-by: John Meneghini <jmeneghi@redhat.com> Tested-by: John Meneghini <jmeneghi@redhat.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2025-01-10	scsi: scsi_error: Add kernel-doc for exported functions	Randy Dunlap	1	-13/+13
	Convert scsi_report_bus_reset() and scsi_report_device_reset() to kernel-doc since they are exported. This allows them to be part of the driver-api/scsi.rst docbook. Signed-off-by: Randy Dunlap <rdunlap@infradead.org> Link: https://lore.kernel.org/r/20241212205217.597844-2-rdunlap@infradead.org CC: James E.J. Bottomley <James.Bottomley@HansenPartnership.com> CC: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2024-10-02	move asm/unaligned.h to linux/unaligned.h	Al Viro	1	-1/+1
	asm/unaligned.h is always an include of asm-generic/unaligned.h; might as well move that thing to linux/unaligned.h and include that - there's nothing arch-specific in that header. auto-generated by the following: for i in `git grep -l -w asm/unaligned.h`; do sed -i -e "s/asm\/unaligned.h/linux\/unaligned.h/" $i done for i in `git grep -l -w asm-generic/unaligned.h`; do sed -i -e "s/asm-generic\/unaligned.h/linux\/unaligned.h/" $i done git mv include/asm-generic/unaligned.h include/linux/unaligned.h git mv tools/include/asm-generic/unaligned.h tools/include/linux/unaligned.h sed -i -e "/unaligned.h/d" include/asm-generic/Kbuild sed -i -e "s/__ASM_GENERIC/__LINUX/" include/linux/unaligned.h tools/include/linux/unaligned.h
2024-02-05	scsi: core: Move scsi_host_busy() out of host lock if it is for per-command	Ming Lei	1	-1/+2
	Commit 4373534a9850 ("scsi: core: Move scsi_host_busy() out of host lock for waking up EH handler") intended to fix a hard lockup issue triggered by EH. The core idea was to move scsi_host_busy() out of the host lock when processing individual commands for EH. However, a suggested style change inadvertently caused scsi_host_busy() to remain under the host lock. Fix this by calling scsi_host_busy() outside the lock. Fixes: 4373534a9850 ("scsi: core: Move scsi_host_busy() out of host lock for waking up EH handler") Cc: Sathya Prakash Veerichetty <safhya.prakash@broadcom.com> Cc: Bart Van Assche <bvanassche@acm.org> Cc: Ewan D. Milne <emilne@redhat.com> Signed-off-by: Ming Lei <ming.lei@redhat.com> Link: https://lore.kernel.org/r/20240203024521.2006455-1-ming.lei@redhat.com Reviewed-by: Bart Van Assche <bvanassche@acm.org> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2024-01-23	scsi: core: Move scsi_host_busy() out of host lock for waking up EH handler	Ming Lei	1	-4/+4
	Inside scsi_eh_wakeup(), scsi_host_busy() is called & checked with host lock every time for deciding if error handler kthread needs to be waken up. This can be too heavy in case of recovery, such as: - N hardware queues - queue depth is M for each hardware queue - each scsi_host_busy() iterates over (N * M) tag/requests If recovery is triggered in case that all requests are in-flight, each scsi_eh_wakeup() is strictly serialized, when scsi_eh_wakeup() is called for the last in-flight request, scsi_host_busy() has been run for (N * M - 1) times, and request has been iterated for (NM - 1) (N * M) times. If both N and M are big enough, hard lockup can be triggered on acquiring host lock, and it is observed on mpi3mr(128 hw queues, queue depth 8169). Fix the issue by calling scsi_host_busy() outside the host lock. We don't need the host lock for getting busy count because host the lock never covers that. [mkp: Drop unnecessary 'busy' variables pointed out by Bart] Cc: Ewan Milne <emilne@redhat.com> Fixes: 6eb045e092ef ("scsi: core: avoid host-wide host_busy counter for scsi_mq") Signed-off-by: Ming Lei <ming.lei@redhat.com> Link: https://lore.kernel.org/r/20240112070000.4161982-1-ming.lei@redhat.com Reviewed-by: Ewan D. Milne <emilne@redhat.com> Reviewed-by: Sathya Prakash Veerichetty <safhya.prakash@broadcom.com> Tested-by: Sathya Prakash Veerichetty <safhya.prakash@broadcom.com> Reviewed-by: Bart Van Assche <bvanassche@acm.org> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2024-01-20	Merge tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi	Linus Torvalds	1	-3/+6
	Pull SCSI updates from James Bottomley: "Final round of fixes that came in too late to send in the first request. It's nine bug fixes and one version update (because of a bug fix) and one set of PCI ID additions. There's one bug fix in the core which is really a one liner (except that an additional sdev pointer was added for convenience) and the rest are in drivers" * tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: scsi: target: core: Add TMF to tmr_list handling scsi: core: Kick the requeue list after inserting when flushing scsi: fnic: unlock on error path in fnic_queuecommand() scsi: fcoe: Fix unsigned comparison with zero in store_ctlr_mode() scsi: mpi3mr: Fix mpi3mr_fw.c kernel-doc warnings scsi: smartpqi: Bump driver version to 2.1.26-030 scsi: smartpqi: Fix logical volume rescan race condition scsi: smartpqi: Add new controller PCI IDs scsi: ufs: qcom: Remove unnecessary goto statement from ufs_qcom_config_esi() scsi: ufs: core: Remove the ufshcd_hba_exit() call from ufshcd_async_scan() scsi: ufs: core: Simplify power management during async scan
2024-01-11	scsi: core: Kick the requeue list after inserting when flushing	Niklas Cassel	1	-3/+6
	When libata calls ata_link_abort() to abort all ata queued commands, it calls blk_abort_request() on the SCSI command representing each QC. This causes scsi_timeout() to be called, which calls scsi_eh_scmd_add() for each SCSI command. scsi_eh_scmd_add() sets the SCSI host to state recovery, and then adds the command to shost->eh_cmd_q. This will wake up the SCSI EH, and eventually the libata EH strategy handler will be called, which calls scsi_eh_flush_done_q() to either flush retry or flush finish each failed command. The commands that are flush retried by scsi_eh_flush_done_q() are done so using scsi_queue_insert(). Before commit 8b566edbdbfb ("scsi: core: Only kick the requeue list if necessary"), __scsi_queue_insert() called blk_mq_requeue_request() with the second argument set to true, indicating that it should always kick/run the requeue list after inserting. After commit 8b566edbdbfb ("scsi: core: Only kick the requeue list if necessary"), __scsi_queue_insert() does not kick/run the requeue list after inserting, if the current SCSI host state is recovery (which is the case in the libata example above). This optimization is probably fine in most cases, as I can only assume that most often someone will eventually kick/run the queues. However, that is not the case for scsi_eh_flush_done_q(), where we can see that the request gets inserted to the requeue list, but the queue is never started after the request has been inserted, leading to the block layer waiting for the completion of command that never gets to run. Since scsi_eh_flush_done_q() is called by SCSI EH context, the SCSI host state is most likely always in recovery when this function is called. Thus, let scsi_eh_flush_done_q() explicitly kick the requeue list after inserting a flush retry command, so that scsi_eh_flush_done_q() keeps the same behavior as before commit 8b566edbdbfb ("scsi: core: Only kick the requeue list if necessary"). Simple reproducer for the libata example above: $ hdparm -Y /dev/sda $ echo 1 > /sys/class/scsi_device/0\:0\:0\:0/device/delete Fixes: 8b566edbdbfb ("scsi: core: Only kick the requeue list if necessary") Reported-by: Kevin Locke <kevin@kevinlocke.name> Closes: https://lore.kernel.org/linux-scsi/ZZw3Th70wUUvCiCY@kevinlocke.name/ Signed-off-by: Niklas Cassel <cassel@kernel.org> Link: https://lore.kernel.org/r/20240111120533.3612509-1-cassel@kernel.org Reviewed-by: Bart Van Assche <bvanassche@acm.org> Reviewed-by: Damien Le Moal <dlemoal@kernel.org> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2024-01-11	Merge tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi	Linus Torvalds	1	-0/+1
	Pull SCSI updates from James Bottomley: "Updates to the usual drivers (ufs, mpi3mr, mpt3sas, lpfc, fnic, hisi_sas, arcmsr, ) plus the usual assorted minor fixes and updates. This time around there's only a single line update to the core, so nothing major and barely anything minor" * tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: (135 commits) scsi: ufs: core: Simplify ufshcd_auto_hibern8_update() scsi: ufs: core: Rename ufshcd_auto_hibern8_enable() and make it static scsi: ufs: qcom: Fix ESI vector mask scsi: ufs: host: Fix kernel-doc warning scsi: hisi_sas: Correct the number of global debugfs registers scsi: hisi_sas: Rollback some operations if FLR failed scsi: hisi_sas: Check before using pointer variables scsi: hisi_sas: Replace with standard error code return value scsi: hisi_sas: Set .phy_attached before notifing phyup event HISI_PHYE_PHY_UP_PM scsi: ufs: core: Add sysfs node for UFS RTC update scsi: ufs: core: Add UFS RTC support scsi: ufs: core: Add ufshcd_is_ufs_dev_busy() scsi: ufs: qcom: Remove unused definitions scsi: ufs: qcom: Use ufshcd_rmwl() where applicable scsi: ufs: qcom: Remove support for host controllers older than v2.0 scsi: ufs: qcom: Simplify ufs_qcom_{assert/deassert}_reset scsi: ufs: qcom: Initialize cycles_in_1us variable in ufs_qcom_set_core_clk_ctrl() scsi: ufs: qcom: Sort includes alphabetically scsi: ufs: qcom: Remove unused ufs_qcom_hosts struct array scsi: ufs: qcom: Use dev_err_probe() to simplify error handling of devm_gpiod_get_optional() ...
2023-12-18	scsi: core: Always send batch on reset or error handling command	Alexander Atanasov	1	-0/+2
	In commit 8930a6c20791 ("scsi: core: add support for request batching") the block layer bd->last flag was mapped to SCMD_LAST and used as an indicator to send the batch for the drivers that implement this feature. However, the error handling code was not updated accordingly. scsi_send_eh_cmnd() is used to send error handling commands and request sense. The problem is that request sense comes as a single command that gets into the batch queue and times out. As a result the device goes offline after several failed resets. This was observed on virtio_scsi during a device resize operation. [ 496.316946] sd 0:0:4:0: [sdd] tag#117 scsi_eh_0: requesting sense [ 506.786356] sd 0:0:4:0: [sdd] tag#117 scsi_send_eh_cmnd timeleft: 0 [ 506.787981] sd 0:0:4:0: [sdd] tag#117 abort To fix this always set SCMD_LAST flag in scsi_send_eh_cmnd() and scsi_reset_ioctl(). Fixes: 8930a6c20791 ("scsi: core: add support for request batching") Cc: <stable@vger.kernel.org> Signed-off-by: Alexander Atanasov <alexander.atanasov@virtuozzo.com> Link: https://lore.kernel.org/r/20231215121008.2881653-1-alexander.atanasov@virtuozzo.com Reviewed-by: Ming Lei <ming.lei@redhat.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2023-11-24	scsi: core: Add a precondition check in scsi_eh_scmd_add()	Bart Van Assche	1	-0/+1
	Calling scsi_eh_scmd_add() may cause the error handler never to be woken up because this may result in shost->host_failed to become larger than scsi_host_busy(shost). Hence complain if scsi_eh_scmd_add() is called after SCMD_STATE_INFLIGHT has been cleared. Cc: Hannes Reinecke <hare@suse.de> Cc: Damien Le Moal <damien.lemoal@opensource.wdc.com> Cc: Mike Christie <michael.christie@oracle.com> Cc: John Garry <john.g.garry@oracle.com> Cc: Ming Lei <ming.lei@redhat.com> Signed-off-by: Bart Van Assche <bvanassche@acm.org> Link: https://lore.kernel.org/r/20231115193343.2262013-1-bvanassche@acm.org Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2023-05-22	scsi: sd: Handle read/write CDL timeout failures	Niklas Cassel	1	-0/+45
	Commands using a duration limit descriptor that has limit policies set to a value other than 0x0 may be failed by the device if one of the limits are exceeded. For such commands, since the failure is the result of the user duration limit configuration and workload, the commands should not be retried and terminated immediately. Furthermore, to allow the user to differentiate these "soft" failures from hard errors due to hardware problem, a different error code than EIO should be returned. There are 2 cases to consider: (1) The failure is due to a limit policy failing the command with a check condition sense key, that is, any limit policy other than 0xD. For this case, scsi_check_sense() is modified to detect failures with the ABORTED COMMAND sense key and the COMMAND TIMEOUT BEFORE PROCESSING or COMMAND TIMEOUT DURING PROCESSING or COMMAND TIMEOUT DURING PROCESSING DUE TO ERROR RECOVERY additional sense code. For these failures, a SUCCESS disposition is returned so that scsi_finish_command() is called to terminate the command. (2) The failure is due to a limit policy set to 0xD, which result in the command being terminated with a GOOD status, COMPLETED sense key, and DATA CURRENTLY UNAVAILABLE additional sense code. To handle this case, the scsi_check_sense() is modified to return a SUCCESS disposition so that scsi_finish_command() is called to terminate the command. In addition, scsi_decide_disposition() has to be modified to see if a command being terminated with GOOD status has sense data. This is as defined in SCSI Primary Commands - 6 (SPC-6), so all according to spec, even if GOOD status commands were not checked before. If scsi_check_sense() detects sense data representing a duration limit, scsi_check_sense() will set the newly introduced SCSI ML byte SCSIML_STAT_DL_TIMEOUT. This SCSI ML byte is checked in scsi_noretry_cmd(), so that a command that failed because of a CDL timeout cannot be retried. The SCSI ML byte is also checked in scsi_result_to_blk_status() to complete the command request with the BLK_STS_DURATION_LIMIT status, which result in the user seeing ETIME errors for the failed commands. Co-developed-by: Damien Le Moal <dlemoal@kernel.org> Signed-off-by: Damien Le Moal <dlemoal@kernel.org> Reviewed-by: Hannes Reinecke <hare@suse.de> Signed-off-by: Niklas Cassel <niklas.cassel@wdc.com> Link: https://lore.kernel.org/r/20230511011356.227789-12-nks@flawful.org Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2023-05-22	scsi: core: Allow libata to complete successful commands via EH	Niklas Cassel	1	-1/+2
	In SCSI, we get the sense data as part of the completion, for ATA however, we need to fetch the sense data as an extra step. For an aborted ATA command the sense data is fetched via libata's ->eh_strategy_handler(). For Command Duration Limits policy 0xD: The device shall complete the command without error with the additional sense code set to DATA CURRENTLY UNAVAILABLE. In order to handle this policy in libata, we intend to send a successful command via SCSI EH, and let libata's ->eh_strategy_handler() fetch the sense data for the good command. This is similar to how we handle an aborted ATA command, just that we need to read the Successful NCQ Commands log instead of the NCQ Command Error log. When we get a SATA completion with successful commands, ATA_SENSE will be set, indicating that some commands in the completion have sense data. The sense_valid bitmask in the Sense Data for Successful NCQ Commands log will inform exactly which commands that had sense data, which might be a subset of all the commands that was completed in the same completion. (Yet all will have ATA_SENSE set, since the status is per completion.) The successful commands that have e.g. a "DATA CURRENTLY UNAVAILABLE" sense data will have a SCSI ML byte set, so scsi_eh_flush_done_q() will not set the scmd->result to DID_TIME_OUT for these commands. However, the successful commands that did not have sense data, must not get their result marked as DID_TIME_OUT by SCSI EH. Add a new flag SCMD_FORCE_EH_SUCCESS, which tells SCSI EH to not mark a command as DID_TIME_OUT, even if it has scmd->result == SAM_STAT_GOOD. This will be used by libata in a subsequent commit. Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Hannes Reinecke <hare@suse.de> Signed-off-by: Niklas Cassel <niklas.cassel@wdc.com> Link: https://lore.kernel.org/r/20230511011356.227789-5-nks@flawful.org Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2023-03-24	scsi: core: Declare most SCSI host template pointers const	Bart Van Assche	1	-8/+8
	Prepare for constifying most SCSI host template pointers by constifying the SCSI host template pointer arguments and variables in the SCSI core. Reviewed-by: Benjamin Block <bblock@linux.ibm.com> Reviewed-by: John Garry <john.g.garry@oracle.com> Reviewed-by: Ming Lei <ming.lei@redhat.com> Cc: Christoph Hellwig <hch@lst.de> Cc: Hannes Reinecke <hare@suse.de> Cc: Mike Christie <michael.christie@oracle.com> Signed-off-by: Bart Van Assche <bvanassche@acm.org> Link: https://lore.kernel.org/r/20230322195515.1267197-3-bvanassche@acm.org Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2022-12-30	Merge branch '6.2/scsi-queue' into 6.2/scsi-fixes	Martin K. Petersen	1	-0/+5
	Pull in remaining patches from the 6.2 queue. Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2022-12-14	Merge tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi	Linus Torvalds	1	-22/+20
	Pull SCSI updates from James Bottomley: "Updates to the usual drivers (target, ufs, smartpqi, lpfc). There are some core changes, mostly around reworking some of our user context assumptions in device put and moving some code around. The remaining updates are bug fixes and minor changes" * tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: (138 commits) scsi: sg: Fix get_user() in call sg_scsi_ioctl() scsi: megaraid_sas: Fix some spelling mistakes in comment scsi: core: Use SCSI_SCAN_INITIAL in do_scsi_scan_host() scsi: core: Use SCSI_SCAN_RESCAN in __scsi_add_device() scsi: ufs: ufs-mediatek: Remove unnecessary return code scsi: ufs: core: Fix the polling implementation scsi: libsas: Do not export sas_ata_wait_after_reset() scsi: hisi_sas: Fix SATA devices missing issue during I_T nexus reset scsi: libsas: Add smp_ata_check_ready_type() scsi: Revert "scsi: hisi_sas: Don't send bcast events from HW during nexus HA reset" scsi: Revert "scsi: hisi_sas: Drain bcast events in hisi_sas_rescan_topology()" scsi: ufs: ufs-mediatek: Modify the return value scsi: ufs: ufs-mediatek: Remove unneeded code scsi: device_handler: alua: Call scsi_device_put() from non-atomic context scsi: device_handler: alua: Revert "Move a scsi_device_put() call out of alua_check_vpd()" scsi: snic: Fix possible UAF in snic_tgt_create() scsi: qla2xxx: Initialize vha->unknown_atio_[list, work] for NPIV hosts scsi: qla2xxx: Remove duplicate of vha->iocb_work initialization scsi: fcoe: Fix transport not deattached when fcoe_if_init() fails scsi: sd: Use 16-byte SYNCHRONIZE CACHE on ZBC devices ...
2022-12-14	scsi: core: scsi_error: Do not queue pointless abort workqueue functions	Hannes Reinecke	1	-0/+5
	If a host template doesn't implement the .eh_abort_handler() there is no point in queueing the abort workqueue function; all it does is invoking SCSI EH anyway. So return 'FAILED' from scsi_abort_command() if the .eh_abort_handler() is not implemented and save us from having to wait for the abort workqueue function to complete. Cc: Niklas Cassel <niklas.cassel@wdc.com> Cc: Damien Le Moal <damien.lemoal@opensource.wdc.com> Cc: John Garry <john.g.garry@oracle.com> Signed-off-by: Hannes Reinecke <hare@suse.de> [niklas: moved the check to the top of scsi_abort_command()] Signed-off-by: Niklas Cassel <niklas.cassel@wdc.com> Link: https://lore.kernel.org/r/20221206131346.2045375-1-niklas.cassel@wdc.com Reviewed-by: John Garry <john.g.garry@oracle.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2022-11-29	scsi/scsi_error: Use call_rcu_hurry() instead of call_rcu()	Uladzislau Rezki	1	-1/+1
	Earlier commits in this series allow battery-powered systems to build their kernels with the default-disabled CONFIG_RCU_LAZY=y Kconfig option. This Kconfig option causes call_rcu() to delay its callbacks in order to batch them. This means that a given RCU grace period covers more callbacks, thus reducing the number of grace periods, in turn reducing the amount of energy consumed, which increases battery lifetime which can be a very good thing. This is not a subtle effect: In some important use cases, the battery lifetime is increased by more than 10%. This CONFIG_RCU_LAZY=y option is available only for CPUs that offload callbacks, for example, CPUs mentioned in the rcu_nocbs kernel boot parameter passed to kernels built with CONFIG_RCU_NOCB_CPU=y. Delaying callbacks is normally not a problem because most callbacks do nothing but free memory. If the system is short on memory, a shrinker will kick all currently queued lazy callbacks out of their laziness, thus freeing their memory in short order. Similarly, the rcu_barrier() function, which blocks until all currently queued callbacks are invoked, will also kick lazy callbacks, thus enabling rcu_barrier() to complete in a timely manner. However, there are some cases where laziness is not a good option. For example, synchronize_rcu() invokes call_rcu(), and blocks until the newly queued callback is invoked. It would not be a good for synchronize_rcu() to block for ten seconds, even on an idle system. Therefore, synchronize_rcu() invokes call_rcu_hurry() instead of call_rcu(). The arrival of a non-lazy call_rcu_hurry() callback on a given CPU kicks any lazy callbacks that might be already queued on that CPU. After all, if there is going to be a grace period, all callbacks might as well get full benefit from it. Yes, this could be done the other way around by creating a call_rcu_lazy(), but earlier experience with this approach and feedback at the 2022 Linux Plumbers Conference shifted the approach to call_rcu() being lazy with call_rcu_hurry() for the few places where laziness is inappropriate. And another call_rcu() instance that cannot be lazy is the one in the scsi_eh_scmd_add() function. Leaving this instance lazy results in unacceptably slow boot times. Therefore, make scsi_eh_scmd_add() use call_rcu_hurry() in order to revert to the old behavior. [ paulmck: Apply s/call_rcu_flush/call_rcu_hurry/ feedback from Tejun Heo. ] Tested-by: Joel Fernandes (Google) <joel@joelfernandes.org> Signed-off-by: Uladzislau Rezki <urezki@gmail.com> Signed-off-by: Joel Fernandes (Google) <joel@joelfernandes.org> Cc: "James E.J. Bottomley" <jejb@linux.ibm.com> Cc: <linux-scsi@vger.kernel.org> Reviewed-by: Bart Van Assche <bvanassche@acm.org> Acked-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2022-11-24	scsi: core: Increase scsi_device's iodone_cnt in scsi_timeout()	Wenchao Hao	1	-0/+1
	If a SCSI command times out and is going to be aborted, we should increase the iodone_cnt of the related scsi_device. Otherwise the iodone_cnt would be smaller than iorequest_cnt. Increasing iodone_cnt in scsi_timeout() would not cause a double accounting issue. Brief analysis follows: - We add the iodone_cnt when BLK_EH_DONE is returned in scsi_timeout(). The related command's timeout event would not happen. - If the abort succeeds and the command is not retried, the command would be completed with scsi_finish_command() which would not increase iodone_cnt. - If the abort succeeds and the command is retried, it would be requeue. A scsi_dispatch_cmd() would be called and iorequest_cnt would be increased again. - If the abort fails, the error handler successfully recovers the device, and the command is not retried, the command would be completed with scsi_finish_command() which would not increase iodone_cnt. - If the abort fails, the error handler successfully recovers the device, and the command is retried, the iorequest_cnt would be increased again. Signed-off-by: Wenchao Hao <haowenchao@huawei.com> Link: https://lore.kernel.org/r/20221123122137.150776-2-haowenchao@huawei.com Reviewed-by: Mike Christie <michael.christie@oracle.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2022-10-22	scsi: core: Change the return type of .eh_timed_out()	Bart Van Assche	1	-14/+19
	Commit 6600593cbd93 ("block: rename BLK_EH_NOT_HANDLED to BLK_EH_DONE") made it impossible for .eh_timed_out() implementations to call scsi_done() without causing a crash. Restore support for SCSI timeout handlers to call scsi_done() as follows: * Change all .eh_timed_out() handlers as follows: - Change the return type into enum scsi_timeout_action. - Change BLK_EH_RESET_TIMER into SCSI_EH_RESET_TIMER. - Change BLK_EH_DONE into SCSI_EH_NOT_HANDLED. * In scsi_timeout(), convert the SCSI_EH_* values into BLK_EH_* values. Reviewed-by: Lee Duncan <lduncan@suse.com> Cc: Christoph Hellwig <hch@lst.de> Cc: Ming Lei <ming.lei@redhat.com> Cc: John Garry <john.garry@huawei.com> Cc: Mike Christie <michael.christie@oracle.com> Cc: Hannes Reinecke <hare@suse.de> Signed-off-by: Bart Van Assche <bvanassche@acm.org> Link: https://lore.kernel.org/r/20221018202958.1902564-3-bvanassche@acm.org Reviewed-by: Mike Christie <michael.christie@oracle.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2022-10-22	scsi: core: Fix a race between scsi_done() and scsi_timeout()	Bart Van Assche	1	-11/+3
	If there is a race between scsi_done() and scsi_timeout() and if scsi_timeout() loses the race, scsi_timeout() should not reset the request timer. Hence change the return value for this case from BLK_EH_RESET_TIMER into BLK_EH_DONE. Although the block layer holds a reference on a request (req->ref) while calling a timeout handler, restarting the timer (blk_add_timer()) while a request is being completed is racy. Reviewed-by: Mike Christie <michael.christie@oracle.com> Cc: Keith Busch <kbusch@kernel.org> Cc: Christoph Hellwig <hch@lst.de> Cc: Ming Lei <ming.lei@redhat.com> Cc: John Garry <john.garry@huawei.com> Cc: Hannes Reinecke <hare@suse.de> Reported-by: Adrian Hunter <adrian.hunter@intel.com> Fixes: 15f73f5b3e59 ("blk-mq: move failure injection out of blk_mq_complete_request") Signed-off-by: Bart Van Assche <bvanassche@acm.org> Link: https://lore.kernel.org/r/20221018202958.1902564-2-bvanassche@acm.org Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2022-10-07	Merge tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi	Linus Torvalds	1	-6/+12
	Pull SCSI updates from James Bottomley: "Updates to the usual drivers (qla2xxx, lpfc, ufs, hisi_sas, mpi3mr, mpt3sas, target). The biggest change (from my biased viewpoint) being that the mpi3mr now attached to the SAS transport class, making it the first fusion type device to do so. Beyond the usual bug fixing and security class reworks, there aren't a huge number of core changes" * tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: (141 commits) scsi: iscsi: iscsi_tcp: Fix null-ptr-deref while calling getpeername() scsi: mpi3mr: Remove unnecessary cast scsi: stex: Properly zero out the passthrough command structure scsi: mpi3mr: Update driver version to 8.2.0.3.0 scsi: mpi3mr: Fix scheduling while atomic type bug scsi: mpi3mr: Scan the devices during resume time scsi: mpi3mr: Free enclosure objects during driver unload scsi: mpi3mr: Handle 0xF003 Fault Code scsi: mpi3mr: Graceful handling of surprise removal of PCIe HBA scsi: mpi3mr: Schedule IRQ kthreads only on non-RT kernels scsi: mpi3mr: Support new power management framework scsi: mpi3mr: Update mpi3 header files scsi: mpt3sas: Revert "scsi: mpt3sas: Fix ioc->base_readl() use" scsi: mpt3sas: Revert "scsi: mpt3sas: Fix writel() use" scsi: wd33c93: Remove dead code related to the long-gone config WD33C93_PIO scsi: core: Add I/O timeout count for SCSI device scsi: qedf: Populate sysfs attributes for vport scsi: pm8001: Replace one-element array with flexible-array member scsi: 3w-xxxx: Replace one-element array with flexible-array member scsi: hptiop: Replace one-element array with flexible-array member in struct hpt_iop_request_ioctl_command() ...
2022-09-30	block: change request end_io handler to pass back a return value	Jens Axboe	1	-1/+3
	Everything is just converted to returning RQ_END_IO_NONE, and there should be no functional changes with this patch. In preparation for allowing the end_io handler to pass ownership back to the block layer, rather than retain ownership of the request. Reviewed-by: Keith Busch <kbusch@kernel.org> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2022-09-25	scsi: core: Add I/O timeout count for SCSI device	Wu Bo	1	-0/+1
	Currently struct scsi_device maintains counters for requests, completions, and errors but is missing a counter for timeouts. For better tracking of timeouts, add a suitable counter. Link: https://lore.kernel.org/r/1663666339-17560-1-git-send-email-wubo40@huawei.com Reviewed-by: Bart Van Assche <bvanassche@acm.org> Signed-off-by: Wu Bo <wubo40@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2022-09-06	scsi: core: Convert scsi_decide_disposition() to use SCSIML_STAT	Mike Christie	1	-6/+6
	Don't use: - DID_TARGET_FAILURE - DID_NEXUS_FAILURE - DID_ALLOC_FAILURE - DID_MEDIUM_ERROR Instead use the SCSI midlayer internal values. Link: https://lore.kernel.org/r/20220812010027.8251-10-michael.christie@oracle.com Reviewed-by: Bart Van Assche <bvanassche@acm.org> Signed-off-by: Mike Christie <michael.christie@oracle.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2022-09-06	scsi: core: Add error codes for internal SCSI midlayer use	Mike Christie	1	-0/+5
	If a driver returns: - DID_TARGET_FAILURE - DID_NEXUS_FAILURE - DID_ALLOC_FAILURE - DID_MEDIUM_ERROR we hit a couple bugs: 1. The SCSI error handler runs because scsi_decide_disposition() has no case statements for them and we return FAILED. 2. For SG IO the userspace app gets a success status instead of failed, because scsi_result_to_blk_status() clears those errors. This patch adds a new internal error code byte for use by the SCSI midlayer. This will be used instead of the above error codes, so we don't have to play that clearing the host code game in scsi_result_to_blk_status() and drivers cannot accidentally use them. A subsequent commit will then remove the internal users of the above codes and convert us to use the new ones. Link: https://lore.kernel.org/r/20220812010027.8251-9-michael.christie@oracle.com Reviewed-by: Bart Van Assche <bvanassche@acm.org> Signed-off-by: Mike Christie <michael.christie@oracle.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2022-08-04	Merge tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi	Linus Torvalds	1	-3/+1
	Pull SCSI updates from James Bottomley: "Updates to the usual drivers (ufs, qla2xx, target, lpfc, smartpqi, mpi3mr). The main driver change that might cause issues on down the road is the conversion of some of our oldest surviving drivers to the DMA API (should only affect m68k). The only major core change is the rework of async resume; the rest are either completely trivial or for updating deprecated APIs" * tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: (195 commits) scsi: target: Remove XDWRITEREAD emulated support scsi: megaraid: Remove the static variable initialisation scsi: ch: Do not initialise statics to 0 scsi: ufs: core: Fix spelling mistake "Cannnot" -> "Cannot" scsi: target: iscsi: Do not require target authentication scsi: target: iscsi: Allow AuthMethod=None scsi: target: iscsi: Support base64 in CHAP scsi: target: iscsi: Add support for extended CDB AHS scsi: ufs: dt-bindings: Add SC8280XP binding scsi: target: iscsi: Fix clang -Wformat warnings scsi: ufs: core: Read device property for ref clock scsi: libsas: Resume SAS host for phy reset or enable via sysfs scsi: hisi_sas: Modify v3 HW SATA completion error processing scsi: hisi_sas: Relocate DMA unmap of SMP task scsi: hisi_sas: Remove unnecessary variable to hold DMA map elements scsi: hisi_sas: Call hisi_sas_slave_configure() from slave_configure_v3_hw() scsi: mpi3mr: Delete a stray tab scsi: mpi3mr: Unlock on error path scsi: mpi3mr: Reduce VD queue depth on detecting throttling scsi: mpi3mr: Resource Based Metering ...
2022-07-14	scsi/core: Change the return type of scsi_noretry_cmd() into bool	Bart Van Assche	1	-8/+8
	This patch prepares for introducing the new blk_opf_t type in the SCSI core. Since the value returned by scsi_noretry_cmd() is only used in boolean expressions, this patch does not change any functionality. Cc: Martin K. Petersen <martin.petersen@oracle.com> Cc: Christoph Hellwig <hch@lst.de> Cc: Ming Lei <ming.lei@redhat.com> Cc: Hannes Reinecke <hare@suse.de> Cc: John Garry <john.garry@huawei.com> Cc: Mike Christie <michael.christie@oracle.com> Signed-off-by: Bart Van Assche <bvanassche@acm.org> Link: https://lore.kernel.org/r/20220714180729.1065367-41-bvanassche@acm.org Signed-off-by: Jens Axboe <axboe@kernel.dk>
2022-07-07	scsi: core: Shorten long warning messages	Li Zhijian	1	-3/+1
	sdev_printk() will only accept messages up to 128 bytes. Shorten strings exceeding 128 bytes avoid printing an incomplete sentence like: [ 475.156955] sd 9:0:0:0: Warning! Received an indication that the LUN assignments on this target have changed. The Linux SCSI layer does not automatical Link: https://lore.kernel.org/r/20220630024516.1571209-1-lizhijian@fujitsu.com Suggested-by: Finn Thain <fthain@linux-m68k.org> Reviewed-by: Bart Van Assche <bvanassche@acm.org> Signed-off-by: Li Zhijian <lizhijian@fujitsu.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2022-07-06	blk-mq: Drop blk_mq_ops.timeout 'reserved' arg	John Garry	1	-2/+1
	With new API blk_mq_is_reserved_rq() we can tell if a request