summaryrefslogtreecommitdiff
path: root/drivers/scsi/hisi_sas
AgeCommit message (Collapse)AuthorFilesLines
2021-05-22scsi: hisi_sas: Drop free_irq() of devm_request_irq() allocated irqYang Yingliang1-4/+4
irqs allocated with devm_request_irq() should not be freed using free_irq(). Doing so causes a dangling pointer and a subsequent double free. Link: https://lore.kernel.org/r/20210519130519.2661938-1-yangyingliang@huawei.com Reported-by: Hulk Robot <hulkci@huawei.com> Acked-by: John Garry <john.garry@huawei.com> Signed-off-by: Yang Yingliang <yangyingliang@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-04-13scsi: hisi_sas: Fix IRQ checksSergey Shtylyov1-3/+3
Commit df2d8213d9e3 ("hisi_sas: use platform_get_irq()") failed to take into account that irq_of_parse_and_map() and platform_get_irq() have a different way of indicating an error: the former returns 0 and the latter returns a negative error code. Fix up the IRQ checks! Link: https://lore.kernel.org/r/810f26d3-908b-1d6b-dc5c-40019726baca@omprussia.ru Fixes: df2d8213d9e3 ("hisi_sas: use platform_get_irq()") Acked-by: John Garry <john.garry@huawei.com> Signed-off-by: Sergey Shtylyov <s.shtylyov@omprussia.ru> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-04-13scsi: hisi_sas: Print SATA device SAS address for soft reset failureLuo Jiaxing1-2/+4
Add (pseudo) SAS address for ATA software reset failure log to assist in debugging. Link: https://lore.kernel.org/r/1617709711-195853-7-git-send-email-john.garry@huawei.com Signed-off-by: Luo Jiaxing <luojiaxing@huawei.com> Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-04-13scsi: hisi_sas: Warn in v3 hw channel interrupt handler when status reg clearedLuo Jiaxing1-2/+8
If a channel interrupt occurs without any status bit set, the handler will return directly. However, if such redundant interrupts are received, it's better to check what happen, so add logs for this. Link: https://lore.kernel.org/r/1617709711-195853-6-git-send-email-john.garry@huawei.com Signed-off-by: Luo Jiaxing <luojiaxing@huawei.com> Signed-off-by: Yihang Li <liyihang6@hisilicon.com> Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-04-13scsi: hisi_sas: Directly snapshot registers when executing a resetJianqin Xie3-20/+36
The debugfs snapshot should be executed before the reset occurs to ensure that the register contents are saved properly. As such, it is incorrect to queue the debugfs dump when running a reset as the reset will occur prior to the snapshot work item is handler. Therefore, directly snapshot registers in the reset work handler. Link: https://lore.kernel.org/r/1617709711-195853-5-git-send-email-john.garry@huawei.com Signed-off-by: Jianqin Xie <xiejianqin@hisilicon.com> Signed-off-by: Luo Jiaxing <luojiaxing@huawei.com> Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-04-13scsi: hisi_sas: Call sas_unregister_ha() to roll back if .hw_init() failsXiang Chen2-2/+6
Function sas_unregister_ha() needs to be called to roll back if hisi_hba->hw->hw_init() fails in function hisi_sas_probe() or hisi_sas_v3_probe(). Make that change. Link: https://lore.kernel.org/r/1617709711-195853-4-git-send-email-john.garry@huawei.com Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com> Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-04-13scsi: hisi_sas: Print SAS address for v3 hw erroneous completion printLuo Jiaxing1-1/+2
To help debugging efforts, print the device SAS address for v3 hw erroneous completion log. Here is an example print: hisi_sas_v3_hw 0000:b4:02.0: erroneous completion iptt=2193 task=000000002b0c13f8 dev id=17 addr=570fd45f9d17b001 Link: https://lore.kernel.org/r/1617709711-195853-3-git-send-email-john.garry@huawei.com Signed-off-by: Luo Jiaxing <luojiaxing@huawei.com> Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-04-13scsi: hisi_sas: Delete some unused callbacksLuo Jiaxing1-2/+0
The debugfs code has been relocated to v3 hw driver, so delete unused struct hisi_sas_hw function pointers snapshot_{prepare, restore}. Link: https://lore.kernel.org/r/1617709711-195853-2-git-send-email-john.garry@huawei.com Signed-off-by: Luo Jiaxing <luojiaxing@huawei.com> Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-01-27scsi: hisi_sas: Add trace FIFO debugfs supportLuo Jiaxing2-0/+266
The controller provides trace FIFO DFX tool to assist link fault debugging and link optimization. This tool can be helpful when debugging link faults without SAS analyzers. Each PHY has an independent trace FIFO interface. The user can configure the trace FIFO tool of one PHY by using the following six interfaces: signal_sel: select signal group applies to different scenarios. 0x0: linkrate negotiation 0x1: Host 12G TX train 0x2: Disk 12G TX train 0x3: SAS PHY CTRL DFX 0 0x4: SAS PHY CTRL DFX 1 0x5: SAS PCS DFX other: linkrate negotiation dump_mask: The masked hardware status bit will not be updated. dump_mode: determines how to dump data after trigger signal is generated. 0x0: dump forever 0x1: dump 32 data after trigger signal is generated 0x2: no more dump after trigger signal is generated trigger_mode: determines the trigger mode, level or edge. 0x0: dump when trigger signal changed 0x1: dump when trigger signal's level equal to trigger_level 0x2: dump when trigger signal's level different from trigger_level trigger_level: determines the trigger level. trigger_msk: mask trigger signal The user can get 32-byte values from hardware by reading the rd_data. These values consitute the status record of the hardware at different time points. Link: https://lore.kernel.org/r/1611659068-131975-6-git-send-email-john.garry@huawei.com Signed-off-by: Luo Jiaxing <luojiaxing@huawei.com> Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-01-27scsi: hisi_sas: Flush workqueue in hisi_sas_v3_remove()Luo Jiaxing1-0/+1
If the controller reset occurs at the same time as driver removal, it may be possible that the interrupts have been released prior to the host softreset, and calling pci_irq_vector() there causes a WARN: WARNING: CPU: 37 PID: 1542 /pci/msi.c:1275 pci_irq_vector+0xc0/0xd0 Call trace: pci_irq_vector+0xc0/0xd0 disable_host_v3_hw+0x58/0x5b0 [hisi_sas_v3_hw] soft_reset_v3_hw+0x40/0xc0 [hisi_sas_v3_hw] hisi_sas_controller_reset+0x150/0x260 [hisi_sas_main] hisi_sas_rst_work_handler+0x3c/0x58 [hisi_sas_main] To fix, flush the driver workqueue prior to releasing the interrupts to ensure any resets have been completed. Link: https://lore.kernel.org/r/1611659068-131975-5-git-send-email-john.garry@huawei.com Signed-off-by: Luo Jiaxing <luojiaxing@huawei.com> Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-01-27scsi: hisi_sas: Enable debugfs support by defaultLuo Jiaxing2-2/+17
Add a config option to enable debugfs support by default. And if debugfs support is enabled by default, dump count default value is increased to 50 as generally users want something bigger than the current default in that situation. Link: https://lore.kernel.org/r/1611659068-131975-4-git-send-email-john.garry@huawei.com Signed-off-by: Luo Jiaxing <luojiaxing@huawei.com> Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-01-27scsi: hisi_sas: Don't check .nr_hw_queues in hisi_sas_task_prep()John Garry1-4/+2
Now that v2 and v3 hw expose their HW queues (and so shost.nr_hw_queues is set), remove the conditional checks in hisi_sas_task_prep(). This change would affect v1 HW performance (as it does not expose HW queues), but nobody uses it and support may be dropped soon. Link: https://lore.kernel.org/r/1611659068-131975-3-git-send-email-john.garry@huawei.com Reviewed-by: Xiang Chen <chenxiang66@hisilicon.com> Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-01-27scsi: hisi_sas: Remove deferred probe check in hisi_sas_v2_probe()John Garry1-12/+0
The platform_get_irq() check for -EPROBE_DEFER was to ensure that all the steps to add the SCSI host are not done and then only to realise that the probe needs to be deferred. However, since there is now an earlier check for this in hisi_sas_interrupt_preinit(), this check is superfluous and may be removed. Link: https://lore.kernel.org/r/1611659068-131975-2-git-send-email-john.garry@huawei.com Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-01-23scsi: hisi_sas: Switch back to original libsas event notifiersAhmed S. Darwish4-11/+10
libsas event notifiers required an extension where gfp_t flags must be explicitly passed. For bisectability, a temporary _gfp() variant of such functions were added. All call sites then got converted use the _gfp() variants and explicitly pass GFP context. Having no callers left, the original libsas notifiers were then modified to accept gfp_t flags by default. Switch back to the original libas API, while still passing GFP context. The libsas _gfp() variants will be removed afterwards. Link: https://lore.kernel.org/r/20210118100955.1761652-14-a.darwish@linutronix.de Reviewed-by: John Garry <john.garry@huawei.com> Signed-off-by: Ahmed S. Darwish <a.darwish@linutronix.de> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-01-23scsi: hisi_sas: Pass gfp_t flags to libsas event notifiersAhmed S. Darwish5-18/+29
Use the new libsas event notifiers API, which requires callers to explicitly pass the gfp_t memory allocation flags. Below are the context analysis for modified functions: => hisi_sas_bytes_dmaed(): Since it is invoked from both process and atomic contexts, let its callers pass the gfp_t flags: * hisi_sas_main.c: ------------------ hisi_sas_phyup_work(): workqueue context -> hisi_sas_bytes_dmaed(..., GFP_KERNEL) hisi_sas_controller_reset_done(): has an msleep() -> hisi_sas_rescan_topology() -> hisi_sas_phy_down() -> hisi_sas_bytes_dmaed(..., GFP_KERNEL) hisi_sas_debug_I_T_nexus_reset(): calls wait_for_completion_timeout() -> hisi_sas_phy_down() -> hisi_sas_bytes_dmaed(..., GFP_KERNEL) * hisi_sas_v1_hw.c: ------------------- int_abnormal_v1_hw(): irq handler -> hisi_sas_phy_down() -> hisi_sas_bytes_dmaed(..., GFP_ATOMIC) * hisi_sas_v[23]_hw.c: ---------------------- int_phy_updown_v[23]_hw(): irq handler -> phy_down_v[23]_hw() -> hisi_sas_phy_down() -> hisi_sas_bytes_dmaed(..., GFP_ATOMIC) => int_bcast_v1_hw() and phy_bcast_v3_hw(): Both are invoked exclusively from irq handlers. Pass GFP_ATOMIC. Link: https://lore.kernel.org/r/20210118100955.1761652-12-a.darwish@linutronix.de Reviewed-by: John Garry <john.garry@huawei.com> Signed-off-by: Ahmed S. Darwish <a.darwish@linutronix.de> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-01-23scsi: libsas: Remove notifier indirectionJohn Garry4-14/+7
LLDDs report events to libsas with .notify_port_event and .notify_phy_event callbacks. These callbacks are fixed and so there is no reason why the functions cannot be called directly, so do that. This neatens the code slightly, makes it more obvious, and reduces function pointer usage, which is generally a good thing. Downside is that there are 2x more symbol exports. [a.darwish@linutronix.de: Remove the now unused "sas_ha" local variables] Link: https://lore.kernel.org/r/20210118100955.1761652-3-a.darwish@linutronix.de Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Jack Wang <jinpu.wang@cloud.ionos.com> Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Ahmed S. Darwish <a.darwish@linutronix.de> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-01-21Merge branch '5.11/scsi-fixes' into 5.12/scsi-queueMartin K. Petersen3-13/+68
Pull in the 5.11 SCSI fixes branch to provide an updated baseline for megaraid and hisi_sas. Both drivers received core changes in v5.11-rc3. Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-01-08scsi: hisi_sas: Remove auto_affine_msi_experimental module_paramJohn Garry1-5/+0
Now that the driver always uses managed interrupts, delete auto_affine_msi_experimental module param. Link: https://lore.kernel.org/r/1609763622-34119-2-git-send-email-john.garry@huawei.com Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-01-04Merge branch '5.11/scsi-postmerge' into 5.11/scsi-fixesMartin K. Petersen3-13/+68
Merge two commits that had dependencies on other 5.11 trees (the block and the irq trees respectively). - We reverted a megaraid_sas change in 5.10 due to missing block layer plumbing. Now that this is in place, reinstate the change. - The hisi_sas driver had a dependency on a driver core irq change that went in through Thomas' tree. Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2020-12-22scsi: hisi_sas: Expose HW queues for v2 hwJohn Garry3-13/+68
As a performance enhancement, make the completion queue interrupts managed. In addition, in commit bf0beec0607d ("blk-mq: drain I/O when all CPUs in a hctx are offline"), CPU hotplug for MQ devices using managed interrupts is made safe. So expose HW queues to blk-mq to take advantage of this. Flag Scsi_host.host_tagset is also set to ensure that the HBA is not sent more commands than it can handle. However the driver still does not use request tag for IPTT as there are many HW bugs means that special rules apply for IPTT allocation. Link: https://lore.kernel.org/r/1606905417-183214-6-git-send-email-john.garry@huawei.com Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2020-12-17Merge tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsiLinus Torvalds3-1445/+1232
Pull SCSI updates from James Bottomley: "This consists of the usual driver updates (ufs, qla2xxx, smartpqi, target, zfcp, fnic, mpt3sas, ibmvfc) plus a load of cleanups, a major power management rework and a load of assorted minor updates. There are a few core updates (formatting fixes being the big one) but nothing major this cycle" * tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: (279 commits) scsi: mpt3sas: Update driver version to 36.100.00.00 scsi: mpt3sas: Handle trigger page after firmware update scsi: mpt3sas: Add persistent MPI trigger page scsi: mpt3sas: Add persistent SCSI sense trigger page scsi: mpt3sas: Add persistent Event trigger page scsi: mpt3sas: Add persistent Master trigger page scsi: mpt3sas: Add persistent trigger pages support scsi: mpt3sas: Sync time periodically between driver and firmware scsi: qla2xxx: Update version to 10.02.00.104-k scsi: qla2xxx: Fix device loss on 4G and older HBAs scsi: qla2xxx: If fcport is undergoing deletion complete I/O with retry scsi: qla2xxx: Fix the call trace for flush workqueue scsi: qla2xxx: Fix flash update in 28XX adapters on big endian machines scsi: qla2xxx: Handle aborts correctly for port undergoing deletion scsi: qla2xxx: Fix N2N and NVMe connect retry failure scsi: qla2xxx: Fix FW initialization error on big endian machines scsi: qla2xxx: Fix crash during driver load on big endian machines scsi: qla2xxx: Fix compilation issue in PPC systems scsi: qla2xxx: Don't check for fw_started while posting NVMe command scsi: qla2xxx: Tear down session if FW say it is down ...
2020-12-08scsi: hisi_sas: Select a suitable queue for internal I/OsXiang Chen2-0/+11
For when managed interrupts are used (and shost->nr_hw_queues is set), a fixed queue - set per-device - is still used for internal I/Os. If all the CPUs mapped to that queue are offlined, then the completions for that queue are not serviced and any internal I/Os will time out. Fix by selecting a queue for internal I/Os from the queue mapped from the current CPU in this scenario. This is still not ideal as it does not deal with CPU hotplug for inflight internal I/Os, and needs proper support from [0]. [0] https://lore.kernel.org/linux-scsi/20200703130122.111448-1-hare@suse.de/T/#m7d77d049b18f33a24ef206af69ebb66d07440556 Link: https://lore.kernel.org/r/1607347855-59091-1-git-send-email-john.garry@huawei.com Fixes: 8d98416a55eb ("scsi: hisi_sas: Switch v3 hw to MQ") Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com> Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2020-12-01scsi: hisi_sas: Remove preemptible()Ahmed S. Darwish1-7/+1
hisi_sas_task_exec() uses preemptible() to see if it's safe to block. This does not work for CONFIG_PREEMPT_COUNT=n kernels in which preemptible() always returns 0. The problem is masked when enabling some of the common Kconfig.debug options (like CONFIG_DEBUG_ATOMIC_SLEEP), as they implicitly enable the preemption counter. In general, driver leaf functions should not make logic decisions based on the context they're called from. The caller should be the entity responsible for explicitly indicating context. Since hisi_sas_task_exec() already has a gfp_t flags parameter, use it as the explicit context marker. Link: https://lore.kernel.org/r/20201126132952.2287996-3-bigeasy@linutronix.de Fixes: 214e702d4b70 ("scsi: hisi_sas: Adjust task reject period during host reset") Fixes: 550c0d89d52d ("scsi: hisi_sas: Replace in_softirq() check in hisi_sas_task_exec()") Cc: Xiaofei Tan <tanxiaofei@huawei.com> Cc: Xiang Chen <chenxiang66@hisilicon.com> Cc: John Garry <john.garry@huawei.com> Acked-by: John Garry <john.garry@huawei.com> Signed-off-by: Ahmed S. Darwish <a.darwish@linutronix.de> Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2020-12-01scsi: hisi_sas: Move debugfs code to v3 hw driverLuo Jiaxing3-1387/+1212
Relocate all the debugfs code for DFX to v3 hw since no other versions support it. Link: https://lore.kernel.org/r/1606207594-196362-4-git-send-email-john.garry@huawei.com Signed-off-by: Luo Jiaxing <luojiaxing@huawei.com> Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2020-12-01scsi: hisi_sas: Fix up probe error handling for v3 hwXiang Chen1-15/+11
Fix some rollbacks in function hisi_sas_v3_probe() and interrupt_init_v3_hw(). Link: https://lore.kernel.org/r/1606207594-196362-3-git-send-email-john.garry@huawei.com Fixes: 8d98416a55eb ("scsi: hisi_sas: Switch v3 hw to MQ") Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com> Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2020-12-01scsi: hisi_sas: Reduce some indirection in v3 hw driverJohn Garry1-3/+2
Sometimes local functions are called indirectly from the hw driver, which only makes the code harder to follow. Remove these. Method .hw_init is only called from platform driver probe, which is not relevant, so don't set this either. Link: https://lore.kernel.org/r/1606207594-196362-2-git-send-email-john.garry@huawei.com Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2020-11-26scsi: hisi_sas_v3_hw: Remove extra function calls for runtime pmVaibhav Gupta1-17/+6
Both runtime_suspend_v3_hw() and runtime_resume_v3_hw() do nothing else but invoke suspend_v3_hw() and resume_v3_hw() respectively. This is the case of unnecessary function calls. To use those functions for runtime pm as well, simply use UNIVERSAL_DEV_PM_OPS. make -j$(nproc) W=1, with CONFIG_PM disabled, throws '-Wunused-function' warning for runtime_suspend_v3_hw() and runtime_resume_v3_hw(). After dropping those function definitions, the warning was thrown for suspend_v3_hw() and resume_v3_hw(). Hence, mark them as '__maybe_unused'. Link: https://lore.kernel.org/r/20201102164730.324035-15-vaibhavgupta40@gmail.com Signed-off-by: Vaibhav Gupta <vaibhavgupta40@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2020-11-26scsi: hisi_sas_v3_hw: Don't use PCI helper functionsVaibhav Gupta1-16/+1
Drivers using new-framework/generic-framework should not handle standard power management operations. These operations were performed by legacy framework through PCI helper functions like pci_save/restore_state(), pci_set_power_state(), etc. Drivers should not use them now. Link: https://lore.kernel.org/r/20201102164730.324035-14-vaibhavgupta40@gmail.com Signed-off-by: Vaibhav Gupta <vaibhavgupta40@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2020-11-26scsi: hisi_sas_v3_hw: Drop PCI Wakeup calls from .resumeVaibhav Gupta1-1/+0
The driver calls pci_enable_wake(...., false) in hisi_sas_v3_resume(), and there is no corresponding pci_enable_wake(...., true) in hisi_sas_v3_suspend(). Either it should do enable-wake the device in .suspend() or should not invoke pci_enable_wake() at all. Concluding that this driver doesn't support enable-wake and PCI core calls pci_enable_wake(pci_dev, PCI_D0, false) during resume, drop it from hisi_sas_v3_resume(). Link: https://lore.kernel.org/r/20201102164730.324035-13-vaibhavgupta40@gmail.com Signed-off-by: Vaibhav Gupta <vaibhavgupta40@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2020-10-27scsi: hisi_sas: Stop using queue #0 always for v2 hwJohn Garry1-1/+1
In commit 8d98416a55eb ("scsi: hisi_sas: Switch v3 hw to MQ"), the dispatch function was changed to choose the delivery queue based on the request tag HW queue index. This heavily degrades performance for v2 hw, since the HW queues are not exposed there, and, as such, HW queue #0 is used for every command. Revert to previous behaviour for when nr_hw_queues is not set, that being to choose the HW queue based on target device index. Link: https://lore.kernel.org/r/1602750425-240341-1-git-send-email-john.garry@huawei.com Fixes: 8d98416a55eb ("scsi: hisi_sas: Switch v3 hw to MQ") Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2020-10-15Merge tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsiLinus Torvalds6-104/+336
Pull SCSI updates from James Bottomley: "The usual driver updates (ufs, qla2xxx, tcmu, ibmvfc, lpfc, smartpqi, hisi_sas, qedi, qedf, mpt3sas) and minor bug fixes. There are only three core changes: adding sense codes, cleaning up noretry and adding an option for limitless retries" * tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: (226 commits) scsi: hisi_sas: Recover PHY state according to the status before reset scsi: hisi_sas: Filter out new PHY up events during suspend scsi: hisi_sas: Add device link between SCSI devices and hisi_hba scsi: hisi_sas: Add check for methods _PS0 and _PR0 scsi: hisi_sas: Add controller runtime PM support for v3 hw scsi: hisi_sas: Switch to new framework to support suspend and resume scsi: hisi_sas: Use hisi_hba->cq_nvecs for calling calling synchronize_irq() scsi: qedf: Remove redundant assignment to variable 'rc' scsi: lpfc: Remove unneeded variable 'status' in lpfc_fcp_cpu_map_store() scsi: snic: Convert to use DEFINE_SEQ_ATTRIBUTE macro scsi: qla4xxx: Delete unneeded variable 'status' in qla4xxx_process_ddb_changed scsi: sun_esp: Use module_platform_driver to simplify the code scsi: sun3x_esp: Use module_platform_driver to simplify the code scsi: sni_53c710: Use module_platform_driver to simplify the code scsi: qlogicpti: Use module_platform_driver to simplify the code scsi: mac_esp: Use module_platform_driver to simplify the code scsi: jazz_esp: Use module_platform_driver to simplify the code scsi: mvumi: Fix error return in mvumi_io_attach() scsi: lpfc: Drop nodelist reference on error in lpfc_gen_req() scsi: be2iscsi: Fix a theoretical leak in beiscsi_create_eqs() ...
2020-10-07scsi: hisi_sas: Recover PHY state according to the status before resetXiang Chen1-3/+1
Currently the PHY state is set according to the state of the PHYs after reset. This is invalid as the PHYs are already re-initialized. Set PHY state according to the state before the reset instead of after. Link: https://lore.kernel.org/r/1601649038-25534-8-git-send-email-john.garry@huawei.com Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com> Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2020-10-07scsi: hisi_sas: Filter out new PHY up events during suspendXiang Chen1-0/+6
Currently sas_resume_ha() is called while resuming the controller to wait for all suspended PHYs to come up and all the libsas events to be completed. There is a scenario which will cause task hung: For direct attach with two disks connected with two PHYs, disable phy0 before suspending the disk on phy1 and the controller, then enable phy0 and resume the controller, and task hung occurs as follows: [ 591.901463] hisi_sas_v3_hw 0000:b4:02.0: resuming from operating state [D0] [ 593.113525] hisi_sas_v3_hw 0000:b4:02.0: neither _PS0 nor _PR0 is defined [ 593.120301] hisi_sas_v3_hw 0000:b4:02.0: waiting up to 25 seconds for 1 phy to resume [ 593.120836] hisi_sas_v3_hw 0000:b4:02.0: phyup: phy0 link_rate=10(sata) [ 593.134680] hisi_sas_v3_hw 0000:b4:02.0: phyup: phy1 link_rate=10(sata) [ 593.134733] sas: phy-2:0 added to port-2:0, phy_mask:0x1 (5000000000000200) [ 593.148350] sas: DOING DISCOVERY on port 0, pid:948 [ 593.153227] hisi_sas_v3_hw 0000:b4:02.0: dev[3:5] found [ 593.159840] sas: Enter sas_scsi_recover_host busy: 0 failed: 0 [ 593.165663] sas: ata7: end_device-2:0: dev error handler [ 593.165730] sas: ata2: end_device-2:1: dev error handler [ 593.172532] hisi_sas_v3_hw 0000:b4:02.0: phydown: phy0 phy_state=0x2 [ 593.182570] hisi_sas_v3_hw 0000:b4:02.0: ignore flutter phy0 down [ 593.331277] hisi_sas_v3_hw 0000:b4:02.0: phyup: phy0 link_rate=10(sata) [ 593.498956] ata7.00: ATA-11: SAMSUNG MZ7LH960HAJR-00005, HXT7404Q, max UDMA/133 [ 593.506235] ata7.00: 1875385008 sectors, multi 16: LBA48 NCQ (depth 32) [ 593.514295] ata7.00: configured for UDMA/133 [ 593.518557] sas: --- Exit sas_scsi_recover_host: busy: 0 failed: 0 tries: 1 [ 593.528613] sas: ata7: end_device-2:0: model:SAMSUNG MZ7LH960HAJR-00005 serial:S45NNA0M712225 [ 593.537520] device_link_add 316: dev=2:0:2:0 supplier:2 consumer:0 [ 593.543674] device_link_add 324 [ 593.546801] device_link_add 352 [ 593.549930] device_link_add 406 [ 593.553058] device_link_add 440: dev=2:0:2:0 supplier:2 consumer:0 [ 593.559208] device_link_add 444 [ 593.562335] device_link_add 455 [ 593.565517] scsi 2:0:2:0: Direct-Access ATA SAMSUNG MZ7LH960 404Q PQ: 0 ANSI: 5 [ 620.057464] phy-2:1: resume timeout [ 738.841445] INFO: task kworker/u256:0:8 blocked for more than 120 seconds. [ 738.848295] Not tainted 5.8.0-rc1-76154-g0d52b59-dirty #744 [ 738.854361] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [ 738.862155] kworker/u256:0 D 0 8 2 0x00000028 [ 738.867626] Workqueue: 0000:b4:02.0_event_q sas_port_event_worker [ 738.873693] Call trace: [ 738.876133] __switch_to+0xf4/0x148 [ 738.879613] __schedule+0x270/0x5d8 [ 738.883091] schedule+0x78/0x110 [ 738.886307] schedule_timeout+0x1ac/0x280 [ 738.890299] wait_for_completion+0x94/0x138 [ 738.894472] flush_workqueue+0x114/0x438 [ 738.898377] sas_porte_bytes_dmaed+0x400/0x500 [ 738.902801] sas_port_event_worker+0x28/0x40 [ 738.907053] process_one_work+0x1e8/0x360 [ 738.911046] worker_thread+0x44/0x478 [ 738.914698] kthread+0x150/0x158 [ 738.917915] ret_from_fork+0x10/0x1c [ 738.921534] INFO: task kworker/u256:1:948 blocked for more than 120 seconds. [ 738.928550] Not tainted 5.8.0-rc1-76154-g0d52b59-dirty #744 [ 738.934614] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [ 738.942408] kworker/u256:1 D 0 948 2 0x00000028 [ 738.947873] Workqueue: 0000:b4:02.0_disco_q sas_discover_domain [ 738.953766] Call trace: [ 738.956203] __switch_to+0xf4/0x148 [ 738.959678] __schedule+0x270/0x5d8 [ 738.963152] schedule+0x78/0x110 [ 738.966368] rpm_resume+0xcc/0x550 [ 738.969757] __pm_runtime_resume+0x3c/0x88 [ 738.973836] rpm_get_suppliers+0x50/0x148 [ 738.977829] __pm_runtime_set_status+0x124/0x2f0 [ 738.982427] scsi_sysfs_add_sdev+0x1a0/0x2a8 [ 738.986679] scsi_probe_and_add_lun+0x888/0xab0 [ 738.991190] __scsi_scan_target+0xec/0x520 [ 738.995268] scsi_scan_target+0x11c/0x128 [ 738.999261] sas_rphy_add+0x15c/0x1e8 [ 739.002907] sas_probe_devices+0xe4/0x150 [ 739.006899] sas_discover_domain+0x33c/0x588 [ 739.011150] process_one_work+0x1e8/0x360 [ 739.015143] worker_thread+0x44/0x478 [ 739.018789] kthread+0x150/0x158 [ 739.022003] ret_from_fork+0x10/0x1c ... If an extra phy0 up happens during resume of the SAS controller, it will emit a new libsas event (event PORTE_BYTES_DMAED and event DISCE_DISCOVER_DOMAIN). We will call function scsi_sysfs_add_sdev() in event DISCE_DISCOVER_DOMAIN, which will call __pm_runtime_set_status() to resume supplier (host controller). For runtime PM core, if device is in the resuming state, the later resume request of the device will wait for previous resume request to complete synchronously. At that point in time the state of the controller is still resuming as it waits for all libsas events to be completed, while libsas event DISCE_DISCOVER_DOMAIN is blocked as the state of the controller is resuming which causes a deadlock. To avoid the issue, filter out new PHY up events while the controller is suspended. Link: https://lore.kernel.org/r/1601649038-25534-7-git-send-email-john.garry@huawei.com Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com> Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2020-10-07scsi: hisi_sas: Add device link between SCSI devices and hisi_hbaXiang Chen1-1/+28
Runtime PM of SCSI devices is already supported in SCSI layer, we can suspend/resume every SCSI device separately. But if there is no link between hisi_hba and SCSI devices or SCSI targets it will cause issues if the controller is suspended while SCSI devices are still resuming. Only when all the SCSI devices under the controller are suspended, the controller can be suspended. Add the device link between SCSI devices and the controller. Link: https://lore.kernel.org/r/1601649038-25534-6-git-send-email-john.garry@huawei.com Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com> Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2020-10-07scsi: hisi_sas: Add check for methods _PS0 and _PR0Xiang Chen2-0/+5
To support system suspend/resume or runtime suspend/resume, need to use the function pci_set_power_state() to change the power state which requires at least method _PS0 or _PR0 be filled by platform for v3 hw. So check whether the method is supported, if not, print a warning. A Kconfig dependency is added as there is no stub for acpi_device_power_manageable(). Link: https://lore.kernel.org/r/1601649038-25534-5-git-send-email-john.garry@huawei.com Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com> Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2020-10-07scsi: hisi_sas: Add controller runtime PM support for v3 hwXiang Chen2-2/+56
Add controller runtime PM support for v3 hw. Link: https://lore.kernel.org/r/1601649038-25534-4-git-send-email-john.garry@huawei.com Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com> Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2020-10-07scsi: hisi_sas: Switch to new framework to support suspend and resumeXiang Chen1-5/+10
For v3 hw we will add support for runtime PM which is only supported in new framework. Legacy PM support and new framework are not allowed to be used together. Switch to new framework to support suspend and resume. Link: https://lore.kernel.org/r/1601649038-25534-3-git-send-email-john.garry@huawei.com Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com> Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2020-10-07scsi: hisi_sas: Use hisi_hba->cq_nvecs for calling calling synchronize_irq()Luo Jiaxing1-2/+3
A call trace is observed when running function level reset with online CPUs less than 16 and MSI auto-affinity enabled. [16538.348038] Call trace: [16538.348422] pci_irq_vector+0x98/0xc0 [16538.348947] disable_host_v3_hw+0x8c/0x288 [hisi_sas_v3_hw] [16538.349706] hisi_sas_reset_prepare_v3_hw+0x60/0x88 [hisi_sas_v3_hw] [16538.350631] pci_dev_save_and_disable+0x38/0x68 [16538.351290] pci_reset_function+0x44/0x88 [16538.351846] reset_store+0x6c/0xb8 [16538.352429] dev_attr_store+0x44/0x60 [16538.353035] sysfs_kf_write+0x58/0x80 [16538.353558] kernfs_fop_write+0x140/0x230 [16538.354175] __vfs_write+0x48/0x80 [16538.354675] vfs_write+0xb8/0x1d8 [16538.355145] ksys_write+0x74/0xf8 [16538.355615] __arm64_sys_write+0x24/0x30 [16538.356240] el0_svc_common.constprop.4+0x80/0x1f0 [16538.356905] do_el0_svc+0x2c/0x38 [16538.357408] el0_svc+0x14/0x40 [16538.357848] el0_sync_handler+0xbc/0x2ec [16538.358388] el0_sync+0x140/0x180 The reason is that if we use pci_alloc_irq_vectors_affinity() to allocate IRQs, the number of CQ IRQs can only be less than or equal to the number of online CPUs, but we use hisi_hba->queue_count (always 16) to iterate during interrupt_disable_v3_hw(). Use hisi_hba->cq_nvecs to replace hisi_hba->queue_count to avoid synchronize IRQ on a CPU which does not exist. Link: https://lore.kernel.org/r/1601649038-25534-2-git-send-email-john.garry@huawei.com Signed-off-by: Luo Jiaxing <luojiaxing@huawei.com> Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2020-10-06scsi: hisi_sas: Switch v3 hw to MQJohn Garry3-70/+56
Now that the block layer provides a shared tag, we can switch the driver to expose all HW queues. Signed-off-by: John Garry <john.garry@huawei.com> Tested-by: Douglas Gilbert <dgilbert@interlog.com> Acked-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2020-09-03scsi: hisi_sas: Code style cleanupLuo Jiaxing3-5/+2
Remove extra blank lines and add spaces around operators. Link: https://lore.kernel.org/r/1598958790-232272-9-git-send-email-john.garry@huawei.com Signed-off-by: Luo Jiaxing <luojiaxing@huawei.com> Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2020-09-03scsi: hisi_sas: Add missing newlinesXiang Chen4-15/+15
Newline is missing from some printk() statements. Add them. Link: https://lore.kernel.org/r/1598958790-232272-8-git-send-email-john.garry@huawei.com Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com> Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2020-09-03scsi: hisi_sas: Add BIST support for fixed code patternLuo Jiaxing3-31/+59
Through the new debugfs interface the user can select fixed code patterns. Add two new interfaces fixed_code and fixed_code1. Link: https://lore.kernel.org/r/1598958790-232272-7-git-send-email-john.garry@huawei.com Signed-off-by: Luo Jiaxing <luojiaxing@huawei.com> Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2020-09-03scsi: hisi_sas: Add BIST support for phy FFELuo Jiaxing3-3/+111
Add BIST support for phy FFE (Feed forward equalizer) setting. The user can configure FFE through the new debugfs interface. FFE is a parameter used for link layer control. It will affect the link quality between the SAS controller and the backplane. In the BIST test, the FFE interface is provided to assist board testers in optimizing link parameters. The modification of the FFE parameter will affect the test after BIST or the normal running of the board. The user should save the initial FFE values and restore them after BIST test is complete. Link: https://lore.kernel.org/r/1598958790-232272-6-git-send-email-john.garry@huawei.com Signed-off-by: Luo Jiaxing <luojiaxing@huawei.com> Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2020-09-03scsi: hisi_sas: Make phy index variable name consistentLuo Jiaxing1-25/+25
We use "phy_id" to identify phy in the BIST code but the rest of code always uses "phy_no". Change it for consistency. Link: https://lore.kernel.org/r/1598958790-232272-5-git-send-email-john.garry@huawei.com Signed-off-by: Luo Jiaxing <luojiaxing@huawei.com> Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2020-09-03scsi: hisi_sas: Do not modify upper fields of PROG_PHY_LINK_RATE regLuo Jiaxing1-11/+14
When updating PROG_PHY_LINK_RATE to set linkrate for a phy we used a hard-coded initial value instead of getting the current value from the register. The assumption was that this register would not be modified, but in fact it was partially modified in a new version of hardware. The hard-coded value we used changed the default value of the register to a an incorrect setting and as a result the SAS controller could not change linkrate for the phy. Delete hard-coded value and always read the latest value of register before updating it. Link: https://lore.kernel.org/r/1598958790-232272-4-git-send-email-john.garry@huawei.com Signed-off-by: Luo Jiaxing <luojiaxing@huawei.com> Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2020-09-03scsi: hisi_sas: Modify macro name for OOB phy linkrateLuo Jiaxing1-8/+7
The macro for OOB phy linkrate is named CFG_PROG_PHY_LINK_RATE_* but that is inaccurate. For clarification, include OOB in macro name. Link: https://lore.kernel.org/r/1598958790-232272-3-git-send-email-john.garry@huawei.com Signed-off-by: Luo Jiaxing <luojiaxing@huawei.com> Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2020-09-03scsi: hisi_sas: Avoid accessing to SSP task for SMP I/OsXiang Chen1-4/+5
hisi_sas_slot_task_free() attempts to dereference SSP task for non-ATA tasks. If the task is SMP, the code may reference the wrong structure although this may not cause any problems. To avoid this, only access to SSP task when slot->n_elem_dif is not 0 which indicates this is an SSP task. Link: https://lore.kernel.org/r/1598958790-232272-2-git-send-email-john.garry@huawei.com Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com> Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2020-08-24treewide: Use fallthrough pseudo-keywordGustavo A. R. Silva1-1/+1
Replace the existing /* fall through */ comments and its variants with the new pseudo-keyword macro fallthrough[1]. Also, remove unnecessary fall-through markings when it is the case. [1] https://www.kernel.org/doc/html/v5.7/process/deprecated.html?highlight=fallthrough#implicit-switch-case-fall-through Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org>
2020-08-07Merge tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsiLinus Torvalds3-4/+10
Pull SCSI updates from James Bottomley: "This consists of the usual driver updates (ufs, qla2xxx, tcmu, lpfc, hpsa, zfcp, scsi_debug) and minor bug fixes. We also have a huge docbook fix update like most other subsystems and no major update to the core (the few non trivial updates are either minor fixes or removing an unused feature [scsi_sdb_cache])" * tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: (307 commits) scsi: scsi_transport_srp: Sanitize scsi_target_block/unblock sequences scsi: ufs-mediatek: Apply DELAY_AFTER_LPM quirk to Micron devices scsi: ufs: Introduce device quirk "DELAY_AFTER_LPM" scsi: virtio-scsi: Correctly handle the case where all LUNs are unplugged scsi: scsi_debug: Implement tur_ms_to_ready parameter scsi: scsi_debug: Fix request sense scsi: lpfc: Fix typo in comment for ULP scsi: ufs-mediatek: Prevent LPM operation on undeclared VCC scsi: iscsi: Do not put host in iscsi_set_flashnode_param() scsi: hpsa: Correct ctrl queue depth scsi: target: tcmu: Make TMR notification optional scsi: target: tcmu: Implement tmr_notify callback scsi: target: tcmu: Fix and simplify timeout handling scsi: target: tcmu: Factor out new helper ring_insert_padding scsi: target: tcmu: Do not queue aborted commands scsi: target: tcmu: Use priv pointer in se_cmd scsi: target: Add tmr_notify backend function scsi: target: Modify core_tmr_abort_task() scsi: target: iscsi: Fix inconsistent debug message scsi: target: iscsi: Fix login error when receiving ...
2020-07-14scsi: hisi_sas: Remove one kerneldoc commentJohn Garry1-1/+1
The comment for interrupt_init_v2_hw() should not be a kerneldoc comment. Remove it. Link: https://lore.kernel.org/r/1594627471-235395-3-git-send-email-john.garry@huawei.com Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>