summaryrefslogtreecommitdiff
path: root/drivers/ata
AgeCommit message (Collapse)AuthorFilesLines
2024-06-16ata: pata_legacy: make legacy_exit() work againSergey Shtylyov1-4/+4
commit d4a89339f17c87c4990070e9116462d16e75894f upstream. Commit defc9cd826e4 ("pata_legacy: resychronize with upstream changes and resubmit") missed to update legacy_exit(), so that it now fails to do any cleanup -- the loop body there can never be entered. Fix that and finally remove now useless nr_legacy_host variable... Found by Linux Verification Center (linuxtesting.org) with the Svace static analysis tool. Fixes: defc9cd826e4 ("pata_legacy: resychronize with upstream changes and resubmit") Cc: stable@vger.kernel.org Signed-off-by: Sergey Shtylyov <s.shtylyov@omp.ru> Reviewed-by: Niklas Cassel <cassel@kernel.org> Signed-off-by: Damien Le Moal <dlemoal@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-05-17ata: sata_gemini: Check clk_enable() resultChen Ni1-1/+4
[ Upstream commit e85006ae7430aef780cc4f0849692e266a102ec0 ] The call to clk_enable() in gemini_sata_start_bridge() can fail. Add a check to detect such failure. Signed-off-by: Chen Ni <nichen@iscas.ac.cn> Signed-off-by: Damien Le Moal <dlemoal@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-04-17ata: libata-scsi: Fix ata_scsi_dev_rescan() error pathDamien Le Moal1-4/+5
commit 79336504781e7fee5ddaf046dcc186c8dfdf60b1 upstream. Commit 0c76106cb975 ("scsi: sd: Fix TCG OPAL unlock on system resume") incorrectly handles failures of scsi_resume_device() in ata_scsi_dev_rescan(), leading to a double call to spin_unlock_irqrestore() to unlock a device port. Fix this by redefining the goto labels used in case of errors and only unlock the port scsi_scan_mutex when scsi_resume_device() fails. Bug found with the Smatch static checker warning: drivers/ata/libata-scsi.c:4774 ata_scsi_dev_rescan() error: double unlocked 'ap->lock' (orig line 4757) Reported-by: Dan Carpenter <dan.carpenter@linaro.org> Fixes: 0c76106cb975 ("scsi: sd: Fix TCG OPAL unlock on system resume") Cc: stable@vger.kernel.org Signed-off-by: Damien Le Moal <dlemoal@kernel.org> Reviewed-by: Niklas Cassel <cassel@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-04-10ata: sata_mv: Fix PCI device ID table declaration compilation warningArnd Bergmann1-32/+31
[ Upstream commit 3137b83a90646917c90951d66489db466b4ae106 ] Building with W=1 shows a warning for an unused variable when CONFIG_PCI is diabled: drivers/ata/sata_mv.c:790:35: error: unused variable 'mv_pci_tbl' [-Werror,-Wunused-const-variable] static const struct pci_device_id mv_pci_tbl[] = { Move the table into the same block that containsn the pci_driver definition. Fixes: 7bb3c5290ca0 ("sata_mv: Remove PCI dependency") Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Damien Le Moal <dlemoal@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-04-10ata: sata_sx4: fix pdc20621_get_from_dimm() on 64-bitArnd Bergmann1-4/+2
[ Upstream commit 52f80bb181a9a1530ade30bc18991900bbb9697f ] gcc warns about a memcpy() with overlapping pointers because of an incorrect size calculation: In file included from include/linux/string.h:369, from drivers/ata/sata_sx4.c:66: In function 'memcpy_fromio', inlined from 'pdc20621_get_from_dimm.constprop' at drivers/ata/sata_sx4.c:962:2: include/linux/fortify-string.h:97:33: error: '__builtin_memcpy' accessing 4294934464 bytes at offsets 0 and [16, 16400] overlaps 6442385281 bytes at offset -2147450817 [-Werror=restrict] 97 | #define __underlying_memcpy __builtin_memcpy | ^ include/linux/fortify-string.h:620:9: note: in expansion of macro '__underlying_memcpy' 620 | __underlying_##op(p, q, __fortify_size); \ | ^~~~~~~~~~~~~ include/linux/fortify-string.h:665:26: note: in expansion of macro '__fortify_memcpy_chk' 665 | #define memcpy(p, q, s) __fortify_memcpy_chk(p, q, s, \ | ^~~~~~~~~~~~~~~~~~~~ include/asm-generic/io.h:1184:9: note: in expansion of macro 'memcpy' 1184 | memcpy(buffer, __io_virt(addr), size); | ^~~~~~ The problem here is the overflow of an unsigned 32-bit number to a negative that gets converted into a signed 'long', keeping a large positive number. Replace the complex calculation with a more readable min() variant that avoids the warning. Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Damien Le Moal <dlemoal@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-04-03scsi: sd: Fix TCG OPAL unlock on system resumeDamien Le Moal2-1/+13
commit 0c76106cb97548810214def8ee22700bbbb90543 upstream. Commit 3cc2ffe5c16d ("scsi: sd: Differentiate system and runtime start/stop management") introduced the manage_system_start_stop scsi_device flag to allow libata to indicate to the SCSI disk driver that nothing should be done when resuming a disk on system resume. This change turned the execution of sd_resume() into a no-op for ATA devices on system resume. While this solved deadlock issues during device resume, this change also wrongly removed the execution of opal_unlock_from_suspend(). As a result, devices with TCG OPAL locking enabled remain locked and inaccessible after a system resume from sleep. To fix this issue, introduce the SCSI driver resume method and implement it with the sd_resume() function calling opal_unlock_from_suspend(). The former sd_resume() function is renamed to sd_resume_common() and modified to call the new sd_resume() function. For non-ATA devices, this result in no functional changes. In order for libata to explicitly execute sd_resume() when a device is resumed during system restart, the function scsi_resume_device() is introduced. libata calls this function from the revalidation work executed on devie resume, a state that is indicated with the new device flag ATA_DFLAG_RESUMING. Doing so, locked TCG OPAL enabled devices are unlocked on resume, allowing normal operation. Fixes: 3cc2ffe5c16d ("scsi: sd: Differentiate system and runtime start/stop management") Link: https://bugzilla.kernel.org/show_bug.cgi?id=218538 Cc: stable@vger.kernel.org Signed-off-by: Damien Le Moal <dlemoal@kernel.org> Link: https://lore.kernel.org/r/20240319071209.1179257-1-dlemoal@kernel.org Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-04-03ahci: asm1064: asm1166: don't limit reported portsConrad Kostecki1-13/+0
[ Upstream commit 6cd8adc3e18960f6e59d797285ed34ef473cc896 ] Previously, patches have been added to limit the reported count of SATA ports for asm1064 and asm1166 SATA controllers, as those controllers do report more ports than physically having. While it is allowed to report more ports than physically having in CAP.NP, it is not allowed to report more ports than physically having in the PI (Ports Implemented) register, which is what these HBAs do. (This is a AHCI spec violation.) Unfortunately, it seems that the PMP implementation in these ASMedia HBAs is also violating the AHCI and SATA-IO PMP specification. What these HBAs do is that they do not report that they support PMP (CAP.SPM (Supports Port Multiplier) is not set). Instead, they have decided to add extra "virtual" ports in the PI register that is used if a port multiplier is connected to any of the physical ports of the HBA. Enumerating the devices behind the PMP as specified in the AHCI and SATA-IO specifications, by using PMP READ and PMP WRITE commands to the physical ports of the HBA is not possible, you have to use the "virtual" ports. This is of course bad, because this gives us no way to detect the device and vendor ID of the PMP actually connected to the HBA, which means that we can not apply the proper PMP quirks for the PMP that is connected to the HBA. Limiting the port map will thus stop these controllers from working with SATA Port Multipliers. This patch reverts both patches for asm1064 and asm1166, so old behavior is restored and SATA PMP will work again, but it will also reintroduce the (minutes long) extra boot time for the ASMedia controllers that do not have a PMP connected (either on the PCIe card itself, or an external PMP). However, a longer boot time for some, is the lesser evil compared to some other users not being able to detect their drives at all. Fixes: 0077a504e1a4 ("ahci: asm1166: correct count of reported ports") Fixes: 9815e3961754 ("ahci: asm1064: correct count of reported ports") Cc: stable@vger.kernel.org Reported-by: Matt <cryptearth@googlemail.com> Signed-off-by: Conrad Kostecki <conikost@gentoo.org> Reviewed-by: Hans de Goede <hdegoede@redhat.com> [cassel: rewrote commit message] Signed-off-by: Niklas Cassel <cassel@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-04-03ahci: asm1064: correct count of reported portsAndrey Jr. Melnikov1-3/+11
[ Upstream commit 9815e39617541ef52d0dfac4be274ad378c6dc09 ] The ASM1064 SATA host controller always reports wrongly, that it has 24 ports. But in reality, it only has four ports. before: ahci 0000:04:00.0: SSS flag set, parallel bus scan disabled ahci 0000:04:00.0: AHCI 0001.0301 32 slots 24 ports 6 Gbps 0xffff0f impl SATA mode ahci 0000:04:00.0: flags: 64bit ncq sntf stag pm led only pio sxs deso sadm sds apst after: ahci 0000:04:00.0: ASM1064 has only four ports ahci 0000:04:00.0: forcing port_map 0xffff0f -> 0xf ahci 0000:04:00.0: SSS flag set, parallel bus scan disabled ahci 0000:04:00.0: AHCI 0001.0301 32 slots 24 ports 6 Gbps 0xf impl SATA mode ahci 0000:04:00.0: flags: 64bit ncq sntf stag pm led only pio sxs deso sadm sds apst Signed-off-by: "Andrey Jr. Melnikov" <temnota.am@gmail.com> Signed-off-by: Niklas Cassel <cassel@kernel.org> Stable-dep-of: 6cd8adc3e189 ("ahci: asm1064: asm1166: don't limit reported ports") Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-03-01ahci: Extend ASM1061 43-bit DMA address quirk to other ASM106x partsLennert Buytenhek1-5/+5
commit 51af8f255bdaca6d501afc0d085b808f67b44d91 upstream. ASMedia have confirmed that all ASM106x parts currently listed in ahci_pci_tbl[] suffer from the 43-bit DMA address limitation that we ran into on the ASM1061, and therefore, we need to apply the quirk added by commit 20730e9b2778 ("ahci: add 43-bit DMA address quirk for ASMedia ASM1061 controllers") to the other supported ASM106x parts as well. Cc: stable@vger.kernel.org Link: https://lore.kernel.org/linux-ide/ZbopwKZJAKQRA4Xv@x1-carbon/ Signed-off-by: Lennert Buytenhek <kernel@wantstofly.org> [cassel: add link to ASMedia confirmation email] Signed-off-by: Niklas Cassel <cassel@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-03-01ata: ahci: add identifiers for ASM2116 series adaptersSzuying Chen1-0/+5
commit 3bf6141060948e27b62b13beb216887f2e54591e upstream. Add support for PCIe SATA adapter cards based on Asmedia 2116 controllers. These cards can provide up to 10 SATA ports on PCIe card. Signed-off-by: Szuying Chen <Chloe_Chen@asmedia.com.tw> Signed-off-by: Damien Le Moal <dlemoal@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-03-01ata: ahci_ceva: fix error handling for Xilinx GT PHY supportRadhey Shyam Pandey1-46/+79
[ Upstream commit 26c8404e162b43dddcb037ba2d0cb58c0ed60aab ] Platform clock and phy error resources are not cleaned up in Xilinx GT PHY error path. To fix introduce the function ceva_ahci_platform_enable_resources() which is a customized version of ahci_platform_enable_resources() and inline with SATA IP programming sequence it does: - Assert SATA reset - Program PS GTR phy - Bring SATA by de-asserting the reset - Wait for GT lane PLL to be locked ceva_ahci_platform_enable_resources() is also used in the resume path as the same SATA programming sequence (as in probe) should be followed. Also cleanup the mixed usage of ahci_platform_enable_resources() and custom implementation in the probe function as both are not required. Fixes: 9a9d3abe24bb ("ata: ahci: ceva: Update the driver to support xilinx GT phy") Signed-off-by: Radhey Shyam Pandey <radhey.shyam.pandey@amd.com> Reviewed-by: Damien Le Moal <dlemoal@kernel.org> Signed-off-by: Niklas Cassel <cassel@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-03-01ata: libata-core: Do not try to set sleeping devices to standbyDamien Le Moal1-0/+4
commit 4b085736e44dbbe69b5eea1a8a294f404678a1f4 upstream. In ata ata_dev_power_set_standby(), check that the target device is not sleeping. If it is, there is no need to do anything. Fixes: aa3998dbeb3a ("ata: libata-scsi: Disable scsi device manage_system_start_stop") Cc: stable@vger.kernel.org Signed-off-by: Damien Le Moal <dlemoal@kernel.org> Signed-off-by: Niklas Cassel <cassel@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-03-01ahci: add 43-bit DMA address quirk for ASMedia ASM1061 controllersLennert Buytenhek2-6/+24
[ Upstream commit 20730e9b277873deeb6637339edcba64468f3da3 ] With one of the on-board ASM1061 AHCI controllers (1b21:0612) on an ASUSTeK Pro WS WRX80E-SAGE SE WIFI mainboard, a controller hang was observed that was immediately preceded by the following kernel messages: ahci 0000:28:00.0: Using 64-bit DMA addresses ahci 0000:28:00.0: AMD-Vi: Event logged [IO_PAGE_FAULT domain=0x0035 address=0x7fffff00000 flags=0x0000] ahci 0000:28:00.0: AMD-Vi: Event logged [IO_PAGE_FAULT domain=0x0035 address=0x7fffff00300 flags=0x0000] ahci 0000:28:00.0: AMD-Vi: Event logged [IO_PAGE_FAULT domain=0x0035 address=0x7fffff00380 flags=0x0000] ahci 0000:28:00.0: AMD-Vi: Event logged [IO_PAGE_FAULT domain=0x0035 address=0x7fffff00400 flags=0x0000] ahci 0000:28:00.0: AMD-Vi: Event logged [IO_PAGE_FAULT domain=0x0035 address=0x7fffff00680 flags=0x0000] ahci 0000:28:00.0: AMD-Vi: Event logged [IO_PAGE_FAULT domain=0x0035 address=0x7fffff00700 flags=0x0000] The first message is produced by code in drivers/iommu/dma-iommu.c which is accompanied by the following comment that seems to apply: /* * Try to use all the 32-bit PCI addresses first. The original SAC vs. * DAC reasoning loses relevance with PCIe, but enough hardware and * firmware bugs are still lurking out there that it's safest not to * venture into the 64-bit space until necessary. * * If your device goes wrong after seeing the notice then likely either * its driver is not setting DMA masks accurately, the hardware has * some inherent bug in handling >32-bit addresses, or not all the * expected address bits are wired up between the device and the IOMMU. */ Asking the ASM1061 on a discrete PCIe card to DMA from I/O virtual address 0xffffffff00000000 produces the following I/O page faults: vfio-pci 0000:07:00.0: AMD-Vi: Event logged [IO_PAGE_FAULT domain=0x0021 address=0x7ff00000000 flags=0x0010] vfio-pci 0000:07:00.0: AMD-Vi: Event logged [IO_PAGE_FAULT domain=0x0021 address=0x7ff00000500 flags=0x0010] Note that the upper 21 bits of the logged DMA address are zero. (When asking a different PCIe device in the same PCIe slot to DMA to the same I/O virtual address, we do see all the upper 32 bits of the DMA address as 1, so this is not an issue with the chipset or IOMMU configuration on the test system.) Also, hacking libahci to always set the upper 21 bits of all DMA addresses to 1 produces no discernible effect on the behavior of the ASM1061, and mkfs/mount/scrub/etc work as without this hack. This all strongly suggests that the ASM1061 has a 43 bit DMA address limit, and this commit therefore adds a quirk to deal with this limit. This issue probably applies to (some of) the other supported ASMedia parts as well, but we limit it to the PCI IDs known to refer to ASM1061 parts, as that's the only part we know for sure to be affected by this issue at this point. Link: https://lore.kernel.org/linux-ide/ZaZ2PIpEId-rl6jv@wantstofly.org/ Signed-off-by: Lennert Buytenhek <kernel@wantstofly.org> [cassel: drop date from error messages in commit log] Signed-off-by: Niklas Cassel <cassel@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-03-01ahci: asm1166: correct count of reported portsConrad Kostecki1-0/+5
[ Upstream commit 0077a504e1a4468669fd2e011108db49133db56e ] The ASM1166 SATA host controller always reports wrongly, that it has 32 ports. But in reality, it only has six ports. This seems to be a hardware issue, as all tested ASM1166 SATA host controllers reports such high count of ports. Example output: ahci 0000:09:00.0: AHCI 0001.0301 32 slots 32 ports 6 Gbps 0xffffff3f impl SATA mode. By adjusting the port_map, the count is limited to six ports. New output: ahci 0000:09:00.0: AHCI 0001.0301 32 slots 32 ports 6 Gbps 0x3f impl SATA mode. Closes: https://bugzilla.kernel.org/show_bug.cgi?id=211873 Closes: https://bugzilla.kernel.org/show_bug.cgi?id=218346 Signed-off-by: Conrad Kostecki <conikost@gentoo.org> Reviewed-by: Hans de Goede <hdegoede@redhat.com> Signed-off-by: Niklas Cassel <cassel@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-12-08scsi: sd: Fix system start for ATA devicesDamien Le Moal1-0/+5
commit b09d7f8fd50f6e93cbadd8d27fde178f745b42a1 upstream. It is not always possible to keep a device in the runtime suspended state when a system level suspend/resume cycle is executed. E.g. for ATA devices connected to AHCI adapters, system resume resets the ATA ports, which causes connected devices to spin up. In such case, a runtime suspended disk will incorrectly be seen with a suspended runtime state because the device is not resumed by sd_resume_system(). The power state seen by the user is different than the actual device physical power state. Fix this issue by introducing the struct scsi_device flag force_runtime_start_on_system_start. When set, this flag causes sd_resume_system() to request a runtime resume operation for runtime suspended devices. This results in the user seeing the device runtime_state as active after a system resume, thus correctly reflecting the device physical power state. Fixes: 9131bff6a9f1 ("scsi: core: pm: Only runtime resume if necessary") Cc: <stable@vger.kernel.org> Signed-off-by: Damien Le Moal <dlemoal@kernel.org> Link: https://lore.kernel.org/r/20231120225631.37938-3-dlemoal@kernel.org Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-12-08scsi: Change SCSI device boolean fields to single bit flagsDamien Le Moal1-2/+2
commit 6371be7aeb986905bb60ec73d002fc02343393b4 upstream. Commit 3cc2ffe5c16d ("scsi: sd: Differentiate system and runtime start/stop management") changed the single bit manage_start_stop flag into 2 boolean fields of the SCSI device structure. Commit 24eca2dce0f8 ("scsi: sd: Introduce manage_shutdown device flag") introduced the manage_shutdown boolean field for the same structure. Together, these 2 commits increase the size of struct scsi_device by 8 bytes by using booleans instead of defining the manage_xxx fields as single bit flags, similarly to other flags of this structure. Avoid this unnecessary structure size increase and be consistent with the definition of other flags by reverting the definitions of the manage_xxx fields as single bit flags. Fixes: 3cc2ffe5c16d ("scsi: sd: Differentiate system and runtime start/stop management") Fixes: 24eca2dce0f8 ("scsi: sd: Introduce manage_shutdown device flag") Cc: <stable@vger.kernel.org> Signed-off-by: Damien Le Moal <dlemoal@kernel.org> Link: https://lore.kernel.org/r/20231120225631.37938-2-dlemoal@kernel.org Reviewed-by: Niklas Cassel <niklas.cassel@wdc.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-12-03ata: pata_isapnp: Add missing error check for devm_ioport_map()Chen Ni1-0/+3
[ Upstream commit a6925165ea82b7765269ddd8dcad57c731aa00de ] Add missing error return check for devm_ioport_map() and return the error if this function call fails. Fixes: 0d5ff566779f ("libata: convert to iomap") Signed-off-by: Chen Ni <nichen@iscas.ac.cn> Reviewed-by: Sergey Shtylyov <s.shtylyov@omp.ru> Signed-off-by: Damien Le Moal <dlemoal@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-11-02scsi: sd: Introduce manage_shutdown device flagDamien Le Moal1-2/+3
commit 24eca2dce0f8d19db808c972b0281298d0bafe99 upstream. Commit aa3998dbeb3a ("ata: libata-scsi: Disable scsi device manage_system_start_stop") change setting the manage_system_start_stop flag to false for libata managed disks to enable libata internal management of disk suspend/resume. However, a side effect of this change is that on system shutdown, disks are no longer being stopped (set to standby mode with the heads unloaded). While this is not a critical issue, this unclean shutdown is not recommended and shows up with increased smart counters (e.g. the unexpected power loss counter "Unexpect_Power_Loss_Ct"). Instead of defining a shutdown driver method for all ATA adapter drivers (not all of them define that operation), this patch resolves this issue by further refining the sd driver start/stop control of disks using the new flag manage_shutdown. If this new flag is set to true by a low level driver, the function sd_shutdown() will issue a START STOP UNIT command with the start argument set to 0 when a disk needs to be powered off (suspended) on system power off, that is, when system_state is equal to SYSTEM_POWER_OFF. Similarly to the other manage_xxx flags, the new manage_shutdown flag is exposed through sysfs as a read-write device attribute. To avoid any confusion between manage_shutdown and manage_system_start_stop, the comments describing these flags in include/scsi/scsi.h are also improved. Fixes: aa3998dbeb3a ("ata: libata-scsi: Disable scsi device manage_system_start_stop") Cc: stable@vger.kernel.org Closes: https://bugzilla.kernel.org/show_bug.cgi?id=218038 Link: https://lore.kernel.org/all/cd397c88-bf53-4768-9ab8-9d107df9e613@gmail.com/ Signed-off-by: Damien Le Moal <dlemoal@kernel.org> Reviewed-by: Niklas Cassel <niklas.cassel@wdc.com> Reviewed-by: Hannes Reinecke <hare@suse.de> Reviewed-by: James Bottomley <James.Bottomley@HansenPartnership.com> Acked-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-10-25ata: libata-eh: Fix compilation warning in ata_eh_link_report()Damien Le Moal1-1/+1
[ Upstream commit 49728bdc702391902a473b9393f1620eea32acb0 ] The 6 bytes length of the tries_buf string in ata_eh_link_report() is too short and results in a gcc compilation warning with W-!: drivers/ata/libata-eh.c: In function ‘ata_eh_link_report’: drivers/ata/libata-eh.c:2371:59: warning: ‘%d’ directive output may be truncated writing between 1 and 11 bytes into a region of size 4 [-Wformat-truncation=] 2371 | snprintf(tries_buf, sizeof(tries_buf), " t%d", | ^~ drivers/ata/libata-eh.c:2371:56: note: directive argument in the range [-2147483648, 4] 2371 | snprintf(tries_buf, sizeof(tries_buf), " t%d", | ^~~~~~ drivers/ata/libata-eh.c:2371:17: note: ‘snprintf’ output between 4 and 14 bytes into a destination of size 6 2371 | snprintf(tries_buf, sizeof(tries_buf), " t%d", | ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 2372 | ap->eh_tries); | ~~~~~~~~~~~~~ Avoid this warning by increasing the string size to 16B. Signed-off-by: Damien Le Moal <dlemoal@kernel.org> Reviewed-by: Hannes Reinecke <hare@suse.de> Tested-by: Geert Uytterhoeven <geert+renesas@glider.be> Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-10-25ata: libata-core: Fix compilation warning in ata_dev_config_ncq()Damien Le Moal1-1/+1
[ Upstream commit ed518d9ba980dc0d27c7d1dea1e627ba001d1977 ] The 24 bytes length allocated to the ncq_desc string in ata_dev_config_lba() for ata_dev_config_ncq() to use is too short, causing the following gcc compilation warnings when compiling with W=1: drivers/ata/libata-core.c: In function ‘ata_dev_configure’: drivers/ata/libata-core.c:2378:56: warning: ‘%d’ directive output may be truncated writing between 1 and 2 bytes into a region of size between 1 and 11 [-Wformat-truncation=] 2378 | snprintf(desc, desc_sz, "NCQ (depth %d/%d)%s", hdepth, | ^~ In function ‘ata_dev_config_ncq’, inlined from ‘ata_dev_config_lba’ at drivers/ata/libata-core.c:2649:8, inlined from ‘ata_dev_configure’ at drivers/ata/libata-core.c:2952:9: drivers/ata/libata-core.c:2378:41: note: directive argument in the range [1, 32] 2378 | snprintf(desc, desc_sz, "NCQ (depth %d/%d)%s", hdepth, | ^~~~~~~~~~~~~~~~~~~~~ drivers/ata/libata-core.c:2378:17: note: ‘snprintf’ output between 16 and 31 bytes into a destination of size 24 2378 | snprintf(desc, desc_sz, "NCQ (depth %d/%d)%s", hdepth, | ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 2379 | ddepth, aa_desc); | ~~~~~~~~~~~~~~~~ Avoid these warnings and the potential truncation by changing the size of the ncq_desc string to 32 characters. Signed-off-by: Damien Le Moal <dlemoal@kernel.org> Reviewed-by: Hannes Reinecke <hare@suse.de> Tested-by: Geert Uytterhoeven <geert+renesas@glider.be> Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-10-20ata: libata-scsi: Disable scsi device manage_system_start_stopDamien Le Moal4-10/+152
commit aa3998dbeb3abce63653b7f6d4542e7dcd022590 upstream. The introduction of a device link to create a consumer/supplier relationship between the scsi device of an ATA device and the ATA port of that ATA device fixes the ordering of system suspend and resume operations. For suspend, the scsi device is suspended first and the ata port after it. This is fine as this allows the synchronize cache and START STOP UNIT commands issued by the scsi disk driver to be executed before the ata port is disabled. For resume operations, the ata port is resumed first, followed by the scsi device. This allows having the request queue of the scsi device to be unfrozen after the ata port resume is scheduled in EH, thus avoiding to see new requests prematurely issued to the ATA device. Since libata sets manage_system_start_stop to 1, the scsi disk resume operation also results in issuing a START STOP UNIT command to the device being resumed so that the device exits standby power mode. However, restoring the ATA device to the active power mode must be synchronized with libata EH processing of the port resume operation to avoid either 1) seeing the start stop unit command being received too early when the port is not yet resumed and ready to accept commands, or after the port resume process issues commands such as IDENTIFY to revalidate the device. In this last case, the risk is that the device revalidation fails with timeout errors as the drive is still spun down. Commit 0a8589055936 ("ata,scsi: do not issue START STOP UNIT on resume") disabled issuing the START STOP UNIT command to avoid issues with it. But this is incorrect as transitioning a device to the active power mode from the standby power mode set on suspend requires a media access command. The IDENTIFY, READ LOG and SET FEATURES commands executed in libata EH context triggered by the ata port resume operation may thus fail. Fix these synchronization issues is by handling a device power mode transitions for system suspend and resume directly in libata EH context, without relying on the scsi disk driver management triggered with the manage_system_start_stop flag. To do this, the following libata helper functions are introduced: 1) ata_dev_power_set_standby(): This function issues a STANDBY IMMEDIATE command to transitiom a device to the standby power mode. For HDDs, this spins down the disks. This function applies only to ATA and ZAC devices and does nothing otherwise. This function also does nothing for devices that have the ATA_FLAG_NO_POWEROFF_SPINDOWN or ATA_FLAG_NO_HIBERNATE_SPINDOWN flag set. For suspend, call ata_dev_power_set_standby() in ata_eh_handle_port_suspend() before the port is disabled and frozen. ata_eh_unload() is also modified to transition all enabled devices to the standby power mode when the system is shutdown or devices removed. 2) ata_dev_power_set_active() and This function applies to ATA or ZAC devices and issues a VERIFY command for 1 sector at LBA 0 to transition the device to the active power mode. For HDDs, since this function will complete only once the disk spin up. Its execution uses the same timeouts as for reset, to give the drive enough time to complete spinup without triggering a command timeout. For resume, call ata_dev_power_set_active() in ata_eh_revalidate_and_attach() after the port has been enabled and before any other command is issued to the device. With these changes, the manage_system_start_stop and no_start_on_resume scsi device flags do not need to be set in ata_scsi_dev_config(). The flag manage_runtime_start_stop is still set to allow the sd driver to spinup/spindown a disk through the sd runtime operations. Fixes: 0a8589055936 ("ata,scsi: do not issue START STOP UNIT on resume") Cc: stable@vger.kernel.org Signed-off-by: Damien Le Moal <dlemoal@kernel.org> Reviewed-by: Hannes Reinecke <hare@suse.de> Tested-by: Geert Uytterhoeven <geert+renesas@glider.be> Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-10-10ata: libata-scsi: Fix delayed scsi_rescan_device() executionDamien Le Moal2-18/+31
[ Upstream commit 8b4d9469d0b0e553208ee6f62f2807111fde18b9 ] Commit 6aa0365a3c85 ("ata: libata-scsi: Avoid deadlock on rescan after device resume") modified ata_scsi_dev_rescan() to check the scsi device "is_suspended" power field to ensure that the scsi device associated with an ATA device is fully resumed when scsi_rescan_device() is executed. However, this fix is problematic as: 1) It relies on a PM internal field that should not be used without PM device locking protection. 2) The check for is_suspended and the call to scsi_rescan_device() are not atomic and a suspend PM event may be triggered between them, casuing scsi_rescan_device() to be called on a suspended device and in that function blocking while holding the scsi device lock. This would deadlock a following resume operation. These problems can trigger PM deadlocks on resume, especially with resume operations triggered quickly after or during suspend operations. E.g., a simple bash script like: for (( i=0; i<10; i++ )); do echo "+2 > /sys/class/rtc/rtc0/wakealarm echo mem > /sys/power/state done that triggers a resume 2 seconds after starting suspending a system can quickly lead to a PM deadlock preventing the system from correctly resuming. Fix this by replacing the check on is_suspended with a check on the return value given by scsi_rescan_device() as that function will fail if called against a suspended device. Also make sure rescan tasks already scheduled are first cancelled before suspending an ata port. Fixes: 6aa0365a3c85 ("ata: libata-scsi: Avoid deadlock on rescan after device resume") Cc: stable@vger.kernel.org Signed-off-by: Damien Le Moal <dlemoal@kernel.org> Reviewed-by: Hannes Reinecke <hare@suse.de> Reviewed-by: Niklas Cassel <niklas.cassel@wdc.com> Tested-by: Geert Uytterhoeven <geert+renesas@glider.be> Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-10-10scsi: core: Improve type safety of scsi_rescan_device()Bart Van Assche1-1/+1
[ Upstream commit 79519528a180c64a90863db2ce70887de6c49d16 ] Most callers of scsi_rescan_device() have the scsi_device pointer readily available. Pass a struct scsi_device pointer to scsi_rescan_device() instead of a struct device pointer. This change prevents that a pointer to another struct device would be passed accidentally to scsi_rescan_device(). Remove the scsi_rescan_device() declaration from the scsi_priv.h header file since it duplicates the declaration in <scsi/scsi_host.h>. Reviewed-by: Hannes Reinecke <hare@suse.de> Reviewed-by: Damien Le Moal <damien.lemoal@opensource.wdc.com> Reviewed-by: John Garry <john.g.garry@oracle.com> Cc: Mike Christie <michael.christie@oracle.com> Cc: Ming Lei <ming.lei@redhat.com> Signed-off-by: Bart Van Assche <bvanassche@acm.org> Link: https://lore.kernel.org/r/20230822153043.4046244-1-bvanassche@acm.org Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Stable-dep-of: 8b4d9469d0b0 ("ata: libata-scsi: Fix delayed scsi_rescan_device() execution") Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-10-10scsi: sd: Differentiate system and runtime start/stop managementDamien Le Moal1-1/+2
[ Upstream commit 3cc2ffe5c16dc65dfac354bc5b5bc98d3b397567 ] The underlying device and driver of a SCSI disk may have different system and runtime power mode control requirements. This is because runtime power management affects only the SCSI disk, while system level power management affects all devices, including the controller for the SCSI disk. For instance, issuing a START STOP UNIT command when a SCSI disk is runtime suspended and resumed is fine: the command is translated to a STANDBY IMMEDIATE command to spin down the ATA disk and to a VERIFY command to wake it up. The SCSI disk runtime operations have no effect on the ata port device used to connect the ATA disk. However, for system suspend/resume operations, the ATA port used to connect the device will also be suspended and resumed, with the resume operation requiring re-validating the device link and the device itself. In this case, issuing a VERIFY command to spinup the disk must be done before starting to revalidate the device, when the ata port is being resumed. In such case, we must not allow the SCSI disk driver to issue START STOP UNIT commands. Allow a low level driver to refine the SCSI disk start/stop management by differentiating system and runtime cases with two new SCSI device flags: manage_system_start_stop and manage_runtime_start_stop. These new flags replace the current manage_start_stop flag. Drivers setting the manage_start_stop are modifed to set both new flags, thus preserving the existing start/stop management behavior. For backward compatibility, the old manage_start_stop sysfs device attribute is kept as a read-only attribute showing a value of 1 for devices enabling both new flags and 0 otherwise. Fixes: 0a8589055936 ("ata,scsi: do not issue START STOP UNIT on resume") Cc: stable@vger.kernel.org Signed-off-by: Damien Le Moal <dlemoal@kernel.org> Reviewed-by: Hannes Reinecke <hare@suse.de> Tested-by: Geert Uytterhoeven <geert+renesas@glider.be> Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com> Stable-dep-of: 99398d2070ab ("scsi: sd: Do not issue commands to suspended disks on shutdown") Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-10-10ata,scsi: do not issue START STOP UNIT on resumeDamien Le Moal1-0/+7
[ Upstream commit 0a8589055936d8feb56477123a8373ac634018fa ] During system resume, ata_port_pm_resume() triggers ata EH to 1) Resume the controller 2) Reset and rescan the ports 3) Revalidate devices This EH execution is started asynchronously from ata_port_pm_resume(), which means that when sd_resume() is executed, none or only part of the above processing may have been executed. However, sd_resume() issues a START STOP UNIT to wake up the drive from sleep mode. This command is translated to ATA with ata_scsi_start_stop_xlat() and issued to the device. However, depending on the state of execution of the EH process and revalidation triggerred by ata_port_pm_resume(), two things may happen: 1) The START STOP UNIT fails if it is received before the controller has been reenabled at the beginning of the EH execution. This is visible with error messages like: ata10.00: device reported invalid CHS sector 0 sd 9:0:0:0: [sdc] Start/Stop Unit failed: Result: hostbyte=DID_OK driverbyte=DRIVER_OK sd 9:0:0:0: [sdc] Sense Key : Illegal Request [current] sd 9:0:0:0: [sdc] Add. Sense: Unaligned write command sd 9:0:0:0: PM: dpm_run_callback(): scsi_bus_resume+0x0/0x90 returns -5 sd 9:0:0:0: PM: failed to resume async: error -5 2) The START STOP UNIT command is received while the EH process is on-going, which mean that it is stopped and must wait for its completion, at which point the command is rather useless as the drive is already fully spun up already. This case results also in a significant delay in sd_resume() which is observable by users as the entire system resume completion is delayed. Given that ATA devices will be woken up by libata activity on resume, sd_resume() has no need to issue a START STOP UNIT command, which solves the above mentioned problems. Do not issue this command by introducing the new scsi_device flag no_start_on_resume and setting this flag to 1 in ata_scsi_dev_config(). sd_resume() is modified to issue a START STOP UNIT command only if this flag is not set. Reported-by: Paul Ausbeck <paula@soe.ucsc.edu> Closes: https://bugzilla.kernel.org/show_bug.cgi?id=215880 Fixes: a19a93e4c6a9 ("scsi: core: pm: Rely on the device driver core for async power management") Signed-off-by: Damien Le Moal <dlemoal@kernel.org> Tested-by: Tanner Watkins <dalzot@gmail.com> Tested-by: Paul Ausbeck <paula@soe.ucsc.edu> Reviewed-by: Hannes Reinecke <hare@suse.de> Reviewed-by: Bart Van Assche <bvanassche@acm.org> Stable-dep-of: 99398d2070ab ("scsi: sd: Do not issue commands to suspended disks on shutdown") Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-10-06ata: libata-core: Do not register PM operations for SAS portsDamien Le Moal3-2/+11
commit 75e2bd5f1ede42a2bc88aa34b431e1ace8e0bea0 upstream. libsas does its own domain based power management of ports. For such ports, libata should not use a device type defining power management operations as executing these operations for suspend/resume in addition to libsas calls to ata_sas_port_suspend() and ata_sas_port_resume() is not necessary (and likely dangerous to do, even though problems are not seen currently). Introduce the new ata_port_sas_type device_type for ports managed by libsas. This new device type is used in ata_tport_add() and is defined without power management operations. Fixes: 2fcbdcb4c802 ("[SCSI] libata: export ata_port suspend/resume infrastructure for sas") Cc: stable@vger.kernel.org Signed-off-by: Damien Le Moal <dlemoal@kernel.org> Reviewed-by: Hannes Reinecke <hare@suse.de> Tested-by: Chia-Lin Kao (AceLan) <acelan.kao@canonical.com> Tested-by: Geert Uytterhoeven <geert+renesas@glider.be> Reviewed-by: John Garry <john.g.garry@oracle.com> Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-10-06ata: libata-core: Fix port and device removalDamien Le Moal1-1/+20
commit 84d76529c650f887f1e18caee72d6f0589e1baf9 upstream. Whenever an ATA adapter driver is removed (e.g. rmmod), ata_port_detach() is called repeatedly for all the adapter ports to remove (unload) the devices attached to the port and delete the port device itself. Removing of devices is done using libata EH with the ATA_PFLAG_UNLOADING port flag set. This causes libata EH to execute ata_eh_unload() which disables all devices attached to the port. ata_port_detach() finishes by calling scsi_remove_host() to remove the scsi host associated with the port. This function will trigger the removal of all scsi devices attached to the host and in the case of disks, calls to sd_shutdown() which will flush the device write cache and stop the device. However, given that the devices were already disabled by ata_eh_unload(), the synchronize write cache command and start stop unit commands fail. E.g. running "rmmod ahci" with first removing sd_mod results in error messages like: ata13.00: disable device sd 0:0:0:0: [sda] Synchronizing SCSI cache sd 0:0:0:0: [sda] Synchronize Cache(10) failed: Result: hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK sd 0:0:0:0: [sda] Stopping disk sd 0:0:0:0: [sda] Start/Stop Unit failed: Result: hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK Fix this by removing all scsi devices of the ata devices connected to the port before scheduling libata EH to disable the ATA devices. Fixes: 720ba12620ee ("[PATCH] libata-hp: update unload-unplug") Cc: stable@vger.kernel.org Signed-off-by: Damien Le Moal <dlemoal@kernel.org> Reviewed-by: Hannes Reinecke <hare@suse.de> Reviewed-by: Niklas Cassel <niklas.cassel@wdc.com> Tested-by: Chia-Lin Kao (AceLan) <acelan.kao@canonical.com> Tested-by: Geert Uytterhoeven <geert+renesas@glider.be> Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-10-06ata: libata-core: Fix ata_port_request_pm() lockingDamien Le Moal1-9/+9
commit 3b8e0af4a7a331d1510e963b8fd77e2fca0a77f1 upstream. The function ata_port_request_pm() checks the port flag ATA_PFLAG_PM_PENDING and calls ata_port_wait_eh() if this flag is set to ensure that power management operations for a port are not scheduled simultaneously. However, this flag check is done without holding the port lock. Fix this by taking the port lock on entry to the function and checking the flag under this lock. The lock is released and re-taken if ata_port_wait_eh() needs to be called. The two WARN_ON() macros checking that the ATA_PFLAG_PM_PENDING flag was cleared are removed as the first call is racy and the second one done without holding the port lock. Fixes: 5ef41082912b ("ata: add ata port system PM callbacks") Cc: stable@vger.kernel.org Signed-off-by: Damien Le Moal <dlemoal@kernel.org> Reviewed-by: Hannes Reinecke <hare@suse.de> Tested-by: Chia-Lin Kao (AceLan) <acelan.kao@canonical.com> Reviewed-by: Niklas Cassel <niklas.cassel@wdc.com> Tested-by: Geert Uytterhoeven <geert+renesas@glider.be> Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com> Reviewed-by: Bart Van Assche <bvanassche@acm.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-10-06ata: libata-scsi: ignore reserved bits for REPORT SUPPORTED OPERATION CODESNiklas Cassel1-1/+1
commit 3ef600923521616ebe192c893468ad0424de2afb upstream. For REPORT SUPPORTED OPERATION CODES command, the service action field is defined as bits 0-4 in the second byte in the CDB. Bits 5-7 in the second byte are reserved. Only look at the service action field in the second byte when determining if the MAINTENANCE IN opcode is a REPORT SUPPORTED OPERATION CODES command. This matches how we only look at the service action field in the second byte when determining if the SERVICE ACTION IN(16) opcode is a READ CAPACITY(16) command (reserved bits 5-7 in the second byte are ignored). Fixes: 7b2030942859 ("libata: Add support for SCT Write Same") Cc: stable@vger.kernel.org Signed-off-by: Niklas Cassel <niklas.cassel@wdc.com> Signed-off-by: Damien Le Moal <dlemoal@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-10-06ata: libata-scsi: link ata port and scsi deviceDamien Le Moal1-5/+40
commit fb99ef17865035a6657786d4b2af11a27ba23f9b upstream. There is no direct device ancestry defined between an ata_device and its scsi device which prevents the power management code from correctly ordering suspend and resume operations. Create such ancestry with the ata device as the parent to ensure that the scsi device (child) is suspended before the ata device and that resume handles the ata device before the scsi device. The parent-child (supplier-consumer) relationship is established between the ata_port (parent) and the scsi device (child) with the function device_add_link(). The parent used is not the ata_device as the PM operations are defined per port and the status of all devices connected through that port is controlled from the port operations. The device link is established with the new function ata_scsi_slave_alloc(), and this function is used to define the ->slave_alloc callback of the scsi host template of all ata drivers. Fixes: a19a93e4c6a9 ("scsi: core: pm: Rely on the device driver core for async power management") Cc: stable@vger.kernel.org Signed-off-by: Damien Le Moal <dlemoal@kernel.org> Reviewed-by: Hannes Reinecke <hare@suse.de> Reviewed-by: Niklas Cassel <niklas.cassel@wdc.com> Tested-by: Geert Uytterhoeven <geert+renesas@glider.be> Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com> Reviewed-by: John Garry <john.g.garry@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-10-06ata: libata-eh: do not clear ATA_PFLAG_EH_PENDING in ata_eh_reset()Niklas Cassel1-10/+3
[ Upstream commit 80cc944eca4f0baa9c381d0706f3160e491437f2 ] ata_scsi_port_error_handler() starts off by clearing ATA_PFLAG_EH_PENDING, before calling ap->ops->error_handler() (without holding the ap->lock). If an error IRQ is received while ap->ops->error_handler() is running, the irq handler will set ATA_PFLAG_EH_PENDING. Once ap->ops->error_handler() returns, ata_scsi_port_error_handler() checks if ATA_PFLAG_EH_PENDING is set, and if it is, another iteration of ATA EH is performed. The problem is that ATA_PFLAG_EH_PENDING is not only cleared by ata_scsi_port_error_handler(), it is also cleared by ata_eh_reset(). ata_eh_reset() is called by ap->ops->error_handler(). This additional clearing done by ata_eh_reset() breaks the whole retry logic in ata_scsi_port_error_handler(). Thus, if an error IRQ is received while ap->ops->error_handler() is running, the port will currently remain frozen and will never get re-enabled. The additional clearing in ata_eh_reset() was introduced in commit 1e641060c4b5 ("libata: clear eh_info on reset completion"). Looking at the original error report: https://marc.info/?l=linux-ide&m=124765325828495&w=2 We can see the following happening: [ 1.074659] ata3: XXX port freeze [ 1.074700] ata3: XXX hardresetting link, stopping engine [ 1.074746] ata3: XXX flipping SControl [ 1.411471] ata3: XXX irq_stat=400040 CONN|PHY [ 1.411475] ata3: XXX port freeze [ 1.420049] ata3: XXX starting engine [ 1.420096] ata3: XXX rc=0, class=1 [ 1.420142] ata3: XXX clearing IRQs for thawing [ 1.420188] ata3: XXX port thawed [ 1.420234] ata3: SATA link up 3.0 Gbps (SStatus 123 SControl 300) We are not supposed to be able to receive an error IRQ while the port is frozen (PxIE is set to 0, i.e. all IRQs for the port are disabled). AHCI 1.3.1 section 10.7.1.1 First Tier (IS Register) states: "Each bit location can be thought of as reporting a '1' if the virtual "interrupt line" for that port is indicating it wishes to generate an interrupt. That is, if a port has one or more interrupt status bit set, and the enables for those status bits are set, then this bit shall be set." Additionally, AHCI state P:ComInit clearly shows that the state machine will only jump to P:ComInitSetIS (which sets IS.IPS(x) to '1'), if PxIE.PCE is set to '1'. In our case, PxIE is set to 0, so IS.IPS(x) won't get set. So IS.IPS(x) only gets set if PxIS and PxIE is set. AHCI 1.3.1 section 10.7.1.1 First Tier (IS Register) also states: "The bits in this register are read/write clear. It is set by the level of the virtual interrupt line being a set, and cleared by a write of '1' from the software." So if IS.IPS(x) is set, you need to explicitly clear it by writing a 1 to IS.IPS(x) for that port. Since PxIE is cleared, the only way to get an interrupt while the port is frozen, is if IS.IPS(x) is set, and the only way IS.IPS(x) can be set when the port is frozen, is if it was set before the port was frozen. However, since commit 737dd811a3db ("ata: libahci: clear pending interrupt status"), we clear both PxIS and IS.IPS(x) after freezing the port, but before the COMRESET, so the problem that commit 1e641060c4b5 ("libata: clear eh_info on reset completion") fixed can no longer happen. Thus, revert commit 1e641060c4b5 ("libata: clear eh_info on reset completion"), so that the retry logic in ata_scsi_port_error_handler() works once again. (The retry logic is still needed, since we can still get an error IRQ _after_ the port has been thawed, but before ata_scsi_port_error_handler() takes the ap->lock in order to check if ATA_PFLAG_EH_PENDING is set.) Signed-off-by: Niklas Cassel <niklas.cassel@wdc.com> Signed-off-by: Damien Le Moal <dlemoal@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-10-06ata: sata_mv: Fix incorrect string length computation in mv_dump_mem()Christophe JAILLET1-2/+2
[ Upstream commit e97eb65dd464e7f118a16a26337322d07eb653e2 ] snprintf() returns the "number of characters which *would* be generated for the given input", not the size *really* generated. In order to avoid too large values for 'o' (and potential negative values for "sizeof(linebuf) o") use scnprintf() instead of snprintf(). Note that given the "w < 4" in the for loop, the buffer can NOT overflow, but using the *right* function is always better. Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr> Signed-off-by: Damien Le Moal <dlemoal@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-09-23ata: libahci: clear pending interrupt statusSzuying Chen1-12/+23
commit 737dd811a3dbfd7edd4ad2ba5152e93d99074f83 upstream. When a CRC error occurs, the HBA asserts an interrupt to indicate an interface fatal error (PxIS.IFS). The ISR clears PxIE and PxIS, then does error recovery. If the adapter receives another SDB FIS with an error (PxIS.TFES) from the device before the start of the EH recovery process, the interrupt signaling the new SDB cannot be serviced as PxIE was cleared already. This in turn results in the HBA inability to issue any command during the error recovery process after setting PxCMD.ST to 1 because PxIS.TFES is still set. According to AHCI 1.3.1 specifications section 6.2.2, fatal errors notified by setting PxIS.HBFS, PxIS.HBDS, PxIS.IFS or PxIS.TFES will cause the HBA to enter the ERR:Fatal state. In this state, the HBA shall not issue any new commands. To avoid this situation, introduce the function ahci_port_clear_pending_irq() to clear pending interrupts before executing a COMRESET. This follows the AHCI 1.3.1 - section 6.2.2.2 specification. Signed-off-by: Szuying Chen <Chloe_Chen@asmedia.com.tw> Fixes: e0bfd149973d ("[PATCH] ahci: stop engine during hard reset") Cc: stable@vger.kernel.org Reviewed-by: Niklas Cassel <niklas.cassel@wdc.com> Signed-off-by: Damien Le Moal <dlemoal@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-09-23ata: libata: disallow dev-initiated LPM transitions to unsupported statesNiklas Cassel2-3/+25
commit 24e0e61db3cb86a66824531989f1df80e0939f26 upstream. In AHCI 1.3.1, the register description for CAP.SSC: "When cleared to ‘0’, software must not allow the HBA to initiate transitions to the Slumber state via agressive link power management nor the PxCMD.ICC field in each port, and the PxSCTL.IPM field in each port must be programmed to disallow device initiated Slumber requests." In AHCI 1.3.1, the register description for CAP.PSC: "When cleared to ‘0’, software must not allow the HBA to initiate transitions to the Partial state via agressive link power management nor the PxCMD.ICC field in each port, and the PxSCTL.IPM field in each port must be programmed to disallow device initiated Partial requests." Ensure that we always set the corresponding bits in PxSCTL.IPM, such that a device is not allowed to initiate transitions to power states which are unsupported by the HBA. DevSleep is always initiated by the HBA, however, for completeness, set the corresponding bit in PxSCTL.IPM such that agressive link power management cannot transition to DevSleep if DevSleep is not supported. sata_link_scr_lpm() is used by libahci, ata_piix and libata-pmp. However, only libahci has the ability to read the CAP/CAP2 register to see if these features are supported. Therefore, in order to not introduce any regressions on ata_piix or libata-pmp, create flags that indicate that the respective feature is NOT supported. This way, the behavior for ata_piix and libata-pmp should remain unchanged. This change is based on a patch originally submitted by Runa Guo-oc. Signed-off-by: Niklas Cassel <niklas.cassel@wdc.com> Fixes: 1152b2617a6e ("libata: implement sata_link_scr_lpm() and make ata_dev_set_feature() global") Cc: stable@vger.kernel.org Signed-off-by: Damien Le Moal <dlemoal@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-09-19ata: pata_ftide010: Add missing MODULE_DESCRIPTIONDamien Le Moal1-0/+1
commit 7274eef5729037300f29d14edeb334a47a098f65 upstream. Add the missing MODULE_DESCRIPTION() to avoid warnings such as: WARNING: modpost: missing MODULE_DESCRIPTION() in drivers/ata/pata_ftide010.o when compiling with W=1. Fixes: be4e456ed3a5 ("ata: Add driver for Faraday Technology FTIDE010") Cc: stable@vger.kernel.org Signed-off-by: Damien Le Moal <dlemoal@kernel.org> Reviewed-by: Linus Walleij <linus.walleij@linaro.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-09-19ata: sata_gemini: Add missing MODULE_DESCRIPTIONDamien Le Moal1-0/+1
commit 8566572bf3b4d6e416a4bf2110dbb4817d11ba59 upstream. Add the missing MODULE_DESCRIPTION() to avoid warnings such as: WARNING: modpost: missing MODULE_DESCRIPTION() in drivers/ata/sata_gemini.o when compiling with W=1. Fixes: be4e456ed3a5 ("ata: Add driver for Faraday Technology FTIDE010") Cc: stable@vger.kernel.org Signed-off-by: Damien Le Moal <dlemoal@kernel.org> Reviewed-by: Linus Walleij <linus.walleij@linaro.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-09-19ata: pata_falcon: fix IO base selection for Q40Michael Schmitz1-21/+29
commit 8a1f00b753ecfdb117dc1a07e68c46d80e7923ea upstream. With commit 44b1fbc0f5f3 ("m68k/q40: Replace q40ide driver with pata_falcon and falconide"), the Q40 IDE driver was replaced by pata_falcon.c. Both IO and memory resources were defined for the Q40 IDE platform device, but definition of the IDE register addresses was modeled after the Falcon case, both in use of the memory resources and in including register shift and byte vs. word offset in the address. This was correct for the Falcon case, which does not apply any address translation to the register addresses. In the Q40 case, all of device base address, byte access offset and register shift is included in the platform specific ISA access translation (in asm/mm_io.h). As a consequence, such address translation gets applied twice, and register addresses are mangled. Use the device base address from the platform IO resource for Q40 (the IO address translation will then add the correct ISA window base address and byte access offset), with register shift 1. Use MMIO base address and register shift 2 as before for Falcon. Encode PIO_OFFSET into IO port addresses for all registers for Q40 except the data transfer register. Encode the MMIO offset there (pata_falcon_data_xfer() directly uses raw IO with no address translation). Reported-by: William R Sowerbutts <will@sowerbutts.com> Closes: https://lore.kernel.org/r/CAMuHMdUU62jjunJh9cqSqHT87B0H0A4udOOPs=WN7WZKpcagVA@mail.gmail.com Link: https://lore.kernel.org/r/CAMuHMdUU62jjunJh9cqSqHT87B0H0A4udOOPs=WN7WZKpcagVA@mail.gmail.com Fixes: 44b1fbc0f5f3 ("m68k/q40: Replace q40ide driver with pata_falcon and falconide") Cc: stable@vger.kernel.org Cc: Finn Thain <fthain@linux-m68k.org> Cc: Geert Uytterhoeven <geert@linux-m68k.org> Tested-by: William R Sowerbutts <will@sowerbutts.com> Signed-off-by: Michael Schmitz <schmitzmic@gmail.com> Reviewed-by: Sergey Shtylyov <s.shtylyov@omp.ru> Reviewed-by: Geert Uytterhoeven <geert@linux-m68k.org> Signed-off-by: Damien Le Moal <dlemoal@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-09-19ata: ahci: Add Elkhart Lake AHCI controllerWerner Fischer1-0/+2
commit 2a2df98ec592667927b5c1351afa6493ea125c9f upstream. Elkhart Lake is the successor of Apollo Lake and Gemini Lake. These CPUs and their PCHs are used in mobile and embedded environments. With this patch I suggest that Elkhart Lake SATA controllers [1] should use the default LPM policy for mobile chipsets. The disadvantage of missing hot-plug support with this setting should not be an issue, as those CPUs are used in embedded environments and not in servers with hot-plug backplanes. We discovered that the Elkhart Lake SATA controllers have been missing in ahci.c after a customer reported the throttling of his SATA SSD after a short period of higher I/O. We determined the high temperature of the SSD controller in idle mode as the root cause for that. Depending on the used SSD, we have seen up to 1.8 Watt lower system idle power usage and up to 30°C lower SSD controller temperatures in our tests, when we set med_power_with_dipm manually. I have provided a table showing seven different SATA SSDs from ATP, Intel/Solidigm and Samsung [2]. Intel lists a total of 3 SATA controller IDs (4B60, 4B62, 4B63) in [1] for those mobile PCHs. This commit just adds 0x4b63 as I do not have test systems with 0x4b60 and 0x4b62 SATA controllers. I have tested this patch with a system which uses 0x4b63 as SATA controller. [1] https://sata-io.org/product/8803 [2] https://www.thomas-krenn.com/en/wiki/SATA_Link_Power_Management#Example_LES_v4 Signed-off-by: Werner Fischer <devlists@wefi.net> Cc: stable@vger.kernel.org Signed-off-by: Damien Le Moal <dlemoal@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-09-13ata: pata_arasan_cf: Use dev_err_probe() instead dev_err() in data_xfer()Minjie Du1-1/+2
[ Upstream commit 4139f992c49356391fb086c0c8ce51f66c26d623 ] It is possible for dma_request_chan() to return EPROBE_DEFER, which means acdev->host->dev is not ready yet. At this point dev_err() will have no output. Use dev_err_probe() instead. Signed-off-by: Minjie Du <duminjie@vivo.com> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Reviewed-by: Sergey Shtylyov <s.shtylyov@omp.ru> Signed-off-by: Damien Le Moal <dlemoal@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-08-03ata: pata_ns87415: mark ns87560_tf_read staticArnd Bergmann1-1/+1
[ Upstream commit 3fc2febb0f8ffae354820c1772ec008733237cfa ] The global function triggers a warning because of the missing prototype drivers/ata/pata_ns87415.c:263:6: warning: no previous prototype for 'ns87560_tf_read' [-Wmissing-prototypes] 263 | void ns87560_tf_read(struct ata_port *ap, struct ata_taskfile *tf) There are no other references to this, so just make it static. Fixes: c4b5b7b6c4423 ("pata_ns87415: Initial cut at 87415/87560 IDE support") Reviewed-by: Sergey Shtylyov <s.shtylyov@omp.ru> Reviewed-by: Serge Semin <fancer.lancer@gmail.com> Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Damien Le Moal <dlemoal@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-06-28ata: libata-scsi: Avoid deadlock on rescan after device resumeDamien Le Moal3-3/+24
[ Upstream commit 6aa0365a3c8512587fffd42fe438768709ddef8e ] When an ATA port is resumed from sleep, the port is reset and a power management request issued to libata EH to reset the port and rescanning the device(s) attached to the port. Device rescanning is done by scheduling an ata_scsi_dev_rescan() work, which will execute scsi_rescan_device(). However, scsi_rescan_device() takes the generic device lock, which is also taken by dpm_resume() when the SCSI device is resumed as well. If a device rescan execution starts before the completion of the SCSI device resume, the rcu locking used to refresh the cached VPD pages of the device, combined with the generic device locking from scsi_rescan_device() and from dpm_resume() can cause a deadlock. Avoid this situation by changing struct ata_port scsi_rescan_task to be a delayed work instead of a simple work_struct. ata_scsi_dev_rescan() is modified to check if the SCSI device associated with the ATA device that must be rescanned is not suspended. If the SCSI device is still suspended, ata_scsi_dev_rescan() returns early and reschedule itself for execution after an arbitrary delay of 5ms. Reported-by: Kai-Heng Feng <kai.heng.feng@canonical.com> Reported-by: Joe Breuer <linux-kernel@jmbreuer.net> Closes: https://bugzilla.kernel.org/show_bug.cgi?id=217530 Fixes: a19a93e4c6a9 ("scsi: core: pm: Rely on the device driver core for async power management") Signed-off-by: Damien Le Moal <dlemoal@kernel.org> Reviewed-by: Hannes Reinecke <hare@suse.de> Tested-by: Kai-Heng Feng <kai.heng.feng@canonical.com> Tested-by: Joe Breuer <linux-kernel@jmbreuer.net> Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-06-09ata: libata-scsi: Use correct device no in ata_find_dev()Damien Le Moal1-8/+26
commit 7f875850f20a42f488840c9df7af91ef7db2d576 upstream. For devices not attached to a port multiplier and managed directly by libata, the device number passed to ata_find_dev() must always be lower than the maximum number of devices returned by ata_link_max_devices(). That is 1 for SATA devices or 2 for an IDE link with master+slave devices. This device number is the SCSI device ID which matches these constraints as the IDs are generated per port and so never exceed the maximum number of devices for the link being used. However, for libsas managed devices, SCSI device IDs are assigned per struct scsi_host, leading to device IDs for SATA devices that can be well in excess of libata per-link maximum number of devices. This results in ata_find_dev() to always return NULL for libsas managed devices except for the first device of the target scsi_host with ID (device number) equal to 0. This issue is visible by executing the hdparm utility, which fails. E.g.: hdparm -i /dev/sdX /dev/sdX: HDIO_GET_IDENTITY failed: No message of desired type Fix this by rewriting ata_find_dev() to ignore the device number for non-PMP attached devices with a link with at most 1 device, that is SATA devices. For these, the device number 0 is always used to return the correct pointer to the struct ata_device of the port link. This change excludes IDE master/slave setups (maximum number of devices per link is 2) and port-multiplier attached devices. Also, to be consistant with the fact that SCSI device IDs and channel numbers used as device numbers are both unsigned int, change the devno argument of ata_find_dev() to unsigned int. Reported-by: Xingui Yang <yangxingui@huawei.com> Fixes: 41bda9c98035 ("libata-link: update hotplug to handle PMP links") Cc: stable@vger.kernel.org Signed-off-by: Damien Le Moal <dlemoal@kernel.org> Reviewed-by: Jason Yan <yanaijie@huawei.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-03-10ata: ahci: Revert "ata: ahci: Add Tiger Lake UP{3,4} AHCI controller"Damien Le Moal1-1/+0
commit 6210038aeaf49c395c2da57572246d93ec67f6d4 upstream. Commit 104ff59af73a ("ata: ahci: Add Tiger Lake UP{3,4} AHCI controller") enabled low power mode for the Tiger Lake AHIC adapter in the author system but created regressions for others. Revert this patch for now until a better solution is found to make this adapter eco-friendly. Link: https://bugzilla.kernel.org/show_bug.cgi?id=217114 CC: stable@vger.kernel.org Signed-off-by: Damien Le Moal <damien.lemoal@opensource.wdc.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-02-22ata: libata-core: Disable READ LOG DMA EXT for Samsung MZ7LHPatrick McLean1-0/+3
commit ead089577e0f55b238f980d9f62eaa90b7b64672 upstream. Samsung MZ7LH drives are spewing messages like this in to dmesg with AMD SATA controllers: ata1.00: exception Emask 0x0 SAct 0x7e0000 SErr 0x0 action 0x6 frozen ata1.00: failed command: SEND FPDMA QUEUED ata1.00: cmd 64/01:88:00:00:00/00:00:00:00:00/a0 tag 17 ncq dma 512 out res 40/00:01:01:4f:c2/00:00:00:00:00/00 Emask 0x4 (timeout) Since this was seen previously with SSD 840 EVO drives in https://bugzilla.kernel.org/show_bug.cgi?id=203475 let's add the same fix for these drives as the EVOs have, since they likely have very similar firmwares. Signed-off-by: Patrick McLean <chutzpah@gentoo.org> Signed-off-by: Damien Le Moal <damien.lemoal@opensource.wdc.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-02-22ata: ahci: Add Tiger Lake UP{3,4} AHCI controllerSimon Gaiser1-0/+1
commit 104ff59af73aba524e57ae0fef70121643ff270e upstream. Mark the Tiger Lake UP{3,4} AHCI controller as "low_power". This enables S0ix to work out of the box. Otherwise this isn't working unless the user manually sets /sys/class/scsi_host/*/link_power_management_policy. Intel lists a total of 4 SATA controller IDs in [1] for those mobile PCHs. This commit just adds the "AHCI" variant since I only tested those. [1]: https://cdrdv2.intel.com/v1/dl/getContent/631119 Signed-off-by: Simon Gaiser <simon@invisiblethingslab.com> CC: stable@vger.kernel.org Signed-off-by: Damien Le Moal <damien.lemoal@opensource.wdc.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-02-09ata: libata: Fix sata_down_spd_limit() when no link speed is reportedDamien Le Moal1-1/+1
[ Upstream commit 69f2c9346313ba3d3dfa4091ff99df26c67c9021 ] Commit 2dc0b46b5ea3 ("libata: sata_down_spd_limit should return if driver has not recorded sstatus speed") changed the behavior of sata_down_spd_limit() to return doing nothing if a drive does not report a current link speed, to avoid reducing the link speed to the lowest 1.5 Gbps speed. However, the change assumed that a speed was recorded before probing (e.g. before a suspend/resume) and set in link->sata_spd. This causes problems with adapters/drives combination failing to establish a link speed during probe autonegotiation. One example reported of this problem is an mvebu adapter with a 3Gbps port-multiplier box: autonegotiation fails, leaving no recorded link speed and no reported current link speed. Probe retries also fail as no action is taken by sata_set_spd() after each retry. Fix this by returning early in sata_down_spd_limit() only if we do have a recorded link speed, that is, if link->sata_spd is not 0. With this fix, a failed probe not leading to a recorded link speed is retried at the lower 1.5 Gbps speed, with the link speed potentially increased later on the second revalidate of the device if the device reports that it supports higher link speeds. Reported-by: Marius Dinu <marius@psihoexpert.ro> Fixes: 2dc0b46b5ea3 ("libata: sata_down_spd_limit should return if driver has not recorded sstatus speed") Reviewed-by: Niklas Cassel <niklas.cassel@wdc.com> Tested-by: Marius Dinu <marius@psihoexpert.ro> Signed-off-by: Damien Le Moal <damien.lemoal@opensource.wdc.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-02-01ata: pata_cs5535: Don't build on UMLPeter Foley1-0/+1
[ Upstream commit 22eebaa631c40f3dac169ba781e0de471b83bf45 ] This driver uses MSR functions that aren't implemented under UML. Avoid building it to prevent tripping up allyesconfig. e.g. /usr/lib/gcc/x86_64-pc-linux-gnu/12/../../../../x86_64-pc-linux-gnu/bin/ld: pata_cs5535.c:(.text+0x3a3): undefined reference to `__tracepoint_read_msr' /usr/lib/gcc/x86_64-pc-linux-gnu/12/../../../../x86_64-pc-linux-gnu/bin/ld: pata_cs5535.c:(.text+0x3d2): undefined reference to `__tracepoint_write_msr' /usr/lib/gcc/x86_64-pc-linux-gnu/12/../../../../x86_64-pc-linux-gnu/bin/ld: pata_cs5535.c:(.text+0x457): undefined reference to `__tracepoint_write_msr' /usr/lib/gcc/x86_64-pc-linux-gnu/12/../../../../x86_64-pc-linux-gnu/bin/ld: pata_cs5535.c:(.text+0x481): undefined reference to `do_trace_write_msr' /usr/lib/gcc/x86_64-pc-linux-gnu/12/../../../../x86_64-pc-linux-gnu/bin/ld: pata_cs5535.c:(.text+0x4d5): undefined reference to `do_trace_write_msr' /usr/lib/gcc/x86_64-pc-linux-gnu/12/../../../../x86_64-pc-linux-gnu/bin/ld: pata_cs5535.c:(.text+0x4f5): undefined reference to `do_trace_read_msr' /usr/lib/gcc/x86_64-pc-linux-gnu/12/../../../../x86_64-pc-linux-gnu/bin/ld: pata_cs5535.c:(.text+0x51c): undefined reference to `do_trace_write_msr' Signed-off-by: Peter Foley <pefoley2@pefoley.com> Reviewed-by: Randy Dunlap <rdunlap@infradead.org> Signed-off-by: Damien Le Moal <damien.lemoal@opensource.wdc.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-01-07ata: ahci: fix enum constants for gcc-13Arnd Bergmann1-122/+123
commit f07788079f515ca4a681c5f595bdad19cfbd7b1d upstream. gcc-13 slightly changes the type of constant expressions that are defined in an enum, which triggers a compile time sanity check in libata: linux/drivers/ata/libahci.c: In function 'ahci_led_store': linux/include/linux/compiler_types.h:357:45: error: call to '__compiletime_assert_302' declared with attribute error: BUILD_BUG_ON failed: sizeof(_s) > sizeof(long) 357 | _compiletime_assert(condition, msg, __compiletime_assert_, __COUNTER__) The new behavior is that sizeof() returns the same value for the constant as it does for the enum type, which is generally more sensible and consistent. The problem in libata is that it contains a single enum definition for lots of unrelated constants, some of which are large positive (unsigned) integers like 0xffffffff, while others like (1<<31) are interpreted as negative integers, and this forces the enum type to become 64 bit wide even though most constants would still fit into a signed 32-bit 'int'. Fix this by changing the entire enum definition to use BIT(x) in place of (1<<x), which results in all values being seen as 'unsigned' and fitting into an unsigned 32-bit type. Link: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107917 Link: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107405 Reported-by: Luis Machado <luis.machado@arm.com> Cc: linux-ide@vger.kernel.org Cc: Damien Le Moal <damien.lemoal@opensource.wdc.com> Cc: stable@vger.kernel.org Cc: Randy Dunlap <rdunlap@infradead.org> Signed-off-by: Arnd Bergmann <arnd@arndb.de> Tested-by: Luis Machado <luis.machado@arm.com> Signed-off-by: Damien Le Moal <damien.lemoal@opensource.wdc.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-01-04ata: ahci: Fix PCS quirk application for suspendAdam Vodopjan1-9/+23
[ Upstream commit 37e14e4f3715428b809e4df9a9958baa64c77d51 ] Since kernel 5.3.4 my laptop (ICH8M controller) does not see Kingston SV300S37A60G SSD disk connected into a SATA connector on wake from suspend. The problem was introduced in c312ef176399 ("libata/ahci: Drop PCS quirk for Denverton and beyond"): the quirk is not applied on wake from suspend as it originally was. It is worth to mention the commit contained another bug: the quirk is not applied at all to controllers which require it. The fix commit 09d6ac8dc51a ("libata/ahci: Fix PCS quirk application") landed in 5.3.8. So testing my patch anywhere between commits c312ef176399 and 09d6ac8dc51a is pointless. Not all disks trigger the problem. For example nothing bad happens with Western Digital WD5000LPCX HDD. Test hardware: - Acer 5920G with ICH8M SATA controller - sda: some SATA HDD connnected into the DVD drive IDE port with a SATA-IDE caddy. It is a boot disk - sdb: Kingston SV300S37A60G SSD connected into the only SATA port Sample "dmesg --notime | grep -E '^(sd |ata)'" output on wake: sd 0:0:0:0: [sda] Starting disk sd 2:0:0:0: [sdb] Starting disk ata4: SATA link down (SStatus 4 SControl 300) ata3: SATA link down (SStatus 4 SControl 300) ata1.00: ACPI cmd ef/03:0c:00:00:00:a0 (SET FEATURES) filtered out ata1.00: ACPI cmd ef/03:42:00:00:00:a0 (SET FEATURES) filtered out ata1: FORCE: cable set to 80c ata5: SATA link down (SStatus 0 SControl 300) ata3: SATA link down (SStatus 4 SControl 300) ata3: SATA link down (SStatus 4 SControl 300) ata3.00: disabled sd 2:0:0:0: rejecting I/O to offline device ata3.00: detaching (SCSI 2:0:0:0) sd 2:0:0:0: [sdb] Start/Stop Unit failed: Result: hostbyte=DID_NO_CONNECT driverbyte=DRIVER_OK sd 2:0:0:0: [sdb] Synchronizing SCSI cache sd 2:0:0:0: [sdb] Synchronize Cache(10) failed: Result: hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK sd 2:0:0:0: [sdb] Stopping disk sd 2:0:0:0: [sdb] Start/Stop Unit failed: Result: hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK Commit c312ef176399 dropped ahci_pci_reset_controller() which internally calls ahci_reset_controller() and applies the PCS quirk if needed after that. It was called each time a reset was required instead of just ahci_reset_controller(). This patch puts the function back in place. Fixes: c312ef176399 ("libata/ahci: Drop PCS quirk for Denverton and beyond") Signed-off-by: Adam Vodopjan <grozzly@protonmail.com> Signed-off-by: Damien Le Moal <damien.lemoal@opensource.wdc.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2022-12-31ata: libata: fix NCQ autosense logicNiklas Cassel1-3/+8
[ Upstream commit 7390896b3484d44cbdb8bc4859964314ac66d3c9 ] Currently, the logic if we should call ata_scsi_set_sense() (and set flag ATA_QCFLAG_SENSE_VALID to indicate that we have successfully added sense data to the struct ata_queued_cmd) looks like this: if (dev->class == ATA_DEV_ZAC && ((qc->result_tf.status & ATA_SENSE) || qc->result_tf.auxiliary)) The problem with this is that a drive can support the NCQ command error log without supporting NCQ autosense. On such a drive, if the failing command has sense data, the status field in the NCQ command error log will have the ATA_SENSE bit set. It is just that this sense data is not included in the NCQ command error log when NCQ autosense is not supported. Instead the sense data has to be fetched using the REQUEST SENSE DATA EXT command. Therefore, we should only add the sense data if the drive supports NCQ autosense AND the ATA_SENSE bit is set in the status field. Fix this, and at the same time, remove the duplicated ATA_DEV_ZAC check. The struct ata_taskfile supplied to ata_eh_read_log_10h() is memset:ed before calling the function, so simply checking if qc->result_tf.auxiliary is set is sufficient to tell us that the log actually contained sense data. Fixes: d238ffd59d3c ("libata: do not attempt to retrieve sense code twice") Signed-off-by: Niklas Cassel <niklas.cassel@wdc.com> Signed-off-by: Damien Le Moal <damien.lemoal@opensource.wdc.com> Signed-off-by: Sasha Levin <sashal@kernel.org>