summaryrefslogtreecommitdiff
path: root/drivers/nvme/host/zns.c
AgeCommit message (Collapse)AuthorFilesLines
2024-04-02nvme: split nvme_update_zone_infoChristoph Hellwig1-13/+20
nvme_update_zone_info does (admin queue) I/O to the device and can fail. We fail to abort the queue limits update if that happen, but really should avoid with the frozen I/O queue as much as possible anyway. Split the logic into a helper to query the information that can be called on an unfrozen queue and one to apply it to the queue limits. Fixes: 9b130d681443 ("nvme: use the atomic queue limits update API") Reported-by: Kanchan Joshi <joshi.k@samsung.com> Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Kanchan Joshi <joshi.k@samsung.com> Signed-off-by: Keith Busch <kbusch@kernel.org>
2024-03-04nvme: use the atomic queue limits update APIChristoph Hellwig1-8/+8
Changes the callchains that update queue_limits to build an on-stack queue_limits and update it atomically. Note that for now only the admin queue actually passes it to the queue allocation function. Doing the same for the gendisks used for the namespaces will require a little more work. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Keith Busch <kbusch@kernel.org>
2024-03-04nvme: remove nvme_revalidate_zonesChristoph Hellwig1-10/+2
Handle setting the zone size / chunk_sectors and max_append_sectors limits together with the other ZNS limits, and just open code the call to blk_revalidate_zones in the current place. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Damien Le Moal <dlemoal@kernel.org> Signed-off-by: Keith Busch <kbusch@kernel.org>
2023-12-22Merge tag 'nvme-6.8-2023-12-21' of git://git.infradead.org/nvme into ↵Jens Axboe1-16/+19
for-6.8/block Pull NVMe updates from Keith: "nvme updates for Linux 6.8 - nvme fabrics spec updates (Guixin, Max) - nvme target udpates (Guixin, Evan) - nvme attribute refactoring (Daniel) - nvme-fc numa fix (Keith)" * tag 'nvme-6.8-2023-12-21' of git://git.infradead.org/nvme: nvme-fc: set numa_node after nvme_init_ctrl nvme-fabrics: don't check discovery ioccsz/iorcsz nvmet: configfs: use ctrl->instance to track passthru subsystems nvme: repack struct nvme_ns_head nvme: add csi, ms and nuse to sysfs nvme: rename ns attribute group nvme: refactor ns info setup function nvme: refactor ns info helpers nvme: move ns id info to struct nvme_ns_head nvmet: remove cntlid_min and cntlid_max check in nvmet_alloc_ctrl nvmet: allow identical cntlid_min and cntlid_max settings nvme-fabrics: check ioccsz and iorcsz nvme: introduce nvme_check_ctrl_fabric_info helper
2023-12-20block: simplify disk_set_zonedChristoph Hellwig1-1/+1
Only use disk_set_zoned to actually enable zoned device support. For clearing it, call disk_clear_zoned, which is renamed from disk_clear_zone_settings and now directly clears the zoned flag as well. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Damien Le Moal <dlemoal@kernel.org> Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com> Link: https://lore.kernel.org/r/20231217165359.604246-5-hch@lst.de Signed-off-by: Jens Axboe <axboe@kernel.dk>
2023-12-20block: remove support for the host aware zone modelChristoph Hellwig1-1/+1
When zones were first added the SCSI and ATA specs, two different models were supported (in addition to the drive managed one that is invisible to the host): - host managed where non-conventional zones there is strict requirement to write at the write pointer, or else an error is returned - host aware where a write point is maintained if writes always happen at it, otherwise it is left in an under-defined state and the sequential write preferred zones behave like conventional zones (probably very badly performing ones, though) Not surprisingly this lukewarm model didn't prove to be very useful and was finally removed from the ZBC and SBC specs (NVMe never implemented it). Due to to the easily disappearing write pointer host software could never rely on the write pointer to actually be useful for say recovery. Fortunately only a few HDD prototypes shipped using this model which never made it to mass production. Drop the support before it is too late. Note that any such host aware prototype HDD can still be used with Linux as we'll now treat it as a conventional HDD. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com> Link: https://lore.kernel.org/r/20231217165359.604246-4-hch@lst.de Signed-off-by: Jens Axboe <axboe@kernel.dk>
2023-12-19nvme: refactor ns info setup functionDaniel Wagner1-7/+9
Use nvme_ns_head instead of nvme_ns where possible. This reduces the coupling between the different data structures. Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Reviewed-by: Hannes Reinecke <hare@suse.de> Signed-off-by: Daniel Wagner <dwagner@suse.de> Signed-off-by: Keith Busch <kbusch@kernel.org>
2023-12-19nvme: refactor ns info helpersDaniel Wagner1-6/+6
Pass in the nvme_ns_head pointer directly. This reduces the necessity on the caller side have the nvme_ns data structure present. Thus we can refactor the caller side in the next step as well. Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Reviewed-by: Hannes Reinecke <hare@suse.de> Signed-off-by: Daniel Wagner <dwagner@suse.de> Signed-off-by: Keith Busch <kbusch@kernel.org>
2023-12-19nvme: move ns id info to struct nvme_ns_headDaniel Wagner1-8/+9
Move the namesapce info to struct nvme_ns_head, because it's the same for all associated namespaces. Note: with multipathing enabled the PI information is shared between all paths. If a path is using a different PI configuration it will overwrite the previous settings. This is obviously not correct and such configuration will be rejected in future. For the time being we expect a correctly configured storage. Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Hannes Reinecke <hare@suse.de> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Daniel Wagner <dwagner@suse.de> Signed-off-by: Keith Busch <kbusch@kernel.org>
2023-07-06scsi: nvme: zns: Set zone limits before revalidating zonesDamien Le Moal1-5/+4
In nvme_revalidate_zones(), execute blk_queue_chunk_sectors() and blk_queue_max_zone_append_sectors() to respectively set a ZNS namespace zone size and maximum zone append sector limit before executing blk_revalidate_disk_zones(). This is to allow the block layer zone reavlidation to check these device characteristics prior to checking all zones of the device. Signed-off-by: Damien Le Moal <dlemoal@kernel.org> Link: https://lore.kernel.org/r/20230703024812.76778-3-dlemoal@kernel.org Reviewed-by: Bart Van Assche <bvanassche@acm.org> Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2022-07-06block: pass a gendisk to blk_queue_max_open_zones and blk_queue_max_active_zonesChristoph Hellwig1-2/+2
Switch to a gendisk based API in preparation for moving all zone related fields from the request_queue to the gendisk. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com> Reviewed-by: Damien Le Moal <damien.lemoal@opensource.wdc.com> Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> Link: https://lore.kernel.org/r/20220706070350.1703384-11-hch@lst.de Signed-off-by: Jens Axboe <axboe@kernel.dk>
2022-07-06block: pass a gendisk to blk_queue_set_zonedChristoph Hellwig1-1/+1
Prepare for storing the zone related field in struct gendisk instead of struct request_queue. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com> Reviewed-by: Damien Le Moal <damien.lemoal@opensource.wdc.com> Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> Link: https://lore.kernel.org/r/20220706070350.1703384-7-hch@lst.de Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-12-06nvme: report write pointer for a full zone as zone start + zone lenNiklas Cassel1-1/+4
The write pointer in NVMe ZNS is invalid for a zone in zone state full. The same also holds true for ZAC/ZBC. The current behavior for NVMe is to simply propagate the wp reported by the drive, even for full zones. Since the wp is invalid for a full zone, the wp reported by the drive may be any value. The way that the sd_zbc driver handles a full zone is to always report the wp as zone start + zone len, regardless of what the drive reported. null_blk also follows this convention. Do the same for NVMe, so that a BLKREPORTZONE ioctl reports the write pointer for a full zone in a consistent way, regardless of the interface of the underlying zoned block device. blkzone report before patch: start: 0x000040000, len 0x040000, cap 0x03e000, wptr 0xfffffffffffbfff8 reset:0 non-seq:0, zcond:14(fu) [type: 2(SEQ_WRITE_REQUIRED)] blkzone report after patch: start: 0x000040000, len 0x040000, cap 0x03e000, wptr 0x040000 reset:0 non-seq:0, zcond:14(fu) [type: 2(SEQ_WRITE_REQUIRED)] Signed-off-by: Niklas Cassel <niklas.cassel@wdc.com> Reviewed-by: Keith Busch <kbusch@kernel.org> Reviewed-by: Damien Le Moal <damien.lemoal@opensource.wdc.com> Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> Signed-off-by: Christoph Hellwig <hch@lst.de>
2021-10-19nvme: move command clear into the various setup helpersJens Axboe1-0/+2
We don't have to worry about doing extra memsets by moving it outside the protection of RQF_DONTPREP, as nvme doesn't do partial completions. This is in preparation for making the read/write fast path not do a full memset of the command. Reviewed-by: Keith Busch <kbusch@kernel.org> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-06-03nvme: split nvme_report_zonesChristoph Hellwig1-18/+2
Split multipath support out of nvme_report_zones into a separate helper and simplify the non-multipath version as a result. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Keith Busch <kbusch@kernel.org> Reviewed-by: Sagi Grimberg <sagi@grimberg.me>
2021-06-03nvme: move the CSI sanity check into nvme_ns_report_zonesChristoph Hellwig1-5/+4
Move the CSI check into nvme_ns_report_zones to clean up the code a little bit and prepare for further refactoring. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Keith Busch <kbusch@kernel.org> Reviewed-by: Sagi Grimberg <sagi@grimberg.me>
2021-04-15nvme: let namespace probing continue for unsupported featuresChristoph Hellwig1-2/+2
Instead of failing to scan the namespace entirely when unsupported features are detected, just mark the gendisk hidden but allow other access like the upcoming per-namespace character device. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Keith Busch <kbusch@kernel.org> Reviewed-by: Javier González <javier.gonz@samsung.com>
2021-03-11nvme: set max_zone_append_sectors nvme_revalidate_zonesChaitanya Kulkarni1-2/+7
The chunk_sectors value affects max_zone_append_sectors. Signed-off-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com> Reviewed-by: Keith Busch <kbusch@kernel.org> Tested-by: Kanchan Joshi <joshi.k@samsung.com> Signed-off-by: Christoph Hellwig <hch@lst.de>
2021-02-10nvme: cleanup zone information initializationDamien Le Moal1-8/+3
For a zoned namespace, in nvme_update_ns_info(), call nvme_update_zone_info() after executing nvme_update_disk_info() so that the namespace queue logical and physical block size limits are set. This allows setting the namespace queue max_zone_append_sectors limit in nvme_update_zone_info() instead of nvme_revalidate_zones(), simplifying this function. Also use blk_queue_set_zoned() to set the namespace zoned model. Signed-off-by: Damien Le Moal <damien.lemoal@wdc.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com> Reviewed-by: Johannes Thumshirn <johannes.thumshirn@edc.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2020-12-01nvme: export zoned namespaces without Zone Append support read-onlyJavier González1-4/+9
Allow ZNS NVMe SSDs to present a read-only namespace when append is not supported, instead of rejecting the namespace directly. This allows (i) the namespace to be used in read-only mode, which is not a problem as the append command only affects the write path, and (ii) to use standard management tools such as nvme-cli to choose a different format or firmware slot that is compatible with the Linux zoned block device. Signed-off-by: Javier González <javier.gonz@samsung.com> Signed-off-by: Christoph Hellwig <hch@lst.de>
2020-10-07nvme: remove the disk argument to nvme_update_zone_infoChristoph Hellwig1-3/+2
The queue can trivially be derived from the nvme_ns structure. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Keith Busch <kbusch@kernel.org> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Reviewed-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com> Reviewed-by: Damien Le Moal <damien.lemoal@wdc.com>
2020-10-07nvme: fix initialization of the zone bitmapsChristoph Hellwig1-0/+11
The removal of the ->revalidate_disk method broke the initialization of the zone bitmaps, as nvme_revalidate_disk now never gets called during initialization. Move the zone related code from nvme_revalidate_disk into a new helper in zns.c, and call it from nvme_alloc_ns in addition to nvme_validate_ns to ensure the zone bitmaps are initialized during probe. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Keith Busch <kbusch@kernel.org> Reviewed-by: Damien Le Moal <damien.lemoal@wdc.com>
2020-09-27nvme: fix error handling in nvme_ns_report_zonesChristoph Hellwig1-25/+16
nvme_submit_sync_cmd can return positive NVMe error codes in addition to the negative Linux error code, which are currently ignored. Fix this by removing __nvme_ns_report_zones and handling the errors from nvme_submit_sync_cmd in the caller instead of multiplexing the return value and the number of zones reported into a single return value. Fixes: 240e6ee272c0 ("nvme: support for zoned namespaces") Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Damien Le Moal <damien.lemoal@wdc.com>
2020-07-15block: add max_active_zones to blk-sysfsNiklas Cassel1-0/+1
Add a new max_active zones definition in the sysfs documentation. This definition will be common for all devices utilizing the zoned block device support in the kernel. Export max_active_zones according to this new definition for NVMe Zoned Namespace devices, ZAC ATA devices (which are treated as SCSI devices by the kernel), and ZBC SCSI devices. Add the new max_active_zones member to struct request_queue, rather than as a queue limit, since this property cannot be split across stacking drivers. For SCSI devices, even though max active zones is not part of the ZBC/ZAC spec, export max_active_zones as 0, signifying "no limit". Signed-off-by: Niklas Cassel <niklas.cassel@wdc.com> Reviewed-by: Javier González <javier@javigon.com> Reviewed-by: Damien Le Moal <damien.lemoal@wdc.com> Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2020-07-15block: add max_open_zones to blk-sysfsNiklas Cassel1-0/+1
Add a new max_open_zones definition in the sysfs documentation. This definition will be common for all devices utilizing the zoned block device support in the kernel. Export max open zones according to this new definition for NVMe Zoned Namespace devices, ZAC ATA devices (which are treated as SCSI devices by the kernel), and ZBC SCSI devices. Add the new max_open_zones member to struct request_queue, rather than as a queue limit, since this property cannot be split across stacking drivers. Signed-off-by: Niklas Cassel <niklas.cassel@wdc.com> Reviewed-by: Javier González <javier@javigon.com> Reviewed-by: Damien Le Moal <damien.lemoal@wdc.com> Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2020-07-08nvme: support for zoned namespacesKeith Busch1-0/+254
Add support for NVM Express Zoned Namespaces (ZNS) Command Set defined in NVM Express TP4053. Zoned namespaces are discovered based on their Command Set Identifier reported in the namespaces Namespace Identification Descriptor list. A successfully discovered Zoned Namespace will be registered with the block layer as a host managed zoned block device with Zone Append command support. A namespace that does not support append is not supported by the driver. Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com> Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> Reviewed-by: Hannes Reinecke <hare@suse.de> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Reviewed-by: Javier González <javier.gonz@samsung.com> Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com> Signed-off-by: Hans Holmberg <hans.holmberg@wdc.com> Signed-off-by: Dmitry Fomichev <dmitry.fomichev@wdc.com> Signed-off-by: Ajay Joshi <ajay.joshi@wdc.com> Signed-off-by: Aravind Ramesh <aravind.ramesh@wdc.com> Signed-off-by: Niklas Cassel <niklas.cassel@wdc.com> Signed-off-by: Matias Bjørling <matias.bjorling@wdc.com> Signed-off-by: Damien Le Moal <damien.lemoal@wdc.com> Signed-off-by: Keith Busch <keith.busch@wdc.com> Signed-off-by: Christoph Hellwig <hch@lst.de>