summaryrefslogtreecommitdiff
path: root/drivers
AgeCommit message (Collapse)AuthorFilesLines
2018-11-14nvme: fix boot hang with only being able to get one IRQ vectorJens Axboe1-3/+10
NVMe always asks for io_queues + 1 worth of IRQ vectors, which means that even when we scale all the way down, we still ask for 2 vectors and get -ENOSPC in return if the system can't support more than 1. Getting just 1 vector is fine, it just means that we'll have 1 IO queue and 1 admin queue, with a shared vector between them. Check for this case and don't add our + 1 if it happens. Fixes: 3b6592f70ad7 ("nvme: utilize two queue maps, one for reads and one for writes") Reported-by: Guenter Roeck <linux@roeck-us.net> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-11-13ide: don't clear special on ide_queue_rq() entryJens Axboe2-5/+1
We can't use RQF_DONTPREP to see if we should clear ->special, as someone could have set that while inserting the request. Make sure we clear it in our ->initialize_rq_fn() helper instead. Fixes: 22ce0a7ccf23 ("ide: don't use req->special") Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-11-12loop: Fix double mutex_unlock(&loop_ctl_mutex) in loop_control_ioctl()Tetsuo Handa1-2/+0
Commit 0a42e99b58a20883 ("loop: Get rid of loop_index_mutex") forgot to remove mutex_unlock(&loop_ctl_mutex) from loop_control_ioctl() when replacing loop_index_mutex with loop_ctl_mutex. Fixes: 0a42e99b58a20883 ("loop: Get rid of loop_index_mutex") Reported-by: syzbot <syzbot+c0138741c2290fc5e63f@syzkaller.appspotmail.com> Reviewed-by: Ming Lei <ming.lei@redhat.com> Reviewed-by: Jan Kara <jack@suse.cz> Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-11-10null_blk: remove unused nullb deviceJens Axboe1-2/+0
The compiler rightfully complains: drivers/block/null_blk_main.c: In function ‘null_complete_rq’: drivers/block/null_blk_main.c:647:16: warning: unused variable ‘nullb’ [-Wunused-variable] struct nullb *nullb = rq->q->queuedata; ^~~~~ Fixes: 49f6613632f9 ("nullb: remove leftover legacy request code") Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-11-10ide: don't use req->specialChristoph Hellwig11-26/+30
Just replace it with a field of the same name in struct ide_req. Reviewed-by: Hannes Reinecke <hare@suse.com> Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-11-10pd: replace ->special use with private data in the requestChristoph Hellwig1-5/+25
Reviewed-by: Hannes Reinecke <hare@suse.com> Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-11-10aoe: replace ->special use with private data in the requestChristoph Hellwig4-23/+20
Makes the code a whole lot easier to read. Reviewed-by: Hannes Reinecke <hare@suse.com> Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-11-10skd_main: don't use req->specialChristoph Hellwig1-1/+7
Add a retries field to the internal request structure instead, which gets set to zero on the first submission. Reviewed-by: Hannes Reinecke <hare@suse.com> Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-11-10nullb: remove leftover legacy request codeChristoph Hellwig1-6/+3
null_softirq_done_fn is only used for the blk-mq path, so remove the other branch. Also rename the function to better match the method name. Reviewed-by: Hannes Reinecke <hare@suse.com> Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-11-10fnic: fix fnic_scsi_host_{start,end}_tagChristoph Hellwig1-2/+2
The way these functions abuse ->special to try to store the dummy request looks completely broken, given that it actually stores the original scsi command. Instead switch to ->host_scribble and store the actual dummy command. Reviewed-by: Hannes Reinecke <hare@suse.com> Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-11-10scsi: return blk_status_t from device handler ->prep_fnChristoph Hellwig5-37/+24
Remove the last use of the old BLKPREP_* values, which get converted to BLK_STS_* later anyway. Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de> Reviewed-by: Bart Van Assche <bvanassche@acm.org> Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-11-10scsi: return blk_status_t from scsi_init_io and ->init_commandChristoph Hellwig5-83/+75
Replace the old BLKPREP_* values with the BLK_STS_ ones that they are converted to later anyway. Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de> Reviewed-by: Bart Van Assche <bvanassche@acm.org> Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-11-10scsi: clean up error handling in scsi_init_ioChristoph Hellwig1-8/+7
There is no need to call scsi_mq_free_sgtables until we have actually allocated sgtables. Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de> Reviewed-by: Bart Van Assche <bvanassche@acm.org> Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-11-10scsi: push blk_status_t up into scsi_setup_{fs,scsi}_cmndChristoph Hellwig1-21/+24
This just moves the prep_to_mq calls up in preparation of further removal of BLKPREP_* usage. Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de> Reviewed-by: Bart Van Assche <bvanassche@acm.org> Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-11-10scsi: simplify scsi_prep_state_checkChristoph Hellwig1-54/+48
Return a blk_status_t directly, and make the code a little more compact by handling the fast path in the caller. Reviewed-by: Bart Van Assche <bvanassche@acm.org> Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-11-10ide: cleanup ->prep_rq calling conventionChristoph Hellwig3-17/+17
The return value is just used as a binary yes/no decision, so switch it to a bool instead of the old BLKPREP_* values returned as an int. Also clean up a few related comments. Reviewed-by: Bart Van Assche <bvanassche@acm.org> Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-11-09mtip32xxx: use for_each_sgChristoph Hellwig1-3/+2
Use the proper helper instead of manually iterating the scatterlist, which is broken in the presence of chained S/G lists. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-11-09mtip32xx: don't use req->specialChristoph Hellwig2-4/+8
Instead create add to the icmd into struct mtip_cmd which can be unioned with the scatterlist used for the normal I/O path. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-11-09mtip32xx: remove mtip_get_int_commandChristoph Hellwig1-17/+7
Merging this function into the only callers makes the code flow easier. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-11-09mtip32xx: remove mtip_init_cmd_headerChristoph Hellwig2-34/+15
There isn't much need for this helper - we can just calculate the offset for the command header once late in the submission path and fill out the ctba and ctbau fields there. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-11-09mtip32xx: add missing endianess annotations on struct smart_attrChristoph Hellwig1-2/+2
Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-11-09mtip32xx: remove __force_bit2intChristoph Hellwig2-35/+26
There is no good excuse not to use proper __le16/32 types. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-11-09mtip32xx: return a blk_status_t from mtip_send_trimChristoph Hellwig1-19/+11
This allows for better error propagation and simpler code. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-11-09mtip32xx: merge mtip_submit_request into mtip_queue_rqChristoph Hellwig1-50/+28
Factor out a new is_stopped helper that matches the existing is_se_active helper, and merge the trivial amount of remaining code into the only caller. This also allows better error handling by returning a BLK_STS_* directly instead of explicitly calling blk_mq_end_request, and moving blk_mq_start_request closer to the actual issue to hardware. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-11-09mtip32xx: move the blk_rq_map_sg call to mtip_hw_submit_ioChristoph Hellwig1-7/+4
We have all arguments at hand in mtip_hw_submit_io, so keep the rq to sg mapping close to the dma_map_sg call. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-11-09sx8: use a per-host tag_setChristoph Hellwig1-248/+95
The current sx8 code spends a lot of effort dealing with the fact that tags are per-host, but there might be multiple queues. Now that the driver has been converted to blk-mq it can take care of the blk-mq tag_set concept that has been designed just for that. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-11-09sx8: cleanup queue and disk allocation / freeingChristoph Hellwig1-59/+48
Make the disk/queue alloc and free helpers per-port by moving the trivial loops into the callers. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-11-08blk-mq-tag: change busy_iter_fn to return whether to continue or notJens Axboe6-11/+18
We have this functionality in sbitmap, but we don't export it in blk-mq for users of the tags busy iteration. This can be useful for stopping the iteration, if the caller doesn't need to find more requests. Reviewed-by: Mike Snitzer <snitzer@redhat.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-11-08loop: Get rid of 'nested' acquisition of loop_ctl_mutexJan Kara1-6/+6
The nested acquisition of loop_ctl_mutex (->lo_ctl_mutex back then) has been introduced by commit f028f3b2f987e "loop: fix circular locking in loop_clr_fd()" to fix lockdep complains about bd_mutex being acquired after lo_ctl_mutex during partition rereading. Now that these are properly fixed, let's stop fooling lockdep. Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-11-08loop: Avoid circular locking dependency between loop_ctl_mutex and bd_mutexJan Kara1-11/+15
Code in loop_change_fd() drops reference to the old file (and also the new file in a failure case) under loop_ctl_mutex. Similarly to a situation in loop_set_fd() this can create a circular locking dependency if this was the last reference holding the file open. Delay dropping of the file reference until we have released loop_ctl_mutex. Reported-by: Tetsuo Handa <penguin-kernel@i-love.sakura.ne.jp> Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-11-08loop: Fix deadlock when calling blkdev_reread_part()Jan Kara1-12/+16
Calling blkdev_reread_part() under loop_ctl_mutex causes lockdep to complain about circular lock dependency between bdev->bd_mutex and lo->lo_ctl_mutex. The problem is that on loop device open or close lo_open() and lo_release() get called with bdev->bd_mutex held and they need to acquire loop_ctl_mutex. OTOH when loop_reread_partitions() is called with loop_ctl_mutex held, it will call blkdev_reread_part() which acquires bdev->bd_mutex. See syzbot report for details [1]. Move call to blkdev_reread_part() in __loop_clr_fd() from under loop_ctl_mutex to finish fixing of the lockdep warning and the possible deadlock. [1] https://syzkaller.appspot.com/bug?id=bf154052f0eea4bc7712499e4569505907d1588 Reported-by: syzbot <syzbot+4684a000d5abdade83fac55b1e7d1f935ef1936e@syzkaller.appspotmail.com> Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-11-08loop: Move loop_reread_partitions() out of loop_ctl_mutexJan Kara1-5/+14
Calling loop_reread_partitions() under loop_ctl_mutex causes lockdep to complain about circular lock dependency between bdev->bd_mutex and lo->lo_ctl_mutex. The problem is that on loop device open or close lo_open() and lo_release() get called with bdev->bd_mutex held and they need to acquire loop_ctl_mutex. OTOH when loop_reread_partitions() is called with loop_ctl_mutex held, it will call blkdev_reread_part() which acquires bdev->bd_mutex. See syzbot report for details [1]. Move all calls of loop_rescan_partitions() out of loop_ctl_mutex to avoid lockdep warning and fix deadlock possibility. [1] https://syzkaller.appspot.com/bug?id=bf154052f0eea4bc7712499e4569505907d1588 Reported-by: syzbot <syzbot+4684a000d5abdade83fac55b1e7d1f935ef1936e@syzkaller.appspotmail.com> Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-11-08loop: Move special partition reread handling in loop_clr_fd()Jan Kara1-14/+19
The call of __blkdev_reread_part() from loop_reread_partition() happens only when we need to invalidate partitions from loop_release(). Thus move a detection for this into loop_clr_fd() and simplify loop_reread_partition(). This makes loop_reread_partition() safe to use without loop_ctl_mutex because we use only lo->lo_number and lo->lo_file_name in case of error for reporting purposes (thus possibly reporting outdate information is not a big deal) and we are safe from 'lo' going away under us by elevated lo->lo_refcnt. Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-11-08loop: Push loop_ctl_mutex down to loop_change_fd()Jan Kara1-11/+11
Push loop_ctl_mutex down to loop_change_fd(). We will need this to be able to call loop_reread_partitions() without loop_ctl_mutex. Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-11-08loop: Push loop_ctl_mutex down to loop_set_fd()Jan Kara1-12/+14
Push lo_ctl_mutex down to loop_set_fd(). We will need this to be able to call loop_reread_partitions() without lo_ctl_mutex. Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-11-08loop: Push loop_ctl_mutex down to loop_set_status()Jan Kara1-26/+25
Push loop_ctl_mutex down to loop_set_status(). We will need this to be able to call loop_reread_partitions() without loop_ctl_mutex. Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-11-08loop: Push loop_ctl_mutex down to loop_get_status()Jan Kara1-27/+10
Push loop_ctl_mutex down to loop_get_status() to avoid the unusual convention that the function gets called with loop_ctl_mutex held and releases it. Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-11-08loop: Push loop_ctl_mutex down into loop_clr_fd()Jan Kara1-20/+29
loop_clr_fd() has a weird locking convention that is expects loop_ctl_mutex held, releases it on success and keeps it on failure. Untangle the mess by moving locking of loop_ctl_mutex into loop_clr_fd(). Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-11-08loop: Split setting of lo_state from loop_clr_fdJan Kara1-21/+31
Move setting of lo_state to Lo_rundown out into the callers. That will allow us to unlock loop_ctl_mutex while the loop device is protected from other changes by its special state. Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-11-08loop: Push lo_ctl_mutex down into individual ioctlsJan Kara1-25/+63
Push acquisition of lo_ctl_mutex down into individual ioctl handling branches. This is a preparatory step for pushing the lock down into individual ioctl handling functions so that they can release the lock as they need it. We also factor out some simple ioctl handlers that will not need any special handling to reduce unnecessary code duplication. Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-11-08loop: Get rid of loop_index_mutexJan Kara1-21/+20
Now that loop_ctl_mutex is global, just get rid of loop_index_mutex as there is no good reason to keep these two separate and it just complicates the locking. Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-11-08loop: Fold __loop_release into loop_releaseJan Kara1-9/+7
__loop_release() has a single call site. Fold it there. This is currently not a huge win but it will make following replacement of loop_index_mutex more obvious. Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-11-08block/loop: Use global lock for ioctl() operation.Tetsuo Handa2-30/+29
syzbot is reporting NULL pointer dereference [1] which is caused by race condition between ioctl(loop_fd, LOOP_CLR_FD, 0) versus ioctl(other_loop_fd, LOOP_SET_FD, loop_fd) due to traversing other loop devices at loop_validate_file() without holding corresponding lo->lo_ctl_mutex locks. Since ioctl() request on loop devices is not frequent operation, we don't need fine grained locking. Let's use global lock in order to allow safe traversal at loop_validate_file(). Note that syzbot is also reporting circular locking dependency between bdev->bd_mutex and lo->lo_ctl_mutex [2] which is caused by calling blkdev_reread_part() with lock held. This patch does not address it. [1] https://syzkaller.appspot.com/bug?id=f3cfe26e785d85f9ee259f385515291d21bd80a3 [2] https://syzkaller.appspot.com/bug?id=bf154052f0eea4bc7712499e4569505907d15889 Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> Reported-by: syzbot <syzbot+bf89c128e05dd6c62523@syzkaller.appspotmail.com> Reviewed-by: Jan Kara <jack@suse.cz> Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-11-08block/loop: Don't grab "struct file" for vfs_getattr() operation.Tetsuo Handa1-5/+5
vfs_getattr() needs "struct path" rather than "struct file". Let's use path_get()/path_put() rather than get_file()/fput(). Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> Reviewed-by: Jan Kara <jack@suse.cz> Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-11-08ms_block: remove unused pointer 'set'Colin Ian King1-1/+0
Pointer 'set' is declared but not used, remove it. Cleans up warning: warning: unused variable ‘set’ [-Wunused-variable] Signed-off-by: Colin Ian King <colin.king@canonical.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-11-08sunvdc: fix compiler warningJens Axboe1-1/+0
Stephen reports: After merging the block tree, today's linux-next build (sparc64 defconfig) produced this warning: /home/sfr/next/next/drivers/block/sunvdc.c: In function 'init_queue': /home/sfr/next/next/drivers/block/sunvdc.c:788:6: warning: unused variable 'ret' [-Wunused-variable] int ret; ^~~ Kill the unused variable. Fixes: fa182a1fa97d ("sunvdc: convert to blk-mq") Reported-by: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-11-07nvme: add separate poll queue mapJens Axboe1-17/+80
Adds support for defining a variable number of poll queues, currently configurable with the 'poll_queues' module parameter. Defaults to a single poll queue. And now we finally have poll support without triggering interrupts! Reviewed-by: Hannes Reinecke <hare@suse.com> Reviewed-by: Keith Busch <keith.busch@intel.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-11-07nvme: utilize two queue maps, one for reads and one for writesJens Axboe1-19/+181
NVMe does round-robin between queues by default, which means that sharing a queue map for both reads and writes can be problematic in terms of read servicing. It's much easier to flood the queue with writes and reduce the read servicing. Implement two queue maps, one for reads and one for writes. The write queue count is configurable through the 'write_queues' parameter. By default, we retain the previous behavior of having a single queue set, shared between reads and writes. Setting 'write_queues' to a non-zero value will create two queue sets, one for reads and one for writes, the latter using the configurable number of queues (hardware queue counts permitting). Reviewed-by: Hannes Reinecke <hare@suse.com> Reviewed-by: Keith Busch <keith.busch@intel.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-11-07blk-mq: abstract out queue mapJens Axboe6-7/+10
This is in preparation for allowing multiple sets of maps per queue, if so desired. Reviewed-by: Hannes Reinecke <hare@suse.com> Reviewed-by: Bart Van Assche <bvanassche@acm.org> Reviewed-by: Keith Busch <keith.busch@intel.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-11-07Merge branch 'irq/for-block' of ↵Jens Axboe1-0/+14
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip into for-4.21/block Pull in the irq affinity commits, that are staged through Thomas's tree. * 'irq/for-block' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: genirq/affinity: Add support for allocating interrupt sets genirq/affinity: Pass first vector to __irq_build_affinity_masks() genirq/affinity: Move two stage affinity spreading into a helper function genirq/affinity: Spread IRQs to all available NUMA nodes