summaryrefslogtreecommitdiff
path: root/block/fops.c
AgeCommit message (Collapse)AuthorFilesLines
2021-11-09Merge tag 'for-5.16/bdev-size-2021-11-09' of git://git.kernel.dk/linux-blockLinus Torvalds1-2/+2
Pull more bdev size updates from Jens Axboe: "Two followup changes for the bdev-size series from this merge window: - Add loff_t cast to bdev_nr_bytes() (Christoph) - Use bdev_nr_bytes() consistently for the block parts at least (me)" * tag 'for-5.16/bdev-size-2021-11-09' of git://git.kernel.dk/linux-block: block: use new bdev_nr_bytes() helper for blkdev_{read,write}_iter() block: add a loff_t cast to bdev_nr_bytes
2021-11-05block: use new bdev_nr_bytes() helper for blkdev_{read,write}_iter()Jens Axboe1-2/+2
We have new helpers for this, use them rather than the slower inode size reads. This makes the read/write path consistent with most of the rest of block as well. Signed-off-by: Jens Axboe <axboe@kernel.dk> Reviewed-by: Christoph Hellwig <hch@lst.de> Link: https://lore.kernel.org/r/a72767cd-3c6d-47f7-80f4-aa025a17b2cb@kernel.dk Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-11-01Merge tag 'for-5.16/ki_complete-2021-10-29' of git://git.kernel.dk/linux-blockLinus Torvalds1-2/+2
Pull kiocb->ki_complete() cleanup from Jens Axboe: "This removes the res2 argument from kiocb->ki_complete(). Only the USB gadget code used it, everybody else passes 0. The USB guys checked the user gadget code they could find, and everybody just uses res as expected for the async interface" * tag 'for-5.16/ki_complete-2021-10-29' of git://git.kernel.dk/linux-block: fs: get rid of the res2 iocb->ki_complete argument usb: remove res2 argument from gadget code completions
2021-11-01Merge tag 'for-5.16/bdev-size-2021-10-29' of git://git.kernel.dk/linux-blockLinus Torvalds1-1/+1
Pull bdev size cleanups from Jens Axboe: "Clean up the bdev size handling with new bdev_nr_bytes() helper" * tag 'for-5.16/bdev-size-2021-10-29' of git://git.kernel.dk/linux-block: (34 commits) partitions/ibm: use bdev_nr_sectors instead of open coding it partitions/efi: use bdev_nr_bytes instead of open coding it block/ioctl: use bdev_nr_sectors and bdev_nr_bytes block: cache inode size in bdev udf: use sb_bdev_nr_blocks reiserfs: use sb_bdev_nr_blocks ntfs: use sb_bdev_nr_blocks jfs: use sb_bdev_nr_blocks ext4: use sb_bdev_nr_blocks block: add a sb_bdev_nr_blocks helper block: use bdev_nr_bytes instead of open coding it in blkdev_fallocate squashfs: use bdev_nr_bytes instead of open coding it reiserfs: use bdev_nr_bytes instead of open coding it pstore/blk: use bdev_nr_bytes instead of open coding it ntfs3: use bdev_nr_bytes instead of open coding it nilfs2: use bdev_nr_bytes instead of open coding it nfs/blocklayout: use bdev_nr_bytes instead of open coding it jfs: use bdev_nr_bytes instead of open coding it hfsplus: use bdev_nr_sectors instead of open coding it hfs: use bdev_nr_sectors instead of open coding it ...
2021-10-27block: add async version of bio_set_polledPavel Begunkov1-4/+3
If we know that a iocb is async we can optimise bio_set_polled() a bit, add a new helper bio_set_polled_async(). Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/8fa137885164a5d05fadcff4c3521da8d5a83d00.1635337135.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-10-27block: kill DIO_MULTI_BIOPavel Begunkov1-21/+12
Now __blkdev_direct_IO() serves only multi-bio I/O, thus remove not used anymore single bio refcounting optimisations. Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/88eb488aae9ed4852a30f3a7132f296f56e43b80.1635337135.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-10-27block: kill unused polling bits in __blkdev_direct_IO()Pavel Begunkov1-17/+3
With addition of __blkdev_direct_IO_async(), __blkdev_direct_IO() now serves only multio-bio I/O, which we don't poll. Now we can remove anything related to I/O polling from it. Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/b8c597a6b7ee612df394853bfd24726aee5b898e.1635337135.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-10-27block: avoid extra iter advance with async iocbPavel Begunkov1-5/+15
Nobody cares about iov iterators state if we return -EIOCBQUEUED, so as the we now have __blkdev_direct_IO_async(), which gets pages only once, we can skip expensive iov_iter_advance(). It's around 1-2% of all CPU spent. Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/a6158edfbfa2ae3bc24aed29a72f035df18fad2f.1635337135.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-10-25fs: get rid of the res2 iocb->ki_complete argumentJens Axboe1-1/+1
The second argument was only used by the USB gadget code, yet everyone pays the overhead of passing a zero to be passed into aio, where it ends up being part of the aio res2 value. Now that everybody is passing in zero, kill off the extra argument. Reviewed-by: Darrick J. Wong <djwong@kernel.org> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-10-25block: add single bio async direct IO helperPavel Begunkov1-3/+84
As with __blkdev_direct_IO_simple(), we can implement direct IO more efficiently if there is only one bio. Add __blkdev_direct_IO_async() and blkdev_bio_end_io_async(). This patch brings me from 4.45-4.5 MIOPS with nullblk to 4.7+. Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/f0ae4109b7a6934adede490f84d188d53b97051b.1635006010.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-10-21block: convert fops.c magic constants to SHIFT_SECTORPavel Begunkov1-8/+10
Don't use shifting by a magic number 9 but replace with a more descriptive SHIFT_SECTOR. Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Link: https://lore.kernel.org/r/068782b9f7e97569fb59a99529b23bb17ea4c5e2.1634755800.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-10-21block: optimise boundary blkdev_read_iter's checksPavel Begunkov1-8/+11
Combine pos and len checks and mark unlikely. Also, don't reexpand if it's not truncated. Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/fff34e613aeaae1ad12977dc4592cb1a1f5d3190.1634755800.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-10-19block: align blkdev_dio inlined bio to a cachelineJens Axboe1-1/+1
We get all sorts of unreliable and funky results since the bio is designed to align on a cacheline, which it does not when inlined like this. Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-10-18block: use bdev_nr_bytes instead of open coding it in blkdev_fallocateChristoph Hellwig1-1/+1
Use the proper helper to read the block device size. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Kees Cook <keescook@chromium.org> Reviewed-by: Jan Kara <jack@suse.cz> Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com> Link: https://lore.kernel.org/r/20211018101130.1838532-25-hch@lst.de Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-10-18block: add a struct io_comp_batch argument to fops->iopoll()Jens Axboe1-2/+2
struct io_comp_batch contains a list head and a completion handler, which will allow completions to more effciently completed batches of IO. For now, no functional changes in this patch, we just define the io_comp_batch structure and add the argument to the file_operations iopoll handler. Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-10-18block: use flags instead of bit fields for blkdev_dioJens Axboe1-14/+20
This generates a lot better code for me, and bumps performance from 7650K IOPS to 7750K IOPS. Looking at profiles for the run and running perf diff, it confirms that we're now sending a lot less time there: 6.38% -2.80% [kernel.vmlinux] [k] blkdev_direct_IO Taking it from the 2nd most cycle consumer to only the 9th most at 3.35% of the CPU time. Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-10-18block: cache bdev in struct file for raw bdev IOPavel Begunkov1-15/+12
bdev = &BDEV_I(file->f_mapping->host)->bdev Getting struct block_device from a file requires 2 memory dereferences as illustrated above, that takes a toll on performance, so cache it in yet unused file->private_data. That gives a noticeable peak performance improvement. Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Link: https://lore.kernel.org/r/8415f9fe12e544b9da89593dfbca8de2b52efe03.1634115360.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-10-18block: switch polling to be bio basedChristoph Hellwig1-17/+8
Replace the blk_poll interface that requires the caller to keep a queue and cookie from the submissions with polling based on the bio. Polling for the bio itself leads to a few advantages: - the cookie construction can made entirely private in blk-mq.c - the caller does not need to remember the request_queue and cookie separately and thus sidesteps their lifetime issues - keeping the device and the cookie inside the bio allows to trivially support polling BIOs remapping by stacking drivers - a lot of code to propagate the cookie back up the submission path can be removed entirely. Signed-off-by: Christoph Hellwig <hch@lst.de> Tested-by: Mark Wunderlich <mark.wunderlich@intel.com> Link: https://lore.kernel.org/r/20211012111226.760968-15-hch@lst.de Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-10-18block: replace the spin argument to blk_iopoll with a flags argumentChristoph Hellwig1-4/+4
Switch the boolean spin argument to blk_poll to passing a set of flags instead. This will allow to control polling behavior in a more fine grained way. Signed-off-by: Christoph Hellwig <hch@lst.de> Tested-by: Mark Wunderlich <mark.wunderlich@intel.com> Link: https://lore.kernel.org/r/20211012111226.760968-10-hch@lst.de [axboe: adapt to changed io_uring iopoll] Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-10-18block: don't try to poll multi-bio I/Os in __blkdev_direct_IOChristoph Hellwig1-14/+7
If an iocb is split into multiple bios we can't poll for both. So don't even bother to try to poll in that case. Signed-off-by: Christoph Hellwig <hch@lst.de> Link: https://lore.kernel.org/r/20211012111226.760968-3-hch@lst.de Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-10-18block: merge block_ioctl into blkdev_ioctlChristoph Hellwig1-18/+1
Simplify the ioctl path and match the code structure on the compat side. Signed-off-by: Christoph Hellwig <hch@lst.de> Link: https://lore.kernel.org/r/20211012104450.659013-4-hch@lst.de Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-09-24block: hold ->invalidate_lock in blkdev_fallocateMing Lei1-11/+10
When running ->fallocate(), blkdev_fallocate() should hold mapping->invalidate_lock to prevent page cache from being accessed, otherwise stale data may be read in page cache. Without this patch, blktests block/009 fails sometimes. With this patch, block/009 can pass always. Also as Jan pointed out, no pages can be created in the discarded area while you are holding the invalidate_lock, so remove the 2nd truncate_bdev_range(). Cc: Jan Kara <jack@suse.cz> Signed-off-by: Ming Lei <ming.lei@redhat.com> Reviewed-by: Jan Kara <jack@suse.cz> Link: https://lore.kernel.org/r/20210923023751.1441091-1-ming.lei@redhat.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-09-07block: split out operations on block special filesChristoph Hellwig1-0/+640
Add a new block/fops.c for all the file and address_space operations that provide the block special file support. Signed-off-by: Christoph Hellwig <hch@lst.de> Link: https://lore.kernel.org/r/20210907141303.1371844-2-hch@lst.de [axboe: correct trailing whitespace while at it] Signed-off-by: Jens Axboe <axboe@kernel.dk>