summaryrefslogtreecommitdiff
path: root/fs/btrfs/volumes.c
AgeCommit message (Collapse)AuthorFilesLines
2017-05-05btrfs: fix the gfp_mask for the reada_zones radix treeChris Mason1-1/+1
Commits cc8385b59e17 and 7ef70b4d9987a7 added preallocation for the reada radix trees and also switched them over to GFP_KERNEL for the default gfp mask. Since we're doing radix tree insertions under spinlocks, we need to make sure the mask doesn't allow sleeping. This fix keeps the radix preallocation but switches back to the original gfp_mask. Reported-by: Filipe Manana <fdmanana@suse.com> Signed-off-by: Chris Mason <clm@fb.com>
2017-04-18btrfs: use q which is already obtained from bdev_get_queueAnand Jain1-4/+3
We have already assigned q from bdev_get_queue() so use it. And rearrange the code for better view. Signed-off-by: Anand Jain <anand.jain@oracle.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2017-04-18Btrfs: switch to div64_u64 if with a u64 divisorLiu Bo1-4/+4
This is fixing code pieces where we use div_u64 when passing a u64 divisor. Cc: David Sterba <dsterba@suse.cz> Signed-off-by: Liu Bo <bo.li.liu@oracle.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2017-04-18btrfs: drop redundant parameters from btrfs_map_sblockDavid Sterba1-4/+2
All callers pass 0 for mirror_num and 1 for need_raid_map. Signed-off-by: David Sterba <dsterba@suse.com>
2017-04-18btrfs: track exclusive filesystem operation in flagsDavid Sterba1-3/+3
There are several operations, usually started from ioctls, that cannot run concurrently. The status is tracked in mutually_exclusive_operation_running as an atomic_t. We can easily track the status as one of the per-filesystem flag bits with same synchronization guarantees. The conversion replaces: * atomic_xchg(..., 1) -> test_and_set_bit(FLAG, ...) * atomic_set(..., 0) -> clear_bit(FLAG, ...) Reviewed-by: Anand Jain <anand.jain@oracle.com> Signed-off-by: David Sterba <dsterba@suse.com>
2017-04-18btrfs: preallocate radix tree node for readaheadDavid Sterba1-1/+1
We can preallocate the node so insertion does not have to do that under the lock. The GFP flags for the per-device radix tree are initialized to GFP_NOFS & ~__GFP_DIRECT_RECLAIM but we can use GFP_KERNEL, same as an allocation above anyway, but also because readahead is optional and not on any critical writeout path. Reviewed-by: Liu Bo <bo.li.liu@oracle.com> Signed-off-by: David Sterba <dsterba@suse.com>
2017-04-18Btrfs: convert BUG_ON to WARN_ONLiu Bo1-2/+2
These two BUG_ON()s would never be true, ensured by callers' logic. Reviewed-by: Qu Wenruo <quwenruo@cn.fujitsu.com> Signed-off-by: Liu Bo <bo.li.liu@oracle.com> Signed-off-by: David Sterba <dsterba@suse.com>
2017-04-18Btrfs: helper for ops that requires full stripeLiu Bo1-8/+10
This adds a helper to show directly whether ops require full stripe. Reviewed-by: Qu Wenruo <quwenruo@cn.fujitsu.com> Signed-off-by: Liu Bo <bo.li.liu@oracle.com> Signed-off-by: David Sterba <dsterba@suse.com>
2017-04-18Btrfs: do not add extra mirror when dev_replace target dev is not availableLiu Bo1-3/+4
With this, we can avoid allocating memory for dev replace copies if the target dev is not available. Reviewed-by: Qu Wenruo <quwenruo@cn.fujitsu.com> Signed-off-by: Liu Bo <bo.li.liu@oracle.com> Signed-off-by: David Sterba <dsterba@suse.com>
2017-04-18Btrfs: handle operations for device replace separatelyLiu Bo1-81/+98
Since this part is mostly independent, this moves it to a separate function. Reviewed-by: Qu Wenruo <quwenruo@cn.fujitsu.com> Signed-off-by: Liu Bo <bo.li.liu@oracle.com> Signed-off-by: David Sterba <dsterba@suse.com>
2017-04-18Btrfs: introduce a function to get extra mirror from replaceLiu Bo1-72/+89
As the part of getting extra mirror in __btrfs_map_block is independent, this puts it into a separate function. Reviewed-by: Qu Wenruo <quwenruo@cn.fujitsu.com> Signed-off-by: Liu Bo <bo.li.liu@oracle.com> Signed-off-by: David Sterba <dsterba@suse.com>
2017-04-18Btrfs: separate DISCARD from __btrfs_map_blockLiu Bo1-114/+175
Since DISCARD is not as important as an operation like write, we don't copy it to target device during replace, and it makes __btrfs_map_block less complex. Signed-off-by: Liu Bo <bo.li.liu@oracle.com> Signed-off-by: David Sterba <dsterba@suse.com>
2017-04-18Btrfs: create a helper for getting chunk mapLiu Bo1-108/+55
We have similar code here and there, this merges them into a helper. Signed-off-by: Liu Bo <bo.li.liu@oracle.com> Reviewed-by: Qu Wenruo <quwenruo@cn.fujitsu.com> Signed-off-by: David Sterba <dsterba@suse.com>
2017-04-18btrfs: convert extent_map.refs from atomic_t to refcount_tElena Reshetova1-1/+1
refcount_t type and corresponding API should be used instead of atomic_t when the variable is used as a reference counter. This allows to avoid accidental refcounter overflows that might lead to use-after-free situations. Signed-off-by: Elena Reshetova <elena.reshetova@intel.com> Signed-off-by: Hans Liljestrand <ishkamiel@gmail.com> Signed-off-by: Kees Cook <keescook@chromium.org> Signed-off-by: David Windsor <dwindsor@gmail.com> Signed-off-by: David Sterba <dsterba@suse.com>
2017-04-18btrfs: convert btrfs_bio.refs from atomic_t to refcount_tElena Reshetova1-4/+4
refcount_t type and corresponding API should be used instead of atomic_t when the variable is used as a reference counter. This allows to avoid accidental refcounter overflows that might lead to use-after-free situations. Signed-off-by: Elena Reshetova <elena.reshetova@intel.com> Signed-off-by: Hans Liljestrand <ishkamiel@gmail.com> Signed-off-by: Kees Cook <keescook@chromium.org> Signed-off-by: David Windsor <dwindsor@gmail.com> Signed-off-by: David Sterba <dsterba@suse.com>
2017-04-18btrfs: fix a bogus warning when converting only data or metadataAdam Borowski1-3/+9
If your filesystem has, eg, data:raid0 metadata:raid1, and you run "btrfs balance -dconvert=raid1", the meta.target field will be uninitialized. That's otherwise ok, as it's unused except for this warning. Thus, let's use the existing set of raid levels for the comparison. As a side effect, non-convert balances will now nag about data>metadata. Signed-off-by: Adam Borowski <kilobyte@angband.pl> Reviewed-by: Liu Bo <bo.li.liu@oracle.com> Signed-off-by: David Sterba <dsterba@suse.com>
2017-04-15Merge branch 'for-linus-4.11' of ↵Linus Torvalds1-1/+1
git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs Pull btrfs fixes from Chris Mason: "Dave Sterba collected a few more fixes for the last rc. These aren't marked for stable, but I'm putting them in with a batch were testing/sending by hand for this release" * 'for-linus-4.11' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs: Btrfs: fix potential use-after-free for cloned bio Btrfs: fix segmentation fault when doing dio read Btrfs: fix invalid dereference in btrfs_retry_endio btrfs: drop the nossd flag when remounting with -o ssd
2017-04-11Btrfs: fix potential use-after-free for cloned bioLiu Bo1-1/+1
KASAN reports that there is a use-after-free case of bio in btrfs_map_bio. If we need to submit IOs to several disks at a time, the original bio would get cloned and mapped to the destination disk, but we really should use the original bio instead of a cloned bio to do the sanity check because cloned bios are likely to be freed by its endio. Reported-by: Diego <diegocg@gmail.com> Signed-off-by: Liu Bo <bo.li.liu@oracle.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2017-03-03Merge branch 'for-linus-4.11' of ↵Linus Torvalds1-9/+12
git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs Pull more btrfs updates from Chris Mason: "Btrfs round two. These are mostly a continuation of Dave Sterba's collection of cleanups, but Filipe also has some bug fixes and performance improvements" * 'for-linus-4.11' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs: (69 commits) btrfs: add dummy callback for readpage_io_failed and drop checks btrfs: drop checks for mandatory extent_io_ops callbacks btrfs: document existence of extent_io ops callbacks btrfs: let writepage_end_io_hook return void btrfs: do proper error handling in btrfs_insert_xattr_item btrfs: handle allocation error in update_dev_stat_item btrfs: remove BUG_ON from __tree_mod_log_insert btrfs: derive maximum output size in the compression implementation btrfs: use predefined limits for calculating maximum number of pages for compression btrfs: export compression buffer limits in a header btrfs: merge nr_pages input and output parameter in compress_pages btrfs: merge length input and output parameter in compress_pages btrfs: constify name of subvolume in creation helpers btrfs: constify buffers used by compression helpers btrfs: constify input buffer of btrfs_csum_data btrfs: constify device path passed to relevant helpers btrfs: make btrfs_inode_resume_unlocked_dio take btrfs_inode btrfs: make btrfs_inode_block_unlocked_dio take btrfs_inode btrfs: Make btrfs_add_nondir take btrfs_inode btrfs: Make btrfs_add_link take btrfs_inode ...
2017-02-28btrfs: handle allocation error in update_dev_stat_itemDavid Sterba1-1/+2
Reviewed-by: Liu Bo <bo.li.liu@oracle.com> Signed-off-by: David Sterba <dsterba@suse.com>
2017-02-28btrfs: constify device path passed to relevant helpersDavid Sterba1-8/+10
Signed-off-by: David Sterba <dsterba@suse.com>
2017-02-26Merge branch 'for-linus-4.11' of ↵Linus Torvalds1-11/+7
git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs Pull btrfs updates from Chris Mason: "This has a series of fixes and cleanups that Dave Sterba has been collecting. There is a pretty big variety here, cleaning up internal APIs and fixing corner cases" * 'for-linus-4.11' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs: (124 commits) Btrfs: use the correct type when creating cow dio extent Btrfs: fix deadlock between dedup on same file and starting writeback btrfs: use btrfs_debug instead of pr_debug in transaction abort btrfs: btrfs_truncate_free_space_cache always allocates path btrfs: free-space-cache, clean up unnecessary root arguments btrfs: convert btrfs_inc_block_group_ro to accept fs_info btrfs: flush_space always takes fs_info->fs_root btrfs: pass fs_info to (more) routines that are only called with extent_root btrfs: qgroup: Move half of the qgroup accounting time out of commit trans btrfs: remove unused parameter from adjust_slots_upwards btrfs: remove unused parameters from __btrfs_write_out_cache btrfs: remove unused parameter from cleanup_write_cache_enospc btrfs: remove unused parameter from __add_inode_ref btrfs: remove unused parameter from clone_copy_inline_extent btrfs: remove unused parameters from btrfs_cmp_data btrfs: remove unused parameter from __add_inline_refs btrfs: remove unused parameters from scrub_setup_wr_ctx btrfs: remove unused parameter from create_snapshot btrfs: remove unused parameter from init_first_rw_device btrfs: remove unused parameter from __btrfs_alloc_chunk ...
2017-02-17btrfs: remove unused parameter from init_first_rw_deviceDavid Sterba1-5/+3
The 'device' used to be added in that function, but now it's done by the caller. Reviewed-by: Liu Bo <bo.li.liu@oracle.com> Signed-off-by: David Sterba <dsterba@suse.com>
2017-02-17btrfs: remove unused parameter from __btrfs_alloc_chunkDavid Sterba1-6/+4
We grab fs_info from other parameters. Reviewed-by: Liu Bo <bo.li.liu@oracle.com> Signed-off-by: David Sterba <dsterba@suse.com>
2017-02-02block: Get rid of blk_get_backing_dev_info()Jan Kara1-1/+1
blk_get_backing_dev_info() is now a simple dereference. Remove that function and simplify some code around that. Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Jens Axboe <axboe@fb.com>
2016-12-16Merge branch 'for-linus-4.10' of ↵Linus Torvalds1-432/+413
git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs Pull btrfs updates from Chris Mason: "Jeff Mahoney and Dave Sterba have a really nice set of cleanups in here, and Christoph pitched in corrections/improvements to make btrfs use proper helpers for bio walking instead of doing it by hand. There are some key fixes as well, including some long standing bugs that took forever to track down in btrfs_drop_extents and during balance" * 'for-linus-4.10' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs: (77 commits) btrfs: limit async_work allocation and worker func duration Revert "Btrfs: adjust len of writes if following a preallocated extent" Btrfs: don't WARN() in btrfs_transaction_abort() for IO errors btrfs: opencode chunk locking, remove helpers btrfs: remove root parameter from transaction commit/end routines btrfs: split btrfs_wait_marked_extents into normal and tree log functions btrfs: take an fs_info directly when the root is not used otherwise btrfs: simplify btrfs_wait_cache_io prototype btrfs: convert extent-tree tracepoints to use fs_info btrfs: root->fs_info cleanup, access fs_info->delayed_root directly btrfs: root->fs_info cleanup, add fs_info convenience variables btrfs: root->fs_info cleanup, update_block_group{,flags} btrfs: root->fs_info cleanup, lock/unlock_chunks btrfs: root->fs_info cleanup, btrfs_calc_{trans,trunc}_metadata_size btrfs: pull node/sector/stripe sizes out of root and into fs_info btrfs: root->fs_info cleanup, io_ctl_init btrfs: root->fs_info cleanup, use fs_info->dev_root everywhere btrfs: struct reada_control.root -> reada_control.fs_info btrfs: struct btrfsic_state->root should be an fs_info btrfs: alloc_reserved_file_extent trace point should use extent_root ...
2016-12-06btrfs: opencode chunk locking, remove helpersDavid Sterba1-35/+35
The helpers are trivial and we don't use them consistently. Signed-off-by: David Sterba <dsterba@suse.com>
2016-12-06btrfs: remove root parameter from transaction commit/end routinesJeff Mahoney1-17/+16
Now we only use the root parameter to print the root objectid in a tracepoint. We can use the root parameter from the transaction handle for that. It's also used to join the transaction with async commits, so we remove the comment that it's just for checking. Signed-off-by: Jeff Mahoney <jeffm@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2016-12-06btrfs: take an fs_info directly when the root is not used otherwiseJeff Mahoney1-92/+67
There are loads of functions in btrfs that accept a root parameter but only use it to obtain an fs_info pointer. Let's convert those to just accept an fs_info pointer directly. Signed-off-by: Jeff Mahoney <jeffm@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2016-12-06btrfs: root->fs_info cleanup, add fs_info convenience variablesJeff Mahoney1-254/+257
In routines where someptr->fs_info is referenced multiple times, we introduce a convenience variable. This makes the code considerably more readable. Signed-off-by: Jeff Mahoney <jeffm@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2016-12-06btrfs: root->fs_info cleanup, lock/unlock_chunksJeff Mahoney1-35/+35
Signed-off-by: Jeff Mahoney <jeffm@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2016-12-06btrfs: pull node/sector/stripe sizes out of root and into fs_infoJeff Mahoney1-19/+22
We track the node sizes per-root, but they never vary from the values in the superblock. This patch messes with the 80-column style a bit, but subsequent patches to factor out root->fs_info into a convenience variable fix it up again. Signed-off-by: Jeff Mahoney <jeffm@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2016-12-06btrfs: root->fs_info cleanup, use fs_info->dev_root everywhereJeff Mahoney1-21/+20
Signed-off-by: Jeff Mahoney <jeffm@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2016-12-06btrfs: btrfs_init_new_device should use fs_info->dev_rootJeff Mahoney1-1/+2
btrfs_init_new_device only uses the root passed in via the ioctl to start the transaction. Nothing else that happens is related to whatever root the user used to initiate the ioctl. We can drop the root requirement and just use fs_info->dev_root instead. Signed-off-by: Jeff Mahoney <jeffm@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2016-12-06btrfs: call functions that always use the same root with fs_info insteadJeff Mahoney1-21/+26
There are many functions that are always called with the same root argument. Rather than passing the same root every time, we can pass an fs_info pointer instead and have the function get the root pointer itself. Signed-off-by: Jeff Mahoney <jeffm@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2016-12-06btrfs: call functions that overwrite their root parameter with fs_infoJeff Mahoney1-31/+25
There are 11 functions that accept a root parameter and immediately overwrite it. We can pass those an fs_info pointer instead. Signed-off-by: Jeff Mahoney <jeffm@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2016-11-30btrfs: remove constant parameter to memset_extent_buffer and rename itDavid Sterba1-1/+1
The only memset we do is to 0, so sink the parameter to the function and simplify all calls. Rename the function to reflect the behaviour. Signed-off-by: David Sterba <dsterba@suse.com>
2016-11-30btrfs: use new helpers to set uuids in ebDavid Sterba1-2/+2
Signed-off-by: David Sterba <dsterba@suse.com>
2016-11-29btrfs: don't abuse REQ_OP_* flags for btrfs_map_blockChristoph Hellwig1-28/+30
btrfs_map_block supports different types of mappings, which to a large extent resemble block layer operations. But they don't always do, and currently btrfs dangerously overlays it's own flag over the block layer flags. This is just asking for a conflict, so introduce a different map flags enum inside of btrfs instead. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2016-11-01block,fs: use REQ_* flags directlyChristoph Hellwig1-1/+1
Remove the WRITE_* and READ_SYNC wrappers, and just use the flags directly. Where applicable this also drops usage of the bio_set_op_attrs wrapper. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jens Axboe <axboe@fb.com>
2016-11-01btrfs: use op_is_sync to check for synchronous requestsChristoph Hellwig1-1/+1
Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jens Axboe <axboe@fb.com>
2016-10-10Revert "btrfs: let btrfs_delete_unused_bgs() to clean relocated bgs"Chris Mason1-10/+14
This reverts commit 5d8eb6fe517583f9c6d5b94faf2254a0207a45c9. When we remove devices, we free the device structures. Delaying btfs_remove_chunk() ends up hitting a use-after-free on them. Signed-off-by: Chris Mason <clm@fb.com>
2016-09-26btrfs: fix a possible umount deadlockAnand Jain1-6/+20
btrfs_show_devname() is using the device_list_mutex, sometimes a call to blkdev_put() leads vfs calling into this func. So call blkdev_put() outside of device_list_mutex, as of now. [ 983.284212] ====================================================== [ 983.290401] [ INFO: possible circular locking dependency detected ] [ 983.296677] 4.8.0-rc5-ceph-00023-g1b39cec2 #1 Not tainted [ 983.302081] ------------------------------------------------------- [ 983.308357] umount/21720 is trying to acquire lock: [ 983.313243] (&bdev->bd_mutex){+.+.+.}, at: [<ffffffff9128ec51>] blkdev_put+0x31/0x150 [ 983.321264] [ 983.321264] but task is already holding lock: [ 983.327101] (&fs_devs->device_list_mutex){+.+...}, at: [<ffffffffc033d6f6>] __btrfs_close_devices+0x46/0x200 [btrfs] [ 983.337839] [ 983.337839] which lock already depends on the new lock. [ 983.337839] [ 983.346024] [ 983.346024] the existing dependency chain (in reverse order) is: [ 983.353512] -> #4 (&fs_devs->device_list_mutex){+.+...}: [ 983.359096] [<ffffffff910dfd0c>] lock_acquire+0x1bc/0x1f0 [ 983.365143] [<ffffffff91823125>] mutex_lock_nested+0x65/0x350 [ 983.371521] [<ffffffffc02d8116>] btrfs_show_devname+0x36/0x1f0 [btrfs] [ 983.378710] [<ffffffff9129523e>] show_vfsmnt+0x4e/0x150 [ 983.384593] [<ffffffff9126ffc7>] m_show+0x17/0x20 [ 983.389957] [<ffffffff91276405>] seq_read+0x2b5/0x3b0 [ 983.395669] [<ffffffff9124c808>] __vfs_read+0x28/0x100 [ 983.401464] [<ffffffff9124eb3b>] vfs_read+0xab/0x150 [ 983.407080] [<ffffffff9124ec32>] SyS_read+0x52/0xb0 [ 983.412609] [<ffffffff91825fc0>] entry_SYSCALL_64_fastpath+0x23/0xc1 [ 983.419617] -> #3 (namespace_sem){++++++}: [ 983.424024] [<ffffffff910dfd0c>] lock_acquire+0x1bc/0x1f0 [ 983.430074] [<ffffffff918239e9>] down_write+0x49/0x80 [ 983.435785] [<ffffffff91272457>] lock_mount+0x67/0x1c0 [ 983.441582] [<ffffffff91272ab2>] do_add_mount+0x32/0xf0 [ 983.447458] [<ffffffff9127363a>] finish_automount+0x5a/0xc0 [ 983.453682] [<ffffffff91259513>] follow_managed+0x1b3/0x2a0 [ 983.459912] [<ffffffff9125b750>] lookup_fast+0x300/0x350 [ 983.465875] [<ffffffff9125d6e7>] path_openat+0x3a7/0xaa0 [ 983.471846] [<ffffffff9125ef75>] do_filp_open+0x85/0xe0 [ 983.477731] [<ffffffff9124c41c>] do_sys_open+0x14c/0x1f0 [ 983.483702] [<ffffffff9124c4de>] SyS_open+0x1e/0x20 [ 983.489240] [<ffffffff91825fc0>] entry_SYSCALL_64_fastpath+0x23/0xc1 [ 983.496254] -> #2 (&sb->s_type->i_mutex_key#3){+.+.+.}: [ 983.501798] [<ffffffff910dfd0c>] lock_acquire+0x1bc/0x1f0 [ 983.507855] [<ffffffff918239e9>] down_write+0x49/0x80 [ 983.513558] [<ffffffff91366237>] start_creating+0x87/0x100 [ 983.519703] [<ffffffff91366647>] debugfs_create_dir+0x17/0x100 [ 983.526195] [<ffffffff911df153>] bdi_register+0x93/0x210 [ 983.532165] [<ffffffff911df313>] bdi_register_owner+0x43/0x70 [ 983.538570] [<ffffffff914080fb>] device_add_disk+0x1fb/0x450 [ 983.544888] [<ffffffff91580226>] loop_add+0x1e6/0x290 [ 983.550596] [<ffffffff91fec358>] loop_init+0x10b/0x14f [ 983.556394] [<ffffffff91002207>] do_one_initcall+0xa7/0x180 [ 983.562618] [<ffffffff91f932e0>] kernel_init_freeable+0x1cc/0x266 [ 983.569370] [<ffffffff918174be>] kernel_init+0xe/0x100 [ 983.575166] [<ffffffff9182620f>] ret_from_fork+0x1f/0x40 [ 983.581131] -> #1 (loop_index_mutex){+.+.+.}: [ 983.585801] [<ffffffff910dfd0c>] lock_acquire+0x1bc/0x1f0 [ 983.591858] [<ffffffff91823125>] mutex_lock_nested+0x65/0x350 [ 983.598256] [<ffffffff9157ed3f>] lo_open+0x1f/0x60 [ 983.603704] [<ffffffff9128eec3>] __blkdev_get+0x123/0x400 [ 983.609757] [<ffffffff9128f4ea>] blkdev_get+0x34a/0x350 [ 983.615639] [<ffffffff9128f554>] blkdev_open+0x64/0x80 [ 983.621428] [<ffffffff9124aff6>] do_dentry_open+0x1c6/0x2d0 [ 983.627651] [<ffffffff9124c029>] vfs_open+0x69/0x80 [ 983.633181] [<ffffffff9125db74>] path_openat+0x834/0xaa0 [ 983.639152] [<ffffffff9125ef75>] do_filp_open+0x85/0xe0 [ 983.645035] [<ffffffff9124c41c>] do_sys_open+0x14c/0x1f0 [ 983.650999] [<ffffffff9124c4de>] SyS_open+0x1e/0x20 [ 983.656535] [<ffffffff91825fc0>] entry_SYSCALL_64_fastpath+0x23/0xc1 [ 983.663541] -> #0 (&bdev->bd_mutex){+.+.+.}: [ 983.668107] [<ffffffff910def43>] __lock_acquire+0x1003/0x17b0 [ 983.674510] [<ffffffff910dfd0c>] lock_acquire+0x1bc/0x1f0 [ 983.680561] [<ffffffff91823125>] mutex_lock_nested+0x65/0x350 [ 983.686967] [<ffffffff9128ec51>] blkdev_put+0x31/0x150 [ 983.692761] [<ffffffffc033481f>] btrfs_close_bdev+0x4f/0x60 [btrfs] [ 983.699699] [<ffffffffc033d77b>] __btrfs_close_devices+0xcb/0x200 [btrfs] [ 983.707178] [<ffffffffc033d8db>] btrfs_close_devices+0x2b/0xa0 [btrfs] [ 983.714380] [<ffffffffc03081c5>] close_ctree+0x265/0x340 [btrfs] [ 983.721061] [<ffffffffc02d7959>] btrfs_put_super+0x19/0x20 [btrfs] [ 983.727908] [<ffffffff91250e2f>] generic_shutdown_super+0x6f/0x100 [ 983.734744] [<ffffffff91250f56>] kill_anon_super+0x16/0x30 [ 983.740888] [<ffffffffc02da97e>] btrfs_kill_super+0x1e/0x130 [btrfs] [ 983.747909] [<ffffffff91250fe9>] deactivate_locked_super+0x49/0x80 [ 983.754745] [<ffffffff912515fd>] deactivate_super+0x5d/0x70 [ 983.760977] [<ffffffff91270a1c>] cleanup_mnt+0x5c/0x80 [ 983.766773] [<ffffffff91270a92>] __cleanup_mnt+0x12/0x20 [ 983.772738] [<ffffffff910aa2fe>] task_work_run+0x7e/0xc0 [ 983.778708] [<ffffffff91081b5a>] exit_to_usermode_loop+0x7e/0xb4 [ 983.785373] [<ffffffff910039eb>] syscall_return_slowpath+0xbb/0xd0 [ 983.792212] [<ffffffff9182605c>] entry_SYSCALL_64_fastpath+0xbf/0xc1 [ 983.799225] [ 983.799225] other info that might help us debug this: [ 983.799225] [ 983.807291] Chain exists of: &bdev->bd_mutex --> namespace_sem --> &fs_devs->device_list_mutex [ 983.816521] Possible unsafe locking scenario: [ 983.816521] [ 983.822489] CPU0 CPU1 [ 983.827043] ---- ---- [ 983.831599] lock(&fs_devs->device_list_mutex); [ 983.836289] lock(namespace_sem); [ 983.842268] lock(&fs_devs->device_list_mutex); [ 983.849478] lock(&bdev->bd_mutex); [ 983.853127] [ 983.853127] *** DEADLOCK *** [ 983.853127] [ 983.859113] 3 locks held by umount/21720: [ 983.863145] #0: (&type->s_umount_key#35){++++..}, at: [<ffffffff912515f5>] deactivate_super+0x55/0x70 [ 983.872713] #1: (uuid_mutex){+.+.+.}, at: [<ffffffffc033d8d3>] btrfs_close_devices+0x23/0xa0 [btrfs] [ 983.882206] #2: (&fs_devs->device_list_mutex){+.+...}, at: [<ffffffffc033d6f6>] __btrfs_close_devices+0x46/0x200 [btrfs] [ 983.893422] [ 983.893422] stack backtrace: [ 983.897824] CPU: 6 PID: 21720 Comm: umount Not tainted 4.8.0-rc5-ceph-00023-g1b39cec2 #1 [ 983.905958] Hardware name: Supermicro SYS-5018R-WR/X10SRW-F, BIOS 1.0c 09/07/2015 [ 983.913492] 0000000000000000 ffff8c8a53c17a38 ffffffff91429521 ffffffff9260f4f0 [ 983.921018] ffffffff92642760 ffff8c8a53c17a88 ffffffff911b2b04 0000000000000050 [ 983.928542] ffffffff9237d620 ffff8c8a5294aee0 ffff8c8a5294aeb8 ffff8c8a5294aee0 [ 983.936072] Call Trace: [ 983.938545] [<ffffffff91429521>] dump_stack+0x85/0xc4 [ 983.943715] [<ffffffff911b2b04>] print_circular_bug+0x1fb/0x20c [ 983.949748] [<ffffffff910def43>] __lock_acquire+0x1003/0x17b0 [ 983.955613] [<ffffffff910dfd0c>] lock_acquire+0x1bc/0x1f0 [ 983.961123] [<ffffffff9128ec51>] ? blkdev_put+0x31/0x150 [ 983.966550] [<ffffffff91823125>] mutex_lock_nested+0x65/0x350 [ 983.972407] [<ffffffff9128ec51>] ? blkdev_put+0x31/0x150 [ 983.977832] [<ffffffff9128ec51>] blkdev_put+0x31/0x150 [ 983.983101] [<ffffffffc033481f>] btrfs_close_bdev+0x4f/0x60 [btrfs] [ 983.989500] [<ffffffffc033d77b>] __btrfs_close_devices+0xcb/0x200 [btrfs] [ 983.996415] [<ffffffffc033d8db>] btrfs_close_devices+0x2b/0xa0 [btrfs] [ 984.003068] [<ffffffffc03081c5>] close_ctree+0x265/0x340 [btrfs] [ 984.009189] [<ffffffff9126cc5e>] ? evict_inodes+0x15e/0x170 [ 984.014881] [<ffffffffc02d7959>] btrfs_put_super+0x19/0x20 [btrfs] [ 984.021176] [<ffffffff91250e2f>] generic_shutdown_super+0x6f/0x100 [ 984.027476] [<ffffffff91250f56>] kill_anon_super+0x16/0x30 [ 984.033082] [<ffffffffc02da97e>] btrfs_kill_super+0x1e/0x130 [btrfs] [ 984.039548] [<ffffffff91250fe9>] deactivate_locked_super+0x49/0x80 [ 984.045839] [<ffffffff912515fd>] deactivate_super+0x5d/0x70 [ 984.051525] [<ffffffff91270a1c>] cleanup_mnt+0x5c/0x80 [ 984.056774] [<ffffffff91270a92>] __cleanup_mnt+0x12/0x20 [ 984.062201] [<ffffffff910aa2fe>] task_work_run+0x7e/0xc0 [ 984.067625] [<ffffffff91081b5a>] exit_to_usermode_loop+0x7e/0xb4 [ 984.073747] [<ffffffff910039eb>] syscall_return_slowpath+0xbb/0xd0 [ 984.080038] [<ffffffff9182605c>] entry_SYSCALL_64_fastpath+0xbf/0xc1 Reported-by: Ilya Dryomov <idryomov@gmail.com> Signed-off-by: Anand Jain <anand.jain@oracle.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2016-09-26btrfs: convert pr_* to btrfs_* where possibleJeff Mahoney1-14/+20
For many printks, we want to know which file system issued the message. This patch converts most pr_* calls to use the btrfs_* versions instead. In some cases, this means adding plumbing to allow call sites access to an fs_info pointer. fs/btrfs/check-integrity.c is left alone for another day. Signed-off-by: Jeff Mahoney <jeffm@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2016-09-26btrfs: convert printk(KERN_* to use pr_* callsJeff Mahoney1-10/+8
This patch converts printk(KERN_* style messages to use the pr_* versions. One side effect is that anything that was KERN_DEBUG is now automatically a dynamic debug message. Signed-off-by: Jeff Mahoney <jeffm@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2016-09-26btrfs: unsplit printed stringsJeff Mahoney1-46/+51
CodingStyle chapter 2: "[...] never break user-visible strings such as printk messages, because that breaks the ability to grep for them." This patch unsplits user-visible strings. Signed-off-by: Jeff Mahoney <jeffm@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2016-09-26btrfs: clean the old superblocks before freeing the deviceJeff Mahoney1-27/+11
btrfs_rm_device frees the block device but then re-opens it using the saved device name. A race exists between the close and the re-open that allows the block size to be changed. The result is getting stuck forever in the reclaim loop in __getblk_slow. This patch moves the superblock cleanup before closing the block device, which is also consistent with other callers. We also don't need a private copy of dev_name as the whole routine operates under the uuid_mutex. Signed-off-by: Jeff Mahoney <jeffm@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2016-09-26Btrfs: add a flags field to btrfs_fs_infoJosef Bacik1-1/+1
We have a lot of random ints in btrfs_fs_info that can be put into flags. This is mostly equivalent with the exception of how we deal with quota going on or off, now instead we set a flag when we are turning it on or off and deal with that appropriately, rather than just having a pending state that the current quota_enabled gets set to. Thanks, Signed-off-by: Josef Bacik <jbacik@fb.com> Signed-off-by: David Sterba <dsterba@suse.com>
2016-09-26btrfs: let btrfs_delete_unused_bgs() to clean relocated bgsNaohiro Aota1-14/+10
Currently, btrfs_relocate_chunk() is removing relocated BG by itself. But the work can be done by btrfs_delete_unused_bgs() (and it's better since it trim the BG). Let's dedupe the code. While btrfs_delete_unused_bgs() is already hitting the relocated BG, it skip the BG since the BG has "ro" flag set (to keep balancing BG intact). On the other hand, btrfs cannot drop "ro" flag here to prevent additional writes. So this patch make use of "removed" flag. btrfs_delete_unused_bgs() now detect the flag to distinguish whether a read-only BG is relocating or not. Signed-off-by: Naohiro Aota <naohiro.aota@hgst.com> Reviewed-by: Josef Bacik <jbacik@fb.com> Signed-off-by: David Sterba <dsterba@suse.com>
2016-08-27Merge branch 'for-linus-4.8' of ↵Linus Torvalds1-8/+19
git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs Pull btrfs fixes from Chris Mason: "We've queued up a few different fixes in here. These range from enospc corners to fsync and quota fixes, and a few targeted at error handling for corrupt metadata/fuzzing" * 'for-linus-4.8' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs: Btrfs: fix lockdep warning on deadlock against an inode's log mutex Btrfs: detect corruption when non-root leaf has zero item Btrfs: check btree node's nritems btrfs: don't create or leak aliased root while cleaning up orphans Btrfs: fix em leak in find_first_block_group btrfs: do not background blkdev_put() Btrfs: clarify do_chunk_alloc()'s return value btrfs: fix fsfreeze hang caused by delayed iputs deal btrfs: update btrfs_space_info's bytes_may_use timely btrfs: divide btrfs_update_reserved_bytes() into two functions btrfs: use correct offset for reloc_inode in prealloc_file_extent_cluster() btrfs: qgroup: Fix qgroup incorrectness caused by log replay btrfs: relocation: Fix leaking qgroups numbers on data extents btrfs: qgroup: Refactor btrfs_qgroup_insert_dirty_extent() btrfs: waiting on qgroup rescan should not always be interruptible btrfs: properly track when rescan worker is running btrfs: flush_space: treat return value of do_chunk_alloc properly Btrfs: add ASSERT for block group's memory leak btrfs: backref: Fix soft lockup in __merge_refs function Btrfs: fix memory leak of reloc_root