summaryrefslogtreecommitdiff
path: root/fs/gfs2/ops_fstype.c
AgeCommit message (Collapse)AuthorFilesLines
2024-05-30gfs2: Fix potential glock use-after-free on unmountAndreas Gruenbacher1-0/+1
[ Upstream commit d98779e687726d8f8860f1c54b5687eec5f63a73 ] When a DLM lockspace is released and there ares still locks in that lockspace, DLM will unlock those locks automatically. Commit fb6791d100d1b started exploiting this behavior to speed up filesystem unmount: gfs2 would simply free glocks it didn't want to unlock and then release the lockspace. This didn't take the bast callbacks for asynchronous lock contention notifications into account, which remain active until until a lock is unlocked or its lockspace is released. To prevent those callbacks from accessing deallocated objects, put the glocks that should not be unlocked on the sd_dead_glocks list, release the lockspace, and only then free those glocks. As an additional measure, ignore unexpected ast and bast callbacks if the receiving glock is dead. Fixes: fb6791d100d1b ("GFS2: skip dlm_unlock calls in unmount") Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com> Cc: David Teigland <teigland@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-02-08fs: super_set_uuid()Kent Overstreet1-1/+1
Some weird old filesytems have UUID-like things that we wish to expose as UUIDs, but are smaller; add a length field so that the new FS_IOC_(GET|SET)UUID ioctls can handle them in generic code. And add a helper super_set_uuid(), for setting nonstandard length uuids. Helper is now required for the new FS_IOC_GETUUID ioctl; if super_set_uuid() hasn't been called, the ioctl won't be supported. Reviewed-by: Dave Chinner <dchinner@redhat.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev> Link: https://lore.kernel.org/r/20240207025624.1019754-2-kent.overstreet@linux.dev Signed-off-by: Christian Brauner <brauner@kernel.org>
2024-01-10Merge tag 'gfs2-v6.7-rc1-fixes' of ↵Linus Torvalds1-1/+1
git://git.kernel.org/pub/scm/linux/kernel/git/gfs2/linux-gfs2 Pull gfs2 updates from Andreas Gruenbacher: - Add support for non-blocking lookup (MAY_NOT_BLOCK / LOOKUP_RCU) - Various minor fixes and cleanups * tag 'gfs2-v6.7-rc1-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/gfs2/linux-gfs2: gfs2: Fix freeze consistency check in log_write_header gfs2: Refcounting fix in gfs2_thaw_super gfs2: Minor gfs2_{freeze,thaw}_super cleanup gfs2: Use wait_event_freezable_timeout() for freezable kthread gfs2: Add missing set_freezable() for freezable kthread gfs2: Remove use of error flag in journal reads gfs2: Lift withdraw check out of gfs2_ail1_empty gfs2: Rename gfs2_withdrawn to gfs2_withdrawing_or_withdrawn gfs2: Mark withdraws as unlikely gfs2: Minor gfs2_ail1_empty cleanup gfs2: use is_subdir() gfs2: d_obtain_alias(ERR_PTR(...)) will do the right thing gfs2: Use GL_NOBLOCK flag for non-blocking lookups gfs2: Add GL_NOBLOCK flag gfs2: rgrp: fix kernel-doc warnings gfs2: fix kernel BUG in gfs2_quota_cleanup gfs2: Fix inode_go_instantiate description gfs2: Fix kernel NULL pointer dereference in gfs2_rgrp_dump
2023-12-20gfs2: Rename gfs2_withdrawn to gfs2_withdrawing_or_withdrawnAndreas Gruenbacher1-1/+1
This function checks whether the filesystem has been been marked to be withdrawn eventually or has been withdrawn already. Rename this function to avoid confusing code like checking for gfs2_withdrawing() when gfs2_withdrawn() has already returned true. Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
2023-12-20gfs2: Mark withdraws as unlikelyAndreas Gruenbacher1-1/+1
Mark the gfs2_withdrawn(), gfs2_withdrawing(), and gfs2_withdraw_in_prog() inline functions as likely to return %false. This allows to get rid of likely() and unlikely() annotations at the call sites of those functions. Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
2023-11-21fs: Rename mapping private membersMatthew Wilcox (Oracle)1-1/+1
It is hard to find where mapping->private_lock, mapping->private_list and mapping->private_data are used, due to private_XXX being a relatively common name for variables and structure members in the kernel. To fit with other members of struct address_space, rename them all to have an i_ prefix. Tested with an allmodconfig build. Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Link: https://lore.kernel.org/r/20231117215823.2821906-1-willy@infradead.org Acked-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Josef Bacik <josef@toxicpanda.com> Signed-off-by: Christian Brauner <brauner@kernel.org>
2023-11-07Merge tag 'gfs2-v6.6-rc2-fixes' of ↵Linus Torvalds1-16/+12
git://git.kernel.org/pub/scm/linux/kernel/git/gfs2/linux-gfs2 Pull gfs2 updates from Andreas Gruenbacher: - Don't update inode timestamps for direct writes (performance regression fix) - Skip no-op quota records instead of panicing - Fix a RCU race in gfs2_permission() - Various other smaller fixes and cleanups all over the place * tag 'gfs2-v6.6-rc2-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/gfs2/linux-gfs2: (24 commits) gfs2: don't withdraw if init_threads() got interrupted gfs2: remove dead code in add_to_queue gfs2: Fix slab-use-after-free in gfs2_qd_dealloc gfs2: Silence "suspicious RCU usage in gfs2_permission" warning gfs2: fs: derive f_fsid from s_uuid gfs2: No longer use 'extern' in function declarations gfs2: Rename gfs2_lookup_{ simple => meta } gfs2: Convert gfs2_internal_read to folios gfs2: Convert stuffed_readpage to folios gfs2: Minor gfs2_write_jdata_batch PAGE_SIZE cleanup gfs2: Get rid of gfs2_alloc_blocks generation parameter gfs2: Add metapath_dibh helper gfs2: Clean up quota.c:print_message gfs2: Clean up gfs2_alloc_parms initializers gfs2: Two quota=account mode fixes gfs2: Stop using GFS2_BASIC_BLOCK and GFS2_BASIC_BLOCK_SHIFT gfs2: setattr_chown: Add missing initialization gfs2: fix an oops in gfs2_permission gfs2: ignore negated quota changes gfs2: Don't update inode timestamps for direct writes ...
2023-11-06gfs2: don't withdraw if init_threads() got interruptedAndreas Gruenbacher1-3/+1
In gfs2_fill_super(), when mounting a gfs2 filesystem is interrupted, kthread_create() can return -EINTR. When that happens, we roll back what has already been done and abort the mount. Since commit 62dd0f98a0e5 ("gfs2: Flag a withdraw if init_threads() fails), we are calling gfs2_withdraw_delayed() in gfs2_fill_super(); first via gfs2_make_fs_rw(), then directly. But gfs2_withdraw_delayed() only marks the filesystem as withdrawing and relies on a caller further up the stack to do the actual withdraw, which doesn't exist in the gfs2_fill_super() case. Because the filesystem is marked as withdrawing / withdrawn, function gfs2_lm_unmount() doesn't release the dlm lockspace, so when we try to mount that filesystem again, we get: gfs2: fsid=gohan:gohan0: Trying to join cluster "lock_dlm", "gohan:gohan0" gfs2: fsid=gohan:gohan0: dlm_new_lockspace error -17 Since commit b77b4a4815a9 ("gfs2: Rework freeze / thaw logic"), the deadlock this gfs2_withdraw_delayed() call was supposed to work around cannot occur anymore because freeze_go_callback() won't take the sb->s_umount semaphore unconditionally anymore, so we can get rid of the gfs2_withdraw_delayed() in gfs2_fill_super() entirely. Reported-by: Alexander Aring <aahringo@redhat.com> Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com> Cc: stable@vger.kernel.org # v6.5+
2023-11-06gfs2: Rename gfs2_lookup_{ simple => meta }Andreas Gruenbacher1-8/+8
Function gfs2_lookup_simple() is used for looking up inodes in the metadata directory tree, so rename it to gfs2_lookup_meta() to closer match its purpose. Clean the function up a little on the way. Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
2023-10-23gfs2: Stop using GFS2_BASIC_BLOCK and GFS2_BASIC_BLOCK_SHIFTAndreas Gruenbacher1-5/+3
Header gfs2_ondisk.h defines GFS2_BASIC_BLOCK and GFS2_BASIC_BLOCK_SHIFT in a misguided attempt to abstract away the fact that sectors on block devices are 512 or (1 << 9) bytes in size. Stop using those definitions. I would be inclinded to remove those definitions altogether, but the gfs2 user-space tools are using them. In addition, instead of GFS2_SB(inode)->sd_sb.sb_bsize_shift, simply use inode->i_blkbits. Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com> Cc: Andrew Price <anprice@redhat.com>
2023-10-04kthread: add kthread_stop_putAndreas Gruenbacher1-6/+3
Add a kthread_stop_put() helper that stops a thread and puts its task struct. Use it to replace the various instances of kthread_stop() followed by put_task_struct(). Remove the kthread_stop_put() macro in usbip that is similar but doesn't return the result of kthread_stop(). [agruenba@redhat.com: fix kerneldoc comment] Link: https://lkml.kernel.org/r/20230911111730.2565537-1-agruenba@redhat.com [akpm@linux-foundation.org: document kthread_stop_put()'s argument] Link: https://lkml.kernel.org/r/20230907234048.2499820-1-agruenba@redhat.com Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-09-05gfs2: Introduce new quota=quiet mount optionBob Peterson1-0/+1
This patch adds a new mount option quota=quiet which is the same as quota=on but it suppresses gfs2 quota error messages. Signed-off-by: Bob Peterson <rpeterso@redhat.com> Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
2023-09-05gfs2: Add device name to gfs2_logd and gfs2_quotadAndreas Gruenbacher1-2/+2
Add the device name to the names of the gfs2_logd and gfs2_quotad kernel threads to allow for easier identification. Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
2023-09-05gfs2: Fix asynchronous thread destructionAndreas Gruenbacher1-10/+25
The kernel threads are currently stopped and destroyed synchronously by gfs2_make_fs_ro() and gfs2_put_super(), and asynchronously by signal_our_withdraw(), with no synchronization, so the synchronous and asynchronous contexts can race with each other. First, when creating the kernel threads, take an extra task struct reference so that the task struct won't go away immediately when they terminate. This allows those kthreads to terminate immediately when they're done rather than hanging around as zombies until they are reaped by kthread_stop(). When kthread_stop() is called on a terminated kthread, it will return immediately. Second, in signal_our_withdraw(), once the SDF_JOURNAL_LIVE flag has been cleared, wake up the logd and quotad wait queues instead of stopping the logd and quotad kthreads. The kthreads are then expected to terminate automatically within short time, but if they cannot, they will not block the withdraw. For example, if a user process and one of the kthread decide to withdraw at the same time, only one of them will perform the actual withdraw and the other will wait for it to be done. If the kthread ends up being the one to wait, the withdrawing user process won't be able to stop it. Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
2023-09-05gfs2: Rename SDF_DEACTIVATING to SDF_KILLAndreas Gruenbacher1-2/+2
Rename the SDF_DEACTIVATING flag to SDF_KILL to make it more obvious that this relates to the kill_sb filesystem operation. Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
2023-09-05gfs2: Rename sd_{ glock => kill }_waitAndreas Gruenbacher1-1/+1
Rename sd_glock_wait to sd_kill_wait: we'll use it for other things related to "killing" a filesystem on unmount soon (kill_sb). Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
2023-07-04Merge tag 'gfs2-v6.4-rc5-fixes' of ↵Linus Torvalds1-12/+3
git://git.kernel.org/pub/scm/linux/kernel/git/gfs2/linux-gfs2 Pull gfs2 updates from Andreas Gruenbacher: - Move the freeze/thaw logic from glock callback context to process / worker thread context to prevent deadlocks - Fix a quota reference couting bug in do_qc() - Carry on deallocating inodes even when gfs2_rindex_update() fails - Retry filesystem-internal reads when they are interruped by a signal - Eliminate kmap_atomic() in favor of kmap_local_page() / memcpy_{from,to}_page() - Get rid of noop_direct_IO - And a few more minor fixes and cleanups * tag 'gfs2-v6.4-rc5-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/gfs2/linux-gfs2: (23 commits) gfs2: Add quota_change type gfs2: Use memcpy_{from,to}_page where appropriate gfs2: Convert remaining kmap_atomic calls to kmap_local_page gfs2: Replace deprecated kmap_atomic with kmap_local_page gfs: Get rid of unnucessary locking in inode_go_dump gfs2: gfs2_freeze_lock_shared cleanup gfs2: Replace sd_freeze_state with SDF_FROZEN flag gfs2: Rework freeze / thaw logic gfs2: Rename SDF_{FS_FROZEN => FREEZE_INITIATOR} gfs2: Reconfiguring frozen filesystem already rejected gfs2: Rename gfs2_freeze_lock{ => _shared } gfs2: Rename the {freeze,thaw}_super callbacks gfs2: Rename remaining "transaction" glock references gfs2: retry interrupted internal reads gfs2: Fix possible data races in gfs2_show_options() gfs2: Fix duplicate should_fault_in_pages() call gfs2: set FMODE_CAN_ODIRECT instead of a dummy direct_IO method gfs2: Don't remember delete unless it's successful gfs2: Update rl_unlinked before releasing rgrp lock gfs2: Fix gfs2_qa_get imbalance in gfs2_quota_hold ...
2023-07-03gfs2: gfs2_freeze_lock_shared cleanupAndreas Gruenbacher1-1/+1
All the remaining users of gfs2_freeze_lock_shared() set freeze_gh to &sdp->sd_freeze_gh and flags to 0, so remove those two parameters. Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
2023-07-03gfs2: Replace sd_freeze_state with SDF_FROZEN flagAndreas Gruenbacher1-1/+0
Replace sd_freeze_state with a new SDF_FROZEN flag. There no longer is a need for indicating that a freeze is in progress (SDF_STARTING_FREEZE); we are now protecting the critical sections with the sd_freeze_mutex. Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
2023-07-03gfs2: Rework freeze / thaw logicAndreas Gruenbacher1-3/+2
So far, at mount time, gfs2 would take the freeze glock in shared mode and then immediately drop it again, turning it into a cached glock that can be reclaimed at any time. To freeze the filesystem cluster-wide, the node initiating the freeze would take the freeze glock in exclusive mode, which would cause the freeze glock's freeze_go_sync() callback to run on each node. There, gfs2 would freeze the filesystem and schedule gfs2_freeze_func() to run. gfs2_freeze_func() would re-acquire the freeze glock in shared mode, thaw the filesystem, and drop the freeze glock again. The initiating node would keep the freeze glock held in exclusive mode. To thaw the filesystem, the initiating node would drop the freeze glock again, which would allow gfs2_freeze_func() to resume on all nodes, leaving the filesystem in the thawed state. It turns out that in freeze_go_sync(), we cannot reliably and safely freeze the filesystem. This is primarily because the final unmount of a filesystem takes a write lock on the s_umount rw semaphore before calling into gfs2_put_super(), and freeze_go_sync() needs to call freeze_super() which also takes a write lock on the same semaphore, causing a deadlock. We could work around this by trying to take an active reference on the super block first, which would prevent unmount from running at the same time. But that can fail, and freeze_go_sync() isn't actually allowed to fail. To get around this, this patch changes the freeze glock locking scheme as follows: At mount time, each node takes the freeze glock in shared mode. To freeze a filesystem, the initiating node first freezes the filesystem locally and then drops and re-acquires the freeze glock in exclusive mode. All other nodes notice that there is contention on the freeze glock in their go_callback callbacks, and they schedule gfs2_freeze_func() to run. There, they freeze the filesystem locally and drop and re-acquire the freeze glock before re-thawing the filesystem. This is happening outside of the glock state engine, so there, we are allowed to fail. From a cluster point of view, taking and immediately dropping a glock is indistinguishable from taking the glock and only dropping it upon contention, so this new scheme is compatible with the old one. Thanks to Li Dong <lidong@vivo.com> for reporting a locking bug in gfs2_freeze_func() in a previous version of this commit. Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
2023-06-15gfs2: Reconfiguring frozen filesystem already rejectedAndreas Gruenbacher1-7/+0
Reconfiguring a frozen filesystem is already rejected in reconfigure_super(), so there is no need to check for that condition again at the filesystem level. Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
2023-06-15gfs2: Rename gfs2_freeze_lock{ => _shared }Andreas Gruenbacher1-2/+2
Rename gfs2_freeze_lock to gfs2_freeze_lock_shared to make it a bit more obvious that this function establishes the "thawed" state of the freeze glock. Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
2023-06-15gfs2: Rename remaining "transaction" glock referencesAndreas Gruenbacher1-1/+1
The transaction glock was repurposed to serve as the new freeze glock years ago. Don't refer to it as the transaction glock anymore. Also, to be more precise, call it the "freeze glock" instead of the "freeze lock". Ditto for the journal glock. Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
2023-05-31gfs2: use __bio_add_page for adding single page to bioJohannes Thumshirn1-1/+1
The GFS2 superblock reading code uses bio_add_page() to add a page to a newly created bio. bio_add_page() can fail, but the return value is never checked. Use __bio_add_page() as adding a single page to a newly created bio is guaranteed to succeed. This brings us a step closer to marking bio_add_page() as __must_check. Reviewed-by: Damien Le Moal <damien.lemoal@opensource.wdc.com> Reviewed-by: Andreas Gruenbacher <agruenba@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> Link: https://lore.kernel.org/r/087c67d4e4973f949d3519c1e4822784ce583c5a.1685532726.git.johannes.thumshirn@wdc.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2023-04-18gfs2: Use gfs2_holder_initialized for jindexBob Peterson1-6/+3
Before this patch function init_journal() used a local variable jindex to keep track of whether it needed to dequeue the jindex holder when errors were found. It also uselessly set the variable just before returning from the function. This patch simplifies the code by eliminatinng the local variable in favor of using function gfs2_holder_initialized. Signed-off-by: Bob Peterson <rpeterso@redhat.com> Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
2023-02-01gfs2: Evict inodes cooperativelyAndreas Gruenbacher1-0/+51
Add a gfs2_evict_inodes() helper that evicts inodes cooperatively across the cluster. This avoids running into timeouts during unmount unnecessarily. Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
2023-02-01gfs2: Flush delete work before shrinking inode cacheAndreas Gruenbacher1-0/+9
In gfs2_kill_sb(), flush the delete work queue after setting the SDF_DEACTIVATING flag. This ensures that no new inodes will be instantiated anymore, and the inode cache will be empty after the following kill_block_super() -> generic_shutdown_super() -> evict_inodes() call. With that, function gfs2_make_fs_ro() now calls gfs2_flush_delete_work() after the workqueue has been destroyed. Skip that by checking for the presence of the SDF_DEACTIVATING flag. Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
2023-02-01gfs2: Add SDF_DEACTIVATING super block flagBob Peterson1-0/+1
Add a new SDF_DEACTIVATING super block flag that is set when the filesystem has started to deactivate. This will be used in the next patch to stop and drain the delete work during unmount. Signed-off-by: Bob Peterson <rpeterso@redhat.com> Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
2023-02-01gfs2: Move delete workqueue into super blockAndreas Gruenbacher1-1/+9
Move the global delete workqueue into struct gfs2_sbd so that we can flush / drain it without interfering with other filesystems. Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
2022-10-09gfs2: Merge branch 'for-next.nopid' into for-nextAndreas Gruenbacher1-6/+8
Resolves a conflict in gfs2_inode_lookup() between the following commits: gfs2: Use TRY lock in gfs2_inode_lookup for UNLINKED inodes gfs2: Mark the remaining process-independent glock holders as GL_NOPID Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
2022-09-20gfs2: Check sb_bsize_shift after reading superblockAndrew Price1-1/+4
Fuzzers like to scribble over sb_bsize_shift but in reality it's very unlikely that this field would be corrupted on its own. Nevertheless it should be checked to avoid the possibility of messy mount errors due to bad calculations. It's always a fixed value based on the block size so we can just check that it's the expected value. Tested with: mkfs.gfs2 -O -p lock_nolock /dev/vdb for i in 0 -1 64 65 32 33; do gfs2_edit -p sb field sb_bsize_shift $i /dev/vdb mount /dev/vdb /mnt/test && umount /mnt/test done Before this patch we get a withdraw after [ 76.413681] gfs2: fsid=loop0.0: fatal: invalid metadata block [ 76.413681] bh = 19 (type: exp=5, found=4) [ 76.413681] function = gfs2_meta_buffer, file = fs/gfs2/meta_io.c, line = 492 and with UBSAN configured we also get complaints like [ 76.373395] UBSAN: shift-out-of-bounds in fs/gfs2/ops_fstype.c:295:19 [ 76.373815] shift exponent 4294967287 is too large for 64-bit type 'long unsigned int' After the patch, these complaints don't appear, mount fails immediately and we get an explanation in dmesg. Reported-by: syzbot+dcf33a7aae997956fe06@syzkaller.appspotmail.com Signed-off-by: Andrew Price <anprice@redhat.com> Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
2022-08-26gfs2: Switch from strlcpy to strscpyAndreas Gruenbacher1-5/+7
Switch from strlcpy to strscpy and make sure that @count is the size of the smaller of the source and destination buffers. This prevents reading beyond the end of the source buffer when the source string isn't null terminated. Found by a modified version of syzkaller. Suggested-by: Wolfram Sang <wsa+renesas@sang-engineering.com> Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
2022-06-29gfs2: Revert 'Fix "truncate in progress" hang'Andreas Gruenbacher1-2/+0
Now that interrupted truncates are completed in the context of the process taking the glock, there is no need for the glock state engine to delegate that task to gfs2_quotad or for quotad to perform those truncates anymore. Get rid of the obsolete associated infrastructure. Reverts commit 813e0c46c9e2 ("GFS2: Fix "truncate in progress" hang"). Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com> Signed-off-by: Bob Peterson <rpeterso@redhat.com>
2022-06-29gfs2: Mark the remaining process-independent glock holders as GL_NOPIDAndreas Gruenbacher1-6/+8
Add the GL_NOPID flag for the remaining glock holders which are not associated with the current process. Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
2022-02-02block: pass a block_device and opf to bio_allocChristoph Hellwig1-3/+1
Pass the block_device and operation that we plan to use this bio for to bio_alloc to optimize the assignment. NULL/0 can be passed, both for the passthrough case on a raw request_queue and to temporarily avoid refactoring some nasty code. Also move the gfp_mask argument after the nr_vecs argument for a much more logical calling convention matching what most of the kernel does. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com> Link: https://lore.kernel.org/r/20220124091107.642561-18-hch@lst.de Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-08-20gfs2: Mark journal inodes as "don't cache"Bob Peterson1-0/+1
Before this patch, journal inodes were considered regular inodes, which meant that instead of evicting them, function iput_final would just put them on the lru for later processing. If the file system withdrew for whatever reason, the withdraw would never be seen until the inode was evicted, which could be indefinitely. This patch marks all journal inodes as "don't cache" which means function iput_final will evict them immediately, allowing us to properly recover the journal on other cluster nodes. Signed-off-by: Bob Peterson <rpeterso@redhat.com>
2021-08-20gfs2: Don't release and reacquire local statfs bhBob Peterson1-0/+9
Before this patch, several functions in gfs2 related to the updating of the statfs file used a newly acquired/read buffer_head for the local statfs file. This is completely unnecessary, because other nodes should never update it. Recreating the buffer is a waste of time. This patch allows gfs2 to read in the local statefs buffer_head at mount time and keep it around until unmount time. Signed-off-by: Bob Peterson <rpeterso@redhat.com>
2021-08-20gfs2: init system threads before freeze lockBob Peterson1-0/+42
Patch 96b1454f2e ("gfs2: move freeze glock outside the make_fs_rw and _ro functions") changed the gfs2 mount sequence so that it holds the freeze lock before calling gfs2_make_fs_rw. Before this patch, gfs2_make_fs_rw called init_threads to initialize the quotad and logd threads. That is a problem if the system needs to withdraw due to IO errors early in the mount sequence, for example, while initializing the system statfs inode: 1. An IO error causes the statfs glock to not sync properly after recovery, and leaves items on the ail list. 2. The leftover items on the ail list causes its do_xmote call to fail, which makes it want to withdraw. But since the glock code cannot withdraw (because the withdraw sequence uses glocks) it relies upon the logd daemon to initiate the withdraw. 3. The withdraw can never be performed by the logd daemon because all this takes place before the logd daemon is started. This patch moves function init_threads from super.c to ops_fstype.c and it changes gfs2_fill_super to start its threads before holding the freeze lock, and if there's an error, stop its threads after releasing it. This allows the logd to run unblocked by the freeze lock. Thus, the logd daemon can perform its withdraw sequence properly. Fixes: 96b1454f2e8e ("gfs2: move freeze glock outside the make_fs_rw and _ro functions") Signed-off-by: Bob Peterson <rpeterso@redhat.com>
2021-07-20gfs2: Fix memory leak of object lsi on error return pathColin Ian King1-0/+1
In the case where IS_ERR(lsi->si_sc_inode) is true the error exit path to free_local does not kfree the allocated object lsi leading to a memory leak. Fix this by kfree'ing lst before taking the error exit path. Addresses-Coverity: ("Resource leak") Fixes: 97fd734ba17e ("gfs2: lookup local statfs inodes prior to journal recovery") Signed-off-by: Colin Ian King <colin.king@canonical.com> Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
2021-06-28gfs2: Fix error handling in init_statfsAndreas Gruenbacher1-0/+1
On an error path, init_statfs calls iput(pn) after pn has already been put. Fix that by setting pn to NULL after the initial iput. Fixes: 97fd734ba17e ("gfs2: lookup local statfs inodes prior to journal recovery") Cc: stable@vger.kernel.org # v5.10+ Reported-by: Jing Xiangfeng <jingxiangfeng@huawei.com> Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
2021-04-09gfs2: Fix a number of kernel-doc warningsLee Jones1-5/+2
Building the kernel with W=1 results in a number of kernel-doc warnings like incorrect function names and parameter descriptions. Fix those, mostly by adding missing parameter descriptions, removing left-over descriptions, and demoting some less important kernel-doc comments into regular comments. Originally proposed by Lee Jones; improved and combined into a single patch by Andreas. Signed-off-by: Lee Jones <lee.jones@linaro.org> Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
2021-04-03gfs2: Remove unused variable sb_formatAndreas Gruenbacher1-1/+0
Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
2021-03-07gfs2: make function gfs2_make_fs_ro() to void typeYang Li1-3/+1
It fixes the following warning detected by coccinelle: ./fs/gfs2/super.c:592:5-10: Unneeded variable: "error". Return "0" on line 628 Reported-by: Abaci Robot <abaci@linux.alibaba.com> Signed-off-by: Yang Li <yang.lee@linux.alibaba.com> Signed-off-by: Bob Peterson <rpeterso@redhat.com> Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
2021-02-23Merge branches 'rgrp-glock-sharing' and 'gfs2-revoke' from ↵Andreas Gruenbacher1-2/+7
https://git.kernel.org/pub/scm/linux/kernel/git/gfs2/linux-gfs2.git Merge the resource group glock sharing feature and the revoke accounting rework. Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
2021-02-22gfs2: Per-revoke accounting in transactionsAndreas Gruenbacher1-0/+7
In the log, revokes are stored as a revoke descriptor (struct gfs2_log_descriptor), followed by zero or more additional revoke blocks (struct gfs2_meta_header). On filesystems with a blocksize of 4k, the revoke descriptor contains up to 503 revokes, and the metadata blocks contain up to 509 revokes each. We've so far been reserving space for revokes in transactions in block granularity, so a lot more space than necessary was being allocated and then released again. This patch switches to assigning revokes to transactions individually instead. Initially, space for the revoke descriptor is reserved and handed out to transactions. When more revokes than that are reserved, additional revoke blocks are added. When the log is flushed, the space for the additional revoke blocks is released, but we keep the space for the revoke descriptor block allocated. Transactions may still reserve more revokes than they will actually need in the end, but now we won't overshoot the target as much, and by only returning the space for excess revokes at log flush time, we further reduce the amount of contention between processes. Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
2021-02-08gfs2: Add trusted xattr supportAndreas Gruenbacher1-1/+13
Add support for an additional filesystem version (sb_fs_format = 1802). When a filesystem with the new version is mounted, the filesystem supports "trusted.*" xattrs. In addition, version 1802 filesystems implement a form of forward compatibility for xattrs: when xattrs with an unknown prefix (ea_type) are found on a version 1802 filesystem, those attributes are not shown by listxattr, and they are not accessible by getxattr, setxattr, or removexattr. This mechanism might turn out to be what we need in the future, but if not, we can always bump the filesystem version and break compatibility instead. Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com> Signed-off-by: Andrew Price <anprice@redhat.com>
2021-02-08gfs2: Enable rgrplvb for sb_fs_format 1802Andrew Price1-3/+10
Turn on rgrplvb by default for sb_fs_format > 1801. Mount options still have to override this so a new args field to differentiate between 'off' and 'not specified' is added, and the new default is applied only when it's not specified. Signed-off-by: Andrew Price <anprice@redhat.com> Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
2021-02-03gfs2: Get rid of sd_reserving_logAndreas Gruenbacher1-2/+0
This counter and the associated wait queue are only used so that gfs2_make_fs_ro can efficiently wait for all pending log space allocations to fail after setting the filesystem to read-only. This comes at the cost of waking up that wait queue very frequently. Instead, when gfs2_log_reserve fails because the filesystem has become read-only, Wake up sd_log_waitq. In gfs2_make_fs_ro, set the file system read-only and then wait until all the log space has been released. Give up and report the problem after a while. With that, sd_reserving_log and sd_reserving_log_wait can be removed. Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
2021-01-25gfs2: Fix invalid block size messageAndrew Price1-1/+1
Signed-off-by: Andrew Price <anprice@redhat.com> Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com> Signed-off-by: Bob Peterson <rpeterso@redhat.com>
2020-12-23gfs2: move freeze glock outside the make_fs_rw and _ro functionsBob Peterson1-14/+17
Before this patch, sister functions gfs2_make_fs_rw and gfs2_make_fs_ro locked (held) the freeze glock by calling gfs2_freeze_lock and gfs2_freeze_unlock. The problem is, not all the callers of gfs2_make_fs_ro should be doing this. The three callers of gfs2_make_fs_ro are: remount (gfs2_reconfigure), signal_our_withdraw, and unmount (gfs2_put_super). But when unmounting the file system we can get into the following circular lock dependency: deactivate_super down_write(&s->s_umount); <-------------------------------------- s_umount deactivate_locked_super gfs2_kill_sb kill_block_super generic_shutdown_super gfs2_put_super gfs2_make_fs_ro gfs2_glock_nq_init sd_freeze_gl freeze_go_sync if (freeze glock in SH) freeze_super (vfs) down_write(&sb->s_umount); <------- s_umount This patch moves the hold of the freeze glock outside the two sister rw/ro functions to their callers, but it doesn't request the glock from gfs2_put_super, thus eliminating the circular dependency. Signed-off-by: Bob Peterson <rpeterso@redhat.com> Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>