summaryrefslogtreecommitdiff
path: root/fs/gfs2
AgeCommit message (Collapse)AuthorFilesLines
2022-04-27gfs2: assign rgrp glock before compute_bitstructsBob Peterson1-4/+5
commit 428f651cb80b227af47fc302e4931791f2fb4741 upstream. Before this patch, function read_rindex_entry called compute_bitstructs before it allocated a glock for the rgrp. But if compute_bitstructs found a problem with the rgrp, it called gfs2_consist_rgrpd, and that called gfs2_dump_glock for rgd->rd_gl which had not yet been assigned. read_rindex_entry compute_bitstructs gfs2_consist_rgrpd gfs2_dump_glock <---------rgd->rd_gl was not set. This patch changes read_rindex_entry so it assigns an rgrp glock before calling compute_bitstructs so gfs2_dump_glock does not reference an unassigned pointer. If an error is discovered, the glock must also be put, so a new goto and label were added. Reported-by: syzbot+c6fd14145e2f62ca0784@syzkaller.appspotmail.com Signed-off-by: Bob Peterson <rpeterso@redhat.com> Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-04-08gfs2: Make sure FITRIM minlen is rounded up to fs block sizeAndrew Price1-1/+2
commit 27ca8273fda398638ca994a207323a85b6d81190 upstream. Per fstrim(8) we must round up the minlen argument to the fs block size. The current calculation doesn't take into account devices that have a discard granularity and requested minlen less than 1 fs block, so the value can get shifted away to zero in the translation to fs blocks. The zero minlen passed to gfs2_rgrp_send_discards() then allows sb_issue_discard() to be called with nr_sects == 0 which returns -EINVAL and results in gfs2_rgrp_send_discards() returning -EIO. Make sure minlen is never < 1 fs block by taking the max of the requested minlen and the fs block size before comparing to the device's discard granularity and shifting to fs blocks. Fixes: 076f0faa764ab ("GFS2: Fix FITRIM argument handling") Signed-off-by: Andrew Price <anprice@redhat.com> Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-04-08gfs2: gfs2_setattr_size error path fixAndreas Gruenbacher6-8/+9
commit 7336905a89f19173bf9301cd50a24421162f417c upstream. When gfs2_setattr_size() fails, it calls gfs2_rs_delete(ip, NULL) to get rid of any reservations the inode may have. Instead, it should pass in the inode's write count as the second parameter to allow gfs2_rs_delete() to figure out if the inode has any writers left. In a next step, there are two instances of gfs2_rs_delete(ip, NULL) left where we know that there can be no other users of the inode. Replace those with gfs2_rs_deltree(&ip->i_res) to avoid the unnecessary write count check. With that, gfs2_rs_delete() is only called with the inode's actual write count, so get rid of the second parameter. Fixes: a097dc7e24cb ("GFS2: Make rgrp reservations part of the gfs2_inode structure") Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-02-16gfs2: Fix gfs2_release for non-writers regressionBob Peterson1-3/+4
commit d3add1a9519dcacd6e644ecac741c56cf18b67f5 upstream. When a file is opened for writing, the vfs code (do_dentry_open) calls get_write_access for the inode, thus incrementing the inode's write count. That writer normally then creates a multi-block reservation for the inode (i_res) that can be re-used by other writers, which speeds up writes for applications that stupidly loop on open/write/close. When the writes are all done, the multi-block reservation should be deleted when the file is closed by the last "writer." Commit 0ec9b9ea4f83 broke that concept when it moved the call to gfs2_rs_delete before the check for FMODE_WRITE. Non-writers have no business removing the multi-block reservations of writers. In fact, if someone opens and closes the file for RO while a writer has a multi-block reservation, the RO closer will delete the reservation midway through the write, and this results in: kernel BUG at fs/gfs2/rgrp.c:677! (or thereabouts) which is: BUG_ON(rs->rs_requested); from function gfs2_rs_deltree. This patch moves the check back inside the check for FMODE_WRITE. Fixes: 0ec9b9ea4f83 ("gfs2: Check for active reservation in gfs2_release") Cc: stable@vger.kernel.org # v5.12+ Signed-off-by: Bob Peterson <rpeterso@redhat.com> Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-12-08gfs2: Fix length of holes reported at end-of-fileAndreas Gruenbacher1-1/+1
[ Upstream commit f3506eee81d1f700d9ee2d2f4a88fddb669ec032 ] Fix the length of holes reported at the end of a file: the length is relative to the beginning of the extent, not the seek position which is rounded down to the filesystem block size. This bug went unnoticed for some time, but is now caught by the following assertion in iomap_iter_done(): WARN_ON_ONCE(iter->iomap.offset + iter->iomap.length <= iter->pos) Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-12-08gfs2: release iopen glock early in evictBob Peterson1-7/+7
[ Upstream commit 49462e2be119d38c5eb5759d0d1b712df3a41239 ] Before this patch, evict would clear the iopen glock's gl_object after releasing the inode glock. In the meantime, another process could reuse the same block and thus glocks for a new inode. It would lock the inode glock (exclusively), and then the iopen glock (shared). The shared locking mode doesn't provide any ordering against the evict, so by the time the iopen glock is reused, evict may not have gotten to setting gl_object to NULL. Fix that by releasing the iopen glock before the inode glock in gfs2_evict_inode. Signed-off-by: Bob Peterson <rpeterso@redhat.com>gl_object Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-11-18gfs2: Fix glock_hash_walk bugsAndreas Gruenbacher1-10/+12
[ Upstream commit 7427f3bb49d81525b7dd1d0f7c5f6bbc752e6f0e ] So far, glock_hash_walk took a reference on each glock it iterated over, and it was the examiner's responsibility to drop those references. Dropping the final reference to a glock can sleep and the examiners are called in a RCU critical section with spin locks held, so examiners that didn't need the extra reference had to drop it asynchronously via gfs2_glock_queue_put or similar. This wasn't done correctly in thaw_glock which did call gfs2_glock_put, and not at all in dump_glock_func. Change glock_hash_walk to not take glock references at all. That way, the examiners that don't need them won't have to bother with slow asynchronous puts, and the examiners that do need references can take them themselves. Reported-by: Alexander Aring <aahringo@redhat.com> Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-11-18gfs2: Cancel remote delete work asynchronouslyAndreas Gruenbacher1-1/+1
[ Upstream commit 486408d690e130c3adacf816754b97558d715f46 ] In gfs2_inode_lookup and gfs2_create_inode, we're calling gfs2_cancel_delete_work which currently cancels any remote delete work (delete_work_func) synchronously. This means that if the work is currently running, it will wait for it to finish. We're doing this to pevent a previous instance of an inode from having any influence on the next instance. However, delete_work_func uses gfs2_inode_lookup internally, and we can end up in a deadlock when delete_work_func gets interrupted at the wrong time. For example, (1) An inode's iopen glock has delete work queued, but the inode itself has been evicted from the inode cache. (2) The delete work is preempted before reaching gfs2_inode_lookup. (3) Another process recreates the inode (gfs2_create_inode). It tries to cancel any outstanding delete work, which blocks waiting for the ongoing delete work to finish. (4) The delete work calls gfs2_inode_lookup, which blocks waiting for gfs2_create_inode to instantiate and unlock the new inode => deadlock. It turns out that when the delete work notices that its inode has been re-instantiated, it will do nothing. This means that it's safe to cancel the delete work asynchronously. This prevents the kind of deadlock described above. Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com> Signed-off-by: Bob Peterson <rpeterso@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-09-09Merge branch 'work.gfs2' of ↵Linus Torvalds1-2/+2
git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs Pull gfs2 setattr updates from Al Viro: "Make it possible for filesystems to use a generic 'may_setattr()' and switch gfs2 to using it" * 'work.gfs2' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: gfs2: Switch to may_setattr in gfs2_setattr fs: Move notify_change permission checks into may_setattr
2021-09-02Merge tag 'ovl-update-5.15' of ↵Linus Torvalds2-2/+5
git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/vfs Pull overlayfs update from Miklos Szeredi: - Copy up immutable/append/sync/noatime attributes (Amir Goldstein) - Improve performance by enabling RCU lookup. - Misc fixes and improvements The reason this touches so many files is that the ->get_acl() method now gets a "bool rcu" argument. The ->get_acl() API was updated based on comments from Al and Linus: Link: https://lore.kernel.org/linux-fsdevel/CAJfpeguQxpd6Wgc0Jd3ks77zcsAv_bn0q17L3VNnnmPKu11t8A@mail.gmail.com/ * tag 'ovl-update-5.15' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/vfs: ovl: enable RCU'd ->get_acl() vfs: add rcu argument to ->get_acl() callback ovl: fix BUG_ON() in may_delete() when called from ovl_cleanup() ovl: use kvalloc in xattr copy-up ovl: update ctime when changing fileattr ovl: skip checking lower file's i_writecount on truncate ovl: relax lookup error on mismatch origin ftype ovl: do not set overlay.opaque for new directories ovl: add ovl_allow_offline_changes() helper ovl: disable decoding null uuid with redirect_dir ovl: consistent behavior for immutable/append-only inodes ovl: copy up sync/noatime fileattr flags ovl: pass ovl_fs to ovl_check_setxattr() fs: add generic helper for filling statx attribute flags
2021-08-31Merge tag 'iomap-5.15-merge-4' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linuxLinus Torvalds1-3/+2
Pull iomap updates from Darrick Wong: "The most notable externally visible change for this cycle is the addition of support for reads to inline tail fragments of files, which was requested by the erofs developers; and a correction for a kernel memory corruption bug if the sysadmin tries to activate a swapfile with more pages than the swapfile header suggests. We also now report writeback completion errors to the file mapping correctly, instead of munging all errors into EIO. Internally, the bulk of the changes are Christoph's patchset to reduce the indirect function call count by a third to a half by converting iomap iteration from a loop pattern to a generator/consumer pattern. As an added bonus, fsdax no longer open-codes iomap apply loops. Summary: - Simplify the bio_end_page usage in the buffered IO code. - Support reading inline data at nonzero offsets for erofs. - Fix some typos and bad grammar. - Convert kmap_atomic usage in the inline data read path. - Add some extra inline data input checking. - Fix a memory corruption bug stemming from iomap_swapfile_activate trying to activate more pages than mm was expecting. - Pass errnos through the page writeback code so that writeback errors are reported correctly instead of being munged to EIO. - Replace iomap_apply with a open-coded iterator loops to reduce the number of indirect calls by a third to a half. - Refactor the fsdax code to use iomap iterators instead of the open-coded iomap_apply code that it had before. - Format file range iomap tracepoint data in hexadecimal and standardize the names used in the pretty-print string" * tag 'iomap-5.15-merge-4' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux: (41 commits) iomap: standardize tracepoint formatting and storage mm/swap: consider max pages in iomap_swapfile_add_extent iomap: move loop control code to iter.c iomap: constify iomap_iter_srcmap fsdax: switch the fault handlers to use iomap_iter fsdax: factor out a dax_fault_actor() helper fsdax: factor out helpers to simplify the dax fault code iomap: rework unshare flag iomap: pass an iomap_iter to various buffered I/O helpers iomap: remove iomap_apply fsdax: switch dax_iomap_rw to use iomap_iter iomap: switch iomap_swapfile_activate to use iomap_iter iomap: switch iomap_seek_data to use iomap_iter iomap: switch iomap_seek_hole to use iomap_iter iomap: switch iomap_bmap to use iomap_iter iomap: switch iomap_fiemap to use iomap_iter iomap: switch __iomap_dio_rw to use iomap_iter iomap: switch iomap_page_mkwrite to use iomap_iter iomap: switch iomap_zero_range to use iomap_iter iomap: switch iomap_file_unshare to use iomap_iter ...
2021-08-31Merge tag 'gfs2-v5.14-rc2-fixes' of ↵Linus Torvalds13-141/+139
git://git.kernel.org/pub/scm/linux/kernel/git/gfs2/linux-gfs2 Pull gfs2 updates from Andreas Gruenbacher: - Various withdraw related fixes (freeze glock recursion, thread initialization / destruction order, journal recovery, glock cleanup, withdraw under journal lock). - Some error message improvements. - Various minor cleanups. * tag 'gfs2-v5.14-rc2-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/gfs2/linux-gfs2: gfs2: Remove redundant check from gfs2_glock_dq gfs2: Delay withdraw from atomic context gfs2: Don't call dlm after protocol is unmounted gfs2: don't stop reads while withdraw in progress gfs2: Mark journal inodes as "don't cache" gfs2: nit: gfs2_drop_inode shouldn't return bool gfs2: Eliminate vestigial HIF_FIRST gfs2: Make recovery error more readable gfs2: Don't release and reacquire local statfs bh gfs2: init system threads before freeze lock gfs2: tiny cleanup in gfs2_log_reserve gfs2: trivial clean up of gfs2_ail_error gfs2: be more verbose replaying invalid rgrp blocks gfs2: Fix glock recursion in freeze_go_xmote_bh gfs2: Fix memory leak of object lsi on error return path
2021-08-23fs: remove mandatory file locking supportJeff Layton1-3/+0
We added CONFIG_MANDATORY_FILE_LOCKING in 2015, and soon after turned it off in Fedora and RHEL8. Several other distros have followed suit. I've heard of one problem in all that time: Someone migrated from an older distro that supported "-o mand" to one that didn't, and the host had a fstab entry with "mand" in it which broke on reboot. They didn't actually _use_ mandatory locking so they just removed the mount option and moved on. This patch rips out mandatory locking support wholesale from the kernel, along with the Kconfig option and the Documentation file. It also changes the mount code to ignore the "mand" mount option instead of erroring out, and to throw a big, ugly warning. Signed-off-by: Jeff Layton <jlayton@kernel.org>
2021-08-20gfs2: Remove redundant check from gfs2_glock_dqBob Peterson1-6/+5
In function gfs2_glock_dq, it checks to see if this is the fast path. Before this patch, it checked both "find_first_holder(gl) == NULL" and list_empty(&gl->gl_holders), which is redundant. If gl_holders is empty then find_first_holder must return NULL. This patch removes the redundancy. Signed-off-by: Bob Peterson <rpeterso@redhat.com>
2021-08-20gfs2: Delay withdraw from atomic contextBob Peterson1-1/+1
Before this patch, if function __gfs2_ail_flush detected an error syncing the ail list, it call gfs2_ail_error which called gfs2_withdraw. Since __gfs2_ail_flush deals with a specific glock, we shouldn't withdraw immediately because the withdraw code (signal_our_withdraw) uses glocks in its processing. This patch changes the call from gfs2_withdraw to gfs2_withdraw_delayed which defers the withdraw until a more appropriate context, such as the logd daemon, discovers the intent to withdraw. Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Bob Peterson <rpeterso@redhat.com>
2021-08-20gfs2: Don't call dlm after protocol is unmountedBob Peterson1-0/+5
In the gfs2 withdraw sequence, the dlm protocol is unmounted with a call to lm_unmount. After a withdraw, users are allowed to unmount the withdrawn file system. But at that point we may still have glocks left over that we need to free via unmount's call to gfs2_gl_hash_clear. These glocks may have never been completed because of whatever problem caused the withdraw (IO errors or whatever). Before this patch, function gdlm_put_lock would still try to call into dlm to unlock these leftover glocks, which resulted in dlm returning -EINVAL because the lock space was abandoned. These glocks were never freed because there was no mechanism after that to free them. This patch adds a check to gdlm_put_lock to see if the locking protocol was inactive (DFL_UNMOUNT flag) and if so, free the glock and not make the invalid call into dlm. I could have combined this "if" with the one that follows, related to leftover glock LVBs, but I felt the code was more readable with its own if clause. Signed-off-by: Bob Peterson <rpeterso@redhat.com>
2021-08-20gfs2: don't stop reads while withdraw in progressBob Peterson2-4/+8
When gfs2 withdraws a file system, it calls signal_our_withdraw which triggers another node to replay the withdrawing node's journal. Then it waits until it knows the journal has been replayed. Part of this wait is to repeatedly call check_journal_clean which calls gfs2_jdesc_check, which checks to see if the journal is sane. As part of its sanity checks it needs to re-read its journal's metadata. But with today's code, any attempt to re-read the metadata results in -EIO because of a check for the file system withdraw in function gfs2_meta_wait. This patch adds an additional check for SDF_WITHDRAW_IN_PROG, to tell if the read is done while the withdraw is in progress. In that case we allow the metadata read to not be rejected. Therefore the metadata check is done properly, so the withdraw sequence can finish normally. Signed-off-by: Bob Peterson <rpeterso@redhat.com>
2021-08-20gfs2: Mark journal inodes as "don't cache"Bob Peterson2-0/+2
Before this patch, journal inodes were considered regular inodes, which meant that instead of evicting them, function iput_final would just put them on the lru for later processing. If the file system withdrew for whatever reason, the withdraw would never be seen until the inode was evicted, which could be indefinitely. This patch marks all journal inodes as "don't cache" which means function iput_final will evict them immediately, allowing us to properly recover the journal on other cluster nodes. Signed-off-by: Bob Peterson <rpeterso@redhat.com>
2021-08-20gfs2: nit: gfs2_drop_inode shouldn't return boolBob Peterson1-1/+1
Today, gfs2_drop_inode can return "false" for an int value. I'm sure this was just an oversight. Change to int value. Signed-off-by: Bob Peterson <rpeterso@redhat.com>
2021-08-20gfs2: Eliminate vestigial HIF_FIRSTBob Peterson2-3/+0
Holder flag HIF_FIRST is no longer used or needed, so remove it. Signed-off-by: Bob Peterson <rpeterso@redhat.com>
2021-08-20gfs2: Make recovery error more readableBob Peterson1-1/+1
Before this patch, withdraws could cause an error that looked like: Journal recovery skipped for 0 until next mount. This patch changes it to a more readable: Journal recovery skipped for jid 0 until next mount. Signed-off-by: Bob Peterson <rpeterso@redhat.com>
2021-08-20gfs2: Don't release and reacquire local statfs bhBob Peterson5-41/+25
Before this patch, several functions in gfs2 related to the updating of the statfs file used a newly acquired/read buffer_head for the local statfs file. This is completely unnecessary, because other nodes should never update it. Recreating the buffer is a waste of time. This patch allows gfs2 to read in the local statefs buffer_head at mount time and keep it around until unmount time. Signed-off-by: Bob Peterson <rpeterso@redhat.com>
2021-08-20gfs2: init system threads before freeze lockBob Peterson2-55/+48
Patch 96b1454f2e ("gfs2: move freeze glock outside the make_fs_rw and _ro functions") changed the gfs2 mount sequence so that it holds the freeze lock before calling gfs2_make_fs_rw. Before this patch, gfs2_make_fs_rw called init_threads to initialize the quotad and logd threads. That is a problem if the system needs to withdraw due to IO errors early in the mount sequence, for example, while initializing the system statfs inode: 1. An IO error causes the statfs glock to not sync properly after recovery, and leaves items on the ail list. 2. The leftover items on the ail list causes its do_xmote call to fail, which makes it want to withdraw. But since the glock code cannot withdraw (because the withdraw sequence uses glocks) it relies upon the logd daemon to initiate the withdraw. 3. The withdraw can never be performed by the logd daemon because all this takes place before the logd daemon is started. This patch moves function init_threads from super.c to ops_fstype.c and it changes gfs2_fill_super to start its threads before holding the freeze lock, and if there's an error, stop its threads after releasing it. This allows the logd to run unblocked by the freeze lock. Thus, the logd daemon can perform its withdraw sequence properly. Fixes: 96b1454f2e8e ("gfs2: move freeze glock outside the make_fs_rw and _ro functions") Signed-off-by: Bob Peterson <rpeterso@redhat.com>
2021-08-19gfs2: tiny cleanup in gfs2_log_reserveBob Peterson1-1/+1
Function gfs2_log_reserve was setting revoke_blks to 0. There's no need because it calculates it shortly thereafter. This patch removes the unnecessary set. Signed-off-by: Bob Peterson <rpeterso@redhat.com>
2021-08-19gfs2: trivial clean up of gfs2_ail_errorBob Peterson1-4/+6
This patch does not change function. It adds variable sdp to clean up function gfs2_ail_error and make it more readable. Signed-off-by: Bob Peterson <rpeterso@redhat.com>
2021-08-19gfs2: be more verbose replaying invalid rgrp blocksBob Peterson1-15/+29
This patch adds some crucial information when journal replay detects a replay of an obsolete rgrp block. For example, it wasn't printing the journal id or the generation number played. This just supplements what is logged in this unusual case. The function that actually complains about the replaying of an obsolete rgrp block has been split off to avoid long lines and sparse warnings. Signed-off-by: Bob Peterson <rpeterso@redhat.com>
2021-08-18vfs: add rcu argument to ->get_acl() callbackMiklos Szeredi2-2/+5
Add a rcu argument to the ->get_acl() callback to allow get_cached_acl_rcu() to call the ->get_acl() method in the next patch. Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2021-08-17iomap: remove the iomap arguments to ->page_{prepare,done}Christoph Hellwig1-3/+2
These aren't actually used by the only instance implementing the methods. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <djwong@kernel.org> Signed-off-by: Darrick J. Wong <djwong@kernel.org>
2021-08-13gfs2: Switch to may_setattr in gfs2_setattrAndreas Gruenbacher1-2/+2
The permission check in gfs2_setattr is an old and outdated version of may_setattr(). Switch to the updated version. Fixes fstest generic/079. Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com> Signed-off-by: Bob Peterson <rpeterso@redhat.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2021-08-04gfs2: Fix glock recursion in freeze_go_xmote_bhBob Peterson1-10/+7
We must not call gfs2_consist (which does a file system withdraw) from the freeze glock's freeze_go_xmote_bh function because the withdraw will try to use the freeze glock, thus causing a glock recursion error. This patch changes freeze_go_xmote_bh to call function gfs2_assert_withdraw_delayed instead of gfs2_consist to avoid recursion. Signed-off-by: Bob Peterson <rpeterso@redhat.com>
2021-07-20gfs2: Fix memory leak of object lsi on error return pathColin Ian King1-0/+1
In the case where IS_ERR(lsi->si_sc_inode) is true the error exit path to free_local does not kfree the allocated object lsi leading to a memory leak. Fix this by kfree'ing lst before taking the error exit path. Addresses-Coverity: ("Resource leak") Fixes: 97fd734ba17e ("gfs2: lookup local statfs inodes prior to journal recovery") Signed-off-by: Colin Ian King <colin.king@canonical.com> Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
2021-06-30Merge tag 'gfs2-v5.13-fixes' of ↵Linus Torvalds7-66/+85
git://git.kernel.org/pub/scm/linux/kernel/git/gfs2/linux-gfs2 Pull gfs2 updates from Andreas Gruenbacher: "Various minor gfs2 cleanups and fixes" * tag 'gfs2-v5.13-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/gfs2/linux-gfs2: gfs2: Clean up gfs2_unstuff_dinode gfs2: Unstuff before locking page in gfs2_page_mkwrite gfs2: Clean up the error handling in gfs2_page_mkwrite gfs2: Fix error handling in init_statfs gfs2: Fix underflow in gfs2_page_mkwrite gfs2: Use list_move_tail instead of list_del/list_add_tail gfs2: Fix do_gfs2_set_flags description
2021-06-29iomap: use __set_page_dirty_nobuffersMatthew Wilcox (Oracle)1-1/+1
The only difference between iomap_set_page_dirty() and __set_page_dirty_nobuffers() is that the latter includes a debugging check that a !Uptodate page has private data. Link: https://lkml.kernel.org/r/20210615162342.1669332-4-willy@infradead.org Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Dan Williams <dan.j.williams@intel.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Jan Kara <jack@suse.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-06-29mm: require ->set_page_dirty to be explicitly wired upChristoph Hellwig1-0/+2
Remove the CONFIG_BLOCK default to __set_page_dirty_buffers and just wire that method up for the missing instances. [hch@lst.de: ecryptfs: add a ->set_page_dirty cludge] Link: https://lkml.kernel.org/r/20210624125250.536369-1-hch@lst.de Link: https://lkml.kernel.org/r/20210614061512.3966143-4-hch@lst.de Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Reviewed-by: Jan Kara <jack@suse.cz> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Matthew Wilcox (Oracle) <willy@infradead.org> Cc: Tyler Hicks <code@tyhicks.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-06-29gfs2: Clean up gfs2_unstuff_dinodeAndreas Gruenbacher5-36/+36
Split __gfs2_unstuff_inode off from gfs2_unstuff_dinode and clean up the code a little. All remaining callers now pass NULL as the page argument of gfs2_unstuff_dinode, so remove that argument. Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
2021-06-29gfs2: Unstuff before locking page in gfs2_page_mkwriteAndreas Gruenbacher1-10/+12
In gfs2_page_mkwrite, unstuff inodes before locking the page. That way, we won't have to pass in the locked page to gfs2_unstuff_inode, and gfs2_unstuff_inode can look up and lock the page itself. Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
2021-06-29gfs2: Clean up the error handling in gfs2_page_mkwriteAndreas Gruenbacher1-23/+40
We're setting an error number so that block_page_mkwrite_return translates it into the corresponding VM_FAULT_* code in several places, but this is getting confusing, so set the VM_FAULT_* codes directly instead. (No change in functionality.) Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
2021-06-28gfs2: Fix error handling in init_statfsAndreas Gruenbacher1-0/+1
On an error path, init_statfs calls iput(pn) after pn has already been put. Fix that by setting pn to NULL after the initial iput. Fixes: 97fd734ba17e ("gfs2: lookup local statfs inodes prior to journal recovery") Cc: stable@vger.kernel.org # v5.10+ Reported-by: Jing Xiangfeng <jingxiangfeng@huawei.com> Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
2021-06-28gfs2: Fix underflow in gfs2_page_mkwriteAndreas Gruenbacher1-2/+2
On filesystems with a block size smaller than PAGE_SIZE and non-empty files smaller then PAGE_SIZE, gfs2_page_mkwrite could end up allocating excess blocks beyond the end of the file, similar to fallocate. This doesn't make sense; fix it. Reported-by: Bob Peterson <rpeterso@redhat.com> Fixes: 184b4e60853d ("gfs2: Fix end-of-file handling in gfs2_page_mkwrite") Cc: stable@vger.kernel.org # v5.5+ Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
2021-06-28gfs2: Use list_move_tail instead of list_del/list_add_tailBaokun Li1-2/+1
Using list_move_tail() instead of list_del() + list_add_tail(). Reported-by: Hulk Robot <hulkci@huawei.com> Signed-off-by: Baokun Li <libaokun1@huawei.com> Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
2021-06-28gfs2: Fix do_gfs2_set_flags descriptionAndreas Gruenbacher1-1/+1
Commit 88b631cbfbeb ("gfs2: convert to fileattr") changed the argument list without updating the description. Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
2021-06-02Revert "gfs2: Fix mmap locking for write faults"Andreas Gruenbacher1-3/+1
This reverts commit b7f55d928e75557295c1ac280c291b738905b6fb. As explained by Linus in [*], write faults on a mmap region are reads from a filesysten point of view, so taking the inode glock exclusively on write faults is incorrect. Instead, when a page is marked writable, the .page_mkwrite vm operation will be called, which is where the exclusive lock taking needs to happen. I got this wrong because of a broken test case that made me believe .page_mkwrite isn't getting called when it actually is. [*] https://lore.kernel.org/lkml/CAHk-=wj8EWr_D65i4oRSj2FTbrc6RdNydNNCGxeabRnwtoU=3Q@mail.gmail.com/ Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
2021-05-31gfs2: Fix use-after-free in gfs2_glock_shrink_scanHillf Danton1-1/+1
The GLF_LRU flag is checked under lru_lock in gfs2_glock_remove_from_lru() to remove the glock from the lru list in __gfs2_glock_put(). On the shrink scan path, the same flag is cleared under lru_lock but because of cond_resched_lock(&lru_lock) in gfs2_dispose_glock_lru(), progress on the put side can be made without deleting the glock from the lru list. Keep GLF_LRU across the race window opened by cond_resched_lock(&lru_lock) to ensure correct behavior on both sides - clear GLF_LRU after list_del under lru_lock. Reported-by: syzbot <syzbot+34ba7ddbf3021981a228@syzkaller.appspotmail.com> Signed-off-by: Hillf Danton <hdanton@sina.com> Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
2021-05-21gfs2: Fix mmap locking for write faultsAndreas Gruenbacher1-1/+3
When a write fault occurs, we need to take the inode glock of the underlying inode in exclusive mode. Otherwise, there's no guarantee that the dirty page will be written back to disk. Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
2021-05-20gfs2: Clean up revokes on normal withdrawsBob Peterson5-4/+12
Before this patch, the system ail lists were cleaned up if the logd process withdrew, but on other withdraws, they were not cleaned up. This included the cleaning up of the revokes as well. This patch reorganizes things a bit so that all withdraws (not just logd) clean up the ail lists, including any pending revokes. Signed-off-by: Bob Peterson <rpeterso@redhat.com> Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
2021-05-20gfs2: fix a deadlock on withdraw-during-mountBob Peterson1-3/+21
Before this patch, gfs2 would deadlock because of the following sequence during mount: mount gfs2_fill_super gfs2_make_fs_rw <--- Detects IO error with glock kthread_stop(sdp->sd_quotad_process); <--- Blocked waiting for quotad to finish logd Detects IO error and the need to withdraw calls gfs2_withdraw gfs2_make_fs_ro kthread_stop(sdp->sd_quotad_process); <--- Blocked waiting for quotad to finish gfs2_quotad gfs2_statfs_sync gfs2_glock_wait <---- Blocked waiting for statfs glock to be granted glock_work_func do_xmote <---Detects IO error, can't release glock: blocked on withdraw glops->go_inval glock_blocked_by_withdraw requeue glock work & exit <--- work requeued, blocked by withdraw This patch makes a special exception for the statfs system inode glock, which allows the statfs glock UNLOCK to proceed normally. That allows the quotad daemon to exit during the withdraw, which allows the logd daemon to exit during the withdraw, which allows the mount to exit. Signed-off-by: Bob Peterson <rpeterso@redhat.com> Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
2021-05-20gfs2: fix scheduling while atomic bug in glocksBob Peterson1-0/+2
Before this patch, in the unlikely event that gfs2_glock_dq encountered a withdraw, it would do a wait_on_bit to wait for its journal to be recovered, but it never released the glock's spin_lock, which caused a scheduling-while-atomic error. This patch unlocks the lockref spin_lock before waiting for recovery. Fixes: 601ef0d52e96 ("gfs2: Force withdraw to replay journals and wait for it to finish") Cc: stable@vger.kernel.org # v5.7+ Reported-by: Alexander Aring <aahringo@redhat.com> Signed-off-by: Bob Peterson <rpeterso@redhat.com> Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
2021-05-20gfs2: Fix I_NEW check in gfs2_dinode_inBob Peterson1-1/+1
Patch 4a378d8a0d96 added a new check for I_NEW inodes, but unfortunately it used the wrong variable, i_flags. This caused GFS2 to withdraw when gfs2_lookup_by_inum needed to refresh an I_NEW inode. This patch switches to use the correct variable, i_state. Fixes: 4a378d8a0d96 ("gfs2: be careful with inode refresh") Signed-off-by: Bob Peterson <rpeterso@redhat.com> Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
2021-05-20gfs2: Prevent direct-I/O write fallback errors from getting lostAndreas Gruenbacher1-1/+4
When a direct I/O write falls entirely and falls back to buffered I/O and the buffered I/O fails, the write failed with return value 0 instead of the error number reported by the buffered I/O. Fix that. Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
2021-05-05mm: introduce and use mapping_empty()Matthew Wilcox (Oracle)1-2/+1
Patch series "Remove nrexceptional tracking", v2. We actually use nrexceptional for very little these days. It's a minor pain to keep in sync with nrpages, but the pain becomes much bigger with the THP patches because we don't know how many indices a shadow entry occupies. It's easier to just remove it than keep it accurate. Also, we save 8 bytes per inode which is nothing to sneeze at; on my laptop, it would improve shmem_inode_cache from 22 to 23 objects per 16kB, and inode_cache from 26 to 27 objects. Combined, that saves a megabyte of memory from a combined usage of 25MB for both caches. Unfortunately, ext4 doesn't cross a magic boundary, so it doesn't save any memory for ext4. This patch (of 4): Instead of checking the two counters (nrpages and nrexceptional), we can just check whether i_pages is empty. Link: https://lkml.kernel.org/r/20201026151849.24232-1-willy@infradead.org Link: https://lkml.kernel.org/r/20201026151849.24232-2-willy@infradead.org Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Tested-by: Vishal Verma <vishal.l.verma@intel.com> Acked-by: Johannes Weiner <hannes@cmpxchg.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>