summaryrefslogtreecommitdiff
path: root/fs/xfs/scrub/trace.h
AgeCommit message (Collapse)AuthorFilesLines
11 daysxfs: fix file_path handling in tracepointsDarrick J. Wong1-6/+4
Since file_path() takes the output buffer as one of its arguments, we might as well have it format directly into the tracepoint's char array instead of wasting stack space. Fixes: 3934e8ebb7cc6 ("xfs: create a big array data structure") Fixes: 5076a6040ca16 ("xfs: support in-memory buffer cache targets") Reported-by: kernel test robot <lkp@intel.com> Closes: https://lore.kernel.org/oe-kbuild-all/202403290419.HPcyvqZu-lkp@intel.com/ Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Chandan Babu R <chandanbabu@kernel.org>
2024-05-23tracing/treewide: Remove second parameter of __assign_str()Steven Rostedt (Google)1-5/+5
With the rework of how the __string() handles dynamic strings where it saves off the source string in field in the helper structure[1], the assignment of that value to the trace event field is stored in the helper value and does not need to be passed in again. This means that with: __string(field, mystring) Which use to be assigned with __assign_str(field, mystring), no longer needs the second parameter and it is unused. With this, __assign_str() will now only get a single parameter. There's over 700 users of __assign_str() and because coccinelle does not handle the TRACE_EVENT() macro I ended up using the following sed script: git grep -l __assign_str | while read a ; do sed -e 's/\(__assign_str([^,]*[^ ,]\) *,[^;]*/\1)/' $a > /tmp/test-file; mv /tmp/test-file $a; done I then searched for __assign_str() that did not end with ';' as those were multi line assignments that the sed script above would fail to catch. Note, the same updates will need to be done for: __assign_str_len() __assign_rel_str() __assign_rel_str_len() I tested this with both an allmodconfig and an allyesconfig (build only for both). [1] https://lore.kernel.org/linux-trace-kernel/20240222211442.634192653@goodmis.org/ Link: https://lore.kernel.org/linux-trace-kernel/20240516133454.681ba6a0@rorschach.local.home Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Julia Lawall <Julia.Lawall@inria.fr> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org> Acked-by: Jani Nikula <jani.nikula@intel.com> Acked-by: Christian König <christian.koenig@amd.com> for the amdgpu parts. Acked-by: Thomas Hellström <thomas.hellstrom@linux.intel.com> #for Acked-by: Rafael J. Wysocki <rafael@kernel.org> # for thermal Acked-by: Takashi Iwai <tiwai@suse.de> Acked-by: Darrick J. Wong <djwong@kernel.org> # xfs Tested-by: Guenter Roeck <linux@roeck-us.net>
2024-04-24xfs: invalidate dentries for a file before moving it to the orphanageDarrick J. Wong1-2/+0
Invalidate the cached dentries that point to the file that we're moving to lost+found before we actually move it. Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de>
2024-04-24xfs: exchange-range for repairs is no longer dynamicDarrick J. Wong1-1/+0
The atomic file exchange-range functionality is now a permanent filesystem feature instead of a dynamic log-incompat feature. It cannot be turned on at runtime, so we no longer need the XCHK_FSGATES flags and whatnot that supported it. Remove the flag and the enable function, and move the xfs_has_exchange_range checks to the start of the repair functions. Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de>
2024-04-24xfs: introduce vectored scrub modeDarrick J. Wong1-1/+78
Introduce a variant on XFS_SCRUB_METADATA that allows for a vectored mode. The caller specifies the principal metadata object that they want to scrub (allocation group, inode, etc.) once, followed by an array of scrub types they want called on that object. The kernel runs the scrub operations and writes the output flags and errno code to the corresponding array element. A new pseudo scrub type BARRIER is introduced to force the kernel to return to userspace if any corruptions have been found when scrubbing the previous scrub types in the array. This enables userspace to schedule, for example, the sequence: 1. data fork 2. barrier 3. directory If the data fork scrub is clean, then the kernel will perform the directory scrub. If not, the barrier in 2 will exit back to userspace. The alternative would have been an interface where userspace passes a pointer to an empty buffer, and the kernel formats that with xfs_scrub_vecs that tell userspace what it scrubbed and what the outcome was. With that the kernel would have to communicate that the buffer needed to have been at least X size, even though for our cases XFS_SCRUB_TYPE_NR + 2 would always be enough. Compared to that, this design keeps all the dependency policy and ordering logic in userspace where it already resides instead of duplicating it in the kernel. The downside of that is that it needs the barrier logic. When running fstests in "rebuild all metadata after each test" mode, I observed a 10% reduction in runtime due to fewer transitions across the system call boundary. Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de>
2024-04-24xfs: fix corruptions in the directory treeDarrick J. Wong1-3/+20
Repair corruptions in the directory tree itself. Cycles are broken by removing an incoming parent->child link. Multiply-owned directories are fixed by pruning the extra parent -> child links Disconnected subtrees are reconnected to the lost and found. Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de>
2024-04-24xfs: invalidate dirloop scrub path data when concurrent updates happenDarrick J. Wong1-0/+65
Add a dirent update hook so that we can detect directory tree updates that affect any of the paths found by this scrubber and force it to rescan. Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de>
2024-04-24xfs: teach online scrub to find directory tree structure problemsDarrick J. Wong1-1/+189
Create a new scrubber that detects corruptions within the directory tree structure itself. It can detect directories with multiple parents; loops within the directory tree; and directory loops not accessible from the root. Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de>
2024-04-24xfs: actually rebuild the parent pointer xattrsDarrick J. Wong1-0/+2
Once we've assembled all the parent pointers for a file, we need to commit the new dataset atomically to that file. Parent pointer records are embedded in the xattr structure, which means that we must write a new extended attribute structure, again, atomically. Therefore, we must copy the non-parent-pointer attributes from the file being repaired into the temporary file's extended attributes and then call the atomic extent swap mechanism to exchange the blocks. Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de>
2024-04-24xfs: implement live updates for parent pointer repairsDarrick J. Wong1-0/+2
While we're scanning the filesystem for dirents that we can turn into parent pointers, we cannot hold the IOLOCK or ILOCK of the file being repaired. Therefore, we need to set up a dirent hook so that we can keep the temporary file's parent pionters up to date with the rest of the filesystem. Hence we add the ability to *remove* pptrs from the temporary file. Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de>
2024-04-24xfs: repair directory parent pointers by scanning for direntsDarrick J. Wong1-0/+36
If parent pointers are enabled on the filesystem, we can repair the entire dataset by walking the directories of the filesystem looking for dirents that we can turn into parent pointers. Once we have a full incore dataset, we'll figure out what to do with it, but that's for a subsequent patch. Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de>
2024-04-24xfs: replay unlocked parent pointer updates that accrue during xattr repairDarrick J. Wong1-0/+73
There are a few places where the extended attribute repair code drops the ILOCK to apply stashed xattrs to the temporary file. Although setxattr and removexattr are still locked out because we retain our hold on the IOLOCK, this doesn't prevent renames from updating parent pointers, because the VFS doesn't take i_rwsem on children that are being moved. Therefore, set up a dirent hook to capture parent pointer updates for this file, and replay(?) the updates. Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de>
2024-04-24xfs: implement live updates for directory repairsDarrick J. Wong1-0/+2
While we're scanning the filesystem for parent pointers that we can turn into dirents, we cannot hold the IOLOCK or ILOCK of the directory being repaired. Therefore, we need to set up a dirent hook so that we can keep the temporary directory up to date with the rest of the filesystem. Hence we add the ability to *remove* entries from the temporary dir. Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de>
2024-04-23xfs: salvage parent pointers when rebuilding xattr structuresDarrick J. Wong1-0/+38
When we're salvaging extended attributes, make sure we validate the ones that claim to be parent pointers before adding them to the salvage pile. Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de>
2024-04-23xfs: walk directory parent pointers to determine backref countDarrick J. Wong1-0/+28
If the filesystem has parent pointers enabled, walk the parent pointers of subdirectories to determine the true backref count. In theory each subdir should have a single parent reachable via dotdot, but in the case of (corrupt) subdirs with multiple parents, we need to keep the link counts high enough that the directory loop detector will be able to correct the multiple parents problems. Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de>
2024-04-23xfs: deferred scrub of parent pointersDarrick J. Wong1-0/+3
If the trylock-based dirent check fails, retain those parent pointers and check them at the end. This may involve dropping the locks on the file being scanned, so yay. Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de>
2024-04-23xfs: deferred scrub of direntsDarrick J. Wong1-0/+34
If the trylock-based parent pointer check fails, retain those dirents and check them at the end. This may involve dropping the locks on the file being scanned, so yay. Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de>
2024-04-16xfs: repair AGI unlinked inode bucket listsDarrick J. Wong1-0/+255
Teach the AGI repair code to rebuild the unlinked buckets and lists. Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de>
2024-04-16xfs: online repair of symbolic linksDarrick J. Wong1-0/+46
If a symbolic link target looks bad, try to sift through the rubble to find as much of the target buffer that we can, and stage a new target (short or remote format as needed) in a temporary file and use the atomic extent swapping mechanism to commit the results. In the worst case, we replace the target with an overly long filename that cannot possibly resolve. Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de>
2024-04-16xfs: ensure dentry consistency when the orphanage adopts a fileDarrick J. Wong1-0/+42
When the orphanage adopts a file, that file becomes a child of the orphanage. The dentry cache may have entries for the orphanage directory and the name we've chosen, so (1) make sure we abort if the dcache has a positive entry because something's not right; and (2) invalidate and purge negative dentries if the adoption goes through. Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de>
2024-04-16xfs: move files to orphanage instead of letting nlinks drop to zeroDarrick J. Wong1-0/+26
If we encounter an inode with a nonzero link count but zero observed links, move it to the orphanage. Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de>
2024-04-16xfs: move orphan files to the orphanageDarrick J. Wong1-0/+28
When we're repairing a directory structure or fixing the dotdot entry of a subdirectory, it's possible that we won't ever find a parent for the subdirectory. When this is the case, move it to the orphanage, aka /lost+found. Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de>
2024-04-16xfs: ask the dentry cache if it knows the parent of a directoryDarrick J. Wong1-0/+1
It's possible that the dentry cache can tell us the parent of a directory. Therefore, when repairing directory dot dot entries, query the dcache as a last resort before scanning the entire filesystem. A reviewer asks: "How high is the chance that we actually have a valid dcache entry for a file in a corrupted directory?" There's a decent chance of this actually working. Say you have a 1000-block directory foo, and block 980 gets corrupted. Let's further suppose that block 0 has a correct entry for ".." and "bar". If someone accesses /mnt/foo/bar, that will cause the dcache to create a dentry from /mnt to /mnt/foo whose d_parent points back to /mnt. If you then want to rebuild the directory, XFS can obtain the parent from the dcache without needing to wander into parent pointers or scan the filesystem to find /mnt's connection to foo. Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de>
2024-04-16xfs: online repair of parent pointersDarrick J. Wong1-0/+1
Teach the online repair code to fix parent pointers for directories. For now, this means correcting the dotdot entry of an existing directory that is otherwise consistent. Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de>
2024-04-16xfs: scan the filesystem to repair a directory dotdot entryDarrick J. Wong1-0/+1
Teach the online directory repair code to scan the filesystem so that we can set the dotdot entry when we're rebuilding a directory. This involves dropping ILOCK on the directory that we're repairing, which means that the VFS can sneak in and tell us to update dotdot at any time. Deal with these races by using a dirent hook to absorb dotdot updates, and be careful not to check the scan results until after we've retaken the ILOCK. Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de>
2024-04-16xfs: online repair of directoriesDarrick J. Wong1-0/+112
If a directory looks like it's in bad shape, try to sift through the rubble to find whatever directory entries we can, scan the directory tree for the parent (if needed), stage the new directory contents in a temporary file and use the atomic extent swapping mechanism to commit the results in bulk. Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de>
2024-04-16xfs: scrub should set preen if attr leaf has holesDarrick J. Wong1-0/+1
If an attr block indicates that it could use compaction, set the preen flag to have the attr fork rebuilt, since the attr fork rebuilder can take care of that for us. Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de>
2024-04-16xfs: repair extended attributesDarrick J. Wong1-0/+83
If the extended attributes look bad, try to sift through the rubble to find whatever keys/values we can, stage a new attribute structure in a temporary file and use the atomic extent swapping mechanism to commit the results in bulk. Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de>
2024-04-16xfs: enable discarding of folios backing an xfileDarrick J. Wong1-0/+1
Create a new xfile function to discard the page cache that's backing part of an xfile. The next patch wil use this to drop parts of an xfile that aren't needed anymore. Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de>
2024-04-16xfs: teach the tempfile to set up atomic file content exchangesDarrick J. Wong1-0/+1
Create some new routines to exchange the contents of a temporary file created to stage a repair with another ondisk file. This will be used by the realtime summary repair function to commit atomically the new rtsummary data, which will be staged in the tempfile. The rest of XFS coordinates access to the realtime metadata inodes solely through the ILOCK. For repair to hold its exclusive access to the realtime summary file, it has to allocate a single large transaction and roll it repeatedly throughout the repair while holding the ILOCK. In turn, this means that for now there's only a partial file mapping exchange implementation for the temporary file because we can only work within an existing transaction. For now, the only tempswap functions needed here are to estimate the resource requirements of the exchange, reserve more space/quota to an existing transaction, and kick off the actual exchange. The rest will be added in a later patch in preparation for repairing xattrs and directories. Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de>
2024-04-16xfs: support preallocating and copying content into temporary filesDarrick J. Wong1-0/+39
Create the routines we need to preallocate space in a temporary ondisk file and then copy the contents of an xfile into the tempfile. The upcoming rtsummary repair feature will construct the contents of a realtime summary file in memory, after which it will want to copy all that into the ondisk temporary file before atomically committing the new rtsummary contents. Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de>
2024-04-16xfs: add the ability to reap entire inode forksDarrick J. Wong1-0/+63
In preparation for supporting repair of indexed file-based metadata (such as realtime bitmaps, directories, and extended attribute data), add a function to reap the old blocks after a metadata repair finishes. IOWs, this is an elaborate bunmapi call that deals with crosslinked blocks by unmapping them without freeing them, and also scans for incore buffers to invalidate. Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de>
2024-04-16xfs: create temporary files and directories for online repairDarrick J. Wong1-0/+33
Teach the online repair code how to create temporary files or directories. These temporary files can be used to stage reconstructed information until we're ready to perform an atomic extent swap to commit the new metadata. Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de>
2024-04-16xfs: fix an AGI lock acquisition ordering problem in xrep_dinode_findmodeDarrick J. Wong1-2/+8
While reviewing the next patch which fixes an ABBA deadlock between the AGI and a directory ILOCK, someone asked a question about why we're holding the AGI in the first place. The reason for that is to quiesce the inode structures for that AG while we do a repair. I then realized that the xrep_dinode_findmode invokes xchk_iscan_iter, which walks the inobts (and hence the AGIs) to find all the inodes. This itself is also an ABBA vector, since the damaged inode could be in AG 5, which we hold while we scan AG 0 for directories. 5 -> 0 is not allowed. To address this, modify the iscan to allow trylock of the AGI buffer using the flags argument to xfs_ialloc_read_agi that the previous patch added. Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de>
2024-02-22xfs: hook live rmap operations during a repair operationDarrick J. Wong1-0/+47
Hook the regular rmap code when an rmapbt repair operation is running so that we can unlock the AGF buffer to scan the filesystem and keep the in-memory btree up to date during the scan. Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de>
2024-02-22xfs: repair the rmapbtDarrick J. Wong1-1/+32
Rebuild the reverse mapping btree from all primary metadata. This first patch establishes the bare mechanics of finding records and putting together a new ondisk tree; more complex pieces are needed to make it work properly. Link: Documentation/filesystems/xfs-online-fsck-design.rst Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de>
2024-02-22xfs: remove xfs_btnum_tChristoph Hellwig1-8/+0
The last checks for bc_btnum can be replaced with helpers that check the btree ops. This allows adding new btrees to XFS without having to update a global enum. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <djwong@kernel.org> [djwong: complete the ops predicates] Signed-off-by: Darrick J. Wong <djwong@kernel.org>
2024-02-22xfs: add a name field to struct xfs_btree_opsChristoph Hellwig1-20/+20
The btnum in struct xfs_btree_ops is often used for printing a symbolic name for the btree. Add a name field to the ops structure and use that directly. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <djwong@kernel.org> Signed-off-by: Darrick J. Wong <djwong@kernel.org>
2024-02-22xfs: repair summary countersDarrick J. Wong1-4/+17
Use the same summary counter calculation infrastructure to generate new values for the in-core summary counters. The difference between the scrubber and the repairer is that the repairer will freeze the fs during setup, which means that the values should match exactly. Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de>
2024-02-22xfs: update health status if we get a clean bill of healthDarrick J. Wong1-1/+3
If scrub finds that everything is ok with the filesystem, we need a way to tell the health tracking that it can let go of indirect health flags, since indirect flags only mean that at some point in the past we lost some context. Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de>
2024-02-22xfs: teach repair to fix file nlinksDarrick J. Wong1-0/+3
Fix the file link counts since we just computed the correct ones. Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de>
2024-02-22xfs: track directory entry updates during live nlinks fsckDarrick J. Wong1-0/+33
Create the necessary hooks in the directory operations (create/link/unlink/rename) code so that our live nlink scrub code can stay up to date with link count updates in the rest of the filesystem. This will be the means to keep our shadow link count information up to date while the scan runs in real time. In online fsck part 2, we'll use these same hooks to handle repairs to directories and parent pointer information. Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de>
2024-02-22xfs: teach scrub to check file nlinksDarrick J. Wong1-1/+146
Create the necessary scrub code to walk the filesystem's directory tree so that we can compute file link counts. Similar to quotacheck, we create an incore shadow array of link count information and then we walk the filesystem a second time to compare the link counts. We need live updates to keep the information up to date during the lengthy scan, so this scrubber remains disabled until the next patch. Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de>
2024-02-22xfs: repair dquots based on live quotacheck resultsDarrick J. Wong1-0/+1
Use the shadow quota counters that live quotacheck creates to reset the incore dquot counters. Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de>
2024-02-22xfs: track quota updates during live quotacheckDarrick J. Wong1-0/+1
Create a shadow dqtrx system in the quotacheck code that hooks the regular dquot counter update code. This will be the means to keep our copy of the dquot counters up to date while the scan runs in real time. Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de>
2024-02-22xfs: implement live quotacheck inode scanDarrick J. Wong1-1/+27
Create a new trio of scrub functions to check quota counters. While the dquots themselves are filesystem metadata and should be checked early, the dquot counter values are computed from other metadata and are therefore summary counters. We don't plug these into the scrub dispatch just yet, because we still need to be able to watch quota updates while doing our scan. Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de>
2024-02-22xfs: repair file modes by scanning for a dirent pointing to usDarrick J. Wong1-0/+49
Repair might encounter an inode with a totally garbage i_mode. To fix this problem, we have to figure out if the file was a regular file, a directory, or a special file. One way to figure this out is to check if there are any directories with entries pointing down to the busted file. This patch recovers the file mode by scanning every directory entry on the filesystem to see if there are any that point to the busted file. If the ftype of all such dirents are consistent, the mode is recovered from the ftype. If no dirents are found, the file becomes a regular file. In all cases, ACLs are canceled and the file is made accessible only by root. A previous patch attempted to guess the mode by reading the beginning of the file data. This was rejected by Christoph on the grounds that we cannot trust user-controlled data blocks. Users do not have direct control over the ondisk contents of directory entries, so this method should be much safer. If all the dirents have the same ftype, then we can translate that back into an S_IFMT flag and fix the file. If not, reset the mode to S_IFREG. Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de>
2024-02-22xfs: iscan batching should handle unallocated inodes tooDarrick J. Wong1-4/+17
The inode scanner tries to reduce contention on the AGI header buffer lock by grabbing references to consecutive allocated inodes. Batching stops as soon as we encounter an unallocated inode. This is unfortunate because in the worst case performance collapses to the old "one at a time" behavior if every other inode is free. This is correct behavior, but we could do better. Unallocated inodes by definition have nothing to scan, which means the iscan can ignore them as long as someone ensures that the scan data will reflect another thread allocating the inode and adding interesting metadata to that inode. That mechanism is, of course, the live update hooks. Therefore, extend the batching mechanism to track unallocated inodes adjacent to the scan cursor. The _want_live_update predicate can tell the caller's live update hook to incorporate all live updates to what the scanner thinks is an unallocated inode if (after dropping the AGI) some other thread allocates one of those inodes and begins using it. Note that we cannot just copy the ir_free bitmap into the scan cursor because the batching stops if iget says the inode is in an intermediate state (e.g. on the inactivation list) and cannot be igrabbed. Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de>
2024-02-22xfs: cache a bunch of inodes for repair scansDarrick J. Wong1-0/+23
After observing xfs_scrub taking forever to rebuild parent pointers on a pptrs enabled filesystem, I decided to profile what the system was doing. It turns out that when there are a lot of threads trying to scan the filesystem, most of our time is spent contending on AGI buffer locks. Given that we're walking the inobt records anyway, we can often tell ahead of time when there's a bunch of (up to 64) consecutive inodes that we could grab all at once. Do this to amortize the cost of taking the AGI lock across as many inodes as we possibly can. On the author's system this seems to improve parallel throughput from barely one and a half cores to slightly sublinear scaling. The obvious antipattern here of course is where the freemask has every other bit set (e.g. all 0xA's) Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de>
2024-02-22xfs: stagger the starting AG of scrub iscans to reduce contentionDarrick J. Wong1-2/+5
Online directory and parent repairs on parent-pointer equipped filesystems have shown that starting a large number of parallel iscans causes a lot of AGI buffer contention. Try to reduce this by making it so that iscans scan wrap around the end of the filesystem, and using a rotor to stagger where each scanner begins. Surprisingly, this boosts CPU utilization (on the author's test machines) from effectively single-threaded to 160%. Not great, but see the next patch. Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de>