summaryrefslogtreecommitdiff
path: root/fs
AgeCommit message (Collapse)AuthorFilesLines
2024-01-01bcachefs: Add a rebalance, data_update tracepointsKent Overstreet3-11/+52
Add a tracepoint for rebalance, printing out - the target option - the compression option - the key being rebalanced Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-01-01bcachefs: Print durability in member_to_text()Kent Overstreet1-0/+5
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-01-01bcachefs: Improve sysfs compression_statsKent Overstreet1-51/+65
Break it out by compression type, and include average extent size. Also, format into a nice table. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-01-01bcachefs: Kill dev_usage->buckets_ecKent Overstreet7-21/+2
This counter is redundant; it's simply the sum of BCH_DATA_stripe and BCH_DATA_parity buckets. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-01-01bcachefs: bch2_dev_usage_to_text()Kent Overstreet3-26/+32
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-01-01bcachefs: New bucket sector count helpersKent Overstreet5-48/+53
This introduces bch2_bucket_sectors() and bch2_bucket_sectors_dirty(), prep work for separately accounting stripe sectors. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-01-01bcachefs: BCH_IOCTL_DEV_USAGE_V2Kent Overstreet2-18/+83
BCH_IOCTL_DEV_USAGE mistakenly put the per-data-type array in struct bch_ioctl_dev_usage; since ioctl numbers encode the size of the arg, that means adding new data types breaks the ioctl. This adds a new version that includes the number of data types as a parameter: the old version is fixed at 10 so as to not break when adding new types. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-01-01bcachefs: Simplify check_bucket_ref()Kent Overstreet1-8/+4
We only need the sector count being modified. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-01-01bcachefs: six locks: Simplify optimistic spinningKent Overstreet3-92/+42
osq lock maintainers don't want it to be used outside of kernel/locking/ - but, we can do better. Since we have lock handoff signalled via waitlist entries, there's no reason for optimistic spinning to have to look at the lock at all - aside from checking lock-owner; we can just spin looking at our waitlist entry. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-01-01bcachefs: BCH_DATA_OP_drop_extra_replicasKent Overstreet2-7/+46
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-01-01bcachefs: Convert bch2_move_btree() to bbposKent Overstreet1-25/+19
Minor cleanup. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-01-01bcachefs: x-macro-ify bch_data_ops enumKent Overstreet3-18/+30
This will let us add an enum -> string table for a to_text() fn. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-01-01bcachefs: clean up one inconsistent indentingYang Li1-1/+1
fs/bcachefs/journal_io.c:1843 bch2_journal_write_pick_flush() warn: inconsistent indenting Reported-by: Abaci Robot <abaci@linux.alibaba.com> Closes: https://bugzilla.openanolis.cn/show_bug.cgi?id=7585 Signed-off-by: Yang Li <yang.lee@linux.alibaba.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-01-01bcachefs: add a quieter bch2_read_superDaniel Hill2-3/+25
If we're looking for a bcachefs supers iteratively we don't want to see this error. This function replaces KERN_ERR with KERN_INFO for when we don't find a bcachefs superblock but preserves other errors. Signed-off-by: Daniel Hill <daniel@gluo.nz> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-01-01bcachefs: Don't use update_cached_sectors() in bch2_mark_alloc()Kent Overstreet3-29/+27
bch2_update_cached_sectors_list() is closer to how the new disk space accounting works, called from trans_mark(). Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-01-01bcachefs: Rename bch_replicas_entry -> bch_replicas_entry_v1Kent Overstreet10-56/+56
Prep work for introducing bch_replicas_entry_v2 Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-01-01bcachefs: Kill btree_iter->journal_posKent Overstreet4-17/+22
For BTREE_ITER_WITH_JOURNAL, we memoize lookups in the journal keys, to avoid the binary search overhead. Previously we stashed the pos of the last key returned from the journal, in order to force the lookup to be redone when rewinding. Now bch2_journal_keys_peek_upto() handles rewinding itself when necessary - so we can slim down btree_iter. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-01-01bcachefs: Kill memset() in bch2_btree_iter_init()Kent Overstreet3-10/+19
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-01-01bcachefs: Add a tracepoint for journal entry closeKent Overstreet2-0/+21
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-01-01bcachefs: Don't flush journal after replayKent Overstreet1-3/+1
The flush_all_pins() after journal replay was unecessary, and trying to completely flush the journal while RW is not a great idea - it's not guaranteed to terminate if other threads keep adding things to the jorunal. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-01-01bcachefs: Don't rejournal keys in key cache flushKent Overstreet1-6/+11
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-01-01bcachefs: Fix userspace bch2_prt_datetime()Kent Overstreet1-0/+1
ctime_r() outputs a newline, which we don't want. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-01-01bcachefs: Kill BTREE_ITER_ALL_LEVELSKent Overstreet3-140/+24
As discussed in the previous patch, BTREE_ITER_ALL_LEVELS appears to be racy with concurrent interior node updates - and perhaps it is fixable, but it's tricky and unnecessary. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-01-01bcachefs: backpointers fsck no longer uses BTREE_ITER_ALL_LEVELSKent Overstreet1-64/+63
It appears that BTREE_ITER_ALL_LEVELS is racy with concurrent interior node btree updates; unfortunate but not terribly surprising it's a difficult problem - that was the original reason for gc_lock. BTREE_ITER_ALL_LEVELS will probably be deleted in a subsequent patch, this changes backpointers fsck to instead walk keys at one level of the btree at a time. This fixes the tiering_drop_alloc test, which stopped working with the patch to not flush the journal after journal replay. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-01-01bcachefs: Improve btree_path_dowgrade tracepointKent Overstreet2-6/+21
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-01-01bcachefs: Rename BTREE_INSERT flagsKent Overstreet26-155/+158
BTREE_INSERT flags are actually transaction commit flags - rename them for clarity. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-01-01bcachefs: bch_str_hash_flags_tKent Overstreet4-15/+23
Create a separate enum for str_hash flags - instead of abusing the btree_insert_flags enum - and create a __bitwise typedef for sparse typechecking. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-01-01bcachefs: Kill dead BTREE_INSERT flagsKent Overstreet3-20/+4
BTREE_INSERT_NOWAIT and BTREE_INSERT_GC_LOCK_HELD are no longer used, and can be deleted. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-01-01bcachefs: Fix redundant variable initializationKent Overstreet1-3/+1
path->level was being read, but never used. Reported-by: Colin Ian King <colin.i.king@gmail.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-01-01bcachefs: Avoiding dropping/retaking write locks in ↵Kent Overstreet1-9/+7
bch2_btree_write_buffer_flush_one() Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-01-01bcachefs: Make journal replay more efficientKent Overstreet2-29/+62
Journal replay now first attempts to replay keys in sorted order, similar to how the btree write buffer flush path works. Any keys that can not be replayed due to journal deadlock are then left for later and replayed in journal order, unpinning journal entries as we go. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-01-01bcachefs: Go rw before journal replayKent Overstreet1-4/+12
This gets us slightly nicer log messages. Also, this slightly clarifies synchronization of c->journal_keys; after we go RW it's in use by multiple threads (so that the btree iterator code can overlay keys from the journal); so it has to be prepped before that point. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-01-01bcachefs: Kill BTREE_UPDATE_PREJOURNALKent Overstreet5-36/+11
With the previous patch that reworks BTREE_INSERT_JOURNAL_REPLAY, we can now switch the btree write buffer to use it for flushing. This has the advantage that transaction commits don't need to take a journal reservation at all. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-01-01bcachefs: BTREE_INSERT_JOURNAL_REPLAY now "don't init trans->journal_res"Kent Overstreet2-4/+17
This slightly changes how trans->journal_res works, in preparation for changing the btree write buffer flush path to use it. Now, BTREE_INSERT_JOURNAL_REPLAY means "don't take a journal reservation; trans->journal_res.seq already refers to the journal sequence number to pin". Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-01-01bcachefs: Clear k->needs_whitout earlier in commit pathKent Overstreet1-2/+2
The upcoming btree write buffer rework is going to use the journal itself as the first stage of the write buffer; this is a cleanup to make sure k->needs_whiteout is initialized before keys hit the journal. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-01-01bcachefs: track_event_change()Kent Overstreet8-92/+140
This introduces a new helper for connecting time_stats to state changes, i.e. when taking journal reservations is blocked for some reason. We use this to track separately the different reasons the journal might be blocked - i.e. space in the journal full, or the journal pin fifo full. Also do some cleanup and improvements on the time stats code. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-01-01bcachefs: Journal pins must always have a flush_fnKent Overstreet4-20/+40
flush_fn is how we identify journal pins in debugfs - this is a debugging aid. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-01-01bcachefs: Add an assertion in bch2_journal_pin_set()Kent Overstreet2-23/+51
Previously, bch2_journal_pin_set() would silently ignore a request to pin a journal sequence number that was no longer dirty, because it was used internally by bch2_journal_pin_copy() which could race with the src pin being flushed. Split these apart so that we can properly assert that @seq is a currently dirty journal sequence number - this is almost always a bug. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-01-01bcachefs: Include average write size in sysfs journal_debugKent Overstreet3-9/+16
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-01-01bcachefs: Fix warning when building in userspaceKent Overstreet1-2/+1
bch_err() doesn't reference the fs arg in userspace Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-01-01bcachefs: Print old version when scanning for old metadataKent Overstreet1-2/+6
Also: we should be using bch2_fs_read_write_early() Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-01-01bcachefs: Fix locking when checking freespace btreeKent Overstreet1-25/+35
On transaction restart, we weren't re-validating the hole we saw. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-01-01bcachefs: Check for unlinked inodes not on deleted listKent Overstreet2-3/+27
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-01-01bcachefs: kill INODE_LOCK, use lock_two_nondirectories()Kent Overstreet2-9/+6
In an ideal world, we'd have a common helper that could be used for sorting a list of inodes into the correct lock order, and then the same lock ordering could be used for any type of inode lock, not just i_rwsem. But the lock ordering rules for i_rwsem are a bit complicated, so - abandon that dream for now and do it the more standard way. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-01-01bcachefs: Improved backpointer messages in fsckKent Overstreet1-28/+22
When we have a key to print, we should print it. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-01-01bcachefs: Add extra verbose logging for ro pathKent Overstreet1-2/+12
Also log time waiting for c->writes references to be dropped; this will help in debugging why unmounts are taking longer than they should. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-01-01bcachefs: Flush fsck errors before running twiceKent Overstreet1-0/+2
It's confusing if we run fsck a second time (in debug mode, to verify the second run is clean), but errors are still ratelimited from the first run. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-01-01bcachefs: make RO snapshots actually ROKent Overstreet6-14/+63
Add checks to all the VFS paths for "are we in a RO snapshot?". Note - we don't check this when setting inode options via our xattr interface, since those generally only affect data placement, not contents of data. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev> Reported-by: "Carl E. Thompson" <list-bcachefs@carlthompson.net>
2024-01-01bcachefs: bch_sb_field_downgradeKent Overstreet10-9/+255
Add a new superblock section that contains a list of { minor version, recovery passes, errors_to_fix } that is - a list of recovery passes that must be run when downgrading past a given version, and a list of errors to silently fix. The upcoming disk accounting rewrite is not going to be fully compatible: we're going to have to regenerate accounting both when upgrading to the new version, and also from downgrading from the new version, since the new method of doing disk space accounting is a completely different architecture based on deltas, and synchronizing them for every jounal entry write to maintain compatibility is going to be too expensive and impractical. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-01-01bcachefs: bch_sb.recovery_passes_requiredKent Overstreet9-30/+172
Add two new superblock fields. Since the main section of the superblock is now fully, we have to add a new variable length section for them - bch_sb_field_ext. - recovery_passes_requried: recovery passes that must be run on the next mount - errors_silent: errors that will be silently fixed These are to improve upgrading and dwongrading: these fields won't be cleared until after recovery successfully completes, so there won't be any issues with crashing partway through an upgrade or a downgrade. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>