summaryrefslogtreecommitdiff
path: root/fs/bcachefs/btree_gc.c
AgeCommit message (Collapse)AuthorFilesLines
2023-10-23bcachefs: Improve bch2_bkey_make_mut()Kent Overstreet1-2/+2
bch2_bkey_make_mut() now takes the bkey_s_c by reference and points it at the new, mutable key. This helps in some fsck paths that may have multiple repair operations on the same key. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-23bcachefs: New error message helpersKent Overstreet1-15/+12
Add two new helpers for printing error messages with __func__ and bch2_err_str(): - bch_err_fn - bch_err_msg Also kill the old error strings in the recovery path, which were causing us to incorrectly report memory allocation failures - they're not needed anymore. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-23bcachefs: bch2_bkey_make_mut() now calls bch2_trans_update()Kent Overstreet1-5/+3
It's safe to call bch2_trans_update with a k/v pair where the value hasn't been filled out, as long as the key part has been and the value is filled out by transaction commit time. This patch folds the bch2_trans_update() call into bch2_bkey_make_mut(), eliminating a bit of boilerplate. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-23bcachefs: Always run topology error when CONFIG_BCACHEFS_DEBUG=yKent Overstreet1-3/+4
Improved test coverage. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-23bcachefs: Make reconstruct_alloc quieterKent Overstreet1-35/+37
We shouldn't be printing out fsck errors for expected errors - this helps make test logs more readable, and makes it easier to see what the actual failure was. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-23bcachefs: Private error codes: ENOMEMKent Overstreet1-8/+8
This adds private error codes for most (but not all) of our ENOMEM uses, which makes it easier to track down assorted allocation failures. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-23bcachefs: Improve error message for stripe block sector counts wrongKent Overstreet1-13/+16
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-23bcachefs: More stripe create cleanup/fixesKent Overstreet1-2/+3
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-23bcachefs: Mark stripe buckets with correct data typeKent Overstreet1-3/+12
Currently, we don't use bucket data type for tracking whether buckets are part of a stripe; parity buckets are BCH_DATA_parity, but data buckets in a stripe are BCH_DATA_user. There's a separate counter, buckets_ec, outside the BCH_DATA_TYPES system for tracking number of buckets on a device that are part of a stripe. The trouble with this approach is that it's too coarse grained, and we need better information on fragmentation for debugging copygc. With this patch, data buckets in a stripe are now tracked as BCH_DATA_stripe buckets. This doesn't yet differentiate between erasure coded and non-erasure coded data in a stripe bucket, nor do we yet track empty data buckets in stripes. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-23bcachefs: bch2_mark_key() now takes btree_id & levelKent Overstreet1-3/+3
btree & level are passed to trans_mark - for backpointers - bch2_mark_key() should take them as well. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-23bcachefs: Fix ec repair code checkKent Overstreet1-1/+1
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-23bcachefs: Better inlining for bch2_alloc_to_v4_mutKent Overstreet1-21/+22
This separates out the slowpath into a separate function, and inlines bch2_alloc_v4_mut into bch2_trans_start_alloc_update(), the main place it's called. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-23bcachefs: Convert EROFS errors to private error codesKent Overstreet1-2/+2
More error code improvements - this gets us more useful error messages. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-23bcachefs: New btree helpersKent Overstreet1-7/+2
This introduces some new conveniences, to help cut down on boilerplate: - bch2_trans_kmalloc_nomemzero() - performance optimiation - bch2_bkey_make_mut() - bch2_bkey_get_mut() - bch2_bkey_get_mut_typed() - bch2_bkey_alloc() Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-23bcachefs: Fix btree_gc when multiple passes requiredKent Overstreet1-4/+21
We weren't resetting filesystem & device usage when restarting gc, which was spotted when free bucket counters overflowed - whoops. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-23bcachefs: Suppress -EROFS messages when shutting downKent Overstreet1-4/+4
This isn't actually an error condition, this just indicates a normal shutdown - no reason for these to be in the log. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-23bcachefs: New bpos_cmp(), bkey_cmp() replacementsKent Overstreet1-17/+17
This patch introduces - bpos_eq() - bpos_lt() - bpos_le() - bpos_gt() - bpos_ge() and equivalent replacements for bkey_cmp(). Looking at the generated assembly these could probably be improved further, but we already see a significant code size improvement with this patch. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-23bcachefs: More style fixesKent Overstreet1-2/+2
Fixes for various checkpatch errors. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-23bcachefs: Assorted checkpatch fixesKent Overstreet1-1/+1
checkpatch.pl gives lots of warnings that we don't want - suggested ignore list: ASSIGN_IN_IF UNSPECIFIED_INT - bcachefs coding style prefers single token type names NEW_TYPEDEFS - typedefs are occasionally good FUNCTION_ARGUMENTS - we prefer to look at functions in .c files (hopefully with docbook documentation), not .h file prototypes MULTISTATEMENT_MACRO_USE_DO_WHILE - we have _many_ x-macros and other macros where we can't do this Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-23bcachefs: Use btree_type_has_ptrs() more consistentlyKent Overstreet1-1/+1
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-23bcachefs: New locking functionsKent Overstreet1-16/+24
In the future, with the new deadlock cycle detector, we won't be using bare six_lock_* anymore: lock wait entries will all be embedded in btree_trans, and we will need a btree_trans context whenever locking a btree node. This patch plumbs a btree_trans to the few places that need it, and adds two new locking functions - btree_node_lock_nopath, which may fail returning a transaction restart, and - btree_node_lock_nopath_nofail, to be used in places where we know we cannot deadlock (i.e. because we're holding no other locks). Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
2023-10-23bcachefs: Add persistent counters for all tracepointsKent Overstreet1-2/+2
Also, do some reorganizing/renaming, convert atomic counters in bch_fs to persistent counters, and add a few missing counters. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-23bcachefs: Convert fsck errors to errcode.hKent Overstreet1-9/+9
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
2023-10-23bcachefs: Use bch2_err_str() in error messagesKent Overstreet1-17/+18
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
2023-10-23bcachefs: Convert bch2_gc_done() for_each_btree_key2()Kent Overstreet1-115/+117
This converts bch2_gc_stripes_done() and bch2_gc_reflink_done() to the new for_each_btree_key_commit() macro. The new for_each_btree_key2() and for_each_btree_key_commit() macros handles transaction retries, allowing us to avoid nested transactions - which we want to avoid since they're tricky to do completely correctly and upcoming assertions are going to be checking for that. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
2023-10-23bcachefs: for_each_btree_key2()Kent Overstreet1-74/+38
This introduces two new macros for iterating through the btree, with transaction restart handling - for_each_btree_key2() - for_each_btree_key_commit() Every iteration is now in an implicit transaction, and - as with lockrestart_do() and commit_do() - returning -EINTR will cause the transaction to be restarted, at the same key. This patch converts a bunch of code that was open coding this to these new macros, saving a substantial amount of code. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
2023-10-23bcachefs: Rename __bch2_trans_do() -> commit_do()Kent Overstreet1-5/+5
Better/more descriptive naming, and prep for adding nested_lockrestart_do() and nested_commit_do(). Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
2023-10-23bcachefs: Silence some fsck errors when reconstructing alloc infoKent Overstreet1-18/+19
There's no need to print fsck errors for errors that are expected, and the user has already opted to repair. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
2023-10-23bcachefs: Put some repair messages behind opts->verboseKent Overstreet1-6/+8
These messages log the updates we're doing in bch2_check_fix_ptrs(), which is useful when debugging but not usually needed. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
2023-10-23bcachefs: Always descend to leaf nodes it btree_gcKent Overstreet1-8/+2
If a btree node is unreadable, it's the topology repair that fixes that and it's kicked off by btree_gc, so btree_gc needs to touch every node and very that they can be read. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
2023-10-23bcachefs: Fix assertion in topology repairKent Overstreet1-0/+2
If we were at the end of the node, when breaking out of the loop we'd pop the assertion on line 446 when cur wasn't NULL. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
2023-10-23bcachefs: Printbuf reworkKent Overstreet1-2/+2
This converts bcachefs to the modern printbuf interface/implementation, synced with the version to be submitted upstream. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-23bcachefs: Tracepoint improvementsKent Overstreet1-5/+2
Delete some obsolete tracepoints, organize alloc tracepoints better, make a few tracepoints more consistent. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
2023-10-23bcachefs: Shutdown path improvementsKent Overstreet1-3/+1
We're seeing occasional firings of the assertion in the key cache shutdown code that nr_dirty == 0, which means we must sometimes be doing transaction commits after we've gone read only. Cleanups & changes: - BCH_FS_ALLOC_CLEAN renamed to BCH_FS_CLEAN_SHUTDOWN - new helper bch2_btree_interior_updates_flush(), which returns true if it had to wait - bch2_btree_flush_writes() now also returns true if there were btree writes in flight - __bch2_fs_read_only now checks if btree writes were in flight in the shutdown loop: btree write completion does a transaction update, to update the pointer in the parent node - assert that !BCH_FS_CLEAN_SHUTDOWN in __bch2_trans_commit Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
2023-10-23bcachefs: Ensure buckets have io_time[READ] setKent Overstreet1-0/+7
It's an error if a bucket is in state BCH_DATA_cached but not on the LRU btree - i.e io_time[READ] == 0 - so, make sure it's set before adding it. Also, make some of the LRU code a bit clearer and more direct. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
2023-10-23bcachefs: Fold bucket_state in to BCH_DATA_TYPES()Kent Overstreet1-3/+29
Previously, we were missing accounting for buckets in need_gc_gens and need_discard states. This matters because buckets in those states need other btree operations done before they can be used, so they can't be conuted when checking current number of free buckets against the allocation watermark. Also, we weren't directly counting free buckets at all. Now, data type 0 == BCH_DATA_free, and free buckets are counted; this means we can get rid of the separate (poorly defined) count of unavailable buckets. This is a new on disk format version, with upgrade and fsck required for the accounting changes. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
2023-10-23bcachefs: Topology repair fixesKent Overstreet1-5/+7
- We were failing to start topology repair, because we hadn't set the superblock flag indicating it needed to run - set_node_min() forget to update the btree node's key - bch2_gc_alloc_reset() didn't reset data type, leading to inserting an invalid key that was empty but had nonzero data type Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
2023-10-23bcachefs: Kill struct bucket_markKent Overstreet1-46/+37
This switches struct bucket to using a lock, instead of cmpxchg. And now that the protected members no longer need to fit into a u64, we can expand the sector counts to 32 bits. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-23bcachefs: Kill main in-memory bucket arrayKent Overstreet1-9/+43
All code using the in-memory bucket array, excluding GC, has now been converted to use the alloc btree directly - so we can finally delete it. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-23bcachefs: Kill allocator threads & freelistsKent Overstreet1-9/+1
Now that we have new persistent data structures for the allocator, this patch converts the allocator to use them. Now, foreground bucket allocation uses the freespace btree to find buckets to allocate, instead of popping buckets off the freelist. The background allocator threads are no longer needed and are deleted, as well as the allocator freelists. Now we only need background tasks for invalidating buckets containing cached data (when we are low on empty buckets), and for issuing discards. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-23bcachefs: KEY_TYPE_alloc_v4Kent Overstreet1-28/+48
This introduces a new alloc key which doesn't use varints. Soon we'll be adding backpointers and storing them in alloc keys, which means our pack/unpack workflow for alloc keys won't really work - we'll need to be mutating alloc keys in place. Instead of bch2_alloc_unpack(), we now have bch2_alloc_to_v4() that converts older types of alloc keys to v4 if needed. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-23bcachefs: Journal seq now incremented at entry open, not closeKent Overstreet1-1/+1
This patch changes journal_entry_open() to initialize the new journal entry, not __journal_entry_close(). This also means that journal_cur_seq() refers to the sequence number of the last journal entry when we don't have an open journal entry, not the next one. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
2023-10-23bcachefs: Heap allocate printbufsKent Overstreet1-64/+117
This patch changes printbufs dynamically allocate and reallocate a buffer as needed. Stack usage has become a bit of a problem, and a major cause of that has been static size string buffers on the stack. The most involved part of this refactoring is that printbufs must now be exited with printbuf_exit(). Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-23bcachefs: Normal update/commit path now works before going RWKent Overstreet1-44/+27
This improves __bch2_trans_commit - early in the recovery process, when we're running btree_gc and before we want to go RW, it now uses bch2_journal_key_insert() to add the update to the list of updates for journal replay to do, instead of btree_gc having to use separate interfaces depending on whether we're running at bringup or, later, runtime. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
2023-10-23bcachefs: Stale ptr cleanup is now done by gc_gensKent Overstreet1-45/+10
Before we had dedicated gc code for bucket->oldest_gen this was btree_gc's responsibility, but now that we have that we can rip it out, simplifying the already overcomplicated btree_gc. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
2023-10-23bcachefs: Fix reflink repair codeKent Overstreet1-2/+10
The reflink repair code was incorrectly inserting a nonzero deleted key via journal replay - this is due to bch2_journal_key_insert() being somewhat hacky, and so this fix is also hacky for now. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
2023-10-23bcachefs: bch2_gc_gens() no longer uses bucket arrayKent Overstreet1-32/+76
Like the previous patches, this converts bch2_gc_gens() to use the alloc btree directly, and private arrays of generation numbers for its own recalculation of oldest_gen. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
2023-10-23bcachefs: btree_gc no longer uses main in-memory bucket arrayKent Overstreet1-88/+166
This changes the btree_gc code to only use the second bucket array, the one dedicated to GC. On completion, it compares what's in its in memory bucket array to the allocation information in the btree and writes it directly, instead of updating the main in-memory bucket array and writing that. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
2023-10-23bcachefs: Improve path for when btree_gc needs another passKent Overstreet1-58/+92
btree_gc sometimes needs another pass when it corrects bucket generation numbers or data types - when it finds multiple pointers of different data types to the same bucket, it may want to keep the second one it found. When this happens, we now clear out bucket sector counts _without_ resetting the bucket generation/data types that we already found, instead of resetting them to what we have in the alloc btree. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
2023-10-23bcachefs: Fix bch2_check_fix_ptrs()Kent Overstreet1-22/+40
The repair for for btree_ptrs was saying one thing and doing another - fortunately, that code can just be deleted. Also, when we update a btree node pointer, we also have to update node in memery, if it exists in the btree node cache - this fixes bch2_check_fix_ptrs() to do that. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>