summaryrefslogtreecommitdiff
path: root/fs/bcachefs/data_update.c
AgeCommit message (Collapse)AuthorFilesLines
2024-07-10bcachefs: Warn on attempting a move with no replicasKent Overstreet1-0/+10
Instead of popping an assert in bch2_write(), WARN and print out some debugging info. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-07-10bcachefs: bch2_data_update_to_text()Kent Overstreet1-0/+34
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-06-12bcachefs: Fix rcu_read_lock() leak in drop_extra_replicasKent Overstreet1-2/+1
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-05-09bcachefs: bch2_dev_have_ref()Kent Overstreet1-3/+3
bch2_dev_bkey_exists() is going away; bch2_dev_have_ref() documents that we're looking up a device without checking if it's present because we have a reference to it already. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-05-09bcachefs: kill bch2_dev_bkey_exists() in data_update_init()Kent Overstreet1-2/+10
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-05-09bcachefs: extent_ptr_durability() -> bch2_dev_rcu()Kent Overstreet1-0/+5
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-05-09bcachefs: PTR_BUCKET_POS() now takes bch_devKent Overstreet1-10/+12
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-05-09bcachefs: New helpers for device refcountsKent Overstreet1-3/+3
This will be used in the next patch for adding some new debug mode asserts. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-05-09bcachefs: bch2_bkey_drop_ptrs() declares loop iterKent Overstreet1-1/+0
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-05-09bcachefs: bch2_trans_unlock() must always be followed by relock() or begin()Kent Overstreet1-0/+2
We're about to add new asserts for btree_trans locking consistency, and part of that requires that aren't using the btree_trans while it's unlocked. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-05-09bcachefs: member helper cleanupsKent Overstreet1-3/+3
Some renaming for better consistency bch2_member_exists -> bch2_member_alive bch2_dev_exists -> bch2_member_exists bch2_dev_exsits2 -> bch2_dev_exists bch_dev_locked -> bch2_dev_locked bch_dev_bkey_exists -> bch2_dev_bkey_exists new helper - bch2_dev_safe Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-05-09bcachefs: iter/update/trigger/str_hash flag cleanupKent Overstreet1-6/+6
Combine iter/update/trigger/str_hash flags into a single enum, and x-macroize them for a to_text() function later. These flags are all for a specific iter/key/update context, so it makes sense to group them together - iter/update/trigger flags were already given distinct bits, this cleans up and unifies that handling. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-04-05bcachefs: Fix rebalance from durability=0 deviceKent Overstreet1-4/+13
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-04-02bcachefs: fix nocow lock deadlockKent Overstreet1-2/+1
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-04-01bcachefs: Add checks for invalid snapshot IDsKent Overstreet1-0/+9
Previously, we assumed that keys were consistent with the snapshots btree - but that's not correct as fsck may not have been run or may not be complete. This adds checks and error handling when using the in-memory snapshots table (that mirrors the snapshots btree). Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-01-21bcachefs: opts->compression can now also be applied in the backgroundKent Overstreet1-4/+2
The "apply this compression method in the background" paths now use the compression option if background_compression is not set; this means that setting or changing the compression option will cause existing data to be compressed accordingly in the background. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-01-01bcachefs: bkey_for_each_ptr() now declares loop iterKent Overstreet1-4/+0
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-01-01bcachefs: Make sure allocation failure errors are loggedKent Overstreet1-0/+2
The previous patch fixed a bug in allocation path error handling, and it would've been noticed sooner had it been logged properly. Generally speaking, errors that shouldn't happen in normal operation and are being returned up the stack should be logged: the write path was already logging IO errors, but non IO errors were missed. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-01-01bcachefs: remove redundant condition from data_update_index_updateDaniel Hill1-1/+1
Signed-off-by: Daniel Hill <daniel@gluo.nz> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-01-01bcachefs: count_event()Kent Overstreet1-1/+1
Small helper for event counters. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-01-01bcachefs: Add a rebalance, data_update tracepointsKent Overstreet1-0/+14
Add a tracepoint for rebalance, printing out - the target option - the compression option - the key being rebalanced Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-01-01bcachefs: Rename BTREE_INSERT flagsKent Overstreet1-3/+3
BTREE_INSERT flags are actually transaction commit flags - rename them for clarity. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-12-27bcachefs: Fix promotesKent Overstreet1-1/+2
The recent work to fix data moves w.r.t. durability broke promotes, because the caused us to bail out when the extent minus pointers being dropped still has enough pointers to satisfy the current number of replicas. Disable this check when we're adding cached replicas. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-12-12bcachefs: Fix nocow locks deadlockKent Overstreet1-1/+2
On trylock failure we were waiting for outstanding reads to complete - but nocow locks need to be held until the whole move is finished. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-12-05bcachefs: Fix bch2_extent_drop_ptrs() callKent Overstreet1-2/+2
Also, make bch2_extent_drop_ptrs() safer, so it works with extents and non-extents iterators. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-11-26bcachefs: Data update path won't accidentaly grow replicasKent Overstreet1-10/+82
Previously, there was a bug where if an extent had greater durability than required (because we needed to move a durability=1 pointer and ended up putting it on a durability 2 device), we would submit a write for replicas=2 - the durability of the pointer being rewritten - instead of the number of replicas required to bring it back up to the data_replicas option. This, plus the allocation path sometimes allocating on a greater durability device than requested, meant that extents could continue having more and more replicas added as they were being rewritten. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-11-14bcachefs: Check for nonce offset inconsistency in data_update pathKent Overstreet1-0/+28
We've rarely been seeing a nonce offset inconsistency that doesn't show up in tests: this adds some extra verification code to the data update path that prints out more relevant info when it occurs. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-11-02bcachefs: Don't downgrade locks on transaction restartKent Overstreet1-11/+1
We should only be downgrading locks on success - otherwise, our transaction restarts won't be getting the correct locks and we'll livelock. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-11-02bcachefs: rebalance_workKent Overstreet1-5/+6
This adds a new btree, rebalance_work, to eliminate scanning required for finding extents that need work done on them in the background - i.e. for the background_target and background_compression options. rebalance_work is a bitset btree, where a KEY_TYPE_set corresponds to an extent in the extents or reflink btree at the same pos. A new extent field is added, bch_extent_rebalance, which indicates that this extent has work that needs to be done in the background - and which options to use. This allows per-inode options to be propagated to indirect extents - at least in some circumstances. In this patch, changing IO options on a file will not propagate the new options to indirect extents pointed to by that file. Updating (setting/clearing) the rebalance_work btree is done by the extent trigger, which looks at the bch_extent_rebalance field. Scanning is still requrired after changing IO path options - either just for a given inode, or for the whole filesystem. We indicate that scanning is required by adding a KEY_TYPE_cookie key to the rebalance_work btree: the cookie counter is so that we can detect that scanning is still required when an option has been flipped mid-way through an existing scan. Future possible work: - Propagate options to indirect extents when being changed - Add other IO path options - nr_replicas, ec, to rebalance_work so they can be applied in the background when they change - Add a counter, for bcachefs fs usage output, showing the pending amount of rebalance work: we'll probably want to do this after the disk space accounting rewrite (moving it to a new btree) Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-31bcachefs: move: move_stats refactoringKent Overstreet1-1/+1
data_progress_list is gone - it was redundant with moving_context_list The upcoming rebalance rewrite is going to have it using two different move_stats objects with the same moving_context, depending on whether it's scanning or using the rebalance_work btree - this patch plumbs stats around a bit differently so that will work. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-31bcachefs: move: convert to bbposKent Overstreet1-3/+5
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-31bcachefs: moving_context now owns a btree_transKent Overstreet1-1/+1
btree_trans and moving_context are used together, and having the moving_context owns the transaction object reduces some plumbing. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-23bcachefs: Heap allocate btree_transKent Overstreet1-1/+1
We're using more stack than we'd like in a number of functions, and btree_trans is the biggest object that we stack allocate. But we have to do a heap allocatation to initialize it anyways, so there's no real downside to heap allocating the entire thing. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-23bcachefs: Fix W=12 build errorsKent Overstreet1-4/+0
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-23bcachefs: Break up io.cKent Overstreet1-1/+1
More reorganization, this splits up io.c into - io_read.c - io_misc.c - fallocate, fpunch, truncate - io_write.c Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-23bcachefs: Don't open code closure_nr_remaining()Kent Overstreet1-1/+1
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-23bcachefs: Compression levelsKent Overstreet1-3/+1
This allows including a compression level when specifying a compression type, e.g. compression=zstd:15 Values from 1 through 15 indicate compression levels, 0 or unspecified indicates the default. For LZ4, values 3-15 specify that the HC algorithm should be used. Note that for compatibility, extents themselves only include the compression type, not the compression level. This means that specifying the same compression algorithm but different compression levels for the compression and background_compression options will have no effect. XXX: perhaps we could add a warning for this Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-23bcachefs: Kill BTREE_INSERT_USE_RESERVEKent Overstreet1-2/+1
Now that we have journal watermarks and alloc watermarks unified, BTREE_INSERT_USE_RESERVE is redundant and can be deleted. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-23bcachefs: Rename enum alloc_reserve -> bch_watermarkKent Overstreet1-2/+2
This is prep work for consolidating with JOURNAL_WATERMARK. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-23bcachefs: bch2_extent_ptr_desired_durability()Kent Overstreet1-1/+1
This adds a new helper for getting a pointer's durability irrespective of the device state, and uses it in the the data update path. This fixes a bug where we do a data update but request 0 replicas to be allocated, because the replica being rewritten is on a device marked as failed. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-23bcachefs: Fix corruption with writeable snapshotsKent Overstreet1-88/+6
When partially overwriting an extent in an older snapshot, the existing extent has to be split. If the existing extent was overwritten in a different (sibling) snapshot, we have to ensure that the split won't be visible in the sibling snapshot. data_update.c already has code for this, bch2_insert_snapshot_writeouts() - we just need to move it into btree_update_leaf.c and change bch2_trans_update_extent() to use it as well. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-23bcachefs: Fix move_extent_fail counterKent Overstreet1-1/+1
fail counters need to be events, not numbers of sectors - or the calculations the tests use for determining if we've had too many slowpath events don't work. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-23bcachefs: bch2_bkey_get_iter() helpersKent Overstreet1-4/+3
Introduce new helpers for a common pattern: bch2_trans_iter_init(); bch2_btree_iter_peek_slot(); - bch2_bkey_get_iter_type() returns -ENOENT if it doesn't find a key of the correct type - bch2_bkey_get_val_typed() copies the val out of the btree to a (typically stack allocated) variable; it handles the case where the value in the btree is smaller than the current version of the type, zeroing out the remainder. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-23bcachefs: Improve move path tracepointsKent Overstreet1-1/+12
Move path tracepoints now include the key being moved. Also, add new tracepoints for the start of move_extent, and evacuate_bucket. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-23bcachefs: Improve trace_move_extent_fail()Kent Overstreet1-4/+75
This greatly expands the move_extent_fail tracepoint - now it includes all the information we have available, including exactly why the extent wasn't updated. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-23bcachefs: Data update path no longer leaves cached replicasKent Overstreet1-0/+11
It turns out that it's currently impossible to invalidate buckets containing only cached data if they're part of a stripe. The normal bucket invalidate path can't do it because we have to be able to incerement the bucket's gen, which isn't correct becasue it's still a member of the stripe - and the bucket invalidate path makes the bucket availabel for reuse right away, which also isn't correct for buckets in stripes. What would work is invalidating cached data by following backpointers, except that cached replicas don't currently get backpointers - because they would be awkward for the existing bucket invalidate path to delete and they haven't been needed elsewhere. So for the time being, to prevent running out of space in stripes, switch the data update path to not leave cached replicas; we may revisit this in the future. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-23bcachefs: New erasure coding shutdown pathKent Overstreet1-0/+1
This implements a new shutdown path for erasure coding, which is needed for the upcoming BCH_WRITE_WAIT_FOR_EC write path. The process is: - Cancel new stripes being built up - Close out/cancel open buckets on write points or the partial list that are for stripes - Shutdown rebalance/copygc - Then wait for in flight new stripes to finish With BCH_WRITE_WAIT_FOR_EC, move ops will be waiting on stripes to fill up before they complete; the new ec shutdown path is needed for shutting down copygc/rebalance without deadlocking. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-23bcachefs: Rework __bch2_data_update_index_update()Kent Overstreet1-52/+52
This makes some improvements to the logic for adding/removing replicas, as part of the larger erasure coding improvements. We now directly consider number of replicas desired for the given inode, and extent/pointer durability: this ensures that the extent ends up with the desired number of replicas when we're replacing multiple pointers with one that has higher durability (e.g. erasure coded). Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-23bcachefs: Extent helper improvementsKent Overstreet1-3/+3
- __bch2_bkey_drop_ptr() -> bch2_bkey_drop_ptr_noerror(), now available outside extents. - Split bch2_bkey_has_device() and bch2_bkey_has_device_c(), const and non const versions - bch2_extent_has_ptr() now returns the pointer it found Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-23bcachefs: moving_context->stats is allowed to be NULLKent Overstreet1-1/+1
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>