summaryrefslogtreecommitdiff
path: root/fs/bcachefs/opts.h
AgeCommit message (Collapse)AuthorFilesLines
2024-06-20bcachefs: Fix safe errors by defaultKent Overstreet1-1/+1
i.e. the start of automatic self healing: If errors=continue or fix_safe, we now automatically fix simple errors without user intervention. New error action option: fix_safe This replaces the existing errors=ro option, which gets a new slot, i.e. existing errors=ro users now get errors=fix_safe. This is currently only enabled for a limited set of errors - initially just disk accounting; errors we would never not want to fix, and we don't want to require user intervention (i.e. to make sure a bug report gets filed). Errors will still be counted in the superblock, so we (developers) will still know they've been occuring if a bug report gets filed (as bug reports typically include the errors superblock section). Eventually we'll be enabling this for a much wider set of errors, after we've done thorough error injection testing. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-05-09bcachefs: Kill opts.buckets_nouseKent Overstreet1-5/+0
Now explicitly allocate and free the buckets_nouse bitmap - this is going to be used for online fsck. To go RW when we haven't check allocations, we'll do a much slimmed down version that just initializes the buckets_nouse bitmaps. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-05-09bcachefs: iter/update/trigger/str_hash flag cleanupKent Overstreet1-1/+1
Combine iter/update/trigger/str_hash flags into a single enum, and x-macroize them for a to_text() function later. These flags are all for a specific iter/key/update context, so it makes sense to group them together - iter/update/trigger flags were already given distinct bits, this cleans up and unifies that handling. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-04-14bcachefs: Standardize helpers for printing enum strs with bounds checksKent Overstreet1-4/+6
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-04-03bcachefs: Repair pass for scanning for btree nodesKent Overstreet1-2/+2
If a btree root or interior btree node goes bad, we're going to lose a lot of data, unless we can recover the nodes that it pointed to by scanning. Fortunately btree node headers are fully self describing, and additionally the magic number is xored with the filesytem UUID, so we can do so safely. This implements the scanning - next patch will rework topology repair to make use of the found nodes. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-04-01bcachefs: Improve -o norecovery; opts.recovery_pass_limitKent Overstreet1-1/+6
This adds opts.recovery_pass_limit, and redoes -o norecovery to make use of it; this fixes some issues with -o norecovery so it can be safely used for data recovery. Norecovery means "don't do journal replay"; it's an important data recovery tool when we're getting stuck in journal replay. When using it this way we need to make sure we don't free journal keys after startup, so we continue to overlay them: thus it needs to imply retain_recovery_info, as well as nochanges. recovery_pass_limit is an explicit option for telling recovery to exit after a specific recovery pass; this is a much cleaner way of implementing -o norecovery, as well as being a useful debug feature in its own right. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-03-14bcachefs: Pin btree cache in ram for random access in fsckKent Overstreet1-0/+5
Various phases of fsck involve checking references from one btree to another: this means doing a sequential scan of one btree, and then mostly random access into the second. This is particularly painful for checking extents <-> backpointers; we can prefetch btree node access on the sequential scan, but not on the random access portion, and this is particularly painful on spinning rust, where we'd like to keep the pipeline fairly full of btree node reads so that the elevator can reduce seeking. This patch implements prefetching and pinning of the portion of the btree that we'll be doing random access to. We already calculate how much of the random access btree will fit in memory so it's a fairly straightforward change. This will put more pressure on system memory usage, so we introduce a new option, fsck_memory_usage_percent, which is the percentage of total system ram that fsck is allowed to pin. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-03-10bcachefs: no_splitbrain_check optionKent Overstreet1-0/+5
This adds an option to disable kicking out devices when splitbrain is detected - it seems there's some issues with splitbrain detection and we're kicking out devices erronously. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-01-21bcachefs: opts->compression can now also be applied in the backgroundKent Overstreet1-0/+5
The "apply this compression method in the background" paths now use the compression option if background_compression is not set; this means that setting or changing the compression option will cause existing data to be compressed accordingly in the background. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-01-21bcachefs: bch2_prt_compression_type()Kent Overstreet1-1/+1
bounds checking helper, since compression types are extensible Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-01-21bcachefs: helpers for printing data typesKent Overstreet1-1/+1
We need bounds checking since new versions may introduce new data types. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-01-06bcachefs: Add an option to control btree node prefetchingKent Overstreet1-1/+7
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-01-06bcachefs: factor out thread_with_file, thread_with_stdioKent Overstreet1-2/+2
thread_with_stdio now knows how to handle input - fsck can now prompt to fix errors. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-01-06bcachefs: Fix nochanges/read_only interactionKent Overstreet1-1/+1
nochanges means "we cannot issue writes at all"; it's possible to go into a pseudo read-write mode where we pin dirty metadata in memory, which is used for fsck in dry run mode and doing journal replay on a read only mount, but we do not want to allow an actual read-write mount in nochanges mode. But we do always want to allow early read-write, during recovery - this patch clarifies that. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-01-01bcachefs: btree write buffer now slurps keys from journalKent Overstreet1-5/+0
Previosuly, the transaction commit path would have to add keys to the btree write buffer as a separate operation, requiring additional global synchronization. This patch introduces a new journal entry type, which indicates that the keys need to be copied into the btree write buffer prior to being written out. We switch the journal entry type back to JSET_ENTRY_btree_keys prior to write, so this is not an on disk format change. Flushing the btree write buffer may require pulling keys out of journal entries yet to be written, and quiescing outstanding journal reservations; we previously added journal->buf_lock for synchronization with the journal write path. We also can't put strict bounds on the number of keys in the journal destined for the write buffer, which means we might overflow the size of the preallocated buffer and have to reallocate - this introduces a potentially fatal memory allocation failure. This is something we'll have to watch for, if it becomes an issue in practice we can do additional mitigation. The transaction commit path no longer has to explicitly check if the write buffer is full and wait on flushing; this is another performance optimization. Instead, when the btree write buffer is close to full we change the journal watermark, so that only reservations for journal reclaim are allowed. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-01-01bcachefs: Add ability to redirect log outputKent Overstreet1-0/+5
Upcoming patches are going to add two new ioctls for running fsck in the kernel, but pretending that we're running our normal userspace fsck. This patch adds some plumbing for redirecting our normal log messages away from the dmesg log to a thread_with_file file descriptor - via a struct log_output, which will be consumed by the fsck f_op's read method. The new ioctls will allow for running fsck in the kernel against an offline filesystem (without mounting it), and an online filesystem. For an offline filesystem we need a way to pass in a pointer to the log_output, which is done via a new hidden opts.h option. For online fsck, we can set c->output directly, but only want to redirect log messages from the thread running fsck - hence the new c->output_filter method. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-11-02bcachefs: Add IO error counts to bch_memberKent Overstreet1-1/+0
We now track IO errors per device since filesystem creation. IO error counts can be viewed in sysfs, or with the 'bcachefs show-super' command. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-31bcachefs: Guard against unknown compression optionsKent Overstreet1-0/+1
Since compression options now include compression level, proper validation is a bit more involved. This adds bch2_compression_opt_valid(), and plumbs it around appropriately. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-31bcachefs: bch2_btree_id_str()Kent Overstreet1-1/+1
Since we can run with unknown btree IDs, we can't directly index btree IDs into fixed size arrays. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-23bcachefs: Add iops fields to bch_memberHunter Shaffer1-0/+1
Signed-off-by: Hunter Shaffer <huntershaffer182456@gmail.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-23bcachefs: Fix W=12 build errorsKent Overstreet1-1/+1
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-23bcachefs: Compression levelsKent Overstreet1-2/+2
This allows including a compression level when specifying a compression type, e.g. compression=zstd:15 Values from 1 through 15 indicate compression levels, 0 or unspecified indicates the default. For LZ4, values 3-15 specify that the HC algorithm should be used. Note that for compatibility, extents themselves only include the compression type, not the compression level. This means that specifying the same compression algorithm but different compression levels for the compression and background_compression options will have no effect. XXX: perhaps we could add a warning for this Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-23bcachefs: fix_errors option is now a proper enumKent Overstreet1-2/+15
Before, it was parsed as a bool but internally it was really an enum: this lets us pass in all the possible values. But we special case the option parsing: no supplied value is parsed as FSCK_FIX_yes, to match the previous behaviour. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-23bcachefs: bch_opt_fnKent Overstreet1-2/+9
Minor refactoring to get rid of some unneeded token pasting. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-23bcachefs: version_upgrade is now an enumKent Overstreet1-2/+3
The version_upgrade parameter is now an enum, not a bool, and it's persistent in the superblock: - compatible (default): upgrade to the latest compatible version - incompatible: upgrade to latest incompatible version - none Currently all upgrades are incompatible upgrades, but the next release will introduce major:minor versions. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-23bcachefs: bch2_version_to_text()Kent Overstreet1-1/+0
Add a new helper for printing out metadata versions in a standard format. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-23bcachefs: Verbose on by default when CONFIG_BCACHEFS_DEBUG=yKent Overstreet1-1/+7
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-23bcachefs: Add option for completely disabling nocowKent Overstreet1-0/+6
This adds an option for completely disabling nocow mode, including the locking in the data move path. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-23bcachefs: Add max nr of IOs in flight to the move pathKent Overstreet1-1/+6
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-23bcachefs: Nocow supportKent Overstreet1-0/+7
This adds support for nocow mode, where we do writes in-place when possible. Patch components: - New boolean filesystem and inode option, nocow: note that when nocow is enabled, data checksumming and compression are implicitly disabled - To prevent in-place writes from racing with data moves (data_update.c) or bucket reuse (i.e. a bucket being reused and re-allocated while a nocow write is in flight, we have a new locking mechanism. Buckets can be locked for either data update or data move, using a fixed size hash table of two_state_shared locks. We don't have any chaining, meaning updates and moves to different buckets that hash to the same lock will wait unnecessarily - we'll want to watch for this becoming an issue. - The allocator path also needs to check for in-place writes in flight to a given bucket before giving it out: thus we add another counter to bucket_alloc_state so we can track this. - Fsync now may need to issue cache flushes to block devices instead of flushing the journal. We add a device bitmask to bch_inode_info, ei_devs_need_flush, which tracks devices that need to have flushes issued - note that this will lead to unnecessary flushes when other codepaths have already issued flushes, we may want to replace this with a sequence number. - New nocow write path: look up extents, and if they're writable write to them - otherwise fall back to the normal COW write path. XXX: switch to sequence numbers instead of bitmask for devs needing journal flush XXX: ei_quota_lock being a mutex means bch2_nocow_write_done() needs to run in process context - see if we can improve this Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-23bcachefs: Btree write bufferKent Overstreet1-0/+5
This adds a new method of doing btree updates - a straight write buffer, implemented as a flat fixed size array. This is only useful when we don't need to read from the btree in order to do the update, and when reading is infrequent - perfect for the LRU btree. This will make LRU btree updates fast enough that we'll be able to use it for persistently indexing buckets by fragmentation, which will be a massive boost to copygc performance. Changes: - A new btree_insert_type enum, for btree_insert_entries. Specifies btree, btree key cache, or btree write buffer. - bch2_trans_update_buffered(): updates via the btree write buffer don't need a btree path, so we need a new update path. - Transaction commit path changes: The update to the btree write buffer both mutates global, and can fail if there isn't currently room. Therefore we do all write buffer updates in the transaction all at once, and also if it fails we have to revert filesystem usage counter changes. If there isn't room we flush the write buffer in the transaction commit error path and retry. - A new persistent option, for specifying the number of entries in the write buffer. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-23bcachefs: bch2_inode_opts_get()Kent Overstreet1-5/+0
This improves io_opts() and makes it a non-inline function - it's big enough that it probably shouldn't be. Also, bch_io_opts no longer needs fields for whether options are defined, so we can slim it down a bit. We'd like to stop passing around the full bch_io_opts, but that'll be tricky because of bch2_rebalance_add_key(). Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-23bcachefs: Improve bch2_inode_opts_to_opts()Kent Overstreet1-1/+0
It turns out the *_defined entries of bch_io_opts are only used in one place - in the xattr get path - and there we immediately convert to a bch_opts struct, which also has the *_defined entries. This patch changes bch2_inode_opts_to_opts() to go directly from bch_inode_unpacked to bch_opts, which is a minor simplification and will also let us slim down struct bch_io_opts in another patch. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-23bcachefs: Add an O_DIRECT option (for userspace)Kent Overstreet1-0/+5
Sometimes we see IO errors due to O_DIRECT alignment issues - having an option to use buffered IO will be helpful. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
2023-10-23bcachefs: Make verbose option settable at runtimeKent Overstreet1-1/+1
-o verbose is very useful, and we're starting to use it more for runtime debug statements - making it possible to enable at runtime is a no brainer. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
2023-10-23bcachefs: Make IO in flight by copygc/rebalance configurableKent Overstreet1-0/+5
This adds a new option, move_bytes_in_flight, for configuring the amount of IO in flight by copygc/rebalance - users with many devices in their filesystem will want to increase this. In the future we should be smarter about this, but this is an easy improvement. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
2023-10-23bcachefs: Rename group to label for remaining strings.Daniel Hill1-4/+4
Signed-off-by: Daniel Hill <daniel@gluo.nz> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-23bcachefs: Make bch_option compatible with Rust ffiBrett Holman1-11/+3
Rust FFI lacks support for unnamed structs and unions. The space saved in bch_option is not enough to be significant. Signed-off-by: Brett Holman <bholman.devel@gmail.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-23bcachefs: Kill old rebuild_replicas optionKent Overstreet1-5/+0
This option was useful when the replicas mechism was new and still being debugged, but hasn't been used in ages - let's delete it. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
2023-10-23bcachefs: New discard implementationKent Overstreet1-1/+1
In the old allocator code, buckets would be discarded just prior to being used - this made sense in bcache where we were discarding buckets just after invalidating the cached data they contain, but in a filesystem where we typically have more free space we want to be discarding buckets when they become empty. This patch implements the new behaviour - it checks the need_discard btree for buckets awaiting discards, and then clears the appropriate bit in the alloc btree, which moves the buckets to the freespace btree. Additionally, discards are now enabled by default. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-23bcachefs: Make minimum journal_flush_delay nonzeroKent Overstreet1-1/+1
We're seeing a very strange bug where journal_flush_delay sometimes gets set to 0 in the superblock. Together with the preceding patch, this should help us track it down. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
2023-10-23bcachefs: Better superblock opt validationKent Overstreet1-2/+3
This moves validation of superblock options to bch2_sb_validate(), so they'll be checked in the write path as well. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
2023-10-23bcachefs: x-macro metadata version enumKent Overstreet1-0/+1
Now we've got strings for metadata versions - this changes bch2_sb_to_text() and our mount log message to use it. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
2023-10-23bcachefs: Convert bch2_sb_to_text to master option listKent Overstreet1-30/+32
Options no longer have to be manually added to bch2_sb_to_text() - it now uses the master list of options in opts.h. Also, improve some of the formatting by converting it to tabstops. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-23bcachefs: opts.read_journal_onlyKent Overstreet1-0/+5
Add an option that tells recovery to only read the journal, to be used by the list_journal command. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
2023-10-23bcachefs: Only allocate buckets_nouse when requestedKent Overstreet1-0/+5
It's only needed by the migrate tool - this patch adds an option to enable allocating it. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
2023-10-23bcachefs: bch2_journal_entry_to_text()Kent Overstreet1-0/+2
This adds a _to_text() pretty printer for journal entries - including every subtype - which will shortly be used by the 'bcachefs list_journal' subcommand. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
2023-10-23bcachefs: BCH_JSET_ENTRY_logKent Overstreet1-0/+5
Add a journal entry type for logging messages, and add an option to use it to log the transaction name - this makes for a very handy debugging tool, as with it we can use the 'bcachefs list_journal' command to see not only what updates were done, but what was doing them. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
2023-10-23bcachefs: Kill non-lru cache replacement policiesKent Overstreet1-1/+0
Prep work for persistent LRUs and getting rid of the in memory bucket array. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
2023-10-23bcachefs: Turn encoded_extent_max into a regular optionKent Overstreet1-0/+6
It'll now be handled at format time and in sysfs like other options - it still can only be set at format time, though. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>