summaryrefslogtreecommitdiff
AgeCommit message (Collapse)AuthorFilesLines
2024-01-31kconfig: initialize sym->curr.tri to 'no' for all symbol types againMasahiro Yamada1-1/+3
Geert Uytterhoeven reported that commit 4e244c10eab3 ("kconfig: remove unneeded symbol_empty variable") changed the default value of CONFIG_LOG_CPU_MAX_BUF_SHIFT from 12 to 0. As it turned out, this is an undefined behavior because sym_calc_value() stopped setting the sym->curr.tri field for 'int', 'hex', and 'string' symbols. This commit restores the original behavior, where 'int', 'hex', 'string' symbols are interpreted as false if used in boolean contexts. CONFIG_LOG_CPU_MAX_BUF_SHIFT will default to 12 again, irrespective of CONFIG_BASE_SMALL. Presumably, this is not the intended behavior, as already reported [1], but this is another issue that should be addressed by a separate patch. [1]: https://lore.kernel.org/all/f6856be8-54b7-0fa0-1d17-39632bf29ada@oracle.com/ Fixes: 4e244c10eab3 ("kconfig: remove unneeded symbol_empty variable") Reported-by: Geert Uytterhoeven <geert+renesas@glider.be> Closes: https://lore.kernel.org/all/CAMuHMdWm6u1wX7efZQf=2XUAHascps76YQac6rdnQGhc8nop_Q@mail.gmail.com/ Tested-by: Geert Uytterhoeven <geert+renesas@glider.be> Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
2024-01-31kbuild: rpm-pkg: simplify installkernel %postJose Ignacio Tornos Martinez1-11/+11
The new installkernel application that is now included in systemd-udev package allows installation although destination files are already present in the boot directory of the kernel package, but is failing with the implemented workaround for the old installkernel application from grubby package. For the new installkernel application, as Davide says: <<The %post currently does a shuffling dance before calling installkernel. This isn't actually necessary afaict, and the current implementation ends up triggering downstream issues such as https://github.com/systemd/systemd/issues/29568 This commit simplifies the logic to remove the shuffling. For reference, the original logic was added in commit 3c9c7a14b627("rpm-pkg: add %post section to create initramfs and grub hooks").>> But we need to keep the old behavior as well, because the old installkernel application from grubby package, does not allow this simplification and we need to be backward compatible to avoid issues with the different packages. Mimic Fedora shipping process and store vmlinuz, config amd System.map in the module directory instead of the boot directory. In this way, we will avoid the commented problem for all the cases, because the new destination files are not going to exist in the boot directory of the kernel package. Replace installkernel tool with kernel-install tool, because the latter is more complete. Besides, after installkernel tool execution, check to complete if the correct package files vmlinuz, System.map and config files are present in /boot directory, and if necessary, copy manually for install operation. In this way, take into account if files were not previously copied from /usr/lib/kernel/install.d/* scripts and if the suitable files for the requested package are present (it could be others if the rpm files were replace with a new pacakge with the same release and a different build). Tested with Fedora 38, Fedora 39, RHEL 9, Oracle Linux 9.3, openSUSE Tumbleweed and openMandrive ROME, using dnf/zypper and rpm tools. cc: stable@vger.kernel.org Co-Developed-by: Davide Cavalca <dcavalca@meta.com> Signed-off-by: Jose Ignacio Tornos Martinez <jtornosm@redhat.com> Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
2024-01-31kbuild: Replace tabs with spaces when followed by conditionalsDmitry Goncharov4-15/+15
This is needed for the future (post make-4.4.1) versions of gnu make. Starting from https://git.savannah.gnu.org/cgit/make.git/commit/?id=07fcee35f058a876447c8a021f9eb1943f902534 gnu make won't allow conditionals to follow recipe prefix. For example there is a tab followed by ifeq on line 324 in the root Makefile. With the new make this conditional causes the following $ make cpu.o /home/dgoncharov/src/linux-kbuild/Makefile:2063: *** missing 'endif'. Stop. make: *** [Makefile:240: __sub-make] Error 2 This patch replaces tabs followed by conditionals with 8 spaces. See https://savannah.gnu.org/bugs/?64185 and https://savannah.gnu.org/bugs/?64259 for details. Signed-off-by: Dmitry Goncharov <dgoncharov@users.sf.net> Reported-by: Martin Dorey <martin.dorey@hitachivantara.com> Reviewed-by: Miguel Ojeda <ojeda@kernel.org> Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
2024-01-31modpost: avoid using the alias attributeMasahiro Yamada2-16/+2
Aiden Leong reported modpost fails to build on macOS since commit 16a473f60edc ("modpost: inform compilers that fatal() never returns"): scripts/mod/modpost.c:93:21: error: aliases are not supported on darwin Nathan's research indicates that Darwin seems to support weak aliases at least [1]. Although the situation might be improved in future Clang versions, we can achieve a similar outcome without relying on it. This commit makes fatal() a macro of error() + exit(1) in modpost.h, as compilers recognize that exit() never returns. [1]: https://github.com/llvm/llvm-project/issues/71001 Fixes: 16a473f60edc ("modpost: inform compilers that fatal() never returns") Reported-by: Aiden Leong <aiden.leong@aibsd.com> Closes: https://lore.kernel.org/all/d9ac2960-6644-4a87-b5e4-4bfb6e0364a8@aibsd.com/ Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
2024-01-31kbuild: fix W= flags in the help messageMasahiro Yamada1-1/+1
W=c and W=e are supported. Signed-off-by: Masahiro Yamada <masahiroy@kernel.org> Reviewed-by: Nicolas Schier <nicolas@fjasle.eu>
2024-01-27modpost: Add '.ltext' and '.ltext.*' to TEXT_SECTIONSNathan Chancellor1-1/+2
After the linked LLVM change, building ARCH=um defconfig results in a segmentation fault in modpost. Prior to commit a23e7584ecf3 ("modpost: unify 'sym' and 'to' in default_mismatch_handler()"), there was a warning: WARNING: modpost: vmlinux.o(__ex_table+0x88): Section mismatch in reference to the .ltext:(unknown) WARNING: modpost: The relocation at __ex_table+0x88 references section ".ltext" which is not in the list of authorized sections. If you're adding a new section and/or if this reference is valid, add ".ltext" to the list of authorized sections to jump to on fault. This can be achieved by adding ".ltext" to OTHER_TEXT_SECTIONS in scripts/mod/modpost.c. The linked LLVM change moves global objects to the '.ltext' (and '.ltext.*' with '-ffunction-sections') sections with '-mcmodel=large', which ARCH=um uses. These sections should be handled just as '.text' and '.text.*' are, so add them to TEXT_SECTIONS. Cc: stable@vger.kernel.org Closes: https://github.com/ClangBuiltLinux/linux/issues/1981 Link: https://github.com/llvm/llvm-project/commit/4bf8a688956a759b7b6b8d94f42d25c13c7af130 Signed-off-by: Nathan Chancellor <nathan@kernel.org> Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
2024-01-27um: Fix adding '-no-pie' for clangNathan Chancellor1-1/+3
The kernel builds with -fno-PIE, so commit 883354afbc10 ("um: link vmlinux with -no-pie") added the compiler linker flag '-no-pie' via cc-option because '-no-pie' was only supported in GCC 6.1.0 and newer. While this works for GCC, this does not work for clang because cc-option uses '-c', which stops the pipeline right before linking, so '-no-pie' is unconsumed and clang warns, causing cc-option to fail just as it would if the option was entirely unsupported: $ clang -Werror -no-pie -c -o /dev/null -x c /dev/null clang-16: error: argument unused during compilation: '-no-pie' [-Werror,-Wunused-command-line-argument] A recent version of clang exposes this because it generates a relocation under '-mcmodel=large' that is not supported in PIE mode: /usr/sbin/ld: init/main.o: relocation R_X86_64_32 against symbol `saved_command_line' can not be used when making a PIE object; recompile with -fPIE /usr/sbin/ld: failed to set dynamic section sizes: bad value clang: error: linker command failed with exit code 1 (use -v to see invocation) Remove the cc-option check altogether. It is wasteful to invoke the compiler to check for '-no-pie' because only one supported compiler version does not support it, GCC 5.x (as it is supported with the minimum version of clang and GCC 6.1.0+). Use a combination of the gcc-min-version macro and CONFIG_CC_IS_CLANG to unconditionally add '-no-pie' with CONFIG_LD_SCRIPT_DYN=y, so that it is enabled with all compilers that support this. Furthermore, using gcc-min-version can help turn this back into LINK-$(CONFIG_LD_SCRIPT_DYN) += -no-pie when the minimum version of GCC is bumped past 6.1.0. Cc: stable@vger.kernel.org Closes: https://github.com/ClangBuiltLinux/linux/issues/1982 Signed-off-by: Nathan Chancellor <nathan@kernel.org> Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
2024-01-27kbuild: defconf: use SRCARCH to find merged configsZhang Bingwu1-4/+4
For some ARCH values, SRCARCH, which should be used for finding arch/ subdirectory, is different from ARCH. Signed-off-by: Zhang Bingwu <xtexchooser@duck.com> Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
2024-01-22Linux 6.8-rc1v6.8-rc1Linus Torvalds1-2/+2
2024-01-22Merge tag 'bcachefs-2024-01-21' of https://evilpiepirate.org/git/bcachefsLinus Torvalds78-1426/+1629
Pull more bcachefs updates from Kent Overstreet: "Some fixes, Some refactoring, some minor features: - Assorted prep work for disk space accounting rewrite - BTREE_TRIGGER_ATOMIC: after combining our trigger callbacks, this makes our trigger context more explicit - A few fixes to avoid excessive transaction restarts on multithreaded workloads: fstests (in addition to ktest tests) are now checking slowpath counters, and that's shaking out a few bugs - Assorted tracepoint improvements - Starting to break up bcachefs_format.h and move on disk types so they're with the code they belong to; this will make room to start documenting the on disk format better. - A few minor fixes" * tag 'bcachefs-2024-01-21' of https://evilpiepirate.org/git/bcachefs: (46 commits) bcachefs: Improve inode_to_text() bcachefs: logged_ops_format.h bcachefs: reflink_format.h bcachefs; extents_format.h bcachefs: ec_format.h bcachefs: subvolume_format.h bcachefs: snapshot_format.h bcachefs: alloc_background_format.h bcachefs: xattr_format.h bcachefs: dirent_format.h bcachefs: inode_format.h bcachefs; quota_format.h bcachefs: sb-counters_format.h bcachefs: counters.c -> sb-counters.c bcachefs: comment bch_subvolume bcachefs: bch_snapshot::btime bcachefs: add missing __GFP_NOWARN bcachefs: opts->compression can now also be applied in the background bcachefs: Prep work for variable size btree node buffers bcachefs: grab s_umount only if snapshotting ...
2024-01-21Merge tag 'timers-core-2024-01-21' of ↵Linus Torvalds7-12/+41
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull timer updates from Thomas Gleixner: "Updates for time and clocksources: - A fix for the idle and iowait time accounting vs CPU hotplug. The time is reset on CPU hotplug which makes the accumulated systemwide time jump backwards. - Assorted fixes and improvements for clocksource/event drivers" * tag 'timers-core-2024-01-21' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: tick-sched: Fix idle and iowait sleeptime accounting vs CPU hotplug clocksource/drivers/ep93xx: Fix error handling during probe clocksource/drivers/cadence-ttc: Fix some kernel-doc warnings clocksource/drivers/timer-ti-dm: Fix make W=n kerneldoc warnings clocksource/timer-riscv: Add riscv_clock_shutdown callback dt-bindings: timer: Add StarFive JH8100 clint dt-bindings: timer: thead,c900-aclint-mtimer: separate mtime and mtimecmp regs
2024-01-21Merge tag 'powerpc-6.8-2' of ↵Linus Torvalds1-0/+1
git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux Pull powerpc fixes from Aneesh Kumar: - Increase default stack size to 32KB for Book3S Thanks to Michael Ellerman. * tag 'powerpc-6.8-2' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: powerpc/64s: Increase default stack size to 32KB
2024-01-21bcachefs: Improve inode_to_text()Kent Overstreet1-7/+18
Add line breaks - inode_to_text() is now much easier to read. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-01-21bcachefs: logged_ops_format.hKent Overstreet2-27/+31
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-01-21bcachefs: reflink_format.hKent Overstreet3-47/+48
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-01-21bcachefs; extents_format.hKent Overstreet2-279/+284
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-01-21bcachefs: ec_format.hKent Overstreet2-16/+20
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-01-21bcachefs: subvolume_format.hKent Overstreet2-32/+36
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-01-21bcachefs: snapshot_format.hKent Overstreet2-33/+37
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-01-21bcachefs: alloc_background_format.hKent Overstreet2-93/+94
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-01-21bcachefs: xattr_format.hKent Overstreet2-15/+20
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-01-21bcachefs: dirent_format.hKent Overstreet2-39/+43
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-01-21bcachefs: inode_format.hKent Overstreet2-164/+167
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-01-21bcachefs; quota_format.hKent Overstreet2-42/+48
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-01-21bcachefs: sb-counters_format.hKent Overstreet2-95/+100
bcachefs_format.h has gotten too big; let's do some organizing. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-01-21bcachefs: counters.c -> sb-counters.cKent Overstreet5-8/+7
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-01-21bcachefs: comment bch_subvolumeKent Overstreet1-0/+3
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-01-21bcachefs: bch_snapshot::btimeKent Overstreet2-0/+3
Add a field to bch_snapshot for creation time; this will be important when we start exposing the snapshot tree to userspace. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-01-21bcachefs: add missing __GFP_NOWARNKent Overstreet1-1/+1
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-01-21bcachefs: opts->compression can now also be applied in the backgroundKent Overstreet11-23/+24
The "apply this compression method in the background" paths now use the compression option if background_compression is not set; this means that setting or changing the compression option will cause existing data to be compressed accordingly in the background. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-01-21bcachefs: Prep work for variable size btree node buffersKent Overstreet18-97/+87
bcachefs btree nodes are big - typically 256k - and btree roots are pinned in memory. As we're now up to 18 btrees, we now have significant memory overhead in mostly empty btree roots. And in the future we're going to start enforcing that certain btree node boundaries exist, to solve lock contention issues - analagous to XFS's AGIs. Thus, we need to start allocating smaller btree node buffers when we can. This patch changes code that refers to the filesystem constant c->opts.btree_node_size to refer to the btree node buffer size - btree_buf_bytes() - where appropriate. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-01-21bcachefs: grab s_umount only if snapshottingSu Yue1-6/+5
When I was testing mongodb over bcachefs with compression, there is a lockdep warning when snapshotting mongodb data volume. $ cat test.sh prog=bcachefs $prog subvolume create /mnt/data $prog subvolume create /mnt/data/snapshots while true;do $prog subvolume snapshot /mnt/data /mnt/data/snapshots/$(date +%s) sleep 1s done $ cat /etc/mongodb.conf systemLog: destination: file logAppend: true path: /mnt/data/mongod.log storage: dbPath: /mnt/data/ lockdep reports: [ 3437.452330] ====================================================== [ 3437.452750] WARNING: possible circular locking dependency detected [ 3437.453168] 6.7.0-rc7-custom+ #85 Tainted: G E [ 3437.453562] ------------------------------------------------------ [ 3437.453981] bcachefs/35533 is trying to acquire lock: [ 3437.454325] ffffa0a02b2b1418 (sb_writers#10){.+.+}-{0:0}, at: filename_create+0x62/0x190 [ 3437.454875] but task is already holding lock: [ 3437.455268] ffffa0a02b2b10e0 (&type->s_umount_key#48){.+.+}-{3:3}, at: bch2_fs_file_ioctl+0x232/0xc90 [bcachefs] [ 3437.456009] which lock already depends on the new lock. [ 3437.456553] the existing dependency chain (in reverse order) is: [ 3437.457054] -> #3 (&type->s_umount_key#48){.+.+}-{3:3}: [ 3437.457507] down_read+0x3e/0x170 [ 3437.457772] bch2_fs_file_ioctl+0x232/0xc90 [bcachefs] [ 3437.458206] __x64_sys_ioctl+0x93/0xd0 [ 3437.458498] do_syscall_64+0x42/0xf0 [ 3437.458779] entry_SYSCALL_64_after_hwframe+0x6e/0x76 [ 3437.459155] -> #2 (&c->snapshot_create_lock){++++}-{3:3}: [ 3437.459615] down_read+0x3e/0x170 [ 3437.459878] bch2_truncate+0x82/0x110 [bcachefs] [ 3437.460276] bchfs_truncate+0x254/0x3c0 [bcachefs] [ 3437.460686] notify_change+0x1f1/0x4a0 [ 3437.461283] do_truncate+0x7f/0xd0 [ 3437.461555] path_openat+0xa57/0xce0 [ 3437.461836] do_filp_open+0xb4/0x160 [ 3437.462116] do_sys_openat2+0x91/0xc0 [ 3437.462402] __x64_sys_openat+0x53/0xa0 [ 3437.462701] do_syscall_64+0x42/0xf0 [ 3437.462982] entry_SYSCALL_64_after_hwframe+0x6e/0x76 [ 3437.463359] -> #1 (&sb->s_type->i_mutex_key#15){+.+.}-{3:3}: [ 3437.463843] down_write+0x3b/0xc0 [ 3437.464223] bch2_write_iter+0x5b/0xcc0 [bcachefs] [ 3437.464493] vfs_write+0x21b/0x4c0 [ 3437.464653] ksys_write+0x69/0xf0 [ 3437.464839] do_syscall_64+0x42/0xf0 [ 3437.465009] entry_SYSCALL_64_after_hwframe+0x6e/0x76 [ 3437.465231] -> #0 (sb_writers#10){.+.+}-{0:0}: [ 3437.465471] __lock_acquire+0x1455/0x21b0 [ 3437.465656] lock_acquire+0xc6/0x2b0 [ 3437.465822] mnt_want_write+0x46/0x1a0 [ 3437.465996] filename_create+0x62/0x190 [ 3437.466175] user_path_create+0x2d/0x50 [ 3437.466352] bch2_fs_file_ioctl+0x2ec/0xc90 [bcachefs] [ 3437.466617] __x64_sys_ioctl+0x93/0xd0 [ 3437.466791] do_syscall_64+0x42/0xf0 [ 3437.466957] entry_SYSCALL_64_after_hwframe+0x6e/0x76 [ 3437.467180] other info that might help us debug this: [ 3437.469670] 2 locks held by bcachefs/35533: other info that might help us debug this: [ 3437.467507] Chain exists of: sb_writers#10 --> &c->snapshot_create_lock --> &type->s_umount_key#48 [ 3437.467979] Possible unsafe locking scenario: [ 3437.468223] CPU0 CPU1 [ 3437.468405] ---- ---- [ 3437.468585] rlock(&type->s_umount_key#48); [ 3437.468758] lock(&c->snapshot_create_lock); [ 3437.469030] lock(&type->s_umount_key#48); [ 3437.469291] rlock(sb_writers#10); [ 3437.469434] *** DEADLOCK *** [ 3437.469670] 2 locks held by bcachefs/35533: [ 3437.469838] #0: ffffa0a02ce00a88 (&c->snapshot_create_lock){++++}-{3:3}, at: bch2_fs_file_ioctl+0x1e3/0xc90 [bcachefs] [ 3437.470294] #1: ffffa0a02b2b10e0 (&type->s_umount_key#48){.+.+}-{3:3}, at: bch2_fs_file_ioctl+0x232/0xc90 [bcachefs] [ 3437.470744] stack backtrace: [ 3437.470922] CPU: 7 PID: 35533 Comm: bcachefs Kdump: loaded Tainted: G E 6.7.0-rc7-custom+ #85 [ 3437.471313] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Arch Linux 1.16.3-1-1 04/01/2014 [ 3437.471694] Call Trace: [ 3437.471795] <TASK> [ 3437.471884] dump_stack_lvl+0x57/0x90 [ 3437.472035] check_noncircular+0x132/0x150 [ 3437.472202] __lock_acquire+0x1455/0x21b0 [ 3437.472369] lock_acquire+0xc6/0x2b0 [ 3437.472518] ? filename_create+0x62/0x190 [ 3437.472683] ? lock_is_held_type+0x97/0x110 [ 3437.472856] mnt_want_write+0x46/0x1a0 [ 3437.473025] ? filename_create+0x62/0x190 [ 3437.473204] filename_create+0x62/0x190 [ 3437.473380] user_path_create+0x2d/0x50 [ 3437.473555] bch2_fs_file_ioctl+0x2ec/0xc90 [bcachefs] [ 3437.473819] ? lock_acquire+0xc6/0x2b0 [ 3437.474002] ? __fget_files+0x2a/0x190 [ 3437.474195] ? __fget_files+0xbc/0x190 [ 3437.474380] ? lock_release+0xc5/0x270 [ 3437.474567] ? __x64_sys_ioctl+0x93/0xd0 [ 3437.474764] ? __pfx_bch2_fs_file_ioctl+0x10/0x10 [bcachefs] [ 3437.475090] __x64_sys_ioctl+0x93/0xd0 [ 3437.475277] do_syscall_64+0x42/0xf0 [ 3437.475454] entry_SYSCALL_64_after_hwframe+0x6e/0x76 [ 3437.475691] RIP: 0033:0x7f2743c313af ====================================================== In __bch2_ioctl_subvolume_create(), we grab s_umount unconditionally and unlock it at the end of the function. There is a comment "why do we need this lock?" about the lock coming from commit 42d237320e98 ("bcachefs: Snapshot creation, deletion") The reason is that __bch2_ioctl_subvolume_create() calls sync_inodes_sb() which enforce locked s_umount to writeback all dirty nodes before doing snapshot works. Fix it by read locking s_umount for snapshotting only and unlocking s_umount after sync_inodes_sb(). Signed-off-by: Su Yue <glass.su@suse.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-01-21bcachefs: kvfree bch_fs::snapshots in bch2_fs_snapshots_exitSu Yue1-1/+1
bch_fs::snapshots is allocated by kvzalloc in __snapshot_t_mut. It should be freed by kvfree not kfree. Or umount will triger: [ 406.829178 ] BUG: unable to handle page fault for address: ffffe7b487148008 [ 406.830676 ] #PF: supervisor read access in kernel mode [ 406.831643 ] #PF: error_code(0x0000) - not-present page [ 406.832487 ] PGD 0 P4D 0 [ 406.832898 ] Oops: 0000 [#1] PREEMPT SMP PTI [ 406.833512 ] CPU: 2 PID: 1754 Comm: umount Kdump: loaded Tainted: G OE 6.7.0-rc7-custom+ #90 [ 406.834746 ] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Arch Linux 1.16.3-1-1 04/01/2014 [ 406.835796 ] RIP: 0010:kfree+0x62/0x140 [ 406.836197 ] Code: 80 48 01 d8 0f 82 e9 00 00 00 48 c7 c2 00 00 00 80 48 2b 15 78 9f 1f 01 48 01 d0 48 c1 e8 0c 48 c1 e0 06 48 03 05 56 9f 1f 01 <48> 8b 50 08 48 89 c7 f6 c2 01 0f 85 b0 00 00 00 66 90 48 8b 07 f6 [ 406.837810 ] RSP: 0018:ffffb9d641607e48 EFLAGS: 00010286 [ 406.838213 ] RAX: ffffe7b487148000 RBX: ffffb9d645200000 RCX: ffffb9d641607dc4 [ 406.838738 ] RDX: 000065bb00000000 RSI: ffffffffc0d88b84 RDI: ffffb9d645200000 [ 406.839217 ] RBP: ffff9a4625d00068 R08: 0000000000000001 R09: 0000000000000001 [ 406.839650 ] R10: 0000000000000001 R11: 000000000000001f R12: ffff9a4625d4da80 [ 406.840055 ] R13: ffff9a4625d00000 R14: ffffffffc0e2eb20 R15: 0000000000000000 [ 406.840451 ] FS: 00007f0a264ffb80(0000) GS:ffff9a4e2d500000(0000) knlGS:0000000000000000 [ 406.840851 ] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 406.841125 ] CR2: ffffe7b487148008 CR3: 000000018c4d2000 CR4: 00000000000006f0 [ 406.841464 ] Call Trace: [ 406.841583 ] <TASK> [ 406.841682 ] ? __die+0x1f/0x70 [ 406.841828 ] ? page_fault_oops+0x159/0x470 [ 406.842014 ] ? fixup_exception+0x22/0x310 [ 406.842198 ] ? exc_page_fault+0x1ed/0x200 [ 406.842382 ] ? asm_exc_page_fault+0x22/0x30 [ 406.842574 ] ? bch2_fs_release+0x54/0x280 [bcachefs] [ 406.842842 ] ? kfree+0x62/0x140 [ 406.842988 ] ? kfree+0x104/0x140 [ 406.843138 ] bch2_fs_release+0x54/0x280 [bcachefs] [ 406.843390 ] kobject_put+0xb7/0x170 [ 406.843552 ] deactivate_locked_super+0x2f/0xa0 [ 406.843756 ] cleanup_mnt+0xba/0x150 [ 406.843917 ] task_work_run+0x59/0xa0 [ 406.844083 ] exit_to_user_mode_prepare+0x197/0x1a0 [ 406.844302 ] syscall_exit_to_user_mode+0x16/0x40 [ 406.844510 ] do_syscall_64+0x4e/0xf0 [ 406.844675 ] entry_SYSCALL_64_after_hwframe+0x6e/0x76 [ 406.844907 ] RIP: 0033:0x7f0a2664e4fb Signed-off-by: Su Yue <glass.su@suse.com> Reviewed-by: Brian Foster <bfoster@redhat.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-01-21bcachefs: bios must be 512 byte alginedKent Overstreet1-0/+4
Fixes: 023f9ac9f70f bcachefs: Delete dio read alignment check Reported-by: Brian Foster <bfoster@redhat.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-01-21bcachefs: remove redundant variable tmpColin Ian King1-3/+1
The variable tmp is being assigned a value but it isn't being read afterwards. The assignment is redundant and so tmp can be removed. Cleans up clang scan build warning: warning: Although the value stored to 'ret' is used in the enclosing expression, the value is never actually read from 'ret' [deadcode.DeadStores] Signed-off-by: Colin Ian King <colin.i.king@gmail.com> Reviewed-by: Brian Foster <bfoster@redhat.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-01-21bcachefs: Improve trace_trans_restart_relockKent Overstreet5-24/+44
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-01-21bcachefs: Fix excess transaction restarts in __bchfs_fallocate()Kent Overstreet4-16/+35
drop_locks_do() should not be used in a fastpath without first trying the do in nonblocking mode - the unlock and relock will cause excessive transaction restarts and potentially livelocking with other threads that are contending for the same locks. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-01-21bcachefs: extents_to_bp_stateKent Overstreet1-48/+41
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-01-21bcachefs: bkey_and_val_eq()Kent Overstreet1-3/+8
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-01-21bcachefs: Better journal tracepointsKent Overstreet2-60/+79
Factor out bch2_journal_bufs_to_text(), and use it in the journal_entry_full() tracepoint; when we can't get a journal reservation we need to know the outstanding journal entry sizes to know if the problem is due to excessive flushing. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-01-21bcachefs: Print size of superblock with space allocatedKent Overstreet1-1/+3
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-01-21bcachefs: Avoid flushing the journal in the discard pathKent Overstreet1-19/+41
When issuing discards, we may need to flush the journal if there's too many buckets that can't be discarded until a journal flush. But the heuristic was bad; we should be comparing the number of buckets that need to flushes against the number of free buckets, not the number of buckets we saw. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-01-21bcachefs: Improve move_extent tracepointKent Overstreet5-7/+48
Also print out the data_opts, so that we can see what specifically is being done to an extent. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-01-21bcachefs: Add missing bch2_moving_ctxt_flush_all()Kent Overstreet1-0/+1
This fixes a bug with rebalance IOs getting stuck with reads completed, but writes never being issued. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-01-21bcachefs: Re-add move_extent_write tracepointKent Overstreet2-23/+20
It appears this was accidentally deleted at some point - also, do a bit of cleanup. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-01-21bcachefs: bch2_kthread_io_clock_wait() no longer sleeps until full amountKent Overstreet1-2/+2
Drop t he loop in bch2_kthread_io_clock_wait(): this allows the code that uses it to be woken up for other reasons, and fixes a bug where rebalance wouldn't wake up when a scan was requested. This raises the possibility of spurious wakeups, but callers should always be able to handle that reasonably well. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-01-21bcachefs: Add .val_to_text() for KEY_TYPE_cookieKent Overstreet1-0/+9
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-01-21bcachefs: Don't pass memcmp() as a pointerKent Overstreet1-2/+8
Some (buggy!) compilers have issues with this. Fixes: https://github.com/koverstreet/bcachefs/issues/625 Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-01-21Merge tag 'header_cleanup-2024-01-20' of https://evilpiepirate.org/git/bcachefsLinus Torvalds1-6/+6
Pull header fix from Kent Overstreet: "Just one small fixup for the RT build" * tag 'header_cleanup-2024-01-20' of https://evilpiepirate.org/git/bcachefs: spinlock: Fix failing build for PREEMPT_RT
2024-01-21bcachefs: Reduce would_deadlock restartsKent Overstreet1-0/+7
We don't have to take locks in any particular ordering - we'll make forward progress just fine - but if we try to stick to an ordering, it can help to avoid excessive would_deadlock transaction restarts. This tweaks the reflink path to take extents btree locks in the right order. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>