summaryrefslogtreecommitdiff
AgeCommit message (Collapse)AuthorFilesLines
2024-05-08scripts/gdb: fix detection of current CPU in KGDBFlorian Rommel1-5/+1
Directly read the current CPU number from the kgdb_active variable. Before, the active CPU was obtained through the current task, which required searching the task list for the pid of GDB's selected thread. Obtaining the pid was buggy: GDB may use selected_thread().ptid[1] (LWPID) instead of .ptid[2] (TID) to store the threads pid; see https://sourceware.org/gdb/current/onlinedocs/gdb.html/Threads-In-Python.html As a result, the detection could return the wrong CPU number, leading to incorrect results for $lx_per_cpu and $lx_current. As a side effect, the patch significantly speeds up $lx_per_cpu and $lx_current in KGDB by avoiding the task-list iteration. Link: https://lkml.kernel.org/r/20240425153501.749966-5-mail@florommel.de Signed-off-by: Florian Rommel <mail@florommel.de> Cc: Andrew Jones <ajones@ventanamicro.com> Cc: Deepak Gupta <debug@rivosinc.com> Cc: Jan Kiszka <jan.kiszka@siemens.com> Cc: Kieran Bingham <kbingham@kernel.org> Cc: Kuan-Ying Lee <Kuan-Ying.Lee@mediatek.com> Cc: Palmer Dabbelt <palmer@rivosinc.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-05-08scripts/gdb: make get_thread_info accept pointersFlorian Rommel1-1/+1
get_thread_info ($lx_thread_info) only accepted a dereferenced task parameter. Passing a pointer to a task_struct (like $lx_per_cpu does with KGDB) threw an exception. With this patch, both (dereferenced values and pointers) are accepted. Before (on x86, KGDB): >>> p $lx_per_cpu(cpu_info) Traceback (most recent call last): File "./scripts/gdb/linux/cpus.py", line 158, in invoke return per_cpu(var_ptr, cpu) ^^^^^^^^^^^^^^^^^^^^^ File "./scripts/gdb/linux/cpus.py", line 42, in per_cpu cpu = get_current_cpu() ^^^^^^^^^^^^^^^^^ File "./scripts/gdb/linux/cpus.py", line 33, in get_current_cpu return tasks.get_thread_info(tasks.get_task_by_pid(tid))['cpu'] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "./scripts/gdb/linux/tasks.py", line 88, in get_thread_info if task.type.fields()[0].type == thread_info_type.get_type(): ~~~~~~~~~~~~~~~~~~^^^ IndexError: list index out of range Link: https://lkml.kernel.org/r/20240425153501.749966-4-mail@florommel.de Signed-off-by: Florian Rommel <mail@florommel.de> Cc: Andrew Jones <ajones@ventanamicro.com> Cc: Deepak Gupta <debug@rivosinc.com> Cc: Jan Kiszka <jan.kiszka@siemens.com> Cc: Kieran Bingham <kbingham@kernel.org> Cc: Kuan-Ying Lee <Kuan-Ying.Lee@mediatek.com> Cc: Palmer Dabbelt <palmer@rivosinc.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-05-08scripts/gdb: fix parameter handling in $lx_per_cpuFlorian Rommel1-3/+2
Before, the script tried to get the address by constructing a pointer to the parameter (by name). However, since GDB now passes the parameter as a GdbValue, we cannot get its name. Instead, we retrieve the address through GdbValue's address attribute. Before: >>> p $lx_per_cpu(cpu_info) Traceback (most recent call last): File "./scripts/gdb/linux/cpus.py", line 152, in invoke var_ptr = gdb.parse_and_eval("&" + var_name.string()) ^^^^^^^^^^^^^^^^^ gdb.error: Trying to read string with inappropriate type `struct cpuinfo_x86'. Link: https://lkml.kernel.org/r/20240425153501.749966-3-mail@florommel.de Signed-off-by: Florian Rommel <mail@florommel.de> Cc: Andrew Jones <ajones@ventanamicro.com> Cc: Deepak Gupta <debug@rivosinc.com> Cc: Jan Kiszka <jan.kiszka@siemens.com> Cc: Kieran Bingham <kbingham@kernel.org> Cc: Kuan-Ying Lee <Kuan-Ying.Lee@mediatek.com> Cc: Palmer Dabbelt <palmer@rivosinc.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-05-08scripts/gdb: fix failing KGDB detection during probeFlorian Rommel1-1/+1
Patch series "scripts/gdb: Fixes for $lx_current and $lx_per_cpu". This series fixes several bugs in the GDB scripts related to the $lx_current and $lx_per_cpu functions. The changes were tested with GDB 10, 11, 12, 13, and 14. Patch 1 fixes false-negative results when probing for KGDB Patch 2 fixes the $lx_per_cpu function, which is currently non-functional in QEMU-GDB and KGDB. Patch 3 fixes an additional bug in $lx_per_cpu that occurs with KGDB. Patch 4 fixes the incorrect detection of the current CPU number in KGDB, which silently breaks $lx_per_cpu and $lx_current. This patch (of 4): The KGDB probe function sometimes failed to detect KGDB for SMP machines as it assumed that task 2 (kthreadd) is running on CPU 0, which is not necessarily the case. Now, the detection is agnostic to kthreadd's CPU. Link: https://lkml.kernel.org/r/20240425153501.749966-1-mail@florommel.de Link: https://lkml.kernel.org/r/20240425153501.749966-2-mail@florommel.de Signed-off-by: Florian Rommel <mail@florommel.de> Cc: Andrew Jones <ajones@ventanamicro.com> Cc: Deepak Gupta <debug@rivosinc.com> Cc: Jan Kiszka <jan.kiszka@siemens.com> Cc: Kieran Bingham <kbingham@kernel.org> Cc: Kuan-Ying Lee <Kuan-Ying.Lee@mediatek.com> Cc: Palmer Dabbelt <palmer@rivosinc.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-05-08kfifo: don't use "proxy" headersAndy Shevchenko3-7/+13
Update header inclusions to follow IWYU (Include What You Use) principle. Link: https://lkml.kernel.org/r/20240423192529.3249134-4-andriy.shevchenko@linux.intel.com Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Cc: Alain Volmat <alain.volmat@foss.st.com> Cc: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com> Cc: Chen-Yu Tsai <wens@csie.org> Cc: Hans Verkuil <hverkuil-cisco@xs4all.nl> Cc: Jernej Skrabec <jernej.skrabec@gmail.com> Cc: Matthias Brugger <matthias.bgg@gmail.com> Cc: Mauro Carvalho Chehab <mchehab@kernel.org> Cc: Patrice Chotard <patrice.chotard@foss.st.com> Cc: Rob Herring <robh@kernel.org> Cc: Samuel Holland <samuel@sholland.org> Cc: Sean Wang <sean.wang@mediatek.com> Cc: Sean Young <sean@mess.org> Cc: Stefani Seibold <stefani@seibold.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-05-08media: stih-cec: add missing io.hAndy Shevchenko1-0/+1
In the driver the io.h is implied by others. This is not good as it prevents from cleanups done in other headers. Add missing include. Link: https://lkml.kernel.org/r/20240423192529.3249134-3-andriy.shevchenko@linux.intel.com Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Cc: Alain Volmat <alain.volmat@foss.st.com> Cc: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com> Cc: Chen-Yu Tsai <wens@csie.org> Cc: Hans Verkuil <hverkuil-cisco@xs4all.nl> Cc: Jernej Skrabec <jernej.skrabec@gmail.com> Cc: Matthias Brugger <matthias.bgg@gmail.com> Cc: Mauro Carvalho Chehab <mchehab@kernel.org> Cc: Patrice Chotard <patrice.chotard@foss.st.com> Cc: Rob Herring <robh@kernel.org> Cc: Samuel Holland <samuel@sholland.org> Cc: Sean Wang <sean.wang@mediatek.com> Cc: Sean Young <sean@mess.org> Cc: Stefani Seibold <stefani@seibold.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-05-08media: rc: add missing io.hAndy Shevchenko4-0/+4
Patch series "kfifo: Clean up kfifo.h", v2. To reduce dependency hell a degree, clean up kfifo.h (mainly getting rid of kernel.h in the global header). This patch (of 3): In many remote control drivers the io.h is implied by others. This is not good as it prevents from cleanups done in other headers. Add missing include. Link: https://lkml.kernel.org/r/20240423192529.3249134-1-andriy.shevchenko@linux.intel.com Link: https://lkml.kernel.org/r/20240423192529.3249134-2-andriy.shevchenko@linux.intel.com Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Cc: Alain Volmat <alain.volmat@foss.st.com> Cc: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com> Cc: Chen-Yu Tsai <wens@csie.org> Cc: Hans Verkuil <hverkuil-cisco@xs4all.nl> Cc: Jernej Skrabec <jernej.skrabec@gmail.com> Cc: Matthias Brugger <matthias.bgg@gmail.com> Cc: Mauro Carvalho Chehab <mchehab@kernel.org> Cc: Patrice Chotard <patrice.chotard@foss.st.com> Cc: Rob Herring <robh@kernel.org> Cc: Samuel Holland <samuel@sholland.org> Cc: Sean Wang <sean.wang@mediatek.com> Cc: Sean Young <sean@mess.org> Cc: Stefani Seibold <stefani@seibold.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-05-08tools lib rbtree: pick some improvements from the kernel rbtree codeArnaldo Carvalho de Melo2-3/+3
The tools/lib/rbtree.c code came from the kernel. Remove the EXPORT_SYMBOL() that make sense only there. Unfortunately it is not being checked with tools/perf/check_headers.sh. Will try to remedy this. Until then pick the improvements from: b0687c1119b4e8c8 ("lib/rbtree: use '+' instead of '|' for setting color.") That I noticed by doing: diff -u tools/lib/rbtree.c lib/rbtree.c diff -u tools/include/linux/rbtree_augmented.h include/linux/rbtree_augmented.h There is one other cases, but lets pick it in separate patches. Link: https://lkml.kernel.org/r/ZigZzeFoukzRKG1Q@x1 Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Noah Goldstein <goldstein.w.n@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-05-08ocfs2: remove redundant assignment to variable statusColin Ian King1-1/+0
Variable status is being assigned and error code that is never read, it is being assigned inside of a do-while loop. The assignment is redundant and can be removed. Cleans up clang scan build warning: fs/ocfs2/dlm/dlmdomain.c:1530:2: warning: Value stored to 'status' is never read [deadcode.DeadStores] Link: https://lkml.kernel.org/r/20240423223018.1573213-1-colin.i.king@gmail.com Signed-off-by: Colin Ian King <colin.i.king@gmail.com> Acked-by: Joseph Qi <joseph.qi@linux.alibaba.com> Cc: Mark Fasheh <mark@fasheh.com> Cc: Joel Becker <jlbec@evilplan.org> Cc: Junxiao Bi <junxiao.bi@oracle.com> Cc: Changwei Ge <gechangwei@live.cn> Cc: Gang He <ghe@suse.com> Cc: Jun Piao <piaojun@huawei.com> Cc: Heming Zhao <heming.zhao@suse.com> Cc: Dan Carpenter <dan.carpenter@linaro.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-05-08nilfs2: convert to use the new mount APIEric Sandeen4-229/+174
Convert nilfs2 to use the new mount API. [sandeen@redhat.com: v2] Link: https://lkml.kernel.org/r/33d078a7-9072-4d8e-a3a9-dec23d4191da@redhat.com Link: https://lkml.kernel.org/r/20240425190526.10905-1-konishi.ryusuke@gmail.com [konishi.ryusuke: fixed missing SB_RDONLY flag repair in nilfs_reconfigure] Link: https://lkml.kernel.org/r/33d078a7-9072-4d8e-a3a9-dec23d4191da@redhat.com Link: https://lkml.kernel.org/r/20240424182716.6024-1-konishi.ryusuke@gmail.com Signed-off-by: Eric Sandeen <sandeen@redhat.com> Signed-off-by: Ryusuke Konishi <konishi.ryusuke@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-05-08kexec: fix the unexpected kexec_dprintk() macroBaoquan He1-4/+2
Jiri reported that the current kexec_dprintk() always prints out debugging message whenever kexec/kdmmp loading is triggered. That is not wanted. The debugging message is supposed to be printed out when 'kexec -s -d' is specified for kexec/kdump loading. After investigating, the reason is the current kexec_dprintk() takes printk(KERN_INFO) or printk(KERN_DEBUG) depending on whether '-d' is specified. However, distros usually have defaulg log level like below: [~]# cat /proc/sys/kernel/printk 7 4 1 7 So, even though '-d' is not specified, printk(KERN_DEBUG) also always prints out. I thought printk(KERN_DEBUG) is equal to pr_debug(), it's not. Fix it by changing to use pr_info() instead which are expected to work. Link: https://lkml.kernel.org/r/20240409042238.1240462-1-bhe@redhat.com Fixes: cbc2fe9d9cb2 ("kexec_file: add kexec_file flag to control debug printing") Signed-off-by: Baoquan He <bhe@redhat.com> Reported-by: Jiri Slaby <jirislaby@kernel.org> Closes: https://lore.kernel.org/all/4c775fca-5def-4a2d-8437-7130b02722a2@kernel.org Reviewed-by: Dave Young <dyoung@redhat.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-05-08crash: add prefix for crash dumping messagesBaoquan He2-2/+4
Add pr_fmt() to kernel/crash_core.c to add the module name to debugging message printed as prefix. And also add prefix 'crashkernel:' to two lines of message printing code in kernel/crash_reserve.c. In kernel/crash_reserve.c, almost all debugging messages have 'crashkernel:' prefix or there's keyword crashkernel at the beginning or in the middle, adding pr_fmt() makes it redundant. Link: https://lkml.kernel.org/r/20240418035843.1562887-1-bhe@redhat.com Signed-off-by: Baoquan He <bhe@redhat.com> Cc: Dave Young <dyoung@redhat.com> Cc: Jiri Slaby <jirislaby@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-05-08cpumask: delete unused reset_cpu_possible_mask()Alexey Dobriyan1-5/+0
Link: https://lkml.kernel.org/r/20240417201123.2961-1-adobriyan@gmail.com Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Cc: Rasmus Villemoes <linux@rasmusvillemoes.dk> Cc: Yury Norov <yury.norov@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-04-29mux: remove usage of the deprecated ida_simple_xx() APIChristophe JAILLET1-2/+2
ida_alloc() and ida_free() should be preferred to the deprecated ida_simple_get() and ida_simple_remove(). This is less verbose. Link: https://lkml.kernel.org/r/f82e013abe4c71f1c7d06819f96472f298acdcf3.1713089554.git.christophe.jaillet@wanadoo.fr Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr> Cc: Peter Rosin <peda@axentia.se> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-04-29pps: remove usage of the deprecated ida_simple_xx() APIChristophe JAILLET1-3/+3
ida_alloc() and ida_free() should be preferred to the deprecated ida_simple_get() and ida_simple_remove(). This is less verbose. Link: https://lkml.kernel.org/r/9f681747d446b874952a892491387d79ffe565a9.1713089394.git.christophe.jaillet@wanadoo.fr Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr> Cc: Rodolfo Giometti <giometti@enneenne.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-04-29intel_th: remove usage of the deprecated ida_simple_xx() APIChristophe JAILLET1-3/+3
ida_alloc() and ida_free() should be preferred to the deprecated ida_simple_get() and ida_simple_remove(). This is less verbose. Link: https://lkml.kernel.org/r/2aca50a9d061faecfd4ded80b5874cd3be9b855d.1713086613.git.christophe.jaillet@wanadoo.fr Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-04-26nilfs2: add kernel-doc comments to nilfs_remove_all_gcinodes()Yang Li1-0/+1
This commit adds kernel-doc style comments with complete parameter descriptions for the function nilfs_remove_all_gcinodes. Link: https://lkml.kernel.org/r/20240410075629.3441-4-konishi.ryusuke@gmail.com Signed-off-by: Yang Li <yang.lee@linux.alibaba.com> Signed-off-by: Ryusuke Konishi <konishi.ryusuke@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-04-26nilfs2: add kernel-doc comments to nilfs_btree_convert_and_insert()Yang Li1-7/+16
This commit adds kernel-doc style comments with complete parameter descriptions for the function nilfs_btree_convert_and_insert. Link: https://lkml.kernel.org/r/20240410075629.3441-3-konishi.ryusuke@gmail.com Signed-off-by: Yang Li <yang.lee@linux.alibaba.com> Signed-off-by: Ryusuke Konishi <konishi.ryusuke@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-04-26nilfs2: add kernel-doc comments to nilfs_do_roll_forward()Yang Li1-0/+1
Patch series "nilfs2: fix missing kernel-doc comments". This commit adds kernel-doc style comments with complete parameter descriptions for the function nilfs_do_roll_forward. Link: https://lkml.kernel.org/r/20240410075629.3441-1-konishi.ryusuke@gmail.com Link: https://lkml.kernel.org/r/20240410075629.3441-2-konishi.ryusuke@gmail.com Signed-off-by: Yang Li <yang.lee@linux.alibaba.com> Signed-off-by: Ryusuke Konishi <konishi.ryusuke@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-04-26blktrace: convert strncpy() to strscpy_pad()Arnd Bergmann1-2/+1
gcc-9 warns about a possibly non-terminated string copy: kernel/trace/blktrace.c: In function 'do_blk_trace_setup': kernel/trace/blktrace.c:527:2: error: 'strncpy' specified bound 32 equals destination size [-Werror=stringop-truncation] Newer versions are fine here because they see the following explicit nul-termination. Using strscpy_pad() avoids the warning and simplifies the code a little. The padding helps give a clean buffer to userspace. Link: https://lkml.kernel.org/r/20240409140059.3806717-5-arnd@kernel.org Signed-off-by: Arnd Bergmann <arnd@arndb.de> Acked-by: Justin Stitt <justinstitt@google.com> Cc: Alexey Starikovskiy <astarikovskiy@suse.de> Cc: Bob Moore <robert.moore@intel.com> Cc: Jens Axboe <axboe@kernel.dk> Cc: Len Brown <lenb@kernel.org> Cc: Lin Ming <ming.m.lin@intel.com> Cc: Masahiro Yamada <masahiroy@kernel.org> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Cc: Nathan Chancellor <nathan@kernel.org> Cc: Nicolas Schier <nicolas@fjasle.eu> Cc: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Cc: "Richard Russon (FlatCap)" <ldm@flatcap.org> Cc: Steven Rostedt <rostedt@goodmis.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-04-26block/partitions/ldm: convert strncpy() to strscpy()Arnd Bergmann1-4/+2
The strncpy() here can cause a non-terminated string, which older gcc versions such as gcc-9 warn about: In function 'ldm_parse_tocblock', inlined from 'ldm_validate_tocblocks' at block/partitions/ldm.c:386:7, inlined from 'ldm_partition' at block/partitions/ldm.c:1457:7: block/partitions/ldm.c:134:2: error: 'strncpy' specified bound 16 equals destination size [-Werror=stringop-truncation] 134 | strncpy (toc->bitmap1_name, data + 0x24, sizeof (toc->bitmap1_name)); | ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ block/partitions/ldm.c:145:2: error: 'strncpy' specified bound 16 equals destination size [-Werror=stringop-truncation] 145 | strncpy (toc->bitmap2_name, data + 0x46, sizeof (toc->bitmap2_name)); | ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ New versions notice that the code is correct after all because of the following termination, but replacing the strncpy() with strscpy_pad() or strcpy() avoids the warning and simplifies the code at the same time. Use the padding version here to keep the existing behavior, in case the code relies on not including uninitialized data. Link: https://lkml.kernel.org/r/20240409140059.3806717-4-arnd@kernel.org Reviewed-by: Justin Stitt <justinstitt@google.com> Signed-off-by: Arnd Bergmann <arnd@arndb.de> Cc: Alexey Starikovskiy <astarikovskiy@suse.de> Cc: Bob Moore <robert.moore@intel.com> Cc: Jens Axboe <axboe@kernel.dk> Cc: Len Brown <lenb@kernel.org> Cc: Lin Ming <ming.m.lin@intel.com> Cc: Masahiro Yamada <masahiroy@kernel.org> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Cc: Nathan Chancellor <nathan@kernel.org> Cc: Nicolas Schier <nicolas@fjasle.eu> Cc: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Cc: "Richard Russon (FlatCap)" <ldm@flatcap.org> Cc: Steven Rostedt <rostedt@goodmis.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-04-26test_hexdump: avoid string truncation warningArnd Bergmann1-1/+1
gcc can warn when a string is too long to fit into the strncpy() destination buffer, as it is here depending on the function arguments: inlined from 'test_hexdump_prepare_test.constprop' at /home/arnd/arm-soc/lib/test_hexdump.c:116:3: include/linux/fortify-string.h:108:33: error: '__builtin_strncpy' output truncated copying between 0 and 32 bytes from a string of length 32 [-Werror=stringop-truncation] 108 | #define __underlying_strncpy __builtin_strncpy | ^ include/linux/fortify-string.h:187:16: note: in expansion of macro '__underlying_strncpy' 187 | return __underlying_strncpy(p, q, size); | ^~~~~~~~~~~~~~~~~~~~ The intention here is to copy exactly 'l' bytes without any padding or NUL-termination, so the most logical change is to use memcpy(), just as a previous change adapted the other output from strncpy() to memcpy(). Link: https://lkml.kernel.org/r/20240409140059.3806717-2-arnd@kernel.org Signed-off-by: Arnd Bergmann <arnd@arndb.de> Acked-by: Justin Stitt <justinstitt@google.com> Cc: Alexey Starikovskiy <astarikovskiy@suse.de> Cc: Bob Moore <robert.moore@intel.com> Cc: Jens Axboe <axboe@kernel.dk> Cc: Len Brown <lenb@kernel.org> Cc: Lin Ming <ming.m.lin@intel.com> Cc: Masahiro Yamada <masahiroy@kernel.org> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Cc: Nathan Chancellor <nathan@kernel.org> Cc: Nicolas Schier <nicolas@fjasle.eu> Cc: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Cc: "Richard Russon (FlatCap)" <ldm@flatcap.org> Cc: Steven Rostedt <rostedt@goodmis.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-04-26ocfs2: use coarse time for new created filesSu Yue1-1/+1
The default atime related mount option is '-o realtime' which means file atime should be updated if atime <= ctime or atime <= mtime. atime should be updated in the following scenario, but it is not: ========================================================== $ rm /mnt/testfile; $ echo test > /mnt/testfile $ stat -c "%X %Y %Z" /mnt/testfile 1711881646 1711881646 1711881646 $ sleep 5 $ cat /mnt/testfile > /dev/null $ stat -c "%X %Y %Z" /mnt/testfile 1711881646 1711881646 1711881646 ========================================================== And the reason the atime in the test is not updated is that ocfs2 calls ktime_get_real_ts64() in __ocfs2_mknod_locked during file creation. Then inode_set_ctime_current() is called in inode_set_ctime_current() calls ktime_get_coarse_real_ts64() to get current time. ktime_get_real_ts64() is more accurate than ktime_get_coarse_real_ts64(). In my test box, I saw ctime set by ktime_get_coarse_real_ts64() is less than ktime_get_real_ts64() even ctime is set later. The ctime of the new inode is smaller than atime. The call trace is like: ocfs2_create ocfs2_mknod __ocfs2_mknod_locked .... ktime_get_real_ts64 <------- set atime,ctime,mtime, more accurate ocfs2_populate_inode ... ocfs2_init_acl ocfs2_acl_set_mode inode_set_ctime_current current_time ktime_get_coarse_real_ts64 <-------less accurate ocfs2_file_read_iter ocfs2_inode_lock_atime ocfs2_should_update_atime atime <= ctime ? <-------- false, ctime < atime due to accuracy So here call ktime_get_coarse_real_ts64 to set inode time coarser while creating new files. It may lower the accuracy of file times. But it's not a big deal since we already use coarse time in other places like ocfs2_update_inode_atime and inode_set_ctime_current. Link: https://lkml.kernel.org/r/20240408082041.20925-5-glass.su@suse.com Fixes: c62c38f6b91b ("ocfs2: replace CURRENT_TIME macro") Signed-off-by: Su Yue <glass.su@suse.com> Reviewed-by: Joseph Qi <joseph.qi@linux.alibaba.com> Cc: Mark Fasheh <mark@fasheh.com> Cc: Joel Becker <jlbec@evilplan.org> Cc: Junxiao Bi <junxiao.bi@oracle.com> Cc: Changwei Ge <gechangwei@live.cn> Cc: Gang He <ghe@suse.com> Cc: Jun Piao <piaojun@huawei.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-04-26ocfs2: update inode fsync transaction id in ocfs2_unlink and ocfs2_linkSu Yue1-0/+2
transaction id should be updated in ocfs2_unlink and ocfs2_link. Otherwise, inode link will be wrong after journal replay even fsync was called before power failure: ======================================================================= $ touch testdir/bar $ ln testdir/bar testdir/bar_link $ fsync testdir/bar $ stat -c %h $SCRATCH_MNT/testdir/bar 1 $ stat -c %h $SCRATCH_MNT/testdir/bar 1 ======================================================================= Link: https://lkml.kernel.org/r/20240408082041.20925-4-glass.su@suse.com Fixes: ccd979bdbce9 ("[PATCH] OCFS2: The Second Oracle Cluster Filesystem") Signed-off-by: Su Yue <glass.su@suse.com> Reviewed-by: Joseph Qi <joseph.qi@linux.alibaba.com> Cc: Changwei Ge <gechangwei@live.cn> Cc: Gang He <ghe@suse.com> Cc: Joel Becker <jlbec@evilplan.org> Cc: Jun Piao <piaojun@huawei.com> Cc: Junxiao Bi <junxiao.bi@oracle.com> Cc: Mark Fasheh <mark@fasheh.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-04-26ocfs2: fix races between hole punching and AIO+DIOSu Yue1-0/+2
After commit "ocfs2: return real error code in ocfs2_dio_wr_get_block", fstests/generic/300 become from always failed to sometimes failed: ======================================================================== [ 473.293420 ] run fstests generic/300 [ 475.296983 ] JBD2: Ignoring recovery information on journal [ 475.302473 ] ocfs2: Mounting device (253,1) on (node local, slot 0) with ordered data mode. [ 494.290998 ] OCFS2: ERROR (device dm-1): ocfs2_change_extent_flag: Owner 5668 has an extent at cpos 78723 which can no longer be found [ 494.291609 ] On-disk corruption discovered. Please run fsck.ocfs2 once the filesystem is unmounted. [ 494.292018 ] OCFS2: File system is now read-only. [ 494.292224 ] (kworker/19:11,2628,19):ocfs2_mark_extent_written:5272 ERROR: status = -30 [ 494.292602 ] (kworker/19:11,2628,19):ocfs2_dio_end_io_write:2374 ERROR: status = -3 fio: io_u error on file /mnt/scratch/racer: Read-only file system: write offset=460849152, buflen=131072 ========================================================================= In __blockdev_direct_IO, ocfs2_dio_wr_get_block is called to add unwritten extents to a list. extents are also inserted into extent tree in ocfs2_write_begin_nolock. Then another thread call fallocate to puch a hole at one of the unwritten extent. The extent at cpos was removed by ocfs2_remove_extent(). At end io worker thread, ocfs2_search_extent_list found there is no such extent at the cpos. T1 T2 T3 inode lock ... insert extents ... inode unlock ocfs2_fallocate __ocfs2_change_file_space inode lock lock ip_alloc_sem ocfs2_remove_inode_range inode ocfs2_remove_btree_range ocfs2_remove_extent ^---remove the extent at cpos 78723 ... unlock ip_alloc_sem inode unlock ocfs2_dio_end_io ocfs2_dio_end_io_write lock ip_alloc_sem ocfs2_mark_extent_written ocfs2_change_extent_flag ocfs2_search_extent_list ^---failed to find extent ... unlock ip_alloc_sem In most filesystems, fallocate is not compatible with racing with AIO+DIO, so fix it by adding to wait for all dio before fallocate/punch_hole like ext4. Link: https://lkml.kernel.org/r/20240408082041.20925-3-glass.su@suse.com Fixes: b25801038da5 ("ocfs2: Support xfs style space reservation ioctls") Signed-off-by: Su Yue <glass.su@suse.com> Reviewed-by: Joseph Qi <joseph.qi@linux.alibaba.com> Cc: Changwei Ge <gechangwei@live.cn> Cc: Gang He <ghe@suse.com> Cc: Joel Becker <jlbec@evilplan.org> Cc: Jun Piao <piaojun@huawei.com> Cc: Junxiao Bi <junxiao.bi@oracle.com> Cc: Mark Fasheh <mark@fasheh.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-04-26ocfs2: return real error code in ocfs2_dio_wr_get_blockSu Yue1-2/+0
Patch series "ocfs2 bugs fixes exposed by fstests", v3. The patchset is to fix some wrong behavior of ocfs2 exposed by fstests. Patch 1 makes userspace happy when some error happens when doing direct io. Before the patch, DIO always return -EIO in case of error. After the patch, it returns real error code such like -ENOSPC, EDQUOT... Patch 2 fixes an error case when doing AIO+DIO and hole punching at same file position in parallel. generic/300 Patch 3 fixes inode link count mismatch after power failure. Without the patch, inode link would be wrong even fync was called on the file. tests/generic/040,041,104,107,336 patch 4 fixes wrong atime with mount option realtime. Without the patch, atime of new created file won't be updated in right time. tests/generic/192 For stable kernels, I added fixes to patch 2,3,4. The patch 1 is not recommended to be backported since ocfs2_dio_wr_get_block calls too many functions. It's diffcult to check every git history of ocfs2 for every LTS kernel. This patch (of 4): ocfs2_dio_wr_get_block always returns -EIO in case of errors. However, some programs expect right exit codes while doing dio. For example, tools like fio treat -ENOSPC as expected code while doing stress jobs. And quota tools expect -EDQUOT when disk quota exceeds. -EIO is too strong return code in the dio path. The caller of ocfs2_dio_wr_get_block is __blockdev_direct_IO which is widely used and it handles error codes well. I have checked functions called by ocfs2_dio_wr_get_block and their return codes look good and clear. So I think it's safe to let ocfs2_dio_wr_get_block return real error code. Link: https://lkml.kernel.org/r/20240408082041.20925-1-glass.su@suse.com Link: https://lkml.kernel.org/r/20240408082041.20925-2-glass.su@suse.com Signed-off-by: Su Yue <glass.su@suse.com> Reviewed-by: Joseph Qi <joseph.qi@linux.alibaba.com> Cc: Changwei Ge <gechangwei@live.cn> Cc: Gang He <ghe@suse.com> Cc: Joel Becker <jlbec@evilplan.org> Cc: Jun Piao <piaojun@huawei.com> Cc: Junxiao Bi <junxiao.bi@oracle.com> Cc: Mark Fasheh <mark@fasheh.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-04-26vmcore: replace strncpy with strscpy_padJustin Stitt1-3/+2
strncpy() is in the process of being replaced as it is deprecated [1]. We should move towards safer and less ambiguous string interfaces. Looking at vmcoredd_header's definition: | struct vmcoredd_header { | __u32 n_namesz; /* Name size */ | __u32 n_descsz; /* Content size */ | __u32 n_type; /* NT_VMCOREDD */ | __u8 name[8]; /* LINUX\0\0\0 */ | __u8 dump_name[VMCOREDD_MAX_NAME_BYTES]; /* Device dump's name */ | }; .. we see that @name wants to be NUL-padded. We're copying data->dump_name which is defined as: | char dump_name[VMCOREDD_MAX_NAME_BYTES]; /* Unique name of the dump */ .. which shares the same size as vdd_hdr->dump_name. Let's make sure we NUL-pad this as well. Use strscpy_pad() which NUL-terminates and NUL-pads its destination buffers. Specifically, use the new 2-argument version of strscpy_pad introduced in Commit e6584c3964f2f ("string: Allow 2-argument strscpy()"). Link: https://www.kernel.org/doc/html/latest/process/deprecated.html#strncpy-on-nul-terminated-strings [1] Link: https://github.com/KSPP/linux/issues/90 Link: https://lkml.kernel.org/r/20240401-strncpy-fs-proc-vmcore-c-v2-1-dd0a73f42635@google.com Signed-off-by: Justin Stitt <justinstitt@google.com> Acked-by: Baoquan He <bhe@redhat.com> Cc: Dave Young <dyoung@redhat.com> Cc: Vivek Goyal <vgoyal@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-04-26devres: don't use "proxy" headersAndy Shevchenko1-3/+6
Update header inclusions to follow IWYU (Include What You Use) principle. Link: https://lkml.kernel.org/r/20240403104820.557487-3-andriy.shevchenko@linux.intel.com Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Reviewed-by: Philipp Stanner <pstanner@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-04-26devres: switch to use dev_err_probe() for unificationAndy Shevchenko1-8/+9
Patch series "devres: A couple of cleanups". A couple of ad-hoc cleanups. No functional changes intended. This patch (of 2): The devm_*() APIs are supposed to be called during the ->probe() stage. Many drivers (especially new ones) have switched to use dev_err_probe() for error messaging for the sake of unification. Let's do the same in the devres APIs. Link: https://lkml.kernel.org/r/20240403104820.557487-1-andriy.shevchenko@linux.intel.com Link: https://lkml.kernel.org/r/20240403104820.557487-2-andriy.shevchenko@linux.intel.com Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Reviewed-by: Philipp Stanner <pstanner@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-04-26kgdb: add HAS_IOPORT dependencyNiklas Schnelle1-0/+1
In a future patch HAS_IOPORT=n will disable inb()/outb() and friends at compile time. We thus need to add HAS_IOPORT as dependency for those drivers using them. Link: https://lkml.kernel.org/r/20240403132547.762429-2-schnelle@linux.ibm.com Co-developed-by: Arnd Bergmann <arnd@kernel.org> Signed-off-by: Arnd Bergmann <arnd@kernel.org> Signed-off-by: Niklas Schnelle <schnelle@linux.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-04-26Squashfs: remove deprecated strncpy by not copying the stringPhillip Lougher1-10/+4
Squashfs copied the passed string (name) into a temporary buffer to ensure it was NUL-terminated. This however is completely unnecessary as the string is already NUL-terminated. So remove the deprecated strncpy() by completely removing the string copy. The background behind this unnecessary string copy is that it dates back to the days when Squashfs was an out of kernel patch. The code deliberately did not assume the string was NUL-terminated in case in future this changed (due to kernel changes). This would mean the out of tree patches would be broken but still compile OK. Link: https://lkml.kernel.org/r/20240403183352.391308-1-phillip@squashfs.org.uk Signed-off-by: Phillip Lougher <phillip@squashfs.org.uk> Reviewed-by: Kees Cook <keescook@chromium.org> Reviewed-by: Justin Stitt <justinstitt@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-04-26ipc: remove the now superfluous sentinel element from ctl_table arrayJoel Granados2-2/+0
This commit comes at the tail end of a greater effort to remove the empty elements at the end of the ctl_table arrays (sentinels) which will reduce the overall build time size of the kernel and run time memory bloat by ~64 bytes per sentinel (further information Link : https://lore.kernel.org/all/ZO5Yx5JFogGi%2FcBo@bombadil.infradead.org/) Remove the sentinels from ipc_sysctls and mq_sysctls Link: https://lkml.kernel.org/r/20240328-jag-sysctl_remset_misc-v1-5-47c1463b3af2@samsung.com Signed-off-by: Joel Granados <j.granados@samsung.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-04-26initrd: remove the now superfluous sentinel element from ctl_table arrayJoel Granados1-1/+0
This commit comes at the tail end of a greater effort to remove the empty elements at the end of the ctl_table arrays (sentinels) which will reduce the overall build time size of the kernel and run time memory bloat by ~64 bytes per sentinel (further information Link : https://lore.kernel.org/all/ZO5Yx5JFogGi%2FcBo@bombadil.infradead.org/) Remove sentinel from kern_do_mounts_initrd_table. Link: https://lkml.kernel.org/r/20240328-jag-sysctl_remset_misc-v1-4-47c1463b3af2@samsung.com Signed-off-by: Joel Granados <j.granados@samsung.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-04-26kcov: avoid clang out-of-range warningArnd Bergmann1-1/+2
The area_size is never larger than the maximum on 64-bit architectutes: kernel/kcov.c:634:29: error: result of comparison of constant 1152921504606846975 with expression of type '__u32' (aka 'unsigned int') is always false [-Werror,-Wtautological-constant-out-of-range-compare] if (remote_arg->area_size > LONG_MAX / sizeof(unsigned long)) ~~~~~~~~~~~~~~~~~~~~~ ^ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ The compiler can correctly optimize the check away and the code appears correct to me, so just add a cast to avoid the warning. Link: https://lkml.kernel.org/r/20240328143051.1069575-5-arnd@kernel.org Signed-off-by: Arnd Bergmann <arnd@arndb.de> Reviewed-by: Justin Stitt <justinstitt@google.com> Cc: Andrey Konovalov <andreyknvl@gmail.com> Cc: Bill Wendling <morbo@google.com> Cc: Dmitry Vyukov <dvyukov@google.com> Cc: Nathan Chancellor <nathan@kernel.org> Cc: Nick Desaulniers <ndesaulniers@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-04-26Documentation: kdump: clean up the outdated descriptionBaoquan He1-4/+4
After commit 443cbaf9e2fd ("crash: split vmcoreinfo exporting code out from crash_core.c"), Kconfig item CRASH_CORE has gone away in kernel. Items VMCORE_INFO and CRASH_RESERVE are used instead. So clean up the outdated description about CRASH_CORE and update it accordingly. Link: https://lkml.kernel.org/r/20240329132825.1102459-3-bhe@redhat.com Signed-off-by: Baoquan He <bhe@redhat.com> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Geert Uytterhoeven <geert@linux-m68k.org> Cc: Huacai Chen <chenhuacai@kernel.org> Cc: WANG Xuerui <kernel@xen0n.name> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-04-26ocfs2: fix sparse warningsHeming Zhao5-15/+16
1. fs/ocfs2/localalloc.c:1224:41: warning: incorrect type in argument 1 (different base types) fs/ocfs2/localalloc.c:1224:41: expected unsigned long long val1 fs/ocfs2/localalloc.c:1224:41: got restricted __le32 [usertype] la_bm_off 2. fs/ocfs2/export.c:258:32: warning: cast to restricted __le32 fs/ocfs2/export.c:259:33: warning: cast to restricted __le32 fs/ocfs2/export.c:260:32: warning: cast to restricted __le32 fs/ocfs2/export.c:272:32: warning: cast to restricted __le32 fs/ocfs2/export.c:273:33: warning: cast to restricted __le32 fs/ocfs2/export.c:274:32: warning: cast to restricted __le32 3. fs/ocfs2/inode.c:1623:13: warning: context imbalance in 'ocfs2_inode_cache_lock' - wrong count at exit fs/ocfs2/inode.c:1630:13: warning: context imbalance in 'ocfs2_inode_cache_unlock' - unexpected unlock 4. fs/ocfs2/refcounttree.c:633:27: warning: incorrect type in assignment (different base types) fs/ocfs2/refcounttree.c:633:27: expected restricted __le32 [usertype] rf_generation fs/ocfs2/refcounttree.c:633:27: got unsigned int 5. fs/ocfs2/dlm/dlmdomain.c:1316:20: warning: context imbalance in 'dlm_query_nodeinfo_handler' - different lock contexts for basic block Link: https://lkml.kernel.org/r/20240328125203.20892-5-heming.zhao@suse.com Signed-off-by: Heming Zhao <heming.zhao@suse.com> Reviewed-by: Joseph Qi <joseph.qi@linux.alibaba.com> Cc: Changwei Ge <gechangwei@live.cn> Cc: Gang He <ghe@suse.com> Cc: Joel Becker <jlbec@evilplan.org> Cc: Jun Piao <piaojun@huawei.com> Cc: Junxiao Bi <junxiao.bi@oracle.com> Cc: Mark Fasheh <mark@fasheh.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-04-26ocfs2: speed up chain-list searchingHeming Zhao1-1/+1
Add short-circuit code to speed up searching Link: https://lkml.kernel.org/r/20240328125203.20892-4-heming.zhao@suse.com Signed-off-by: Heming Zhao <heming.zhao@suse.com> Reviewed-by: Joseph Qi <joseph.qi@linux.alibaba.com> Cc: Changwei Ge <gechangwei@live.cn> Cc: Gang He <ghe@suse.com> Cc: Joel Becker <jlbec@evilplan.org> Cc: Jun Piao <piaojun@huawei.com> Cc: Junxiao Bi <junxiao.bi@oracle.com> Cc: Mark Fasheh <mark@fasheh.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-04-26ocfs2: adjust enabling place for la windowHeming Zhao2-12/+8
Patch series "improve write IO performance when fragmentation is high", v6. This patch (of 4): After introducing gd->bg_contig_free_bits, the code path 'ocfs2_cluster_group_search() => ocfs2_local_alloc_seen_free_bits()' becomes death when all the gd->bg_contig_free_bits are set to the correct value. This patch relocates ocfs2_local_alloc_seen_free_bits() to a more appropriate location. (The new place being ocfs2_block_group_set_bits().) In ocfs2_local_alloc_seen_free_bits(), the scope of the spin-lock has been adjusted to reduce meaningless lock races. e.g: when userspace creates & deletes 1 cluster_size files in parallel, acquiring the spin-lock in ocfs2_local_alloc_seen_free_bits() is totally pointless and impedes IO performance. Link: https://lkml.kernel.org/r/20240328125203.20892-3-heming.zhao@suse.com Signed-off-by: Heming Zhao <heming.zhao@suse.com> Reviewed-by: Joseph Qi <joseph.qi@linux.alibaba.com> Cc: Mark Fasheh <mark@fasheh.com> Cc: Joel Becker <jlbec@evilplan.org> Cc: Junxiao Bi <junxiao.bi@oracle.com> Cc: Changwei Ge <gechangwei@live.cn> Cc: Gang He <ghe@suse.com> Cc: Jun Piao <piaojun@huawei.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-04-26ocfs2: improve write IO performance when fragmentation is highHeming Zhao5-11/+108
The group_search function ocfs2_cluster_group_search() should bypass groups with insufficient space to avoid unnecessary searches. This patch is particularly useful when ocfs2 is handling huge number small files, and volume fragmentation is very high. In this case, ocfs2 is busy with looking up available la window from //global_bitmap. This patch introduces a new member in the Group Description (gd) struct called 'bg_contig_free_bits', representing the max contigous free bits in this gd. When ocfs2 allocates a new la window from //global_bitmap, 'bg_contig_free_bits' helps expedite the search process. Let's image below path. 1. la state (->local_alloc_state) is set THROTTLED or DISABLED. 2. when user delete a large file and trigger ocfs2_local_alloc_seen_free_bits set osb->local_alloc_state unconditionally. 3. a write IOs thread run and trigger the worst performance path ``` ocfs2_reserve_clusters_with_limit ocfs2_reserve_local_alloc_bits ocfs2_local_alloc_slide_window //[1] + ocfs2_local_alloc_reserve_for_window //[2] + ocfs2_local_alloc_new_window //[3] ocfs2_recalc_la_window ``` [1]: will be called when la window bits used up. [2]: under la state is ENABLED, and this func only check global_bitmap free bits, it will succeed in general. [3]: will use the default la window size to search clusters then fail. ocfs2_recalc_la_window attempts other la window sizes. the timing complexity is O(n^4), resulting in a significant time cost for scanning global bitmap. This leads to a dramatic slowdown in write I/Os (e.g., user space 'dd'). i.e. an ocfs2 partition size: 1.45TB, cluster size: 4KB, la window default size: 106MB. The partition is fragmentation by creating & deleting huge mount of small files. before this patch, the timing of [3] should be (the number got from real world): - la window size change order (size: MB): 106, 53, 26.5, 13, 6.5, 3.25, 1.6, 0.8 only 0.8MB succeed, 0.8MB also triggers la window to disable. ocfs2_local_alloc_new_window retries 8 times, first 7 times totally runs in worst case. - group chain number: 242 ocfs2_claim_suballoc_bits calls for-loop 242 times - each chain has 49 block group ocfs2_search_chain calls while-loop 49 times - each bg has 32256 blocks ocfs2_block_group_find_clear_bits calls while-loop for 32256 bits. for ocfs2_find_next_zero_bit uses ffz() to find zero bit, let's use (32256/64) (this is not worst value) for timing calucation. the loop times: 7*242*49*(32256/64) = 41835024 (~42 million times) In the worst case, user space writes 1MB data will trigger 42M scanning times. under this patch, the timing is '7*242*49 = 83006', reduced by three orders of magnitude. Link: https://lkml.kernel.org/r/20240328125203.20892-2-heming.zhao@suse.com Signed-off-by: Heming Zhao <heming.zhao@suse.com> Reviewed-by: Joseph Qi <joseph.qi@linux.alibaba.com> Cc: Changwei Ge <gechangwei@live.cn> Cc: Gang He <ghe@suse.com> Cc: Joel Becker <jlbec@evilplan.org> Cc: Jun Piao <piaojun@huawei.com> Cc: Junxiao Bi <junxiao.bi@oracle.com> Cc: Mark Fasheh <mark@fasheh.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-04-26regset: use kvzalloc() for regset_get_alloc()Douglas Anderson2-4/+4
While browsing through ChromeOS crash reports, I found one with an allocation failure that looked like this: chrome: page allocation failure: order:7, mode:0x40dc0(GFP_KERNEL|__GFP_COMP|__GFP_ZERO), nodemask=(null),cpuset=urgent,mems_allowed=0 CPU: 7 PID: 3295 Comm: chrome Not tainted 5.15.133-20574-g8044615ac35c #1 (HASH:1162 1) Hardware name: Google Lazor (rev3 - 8) with KB Backlight (DT) Call trace: ... warn_alloc+0x104/0x174 __alloc_pages+0x5f0/0x6e4 kmalloc_order+0x44/0x98 kmalloc_order_trace+0x34/0x124 __kmalloc+0x228/0x36c __regset_get+0x68/0xcc regset_get_alloc+0x1c/0x28 elf_core_dump+0x3d8/0xd8c do_coredump+0xeb8/0x1378 get_signal+0x14c/0x804 ... An order 7 allocation is (1 << 7) contiguous pages, or 512K. It's not a surprise that this allocation failed on a system that's been running for a while. More digging showed that it was fairly easy to see the order 7 allocation by just sending a SIGQUIT to chrome (or other processes) to generate a core dump. The actual amount being allocated was 279,584 bytes and it was for "core_note_type" NT_ARM_SVE. There was quite a bit of discussion [1] on the mailing lists in response to my v1 patch attempting to switch to vmalloc. The overall conclusion was that we could likely reduce the 279,584 byte allocation by quite a bit and Mark Brown has sent a patch to that effect [2]. However even with the 279,584 byte allocation gone there are still 65,552 byte allocations. These are just barely more than the 65,536 bytes and thus would require an order 5 allocation. An order 5 allocation is still something to avoid unless necessary and nothing needs the memory here to be contiguous. Change the allocation to kvzalloc() which should still be efficient for small allocations but doesn't force the memory subsystem to work hard (and maybe fail) at getting a large contiguous chunk. [1] https://lore.kernel.org/r/20240201171159.1.Id9ad163b60d21c9e56c2d686b0cc9083a8ba7924@changeid [2] https://lore.kernel.org/r/20240203-arm64-sve-ptrace-regset-size-v1-1-2c3ba1386b9e@kernel.org Link: https://lkml.kernel.org/r/20240205092626.v2.1.Id9ad163b60d21c9e56c2d686b0cc9083a8ba7924@changeid Signed-off-by: Douglas Anderson <dianders@chromium.org> Reviewed-by: Catalin Marinas <catalin.marinas@arm.com> Cc: Al Viro <viro@ZenIV.linux.org.uk> Cc: Christian Brauner <brauner@kernel.org> Cc: Dave Martin <Dave.Martin@arm.com> Cc: Eric Biederman <ebiederm@xmission.com> Cc: Jan Kara <jack@suse.cz> Cc: Kees Cook <keescook@chromium.org> Cc: Mark Brown <broonie@kernel.org> Cc: Matthew Wilcox (Oracle) <willy@infradead.org> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Will Deacon <will@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-04-26NUMA: early use of cpu_to_node() returns 0 instead of the correct node idHuang Shijie1-0/+14
During the kernel booting, the generic cpu_to_node() is called too early in arm64, powerpc and riscv when CONFIG_NUMA is enabled. There are at least four places in the common code where the generic cpu_to_node() is called before it is initialized: 1.) early_trace_init() in kernel/trace/trace.c 2.) sched_init() in kernel/sched/core.c 3.) init_sched_fair_class() in kernel/sched/fair.c 4.) workqueue_init_early() in kernel/workqueue.c This will harm performance since there is an increase in off node accesses. In order to fix the bug, the patch introduces early_numa_node_init() which is called after smp_prepare_boot_cpu() in start_kernel. early_numa_node_init will initialize the "numa_node" as soon as the early_cpu_to_node() is ready, before the cpu_to_node() is called at the first time. Link: https://lkml.kernel.org/r/20240126064451.5465-1-shijie@os.amperecomputing.com Signed-off-by: Huang Shijie <shijie@os.amperecomputing.com> Acked-by: Palmer Dabbelt <palmer@rivosinc.com> [RISC-V] Cc: Albert Ou <aou@eecs.berkeley.edu> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Huacai Chen <chenhuacai@kernel.org> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jakub Kicinski <kuba@kernel.org> Cc: Jiaxun Yang <jiaxun.yang@flygoat.com> Cc: Josh Poimboeuf <jpoimboe@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Michael Kelley (LINUX) <mikelley@microsoft.com> Cc: "Mike Rapoport (IBM)" <rppt@kernel.org> Cc: Nick Desaulniers <ndesaulniers@google.com> Cc: Paul Walmsley <paul.walmsley@sifive.com> Cc: Rafael J. Wysocki <rafael@kernel.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Valentin Schneider <vschneid@redhat.com> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: Will Deacon <will@kernel.org> Cc: Yury Norov <yury.norov@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-04-26fs: add kernel-doc comments to fat_parse_long()Yang Li1-0/+12
Add kernel-doc style comments with complete parameter descriptions for fat_parse_long(). Link: https://lkml.kernel.org/r/20240322073724.102332-1-yang.lee@linux.alibaba.com Signed-off-by: Yang Li <yang.lee@linux.alibaba.com> Acked-by: OGAWA Hirofumi <hirofumi@mail.parknet.co.jp> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-04-26x86: call instrumentation hooks from copy_mc.cAlexander Potapenko1-4/+17
Memory accesses in copy_mc_to_kernel() and copy_mc_to_user() are performed by assembly routines and are invisible to KASAN, KCSAN, and KMSAN. Add hooks from instrumentation.h to tell the tools these functions have memcpy/copy_from_user semantics. The call to copy_mc_fragile() in copy_mc_fragile_handle_tail() is left intact, because the latter is only called from the assembly implementation of copy_mc_fragile(), so the memory accesses in it are covered by the instrumentation in copy_mc_to_kernel() and copy_mc_to_user(). Link: https://lore.kernel.org/all/3b7dbd88-0861-4638-b2d2-911c97a4cadf@I-love.SAKURA.ne.jp/ Link: https://lkml.kernel.org/r/20240320101851.2589698-3-glider@google.com Signed-off-by: Alexander Potapenko <glider@google.com> Suggested-by: Linus Torvalds <torvalds@linux-foundation.org> Cc: Dmitry Vyukov <dvyukov@google.com> Cc: Marco Elver <elver@google.com> Cc: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> Cc: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-04-26instrumented.h: add instrument_memcpy_before, instrument_memcpy_afterAlexander Potapenko1-0/+35
Bug detection tools based on compiler instrumentation may miss memory accesses in custom memcpy implementations (such as copy_mc_to_kernel). Provide instrumentation hooks that tell KASAN, KCSAN, and KMSAN about such accesses. Link: https://lore.kernel.org/all/3b7dbd88-0861-4638-b2d2-911c97a4cadf@I-love.SAKURA.ne.jp/ Link: https://lkml.kernel.org/r/20240320101851.2589698-2-glider@google.com Signed-off-by: Alexander Potapenko <glider@google.com> Reviewed-by: Marco Elver <elver@google.com> Cc: Dmitry Vyukov <dvyukov@google.com> Cc: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-04-26mm: kmsan: implement kmsan_memmove()Alexander Potapenko2-0/+26
Provide a hook that can be used by custom memcpy implementations to tell KMSAN that the metadata needs to be copied. Without that, false positive reports are possible in the cases where KMSAN fails to intercept memory initialization. Link: https://lore.kernel.org/all/3b7dbd88-0861-4638-b2d2-911c97a4cadf@I-love.SAKURA.ne.jp/ Link: https://lkml.kernel.org/r/20240320101851.2589698-1-glider@google.com Signed-off-by: Alexander Potapenko <glider@google.com> Suggested-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> Reviewed-by: Marco Elver <elver@google.com> Cc: Dmitry Vyukov <dvyukov@google.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> Cc: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-04-26bootconfig: do not put quotes on cmdline items unless necessaryRasmus Villemoes1-3/+9
When trying to migrate to using bootconfig to embed the kernel's and PID1's command line with the kernel image itself, and so allowing changing that without modifying the bootloader, I noticed that /proc/cmdline changed from e.g. console=ttymxc0,115200n8 cma=128M quiet -- --log-level=notice to console="ttymxc0,115200n8" cma="128M" quiet -- --log-level="notice" The kernel parameters are parsed just fine, and the quotes are indeed stripped from the actual argv[] given to PID1. However, the quoting doesn't really serve any purpose and looks excessive, and might confuse some (naive) userspace tool trying to parse /proc/cmdline. So do not quote the value unless it contains whitespace. Link: https://lkml.kernel.org/r/20240320101952.62135-1-linux@rasmusvillemoes.dk Signed-off-by: Rasmus Villemoes <linux@rasmusvillemoes.dk> Cc: Masami Hiramatsu <mhiramat@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-04-26lib/build_OID_registry: don't mention the full path of the script in outputUwe Kleine-König1-1/+4
This change strips the full path of the script generating lib/oid_registry_data.c to just lib/build_OID_registry. The motivation for this change is Yocto emitting a build warning File /usr/src/debug/linux-lxatac/6.7-r0/lib/oid_registry_data.c in package linux-lxatac-src contains reference to TMPDIR [buildpaths] So this change brings us one step closer to make the build result reproducible independent of the build path. Link: https://lkml.kernel.org/r/20240313211957.884561-2-u.kleine-koenig@pengutronix.de Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de> Cc: Masahiro Yamada <masahiroy@kernel.org> Reviewed-by: Nicolas Schier <nicolas@fjasle.eu> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-04-26ocfs2: update inode ctime in ocfs2_fileattr_setSu Yue1-0/+1
inode ctime should be updated if ocfs2_fileattr_set is called. Link: https://lkml.kernel.org/r/20240318115609.3194-1-l@damenly.org Signed-off-by: Su Yue <glass.su@suse.com> Reviewed-by: Joseph Qi <joseph.qi@linux.alibaba.com> Cc: Mark Fasheh <mark@fasheh.com> Cc: Joel Becker <jlbec@evilplan.org> Cc: Junxiao Bi <junxiao.bi@oracle.com> Cc: Changwei Ge <gechangwei@live.cn> Cc: Gang He <ghe@suse.com> Cc: Jun Piao <piaojun@huawei.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-04-26ocfs2: correctly use ocfs2_find_next_zero_bit()Joseph Qi3-18/+9
If no bits are zero, ocfs2_find_next_zero_bit() will return max size, so check the return value with -1 is meaningless. Correct this usage and cleanup the code. Link: https://lkml.kernel.org/r/20240314021713.240796-1-joseph.qi@linux.alibaba.com Signed-off-by: Joseph Qi <joseph.qi@linux.alibaba.com> Reviewed-by: Heming Zhao <heming.zhao@suse.com> Cc: Mark Fasheh <mark@fasheh.com> Cc: Joel Becker <jlbec@evilplan.org> Cc: Junxiao Bi <junxiao.bi@oracle.com> Cc: Changwei Ge <gechangwei@live.cn> Cc: Gang He <ghe@suse.com> Cc: Jun Piao <piaojun@huawei.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-04-14Linux 6.9-rc4Linus Torvalds1-1/+1