summaryrefslogtreecommitdiff
path: root/lib
AgeCommit message (Collapse)AuthorFilesLines
2023-09-19kunit: Fix wild-memory-access bug in kunit_free_suite_set()Jinjie Ruan1-1/+2
[ Upstream commit 2810c1e99867a811e631dd24e63e6c1e3b78a59d ] Inject fault while probing kunit-example-test.ko, if kstrdup() fails in mod_sysfs_setup() in load_module(), the mod->state will switch from MODULE_STATE_COMING to MODULE_STATE_GOING instead of from MODULE_STATE_LIVE to MODULE_STATE_GOING, so only kunit_module_exit() will be called without kunit_module_init(), and the mod->kunit_suites is no set correctly and the free in kunit_free_suite_set() will cause below wild-memory-access bug. The mod->state state machine when load_module() succeeds: MODULE_STATE_UNFORMED ---> MODULE_STATE_COMING ---> MODULE_STATE_LIVE ^ | | | delete_module +---------------- MODULE_STATE_GOING <---------+ The mod->state state machine when load_module() fails at mod_sysfs_setup(): MODULE_STATE_UNFORMED ---> MODULE_STATE_COMING ---> MODULE_STATE_GOING ^ | | | +-----------------------------------------------+ Call kunit_module_init() at MODULE_STATE_COMING state to fix the issue because MODULE_STATE_LIVE is transformed from it. Unable to handle kernel paging request at virtual address ffffff341e942a88 KASAN: maybe wild-memory-access in range [0x0003f9a0f4a15440-0x0003f9a0f4a15447] Mem abort info: ESR = 0x0000000096000004 EC = 0x25: DABT (current EL), IL = 32 bits SET = 0, FnV = 0 EA = 0, S1PTW = 0 FSC = 0x04: level 0 translation fault Data abort info: ISV = 0, ISS = 0x00000004, ISS2 = 0x00000000 CM = 0, WnR = 0, TnD = 0, TagAccess = 0 GCS = 0, Overlay = 0, DirtyBit = 0, Xs = 0 swapper pgtable: 4k pages, 48-bit VAs, pgdp=00000000441ea000 [ffffff341e942a88] pgd=0000000000000000, p4d=0000000000000000 Internal error: Oops: 0000000096000004 [#1] PREEMPT SMP Modules linked in: kunit_example_test(-) cfg80211 rfkill 8021q garp mrp stp llc ipv6 [last unloaded: kunit_example_test] CPU: 3 PID: 2035 Comm: modprobe Tainted: G W N 6.5.0-next-20230828+ #136 Hardware name: linux,dummy-virt (DT) pstate: a0000005 (NzCv daif -PAN -UAO -TCO -DIT -SSBS BTYPE=--) pc : kfree+0x2c/0x70 lr : kunit_free_suite_set+0xcc/0x13c sp : ffff8000829b75b0 x29: ffff8000829b75b0 x28: ffff8000829b7b90 x27: 0000000000000000 x26: dfff800000000000 x25: ffffcd07c82a7280 x24: ffffcd07a50ab300 x23: ffffcd07a50ab2e8 x22: 1ffff00010536ec0 x21: dfff800000000000 x20: ffffcd07a50ab2f0 x19: ffffcd07a50ab2f0 x18: 0000000000000000 x17: 0000000000000000 x16: 0000000000000000 x15: ffffcd07c24b6764 x14: ffffcd07c24b63c0 x13: ffffcd07c4cebb94 x12: ffff700010536ec7 x11: 1ffff00010536ec6 x10: ffff700010536ec6 x9 : dfff800000000000 x8 : 00008fffefac913a x7 : 0000000041b58ab3 x6 : 0000000000000000 x5 : 1ffff00010536ec5 x4 : ffff8000829b7628 x3 : dfff800000000000 x2 : ffffff341e942a80 x1 : ffffcd07a50aa000 x0 : fffffc0000000000 Call trace: kfree+0x2c/0x70 kunit_free_suite_set+0xcc/0x13c kunit_module_notify+0xd8/0x360 blocking_notifier_call_chain+0xc4/0x128 load_module+0x382c/0x44a4 init_module_from_file+0xd4/0x128 idempotent_init_module+0x2c8/0x524 __arm64_sys_finit_module+0xac/0x100 invoke_syscall+0x6c/0x258 el0_svc_common.constprop.0+0x160/0x22c do_el0_svc+0x44/0x5c el0_svc+0x38/0x78 el0t_64_sync_handler+0x13c/0x158 el0t_64_sync+0x190/0x194 Code: aa0003e1 b25657e0 d34cfc42 8b021802 (f9400440) ---[ end trace 0000000000000000 ]--- Kernel panic - not syncing: Oops: Fatal exception SMP: stopping secondary CPUs Kernel Offset: 0x4d0742200000 from 0xffff800080000000 PHYS_OFFSET: 0xffffee43c0000000 CPU features: 0x88000203,3c020000,1000421b Memory Limit: none Rebooting in 1 seconds.. Fixes: 3d6e44623841 ("kunit: unify module and builtin suite definitions") Signed-off-by: Jinjie Ruan <ruanjinjie@huawei.com> Reviewed-by: Rae Moar <rmoar@google.com> Reviewed-by: David Gow <davidgow@google.com> Signed-off-by: Shuah Khan <skhan@linuxfoundation.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-09-19lib: test_scanf: Add explicit type cast to result initialization in ↵Nathan Chancellor1-1/+1
test_number_prefix() commit 92382d744176f230101d54f5c017bccd62770f01 upstream. A recent change in clang allows it to consider more expressions as compile time constants, which causes it to point out an implicit conversion in the scanf tests: lib/test_scanf.c:661:2: warning: implicit conversion from 'int' to 'unsigned char' changes value from -168 to 88 [-Wconstant-conversion] 661 | test_number_prefix(unsigned char, "0xA7", "%2hhx%hhx", 0, 0xa7, 2, check_uchar); | ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ lib/test_scanf.c:609:29: note: expanded from macro 'test_number_prefix' 609 | T result[2] = {~expect[0], ~expect[1]}; \ | ~ ^~~~~~~~~~ 1 warning generated. The result of the bitwise negation is the type of the operand after going through the integer promotion rules, so this truncation is expected but harmless, as the initial values in the result array get overwritten by _test() anyways. Add an explicit cast to the expected type in test_number_prefix() to silence the warning. There is no functional change, as all the tests still pass with GCC 13.1.0 and clang 18.0.0. Cc: stable@vger.kernel.org Link: https://github.com/ClangBuiltLinux/linuxq/issues/1899 Link: https://github.com/llvm/llvm-project/commit/610ec954e1f81c0e8fcadedcd25afe643f5a094e Suggested-by: Nick Desaulniers <ndesaulniers@google.com> Signed-off-by: Nathan Chancellor <nathan@kernel.org> Reviewed-by: Petr Mladek <pmladek@suse.com> Signed-off-by: Petr Mladek <pmladek@suse.com> Link: https://lore.kernel.org/r/20230807-test_scanf-wconstant-conversion-v2-1-839ca39083e1@kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-09-19idr: fix param name in idr_alloc_cyclic() docAriel Marcovitch1-1/+1
[ Upstream commit 2a15de80dd0f7e04a823291aa9eb49c5294f56af ] The relevant parameter is 'start' and not 'nextid' Fixes: 460488c58ca8 ("idr: Remove idr_alloc_ext") Signed-off-by: Ariel Marcovitch <arielmarcovitch@gmail.com> Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-09-19lib/test_meminit: allocate pages up to order MAX_ORDERAndrew Donnellan1-1/+1
commit efb78fa86e95832b78ca0ba60f3706788a818938 upstream. test_pages() tests the page allocator by calling alloc_pages() with different orders up to order 10. However, different architectures and platforms support different maximum contiguous allocation sizes. The default maximum allocation order (MAX_ORDER) is 10, but architectures can use CONFIG_ARCH_FORCE_MAX_ORDER to override this. On platforms where this is less than 10, test_meminit() will blow up with a WARN(). This is expected, so let's not do that. Replace the hardcoded "10" with the MAX_ORDER macro so that we test allocations up to the expected platform limit. Link: https://lkml.kernel.org/r/20230714015238.47931-1-ajd@linux.ibm.com Fixes: 5015a300a522 ("lib: introduce test_meminit module") Signed-off-by: Andrew Donnellan <ajd@linux.ibm.com> Reviewed-by: Alexander Potapenko <glider@google.com> Cc: Xiaoke Wang <xkernel.wang@foxmail.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-09-13XArray: Do not return sibling entries from xa_load()Matthew Wilcox (Oracle)1-1/+1
commit cbc02854331edc6dc22d8b77b6e22e38ebc7dd51 upstream. It is possible for xa_load() to observe a sibling entry pointing to another sibling entry. An example: Thread A: Thread B: xa_store_range(xa, entry, 188, 191, gfp); xa_load(xa, 191); entry = xa_entry(xa, node, 63); [entry is a sibling of 188] xa_store_range(xa, entry, 184, 191, gfp); if (xa_is_sibling(entry)) offset = xa_to_sibling(entry); entry = xa_entry(xas->xa, node, offset); [entry is now a sibling of 184] It is sufficient to go around this loop until we hit a non-sibling entry. Sibling entries always point earlier in the node, so we are guaranteed to terminate this search. Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Fixes: 6b24ca4a1a8d ("mm: Use multi-index entries in the page cache") Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-08-30maple_tree: disable mas_wr_append() when other readers are possibleLiam R. Howlett1-0/+3
[ Upstream commit cfeb6ae8bcb96ccf674724f223661bbcef7b0d0b ] The current implementation of append may cause duplicate data and/or incorrect ranges to be returned to a reader during an update. Although this has not been reported or seen, disable the append write operation while the tree is in rcu mode out of an abundance of caution. During the analysis of the mas_next_slot() the following was artificially created by separating the writer and reader code: Writer: reader: mas_wr_append set end pivot updates end metata Detects write to last slot last slot write is to start of slot store current contents in slot overwrite old end pivot mas_next_slot(): read end metadata read old end pivot return with incorrect range store new value Alternatively: Writer: reader: mas_wr_append set end pivot updates end metata Detects write to last slot last lost write to end of slot store value mas_next_slot(): read end metadata read old end pivot read new end pivot return with incorrect range set old end pivot There may be other accesses that are not safe since we are now updating both metadata and pointers, so disabling append if there could be rcu readers is the safest action. Link: https://lkml.kernel.org/r/20230819004356.1454718-2-Liam.Howlett@oracle.com Fixes: 54a611b60590 ("Maple Tree: add new data structure") Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-08-30radix tree: remove unused variableArnd Bergmann1-1/+0
commit d59070d1076ec5114edb67c87658aeb1d691d381 upstream. Recent versions of clang warn about an unused variable, though older versions saw the 'slot++' as a use and did not warn: radix-tree.c:1136:50: error: parameter 'slot' set but not used [-Werror,-Wunused-but-set-parameter] It's clearly not needed any more, so just remove it. Link: https://lkml.kernel.org/r/20230811131023.2226509-1-arnd@kernel.org Fixes: 3a08cd52c37c7 ("radix tree: Remove multiorder support") Signed-off-by: Arnd Bergmann <arnd@arndb.de> Cc: Matthew Wilcox <willy@infradead.org> Cc: Nathan Chancellor <nathan@kernel.org> Cc: Nick Desaulniers <ndesaulniers@google.com> Cc: Peng Zhang <zhangpeng.00@bytedance.com> Cc: Rong Tao <rongtao@cestc.cn> Cc: Tom Rix <trix@redhat.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-08-30lib/clz_ctz.c: Fix __clzdi2() and __ctzdi2() for 32-bit kernelsHelge Deller1-26/+6
commit 382d4cd1847517ffcb1800fd462b625db7b2ebea upstream. The gcc compiler translates on some architectures the 64-bit __builtin_clzll() function to a call to the libgcc function __clzdi2(), which should take a 64-bit parameter on 32- and 64-bit platforms. But in the current kernel code, the built-in __clzdi2() function is defined to operate (wrongly) on 32-bit parameters if BITS_PER_LONG == 32, thus the return values on 32-bit kernels are in the range from [0..31] instead of the expected [0..63] range. This patch fixes the in-kernel functions __clzdi2() and __ctzdi2() to take a 64-bit parameter on 32-bit kernels as well, thus it makes the functions identical for 32- and 64-bit kernels. This bug went unnoticed since kernel 3.11 for over 10 years, and here are some possible reasons for that: a) Some architectures have assembly instructions to count the bits and which are used instead of calling __clzdi2(), e.g. on x86 the bsr instruction and on ppc cntlz is used. On such architectures the wrong __clzdi2() implementation isn't used and as such the bug has no effect and won't be noticed. b) Some architectures link to libgcc.a, and the in-kernel weak functions get replaced by the correct 64-bit variants from libgcc.a. c) __builtin_clzll() and __clzdi2() doesn't seem to be used in many places in the kernel, and most likely only in uncritical functions, e.g. when printing hex values via seq_put_hex_ll(). The wrong return value will still print the correct number, but just in a wrong formatting (e.g. with too many leading zeroes). d) 32-bit kernels aren't used that much any longer, so they are less tested. A trivial testcase to verify if the currently running 32-bit kernel is affected by the bug is to look at the output of /proc/self/maps: Here the kernel uses a correct implementation of __clzdi2(): root@debian:~# cat /proc/self/maps 00010000-00019000 r-xp 00000000 08:05 787324 /usr/bin/cat 00019000-0001a000 rwxp 00009000 08:05 787324 /usr/bin/cat 0001a000-0003b000 rwxp 00000000 00:00 0 [heap] f7551000-f770d000 r-xp 00000000 08:05 794765 /usr/lib/hppa-linux-gnu/libc.so.6 ... and this kernel uses the broken implementation of __clzdi2(): root@debian:~# cat /proc/self/maps 0000000010000-0000000019000 r-xp 00000000 000000008:000000005 787324 /usr/bin/cat 0000000019000-000000001a000 rwxp 000000009000 000000008:000000005 787324 /usr/bin/cat 000000001a000-000000003b000 rwxp 00000000 00:00 0 [heap] 00000000f73d1000-00000000f758d000 r-xp 00000000 000000008:000000005 794765 /usr/lib/hppa-linux-gnu/libc.so.6 ... Signed-off-by: Helge Deller <deller@gmx.de> Fixes: 4df87bb7b6a22 ("lib: add weak clz/ctz functions") Cc: Chanho Min <chanho.min@lge.com> Cc: Geert Uytterhoeven <geert@linux-m68k.org> Cc: stable@vger.kernel.org # v3.11+ Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-08-11debugobjects: Recheck debug_objects_enabled before reportingTetsuo Handa1-0/+9
commit 8b64d420fe2450f82848178506d3e3a0bd195539 upstream. syzbot is reporting false a positive ODEBUG message immediately after ODEBUG was disabled due to OOM. [ 1062.309646][T22911] ODEBUG: Out of memory. ODEBUG disabled [ 1062.886755][ T5171] ------------[ cut here ]------------ [ 1062.892770][ T5171] ODEBUG: assert_init not available (active state 0) object: ffffc900056afb20 object type: timer_list hint: process_timeout+0x0/0x40 CPU 0 [ T5171] CPU 1 [T22911] -------------- -------------- debug_object_assert_init() { if (!debug_objects_enabled) return; db = get_bucket(addr); lookup_object_or_alloc() { debug_objects_enabled = 0; return NULL; } debug_objects_oom() { pr_warn("Out of memory. ODEBUG disabled\n"); // all buckets get emptied here, and } lookup_object_or_alloc(addr, db, descr, false, true) { // this bucket is already empty. return ERR_PTR(-ENOENT); } // Emits false positive warning. debug_print_object(&o, "assert_init"); } Recheck debug_object_enabled in debug_print_object() to avoid that. Reported-by: syzbot <syzbot+7937ba6a50bdd00fffdf@syzkaller.appspotmail.com> Suggested-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Link: https://lore.kernel.org/r/492fe2ae-5141-d548-ebd5-62f5fe2e57f7@I-love.SAKURA.ne.jp Closes: https://syzkaller.appspot.com/bug?extid=7937ba6a50bdd00fffdf Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-08-11lib/bitmap: workaround const_eval test build failureYury Norov2-4/+10
[ Upstream commit 2356d198d2b4ddec24efea98271cb3be230bc787 ] When building with Clang, and when KASAN and GCOV_PROFILE_ALL are both enabled, the test fails to build [1]: >> lib/test_bitmap.c:920:2: error: call to '__compiletime_assert_239' declared with 'error' attribute: BUILD_BUG_ON failed: !__builtin_constant_p(res) BUILD_BUG_ON(!__builtin_constant_p(res)); ^ include/linux/build_bug.h:50:2: note: expanded from macro 'BUILD_BUG_ON' BUILD_BUG_ON_MSG(condition, "BUILD_BUG_ON failed: " #condition) ^ include/linux/build_bug.h:39:37: note: expanded from macro 'BUILD_BUG_ON_MSG' #define BUILD_BUG_ON_MSG(cond, msg) compiletime_assert(!(cond), msg) ^ include/linux/compiler_types.h:352:2: note: expanded from macro 'compiletime_assert' _compiletime_assert(condition, msg, __compiletime_assert_, __COUNTER__) ^ include/linux/compiler_types.h:340:2: note: expanded from macro '_compiletime_assert' __compiletime_assert(condition, msg, prefix, suffix) ^ include/linux/compiler_types.h:333:4: note: expanded from macro '__compiletime_assert' prefix ## suffix(); \ ^ <scratch space>:185:1: note: expanded from here __compiletime_assert_239 Originally it was attributed to s390, which now looks seemingly wrong. The issue is not related to bitmap code itself, but it breaks build for a given configuration. Disabling the const_eval test under that config may potentially hide other bugs. Instead, workaround it by disabling GCOV for the test_bitmap unless the compiler will get fixed. [1] https://github.com/ClangBuiltLinux/linux/issues/1874 Reported-by: kernel test robot <lkp@intel.com> Closes: https://lore.kernel.org/oe-kbuild-all/202307171254.yFcH97ej-lkp@intel.com/ Fixes: dc34d5036692 ("lib: test_bitmap: add compile-time optimization/evaluations assertions") Co-developed-by: Nathan Chancellor <nathan@kernel.org> Signed-off-by: Nathan Chancellor <nathan@kernel.org> Signed-off-by: Yury Norov <yury.norov@gmail.com> Reviewed-by: Nick Desaulniers <ndesaulniers@google.com> Reviewed-by: Alexander Lobakin <aleksander.lobakin@intel.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-08-03test_firmware: return ENOMEM instead of ENOSPC on failed memory allocationMirsad Goran Todorovac1-6/+6
commit 7dae593cd226a0bca61201cf85ceb9335cf63682 upstream. In a couple of situations like name = kstrndup(buf, count, GFP_KERNEL); if (!name) return -ENOSPC; the error is not actually "No space left on device", but "Out of memory". It is semantically correct to return -ENOMEM in all failed kstrndup() and kzalloc() cases in this driver, as it is not a problem with disk space, but with kernel memory allocator failing allocation. The semantically correct should be: name = kstrndup(buf, count, GFP_KERNEL); if (!name) return -ENOMEM; Cc: Dan Carpenter <error27@gmail.com> Cc: Takashi Iwai <tiwai@suse.de> Cc: Kees Cook <keescook@chromium.org> Cc: "Luis R. Rodriguez" <mcgrof@ruslug.rutgers.edu> Cc: Scott Branden <sbranden@broadcom.com> Cc: Hans de Goede <hdegoede@redhat.com> Cc: Brian Norris <briannorris@chromium.org> Fixes: c92316bf8e948 ("test_firmware: add batched firmware tests") Fixes: 0a8adf584759c ("test: add firmware_class loader test") Fixes: 548193cba2a7d ("test_firmware: add support for firmware_request_platform") Fixes: eb910947c82f9 ("test: firmware_class: add asynchronous request trigger") Fixes: 061132d2b9c95 ("test_firmware: add test custom fallback trigger") Fixes: 7feebfa487b92 ("test_firmware: add support for request_firmware_into_buf") Signed-off-by: Mirsad Goran Todorovac <mirsad.todorovac@alu.unizg.hr> Reviewed-by: Dan Carpenter <dan.carpenter@linaro.org> Message-ID: <20230606070808.9300-1-mirsad.todorovac@alu.unizg.hr> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-08-03maple_tree: fix 32 bit mas_next testingLiam R. Howlett1-1/+4
[ Upstream commit 7a93c71a6714ca1a9c03d70432dac104b0cfb815 ] The test setup of mas_next is dependent on node entry size to create a 2 level tree, but the tests did not account for this in the expected value when shifting beyond the scope of the tree. Fix this by setting up the test to succeed depending on the node entries which is dependent on the 32/64 bit setup. Link: https://lkml.kernel.org/r/20230712173916.168805-1-Liam.Howlett@oracle.com Fixes: 120b116208a0 ("maple_tree: reorganize testing to restore module testing") Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com> Reported-by: Geert Uytterhoeven <geert@linux-m68k.org> Closes: https://lore.kernel.org/linux-mm/CAMuHMdV4T53fOw7VPoBgPR7fP6RYqf=CBhD_y_vOg53zZX_DnA@mail.gmail.com/ Tested-by: Geert Uytterhoeven <geert@linux-m68k.org> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-08-03maple_tree: add __init and __exit to test moduleLiam R. Howlett1-78/+80
[ Upstream commit eaf9790d3bc6e157a2134c01c7d707a5a712fab1 ] The test functions are not needed after the module is removed, so mark them as such. Add __exit to the module removal function. Some other variables have been marked as const static as well. Link: https://lkml.kernel.org/r/20230518145544.1722059-20-Liam.Howlett@oracle.com Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com> Suggested-by: Andrew Morton <akpm@linux-foundation.org> Cc: David Binderman <dcb314@hotmail.com> Cc: Peng Zhang <zhangpeng.00@bytedance.com> Cc: Sergey Senozhatsky <senozhatsky@chromium.org> Cc: Vernon Yang <vernon2gm@gmail.com> Cc: Wei Yang <richard.weiyang@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Stable-dep-of: 7a93c71a6714 ("maple_tree: fix 32 bit mas_next testing") Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-08-03test_maple_tree: test modifications while iteratingLiam R. Howlett1-0/+72
[ Upstream commit 5159d64b335401fa83f18c27e2267f1eafc41bd3 ] Add a testcase to ensure the iterator detects bad states on modifications and does what the user expects Link: https://lkml.kernel.org/r/20230120162650.984577-5-Liam.Howlett@oracle.com Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Stable-dep-of: 7a93c71a6714 ("maple_tree: fix 32 bit mas_next testing") Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-07-27maple_tree: set the node limit when creating a new root nodePeng Zhang1-1/+2
commit 3c769fd88b9742954763a968e84de09f7ad78cfe upstream. Set the node limit of the root node so that the last pivot of all nodes is the node limit (if the node is not full). This patch also fixes a bug in mas_rev_awalk(). Effectively, always setting a maximum makes mas_logical_pivot() behave as mas_safe_pivot(). Without this fix, it is possible that very small tasks would fail to find the correct gap. Although this has not been observed with real tasks, it has been reported to happen in m68k nommu running the maple tree tests. Link: https://lkml.kernel.org/r/20230711035444.526-1-zhangpeng.00@bytedance.com Link: https://lore.kernel.org/linux-mm/CAMuHMdV4T53fOw7VPoBgPR7fP6RYqf=CBhD_y_vOg53zZX_DnA@mail.gmail.com/ Link: https://lkml.kernel.org/r/20230711035444.526-2-zhangpeng.00@bytedance.com Fixes: 54a611b60590 ("Maple Tree: add new data structure") Signed-off-by: Peng Zhang <zhangpeng.00@bytedance.com> Reviewed-by: Liam R. Howlett <Liam.Howlett@oracle.com> Tested-by: Geert Uytterhoeven <geert@linux-m68k.org> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-07-19lib/bitmap: drop optimization of bitmap_{from,to}_arr64Yury Norov1-1/+1
[ Upstream commit c1d2ba10f594046831d14b03f194e8d05e78abad ] bitmap_{from,to}_arr64() optimization is overly optimistic on 32-bit LE architectures when it's wired to bitmap_copy_clear_tail(). bitmap_copy_clear_tail() takes care of unused bits in the bitmap up to the next word boundary. But on 32-bit machines when copying bits from bitmap to array of 64-bit words, it's expected that the unused part of a recipient array must be cleared up to 64-bit boundary, so the last 4 bytes may stay untouched when nbits % 64 <= 32. While the copying part of the optimization works correct, that clear-tail trick makes corresponding tests reasonably fail: test_bitmap: bitmap_to_arr64(nbits == 1): tail is not safely cleared: 0xa5a5a5a500000001 (must be 0x0000000000000001) Fix it by removing bitmap_{from,to}_arr64() optimization for 32-bit LE arches. Reported-by: Guenter Roeck <linux@roeck-us.net> Link: https://lore.kernel.org/lkml/20230225184702.GA3587246@roeck-us.net/ Fixes: 0a97953fd221 ("lib: add bitmap_{from,to}_arr64") Signed-off-by: Yury Norov <yury.norov@gmail.com> Tested-by: Guenter Roeck <linux@roeck-us.net> Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Reviewed-by: Alexander Lobakin <aleksander.lobakin@intel.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-07-19lib/ts_bm: reset initial match offset for every block of textJeremy Sowden1-1/+3
[ Upstream commit 6f67fbf8192da80c4db01a1800c7fceaca9cf1f9 ] The `shift` variable which indicates the offset in the string at which to start matching the pattern is initialized to `bm->patlen - 1`, but it is not reset when a new block is retrieved. This means the implemen- tation may start looking at later and later positions in each successive block and miss occurrences of the pattern at the beginning. E.g., consider a HTTP packet held in a non-linear skb, where the HTTP request line occurs in the second block: [... 52 bytes of packet headers ...] GET /bmtest HTTP/1.1\r\nHost: www.example.com\r\n\r\n and the pattern is "GET /bmtest". Once the first block comprising the packet headers has been examined, `shift` will be pointing to somewhere near the end of the block, and so when the second block is examined the request line at the beginning will be missed. Reinitialize the variable for each new block. Fixes: 8082e4ed0a61 ("[LIB]: Boyer-Moore extension for textsearch infrastructure strike #2") Link: https://bugzilla.netfilter.org/show_bug.cgi?id=1390 Signed-off-by: Jeremy Sowden <jeremy@azazel.net> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-07-01maple_tree: fix potential out-of-bounds access in mas_wr_end_piv()Peng Zhang1-5/+6
commit cd00dd2585c4158e81fdfac0bbcc0446afbad26d upstream. Check the write offset end bounds before using it as the offset into the pivot array. This avoids a possible out-of-bounds access on the pivot array if the write extends to the last slot in the node, in which case the node maximum should be used as the end pivot. akpm: this doesn't affect any current callers, but new users of mapletree may encounter this problem if backported into earlier kernels, so let's fix it in -stable kernels in case of this. Link: https://lkml.kernel.org/r/20230506024752.2550-1-zhangpeng.00@bytedance.com Fixes: 54a611b60590 ("Maple Tree: add new data structure") Signed-off-by: Peng Zhang <zhangpeng.00@bytedance.com> Reviewed-by: Liam R. Howlett <Liam.Howlett@oracle.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-06-21test_firmware: prevent race conditions by a correct implementation of lockingMirsad Goran Todorovac1-17/+35
[ Upstream commit 4acfe3dfde685a5a9eaec5555351918e2d7266a1 ] Dan Carpenter spotted a race condition in a couple of situations like these in the test_firmware driver: static int test_dev_config_update_u8(const char *buf, size_t size, u8 *cfg) { u8 val; int ret; ret = kstrtou8(buf, 10, &val); if (ret) return ret; mutex_lock(&test_fw_mutex); *(u8 *)cfg = val; mutex_unlock(&test_fw_mutex); /* Always return full write size even if we didn't consume all */ return size; } static ssize_t config_num_requests_store(struct device *dev, struct device_attribute *attr, const char *buf, size_t count) { int rc; mutex_lock(&test_fw_mutex); if (test_fw_config->reqs) { pr_err("Must call release_all_firmware prior to changing config\n"); rc = -EINVAL; mutex_unlock(&test_fw_mutex); goto out; } mutex_unlock(&test_fw_mutex); rc = test_dev_config_update_u8(buf, count, &test_fw_config->num_requests); out: return rc; } static ssize_t config_read_fw_idx_store(struct device *dev, struct device_attribute *attr, const char *buf, size_t count) { return test_dev_config_update_u8(buf, count, &test_fw_config->read_fw_idx); } The function test_dev_config_update_u8() is called from both the locked and the unlocked context, function config_num_requests_store() and config_read_fw_idx_store() which can both be called asynchronously as they are driver's methods, while test_dev_config_update_u8() and siblings change their argument pointed to by u8 *cfg or similar pointer. To avoid deadlock on test_fw_mutex, the lock is dropped before calling test_dev_config_update_u8() and re-acquired within test_dev_config_update_u8() itself, but alas this creates a race condition. Having two locks wouldn't assure a race-proof mutual exclusion. This situation is best avoided by the introduction of a new, unlocked function __test_dev_config_update_u8() which can be called from the locked context and reducing test_dev_config_update_u8() to: static int test_dev_config_update_u8(const char *buf, size_t size, u8 *cfg) { int ret; mutex_lock(&test_fw_mutex); ret = __test_dev_config_update_u8(buf, size, cfg); mutex_unlock(&test_fw_mutex); return ret; } doing the locking and calling the unlocked primitive, which enables both locked and unlocked versions without duplication of code. The similar approach was applied to all functions called from the locked and the unlocked context, which safely mitigates both deadlocks and race conditions in the driver. __test_dev_config_update_bool(), __test_dev_config_update_u8() and __test_dev_config_update_size_t() unlocked versions of the functions were introduced to be called from the locked contexts as a workaround without releasing the main driver's lock and thereof causing a race condition. The test_dev_config_update_bool(), test_dev_config_update_u8() and test_dev_config_update_size_t() locked versions of the functions are being called from driver methods without the unnecessary multiplying of the locking and unlocking code for each method, and complicating the code with saving of the return value across lock. Fixes: 7feebfa487b92 ("test_firmware: add support for request_firmware_into_buf") Cc: Luis Chamberlain <mcgrof@kernel.org> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Russ Weight <russell.h.weight@intel.com> Cc: Takashi Iwai <tiwai@suse.de> Cc: Tianfei Zhang <tianfei.zhang@intel.com> Cc: Shuah Khan <shuah@kernel.org> Cc: Colin Ian King <colin.i.king@gmail.com> Cc: Randy Dunlap <rdunlap@infradead.org> Cc: linux-kselftest@vger.kernel.org Cc: stable@vger.kernel.org # v5.4 Suggested-by: Dan Carpenter <error27@gmail.com> Signed-off-by: Mirsad Goran Todorovac <mirsad.todorovac@alu.unizg.hr> Link: https://lore.kernel.org/r/20230509084746.48259-1-mirsad.todorovac@alu.unizg.hr Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-06-21test_firmware: Use kstrtobool() instead of strtobool()Christophe JAILLET1-1/+2
[ Upstream commit f7d85515bd21902b218370a1a6301f76e4e636ff ] strtobool() is the same as kstrtobool(). However, the latter is more used within the kernel. In order to remove strtobool() and slightly simplify kstrtox.h, switch to the other function name. While at it, include the corresponding header file (<linux/kstrtox.h>) Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr> Acked-by: Luis Chamberlain <mcgrof@kernel.org> Link: https://lore.kernel.org/r/34f04735d20e0138695dd4070651bd860a36b81c.1673688120.git.christophe.jaillet@wanadoo.fr Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Stable-dep-of: 4acfe3dfde68 ("test_firmware: prevent race conditions by a correct implementation of locking") Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-06-14lib: cpu_rmap: Fix potential use-after-free in irq_cpu_rmap_release()Ben Hutchings1-1/+1
[ Upstream commit 7c5d4801ecf0564c860033d89726b99723c55146 ] irq_cpu_rmap_release() calls cpu_rmap_put(), which may free the rmap. So we need to clear the pointer to our glue structure in rmap before doing that, not after. Fixes: 4e0473f1060a ("lib: cpu_rmap: Avoid use after free on rmap->obj array entries") Signed-off-by: Ben Hutchings <ben@decadent.org.uk> Reviewed-by: Simon Horman <simon.horman@corigine.com> Link: https://lore.kernel.org/r/ZHo0vwquhOy3FaXc@decadent.org.uk Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-06-09test_firmware: fix the memory leak of the allocated firmware bufferMirsad Goran Todorovac1-1/+18
commit 48e156023059e57a8fc68b498439832f7600ffff upstream. The following kernel memory leak was noticed after running tools/testing/selftests/firmware/fw_run_tests.sh: [root@pc-mtodorov firmware]# cat /sys/kernel/debug/kmemleak . . . unreferenced object 0xffff955389bc3400 (size 1024): comm "test_firmware-0", pid 5451, jiffies 4294944822 (age 65.652s) hex dump (first 32 bytes): 47 48 34 35 36 37 0a 00 00 00 00 00 00 00 00 00 GH4567.......... 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ backtrace: [<ffffffff962f5dec>] slab_post_alloc_hook+0x8c/0x3c0 [<ffffffff962fcca4>] __kmem_cache_alloc_node+0x184/0x240 [<ffffffff962704de>] kmalloc_trace+0x2e/0xc0 [<ffffffff9665b42d>] test_fw_run_batch_request+0x9d/0x180 [<ffffffff95fd813b>] kthread+0x10b/0x140 [<ffffffff95e033e9>] ret_from_fork+0x29/0x50 unreferenced object 0xffff9553c334b400 (size 1024): comm "test_firmware-1", pid 5452, jiffies 4294944822 (age 65.652s) hex dump (first 32 bytes): 47 48 34 35 36 37 0a 00 00 00 00 00 00 00 00 00 GH4567.......... 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ backtrace: [<ffffffff962f5dec>] slab_post_alloc_hook+0x8c/0x3c0 [<ffffffff962fcca4>] __kmem_cache_alloc_node+0x184/0x240 [<ffffffff962704de>] kmalloc_trace+0x2e/0xc0 [<ffffffff9665b42d>] test_fw_run_batch_request+0x9d/0x180 [<ffffffff95fd813b>] kthread+0x10b/0x140 [<ffffffff95e033e9>] ret_from_fork+0x29/0x50 unreferenced object 0xffff9553c334f000 (size 1024): comm "test_firmware-2", pid 5453, jiffies 4294944822 (age 65.652s) hex dump (first 32 bytes): 47 48 34 35 36 37 0a 00 00 00 00 00 00 00 00 00 GH4567.......... 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ backtrace: [<ffffffff962f5dec>] slab_post_alloc_hook+0x8c/0x3c0 [<ffffffff962fcca4>] __kmem_cache_alloc_node+0x184/0x240 [<ffffffff962704de>] kmalloc_trace+0x2e/0xc0 [<ffffffff9665b42d>] test_fw_run_batch_request+0x9d/0x180 [<ffffffff95fd813b>] kthread+0x10b/0x140 [<ffffffff95e033e9>] ret_from_fork+0x29/0x50 unreferenced object 0xffff9553c3348400 (size 1024): comm "test_firmware-3", pid 5454, jiffies 4294944822 (age 65.652s) hex dump (first 32 bytes): 47 48 34 35 36 37 0a 00 00 00 00 00 00 00 00 00 GH4567.......... 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ backtrace: [<ffffffff962f5dec>] slab_post_alloc_hook+0x8c/0x3c0 [<ffffffff962fcca4>] __kmem_cache_alloc_node+0x184/0x240 [<ffffffff962704de>] kmalloc_trace+0x2e/0xc0 [<ffffffff9665b42d>] test_fw_run_batch_request+0x9d/0x180 [<ffffffff95fd813b>] kthread+0x10b/0x140 [<ffffffff95e033e9>] ret_from_fork+0x29/0x50 [root@pc-mtodorov firmware]# Note that the size 1024 corresponds to the size of the test firmware buffer. The actual number of the buffers leaked is around 70-110, depending on the test run. The cause of the leak is the following: request_partial_firmware_into_buf() and request_firmware_into_buf() provided firmware buffer isn't released on release_firmware(), we have allocated it and we are responsible for deallocating it manually. This is introduced in a number of context where previously only release_firmware() was called, which was insufficient. Reported-by: Mirsad Goran Todorovac <mirsad.todorovac@alu.unizg.hr> Fixes: 7feebfa487b92 ("test_firmware: add support for request_firmware_into_buf") Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Dan Carpenter <error27@gmail.com> Cc: Takashi Iwai <tiwai@suse.de> Cc: Luis Chamberlain <mcgrof@kernel.org> Cc: Russ Weight <russell.h.weight@intel.com> Cc: Tianfei zhang <tianfei.zhang@intel.com> Cc: Christophe JAILLET <christophe.jaillet@wanadoo.fr> Cc: Zhengchao Shao <shaozhengchao@huawei.com> Cc: Colin Ian King <colin.i.king@gmail.com> Cc: linux-kernel@vger.kernel.org Cc: Kees Cook <keescook@chromium.org> Cc: Scott Branden <sbranden@broadcom.com> Cc: Luis R. Rodriguez <mcgrof@kernel.org> Cc: linux-kselftest@vger.kernel.org Cc: stable@vger.kernel.org # v5.4 Signed-off-by: Mirsad Goran Todorovac <mirsad.todorovac@alu.unizg.hr> Link: https://lore.kernel.org/r/20230509084746.48259-3-mirsad.todorovac@alu.unizg.hr Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-06-09test_firmware: fix a memory leak with reqs bufferMirsad Goran Todorovac1-0/+10
commit be37bed754ed90b2655382f93f9724b3c1aae847 upstream. Dan Carpenter spotted that test_fw_config->reqs will be leaked if trigger_batched_requests_store() is called two or more times. The same appears with trigger_batched_requests_async_store(). This bug wasn't trigger by the tests, but observed by Dan's visual inspection of the code. The recommended workaround was to return -EBUSY if test_fw_config->reqs is already allocated. Fixes: 7feebfa487b92 ("test_firmware: add support for request_firmware_into_buf") Cc: Luis Chamberlain <mcgrof@kernel.org> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Russ Weight <russell.h.weight@intel.com> Cc: Tianfei Zhang <tianfei.zhang@intel.com> Cc: Shuah Khan <shuah@kernel.org> Cc: Colin Ian King <colin.i.king@gmail.com> Cc: Randy Dunlap <rdunlap@infradead.org> Cc: linux-kselftest@vger.kernel.org Cc: stable@vger.kernel.org # v5.4 Suggested-by: Dan Carpenter <error27@gmail.com> Suggested-by: Takashi Iwai <tiwai@suse.de> Signed-off-by: Mirsad Goran Todorovac <mirsad.todorovac@alu.unizg.hr> Reviewed-by: Dan Carpenter <dan.carpenter@linaro.org> Acked-by: Luis Chamberlain <mcgrof@kernel.org> Link: https://lore.kernel.org/r/20230509084746.48259-2-mirsad.todorovac@alu.unizg.hr Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-05-30debugobjects: Don't wake up kswapd from fill_pool()Tetsuo Handa1-1/+1
commit eb799279fb1f9c63c520fe8c1c41cb9154252db6 upstream. syzbot is reporting a lockdep warning in fill_pool() because the allocation from debugobjects is using GFP_ATOMIC, which is (__GFP_HIGH | __GFP_KSWAPD_RECLAIM) and therefore tries to wake up kswapd, which acquires kswapd_wait::lock. Since fill_pool() might be called with arbitrary locks held, fill_pool() should not assume that acquiring kswapd_wait::lock is safe. Use __GFP_HIGH instead and remove __GFP_NORETRY as it is pointless for !__GFP_DIRECT_RECLAIM allocation. Fixes: 3ac7fe5a4aab ("infrastructure to debug (dynamic) objects") Reported-by: syzbot <syzbot+fe0c72f0ccbb93786380@syzkaller.appspotmail.com> Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Link: https://lore.kernel.org/r/6577e1fa-b6ee-f2be-2414-a2b51b1c5e30@I-love.SAKURA.ne.jp Closes: https://syzkaller.appspot.com/bug?extid=fe0c72f0ccbb93786380 Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-05-24maple_tree: make maple state reusable after mas_empty_area()Peng Zhang1-9/+3
commit 0257d9908d38c0b1669af4bb1bc4dbca1f273fe6 upstream. Make mas->min and mas->max point to a node range instead of a leaf entry range. This allows mas to still be usable after mas_empty_area() returns. Users would get unexpected results from other operations on the maple state after calling the affected function. For example, x86 MAP_32BIT mmap() acts as if there is no suitable gap when there should be one. Link: https://lkml.kernel.org/r/20230505145829.74574-1-zhangpeng.00@bytedance.com Fixes: 54a611b60590 ("Maple Tree: add new data structure") Signed-off-by: Peng Zhang <zhangpeng.00@bytedance.com> Reported-by: "Edgecombe, Rick P" <rick.p.edgecombe@intel.com> Reported-by: Tad <support@spotco.us> Reported-by: Michael Keyes <mgkeyes@vigovproductions.net> Link: https://lore.kernel.org/linux-mm/32f156ba80010fd97dbaf0a0cdfc84366608624d.camel@intel.com/ Link: https://lore.kernel.org/linux-mm/e6108286ac025c268964a7ead3aab9899f9bc6e9.camel@spotco.us/ Reviewed-by: Liam R. Howlett <Liam.Howlett@oracle.com> Tested-by: Rick Edgecombe <rick.p.edgecombe@intel.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-05-24lib: cpu_rmap: Avoid use after free on rmap->obj array entriesEli Cohen1-1/+4
[ Upstream commit 4e0473f1060aa49621d40a113afde24818101d37 ] When calling irq_set_affinity_notifier() with NULL at the notify argument, it will cause freeing of the glue pointer in the corresponding array entry but will leave the pointer in the array. A subsequent call to free_irq_cpu_rmap() will try to free this entry again leading to possible use after free. Fix that by setting NULL to the array entry and checking that we have non-zero at the array entry when iterating over the array in free_irq_cpu_rmap(). The current code does not suffer from this since there are no cases where irq_set_affinity_notifier(irq, NULL) (note the NULL passed for the notify arg) is called, followed by a call to free_irq_cpu_rmap() so we don't hit and issue. Subsequent patches in this series excersize this flow, hence the required fix. Cc: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Eli Cohen <elic@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com> Reviewed-by: Jacob Keller <jacob.e.keller@intel.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-05-24linux/dim: Do nothing if no time delta between samplesRoy Novich3-4/+7
[ Upstream commit 162bd18eb55adf464a0fa2b4144b8d61c75ff7c2 ] Add return value for dim_calc_stats. This is an indication for the caller if curr_stats was assigned by the function. Avoid using curr_stats uninitialized over {rdma/net}_dim, when no time delta between samples. Coverity reported this potential use of an uninitialized variable. Fixes: 4c4dbb4a7363 ("net/mlx5e: Move dynamic interrupt coalescing code to include/linux") Fixes: cb3c7fd4f839 ("net/mlx5e: Support adaptive RX coalescing") Signed-off-by: Roy Novich <royno@nvidia.com> Reviewed-by: Aya Levin <ayal@nvidia.com> Reviewed-by: Saeed Mahameed <saeedm@nvidia.com> Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Reviewed-by: Leon Romanovsky <leonro@nvidia.com> Reviewed-by: Michal Kubiak <michal.kubiak@intel.com> Link: https://lore.kernel.org/r/20230507135743.138993-1-tariqt@nvidia.com Signed-off-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-05-11debugobject: Ensure pool refill (again)Thomas Gleixner1-6/+15
commit 0af462f19e635ad522f28981238334620881badc upstream. The recent fix to ensure atomicity of lookup and allocation inadvertently broke the pool refill mechanism. Prior to that change debug_objects_activate() and debug_objecs_assert_init() invoked debug_objecs_init() to set up the tracking object for statically initialized objects. That's not longer the case and debug_objecs_init() is now the only place which does pool refills. Depending on the number of statically initialized objects this can be enough to actually deplete the pool, which was observed by Ido via a debugobjects OOM warning. Restore the old behaviour by adding explicit refill opportunities to debug_objects_activate() and debug_objecs_assert_init(). Fixes: 63a759694eed ("debugobject: Prevent init race with static objects") Reported-by: Ido Schimmel <idosch@nvidia.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Tested-by: Ido Schimmel <idosch@nvidia.com> Link: https://lore.kernel.org/r/871qk05a9d.ffs@tglx Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-05-11debugobject: Prevent init race with static objectsThomas Gleixner1-59/+66
[ Upstream commit 63a759694eed61025713b3e14dd827c8548daadc ] Statically initialized objects are usually not initialized via the init() function of the subsystem. They are special cased and the subsystem provides a function to validate whether an object which is not yet tracked by debugobjects is statically initialized. This means the object is started to be tracked on first use, e.g. activation. This works perfectly fine, unless there are two concurrent operations on that object. Schspa decoded the problem: T0 T1 debug_object_assert_init(addr) lock_hash_bucket() obj = lookup_object(addr); if (!obj) { unlock_hash_bucket(); - > preemption lock_subsytem_object(addr); activate_object(addr) lock_hash_bucket(); obj = lookup_object(addr); if (!obj) { unlock_hash_bucket(); if (is_static_object(addr)) init_and_track(addr); lock_hash_bucket(); obj = lookup_object(addr); obj->state = ACTIVATED; unlock_hash_bucket(); subsys function modifies content of addr, so static object detection does not longer work. unlock_subsytem_object(addr); if (is_static_object(addr)) <- Fails debugobject emits a warning and invokes the fixup function which reinitializes the already active object in the worst case. This race exists forever, but was never observed until mod_timer() got a debug_object_assert_init() added which is outside of the timer base lock held section right at the beginning of the function to cover the lockless early exit points too. Rework the code so that the lookup, the static object check and the tracking object association happens atomically under the hash bucket lock. This prevents the issue completely as all callers are serialized on the hash bucket lock and therefore cannot observe inconsistent state. Fixes: 3ac7fe5a4aab ("infrastructure to debug (dynamic) objects") Reported-by: syzbot+5093ba19745994288b53@syzkaller.appspotmail.com Debugged-by: Schspa Shi <schspa@gmail.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Stephen Boyd <swboyd@chromium.org> Link: https://syzkaller.appspot.com/bug?id=22c8a5938eab640d1c6bcc0e3dc7be519d878462 Link: https://lore.kernel.org/lkml/20230303161906.831686-1-schspa@gmail.com Link: https://lore.kernel.org/r/87zg7dzgao.ffs@tglx Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-05-11kunit: fix bug in the order of lines in debugfs logsRae Moar2-9/+26
[ Upstream commit f9a301c3317daa921375da0aec82462ddf019928 ] Fix bug in debugfs logs that causes an incorrect order of lines in the debugfs log. Currently, the test counts lines that show the number of tests passed, failed, and skipped, as well as any suite diagnostic lines, appear prior to the individual results, which is a bug. Ensure the order of printing for the debugfs log is correct. Additionally, add a KTAP header to so the debugfs logs can be valid KTAP. This is an example of a log prior to these fixes: KTAP version 1 # Subtest: kunit_status 1..2 # kunit_status: pass:2 fail:0 skip:0 total:2 # Totals: pass:2 fail:0 skip:0 total:2 ok 1 kunit_status_set_failure_test ok 2 kunit_status_mark_skipped_test ok 1 kunit_status Note the two lines with stats are out of order. This is the same debugfs log after the fixes (in combination with the third patch to remove the extra line): KTAP version 1 1..1 KTAP version 1 # Subtest: kunit_status 1..2 ok 1 kunit_status_set_failure_test ok 2 kunit_status_mark_skipped_test # kunit_status: pass:2 fail:0 skip:0 total:2 # Totals: pass:2 fail:0 skip:0 total:2 ok 1 kunit_status Signed-off-by: Rae Moar <rmoar@google.com> Reviewed-by: David Gow <davidgow@google.com> Signed-off-by: Shuah Khan <skhan@linuxfoundation.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-05-11kunit: improve KTAP compliance of KUnit test outputRae Moar3-7/+10
[ Upstream commit 6c738b52316c58ae8a87abf0907f87a7b5e7a109 ] Change KUnit test output to better comply with KTAP v1 specifications found here: https://kernel.org/doc/html/latest/dev-tools/ktap.html. 1) Use "KTAP version 1" instead of "TAP version 14" as test output header 2) Remove '-' between test number and test name on test result lines 2) Add KTAP version lines to each subtest header as well Note that the new KUnit output still includes the “# Subtest” line now located after the KTAP version line. This does not completely match the KTAP v1 spec but since it is classified as a diagnostic line, it is not expected to be disruptive or break any existing parsers. This “# Subtest” line comes from the TAP 14 spec (https://testanything.org/tap-version-14-specification.html) and it is used to define the test name before the results. Original output: TAP version 14 1..1 # Subtest: kunit-test-suite 1..3 ok 1 - kunit_test_1 ok 2 - kunit_test_2 ok 3 - kunit_test_3 # kunit-test-suite: pass:3 fail:0 skip:0 total:3 # Totals: pass:3 fail:0 skip:0 total:3 ok 1 - kunit-test-suite New output: KTAP version 1 1..1 KTAP version 1 # Subtest: kunit-test-suite 1..3 ok 1 kunit_test_1 ok 2 kunit_test_2 ok 3 kunit_test_3 # kunit-test-suite: pass:3 fail:0 skip:0 total:3 # Totals: pass:3 fail:0 skip:0 total:3 ok 1 kunit-test-suite Signed-off-by: Rae Moar <rmoar@google.com> Reviewed-by: Daniel Latypov <dlatypov@google.com> Reviewed-by: David Gow <davidgow@google.com> Tested-by: Anders Roxell <anders.roxell@linaro.org> Signed-off-by: Shuah Khan <skhan@linuxfoundation.org> Stable-dep-of: f9a301c3317d ("kunit: fix bug in the order of lines in debugfs logs") Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-04-26maple_tree: fix a potential memory leak, OOB access, or other unpredictable bugPeng Zhang1-12/+7
commit 1f5f12ece722aacea1769fb644f27790ede339dc upstream. In mas_alloc_nodes(), "node->node_count = 0" means to initialize the node_count field of the new node, but the node may not be a new node. It may be a node that existed before and node_count has a value, setting it to 0 will cause a memory leak. At this time, mas->alloc->total will be greater than the actual number of nodes in the linked list, which may cause many other errors. For example, out-of-bounds access in mas_pop_node(), and mas_pop_node() may return addresses that should not be used. Fix it by initializing node_count only for new nodes. Also, by the way, an if-else statement was removed to simplify the code. Link: https://lkml.kernel.org/r/20230411041005.26205-1-zhangpeng.00@bytedance.com Fixes: 54a611b60590 ("Maple Tree: add new data structure") Signed-off-by: Peng Zhang <zhangpeng.00@bytedance.com> Reviewed-by: Liam R. Howlett <Liam.Howlett@oracle.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-04-26maple_tree: fix mas_empty_area() searchLiam R. Howlett1-9/+11
commit 06e8fd999334bcd76b4d72d7b9206d4aea89764e upstream. The internal function of mas_awalk() was incorrectly skipping the last entry in a node, which could potentially be NULL. This is only a problem for the left-most node in the tree - otherwise that NULL would not exist. Fix mas_awalk() by using the metadata to obtain the end of the node for the loop and the logical pivot as apposed to the raw pivot value. Link: https://lkml.kernel.org/r/20230414145728.4067069-2-Liam.Howlett@oracle.com Fixes: 54a611b60590 ("Maple Tree: add new data structure") Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com> Reported-by: Rick Edgecombe <rick.p.edgecombe@intel.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-04-26maple_tree: make maple state reusable after mas_empty_area_rev()Liam R. Howlett1-14/+13
commit fad8e4291da5e3243e086622df63cb952db444d8 upstream. Stop using maple state min/max for the range by passing through pointers for those values. This will allow the maple state to be reused without resetting. Also add some logic to fail out early on searching with invalid arguments. Link: https://lkml.kernel.org/r/20230414145728.4067069-1-Liam.Howlett@oracle.com Fixes: 54a611b60590 ("Maple Tree: add new data structure") Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com> Reported-by: Rick Edgecombe <rick.p.edgecombe@intel.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-04-20maple_tree: fix write memory barrier of nodes once dead for RCU modeLiam R. Howlett1-2/+5
[ Upstream commit c13af03de46ba27674dd9fb31a17c0d480081139 ] During the development of the maple tree, the strategy of freeing multiple nodes changed and, in the process, the pivots were reused to store pointers to dead nodes. To ensure the readers see accurate pivots, the writers need to mark the nodes as dead and call smp_wmb() to ensure any readers can identify the node as dead before using the pivot values. There were two places where the old method of marking the node as dead without smp_wmb() were being used, which resulted in RCU readers seeing the wrong pivot value before seeing the node was dead. Fix this race condition by using mte_set_node_dead() which has the smp_wmb() call to ensure the race is closed. Add a WARN_ON() to the ma_free_rcu() call to ensure all nodes being freed are marked as dead to ensure there are no other call paths besides the two updated paths. This is necessary for the RCU mode of the maple tree. Link: https://lkml.kernel.org/r/20230227173632.3292573-6-surenb@google.com Fixes: 54a611b60590 ("Maple Tree: add new data structure") Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com> Signed-off-by: Suren Baghdasaryan <surenb@google.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-04-13maple_tree: add RCU lock checking to rcu callback functionsLiam R. Howlett1-92/+96
commit 790e1fa86b340c2bd4a327e01c161f7a1ad885f6 upstream. Dereferencing RCU objects within the RCU callback without the RCU check has caused lockdep to complain. Fix the RCU dereferencing by using the RCU callback lock to ensure the operation is safe. Also stop creating a new lock to use for dereferencing during destruction of the tree or subtree. Instead, pass through a pointer to the tree that has the lock that is held for RCU dereferencing checking. It also does not make sense to use the maple state in the freeing scenario as the tree walk is a special case where the tree no longer has the normal encodings and parent pointers. Link: https://lkml.kernel.org/r/20230227173632.3292573-8-surenb@google.com Fixes: 54a611b60590 ("Maple Tree: add new data structure") Cc: stable@vger.kernel.org Reported-by: Suren Baghdasaryan <surenb@google.com> Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-04-13maple_tree: add smp_rmb() to dead node detectionLiam R. Howlett1-2/+6
commit 0a2b18d948838e16912b3b627b504ab062b7d02a upstream. Add an smp_rmb() before reading the parent pointer to ensure that anything read from the node prior to the parent pointer hasn't been reordered ahead of this check. The is necessary for RCU mode. Link: https://lkml.kernel.org/r/20230227173632.3292573-7-surenb@google.com Fixes: 54a611b60590 ("Maple Tree: add new data structure") Cc: stable@vger.kernel.org Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-04-13maple_tree: remove extra smp_wmb() from mas_dead_leaves()Liam R. Howlett1-1/+0
commit 8372f4d83f96f35915106093cde4565836587123 upstream. The call to mte_set_dead_node() before the smp_wmb() already calls smp_wmb() so this is not needed. This is an optimization for the RCU mode of the maple tree. Link: https://lkml.kernel.org/r/20230227173632.3292573-5-surenb@google.com Fixes: 54a611b60590 ("Maple Tree: add new data structure") Cc: stable@vger.kernel.org Signed-off-by: Liam Howlett <Liam.Howlett@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-04-13maple_tree: fix freeing of nodes in rcu modeLiam R. Howlett1-11/+62
commit 2e5b4921f8efc9e845f4f04741797d16f36847eb upstream. The walk to destroy the nodes was not always setting the node type and would result in a destroy method potentially using the values as nodes. Avoid this by setting the correct node types. This is necessary for the RCU mode of the maple tree. Link: https://lkml.kernel.org/r/20230227173632.3292573-4-surenb@google.com Cc: <Stable@vger.kernel.org> Fixes: 54a611b60590 ("Maple Tree: add new data structure") Signed-off-by: Liam Howlett <Liam.Howlett@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-04-13maple_tree: detect dead nodes in mas_start()Liam R. Howlett1-0/+4
commit a7b92d59c885018cb7bb88539892278e4fd64b29 upstream. When initially starting a search, the root node may already be in the process of being replaced in RCU mode. Detect and restart the walk if this is the case. This is necessary for RCU mode of the maple tree. Link: https://lkml.kernel.org/r/20230227173632.3292573-3-surenb@google.com Cc: <Stable@vger.kernel.org> Fixes: 54a611b60590 ("Maple Tree: add new data structure") Signed-off-by: Liam Howlett <Liam.Howlett@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-04-13maple_tree: refine ma_state init from mas_start()Liam R. Howlett1-3/+3
commit 46b345848261009477552d654cb2f65000c30e4d upstream. If mas->node is an MAS_START, there are three cases, and they all assign different values to mas->node and mas->offset. So there is no need to set them to a default value before updating. Update them directly to make them easier to understand and for better readability. Link: https://lkml.kernel.org/r/20221221060058.609003-7-vernon2gm@gmail.com Cc: <Stable@vger.kernel.org> Fixes: 54a611b60590 ("Maple Tree: add new data structure") Signed-off-by: Vernon Yang <vernon2gm@gmail.com> Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-04-13maple_tree: be more cautious about dead nodesLiam R. Howlett1-9/+43
commit 39d0bd86c499ecd6abae42a9b7112056c5560691 upstream. ma_pivots() and ma_data_end() may be called with a dead node. Ensure to that the node isn't dead before using the returned values. This is necessary for RCU mode of the maple tree. Link: https://lkml.kernel.org/r/20230227173632.3292573-1-surenb@google.com Link: https://lkml.kernel.org/r/20230227173632.3292573-2-surenb@google.com Fixes: 54a611b60590 ("Maple Tree: add new data structure") Cc: <Stable@vger.kernel.org> Signed-off-by: Liam Howlett <Liam.Howlett@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-04-13maple_tree: fix mas_prev() and mas_find() state handlingLiam R. Howlett1-1/+5
commit 17dc622c7b0f94e49bed030726df4db12ecaa6b5 upstream. When mas_prev() does not find anything, set the state to MAS_NONE. Handle the MAS_NONE in mas_find() like a MAS_START. Link: https://lkml.kernel.org/r/20230120162650.984577-7-Liam.Howlett@oracle.com Cc: <Stable@vger.kernel.org> Fixes: 54a611b60590 ("Maple Tree: add new data structure") Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com> Reported-by: <syzbot+502859d610c661e56545@syzkaller.appspotmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-04-13maple_tree: fix handle of invalidated state in mas_wr_store_setup()Liam R. Howlett1-0/+3
commit 1202700c3f8cc5f7e4646c3cf05ee6f7c8bc6ccf upstream. If an invalidated maple state is encountered during write, reset the maple state to MAS_START. This will result in a re-walk of the tree to the correct location for the write. Link: https://lore.kernel.org/all/20230107020126.1627-1-sj@kernel.org/ Link: https://lkml.kernel.org/r/20230120162650.984577-6-Liam.Howlett@oracle.com Cc: <Stable@vger.kernel.org> Fixes: 54a611b60590 ("Maple Tree: add new data structure") Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com> Reported-by: SeongJae Park <sj@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-04-13maple_tree: reduce user error potentialLiam R. Howlett1-0/+10
commit 50e81c82ad947045c7ed26ddc9acb17276b653b6 upstream. When iterating, a user may operate on the tree and cause the maple state to be altered and left in an unintuitive state. Detect this scenario and correct it by setting to the limit and invalidating the state. Link: https://lkml.kernel.org/r/20230120162650.984577-4-Liam.Howlett@oracle.com Cc: <Stable@vger.kernel.org> Fixes: 54a611b60590 ("Maple Tree: add new data structure") Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-04-13maple_tree: fix potential rcu issueLiam R. Howlett1-1/+1
commit 65be6f058b0eba98dc6c6f197ea9f62c9b6a519f upstream. Ensure the node isn't dead after reading the node end. Link: https://lkml.kernel.org/r/20230120162650.984577-3-Liam.Howlett@oracle.com Cc: <Stable@vger.kernel.org> Fixes: 54a611b60590 ("Maple Tree: add new data structure") Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-04-13maple_tree: remove GFP_ZERO from kmem_cache_alloc() and kmem_cache_alloc_bulk()Liam R. Howlett1-37/+43
commit 541e06b772c1aaffb3b6a245ccface36d7107af2 upstream. Preallocations are common in the VMA code to avoid allocating under certain locking conditions. The preallocations must also cover the worst-case scenario. Removing the GFP_ZERO flag from the kmem_cache_alloc() (and bulk variant) calls will reduce the amount of time spent zeroing memory that may not be used. Only zero out the necessary area to keep track of the allocations in the maple state. Zero the entire node prior to using it in the tree. This required internal changes to node counting on allocation, so the test code is also updated. This restores some micro-benchmark performance: up to +9% in mmtests mmap1 by my testing +10% to +20% in mmap, mmapaddr, mmapmany tests reported by Red Hat Link: https://bugzilla.redhat.com/show_bug.cgi?id=2149636 Link: https://lkml.kernel.org/r/20230105160427.2988454-1-Liam.Howlett@oracle.com Cc: stable@vger.kernel.org Fixes: 54a611b60590 ("Maple Tree: add new data structure") Signed-off-by: Liam Howlett <Liam.Howlett@oracle.com> Reported-by: Jirka Hladky <jhladky@redhat.com> Suggested-by: Matthew Wilcox (Oracle) <willy@infradead.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-04-13maple_tree: fix a potential concurrency bug in RCU modePeng Zhang1-2/+1
commit c45ea315a602d45569b08b93e9ab30f6a63a38aa upstream. There is a concurrency bug that may cause the wrong value to be loaded when a CPU is modifying the maple tree. CPU1: mtree_insert_range() mas_insert() mas_store_root() ... mas_root_expand() ... rcu_assign_pointer(mas->tree->ma_root, mte_mk_root(mas->node)); ma_set_meta(node, maple_leaf_64, 0, slot); <---IP CPU2: mtree_load() mtree_lookup_walk() ma_data_end(); When CPU1 is about to execute the instruction pointed to by IP, the ma_data_end() executed by CPU2 may return the wrong end position, which will cause the value loaded by mtree_load() to be wrong. An example of triggering the bug: Add mdelay(100) between rcu_assign_pointer() and ma_set_meta() in mas_root_expand(). static DEFINE_MTREE(tree); int work(void *p) { unsigned long val; for (int i = 0 ; i< 30; ++i) { val = (unsigned long)mtree_load(&tree, 8); mdelay(5); pr_info("%lu",val); } return 0; } mt_init_flags(&tree, MT_FLAGS_USE_RCU); mtree_insert(&tree, 0, (void*)12345, GFP_KERNEL); run_thread(work) mtree_insert(&tree, 1, (void*)56789, GFP_KERNEL); In RCU mode, mtree_load() should always return the value before or after the data structure is modified, and in this example mtree_load(&tree, 8) may return 56789 which is not expected, it should always return NULL. Fix it by put ma_set_meta() before rcu_assign_pointer(). Link: https://lkml.kernel.org/r/20230314124203.91572-4-zhangpeng.00@bytedance.com Fixes: 54a611b60590 ("Maple Tree: add new data structure") Signed-off-by: Peng Zhang <zhangpeng.00@bytedance.com> Reviewed-by: Liam R. Howlett <Liam.Howlett@oracle.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-04-13maple_tree: fix get wrong data_end in mtree_lookup_walk()Peng Zhang1-10/+5
commit ec07967d7523adb3670f9dfee0232e3bc868f3de upstream. if (likely(offset > end)) max = pivots[offset]; The above code should be changed to if (likely(offset < end)), which is correct. This affects the correctness of ma_data_end(). Now it seems that the final result will not be wrong, but it is best to change it. This patch does not change the code as above, because it simplifies the code by the way. Link: https://lkml.kernel.org/r/20230314124203.91572-1-zhangpeng.00@bytedance.com Link: https://lkml.kernel.org/r/20230314124203.91572-2-zhangpeng.00@bytedance.com Fixes: 54a611b60590 ("Maple Tree: add new data structure") Signed-off-by: Peng Zhang <zhangpeng.00@bytedance.com> Reviewed-by: Liam R. Howlett <Liam.Howlett@oracle.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-04-06zstd: Fix definition of assert()Jonathan Neuschäfer1-1/+1
[ Upstream commit 6906598f1ce93761716d780b6e3f171e13f0f4ce ] assert(x) should emit a warning if x is false. WARN_ON(x) emits a warning if x is true. Thus, assert(x) should be defined as WARN_ON(!x) rather than WARN_ON(x). Signed-off-by: Jonathan Neuschäfer <j.neuschaefer@gmx.net> Signed-off-by: Nick Terrell <terrelln@fb.com> Signed-off-by: Sasha Levin <sashal@kernel.org>