summaryrefslogtreecommitdiff
AgeCommit message (Collapse)AuthorFilesLines
7 daysLinux 4.19.315v4.19.315linux-4.19.yGreg Kroah-Hartman1-1/+1
Link: https://lore.kernel.org/r/20240523130325.727602650@linuxfoundation.org Tested-by: Harshit Mogalapalli <harshit.m.mogalapalli@oracle.com> Tested-by: Pavel Machek (CIP) <pavel@denx.de> Tested-by: Linux Kernel Functional Testing <lkft@linaro.org> Tested-by: Jon Hunter <jonathanh@nvidia.com> Tested-by: Shuah Khan <skhan@linuxfoundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
7 daysdocs: kernel_include.py: Cope with docutils 0.21Akira Yokosawa1-1/+0
commit d43ddd5c91802a46354fa4c4381416ef760676e2 upstream. Running "make htmldocs" on a newly installed Sphinx 7.3.7 ends up in a build error: Sphinx parallel build error: AttributeError: module 'docutils.nodes' has no attribute 'reprunicode' docutils 0.21 has removed nodes.reprunicode, quote from release note [1]: * Removed objects: docutils.nodes.reprunicode, docutils.nodes.ensure_str() Python 2 compatibility hacks Sphinx 7.3.0 supports docutils 0.21 [2]: kernel_include.py, whose origin is misc.py of docutils, uses reprunicode. Upstream docutils removed the offending line from the corresponding file (docutils/docutils/parsers/rst/directives/misc.py) in January 2022. Quoting the changelog [3]: Deprecate `nodes.reprunicode` and `nodes.ensure_str()`. Drop uses of the deprecated constructs (not required with Python 3). Do the same for kernel_include.py. Tested against: - Sphinx 2.4.5 (docutils 0.17.1) - Sphinx 3.4.3 (docutils 0.17.1) - Sphinx 5.3.0 (docutils 0.18.1) - Sphinx 6.2.1 (docutils 0.19) - Sphinx 7.2.6 (docutils 0.20.1) - Sphinx 7.3.7 (docutils 0.21.2) Link: http://www.docutils.org/RELEASE-NOTES.html#release-0-21-2024-04-09 [1] Link: https://www.sphinx-doc.org/en/master/changes.html#release-7-3-0-released-apr-16-2024 [2] Link: https://github.com/docutils/docutils/commit/c8471ce47a24 [3] Signed-off-by: Akira Yokosawa <akiyks@gmail.com> Cc: stable@vger.kernel.org Signed-off-by: Jonathan Corbet <corbet@lwn.net> Link: https://lore.kernel.org/r/faf5fa45-2a9d-4573-9d2e-3930bdc1ed65@gmail.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
7 daysserial: kgdboc: Fix NMI-safety problems from keyboard reset codeDaniel Thompson1-1/+29
commit b2aba15ad6f908d1a620fd97f6af5620c3639742 upstream. Currently, when kdb is compiled with keyboard support, then we will use schedule_work() to provoke reset of the keyboard status. Unfortunately schedule_work() gets called from the kgdboc post-debug-exception handler. That risks deadlock since schedule_work() is not NMI-safe and, even on platforms where the NMI is not directly used for debugging, the debug trap can have NMI-like behaviour depending on where breakpoints are placed. Fix this by using the irq work system, which is NMI-safe, to defer the call to schedule_work() to a point when it is safe to call. Reported-by: Liuye <liu.yeC@h3c.com> Closes: https://lore.kernel.org/all/20240228025602.3087748-1-liu.yeC@h3c.com/ Cc: stable@vger.kernel.org Reviewed-by: Douglas Anderson <dianders@chromium.org> Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Link: https://lore.kernel.org/r/20240424-kgdboc_fix_schedule_work-v2-1-50f5a490aec5@linaro.org Signed-off-by: Daniel Thompson <daniel.thompson@linaro.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
7 daystracing: Remove unnecessary var_ref destroy in track_data_destroy()Tom Zanussi1-1/+0
commit ff9d31d0d46672e201fc9ff59c42f1eef5f00c77 upstream. Commit 656fe2ba85e8 (tracing: Use hist trigger's var_ref array to destroy var_refs) centralized the destruction of all the var_refs in one place so that other code didn't have to do it. The track_data_destroy() added later ignored that and also destroyed the track_data var_ref, causing a double-free error flagged by KASAN. ================================================================== BUG: KASAN: use-after-free in destroy_hist_field+0x30/0x70 Read of size 8 at addr ffff888086df2210 by task bash/1694 CPU: 6 PID: 1694 Comm: bash Not tainted 5.1.0-rc1-test+ #15 Hardware name: Hewlett-Packard HP Compaq Pro 6300 SFF/339A, BIOS K01 v03.03 07/14/2016 Call Trace: dump_stack+0x71/0xa0 ? destroy_hist_field+0x30/0x70 print_address_description.cold.3+0x9/0x1fb ? destroy_hist_field+0x30/0x70 ? destroy_hist_field+0x30/0x70 kasan_report.cold.4+0x1a/0x33 ? __kasan_slab_free+0x100/0x150 ? destroy_hist_field+0x30/0x70 destroy_hist_field+0x30/0x70 track_data_destroy+0x55/0xe0 destroy_hist_data+0x1f0/0x350 hist_unreg_all+0x203/0x220 event_trigger_open+0xbb/0x130 do_dentry_open+0x296/0x700 ? stacktrace_count_trigger+0x30/0x30 ? generic_permission+0x56/0x200 ? __x64_sys_fchdir+0xd0/0xd0 ? inode_permission+0x55/0x200 ? security_inode_permission+0x18/0x60 path_openat+0x633/0x22b0 ? path_lookupat.isra.50+0x420/0x420 ? __kasan_kmalloc.constprop.12+0xc1/0xd0 ? kmem_cache_alloc+0xe5/0x260 ? getname_flags+0x6c/0x2a0 ? do_sys_open+0x149/0x2b0 ? do_syscall_64+0x73/0x1b0 ? entry_SYSCALL_64_after_hwframe+0x44/0xa9 ? _raw_write_lock_bh+0xe0/0xe0 ? __kernel_text_address+0xe/0x30 ? unwind_get_return_address+0x2f/0x50 ? __list_add_valid+0x2d/0x70 ? deactivate_slab.isra.62+0x1f4/0x5a0 ? getname_flags+0x6c/0x2a0 ? set_track+0x76/0x120 do_filp_open+0x11a/0x1a0 ? may_open_dev+0x50/0x50 ? _raw_spin_lock+0x7a/0xd0 ? _raw_write_lock_bh+0xe0/0xe0 ? __alloc_fd+0x10f/0x200 do_sys_open+0x1db/0x2b0 ? filp_open+0x50/0x50 do_syscall_64+0x73/0x1b0 entry_SYSCALL_64_after_hwframe+0x44/0xa9 RIP: 0033:0x7fa7b24a4ca2 Code: 25 00 00 41 00 3d 00 00 41 00 74 4c 48 8d 05 85 7a 0d 00 8b 00 85 c0 75 6d 89 f2 b8 01 01 00 00 48 89 fe bf 9c ff ff ff 0f 05 <48> 3d 00 f0 ff ff 0f 87 a2 00 00 00 48 8b 4c 24 28 64 48 33 0c 25 RSP: 002b:00007fffbafb3af0 EFLAGS: 00000246 ORIG_RAX: 0000000000000101 RAX: ffffffffffffffda RBX: 000055d3648ade30 RCX: 00007fa7b24a4ca2 RDX: 0000000000000241 RSI: 000055d364a55240 RDI: 00000000ffffff9c RBP: 00007fffbafb3bf0 R08: 0000000000000020 R09: 0000000000000002 R10: 00000000000001b6 R11: 0000000000000246 R12: 0000000000000000 R13: 0000000000000003 R14: 0000000000000001 R15: 000055d364a55240 ================================================================== So remove the track_data_destroy() destroy_hist_field() call for that var_ref. Link: http://lkml.kernel.org/r/1deffec420f6a16d11dd8647318d34a66d1989a9.camel@linux.intel.com Fixes: 466f4528fbc69 ("tracing: Generalize hist trigger onmax and save action") Reported-by: Steven Rostedt (VMware) <rostedt@goodmis.org> Signed-off-by: Tom Zanussi <tom.zanussi@linux.intel.com> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org> Signed-off-by: George Guo <guodongtai@kylinos.cn> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
7 daystracing: Generalize hist trigger onmax and save actionTom Zanussi1-76/+160
commit 466f4528fbc692ea56deca278fa6aeb79e6e8b21 upstream. The action refactor code allowed actions and handlers to be separated, but the existing onmax handler and save action code is still not flexible enough to handle arbitrary coupling. This change generalizes them and in the process makes additional handlers and actions easier to implement. The onmax action can be broken up and thought of as two separate components - a variable to be tracked (the parameter given to the onmax($var_to_track) function) and an invisible variable created to save the ongoing result of doing something with that variable, such as saving the max value of that variable so far seen. Separating it out like this and renaming it appropriately allows us to use the same code for similar tracking functions such as onchange($var_to_track), which would just track the last value seen rather than the max seen so far, which is useful in some situations. Additionally, because different handlers and actions may want to save and access data differently e.g. save and retrieve tracking values as local variables vs something more global, save_val() and get_val() interface functions are introduced and max-specific implementations are used instead. The same goes for the code that checks whether a maximum has been hit - a generic check_val() interface and max-checking implementation is used instead, which allows future patches to make use of he same code using their own implemetations of similar functionality. Link: http://lkml.kernel.org/r/980ea73dd8e3f36db3d646f99652f8fed42b77d4.1550100284.git.tom.zanussi@linux.intel.com Signed-off-by: Tom Zanussi <tom.zanussi@linux.intel.com> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org> Signed-off-by: George Guo <guodongtai@kylinos.cn> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
7 daystracing: Split up onmatch action dataTom Zanussi2-44/+63
commit c3e49506a0f426a850675e39419879214060ca8b upstream. Currently, the onmatch action data binds the onmatch action to data related to synthetic event generation. Since we want to allow the onmatch handler to potentially invoke a different action, and because we expect other handlers to generate synthetic events, we need to separate the data related to these two functions. Also rename the onmatch data to something more descriptive, and create and use common action data destroy function. Link: http://lkml.kernel.org/r/b9abbf9aae69fe3920cdc8ddbcaad544dd258d78.1550100284.git.tom.zanussi@linux.intel.com Signed-off-by: Tom Zanussi <tom.zanussi@linux.intel.com> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org> Signed-off-by: George Guo <guodongtai@kylinos.cn> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
7 daystracing: Refactor hist trigger action codeTom Zanussi1-169/+238
commit 7d18a10c316783357fb1b2b649cfcf97c70a7bee upstream. The hist trigger action code currently implements two essentially hard-coded pairs of 'actions' - onmax(), which tracks a variable and saves some event fields when a max is hit, and onmatch(), which is hard-coded to generate a synthetic event. These hardcoded pairs (track max/save fields and detect match/generate synthetic event) should really be decoupled into separate components that can then be arbitrarily combined. The first component of each pair (track max/detect match) is called a 'handler' in the new code, while the second component (save fields/generate synthetic event) is called an 'action' in this scheme. This change refactors the action code to reflect this split by adding two handlers, HANDLER_ONMATCH and HANDLER_ONMAX, along with two actions, ACTION_SAVE and ACTION_TRACE. The new code combines them to produce the existing ONMATCH/TRACE and ONMAX/SAVE functionality, but doesn't implement the other combinations now possible. Future patches will expand these to further useful cases, such as ONMAX/TRACE, as well as add additional handlers and actions such as ONCHANGE and SNAPSHOT. Also, add abbreviated documentation for handlers and actions to README. Link: http://lkml.kernel.org/r/98bfdd48c1b4ff29fc5766442f99f5bc3c34b76b.1550100284.git.tom.zanussi@linux.intel.com Signed-off-by: Tom Zanussi <tom.zanussi@linux.intel.com> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org> Signed-off-by: George Guo <guodongtai@kylinos.cn> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
7 daystracing: Have the historgram use the result of str_has_prefix() for len of ↵Steven Rostedt (VMware)1-4/+5
prefix commit 036876fa56204ae0fa59045bd6bbb2691a060633 upstream. As str_has_prefix() returns the length on match, we can use that for the updating of the string pointer instead of recalculating the prefix size. Cc: Tom Zanussi <zanussi@kernel.org> Acked-by: Namhyung Kim <namhyung@kernel.org> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org> Signed-off-by: George Guo <guodongtai@kylinos.cn> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
7 daystracing: Use str_has_prefix() instead of using fixed sizesSteven Rostedt (VMware)5-5/+5
commit b6b2735514bcd70ad1556a33892a636b20ece671 upstream. There are several instances of strncmp(str, "const", 123), where 123 is the strlen of the const string to check if "const" is the prefix of str. But this can be error prone. Use str_has_prefix() instead. Acked-by: Namhyung Kim <namhyung@kernel.org> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org> Signed-off-by: George Guo <guodongtai@kylinos.cn> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
7 daystracing: Use str_has_prefix() helper for histogram codeSteven Rostedt (VMware)1-16/+16
commit 754481e6954cbef53f8bc4412ad48dde611e21d3 upstream. The tracing histogram code contains a lot of instances of the construct: strncmp(str, "const", sizeof("const") - 1) This can be prone to bugs due to typos or bad cut and paste. Use the str_has_prefix() helper macro instead that removes the need for having two copies of the constant string. Cc: Tom Zanussi <tom.zanussi@linux.intel.com> Acked-by: Namhyung Kim <namhyung@kernel.org> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org> Signed-off-by: George Guo <guodongtai@kylinos.cn> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
7 daysstring.h: Add str_has_prefix() helper functionSteven Rostedt (VMware)1-0/+20
commit 72921427d46bf9731a1ab7864adc64c43dfae29f upstream. A discussion came up in the trace triggers thread about converting a bunch of: strncmp(str, "const", sizeof("const") - 1) use cases into a helper macro. It started with: strncmp(str, const, sizeof(const) - 1) But then Joe Perches mentioned that if a const is not used, the sizeof() will be the size of a pointer, which can be bad. And that gcc will optimize strlen("const") into "sizeof("const") - 1". Thinking about this more, a quick grep in the kernel tree found several (thousands!) of cases that use this construct. A quick grep also revealed that there's probably several bugs in that use case. Some are that people forgot the "- 1" (which I found) and others could be that the constant for the sizeof is different than the constant (although, I haven't found any of those, but I also didn't look hard). I figured the best thing to do is to create a helper macro and place it into include/linux/string.h. And go around and fix all the open coded versions of it later. Note, gcc appears to optimize this when we make it into an always_inline static function, which removes a lot of issues that a macro produces. Link: http://lkml.kernel.org/r/e3e754f2bd18e56eaa8baf79bee619316ebf4cfc.1545161087.git.tom.zanussi@linux.intel.com Link: http://lkml.kernel.org/r/20181219211615.2298e781@gandalf.local.home Link: http://lkml.kernel.org/r/CAHk-=wg_sR-UEC1ggmkZpypOUYanL5CMX4R7ceuaV4QMf5jBtg@mail.gmail.com Cc: Tom Zanussi <zanussi@kernel.org> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Acked-by: Namhyung Kim <namhyung@kernel.org> Suggestions-by: Linus Torvalds <torvalds@linux-foundation.org> Suggestions-by: Joe Perches <joe@perches.com> Suggestions-by: Andreas Schwab <schwab@linux-m68k.org> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org> Signed-off-by: George Guo <guodongtai@kylinos.cn> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
7 daystracing: Consolidate trace_add/remove_event_call back to the nolock functionsSteven Rostedt (VMware)3-31/+7
commit 7e1413edd6194a9807aa5f3ac0378b9b4b9da879 upstream. The trace_add/remove_event_call_nolock() functions were added to allow the tace_add/remove_event_call() code be called when the event_mutex lock was already taken. Now that all callers are done within the event_mutex, there's no reason to have two different interfaces. Remove the current wrapper trace_add/remove_event_call()s and rename the _nolock versions back to the original names. Link: http://lkml.kernel.org/r/154140866955.17322.2081425494660638846.stgit@devbox Acked-by: Masami Hiramatsu <mhiramat@kernel.org> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org> Signed-off-by: George Guo <guodongtai@kylinos.cn> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
7 daystracing: Remove unneeded synth_event_mutexMasami Hiramatsu1-23/+7
commit 0e2b81f7b52a1c1a8c46986f9ca01eb7b3c421f8 upstream. Rmove unneeded synth_event_mutex. This mutex protects the reference count in synth_event, however, those operational points are already protected by event_mutex. 1. In __create_synth_event() and create_or_delete_synth_event(), those synth_event_mutex clearly obtained right after event_mutex. 2. event_hist_trigger_func() is trigger_hist_cmd.func() which is called by trigger_process_regex(), which is a part of event_trigger_regex_write() and this function takes event_mutex. 3. hist_unreg_all() is trigger_hist_cmd.unreg_all() which is called by event_trigger_regex_open() and it takes event_mutex. 4. onmatch_destroy() and onmatch_create() have long call tree, but both are finally invoked from event_trigger_regex_write() and event_trace_del_tracer(), former takes event_mutex, and latter ensures called under event_mutex locked. Finally, I ensured there is no resource conflict. For safety, I added lockdep_assert_held(&event_mutex) for each function. Link: http://lkml.kernel.org/r/154140864134.17322.4796059721306031894.stgit@devbox Reviewed-by: Tom Zanussi <tom.zanussi@linux.intel.com> Tested-by: Tom Zanussi <tom.zanussi@linux.intel.com> Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org> Signed-off-by: George Guo <guodongtai@kylinos.cn> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
7 daystracing: Use dyn_event framework for synthetic eventsMasami Hiramatsu3-98/+176
commit 7bbab38d07f3185fddf6fce126e2239010efdfce upstream. Use dyn_event framework for synthetic events. This shows synthetic events on "tracing/dynamic_events" file in addition to tracing/synthetic_events interface. User can also define new events via tracing/dynamic_events with "s:" prefix. So, the new syntax is below; s:[synthetic/]EVENT_NAME TYPE ARG; [TYPE ARG;]... To remove events via tracing/dynamic_events, you can use "-:" prefix as same as other events. Link: http://lkml.kernel.org/r/154140861301.17322.15454611233735614508.stgit@devbox Reviewed-by: Tom Zanussi <tom.zanussi@linux.intel.com> Tested-by: Tom Zanussi <tom.zanussi@linux.intel.com> Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org> Signed-off-by: George Guo <guodongtai@kylinos.cn> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
7 daystracing: Add unified dynamic event frameworkMasami Hiramatsu5-0/+337
commit 5448d44c38557fc15d1c53b608a9c9f0e1ca8f86 upstream. Add unified dynamic event framework for ftrace kprobes, uprobes and synthetic events. Those dynamic events can be co-exist on same file because those syntax doesn't overlap. This introduces a framework part which provides a unified tracefs interface and operations. Link: http://lkml.kernel.org/r/154140852824.17322.12250362185969352095.stgit@devbox Reviewed-by: Tom Zanussi <tom.zanussi@linux.intel.com> Tested-by: Tom Zanussi <tom.zanussi@linux.intel.com> Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org> Signed-off-by: George Guo <guodongtai@kylinos.cn> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
7 daystracing: Simplify creation and deletion of synthetic eventsMasami Hiramatsu1-35/+18
commit faacb361f271be4baf2d807e2eeaba87e059225f upstream. Since the event_mutex and synth_event_mutex ordering issue is gone, we can skip existing event check when adding or deleting events, and some redundant code in error path. This changes release_all_synth_events() to abort the process when it hits any error and returns the error code. It succeeds only if it has no error. Link: http://lkml.kernel.org/r/154140847194.17322.17960275728005067803.stgit@devbox Reviewed-by: Tom Zanussi <tom.zanussi@linux.intel.com> Tested-by: Tom Zanussi <tom.zanussi@linux.intel.com> Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org> Signed-off-by: George Guo <guodongtai@kylinos.cn> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
7 daysbtrfs: add missing mutex_unlock in btrfs_relocate_sys_chunks()Dominique Martinet1-0/+1
commit 9af503d91298c3f2945e73703f0e00995be08c30 upstream. The previous patch that replaced BUG_ON by error handling forgot to unlock the mutex in the error path. Link: https://lore.kernel.org/all/Zh%2fHpAGFqa7YAFuM@duo.ucw.cz Reported-by: Pavel Machek <pavel@denx.de> Fixes: 7411055db5ce ("btrfs: handle chunk tree lookup error in btrfs_relocate_sys_chunks()") CC: stable@vger.kernel.org Reviewed-by: Pavel Machek <pavel@denx.de> Signed-off-by: Dominique Martinet <dominique.martinet@atmark-techno.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com> Signed-off-by: Dominique Martinet <dominique.martinet@atmark-techno.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
7 daysdm: limit the number of targets and parameter size areaMikulas Patocka3-3/+11
commit bd504bcfec41a503b32054da5472904b404341a4 upstream. The kvmalloc function fails with a warning if the size is larger than INT_MAX. The warning was triggered by a syscall testing robot. In order to avoid the warning, this commit limits the number of targets to 1048576 and the size of the parameter area to 1073741824. Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> Signed-off-by: Mike Snitzer <snitzer@kernel.org> [srish: Apply to stable branch linux-4.19.y] Signed-off-by: Srish Srinivasan <srish.srinivasan@broadcom.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
7 daysRevert "selftests: mm: fix map_hugetlb failure on 64K page size systems"Harshit Mogalapalli1-7/+0
This reverts commit abdbd5f3e8c504d864fdc032dd5a4eb481cb12bf which is commit 91b80cc5b39f00399e8e2d17527cad2c7fa535e2 upstream. map_hugetlb.c:18:10: fatal error: vm_util.h: No such file or directory 18 | #include "vm_util.h" | ^~~~~~~~~~~ compilation terminated. vm_util.h is not present in 4.19.y, as commit:642bc52aed9c ("selftests: vm: bring common functions to a new file") is not present in stable kernels <=6.1.y Signed-off-by: Harshit Mogalapalli <harshit.m.mogalapalli@oracle.com> Cc: Shuah Khan <skhan@linuxfoundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-05-17Linux 4.19.314v4.19.314Greg Kroah-Hartman1-1/+1
Link: https://lore.kernel.org/r/20240514100948.010148088@linuxfoundation.org Tested-by: Harshit Mogalapalli <harshit.m.mogalapalli@oracle.com> Tested-by: Pavel Machek (CIP) <pavel@denx.de> Tested-by: Shuah Khan <skhan@linuxfoundation.org> Tested-by: Linux Kernel Functional Testing <lkft@linaro.org> Tested-by: Jon Hunter <jonathanh@nvidia.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-05-17af_unix: Suppress false-positive lockdep splat for spin_lock() in __unix_gc().Kuniyuki Iwashima2-1/+4
commit 1971d13ffa84a551d29a81fdf5b5ec5be166ac83 upstream. syzbot reported a lockdep splat regarding unix_gc_lock and unix_state_lock(). One is called from recvmsg() for a connected socket, and another is called from GC for TCP_LISTEN socket. So, the splat is false-positive. Let's add a dedicated lock class for the latter to suppress the splat. Note that this change is not necessary for net-next.git as the issue is only applied to the old GC impl. [0]: WARNING: possible circular locking dependency detected 6.9.0-rc5-syzkaller-00007-g4d2008430ce8 #0 Not tainted ----------------------------------------------------- kworker/u8:1/11 is trying to acquire lock: ffff88807cea4e70 (&u->lock){+.+.}-{2:2}, at: spin_lock include/linux/spinlock.h:351 [inline] ffff88807cea4e70 (&u->lock){+.+.}-{2:2}, at: __unix_gc+0x40e/0xf70 net/unix/garbage.c:302 but task is already holding lock: ffffffff8f6ab638 (unix_gc_lock){+.+.}-{2:2}, at: spin_lock include/linux/spinlock.h:351 [inline] ffffffff8f6ab638 (unix_gc_lock){+.+.}-{2:2}, at: __unix_gc+0x117/0xf70 net/unix/garbage.c:261 which lock already depends on the new lock. the existing dependency chain (in reverse order) is: -> #1 (unix_gc_lock){+.+.}-{2:2}: lock_acquire+0x1ed/0x550 kernel/locking/lockdep.c:5754 __raw_spin_lock include/linux/spinlock_api_smp.h:133 [inline] _raw_spin_lock+0x2e/0x40 kernel/locking/spinlock.c:154 spin_lock include/linux/spinlock.h:351 [inline] unix_notinflight+0x13d/0x390 net/unix/garbage.c:140 unix_detach_fds net/unix/af_unix.c:1819 [inline] unix_destruct_scm+0x221/0x350 net/unix/af_unix.c:1876 skb_release_head_state+0x100/0x250 net/core/skbuff.c:1188 skb_release_all net/core/skbuff.c:1200 [inline] __kfree_skb net/core/skbuff.c:1216 [inline] kfree_skb_reason+0x16d/0x3b0 net/core/skbuff.c:1252 kfree_skb include/linux/skbuff.h:1262 [inline] manage_oob net/unix/af_unix.c:2672 [inline] unix_stream_read_generic+0x1125/0x2700 net/unix/af_unix.c:2749 unix_stream_splice_read+0x239/0x320 net/unix/af_unix.c:2981 do_splice_read fs/splice.c:985 [inline] splice_file_to_pipe+0x299/0x500 fs/splice.c:1295 do_splice+0xf2d/0x1880 fs/splice.c:1379 __do_splice fs/splice.c:1436 [inline] __do_sys_splice fs/splice.c:1652 [inline] __se_sys_splice+0x331/0x4a0 fs/splice.c:1634 do_syscall_x64 arch/x86/entry/common.c:52 [inline] do_syscall_64+0xf5/0x240 arch/x86/entry/common.c:83 entry_SYSCALL_64_after_hwframe+0x77/0x7f -> #0 (&u->lock){+.+.}-{2:2}: check_prev_add kernel/locking/lockdep.c:3134 [inline] check_prevs_add kernel/locking/lockdep.c:3253 [inline] validate_chain+0x18cb/0x58e0 kernel/locking/lockdep.c:3869 __lock_acquire+0x1346/0x1fd0 kernel/locking/lockdep.c:5137 lock_acquire+0x1ed/0x550 kernel/locking/lockdep.c:5754 __raw_spin_lock include/linux/spinlock_api_smp.h:133 [inline] _raw_spin_lock+0x2e/0x40 kernel/locking/spinlock.c:154 spin_lock include/linux/spinlock.h:351 [inline] __unix_gc+0x40e/0xf70 net/unix/garbage.c:302 process_one_work kernel/workqueue.c:3254 [inline] process_scheduled_works+0xa10/0x17c0 kernel/workqueue.c:3335 worker_thread+0x86d/0xd70 kernel/workqueue.c:3416 kthread+0x2f0/0x390 kernel/kthread.c:388 ret_from_fork+0x4b/0x80 arch/x86/kernel/process.c:147 ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:244 other info that might help us debug this: Possible unsafe locking scenario: CPU0 CPU1 ---- ---- lock(unix_gc_lock); lock(&u->lock); lock(unix_gc_lock); lock(&u->lock); *** DEADLOCK *** 3 locks held by kworker/u8:1/11: #0: ffff888015089148 ((wq_completion)events_unbound){+.+.}-{0:0}, at: process_one_work kernel/workqueue.c:3229 [inline] #0: ffff888015089148 ((wq_completion)events_unbound){+.+.}-{0:0}, at: process_scheduled_works+0x8e0/0x17c0 kernel/workqueue.c:3335 #1: ffffc90000107d00 (unix_gc_work){+.+.}-{0:0}, at: process_one_work kernel/workqueue.c:3230 [inline] #1: ffffc90000107d00 (unix_gc_work){+.+.}-{0:0}, at: process_scheduled_works+0x91b/0x17c0 kernel/workqueue.c:3335 #2: ffffffff8f6ab638 (unix_gc_lock){+.+.}-{2:2}, at: spin_lock include/linux/spinlock.h:351 [inline] #2: ffffffff8f6ab638 (unix_gc_lock){+.+.}-{2:2}, at: __unix_gc+0x117/0xf70 net/unix/garbage.c:261 stack backtrace: CPU: 0 PID: 11 Comm: kworker/u8:1 Not tainted 6.9.0-rc5-syzkaller-00007-g4d2008430ce8 #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 03/27/2024 Workqueue: events_unbound __unix_gc Call Trace: <TASK> __dump_stack lib/dump_stack.c:88 [inline] dump_stack_lvl+0x241/0x360 lib/dump_stack.c:114 check_noncircular+0x36a/0x4a0 kernel/locking/lockdep.c:2187 check_prev_add kernel/locking/lockdep.c:3134 [inline] check_prevs_add kernel/locking/lockdep.c:3253 [inline] validate_chain+0x18cb/0x58e0 kernel/locking/lockdep.c:3869 __lock_acquire+0x1346/0x1fd0 kernel/locking/lockdep.c:5137 lock_acquire+0x1ed/0x550 kernel/locking/lockdep.c:5754 __raw_spin_lock include/linux/spinlock_api_smp.h:133 [inline] _raw_spin_lock+0x2e/0x40 kernel/locking/spinlock.c:154 spin_lock include/linux/spinlock.h:351 [inline] __unix_gc+0x40e/0xf70 net/unix/garbage.c:302 process_one_work kernel/workqueue.c:3254 [inline] process_scheduled_works+0xa10/0x17c0 kernel/workqueue.c:3335 worker_thread+0x86d/0xd70 kernel/workqueue.c:3416 kthread+0x2f0/0x390 kernel/kthread.c:388 ret_from_fork+0x4b/0x80 arch/x86/kernel/process.c:147 ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:244 </TASK> Fixes: 47d8ac011fe1 ("af_unix: Fix garbage collector racing against connect()") Reported-and-tested-by: syzbot+fa379358c28cc87cc307@syzkaller.appspotmail.com Closes: https://syzkaller.appspot.com/bug?extid=fa379358c28cc87cc307 Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com> Link: https://lore.kernel.org/r/20240424170443.9832-1-kuniyu@amazon.com Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-05-17net: fix out-of-bounds access in ops_initThadeu Lima de Souza Cascardo1-3/+10
commit a26ff37e624d12e28077e5b24d2b264f62764ad6 upstream. net_alloc_generic is called by net_alloc, which is called without any locking. It reads max_gen_ptrs, which is changed under pernet_ops_rwsem. It is read twice, first to allocate an array, then to set s.len, which is later used to limit the bounds of the array access. It is possible that the array is allocated and another thread is registering a new pernet ops, increments max_gen_ptrs, which is then used to set s.len with a larger than allocated length for the variable array. Fix it by reading max_gen_ptrs only once in net_alloc_generic. If max_gen_ptrs is later incremented, it will be caught in net_assign_generic. Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@igalia.com> Fixes: 073862ba5d24 ("netns: fix net_alloc_generic()") Reviewed-by: Eric Dumazet <edumazet@google.com> Reviewed-by: Kuniyuki Iwashima <kuniyu@amazon.com> Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20240502132006.3430840-1-cascardo@igalia.com Signed-off-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-05-17drm/vmwgfx: Fix invalid reads in fence signaled eventsZack Rusin1-1/+1
commit a37ef7613c00f2d72c8fc08bd83fb6cc76926c8c upstream. Correctly set the length of the drm_event to the size of the structure that's actually used. The length of the drm_event was set to the parent structure instead of to the drm_vmw_event_fence which is supposed to be read. drm_read uses the length parameter to copy the event to the user space thus resuling in oob reads. Signed-off-by: Zack Rusin <zack.rusin@broadcom.com> Fixes: 8b7de6aa8468 ("vmwgfx: Rework fence event action") Reported-by: zdi-disclosures@trendmicro.com # ZDI-CAN-23566 Cc: David Airlie <airlied@gmail.com> CC: Daniel Vetter <daniel@ffwll.ch> Cc: Zack Rusin <zack.rusin@broadcom.com> Cc: Broadcom internal kernel review list <bcm-kernel-feedback-list@broadcom.com> Cc: dri-devel@lists.freedesktop.org Cc: linux-kernel@vger.kernel.org Cc: <stable@vger.kernel.org> # v3.4+ Reviewed-by: Maaz Mombasawala <maaz.mombasawala@broadcom.com> Reviewed-by: Martin Krastev <martin.krastev@broadcom.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240425192748.1761522-1-zack.rusin@broadcom.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-05-17dyndbg: fix old BUG_ON in >control parserJim Cromie1-1/+5
commit 00e7d3bea2ce7dac7bee1cf501fb071fd0ea8f6c upstream. Fix a BUG_ON from 2009. Even if it looks "unreachable" (I didn't really look), lets make sure by removing it, doing pr_err and return -EINVAL instead. Cc: stable <stable@kernel.org> Signed-off-by: Jim Cromie <jim.cromie@gmail.com> Link: https://lore.kernel.org/r/20240429193145.66543-2-jim.cromie@gmail.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-05-17tipc: fix UAF in error pathPaolo Abeni1-1/+5
commit 080cbb890286cd794f1ee788bbc5463e2deb7c2b upstream. Sam Page (sam4k) working with Trend Micro Zero Day Initiative reported a UAF in the tipc_buf_append() error path: BUG: KASAN: slab-use-after-free in kfree_skb_list_reason+0x47e/0x4c0 linux/net/core/skbuff.c:1183 Read of size 8 at addr ffff88804d2a7c80 by task poc/8034 CPU: 1 PID: 8034 Comm: poc Not tainted 6.8.2 #1 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.0-debian-1.16.0-5 04/01/2014 Call Trace: <IRQ> __dump_stack linux/lib/dump_stack.c:88 dump_stack_lvl+0xd9/0x1b0 linux/lib/dump_stack.c:106 print_address_description linux/mm/kasan/report.c:377 print_report+0xc4/0x620 linux/mm/kasan/report.c:488 kasan_report+0xda/0x110 linux/mm/kasan/report.c:601 kfree_skb_list_reason+0x47e/0x4c0 linux/net/core/skbuff.c:1183 skb_release_data+0x5af/0x880 linux/net/core/skbuff.c:1026 skb_release_all linux/net/core/skbuff.c:1094 __kfree_skb linux/net/core/skbuff.c:1108 kfree_skb_reason+0x12d/0x210 linux/net/core/skbuff.c:1144 kfree_skb linux/./include/linux/skbuff.h:1244 tipc_buf_append+0x425/0xb50 linux/net/tipc/msg.c:186 tipc_link_input+0x224/0x7c0 linux/net/tipc/link.c:1324 tipc_link_rcv+0x76e/0x2d70 linux/net/tipc/link.c:1824 tipc_rcv+0x45f/0x10f0 linux/net/tipc/node.c:2159 tipc_udp_recv+0x73b/0x8f0 linux/net/tipc/udp_media.c:390 udp_queue_rcv_one_skb+0xad2/0x1850 linux/net/ipv4/udp.c:2108 udp_queue_rcv_skb+0x131/0xb00 linux/net/ipv4/udp.c:2186 udp_unicast_rcv_skb+0x165/0x3b0 linux/net/ipv4/udp.c:2346 __udp4_lib_rcv+0x2594/0x3400 linux/net/ipv4/udp.c:2422 ip_protocol_deliver_rcu+0x30c/0x4e0 linux/net/ipv4/ip_input.c:205 ip_local_deliver_finish+0x2e4/0x520 linux/net/ipv4/ip_input.c:233 NF_HOOK linux/./include/linux/netfilter.h:314 NF_HOOK linux/./include/linux/netfilter.h:308 ip_local_deliver+0x18e/0x1f0 linux/net/ipv4/ip_input.c:254 dst_input linux/./include/net/dst.h:461 ip_rcv_finish linux/net/ipv4/ip_input.c:449 NF_HOOK linux/./include/linux/netfilter.h:314 NF_HOOK linux/./include/linux/netfilter.h:308 ip_rcv+0x2c5/0x5d0 linux/net/ipv4/ip_input.c:569 __netif_receive_skb_one_core+0x199/0x1e0 linux/net/core/dev.c:5534 __netif_receive_skb+0x1f/0x1c0 linux/net/core/dev.c:5648 process_backlog+0x101/0x6b0 linux/net/core/dev.c:5976 __napi_poll.constprop.0+0xba/0x550 linux/net/core/dev.c:6576 napi_poll linux/net/core/dev.c:6645 net_rx_action+0x95a/0xe90 linux/net/core/dev.c:6781 __do_softirq+0x21f/0x8e7 linux/kernel/softirq.c:553 do_softirq linux/kernel/softirq.c:454 do_softirq+0xb2/0xf0 linux/kernel/softirq.c:441 </IRQ> <TASK> __local_bh_enable_ip+0x100/0x120 linux/kernel/softirq.c:381 local_bh_enable linux/./include/linux/bottom_half.h:33 rcu_read_unlock_bh linux/./include/linux/rcupdate.h:851 __dev_queue_xmit+0x871/0x3ee0 linux/net/core/dev.c:4378 dev_queue_xmit linux/./include/linux/netdevice.h:3169 neigh_hh_output linux/./include/net/neighbour.h:526 neigh_output linux/./include/net/neighbour.h:540 ip_finish_output2+0x169f/0x2550 linux/net/ipv4/ip_output.c:235 __ip_finish_output linux/net/ipv4/ip_output.c:313 __ip_finish_output+0x49e/0x950 linux/net/ipv4/ip_output.c:295 ip_finish_output+0x31/0x310 linux/net/ipv4/ip_output.c:323 NF_HOOK_COND linux/./include/linux/netfilter.h:303 ip_output+0x13b/0x2a0 linux/net/ipv4/ip_output.c:433 dst_output linux/./include/net/dst.h:451 ip_local_out linux/net/ipv4/ip_output.c:129 ip_send_skb+0x3e5/0x560 linux/net/ipv4/ip_output.c:1492 udp_send_skb+0x73f/0x1530 linux/net/ipv4/udp.c:963 udp_sendmsg+0x1a36/0x2b40 linux/net/ipv4/udp.c:1250 inet_sendmsg+0x105/0x140 linux/net/ipv4/af_inet.c:850 sock_sendmsg_nosec linux/net/socket.c:730 __sock_sendmsg linux/net/socket.c:745 __sys_sendto+0x42c/0x4e0 linux/net/socket.c:2191 __do_sys_sendto linux/net/socket.c:2203 __se_sys_sendto linux/net/socket.c:2199 __x64_sys_sendto+0xe0/0x1c0 linux/net/socket.c:2199 do_syscall_x64 linux/arch/x86/entry/common.c:52 do_syscall_64+0xd8/0x270 linux/arch/x86/entry/common.c:83 entry_SYSCALL_64_after_hwframe+0x6f/0x77 linux/arch/x86/entry/entry_64.S:120 RIP: 0033:0x7f3434974f29 Code: 00 c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 37 8f 0d 00 f7 d8 64 89 01 48 RSP: 002b:00007fff9154f2b8 EFLAGS: 00000212 ORIG_RAX: 000000000000002c RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007f3434974f29 RDX: 00000000000032c8 RSI: 00007fff9154f300 RDI: 0000000000000003 RBP: 00007fff915532e0 R08: 00007fff91553360 R09: 0000000000000010 R10: 0000000000000000 R11: 0000000000000212 R12: 000055ed86d261d0 R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000 </TASK> In the critical scenario, either the relevant skb is freed or its ownership is transferred into a frag_lists. In both cases, the cleanup code must not free it again: we need to clear the skb reference earlier. Fixes: 1149557d64c9 ("tipc: eliminate unnecessary linearization of incoming buffers") Cc: stable@vger.kernel.org Reported-by: zdi-disclosures@trendmicro.com # ZDI-CAN-23852 Acked-by: Xin Long <lucien.xin@gmail.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Link: https://lore.kernel.org/r/752f1ccf762223d109845365d07f55414058e5a3.1714484273.git.pabeni@redhat.com Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-05-17usb: gadget: f_fs: Fix a race condition when processing setup packets.Chris Wulff1-1/+1
commit 0aea736ddb877b93f6d2dd8cf439840d6b4970a9 upstream. If the USB driver passes a pointer into the TRB buffer for creq, this buffer can be overwritten with the status response as soon as the event is queued. This can make the final check return USB_GADGET_DELAYED_STATUS when it shouldn't. Instead use the stored wLength. Fixes: 4d644abf2569 ("usb: gadget: f_fs: Only return delayed status when len is 0") Cc: stable <stable@kernel.org> Signed-off-by: Chris Wulff <chris.wulff@biamp.com> Link: https://lore.kernel.org/r/CO1PR17MB5419BD664264A558B2395E28E1112@CO1PR17MB5419.namprd17.prod.outlook.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-05-17usb: gadget: composite: fix OS descriptors w_value logicPeter Korsgaard1-3/+3
commit ec6ce7075ef879b91a8710829016005dc8170f17 upstream. The OS descriptors logic had the high/low byte of w_value inverted, causing the extended properties to not be accessible for interface != 0. >From the Microsoft documentation: https://learn.microsoft.com/en-us/windows-hardware/drivers/usbcon/microsoft-os-1-0-descriptors-specification OS_Desc_CompatID.doc (w_index = 0x4): - wValue: High Byte = InterfaceNumber. InterfaceNumber is set to the number of the interface or function that is associated with the descriptor, typically 0x00. Because a device can have only one extended compat ID descriptor, it should ignore InterfaceNumber, regardless of the value, and simply return the descriptor. Low Byte = 0. PageNumber is used to retrieve descriptors that are larger than 64 KB. The header section is 16 bytes, so PageNumber is set to 0 for this request. We currently do not support >64KB compat ID descriptors, so verify that the low byte is 0. OS_Desc_Ext_Prop.doc (w_index = 0x5): - wValue: High byte = InterfaceNumber. The high byte of wValue is set to the number of the interface or function that is associated with the descriptor. Low byte = PageNumber. The low byte of wValue is used to retrieve descriptors that are larger than 64 KB. The header section is 10 bytes, so PageNumber is set to 0 for this request. We also don't support >64KB extended properties, so verify that the low byte is 0 and use the high byte for the interface number. Fixes: 37a3a533429e ("usb: gadget: OS Feature Descriptors support") Cc: stable <stable@kernel.org> Signed-off-by: Peter Korsgaard <peter@korsgaard.com> Link: https://lore.kernel.org/r/20240404100635.3215340-1-peter@korsgaard.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-05-17firewire: nosy: ensure user_length is taken into account when fetching ↵Thanassis Avgerinos1-2/+4
packet contents commit 38762a0763c10c24a4915feee722d7aa6e73eb98 upstream. Ensure that packet_buffer_get respects the user_length provided. If the length of the head packet exceeds the user_length, packet_buffer_get will now return 0 to signify to the user that no data were read and a larger buffer size is required. Helps prevent user space overflows. Signed-off-by: Thanassis Avgerinos <thanassis.avgerinos@gmail.com> Signed-off-by: Takashi Sakamoto <o-takashi@sakamocchi.jp> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-05-17af_unix: Fix garbage collector racing against connect()Michal Luczaj1-1/+17
[ Upstream commit 47d8ac011fe1c9251070e1bd64cb10b48193ec51 ] Garbage collector does not take into account the risk of embryo getting enqueued during the garbage collection. If such embryo has a peer that carries SCM_RIGHTS, two consecutive passes of scan_children() may see a different set of children. Leading to an incorrectly elevated inflight count, and then a dangling pointer within the gc_inflight_list. sockets are AF_UNIX/SOCK_STREAM S is an unconnected socket L is a listening in-flight socket bound to addr, not in fdtable V's fd will be passed via sendmsg(), gets inflight count bumped connect(S, addr) sendmsg(S, [V]); close(V) __unix_gc() ---------------- ------------------------- ----------- NS = unix_create1() skb1 = sock_wmalloc(NS) L = unix_find_other(addr) unix_state_lock(L) unix_peer(S) = NS // V count=1 inflight=0 NS = unix_peer(S) skb2 = sock_alloc() skb_queue_tail(NS, skb2[V]) // V became in-flight // V count=2 inflight=1 close(V) // V count=1 inflight=1 // GC candidate condition met for u in gc_inflight_list: if (total_refs == inflight_refs) add u to gc_candidates // gc_candidates={L, V} for u in gc_candidates: scan_children(u, dec_inflight) // embryo (skb1) was not // reachable from L yet, so V's // inflight remains unchanged __skb_queue_tail(L, skb1) unix_state_unlock(L) for u in gc_candidates: if (u.inflight) scan_children(u, inc_inflight_move_tail) // V count=1 inflight=2 (!) If there is a GC-candidate listening socket, lock/unlock its state. This makes GC wait until the end of any ongoing connect() to that socket. After flipping the lock, a possibly SCM-laden embryo is already enqueued. And if there is another embryo coming, it can not possibly carry SCM_RIGHTS. At this point, unix_inflight() can not happen because unix_gc_lock is already taken. Inflight graph remains unaffected. Fixes: 1fd05ba5a2f2 ("[AF_UNIX]: Rewrite garbage collector, fixes race.") Signed-off-by: Michal Luczaj <mhal@rbox.co> Reviewed-by: Kuniyuki Iwashima <kuniyu@amazon.com> Link: https://lore.kernel.org/r/20240409201047.1032217-1-mhal@rbox.co Signed-off-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-05-17af_unix: Do not use atomic ops for unix_sk(sk)->inflight.Kuniyuki Iwashima4-15/+16
[ Upstream commit 97af84a6bba2ab2b9c704c08e67de3b5ea551bb2 ] When touching unix_sk(sk)->inflight, we are always under spin_lock(&unix_gc_lock). Let's convert unix_sk(sk)->inflight to the normal unsigned long. Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://lore.kernel.org/r/20240123170856.41348-3-kuniyu@amazon.com Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-05-17ipv6: fib6_rules: avoid possible NULL dereference in fib6_rule_action()Eric Dumazet1-1/+5
[ Upstream commit d101291b2681e5ab938554e3e323f7a7ee33e3aa ] syzbot is able to trigger the following crash [1], caused by unsafe ip6_dst_idev() use. Indeed ip6_dst_idev() can return NULL, and must always be checked. [1] Oops: general protection fault, probably for non-canonical address 0xdffffc0000000000: 0000 [#1] PREEMPT SMP KASAN PTI KASAN: null-ptr-deref in range [0x0000000000000000-0x0000000000000007] CPU: 0 PID: 31648 Comm: syz-executor.0 Not tainted 6.9.0-rc4-next-20240417-syzkaller #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 03/27/2024 RIP: 0010:__fib6_rule_action net/ipv6/fib6_rules.c:237 [inline] RIP: 0010:fib6_rule_action+0x241/0x7b0 net/ipv6/fib6_rules.c:267 Code: 02 00 00 49 8d 9f d8 00 00 00 48 89 d8 48 c1 e8 03 42 80 3c 20 00 74 08 48 89 df e8 f9 32 bf f7 48 8b 1b 48 89 d8 48 c1 e8 03 <42> 80 3c 20 00 74 08 48 89 df e8 e0 32 bf f7 4c 8b 03 48 89 ef 4c RSP: 0018:ffffc9000fc1f2f0 EFLAGS: 00010246 RAX: 0000000000000000 RBX: 0000000000000000 RCX: 1a772f98c8186700 RDX: 0000000000000003 RSI: ffffffff8bcac4e0 RDI: ffffffff8c1f9760 RBP: ffff8880673fb980 R08: ffffffff8fac15ef R09: 1ffffffff1f582bd R10: dffffc0000000000 R11: fffffbfff1f582be R12: dffffc0000000000 R13: 0000000000000080 R14: ffff888076509000 R15: ffff88807a029a00 FS: 00007f55e82ca6c0(0000) GS:ffff8880b9400000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000001b31d23000 CR3: 0000000022b66000 CR4: 00000000003506f0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace: <TASK> fib_rules_lookup+0x62c/0xdb0 net/core/fib_rules.c:317 fib6_rule_lookup+0x1fd/0x790 net/ipv6/fib6_rules.c:108 ip6_route_output_flags_noref net/ipv6/route.c:2637 [inline] ip6_route_output_flags+0x38e/0x610 net/ipv6/route.c:2649 ip6_route_output include/net/ip6_route.h:93 [inline] ip6_dst_lookup_tail+0x189/0x11a0 net/ipv6/ip6_output.c:1120 ip6_dst_lookup_flow+0xb9/0x180 net/ipv6/ip6_output.c:1250 sctp_v6_get_dst+0x792/0x1e20 net/sctp/ipv6.c:326 sctp_transport_route+0x12c/0x2e0 net/sctp/transport.c:455 sctp_assoc_add_peer+0x614/0x15c0 net/sctp/associola.c:662 sctp_connect_new_asoc+0x31d/0x6c0 net/sctp/socket.c:1099 __sctp_connect+0x66d/0xe30 net/sctp/socket.c:1197 sctp_connect net/sctp/socket.c:4819 [inline] sctp_inet_connect+0x149/0x1f0 net/sctp/socket.c:4834 __sys_connect_file net/socket.c:2048 [inline] __sys_connect+0x2df/0x310 net/socket.c:2065 __do_sys_connect net/socket.c:2075 [inline] __se_sys_connect net/socket.c:2072 [inline] __x64_sys_connect+0x7a/0x90 net/socket.c:2072 do_syscall_x64 arch/x86/entry/common.c:52 [inline] do_syscall_64+0xf5/0x240 arch/x86/entry/common.c:83 entry_SYSCALL_64_after_hwframe+0x77/0x7f Fixes: 5e5f3f0f8013 ("[IPV6] ADDRCONF: Convert ipv6_get_saddr() to ipv6_dev_get_saddr().") Signed-off-by: Eric Dumazet <edumazet@google.com> Reviewed-by: Simon Horman <horms@kernel.org> Reviewed-by: David Ahern <dsahern@kernel.org> Link: https://lore.kernel.org/r/20240507163145.835254-1-edumazet@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-05-17net: bridge: fix corrupted ethernet header on multicast-to-unicastFelix Fietkau1-2/+7
[ Upstream commit 86b29d830ad69eecff25b22dc96c14c6573718e6 ] The change from skb_copy to pskb_copy unfortunately changed the data copying to omit the ethernet header, since it was pulled before reaching this point. Fix this by calling __skb_push/pull around pskb_copy. Fixes: 59c878cbcdd8 ("net: bridge: fix multicast-to-unicast with fraglist GSO") Signed-off-by: Felix Fietkau <nbd@nbd.name> Acked-by: Nikolay Aleksandrov <razor@blackwall.org> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-05-17phonet: fix rtm_phonet_notify() skb allocationEric Dumazet1-1/+1
[ Upstream commit d8cac8568618dcb8a51af3db1103e8d4cc4aeea7 ] fill_route() stores three components in the skb: - struct rtmsg - RTA_DST (u8) - RTA_OIF (u32) Therefore, rtm_phonet_notify() should use NLMSG_ALIGN(sizeof(struct rtmsg)) + nla_total_size(1) + nla_total_size(4) Fixes: f062f41d0657 ("Phonet: routing table Netlink interface") Signed-off-by: Eric Dumazet <edumazet@google.com> Acked-by: Rémi Denis-Courmont <courmisch@gmail.com> Link: https://lore.kernel.org/r/20240502161700.1804476-1-edumazet@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-05-17rtnetlink: Correct nested IFLA_VF_VLAN_LIST attribute validationRoded Zats1-1/+1
[ Upstream commit 1aec77b2bb2ed1db0f5efc61c4c1ca3813307489 ] Each attribute inside a nested IFLA_VF_VLAN_LIST is assumed to be a struct ifla_vf_vlan_info so the size of such attribute needs to be at least of sizeof(struct ifla_vf_vlan_info) which is 14 bytes. The current size validation in do_setvfinfo is against NLA_HDRLEN (4 bytes) which is less than sizeof(struct ifla_vf_vlan_info) so this validation is not enough and a too small attribute might be cast to a struct ifla_vf_vlan_info, this might result in an out of bands read access when accessing the saved (casted) entry in ivvl. Fixes: 79aab093a0b5 ("net: Update API for VF vlan protocol 802.1ad support") Signed-off-by: Roded Zats <rzats@paloaltonetworks.com> Reviewed-by: Donald Hunter <donald.hunter@gmail.com> Link: https://lore.kernel.org/r/20240502155751.75705-1-rzats@paloaltonetworks.com Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-05-17Bluetooth: l2cap: fix null-ptr-deref in l2cap_chan_timeoutDuoming Zhou1-0/+3
[ Upstream commit adf0398cee86643b8eacde95f17d073d022f782c ] There is a race condition between l2cap_chan_timeout() and l2cap_chan_del(). When we use l2cap_chan_del() to delete the channel, the chan->conn will be set to null. But the conn could be dereferenced again in the mutex_lock() of l2cap_chan_timeout(). As a result the null pointer dereference bug will happen. The KASAN report triggered by POC is shown below: [ 472.074580] ================================================================== [ 472.075284] BUG: KASAN: null-ptr-deref in mutex_lock+0x68/0xc0 [ 472.075308] Write of size 8 at addr 0000000000000158 by task kworker/0:0/7 [ 472.075308] [ 472.075308] CPU: 0 PID: 7 Comm: kworker/0:0 Not tainted 6.9.0-rc5-00356-g78c0094a146b #36 [ 472.075308] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.14.0-0-g155821a1990b-prebuilt.qemu4 [ 472.075308] Workqueue: events l2cap_chan_timeout [ 472.075308] Call Trace: [ 472.075308] <TASK> [ 472.075308] dump_stack_lvl+0x137/0x1a0 [ 472.075308] print_report+0x101/0x250 [ 472.075308] ? __virt_addr_valid+0x77/0x160 [ 472.075308] ? mutex_lock+0x68/0xc0 [ 472.075308] kasan_report+0x139/0x170 [ 472.075308] ? mutex_lock+0x68/0xc0 [ 472.075308] kasan_check_range+0x2c3/0x2e0 [ 472.075308] mutex_lock+0x68/0xc0 [ 472.075308] l2cap_chan_timeout+0x181/0x300 [ 472.075308] process_one_work+0x5d2/0xe00 [ 472.075308] worker_thread+0xe1d/0x1660 [ 472.075308] ? pr_cont_work+0x5e0/0x5e0 [ 472.075308] kthread+0x2b7/0x350 [ 472.075308] ? pr_cont_work+0x5e0/0x5e0 [ 472.075308] ? kthread_blkcg+0xd0/0xd0 [ 472.075308] ret_from_fork+0x4d/0x80 [ 472.075308] ? kthread_blkcg+0xd0/0xd0 [ 472.075308] ret_from_fork_asm+0x11/0x20 [ 472.075308] </TASK> [ 472.075308] ================================================================== [ 472.094860] Disabling lock debugging due to kernel taint [ 472.096136] BUG: kernel NULL pointer dereference, address: 0000000000000158 [ 472.096136] #PF: supervisor write access in kernel mode [ 472.096136] #PF: error_code(0x0002) - not-present page [ 472.096136] PGD 0 P4D 0 [ 472.096136] Oops: 0002 [#1] PREEMPT SMP KASAN NOPTI [ 472.096136] CPU: 0 PID: 7 Comm: kworker/0:0 Tainted: G B 6.9.0-rc5-00356-g78c0094a146b #36 [ 472.096136] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.14.0-0-g155821a1990b-prebuilt.qemu4 [ 472.096136] Workqueue: events l2cap_chan_timeout [ 472.096136] RIP: 0010:mutex_lock+0x88/0xc0 [ 472.096136] Code: be 08 00 00 00 e8 f8 23 1f fd 4c 89 f7 be 08 00 00 00 e8 eb 23 1f fd 42 80 3c 23 00 74 08 48 88 [ 472.096136] RSP: 0018:ffff88800744fc78 EFLAGS: 00000246 [ 472.096136] RAX: 0000000000000000 RBX: 1ffff11000e89f8f RCX: ffffffff8457c865 [ 472.096136] RDX: 0000000000000001 RSI: 0000000000000008 RDI: ffff88800744fc78 [ 472.096136] RBP: 0000000000000158 R08: ffff88800744fc7f R09: 1ffff11000e89f8f [ 472.096136] R10: dffffc0000000000 R11: ffffed1000e89f90 R12: dffffc0000000000 [ 472.096136] R13: 0000000000000158 R14: ffff88800744fc78 R15: ffff888007405a00 [ 472.096136] FS: 0000000000000000(0000) GS:ffff88806d200000(0000) knlGS:0000000000000000 [ 472.096136] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 472.096136] CR2: 0000000000000158 CR3: 000000000da32000 CR4: 00000000000006f0 [ 472.096136] Call Trace: [ 472.096136] <TASK> [ 472.096136] ? __die_body+0x8d/0xe0 [ 472.096136] ? page_fault_oops+0x6b8/0x9a0 [ 472.096136] ? kernelmode_fixup_or_oops+0x20c/0x2a0 [ 472.096136] ? do_user_addr_fault+0x1027/0x1340 [ 472.096136] ? _printk+0x7a/0xa0 [ 472.096136] ? mutex_lock+0x68/0xc0 [ 472.096136] ? add_taint+0x42/0xd0 [ 472.096136] ? exc_page_fault+0x6a/0x1b0 [ 472.096136] ? asm_exc_page_fault+0x26/0x30 [ 472.096136] ? mutex_lock+0x75/0xc0 [ 472.096136] ? mutex_lock+0x88/0xc0 [ 472.096136] ? mutex_lock+0x75/0xc0 [ 472.096136] l2cap_chan_timeout+0x181/0x300 [ 472.096136] process_one_work+0x5d2/0xe00 [ 472.096136] worker_thread+0xe1d/0x1660 [ 472.096136] ? pr_cont_work+0x5e0/0x5e0 [ 472.096136] kthread+0x2b7/0x350 [ 472.096136] ? pr_cont_work+0x5e0/0x5e0 [ 472.096136] ? kthread_blkcg+0xd0/0xd0 [ 472.096136] ret_from_fork+0x4d/0x80 [ 472.096136] ? kthread_blkcg+0xd0/0xd0 [ 472.096136] ret_from_fork_asm+0x11/0x20 [ 472.096136] </TASK> [ 472.096136] Modules linked in: [ 472.096136] CR2: 0000000000000158 [ 472.096136] ---[ end trace 0000000000000000 ]--- [ 472.096136] RIP: 0010:mutex_lock+0x88/0xc0 [ 472.096136] Code: be 08 00 00 00 e8 f8 23 1f fd 4c 89 f7 be 08 00 00 00 e8 eb 23 1f fd 42 80 3c 23 00 74 08 48 88 [ 472.096136] RSP: 0018:ffff88800744fc78 EFLAGS: 00000246 [ 472.096136] RAX: 0000000000000000 RBX: 1ffff11000e89f8f RCX: ffffffff8457c865 [ 472.096136] RDX: 0000000000000001 RSI: 0000000000000008 RDI: ffff88800744fc78 [ 472.096136] RBP: 0000000000000158 R08: ffff88800744fc7f R09: 1ffff11000e89f8f [ 472.132932] R10: dffffc0000000000 R11: ffffed1000e89f90 R12: dffffc0000000000 [ 472.132932] R13: 0000000000000158 R14: ffff88800744fc78 R15: ffff888007405a00 [ 472.132932] FS: 0000000000000000(0000) GS:ffff88806d200000(0000) knlGS:0000000000000000 [ 472.132932] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 472.132932] CR2: 0000000000000158 CR3: 000000000da32000 CR4: 00000000000006f0 [ 472.132932] Kernel panic - not syncing: Fatal exception [ 472.132932] Kernel Offset: disabled [ 472.132932] ---[ end Kernel panic - not syncing: Fatal exception ]--- Add a check to judge whether the conn is null in l2cap_chan_timeout() in order to mitigate the bug. Fixes: 3df91ea20e74 ("Bluetooth: Revert to mutexes from RCU list") Signed-off-by: Duoming Zhou <duoming@zju.edu.cn> Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-05-17Bluetooth: Fix use-after-free bugs caused by sco_sock_timeoutDuoming Zhou1-0/+4
[ Upstream commit 483bc08181827fc475643272ffb69c533007e546 ] When the sco connection is established and then, the sco socket is releasing, timeout_work will be scheduled to judge whether the sco disconnection is timeout. The sock will be deallocated later, but it is dereferenced again in sco_sock_timeout. As a result, the use-after-free bugs will happen. The root cause is shown below: Cleanup Thread | Worker Thread sco_sock_release | sco_sock_close | __sco_sock_close | sco_sock_set_timer | schedule_delayed_work | sco_sock_kill | (wait a time) sock_put(sk) //FREE | sco_sock_timeout | sock_hold(sk) //USE The KASAN report triggered by POC is shown below: [ 95.890016] ================================================================== [ 95.890496] BUG: KASAN: slab-use-after-free in sco_sock_timeout+0x5e/0x1c0 [ 95.890755] Write of size 4 at addr ffff88800c388080 by task kworker/0:0/7 ... [ 95.890755] Workqueue: events sco_sock_timeout [ 95.890755] Call Trace: [ 95.890755] <TASK> [ 95.890755] dump_stack_lvl+0x45/0x110 [ 95.890755] print_address_description+0x78/0x390 [ 95.890755] print_report+0x11b/0x250 [ 95.890755] ? __virt_addr_valid+0xbe/0xf0 [ 95.890755] ? sco_sock_timeout+0x5e/0x1c0 [ 95.890755] kasan_report+0x139/0x170 [ 95.890755] ? update_load_avg+0xe5/0x9f0 [ 95.890755] ? sco_sock_timeout+0x5e/0x1c0 [ 95.890755] kasan_check_range+0x2c3/0x2e0 [ 95.890755] sco_sock_timeout+0x5e/0x1c0 [ 95.890755] process_one_work+0x561/0xc50 [ 95.890755] worker_thread+0xab2/0x13c0 [ 95.890755] ? pr_cont_work+0x490/0x490 [ 95.890755] kthread+0x279/0x300 [ 95.890755] ? pr_cont_work+0x490/0x490 [ 95.890755] ? kthread_blkcg+0xa0/0xa0 [ 95.890755] ret_from_fork+0x34/0x60 [ 95.890755] ? kthread_blkcg+0xa0/0xa0 [ 95.890755] ret_from_fork_asm+0x11/0x20 [ 95.890755] </TASK> [ 95.890755] [ 95.890755] Allocated by task 506: [ 95.890755] kasan_save_track+0x3f/0x70 [ 95.890755] __kasan_kmalloc+0x86/0x90 [ 95.890755] __kmalloc+0x17f/0x360 [ 95.890755] sk_prot_alloc+0xe1/0x1a0 [ 95.890755] sk_alloc+0x31/0x4e0 [ 95.890755] bt_sock_alloc+0x2b/0x2a0 [ 95.890755] sco_sock_create+0xad/0x320 [ 95.890755] bt_sock_create+0x145/0x320 [ 95.890755] __sock_create+0x2e1/0x650 [ 95.890755] __sys_socket+0xd0/0x280 [ 95.890755] __x64_sys_socket+0x75/0x80 [ 95.890755] do_syscall_64+0xc4/0x1b0 [ 95.890755] entry_SYSCALL_64_after_hwframe+0x67/0x6f [ 95.890755] [ 95.890755] Freed by task 506: [ 95.890755] kasan_save_track+0x3f/0x70 [ 95.890755] kasan_save_free_info+0x40/0x50 [ 95.890755] poison_slab_object+0x118/0x180 [ 95.890755] __kasan_slab_free+0x12/0x30 [ 95.890755] kfree+0xb2/0x240 [ 95.890755] __sk_destruct+0x317/0x410 [ 95.890755] sco_sock_release+0x232/0x280 [ 95.890755] sock_close+0xb2/0x210 [ 95.890755] __fput+0x37f/0x770 [ 95.890755] task_work_run+0x1ae/0x210 [ 95.890755] get_signal+0xe17/0xf70 [ 95.890755] arch_do_signal_or_restart+0x3f/0x520 [ 95.890755] syscall_exit_to_user_mode+0x55/0x120 [ 95.890755] do_syscall_64+0xd1/0x1b0 [ 95.890755] entry_SYSCALL_64_after_hwframe+0x67/0x6f [ 95.890755] [ 95.890755] The buggy address belongs to the object at ffff88800c388000 [ 95.890755] which belongs to the cache kmalloc-1k of size 1024 [ 95.890755] The buggy address is located 128 bytes inside of [ 95.890755] freed 1024-byte region [ffff88800c388000, ffff88800c388400) [ 95.890755] [ 95.890755] The buggy address belongs to the physical page: [ 95.890755] page: refcount:1 mapcount:0 mapping:0000000000000000 index:0xffff88800c38a800 pfn:0xc388 [ 95.890755] head: order:3 entire_mapcount:0 nr_pages_mapped:0 pincount:0 [ 95.890755] anon flags: 0x100000000000840(slab|head|node=0|zone=1) [ 95.890755] page_type: 0xffffffff() [ 95.890755] raw: 0100000000000840 ffff888006842dc0 0000000000000000 0000000000000001 [ 95.890755] raw: ffff88800c38a800 000000000010000a 00000001ffffffff 0000000000000000 [ 95.890755] head: 0100000000000840 ffff888006842dc0 0000000000000000 0000000000000001 [ 95.890755] head: ffff88800c38a800 000000000010000a 00000001ffffffff 0000000000000000 [ 95.890755] head: 0100000000000003 ffffea000030e201 ffffea000030e248 00000000ffffffff [ 95.890755] head: 0000000800000000 0000000000000000 00000000ffffffff 0000000000000000 [ 95.890755] page dumped because: kasan: bad access detected [ 95.890755] [ 95.890755] Memory state around the buggy address: [ 95.890755] ffff88800c387f80: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc [ 95.890755] ffff88800c388000: fa fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb [ 95.890755] >ffff88800c388080: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb [ 95.890755] ^ [ 95.890755] ffff88800c388100: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb [ 95.890755] ffff88800c388180: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb [ 95.890755] ================================================================== Fix this problem by adding a check protected by sco_conn_lock to judget whether the conn->hcon is null. Because the conn->hcon will be set to null, when the sock is releasing. Fixes: ba316be1b6a0 ("Bluetooth: schedule SCO timeouts with delayed_work") Signed-off-by: Duoming Zhou <duoming@zju.edu.cn> Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-05-17tcp: Use refcount_inc_not_zero() in tcp_twsk_unique().Kuniyuki Iwashima1-1/+7
[ Upstream commit f2db7230f73a80dbb179deab78f88a7947f0ab7e ] Anderson Nascimento reported a use-after-free splat in tcp_twsk_unique() with nice analysis. Since commit ec94c2696f0b ("tcp/dccp: avoid one atomic operation for timewait hashdance"), inet_twsk_hashdance() sets TIME-WAIT socket's sk_refcnt after putting it into ehash and releasing the bucket lock. Thus, there is a small race window where other threads could try to reuse the port during connect() and call sock_hold() in tcp_twsk_unique() for the TIME-WAIT socket with zero refcnt. If that happens, the refcnt taken by tcp_twsk_unique() is overwritten and sock_put() will cause underflow, triggering a real use-after-free somewhere else. To avoid the use-after-free, we need to use refcount_inc_not_zero() in tcp_twsk_unique() and give up on reusing the port if it returns false. [0]: refcount_t: addition on 0; use-after-free. WARNING: CPU: 0 PID: 1039313 at lib/refcount.c:25 refcount_warn_saturate+0xe5/0x110 CPU: 0 PID: 1039313 Comm: trigger Not tainted 6.8.6-200.fc39.x86_64 #1 Hardware name: VMware, Inc. VMware20,1/440BX Desktop Reference Platform, BIOS VMW201.00V.21805430.B64.2305221830 05/22/2023 RIP: 0010:refcount_warn_saturate+0xe5/0x110 Code: 42 8e ff 0f 0b c3 cc cc cc cc 80 3d aa 13 ea 01 00 0f 85 5e ff ff ff 48 c7 c7 f8 8e b7 82 c6 05 96 13 ea 01 01 e8 7b 42 8e ff <0f> 0b c3 cc cc cc cc 48 c7 c7 50 8f b7 82 c6 05 7a 13 ea 01 01 e8 RSP: 0018:ffffc90006b43b60 EFLAGS: 00010282 RAX: 0000000000000000 RBX: ffff888009bb3ef0 RCX: 0000000000000027 RDX: ffff88807be218c8 RSI: 0000000000000001 RDI: ffff88807be218c0 RBP: 0000000000069d70 R08: 0000000000000000 R09: ffffc90006b439f0 R10: ffffc90006b439e8 R11: 0000000000000003 R12: ffff8880029ede84 R13: 0000000000004e20 R14: ffffffff84356dc0 R15: ffff888009bb3ef0 FS: 00007f62c10926c0(0000) GS:ffff88807be00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000020ccb000 CR3: 000000004628c005 CR4: 0000000000f70ef0 PKRU: 55555554 Call Trace: <TASK> ? refcount_warn_saturate+0xe5/0x110 ? __warn+0x81/0x130 ? refcount_warn_saturate+0xe5/0x110 ? report_bug+0x171/0x1a0 ? refcount_warn_saturate+0xe5/0x110 ? handle_bug+0x3c/0x80 ? exc_invalid_op+0x17/0x70 ? asm_exc_invalid_op+0x1a/0x20 ? refcount_warn_saturate+0xe5/0x110 tcp_twsk_unique+0x186/0x190 __inet_check_established+0x176/0x2d0 __inet_hash_connect+0x74/0x7d0 ? __pfx___inet_check_established+0x10/0x10 tcp_v4_connect+0x278/0x530 __inet_stream_connect+0x10f/0x3d0 inet_stream_connect+0x3a/0x60 __sys_connect+0xa8/0xd0 __x64_sys_connect+0x18/0x20 do_syscall_64+0x83/0x170 entry_SYSCALL_64_after_hwframe+0x78/0x80 RIP: 0033:0x7f62c11a885d Code: ff c3 66 2e 0f 1f 84 00 00 00 00 00 90 f3 0f 1e fa 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d a3 45 0c 00 f7 d8 64 89 01 48 RSP: 002b:00007f62c1091e58 EFLAGS: 00000296 ORIG_RAX: 000000000000002a RAX: ffffffffffffffda RBX: 0000000020ccb004 RCX: 00007f62c11a885d RDX: 0000000000000010 RSI: 0000000020ccb000 RDI: 0000000000000003 RBP: 00007f62c1091e90 R08: 0000000000000000 R09: 0000000000000000 R10: 0000000000000000 R11: 0000000000000296 R12: 00007f62c10926c0 R13: ffffffffffffff88 R14: 0000000000000000 R15: 00007ffe237885b0 </TASK> Fixes: ec94c2696f0b ("tcp/dccp: avoid one atomic operation for timewait hashdance") Reported-by: Anderson Nascimento <anderson@allelesecurity.com> Closes: https://lore.kernel.org/netdev/37a477a6-d39e-486b-9577-3463f655a6b7@allelesecurity.com/ Suggested-by: Eric Dumazet <edumazet@google.com> Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Link: https://lore.kernel.org/r/20240501213145.62261-1-kuniyu@amazon.com Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-05-17tcp: defer shutdown(SEND_SHUTDOWN) for TCP_SYN_RECV socketsEric Dumazet3-3/+7
[ Upstream commit 94062790aedb505bdda209b10bea47b294d6394f ] TCP_SYN_RECV state is really special, it is only used by cross-syn connections, mostly used by fuzzers. In the following crash [1], syzbot managed to trigger a divide by zero in tcp_rcv_space_adjust() A socket makes the following state transitions, without ever calling tcp_init_transfer(), meaning tcp_init_buffer_space() is also not called. TCP_CLOSE connect() TCP_SYN_SENT TCP_SYN_RECV shutdown() -> tcp_shutdown(sk, SEND_SHUTDOWN) TCP_FIN_WAIT1 To fix this issue, change tcp_shutdown() to not perform a TCP_SYN_RECV -> TCP_FIN_WAIT1 transition, which makes no sense anyway. When tcp_rcv_state_process() later changes socket state from TCP_SYN_RECV to TCP_ESTABLISH, then look at sk->sk_shutdown to finally enter TCP_FIN_WAIT1 state, and send a FIN packet from a sane socket state. This means tcp_send_fin() can now be called from BH context, and must use GFP_ATOMIC allocations. [1] divide error: 0000 [#1] PREEMPT SMP KASAN NOPTI CPU: 1 PID: 5084 Comm: syz-executor358 Not tainted 6.9.0-rc6-syzkaller-00022-g98369dccd2f8 #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 03/27/2024 RIP: 0010:tcp_rcv_space_adjust+0x2df/0x890 net/ipv4/tcp_input.c:767 Code: e3 04 4c 01 eb 48 8b 44 24 38 0f b6 04 10 84 c0 49 89 d5 0f 85 a5 03 00 00 41 8b 8e c8 09 00 00 89 e8 29 c8 48 0f af c3 31 d2 <48> f7 f1 48 8d 1c 43 49 8d 96 76 08 00 00 48 89 d0 48 c1 e8 03 48 RSP: 0018:ffffc900031ef3f0 EFLAGS: 00010246 RAX: 0c677a10441f8f42 RBX: 000000004fb95e7e RCX: 0000000000000000 RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000 RBP: 0000000027d4b11f R08: ffffffff89e535a4 R09: 1ffffffff25e6ab7 R10: dffffc0000000000 R11: ffffffff8135e920 R12: ffff88802a9f8d30 R13: dffffc0000000000 R14: ffff88802a9f8d00 R15: 1ffff1100553f2da FS: 00005555775c0380(0000) GS:ffff8880b9500000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007f1155bf2304 CR3: 000000002b9f2000 CR4: 0000000000350ef0 Call Trace: <TASK> tcp_recvmsg_locked+0x106d/0x25a0 net/ipv4/tcp.c:2513 tcp_recvmsg+0x25d/0x920 net/ipv4/tcp.c:2578 inet6_recvmsg+0x16a/0x730 net/ipv6/af_inet6.c:680 sock_recvmsg_nosec net/socket.c:1046 [inline] sock_recvmsg+0x109/0x280 net/socket.c:1068 ____sys_recvmsg+0x1db/0x470 net/socket.c:2803 ___sys_recvmsg net/socket.c:2845 [inline] do_recvmmsg+0x474/0xae0 net/socket.c:2939 __sys_recvmmsg net/socket.c:3018 [inline] __do_sys_recvmmsg net/socket.c:3041 [inline] __se_sys_recvmmsg net/socket.c:3034 [inline] __x64_sys_recvmmsg+0x199/0x250 net/socket.c:3034 do_syscall_x64 arch/x86/entry/common.c:52 [inline] do_syscall_64+0xf5/0x240 arch/x86/entry/common.c:83 entry_SYSCALL_64_after_hwframe+0x77/0x7f RIP: 0033:0x7faeb6363db9 Code: 28 00 00 00 75 05 48 83 c4 28 c3 e8 c1 17 00 00 90 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 b8 ff ff ff f7 d8 64 89 01 48 RSP: 002b:00007ffcc1997168 EFLAGS: 00000246 ORIG_RAX: 000000000000012b RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007faeb6363db9 RDX: 0000000000000001 RSI: 0000000020000bc0 RDI: 0000000000000005 RBP: 0000000000000000 R08: 0000000000000000 R09: 000000000000001c R10: 0000000000000122 R11: 0000000000000246 R12: 0000000000000000 R13: 0000000000000000 R14: 0000000000000001 R15: 0000000000000001 Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") Reported-by: syzbot <syzkaller@googlegroups.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Acked-by: Neal Cardwell <ncardwell@google.com> Link: https://lore.kernel.org/r/20240501125448.896529-1-edumazet@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-05-17tcp: remove redundant check on tskbColin Ian King1-5/+2
[ Upstream commit d1edc085559744fbda7a55e97eeae8bd6135a11b ] The non-null check on tskb is always false because it is in an else path of a check on tskb and hence tskb is null in this code block. This is check is therefore redundant and can be removed as well as the label coalesc. if (tsbk) { ... } else { ... if (unlikely(!skb)) { if (tskb) /* can never be true, redundant code */ goto coalesc; return; } } Addresses-Coverity: ("Logically dead code") Signed-off-by: Colin Ian King <colin.king@canonical.com> Reviewed-by: Mukesh Ojha <mojha@codeaurora.org> Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net> Stable-dep-of: 94062790aedb ("tcp: defer shutdown(SEND_SHUTDOWN) for TCP_SYN_RECV sockets") Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-05-17net:usb:qmi_wwan: support Rolling modulesVanillan Wang1-0/+1
[ Upstream commit d362046021ea122309da8c8e0b6850c792ca97b5 ] Update the qmi_wwan driver support for the Rolling LTE modules. - VID:PID 33f8:0104, RW101-GL for laptop debug M.2 cards(with RMNET interface for /Linux/Chrome OS) 0x0104: RMNET, diag, at, pipe Here are the outputs of usb-devices: T: Bus=04 Lev=01 Prnt=01 Port=00 Cnt=01 Dev#= 2 Spd=5000 MxCh= 0 D: Ver= 3.20 Cls=00(>ifc ) Sub=00 Prot=00 MxPS= 9 #Cfgs= 1 P: Vendor=33f8 ProdID=0104 Rev=05.04 S: Manufacturer=Rolling Wireless S.a.r.l. S: Product=Rolling Module S: SerialNumber=ba2eb033 C: #Ifs= 6 Cfg#= 1 Atr=a0 MxPwr=896mA I: If#= 0 Alt= 0 #EPs= 2 Cls=ff(vend.) Sub=ff Prot=30 Driver=option E: Ad=01(O) Atr=02(Bulk) MxPS=1024 Ivl=0ms E: Ad=81(I) Atr=02(Bulk) MxPS=1024 Ivl=0ms I: If#= 1 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=ff Prot=40 Driver=option E: Ad=02(O) Atr=02(Bulk) MxPS=1024 Ivl=0ms E: Ad=82(I) Atr=02(Bulk) MxPS=1024 Ivl=0ms E: Ad=83(I) Atr=03(Int.) MxPS= 10 Ivl=32ms I: If#= 2 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=ff Prot=40 Driver=option E: Ad=03(O) Atr=02(Bulk) MxPS=1024 Ivl=0ms E: Ad=84(I) Atr=02(Bulk) MxPS=1024 Ivl=0ms E: Ad=85(I) Atr=03(Int.) MxPS= 10 Ivl=32ms I: If#= 3 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=00 Prot=40 Driver=option E: Ad=04(O) Atr=02(Bulk) MxPS=1024 Ivl=0ms E: Ad=86(I) Atr=02(Bulk) MxPS=1024 Ivl=0ms E: Ad=87(I) Atr=03(Int.) MxPS= 10 Ivl=32ms I: If#= 4 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=ff Prot=50 Driver=qmi_wwan E: Ad=0f(O) Atr=02(Bulk) MxPS=1024 Ivl=0ms E: Ad=88(I) Atr=03(Int.) MxPS= 8 Ivl=32ms E: Ad=8e(I) Atr=02(Bulk) MxPS=1024 Ivl=0ms I: If#= 5 Alt= 0 #EPs= 2 Cls=ff(vend.) Sub=42 Prot=01 Driver=usbfs E: Ad=05(O) Atr=02(Bulk) MxPS=1024 Ivl=0ms E: Ad=89(I) Atr=02(Bulk) MxPS=1024 Ivl=0ms Signed-off-by: Vanillan Wang <vanillanwang@163.com> Link: https://lore.kernel.org/r/20240416120713.24777-1-vanillanwang@163.com Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-05-17fs/9p: drop inodes immediately on non-.L tooJoakim Sindholt1-0/+1
[ Upstream commit 7fd524b9bd1be210fe79035800f4bd78a41b349f ] Signed-off-by: Joakim Sindholt <opensource@zhasha.com> Signed-off-by: Eric Van Hensbergen <ericvh@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-05-17gpio: crystalcove: Use -ENOTSUPP consistentlyAndy Shevchenko1-1/+1
[ Upstream commit ace0ebe5c98d66889f19e0f30e2518d0c58d0e04 ] The GPIO library expects the drivers to return -ENOTSUPP in some cases and not using analogue POSIX code. Make the driver to follow this. Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-05-17gpio: wcove: Use -ENOTSUPP consistentlyAndy Shevchenko1-1/+1
[ Upstream commit 0c3b532ad3fbf82884a2e7e83e37c7dcdd4d1d99 ] The GPIO library expects the drivers to return -ENOTSUPP in some cases and not using analogue POSIX code. Make the driver to follow this. Reviewed-by: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com> Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-05-179p: explicitly deny setlease attemptsJeff Layton1-0/+2
[ Upstream commit 7a84602297d36617dbdadeba55a2567031e5165b ] 9p is a remote network protocol, and it doesn't support asynchronous notifications from the server. Ensure that we don't hand out any leases since we can't guarantee they'll be broken when a file's contents change. Signed-off-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Eric Van Hensbergen <ericvh@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-05-17fs/9p: translate O_TRUNC into OTRUNCJoakim Sindholt1-0/+3
[ Upstream commit 87de39e70503e04ddb58965520b15eb9efa7eef3 ] This one hits both 9P2000 and .u as it appears v9fs has never translated the O_TRUNC flag. Signed-off-by: Joakim Sindholt <opensource@zhasha.com> Signed-off-by: Eric Van Hensbergen <ericvh@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-05-17fs/9p: only translate RWX permissions for plain 9P2000Joakim Sindholt1-1/+1
[ Upstream commit cd25e15e57e68a6b18dc9323047fe9c68b99290b ] Garbage in plain 9P2000's perm bits is allowed through, which causes it to be able to set (among others) the suid bit. This was presumably not the intent since the unix extended bits are handled explicitly and conditionally on .u. Signed-off-by: Joakim Sindholt <opensource@zhasha.com> Signed-off-by: Eric Van Hensbergen <ericvh@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-05-17selftests: timers: Fix valid-adjtimex signed left-shift undefined behaviorJohn Stultz1-37/+36
[ Upstream commit 076361362122a6d8a4c45f172ced5576b2d4a50d ] The struct adjtimex freq field takes a signed value who's units are in shifted (<<16) parts-per-million. Unfortunately for negative adjustments, the straightforward use of: freq = ppm << 16 trips undefined behavior warnings with clang: valid-adjtimex.c:66:6: warning: shifting a negative signed value is undefined [-Wshift-negative-value] -499<<16, ~~~~^ valid-adjtimex.c:67:6: warning: shifting a negative signed value is undefined [-Wshift-negative-value] -450<<16, ~~~~^ .. Fix it by using a multiply by (1 << 16) instead of shifting negative values in the valid-adjtimex test case. Align the values for better readability. Reported-by: Lee Jones <joneslee@google.com> Reported-by: Muhammad Usama Anjum <usama.anjum@collabora.com> Signed-off-by: John Stultz <jstultz@google.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Muhammad Usama Anjum <usama.anjum@collabora.com> Link: https://lore.kernel.org/r/20240409202222.2830476-1-jstultz@google.com Link: https://lore.kernel.org/lkml/0c6d4f0d-2064-4444-986b-1d1ed782135f@collabora.com/ Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-05-17scsi: target: Fix SELinux error when systemd-modules loads the target moduleMaurizio Lombardi1-0/+12
[ Upstream commit 97a54ef596c3fd24ec2b227ba8aaf2cf5415e779 ] If the systemd-modules service loads the target module, the credentials of that userspace process will be used to validate the access to the target db directory. SELinux will prevent it, reporting an error like the following: kernel: audit: type=1400 audit(1676301082.205:4): avc: denied { read } for pid=1020 comm="systemd-modules" name="target" dev="dm-3" ino=4657583 scontext=system_u:system_r:systemd_modules_load_t:s0 tcontext=system_u:object_r:targetd_etc_rw_t:s0 tclass=dir permissive=0 Fix the error by using the kernel credentials to access the db directory Signed-off-by: Maurizio Lombardi <mlombard@redhat.com> Link: https://lore.kernel.org/r/20240215143944.847184-2-mlombard@redhat.com Reviewed-by: Mike Christie <michael.christie@oracle.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-05-17btrfs: always clear PERTRANS metadata during commitBoris Burkov1-1/+1
[ Upstream commit 6e68de0bb0ed59e0554a0c15ede7308c47351e2d ] It is possible to clear a root's IN_TRANS tag from the radix tree, but not clear its PERTRANS, if there is some error in between. Eliminate that possibility by moving the free up to where we clear the tag. Reviewed-by: Qu Wenruo <wqu@suse.com> Signed-off-by: Boris Burkov <boris@bur.io> Signed-off-by: David Sterba <dsterba@suse.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-05-17btrfs: make btrfs_clear_delalloc_extent() free delalloc reserveBoris Burkov1-1/+1
[ Upstream commit 3c6f0c5ecc8910d4ffb0dfe85609ebc0c91c8f34 ] Currently, this call site in btrfs_clear_delalloc_extent() only converts the reservation. We are marking it not delalloc, so I don't think it makes sense to keep the rsv around. This is a path where we are not sure to join a transaction, so it leads to incorrect free-ing during umount. Helps with the pass rate of generic/269 and generic/475. Reviewed-by: Qu Wenruo <wqu@suse.com> Signed-off-by: Boris Burkov <boris@bur.io> Signed-off-by: David Sterba <dsterba@suse.com> Signed-off-by: Sasha Levin <sashal@kernel.org>