summaryrefslogtreecommitdiff
path: root/arch
AgeCommit message (Collapse)AuthorFilesLines
2023-03-12Merge tag 'x86_urgent_for_v6.3_rc2' of ↵Linus Torvalds1-0/+9
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 fix from Borislav Petkov: "A single erratum fix for AMD machines: - Disable XSAVES on AMD Zen1 and Zen2 machines due to an erratum. No impact to anything as those machines will fallback to XSAVEC which is equivalent there" * tag 'x86_urgent_for_v6.3_rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/CPU/AMD: Disable XSAVES on AMD family 0x17
2023-03-11Merge tag 'pull-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfsLinus Torvalds1-2/+2
Pull misc fixes from Al Viro: "pick_file() speculation fix + fix for alpha mis(merge,cherry-pick) The fs/file.c one is a genuine missing speculation barrier in pick_file() (reachable e.g. via close(2)). The alpha one is strictly speaking not a bug fix, but only because confusion between preempt_enable() and preempt_disable() is harmless on architecture without CONFIG_PREEMPT. Looks like alpha.git picked the wrong version of patch - that braino used to be there in early versions, but it had been fixed quite a while ago..." * tag 'pull-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: fs: prevent out-of-bounds array speculation when closing a file descriptor alpha: fix lazy-FPU mis(merged/applied/whatnot)
2023-03-10Merge tag 'riscv-for-linus-6.3-rc2' of ↵Linus Torvalds8-8/+52
git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux Pull RISC-V fixes from Palmer Dabbelt: - RISC-V architecture-specific ELF attributes have been disabled in the kernel builds - A fix for a locking failure while during errata patching that manifests on SiFive-based systems - A fix for a KASAN failure during stack unwinding - A fix for some lockdep failures during text patching * tag 'riscv-for-linus-6.3-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux: RISC-V: Don't check text_mutex during stop_machine riscv: Use READ_ONCE_NOCHECK in imprecise unwinding stack mode RISC-V: fix taking the text_mutex twice during sifive errata patching RISC-V: Stop emitting attributes
2023-03-10RISC-V: Don't check text_mutex during stop_machineConor Dooley4-6/+39
We're currently using stop_machine() to update ftrace & kprobes, which means that the thread that takes text_mutex during may not be the same as the thread that eventually patches the code. This isn't actually a race because the lock is still held (preventing any other concurrent accesses) and there is only one thread running during stop_machine(), but it does trigger a lockdep failure. This patch just elides the lockdep check during stop_machine. Fixes: c15ac4fd60d5 ("riscv/ftrace: Add dynamic function tracer support") Suggested-by: Steven Rostedt <rostedt@goodmis.org> Reported-by: Changbin Du <changbin.du@gmail.com> Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com> Signed-off-by: Conor Dooley <conor.dooley@microchip.com> Link: https://lore.kernel.org/r/20230303143754.4005217-1-conor.dooley@microchip.com Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2023-03-10riscv: Use READ_ONCE_NOCHECK in imprecise unwinding stack modeAlexandre Ghiti1-1/+1
When CONFIG_FRAME_POINTER is unset, the stack unwinding function walk_stackframe randomly reads the stack and then, when KASAN is enabled, it can lead to the following backtrace: [ 0.000000] ================================================================== [ 0.000000] BUG: KASAN: stack-out-of-bounds in walk_stackframe+0xa6/0x11a [ 0.000000] Read of size 8 at addr ffffffff81807c40 by task swapper/0 [ 0.000000] [ 0.000000] CPU: 0 PID: 0 Comm: swapper Not tainted 6.2.0-12919-g24203e6db61f #43 [ 0.000000] Hardware name: riscv-virtio,qemu (DT) [ 0.000000] Call Trace: [ 0.000000] [<ffffffff80007ba8>] walk_stackframe+0x0/0x11a [ 0.000000] [<ffffffff80099ecc>] init_param_lock+0x26/0x2a [ 0.000000] [<ffffffff80007c4a>] walk_stackframe+0xa2/0x11a [ 0.000000] [<ffffffff80c49c80>] dump_stack_lvl+0x22/0x36 [ 0.000000] [<ffffffff80c3783e>] print_report+0x198/0x4a8 [ 0.000000] [<ffffffff80099ecc>] init_param_lock+0x26/0x2a [ 0.000000] [<ffffffff80007c4a>] walk_stackframe+0xa2/0x11a [ 0.000000] [<ffffffff8015f68a>] kasan_report+0x9a/0xc8 [ 0.000000] [<ffffffff80007c4a>] walk_stackframe+0xa2/0x11a [ 0.000000] [<ffffffff80007c4a>] walk_stackframe+0xa2/0x11a [ 0.000000] [<ffffffff8006e99c>] desc_make_final+0x80/0x84 [ 0.000000] [<ffffffff8009a04e>] stack_trace_save+0x88/0xa6 [ 0.000000] [<ffffffff80099fc2>] filter_irq_stacks+0x72/0x76 [ 0.000000] [<ffffffff8006b95e>] devkmsg_read+0x32a/0x32e [ 0.000000] [<ffffffff8015ec16>] kasan_save_stack+0x28/0x52 [ 0.000000] [<ffffffff8006e998>] desc_make_final+0x7c/0x84 [ 0.000000] [<ffffffff8009a04a>] stack_trace_save+0x84/0xa6 [ 0.000000] [<ffffffff8015ec52>] kasan_set_track+0x12/0x20 [ 0.000000] [<ffffffff8015f22e>] __kasan_slab_alloc+0x58/0x5e [ 0.000000] [<ffffffff8015e7ea>] __kmem_cache_create+0x21e/0x39a [ 0.000000] [<ffffffff80e133ac>] create_boot_cache+0x70/0x9c [ 0.000000] [<ffffffff80e17ab2>] kmem_cache_init+0x6c/0x11e [ 0.000000] [<ffffffff80e00fd6>] mm_init+0xd8/0xfe [ 0.000000] [<ffffffff80e011d8>] start_kernel+0x190/0x3ca [ 0.000000] [ 0.000000] The buggy address belongs to stack of task swapper/0 [ 0.000000] and is located at offset 0 in frame: [ 0.000000] stack_trace_save+0x0/0xa6 [ 0.000000] [ 0.000000] This frame has 1 object: [ 0.000000] [32, 56) 'c' [ 0.000000] [ 0.000000] The buggy address belongs to the physical page: [ 0.000000] page:(____ptrval____) refcount:1 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x81a07 [ 0.000000] flags: 0x1000(reserved|zone=0) [ 0.000000] raw: 0000000000001000 ff600003f1e3d150 ff600003f1e3d150 0000000000000000 [ 0.000000] raw: 0000000000000000 0000000000000000 00000001ffffffff [ 0.000000] page dumped because: kasan: bad access detected [ 0.000000] [ 0.000000] Memory state around the buggy address: [ 0.000000] ffffffff81807b00: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 [ 0.000000] ffffffff81807b80: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 [ 0.000000] >ffffffff81807c00: 00 00 00 00 00 00 00 00 f1 f1 f1 f1 00 00 00 f3 [ 0.000000] ^ [ 0.000000] ffffffff81807c80: f3 f3 f3 f3 00 00 00 00 00 00 00 00 00 00 00 00 [ 0.000000] ffffffff81807d00: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 [ 0.000000] ================================================================== Fix that by using READ_ONCE_NOCHECK when reading the stack in imprecise mode. Fixes: 5d8544e2d007 ("RISC-V: Generic library routines and assembly") Reported-by: Chathura Rajapaksha <chathura.abeyrathne.lk@gmail.com> Link: https://lore.kernel.org/all/CAD7mqryDQCYyJ1gAmtMm8SASMWAQ4i103ptTb0f6Oda=tPY2=A@mail.gmail.com/ Suggested-by: Dmitry Vyukov <dvyukov@google.com> Signed-off-by: Alexandre Ghiti <alexghiti@rivosinc.com> Link: https://lore.kernel.org/r/20230308091639.602024-1-alexghiti@rivosinc.com Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2023-03-09Merge tag 'net-6.3-rc2' of ↵Linus Torvalds6-2/+9
git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net Pull networking fixes from Paolo Abeni: "Including fixes from netfilter and bpf. Current release - regressions: - core: avoid skb end_offset change in __skb_unclone_keeptruesize() - sched: - act_connmark: handle errno on tcf_idr_check_alloc - flower: fix fl_change() error recovery path - ieee802154: prevent user from crashing the host Current release - new code bugs: - eth: bnxt_en: fix the double free during device removal - tools: ynl: - fix enum-as-flags in the generic CLI - fully inherit attrs in subsets - re-license uniformly under GPL-2.0 or BSD-3-clause Previous releases - regressions: - core: use indirect calls helpers for sk_exit_memory_pressure() - tls: - fix return value for async crypto - avoid hanging tasks on the tx_lock - eth: ice: copy last block omitted in ice_get_module_eeprom() Previous releases - always broken: - core: avoid double iput when sock_alloc_file fails - af_unix: fix struct pid leaks in OOB support - tls: - fix possible race condition - fix device-offloaded sendpage straddling records - bpf: - sockmap: fix an infinite loop error - test_run: fix &xdp_frame misplacement for LIVE_FRAMES - fix resolving BTF_KIND_VAR after ARRAY, STRUCT, UNION, PTR - netfilter: tproxy: fix deadlock due to missing BH disable - phylib: get rid of unnecessary locking - eth: bgmac: fix *initial* chip reset to support BCM5358 - eth: nfp: fix csum for ipsec offload - eth: mtk_eth_soc: fix RX data corruption issue Misc: - usb: qmi_wwan: add telit 0x1080 composition" * tag 'net-6.3-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (64 commits) tools: ynl: fix enum-as-flags in the generic CLI tools: ynl: move the enum classes to shared code net: avoid double iput when sock_alloc_file fails af_unix: fix struct pid leaks in OOB support eth: fealnx: bring back this old driver net: dsa: mt7530: permit port 5 to work without port 6 on MT7621 SoC net: microchip: sparx5: fix deletion of existing DSCP mappings octeontx2-af: Unlock contexts in the queue context cache in case of fault detection net/smc: fix fallback failed while sendmsg with fastopen ynl: re-license uniformly under GPL-2.0 OR BSD-3-Clause mailmap: update entries for Stephen Hemminger mailmap: add entry for Maxim Mikityanskiy nfc: change order inside nfc_se_io error path ethernet: ice: avoid gcc-9 integer overflow warning ice: don't ignore return codes in VSI related code ice: Fix DSCP PFC TLV creation net: usb: qmi_wwan: add Telit 0x1080 composition net: usb: cdc_mbim: avoid altsetting toggling for Telit FE990 netfilter: conntrack: adopt safer max chain length net: tls: fix device-offloaded sendpage straddling records ...
2023-03-09Merge tag 'm68k-for-v6.3-tag2' of ↵Linus Torvalds3-11/+13
git://git.kernel.org/pub/scm/linux/kernel/git/geert/linux-m68k Pull m68k fixes from Geert Uytterhoeven: - Fix systems with memory at end of 32-bit address space - Fix initrd on systems where memory does not start at address zero - Fix 68030 handling of bus errors for addresses in exception tables * tag 'm68k-for-v6.3-tag2' of git://git.kernel.org/pub/scm/linux/kernel/git/geert/linux-m68k: m68k: Only force 030 bus error if PC not in exception table m68k: mm: Move initrd phys_to_virt handling after paging_init() m68k: mm: Fix systems with memory at end of 32-bit address space
2023-03-09sh: sanitize the flags on sigreturnAl Viro2-0/+4
We fetch %SR value from sigframe; it might have been modified by signal handler, so we can't trust it with any bits that are not modifiable in user mode. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Cc: Rich Felker <dalias@libc.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2023-03-09eth: fealnx: bring back this old driverJakub Kicinski2-0/+2
This reverts commit d5e2d038dbece821f1af57acbeded3aa9a1832c1. We have a report of this chip being used on a SURECOM EP-320X-S 100/10M Ethernet PCI Adapter which could still have been purchased in some parts of the world 3 years ago. Cc: stable@vger.kernel.org Link: https://bugzilla.kernel.org/show_bug.cgi?id=217151 Fixes: d5e2d038dbec ("eth: fealnx: delete the driver for Myson MTD-800") Link: https://lore.kernel.org/r/20230307171930.4008454-1-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-03-08x86/resctl: fix scheduler confusion with 'current'Linus Torvalds4-10/+10
The implementation of 'current' on x86 is very intentionally special: it is a very common thing to look up, and it uses 'this_cpu_read_stable()' to get the current thread pointer efficiently from per-cpu storage. And the keyword in there is 'stable': the current thread pointer never changes as far as a single thread is concerned. Even if when a thread is preempted, or moved to another CPU, or even across an explicit call 'schedule()' that thread will still have the same value for 'current'. It is, after all, the kernel base pointer to thread-local storage. That's why it's stable to begin with, but it's also why it's important enough that we have that special 'this_cpu_read_stable()' access for it. So this is all done very intentionally to allow the compiler to treat 'current' as a value that never visibly changes, so that the compiler can do CSE and combine multiple different 'current' accesses into one. However, there is obviously one very special situation when the currently running thread does actually change: inside the scheduler itself. So the scheduler code paths are special, and do not have a 'current' thread at all. Instead there are _two_ threads: the previous and the next thread - typically called 'prev' and 'next' (or prev_p/next_p) internally. So this is all actually quite straightforward and simple, and not all that complicated. Except for when you then have special code that is run in scheduler context, that code then has to be aware that 'current' isn't really a valid thing. Did you mean 'prev'? Did you mean 'next'? In fact, even if then look at the code, and you use 'current' after the new value has been assigned to the percpu variable, we have explicitly told the compiler that 'current' is magical and always stable. So the compiler is quite free to use an older (or newer) value of 'current', and the actual assignment to the percpu storage is not relevant even if it might look that way. Which is exactly what happened in the resctl code, that blithely used 'current' in '__resctrl_sched_in()' when it really wanted the new process state (as implied by the name: we're scheduling 'into' that new resctl state). And clang would end up just using the old thread pointer value at least in some configurations. This could have happened with gcc too, and purely depends on random compiler details. Clang just seems to have been more aggressive about moving the read of the per-cpu current_task pointer around. The fix is trivial: just make the resctl code adhere to the scheduler rules of using the prev/next thread pointer explicitly, instead of using 'current' in a situation where it just wasn't valid. That same code is then also used outside of the scheduler context (when a thread resctl state is explicitly changed), and then we will just pass in 'current' as that pointer, of course. There is no ambiguity in that case. The fix may be trivial, but noticing and figuring out what went wrong was not. The credit for that goes to Stephane Eranian. Reported-by: Stephane Eranian <eranian@google.com> Link: https://lore.kernel.org/lkml/20230303231133.1486085-1-eranian@google.com/ Link: https://lore.kernel.org/lkml/alpine.LFD.2.01.0908011214330.3304@localhost.localdomain/ Reviewed-by: Nick Desaulniers <ndesaulniers@google.com> Tested-by: Tony Luck <tony.luck@intel.com> Tested-by: Stephane Eranian <eranian@google.com> Tested-by: Babu Moger <babu.moger@amd.com> Cc: stable@kernel.org Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2023-03-08x86/CPU/AMD: Disable XSAVES on AMD family 0x17Andrew Cooper1-0/+9
AMD Erratum 1386 is summarised as: XSAVES Instruction May Fail to Save XMM Registers to the Provided State Save Area This piece of accidental chronomancy causes the %xmm registers to occasionally reset back to an older value. Ignore the XSAVES feature on all AMD Zen1/2 hardware. The XSAVEC instruction (which works fine) is equivalent on affected parts. [ bp: Typos, move it into the F17h-specific function. ] Reported-by: Tavis Ormandy <taviso@gmail.com> Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Cc: <stable@kernel.org> Link: https://lore.kernel.org/r/20230307174643.1240184-1-andrew.cooper3@citrix.com
2023-03-07RISC-V: fix taking the text_mutex twice during sifive errata patchingConor Dooley1-1/+1
Chris pointed out that some bonehead, *cough* me *cough*, added two mutex_locks() to the SiFive errata patching. The second was meant to have been a mutex_unlock(). This results in errors such as Unable to handle kernel NULL pointer dereference at virtual address 0000000000000030 Oops [#1] Modules linked in: CPU: 0 PID: 0 Comm: swapper Not tainted 6.2.0-rc1-starlight-00079-g9493e6f3ce02 #229 Hardware name: BeagleV Starlight Beta (DT) epc : __schedule+0x42/0x500 ra : schedule+0x46/0xce epc : ffffffff8065957c ra : ffffffff80659a80 sp : ffffffff81203c80 gp : ffffffff812d50a0 tp : ffffffff8120db40 t0 : ffffffff81203d68 t1 : 0000000000000001 t2 : 4c45203a76637369 s0 : ffffffff81203cf0 s1 : ffffffff8120db40 a0 : 0000000000000000 a1 : ffffffff81213958 a2 : ffffffff81213958 a3 : 0000000000000000 a4 : 0000000000000000 a5 : ffffffff80a1bd00 a6 : 0000000000000000 a7 : 0000000052464e43 s2 : ffffffff8120db41 s3 : ffffffff80a1ad00 s4 : 0000000000000000 s5 : 0000000000000002 s6 : ffffffff81213938 s7 : 0000000000000000 s8 : 0000000000000000 s9 : 0000000000000001 s10: ffffffff812d7204 s11: ffffffff80d3c920 t3 : 0000000000000001 t4 : ffffffff812e6dd7 t5 : ffffffff812e6dd8 t6 : ffffffff81203bb8 status: 0000000200000100 badaddr: 0000000000000030 cause: 000000000000000d [<ffffffff80659a80>] schedule+0x46/0xce [<ffffffff80659dce>] schedule_preempt_disabled+0x16/0x28 [<ffffffff8065ae0c>] __mutex_lock.constprop.0+0x3fe/0x652 [<ffffffff8065b138>] __mutex_lock_slowpath+0xe/0x16 [<ffffffff8065b182>] mutex_lock+0x42/0x4c [<ffffffff8000ad94>] sifive_errata_patch_func+0xf6/0x18c [<ffffffff80002b92>] _apply_alternatives+0x74/0x76 [<ffffffff80802ee8>] apply_boot_alternatives+0x3c/0xfa [<ffffffff80803cb0>] setup_arch+0x60c/0x640 [<ffffffff80800926>] start_kernel+0x8e/0x99c ---[ end trace 0000000000000000 ]--- Reported-by: Chris Hofstaedtler <zeha@debian.org> Fixes: 9493e6f3ce02 ("RISC-V: take text_mutex during alternative patching") Signed-off-by: Conor Dooley <conor.dooley@microchip.com> Tested-by: Geert Uytterhoeven <geert+renesas@glider.be> Link: https://lore.kernel.org/r/20230302174154.970746-1-conor@kernel.org [Palmer: pick up Geert's bug report from the thread] Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2023-03-07Merge tag 'for-netdev' of ↵Jakub Kicinski1-0/+1
https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf Daniel Borkmann says: ==================== pull-request: bpf 2023-03-06 We've added 8 non-merge commits during the last 7 day(s) which contain a total of 9 files changed, 64 insertions(+), 18 deletions(-). The main changes are: 1) Fix BTF resolver for DATASEC sections when a VAR points at a modifier, that is, keep resolving such instances instead of bailing out, from Lorenz Bauer. 2) Fix BPF test framework with regards to xdp_frame info misplacement in the "live packet" code, from Alexander Lobakin. 3) Fix an infinite loop in BPF sockmap code for TCP/UDP/AF_UNIX, from Liu Jian. 4) Fix a build error for riscv BPF JIT under PERF_EVENTS=n, from Randy Dunlap. 5) Several BPF doc fixes with either broken links or external instead of internal doc links, from Bagas Sanjaya. * tag 'for-netdev' of https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf: selftests/bpf: check that modifier resolves after pointer btf: fix resolving BTF_KIND_VAR after ARRAY, STRUCT, UNION, PTR bpf, test_run: fix &xdp_frame misplacement for LIVE_FRAMES bpf, doc: Link to submitting-patches.rst for general patch submission info bpf, doc: Do not link to docs.kernel.org for kselftest link bpf, sockmap: Fix an infinite loop error when len is 0 in tcp_bpf_recvmsg_parser() riscv, bpf: Fix patch_text implicit declaration bpf, docs: Fix link to BTF doc ==================== Link: https://lore.kernel.org/r/20230306215944.11981-1-daniel@iogearbox.net Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-03-07alpha: fix lazy-FPU mis(merged/applied/whatnot)Al Viro1-2/+2
Looks like a braino that used to be fixed in e.g. #next.alpha had gotten into alpha.git cherry-picked version of that patch. Sure, alpha has no preempt, but preempt_enable() in place of preempt_disable() is actively confusing the readers... Other than that, the cherry-picked variant matches what I have. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2023-03-07RISC-V: Stop emitting attributesPalmer Dabbelt2-0/+11
The RISC-V ELF attributes don't contain any useful information. New toolchains ignore them, but they frequently trip up various older/mixed toolchains. So just turn them off. Tested-by: Conor Dooley <conor.dooley@microchip.com> Link: https://lore.kernel.org/r/20230223224605.6995-1-palmer@rivosinc.com Cc: stable@vger.kernel.org Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2023-03-06cpumask: fix incorrect cpumask scanning result checksLinus Torvalds1-1/+1
It turns out that commit 596ff4a09b89 ("cpumask: re-introduce constant-sized cpumask optimizations") exposed a number of cases of drivers not checking the result of "cpumask_next()" and friends correctly. The documented correct check for "no more cpus in the cpumask" is to check for the result being equal or larger than the number of possible CPU ids, exactly _because_ we've always done those constant-sized cpumask scans using a widened type before. So the return value of a cpumask scan should be checked with if (cpu >= nr_cpu_ids) ... because the cpumask scan did not necessarily stop exactly *at* that maximum CPU id. But a few cases ended up instead using checks like if (cpu == nr_cpumask_bits) ... which used that internal "widened" number of bits. And that used to work pretty much by accident (ok, in this case "by accident" is simply because it matched the historical internal implementation of the cpumask scanning, so it was more of a "intentionally using implementation details rather than an accident"). But the extended constant-sized optimizations then did that internal implementation differently, and now that code that did things wrong but matched the old implementation no longer worked at all. Which then causes subsequent odd problems due to using what ends up being an invalid CPU ID. Most of these cases require either unusual hardware or special uses to hit, but the random.c one triggers quite easily. All you really need is to have a sufficiently small CONFIG_NR_CPUS value for the bit scanning optimization to be triggered, but not enough CPUs to then actually fill that widened cpumask. At that point, the cpumask scanning will return the NR_CPUS constant, which is _not_ the same as nr_cpumask_bits. This just does the mindless fix with sed -i 's/== nr_cpumask_bits/>= nr_cpu_ids/' to fix the incorrect uses. The ones in the SCSI lpfc driver in particular could probably be fixed more cleanly by just removing that repeated pattern entirely, but I am not emptionally invested enough in that driver to care. Reported-and-tested-by: Guenter Roeck <linux@roeck-us.net> Link: https://lore.kernel.org/lkml/481b19b5-83a0-4793-b4fd-194ad7b978c3@roeck-us.net/ Reported-and-tested-by: Geert Uytterhoeven <geert+renesas@glider.be> Link: https://lore.kernel.org/lkml/CAMuHMdUKo_Sf7TjKzcNDa8Ve+6QrK+P8nSQrSQ=6LTRmcBKNww@mail.gmail.com/ Reported-by: Vernon Yang <vernon2gm@gmail.com> Link: https://lore.kernel.org/lkml/20230306160651.2016767-1-vernon2gm@gmail.com/ Cc: Yury Norov <yury.norov@gmail.com> Cc: Jason A. Donenfeld <Jason@zx2c4.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2023-03-06m68k: Only force 030 bus error if PC not in exception tableMichael Schmitz1-1/+3
__get_kernel_nofault() does copy data in supervisor mode when forcing a task backtrace log through /proc/sysrq_trigger. This is expected cause a bus error exception on e.g. NULL pointer dereferencing when logging a kernel task has no workqueue associated. This bus error ought to be ignored. Our 030 bus error handler is ill equipped to deal with this: Whenever ssw indicates a kernel mode access on a data fault, we don't even attempt to handle the fault and instead always send a SEGV signal (or panic). As a result, the check for exception handling at the fault PC (buried in send_sig_fault() which gets called from do_page_fault() eventually) is never used. In contrast, both 040 and 060 access error handlers do not care whether a fault happened on supervisor mode access, and will call do_page_fault() on those, ultimately honoring the exception table. Add a check in bus_error030 to call do_page_fault() in case we do have an entry for the fault PC in our exception table. I had attempted a fix for this earlier in 2019 that did rely on testing pagefault_disabled() (see link below) to achieve the same thing, but this patch should be more generic. Tested on 030 Atari Falcon. Reported-by: Eero Tamminen <oak@helsinkinet.fi> Link: https://lore.kernel.org/r/alpine.LNX.2.21.1904091023540.25@nippy.intranet Link: https://lore.kernel.org/r/63130691-1984-c423-c1f2-73bfd8d3dcd3@gmail.com Signed-off-by: Michael Schmitz <schmitzmic@gmail.com> Reviewed-by: Geert Uytterhoeven <geert@linux-m68k.org> Link: https://lore.kernel.org/r/20230301021107.26307-1-schmitzmic@gmail.com Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org>
2023-03-06m68k: mm: Move initrd phys_to_virt handling after paging_init()Geert Uytterhoeven1-5/+5
When booting with an initial ramdisk on platforms where physical memory does not start at address zero (e.g. on Amiga): initrd: 0ef0602c - 0f800000 Zone ranges: DMA [mem 0x0000000008000000-0x000000f7ffffffff] Normal empty Movable zone start for each node Early memory node ranges node 0: [mem 0x0000000008000000-0x000000000f7fffff] Initmem setup node 0 [mem 0x0000000008000000-0x000000000f7fffff] Unable to handle kernel access at virtual address (ptrval) Oops: 00000000 Modules linked in: PC: [<00201d3c>] memcmp+0x28/0x56 As phys_to_virt() relies on m68k_memoffset and module_fixup(), it must not be called before paging_init(). Hence postpone the phys_to_virt handling for the initial ramdisk until after calling paging_init(). While at it, reduce #ifdef clutter by using IS_ENABLED() instead. Fixes: 376e3fdecb0dcae2 ("m68k: Enable memtest functionality") Reported-by: Stephen Walsh <vk3heg@vk3heg.net> Link: https://lists.debian.org/debian-68k/2022/09/msg00007.html Reported-by: John Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de> Link: https://lore.kernel.org/r/4f45f05f377bf3f5baf88dbd5c3c8aeac59d94f0.camel@physik.fu-berlin.de Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org> Acked-by: Finn Thain <fthain@linux-m68k.org> Link: https://lore.kernel.org/r/dff216da09ab7a60217c3fc2147e671ae07d636f.1677528627.git.geert@linux-m68k.org
2023-03-06m68k: mm: Fix systems with memory at end of 32-bit address spaceKars de Jong1-5/+5
The calculation of end addresses of memory chunks overflowed to 0 when a memory chunk is located at the end of 32-bit address space. This is the case for the HP300 architecture. Link: https://lore.kernel.org/linux-m68k/CACz-3rhUo5pgNwdWHaPWmz+30Qo9xCg70wNxdf7o5x-6tXq8QQ@mail.gmail.com/ Signed-off-by: Kars de Jong <jongk@linux-m68k.org> Reviewed-by: Geert Uytterhoeven <geert@linux-m68k.org> Link: https://lore.kernel.org/r/20230223112349.26675-1-jongk@linux-m68k.org Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org>
2023-03-06cpumask: re-introduce constant-sized cpumask optimizationsLinus Torvalds1-3/+1
Commit aa47a7c215e7 ("lib/cpumask: deprecate nr_cpumask_bits") resulted in the cpumask operations potentially becoming hugely less efficient, because suddenly the cpumask was always considered to be variable-sized. The optimization was then later added back in a limited form by commit 6f9c07be9d02 ("lib/cpumask: add FORCE_NR_CPUS config option"), but that FORCE_NR_CPUS option is not useful in a generic kernel and more of a special case for embedded situations with fixed hardware. Instead, just re-introduce the optimization, with some changes. Instead of depending on CPUMASK_OFFSTACK being false, and then always using the full constant cpumask width, this introduces three different cpumask "sizes": - the exact size (nr_cpumask_bits) remains identical to nr_cpu_ids. This is used for situations where we should use the exact size. - the "small" size (small_cpumask_bits) is the NR_CPUS constant if it fits in a single word and the bitmap operations thus end up able to trigger the "small_const_nbits()" optimizations. This is used for the operations that have optimized single-word cases that get inlined, notably the bit find and scanning functions. - the "large" size (large_cpumask_bits) is the NR_CPUS constant if it is an sufficiently small constant that makes simple "copy" and "clear" operations more efficient. This is arbitrarily set at four words or less. As a an example of this situation, without this fixed size optimization, cpumask_clear() will generate code like movl nr_cpu_ids(%rip), %edx addq $63, %rdx shrq $3, %rdx andl $-8, %edx callq memset@PLT on x86-64, because it would calculate the "exact" number of longwords that need to be cleared. In contrast, with this patch, using a MAX_CPU of 64 (which is quite a reasonable value to use), the above becomes a single movq $0,cpumask instruction instead, because instead of caring to figure out exactly how many CPU's the system has, it just knows that the cpumask will be a single word and can just clear it all. Note that this does end up tightening the rules a bit from the original version in another way: operations that set bits in the cpumask are now limited to the actual nr_cpu_ids limit, whereas we used to do the nr_cpumask_bits thing almost everywhere in the cpumask code. But if you just clear bits, or scan for bits, we can use the simpler compile-time constants. In the process, remove 'cpumask_complement()' and 'for_each_cpu_not()' which were not useful, and which fundamentally have to be limited to 'nr_cpu_ids'. Better remove them now than have somebody introduce use of them later. Of course, on x86-64 with MAXSMP there is no sane small compile-time constant for the cpumask sizes, and we end up using the actual CPU bits, and will generate the above kind of horrors regardless. Please don't use MAXSMP unless you really expect to have machines with thousands of cores. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2023-03-05Merge tag 'x86-urgent-2023-03-05' of ↵Linus Torvalds1-7/+18
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 updates from Thomas Gleixner: "A small set of updates for x86: - Return -EIO instead of success when the certificate buffer for SEV guests is not large enough - Allow STIPB to be enabled with legacy IBSR. Legacy IBRS is cleared on return to userspace for performance reasons, but the leaves user space vulnerable to cross-thread attacks which STIBP prevents. Update the documentation accordingly" * tag 'x86-urgent-2023-03-05' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: virt/sev-guest: Return -EIO if certificate buffer is not large enough Documentation/hw-vuln: Document the interaction between IBRS and STIBP x86/speculation: Allow enabling STIBP with legacy IBRS
2023-03-05Merge tag 'pull-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfsLinus Torvalds11-11/+48
Pull VM_FAULT_RETRY fixes from Al Viro: "Some of the page fault handlers do not deal with the following case correctly: - handle_mm_fault() has returned VM_FAULT_RETRY - there is a pending fatal signal - fault had happened in kernel mode Correct action in such case is not "return unconditionally" - fatal signals are handled only upon return to userland and something like copy_to_user() would end up retrying the faulting instruction and triggering the same fault again and again. What we need to do in such case is to make the caller to treat that as failed uaccess attempt - handle exception if there is an exception handler for faulting instruction or oops if there isn't one. Over the years some architectures had been fixed and now are handling that case properly; some still do not. This series should fix the remaining ones. Status: - m68k, riscv, hexagon, parisc: tested/acked by maintainers. - alpha, sparc32, sparc64: tested locally - bug has been reproduced on the unpatched kernel and verified to be fixed by this series. - ia64, microblaze, nios2, openrisc: build, but otherwise completely untested" * tag 'pull-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: openrisc: fix livelock in uaccess nios2: fix livelock in uaccess microblaze: fix livelock in uaccess ia64: fix livelock in uaccess sparc: fix livelock in uaccess alpha: fix livelock in uaccess parisc: fix livelock in uaccess hexagon: fix livelock in uaccess riscv: fix livelock in uaccess m68k: fix livelock in uaccess
2023-03-05Remove Intel compiler supportMasahiro Yamada3-172/+2
include/linux/compiler-intel.h had no update in the past 3 years. We often forget about the third C compiler to build the kernel. For example, commit a0a12c3ed057 ("asm goto: eradicate CC_HAS_ASM_GOTO") only mentioned GCC and Clang. init/Kconfig defines CC_IS_GCC and CC_IS_CLANG but not CC_IS_ICC, and nobody has reported any issue. I guess the Intel Compiler support is broken, and nobody is caring about it. Harald Arnesen pointed out ICC (classic Intel C/C++ compiler) is deprecated: $ icc -v icc: remark #10441: The Intel(R) C++ Compiler Classic (ICC) is deprecated and will be removed from product release in the second half of 2023. The Intel(R) oneAPI DPC++/C++ Compiler (ICX) is the recommended compiler moving forward. Please transition to use this compiler. Use '-diag-disable=10441' to disable this message. icc version 2021.7.0 (gcc version 12.1.0 compatibility) Arnd Bergmann provided a link to the article, "Intel C/C++ compilers complete adoption of LLVM". lib/zstd/common/compiler.h and lib/zstd/compress/zstd_fast.c were kept untouched for better sync with https://github.com/facebook/zstd Link: https://www.intel.com/content/www/us/en/developer/articles/technical/adoption-of-llvm-complete-icx.html Signed-off-by: Masahiro Yamada <masahiroy@kernel.org> Acked-by: Arnd Bergmann <arnd@arndb.de> Reviewed-by: Nick Desaulniers <ndesaulniers@google.com> Reviewed-by: Nathan Chancellor <nathan@kernel.org> Reviewed-by: Miguel Ojeda <ojeda@kernel.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2023-03-05Merge tag 'mm-hotfixes-stable-2023-03-04-13-12' of ↵Linus Torvalds1-19/+0
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm Pull misc fixes from Andrew Morton: "17 hotfixes. Eight are for MM and seven are for other parts of the kernel. Seven are cc:stable and eight address post-6.3 issues or were judged unsuitable for -stable backporting" * tag 'mm-hotfixes-stable-2023-03-04-13-12' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: mailmap: map Dikshita Agarwal's old address to his current one mailmap: map Vikash Garodia's old address to his current one fs/cramfs/inode.c: initialize file_ra_state fs: hfsplus: fix UAF issue in hfsplus_put_super panic: fix the panic_print NMI backtrace setting lib: parser: update documentation for match_NUMBER functions kasan, x86: don't rename memintrinsics in uninstrumented files kasan: test: fix test for new meminstrinsic instrumentation kasan: treat meminstrinsic as builtins in uninstrumented files kasan: emit different calls for instrumentable memintrinsics ocfs2: fix non-auto defrag path not working issue ocfs2: fix defrag path triggering jbd2 ASSERT mailmap: map Georgi Djakov's old Linaro address to his current one mm/hwpoison: convert TTU_IGNORE_HWPOISON to TTU_HWPOISON lib/zlib: DFLTCC deflate does not write all available bits for Z_NO_FLUSH mm/damon/paddr: fix missing folio_put() mm/mremap: fix dup_anon_vma() in vma_merge() case 4
2023-03-04Merge tag 'powerpc-6.3-2' of ↵Linus Torvalds2-1/+2
git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux Pull powerpc fixes from Michael Ellerman: - Drop orphaned VAS MAINTAINERS entry - Fix build errors with clang and KCSAN - Avoid build errors seen with LD_DEAD_CODE_DATA_ELIMINATION together with recordmcount Thanks to Nathan Chancellor. * tag 'powerpc-6.3-2' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: powerpc: Avoid dead code/data elimination when using recordmcount powerpc/vmlinux.lds: Add .text.asan/tsan sections powerpc: Drop orphaned VAS MAINTAINERS entry
2023-03-03Merge tag 's390-6.3-2' of ↵Linus Torvalds12-99/+128
git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux Pull more s390 updates from Heiko Carstens: - Add empty command line parameter handling stubs to kernel for all command line parameters which are handled in the decompressor. This avoids invalid "Unknown kernel command line parameters" messages from the kernel, and also avoids that these will be incorrectly passed to user space. This caused already confusion, therefore add the empty stubs - Add missing phys_to_virt() handling to machine check handler - Introduce and use a union to be used for zcrypt inline assemblies. This makes sure that only a register wide member of the union is passed as input and output parameter to inline assemblies, while usual C code uses other members of the union to access bit fields of it - Add and use a READ_ONCE_ALIGNED_128() macro, which can be used to atomically read a 128-bit value from memory. This replaces the (mis-)use of the 128-bit cmpxchg operation to do the same in cpum_sf code. Currently gcc does not generate the used lpq instruction if __READ_ONCE() is used for aligned 128-bit accesses, therefore use this s390 specific helper - Simplify machine check handler code if a task needs to be killed because of e.g. register corruption due to a machine malfunction - Perform CPU reset to clear pending interrupts and TLB entries on an already stopped target CPU before delegating work to it - Generate arch/s390/boot/vmlinux.map link map for the decompressor, when CONFIG_VMLINUX_MAP is enabled for debugging purposes - Fix segment type handling for dcssblk devices. It incorrectly always returned type "READ/WRITE" even for read-only segements, which can result in a kernel panic if somebody tries to write to a read-only device - Sort config S390 select list again - Fix two kprobe reenter bugs revealed by a recently added kprobe kunit test * tag 's390-6.3-2' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux: s390/kprobes: fix current_kprobe never cleared after kprobes reenter s390/kprobes: fix irq mask clobbering on kprobe reenter from post_handler s390/Kconfig: sort config S390 select list again s390/extmem: return correct segment type in __segment_load() s390/decompressor: add link map saving s390/smp: perform cpu reset before delegating work to target cpu s390/mcck: cleanup user process termination path s390/cpum_sf: use READ_ONCE_ALIGNED_128() instead of 128-bit cmpxchg s390/rwonce: add READ_ONCE_ALIGNED_128() macro s390/ap,zcrypt,vfio: introduce and use ap_queue_status_reg union s390/nmi: fix virtual-physical address confusion s390/setup: do not complain about parameters handled in decompressor
2023-03-03Merge tag 'riscv-for-linus-6.3-mw2' of ↵Linus Torvalds4-17/+27
git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux Pull more RISC-V updates from Palmer Dabbelt: - Some cleanups and fixes for the Zbb-optimized string routines - Support for custom (vendor or implementation defined) perf events - COMMAND_LINE_SIZE has been increased to 1024 * tag 'riscv-for-linus-6.3-mw2' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux: riscv: Bump COMMAND_LINE_SIZE value to 1024 drivers/perf: RISC-V: Allow programming custom firmware events riscv, lib: Fix Zbb strncmp RISC-V: improve string-function assembly
2023-03-03kasan, x86: don't rename memintrinsics in uninstrumented filesMarco Elver1-19/+0
Now that memcpy/memset/memmove are no longer overridden by KASAN, we can just use the normal symbol names in uninstrumented files. Drop the preprocessor redefinitions. Link: https://lkml.kernel.org/r/20230224085942.1791837-4-elver@google.com Fixes: 69d4c0d32186 ("entry, kasan, x86: Disallow overriding mem*() functions") Signed-off-by: Marco Elver <elver@google.com> Reviewed-by: Andrey Konovalov <andreyknvl@gmail.com> Cc: Alexander Potapenko <glider@google.com> Cc: Andrey Ryabinin <ryabinin.a.a@gmail.com> Cc: Borislav Petkov (AMD) <bp@alien8.de> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: Dmitry Vyukov <dvyukov@google.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jakub Jelinek <jakub@redhat.com> Cc: Kees Cook <keescook@chromium.org> Cc: Linux Kernel Functional Testing <lkft@linaro.org> Cc: Naresh Kamboju <naresh.kamboju@linaro.org> Cc: Nathan Chancellor <nathan@kernel.org> Cc: Nick Desaulniers <ndesaulniers@google.com> Cc: Nicolas Schier <nicolas@fjasle.eu> Cc: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vincenzo Frascino <vincenzo.frascino@arm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-03-03Merge tag 'arm64-fixes' of ↵Linus Torvalds8-13/+22
git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux Pull arm64 fixes from Catalin Marinas: - In copy_highpage(), only reset the tag of the destination pointer if KASAN_HW_TAGS is enabled so that user-space MTE does not interfere with KASAN_SW_TAGS (which relies on top-byte-ignore). - Remove warning if SME is detected without SVE, the kernel can cope with such configuration (though none in the field currently). - In cfi_handler(), pass the ESR_EL1 value to die() for consistency with other die() callers. - Disable HUGETLB_PAGE_OPTIMIZE_VMEMMAP on arm64 since the pte manipulation from the generic vmemmap_remap_pte() does not follow the required ARM break-before-make sequence (clear the pte, flush the TLBs, set the new pte). It may be re-enabled once this sequence is sorted. - Fix possible memory leak in the arm64 ACPI code if the SMCCC version and conduit checks fail. - Forbid CALL_OPS with CC_OPTIMIZE_FOR_SIZE since gcc ignores -falign-functions=N with -Os. - Don't pretend KASLR is enabled if offset < MIN_KIMG_ALIGN as no randomisation would actually take place. * tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: arm64: kaslr: don't pretend KASLR is enabled if offset < MIN_KIMG_ALIGN arm64: ftrace: forbid CALL_OPS with CC_OPTIMIZE_FOR_SIZE arm64: acpi: Fix possible memory leak of ffh_ctxt arm64: mm: hugetlb: Disable HUGETLB_PAGE_OPTIMIZE_VMEMMAP arm64: pass ESR_ELx to die() of cfi_handler arm64/fpsimd: Remove warning for SME without SVE arm64: Reset KASAN tag in copy_highpage with HW tags only
2023-03-02Merge tag 'mips_6.3_1' of ↵Linus Torvalds8-31/+31
git://git.kernel.org/pub/scm/linux/kernel/git/mips/linux Pull more MIPS updates from Thomas Bogendoerfer: "A few more cleanups and fixes" * tag 'mips_6.3_1' of git://git.kernel.org/pub/scm/linux/kernel/git/mips/linux: MIPS: Workaround clang inline compat branch issue mips: dts: ralink: mt7621: add phandle to system controller node for watchdog mips: dts: ralink: mt7621: rename watchdog node from 'wdt' into 'watchdog' mips: ralink: make SOC_MT7621 select PINCTRL mips: remove SYS_HAS_CPU_MIPS32_R1 from RALINK MIPS: cevt-r4k: Offset the value used to clear compare interrupt MIPS: smp-cps: Don't rely on CP0_CMGCRBASE MIPS: Remove DMA_PERDEV_COHERENT
2023-03-02Merge tag 'objtool-core-2023-03-02' of ↵Linus Torvalds5-11/+19
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull objtool updates from Ingo Molnar: - Shrink 'struct instruction', to improve objtool performance & memory footprint - Other maximum memory usage reductions - this makes the build both faster, and fixes kernel build OOM failures on allyesconfig and similar configs when they try to build the final (large) vmlinux.o - Fix ORC unwinding when a kprobe (INT3) is set on a stack-modifying single-byte instruction (PUSH/POP or LEAVE). This requires the extension of the ORC metadata structure with a 'signal' field - Misc fixes & cleanups * tag 'objtool-core-2023-03-02' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (22 commits) objtool: Fix ORC 'signal' propagation objtool: Remove instruction::list x86: Fix FILL_RETURN_BUFFER objtool: Fix overlapping alternatives objtool: Union instruction::{call_dest,jump_table} objtool: Remove instruction::reloc objtool: Shrink instruction::{type,visited} objtool: Make instruction::alts a single-linked list objtool: Make instruction::stack_ops a single-linked list objtool: Change arch_decode_instruction() signature x86/entry: Fix unwinding from kprobe on PUSH/POP instruction x86/unwind/orc: Add 'signal' field to ORC metadata objtool: Optimize layout of struct special_alt objtool: Optimize layout of struct symbol objtool: Allocate multiple structures with calloc() objtool: Make struct check_options static objtool: Make struct entries[] static and const objtool: Fix HOSTCC flag usage objtool: Properly support make V=1 objtool: Install libsubcmd in build ...
2023-03-02openrisc: fix livelock in uaccessAl Viro1-1/+4
openrisc equivalent of 26178ec11ef3 "x86: mm: consolidate VM_FAULT_RETRY handling" If e.g. get_user() triggers a page fault and a fatal signal is caught, we might end up with handle_mm_fault() returning VM_FAULT_RETRY and not doing anything to page tables. In such case we must *not* return to the faulting insn - that would repeat the entire thing without making any progress; what we need instead is to treat that as failed (user) memory access. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2023-03-02nios2: fix livelock in uaccessAl Viro1-1/+4
nios2 equivalent of 26178ec11ef3 "x86: mm: consolidate VM_FAULT_RETRY handling" If e.g. get_user() triggers a page fault and a fatal signal is caught, we might end up with handle_mm_fault() returning VM_FAULT_RETRY and not doing anything to page tables. In such case we must *not* return to the faulting insn - that would repeat the entire thing without making any progress; what we need instead is to treat that as failed (user) memory access. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2023-03-02microblaze: fix livelock in uaccessAl Viro1-1/+4
microblaze equivalent of 26178ec11ef3 "x86: mm: consolidate VM_FAULT_RETRY handling" If e.g. get_user() triggers a page fault and a fatal signal is caught, we might end up with handle_mm_fault() returning VM_FAULT_RETRY and not doing anything to page tables. In such case we must *not* return to the faulting insn - that would repeat the entire thing without making any progress; what we need instead is to treat that as failed (user) memory access. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2023-03-02ia64: fix livelock in uaccessAl Viro1-1/+4
ia64 equivalent of 26178ec11ef3 "x86: mm: consolidate VM_FAULT_RETRY handling" If e.g. get_user() triggers a page fault and a fatal signal is caught, we might end up with handle_mm_fault() returning VM_FAULT_RETRY and not doing anything to page tables. In such case we must *not* return to the faulting insn - that would repeat the entire thing without making any progress; what we need instead is to treat that as failed (user) memory access. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2023-03-02sparc: fix livelock in uaccessAl Viro2-2/+10
sparc equivalent of 26178ec11ef3 "x86: mm: consolidate VM_FAULT_RETRY handling" If e.g. get_user() triggers a page fault and a fatal signal is caught, we might end up with handle_mm_fault() returning VM_FAULT_RETRY and not doing anything to page tables. In such case we must *not* return to the faulting insn - that would repeat the entire thing without making any progress; what we need instead is to treat that as failed (user) memory access. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2023-03-02alpha: fix livelock in uaccessAl Viro1-1/+4
alpha equivalent of 26178ec11ef3 "x86: mm: consolidate VM_FAULT_RETRY handling" If e.g. get_user() triggers a page fault and a fatal signal is caught, we might end up with handle_mm_fault() returning VM_FAULT_RETRY and not doing anything to page tables. In such case we must *not* return to the faulting insn - that would repeat the entire thing without making any progress; what we need instead is to treat that as failed (user) memory access. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2023-03-02parisc: fix livelock in uaccessAl Viro1-1/+6
parisc equivalent of 26178ec11ef3 "x86: mm: consolidate VM_FAULT_RETRY handling" If e.g. get_user() triggers a page fault and a fatal signal is caught, we might end up with handle_mm_fault() returning VM_FAULT_RETRY and not doing anything to page tables. In such case we must *not* return to the faulting insn - that would repeat the entire thing without making any progress; what we need instead is to treat that as failed (user) memory access. Tested-by: Helge Deller <deller@gmx.de> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2023-03-02hexagon: fix livelock in uaccessAl Viro1-1/+4
hexagon equivalent of 26178ec11ef3 "x86: mm: consolidate VM_FAULT_RETRY handling" If e.g. get_user() triggers a page fault and a fatal signal is caught, we might end up with handle_mm_fault() returning VM_FAULT_RETRY and not doing anything to page tables. In such case we must *not* return to the faulting insn - that would repeat the entire thing without making any progress; what we need instead is to treat that as failed (user) memory access. Acked-by: Brian Cain <bcain@quicinc.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2023-03-02riscv: fix livelock in uaccessAl Viro1-1/+4
riscv equivalent of 26178ec11ef3 "x86: mm: consolidate VM_FAULT_RETRY handling" If e.g. get_user() triggers a page fault and a fatal signal is caught, we might end up with handle_mm_fault() returning VM_FAULT_RETRY and not doing anything to page tables. In such case we must *not* return to the faulting insn - that would repeat the entire thing without making any progress; what we need instead is to treat that as failed (user) memory access. Tested-by: Björn Töpel <bjorn@kernel.org> Tested-by: Geert Uytterhoeven <geert+renesas@glider.be> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2023-03-02m68k: fix livelock in uaccessAl Viro1-1/+4
m68k equivalent of 26178ec11ef3 "x86: mm: consolidate VM_FAULT_RETRY handling" If e.g. get_user() triggers a page fault and a fatal signal is caught, we might end up with handle_mm_fault() returning VM_FAULT_RETRY and not doing anything to page tables. In such case we must *not* return to the faulting insn - that would repeat the entire thing without making any progress; what we need instead is to treat that as failed (user) memory access. Tested-by: Finn Thain <fthain@linux-m68k.org> Tested-by: Geert Uytterhoeven <geert@linux-m68k.org> Acked-by: Geert Uytterhoeven <geert@linux-m68k.org> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2023-03-02s390/kprobes: fix current_kprobe never cleared after kprobes reenterVasily Gorbik1-0/+1
Recent test_kprobe_missed kprobes kunit test uncovers the following problem. Once kprobe is triggered from another kprobe (kprobe reenter), all future kprobes on this cpu are considered as kprobe reenter, thus pre_handler and post_handler are not being called and kprobes are counted as "missed". Commit b9599798f953 ("[S390] kprobes: activation and deactivation") introduced a simpler scheme for kprobes (de)activation and status tracking by using push_kprobe/pop_kprobe, which supposed to work for both initial kprobe entry as well as kprobe reentry and helps to avoid handling those two cases differently. The problem is that a sequence of calls in case of kprobes reenter: push_kprobe() <- NULL (current_kprobe) push_kprobe() <- kprobe1 (current_kprobe) pop_kprobe() -> kprobe1 (current_kprobe) pop_kprobe() -> kprobe1 (current_kprobe) leaves "kprobe1" as "current_kprobe" on this cpu, instead of setting it to NULL. In fact push_kprobe/pop_kprobe can only store a single state (there is just one prev_kprobe in kprobe_ctlblk). Which is a hack but sufficient, there is no need to have another prev_kprobe just to store NULL. To make a simple and backportable fix simply reset "prev_kprobe" when kprobe is poped from this "stack". No need to worry about "kprobe_status" in this case, because its value is only checked when current_kprobe != NULL. Cc: stable@vger.kernel.org Fixes: b9599798f953 ("[S390] kprobes: activation and deactivation") Reviewed-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2023-03-02s390/kprobes: fix irq mask clobbering on kprobe reenter from post_handlerVasily Gorbik1-2/+1
Recent test_kprobe_missed kprobes kunit test uncovers the following error (reported when CONFIG_DEBUG_ATOMIC_SLEEP is enabled): BUG: sleeping function called from invalid context at kernel/locking/mutex.c:580 in_atomic(): 0, irqs_disabled(): 1, non_block: 0, pid: 662, name: kunit_try_catch preempt_count: 0, expected: 0 RCU nest depth: 0, expected: 0 no locks held by kunit_try_catch/662. irq event stamp: 280 hardirqs last enabled at (279): [<00000003e60a3d42>] __do_pgm_check+0x17a/0x1c0 hardirqs last disabled at (280): [<00000003e3bd774a>] kprobe_exceptions_notify+0x27a/0x318 softirqs last enabled at (0): [<00000003e3c5c890>] copy_process+0x14a8/0x4c80 softirqs last disabled at (0): [<0000000000000000>] 0x0 CPU: 46 PID: 662 Comm: kunit_try_catch Tainted: G N 6.2.0-173644-g44c18d77f0c0 #2 Hardware name: IBM 3931 A01 704 (LPAR) Call Trace: [<00000003e60a3a00>] dump_stack_lvl+0x120/0x198 [<00000003e3d02e82>] __might_resched+0x60a/0x668 [<00000003e60b9908>] __mutex_lock+0xc0/0x14e0 [<00000003e60bad5a>] mutex_lock_nested+0x32/0x40 [<00000003e3f7b460>] unregister_kprobe+0x30/0xd8 [<00000003e51b2602>] test_kprobe_missed+0xf2/0x268 [<00000003e51b5406>] kunit_try_run_case+0x10e/0x290 [<00000003e51b7dfa>] kunit_generic_run_threadfn_adapter+0x62/0xb8 [<00000003e3ce30f8>] kthread+0x2d0/0x398 [<00000003e3b96afa>] __ret_from_fork+0x8a/0xe8 [<00000003e60ccada>] ret_from_fork+0xa/0x40 The reason for this error report is that kprobes handling code failed to restore irqs. The problem is that when kprobe is triggered from another kprobe post_handler current sequence of enable_singlestep / disable_singlestep is the following: enable_singlestep <- original kprobe (saves kprobe_saved_imask) enable_singlestep <- kprobe triggered from post_handler (clobbers kprobe_saved_imask) disable_singlestep <- kprobe triggered from post_handler (restores kprobe_saved_imask) disable_singlestep <- original kprobe (restores wrong clobbered kprobe_saved_imask) There is just one kprobe_ctlblk per cpu and both calls saves and loads irq mask to kprobe_saved_imask. To fix the problem simply move resume_execution (which calls disable_singlestep) before calling post_handler. This also fixes the problem that post_handler is called with pt_regs which were not yet adjusted after single-stepping. Cc: stable@vger.kernel.org Fixes: 4ba069b802c2 ("[S390] add kprobes support.") Reviewed-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2023-03-02riscv: Bump COMMAND_LINE_SIZE value to 1024Alexandre Ghiti1-0/+8
Increase COMMAND_LINE_SIZE as the current default value is too low for syzbot kernel command line. There has been considerable discussion on this patch that has led to a larger patch set removing COMMAND_LINE_SIZE from the uapi headers on all ports. That's not quite done yet, but it's gotten far enough we're confident this is not a uABI change so this is safe. Reported-by: Dmitry Vyukov <dvyukov@google.com> Signed-off-by: Alexandre Ghiti <alex@ghiti.fr> Link: https://lore.kernel.org/r/20210316193420.904-1-alex@ghiti.fr [Palmer: it's not uabi] Link: https://lore.kernel.org/linux-riscv/874b8076-b0d1-4aaa-bcd8-05d523060152@app.fastmail.com/#t Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2023-03-01s390/Kconfig: sort config S390 select list againHeiko Carstens1-3/+3
Keep the config S390 select list sorted. Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2023-03-01s390/extmem: return correct segment type in __segment_load()Gerald Schaefer1-5/+7
Commit f05f62d04271f ("s390/vmem: get rid of memory segment list") reshuffled the call to vmem_add_mapping() in __segment_load(), which now overwrites rc after it was set to contain the segment type code. As result, __segment_load() will now always return 0 on success, which corresponds to the segment type code SEG_TYPE_SW, i.e. a writeable segment. This results in a kernel crash when loading a read-only segment as dcssblk block device, and trying to write to it. Instead of reshuffling code again, make sure to return the segment type on success, and also describe this rather delicate and unexpected logic in the function comment. Also initialize new segtype variable with invalid value, to prevent possible future confusion. Fixes: f05f62d04271 ("s390/vmem: get rid of memory segment list") Cc: <stable@vger.kernel.org> # 5.9+ Signed-off-by: Gerald Schaefer <gerald.schaefer@linux.ibm.com> Reviewed-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2023-03-01Merge tag 'loongarch-6.3' of ↵Linus Torvalds40-129/+2623
git://git.kernel.org/pub/scm/linux/kernel/git/chenhuacai/linux-loongson Pull LoongArch updates from Huacai Chen: - Make -mstrict-align configurable - Add kernel relocation and KASLR support - Add single kernel image implementation for kdump - Add hardware breakpoints/watchpoints support - Add kprobes/kretprobes/kprobes_on_ftrace support - Add LoongArch support for some selftests. * tag 'loongarch-6.3' of git://git.kernel.org/pub/scm/linux/kernel/git/chenhuacai/linux-loongson: (23 commits) selftests/ftrace: Add LoongArch kprobe args string tests support selftests/seccomp: Add LoongArch selftesting support tools: Add LoongArch build infrastructure samples/kprobes: Add LoongArch support LoongArch: Mark some assembler symbols as non-kprobe-able LoongArch: Add kprobes on ftrace support LoongArch: Add kretprobes support LoongArch: Add kprobes support LoongArch: Simulate branch and PC* instructions LoongArch: ptrace: Add hardware single step support LoongArch: ptrace: Add function argument access API LoongArch: ptrace: Expose hardware breakpoints to debuggers LoongArch: Add hardware breakpoints/watchpoints support LoongArch: kdump: Add crashkernel=YM handling LoongArch: kdump: Add single kernel image implementation LoongArch: Add support for kernel address space layout randomization (KASLR) LoongArch: Add support for kernel relocation LoongArch: Add la_abs macro implementation LoongArch: Add JUMP_VIRT_ADDR macro implementation to avoid using la.abs LoongArch: Use la.pcrel instead of la.abs when it's trivially possible ...
2023-03-01Merge tag 'uml-for-linus-6.3-rc1' of ↵Linus Torvalds19-104/+258
git://git.kernel.org/pub/scm/linux/kernel/git/uml/linux Pull UML updates from Richard Weinberger: - Add support for rust (yay!) - Add support for LTO - Add platform bus support to virtio-pci - Various virtio fixes - Coding style, spelling cleanups * tag 'uml-for-linus-6.3-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/uml/linux: (27 commits) Documentation: rust: Fix arch support table uml: vector: Remove unused definitions VECTOR_{WRITE,HEADERS} um: virt-pci: properly remove PCI device from bus um: virtio_uml: move device breaking into workqueue um: virtio_uml: mark device as unregistered when breaking it um: virtio_uml: free command if adding to virtqueue failed UML: define RUNTIME_DISCARD_EXIT virt-pci: add platform bus support um-virt-pci: Make max delay configurable um: virt-pci: implement pcibios_get_phb_of_node() um: Support LTO um: put power options in a menu um: Use CFLAGS_vmlinux um: Prevent building modules incompatible with MODVERSIONS um: Avoid pcap multiple definition errors um: Make the definition of cpu_data more compatible x86: um: vdso: Add '%rcx' and '%r11' to the syscall clobber list rust: arch/um: Add support for CONFIG_RUST under x86_64 UML rust: arch/um: Disable FP/SIMD instruction to match x86 rust: arch/um: Use 'pie' relocation mode under UML ...
2023-03-01riscv, lib: Fix Zbb strncmpBjörn Töpel1-1/+3
The Zbb optimized strncmp has two parts; a fast path that does XLEN/8B per iteration, and a slow that does one byte per iteration. The idea is to compare aligned XLEN chunks for most of strings, and do the remainder tail in the slow path. The Zbb strncmp has two issues in the fast path: Incorrect remainder handling (wrong compare): Assume that the string length is 9. On 64b systems, the fast path should do one iteration, and one iteration in the slow path. Instead, both were done in the fast path, which lead to incorrect results. An example: strncmp("/dev/vda", "/dev/", 5); Correct by changing "bgt" to "bge". Missing NULL checks in the second string: This could lead to incorrect results for: strncmp("/dev/vda", "/dev/vda\0", 8); Correct by adding an additional check. Fixes: b6fcdb191e36 ("RISC-V: add zbb support to string functions") Suggested-by: Heiko Stuebner <heiko.stuebner@vrull.eu> Signed-off-by: Björn Töpel <bjorn@rivosinc.com> Tested-by: Conor Dooley <conor.dooley@microchip.com> Tested-by: Guenter Roeck <linux@roeck-us.net> Link: https://lore.kernel.org/r/20230228184211.1585641-1-bjorn@kernel.org Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2023-03-01powerpc: dts: t1040rdb: enable both CPU portsVladimir Oltean2-1/+6
Since commit eca70102cfb1 ("net: dsa: felix: add support for changing DSA master") included in kernel v6.1, the driver supports 2 CPU ports, and they can be put in a LAG, for example (see Documentation/networking/dsa/configuration.rst for more details). Defining the second CPU port in the device tree should not cause any compatibility issue, because the default CPU port was &seville_port8 before this change, and still is &seville_port8 now (the numerically first CPU port is used by default). Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>