summaryrefslogtreecommitdiff
path: root/arch
AgeCommit message (Collapse)AuthorFilesLines
2024-02-06LoongArch: vDSO: Disable UBSAN instrumentationKees Cook1-0/+1
The vDSO executes in userspace, so the kernel's UBSAN should not instrument it. Solves these kind of build errors: loongarch64-linux-ld: arch/loongarch/vdso/vgettimeofday.o: in function `vdso_shift_ns': lib/vdso/gettimeofday.c:23:(.text+0x3f8): undefined reference to `__ubsan_handle_shift_out_of_bounds' Reported-by: kernel test robot <lkp@intel.com> Closes: https://lore.kernel.org/oe-kbuild-all/202401310530.lZHCj1Zl-lkp@intel.com/ Cc: Huacai Chen <chenhuacai@kernel.org> Cc: WANG Xuerui <kernel@xen0n.name> Cc: Vincenzo Frascino <vincenzo.frascino@arm.com> Cc: Nathan Chancellor <nathan@kernel.org> Cc: Masahiro Yamada <masahiroy@kernel.org> Cc: Fangrui Song <maskray@google.com> Cc: loongarch@lists.linux.dev Signed-off-by: Kees Cook <keescook@chromium.org> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2024-02-06LoongArch: Fix earlycon parameter if KASAN enabledHuacai Chen1-0/+3
The earlycon parameter is based on fixmap, and fixmap addresses are not supposed to be shadowed by KASAN. So return the kasan_early_shadow_page in kasan_mem_to_shadow() if the input address is above FIXADDR_START. Otherwise earlycon cannot work after kasan_init(). Cc: stable@vger.kernel.org Fixes: 5aa4ac64e6add3e ("LoongArch: Add KASAN (Kernel Address Sanitizer) support") Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2024-02-06LoongArch: Change acpi_core_pic[NR_CPUS] to acpi_core_pic[MAX_CORE_PIC]Huacai Chen2-4/+4
With default config, the value of NR_CPUS is 64. When HW platform has more then 64 cpus, system will crash on these platforms. MAX_CORE_PIC is the maximum cpu number in MADT table (max physical number) which can exceed the supported maximum cpu number (NR_CPUS, max logical number), but kernel should not crash. Kernel should boot cpus with NR_CPUS, let the remainder cpus stay in BIOS. The potential crash reason is that the array acpi_core_pic[NR_CPUS] can be overflowed when parsing MADT table, and it is obvious that CORE_PIC should be corresponding to physical core rather than logical core, so it is better to define the array as acpi_core_pic[MAX_CORE_PIC]. With the patch, system can boot up 64 vcpus with qemu parameter -smp 128, otherwise system will crash with the following message. [ 0.000000] CPU 0 Unable to handle kernel paging request at virtual address 0000420000004259, era == 90000000037a5f0c, ra == 90000000037a46ec [ 0.000000] Oops[#1]: [ 0.000000] CPU: 0 PID: 0 Comm: swapper Not tainted 6.8.0-rc2+ #192 [ 0.000000] Hardware name: QEMU QEMU Virtual Machine, BIOS unknown 2/2/2022 [ 0.000000] pc 90000000037a5f0c ra 90000000037a46ec tp 9000000003c90000 sp 9000000003c93d60 [ 0.000000] a0 0000000000000019 a1 9000000003d93bc0 a2 0000000000000000 a3 9000000003c93bd8 [ 0.000000] a4 9000000003c93a74 a5 9000000083c93a67 a6 9000000003c938f0 a7 0000000000000005 [ 0.000000] t0 0000420000004201 t1 0000000000000000 t2 0000000000000001 t3 0000000000000001 [ 0.000000] t4 0000000000000003 t5 0000000000000000 t6 0000000000000030 t7 0000000000000063 [ 0.000000] t8 0000000000000014 u0 ffffffffffffffff s9 0000000000000000 s0 9000000003caee98 [ 0.000000] s1 90000000041b0480 s2 9000000003c93da0 s3 9000000003c93d98 s4 9000000003c93d90 [ 0.000000] s5 9000000003caa000 s6 000000000a7fd000 s7 000000000f556b60 s8 000000000e0a4330 [ 0.000000] ra: 90000000037a46ec platform_init+0x214/0x250 [ 0.000000] ERA: 90000000037a5f0c efi_runtime_init+0x30/0x94 [ 0.000000] CRMD: 000000b0 (PLV0 -IE -DA +PG DACF=CC DACM=CC -WE) [ 0.000000] PRMD: 00000000 (PPLV0 -PIE -PWE) [ 0.000000] EUEN: 00000000 (-FPE -SXE -ASXE -BTE) [ 0.000000] ECFG: 00070800 (LIE=11 VS=7) [ 0.000000] ESTAT: 00010000 [PIL] (IS= ECode=1 EsubCode=0) [ 0.000000] BADV: 0000420000004259 [ 0.000000] PRID: 0014c010 (Loongson-64bit, Loongson-3A5000) [ 0.000000] Modules linked in: [ 0.000000] Process swapper (pid: 0, threadinfo=(____ptrval____), task=(____ptrval____)) [ 0.000000] Stack : 9000000003c93a14 9000000003800898 90000000041844f8 90000000037a46ec [ 0.000000] 000000000a7fd000 0000000008290000 0000000000000000 0000000000000000 [ 0.000000] 0000000000000000 0000000000000000 00000000019d8000 000000000f556b60 [ 0.000000] 000000000a7fd000 000000000f556b08 9000000003ca7700 9000000003800000 [ 0.000000] 9000000003c93e50 9000000003800898 9000000003800108 90000000037a484c [ 0.000000] 000000000e0a4330 000000000f556b60 000000000a7fd000 000000000f556b08 [ 0.000000] 9000000003ca7700 9000000004184000 0000000000200000 000000000e02b018 [ 0.000000] 000000000a7fd000 90000000037a0790 9000000003800108 0000000000000000 [ 0.000000] 0000000000000000 000000000e0a4330 000000000f556b60 000000000a7fd000 [ 0.000000] 000000000f556b08 000000000eaae298 000000000eaa5040 0000000000200000 [ 0.000000] ... [ 0.000000] Call Trace: [ 0.000000] [<90000000037a5f0c>] efi_runtime_init+0x30/0x94 [ 0.000000] [<90000000037a46ec>] platform_init+0x214/0x250 [ 0.000000] [<90000000037a484c>] setup_arch+0x124/0x45c [ 0.000000] [<90000000037a0790>] start_kernel+0x90/0x670 [ 0.000000] [<900000000378b0d8>] kernel_entry+0xd8/0xdc Signed-off-by: Bibo Mao <maobibo@loongson.cn> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2024-02-06LoongArch: Select HAVE_ARCH_SECCOMP to use the common SECCOMP menuMasahiro Yamada1-17/+1
LoongArch missed the refactoring made by commit 282a181b1a0d ("seccomp: Move config option SECCOMP to arch/Kconfig") because LoongArch was not mainlined at that time. The 'depends on PROC_FS' statement is stale as described in that commit. Select HAVE_ARCH_SECCOMP, and remove the duplicated config entry. Signed-off-by: Masahiro Yamada <masahiroy@kernel.org> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2024-02-06LoongArch: Select ARCH_ENABLE_THP_MIGRATION instead of redefining itMasahiro Yamada1-4/+1
ARCH_ENABLE_THP_MIGRATION is supposed to be selected by arch Kconfig. Signed-off-by: Masahiro Yamada <masahiroy@kernel.org> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2024-02-05KVM: x86: Fix KVM_GET_MSRS stack info leakMathias Krause1-10/+5
Commit 6abe9c1386e5 ("KVM: X86: Move ignore_msrs handling upper the stack") changed the 'ignore_msrs' handling, including sanitizing return values to the caller. This was fine until commit 12bc2132b15e ("KVM: X86: Do the same ignore_msrs check for feature msrs") which allowed non-existing feature MSRs to be ignored, i.e. to not generate an error on the ioctl() level. It even tried to preserve the sanitization of the return value. However, the logic is flawed, as '*data' will be overwritten again with the uninitialized stack value of msr.data. Fix this by simplifying the logic and always initializing msr.data, vanishing the need for an additional error exit path. Fixes: 12bc2132b15e ("KVM: X86: Do the same ignore_msrs check for feature msrs") Signed-off-by: Mathias Krause <minipli@grsecurity.net> Reviewed-by: Xiaoyao Li <xiaoyao.li@intel.com> Link: https://lore.kernel.org/r/20240203124522.592778-2-minipli@grsecurity.net Signed-off-by: Sean Christopherson <seanjc@google.com>
2024-02-05powerpc/kasan: Fix addr error caused by page alignmentJiangfeng Xiao1-0/+1
In kasan_init_region, when k_start is not page aligned, at the begin of for loop, k_cur = k_start & PAGE_MASK is less than k_start, and then `va = block + k_cur - k_start` is less than block, the addr va is invalid, because the memory address space from va to block is not alloced by memblock_alloc, which will not be reserved by memblock_reserve later, it will be used by other places. As a result, memory overwriting occurs. for example: int __init __weak kasan_init_region(void *start, size_t size) { [...] /* if say block(dcd97000) k_start(feef7400) k_end(feeff3fe) */ block = memblock_alloc(k_end - k_start, PAGE_SIZE); [...] for (k_cur = k_start & PAGE_MASK; k_cur < k_end; k_cur += PAGE_SIZE) { /* at the begin of for loop * block(dcd97000) va(dcd96c00) k_cur(feef7000) k_start(feef7400) * va(dcd96c00) is less than block(dcd97000), va is invalid */ void *va = block + k_cur - k_start; [...] } [...] } Therefore, page alignment is performed on k_start before memblock_alloc() to ensure the validity of the VA address. Fixes: 663c0c9496a6 ("powerpc/kasan: Fix shadow area set up for modules.") Signed-off-by: Jiangfeng Xiao <xiaojiangfeng@huawei.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://msgid.link/1705974359-43790-1-git-send-email-xiaojiangfeng@huawei.com
2024-02-05powerpc/6xx: set High BAT Enable flag on G2_LE coresMatthias Schiffer2-1/+21
MMU_FTR_USE_HIGH_BATS is set for G2_LE cores and derivatives like e300cX, but the high BATs need to be enabled in HID2 to work. Add register definitions and add the needed setup to __setup_cpu_603. This fixes boot on CPUs like the MPC5200B with STRICT_KERNEL_RWX enabled on systems where the flag has not been set by the bootloader already. Fixes: e4d6654ebe6e ("powerpc/mm/32s: rework mmu_mapin_ram()") Signed-off-by: Matthias Schiffer <matthias.schiffer@ew.tq-group.com> Reviewed-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://msgid.link/20240124103838.43675-1-matthias.schiffer@ew.tq-group.com
2024-02-05powerpc/64: Set task pt_regs->link to the LR value on scv entryNaveen N Rao1-2/+2
Nysal reported that userspace backtraces are missing in offcputime bcc tool. As an example: $ sudo ./bcc/tools/offcputime.py -uU Tracing off-CPU time (us) of user threads by user stack... Hit Ctrl-C to end. ^C write - python (9107) 8 write - sudo (9105) 9 mmap - python (9107) 16 clock_nanosleep - multipathd (697) 3001604 The offcputime bcc tool attaches a bpf program to a kprobe on finish_task_switch(), which is usually hit on a syscall from userspace. With the switch to system call vectored, we started setting pt_regs->link to zero. This is because system call vectored behaves like a function call with LR pointing to the system call return address, and with no modification to SRR0/SRR1. The LR value does indicate our next instruction, so it is being saved as pt_regs->nip, and pt_regs->link is being set to zero. This is not a problem by itself, but BPF uses perf callchain infrastructure for capturing stack traces, and that stores LR as the second entry in the stack trace. perf has code to cope with the second entry being zero, and skips over it. However, generic userspace unwinders assume that a zero entry indicates end of the stack trace, resulting in a truncated userspace stack trace. Rather than fixing all userspace unwinders to ignore/skip past the second entry, store the real LR value in pt_regs->link so that there continues to be a valid, though duplicate entry in the stack trace. With this change: $ sudo ./bcc/tools/offcputime.py -uU Tracing off-CPU time (us) of user threads by user stack... Hit Ctrl-C to end. ^C write write [unknown] [unknown] [unknown] [unknown] [unknown] PyObject_VectorcallMethod [unknown] [unknown] PyObject_CallOneArg PyFile_WriteObject PyFile_WriteString [unknown] [unknown] PyObject_Vectorcall _PyEval_EvalFrameDefault PyEval_EvalCode [unknown] [unknown] [unknown] _PyRun_SimpleFileObject _PyRun_AnyFileObject Py_RunMain [unknown] Py_BytesMain [unknown] __libc_start_main - python (1293) 7 write write [unknown] sudo_ev_loop_v1 sudo_ev_dispatch_v1 [unknown] [unknown] [unknown] [unknown] __libc_start_main - sudo (1291) 7 syscall syscall bpf_open_perf_buffer_opts [unknown] [unknown] [unknown] [unknown] _PyObject_MakeTpCall PyObject_Vectorcall _PyEval_EvalFrameDefault PyEval_EvalCode [unknown] [unknown] [unknown] _PyRun_SimpleFileObject _PyRun_AnyFileObject Py_RunMain [unknown] Py_BytesMain [unknown] __libc_start_main - python (1293) 11 clock_nanosleep clock_nanosleep nanosleep sleep [unknown] [unknown] __clone - multipathd (698) 3001661 Fixes: 7fa95f9adaee ("powerpc/64s: system call support for scv/rfscv instructions") Cc: stable@vger.kernel.org Reported-by: "Nysal Jan K.A" <nysal@linux.ibm.com> Signed-off-by: Naveen N Rao <naveen@kernel.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://msgid.link/20240202154316.395276-1-naveen@kernel.org
2024-02-05powerpc/pseries/iommu: Fix iommu initialisation during DLPAR addGaurav Batra3-5/+23
When a PCI device is dynamically added, the kernel oopses with a NULL pointer dereference: BUG: Kernel NULL pointer dereference on read at 0x00000030 Faulting instruction address: 0xc0000000006bbe5c Oops: Kernel access of bad area, sig: 11 [#1] LE PAGE_SIZE=64K MMU=Radix SMP NR_CPUS=2048 NUMA pSeries Modules linked in: rpadlpar_io rpaphp rpcsec_gss_krb5 auth_rpcgss nfsv4 dns_resolver nfs lockd grace fscache netfs xsk_diag bonding nft_compat nf_tables nfnetlink rfkill binfmt_misc dm_multipath rpcrdma sunrpc rdma_ucm ib_srpt ib_isert iscsi_target_mod target_core_mod ib_umad ib_iser libiscsi scsi_transport_iscsi ib_ipoib rdma_cm iw_cm ib_cm mlx5_ib ib_uverbs ib_core pseries_rng drm drm_panel_orientation_quirks xfs libcrc32c mlx5_core mlxfw sd_mod t10_pi sg tls ibmvscsi ibmveth scsi_transport_srp vmx_crypto pseries_wdt psample dm_mirror dm_region_hash dm_log dm_mod fuse CPU: 17 PID: 2685 Comm: drmgr Not tainted 6.7.0-203405+ #66 Hardware name: IBM,9080-HEX POWER10 (raw) 0x800200 0xf000006 of:IBM,FW1060.00 (NH1060_008) hv:phyp pSeries NIP: c0000000006bbe5c LR: c000000000a13e68 CTR: c0000000000579f8 REGS: c00000009924f240 TRAP: 0300 Not tainted (6.7.0-203405+) MSR: 8000000000009033 <SF,EE,ME,IR,DR,RI,LE> CR: 24002220 XER: 20040006 CFAR: c000000000a13e64 DAR: 0000000000000030 DSISR: 40000000 IRQMASK: 0 ... NIP sysfs_add_link_to_group+0x34/0x94 LR iommu_device_link+0x5c/0x118 Call Trace: iommu_init_device+0x26c/0x318 (unreliable) iommu_device_link+0x5c/0x118 iommu_init_device+0xa8/0x318 iommu_probe_device+0xc0/0x134 iommu_bus_notifier+0x44/0x104 notifier_call_chain+0xb8/0x19c blocking_notifier_call_chain+0x64/0x98 bus_notify+0x50/0x7c device_add+0x640/0x918 pci_device_add+0x23c/0x298 of_create_pci_dev+0x400/0x884 of_scan_pci_dev+0x124/0x1b0 __of_scan_bus+0x78/0x18c pcibios_scan_phb+0x2a4/0x3b0 init_phb_dynamic+0xb8/0x110 dlpar_add_slot+0x170/0x3b8 [rpadlpar_io] add_slot_store.part.0+0xb4/0x130 [rpadlpar_io] kobj_attr_store+0x2c/0x48 sysfs_kf_write+0x64/0x78 kernfs_fop_write_iter+0x1b0/0x290 vfs_write+0x350/0x4a0 ksys_write+0x84/0x140 system_call_exception+0x124/0x330 system_call_vectored_common+0x15c/0x2ec Commit a940904443e4 ("powerpc/iommu: Add iommu_ops to report capabilities and allow blocking domains") broke DLPAR add of PCI devices. The above added iommu_device structure to pci_controller. During system boot, PCI devices are discovered and this newly added iommu_device structure is initialized by a call to iommu_device_register(). During DLPAR add of a PCI device, a new pci_controller structure is allocated but there are no calls made to iommu_device_register() interface. Fix is to register the iommu device during DLPAR add as well. Fixes: a940904443e4 ("powerpc/iommu: Add iommu_ops to report capabilities and allow blocking domains") Signed-off-by: Gaurav Batra <gbatra@linux.ibm.com> [mpe: Trim oops and tweak some change log wording] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://msgid.link/20240122222407.39603-1-gbatra@linux.ibm.com
2024-02-05x86/efistub: Use 1:1 file:memory mapping for PE/COFF .compat sectionArd Biesheuvel2-11/+9
The .compat section is a dummy PE section that contains the address of the 32-bit entrypoint of the 64-bit kernel image if it is bootable from 32-bit firmware (i.e., CONFIG_EFI_MIXED=y) This section is only 8 bytes in size and is only referenced from the loader, and so it is placed at the end of the memory view of the image, to avoid the need for padding it to 4k, which is required for sections appearing in the middle of the image. Unfortunately, this violates the PE/COFF spec, and even if most EFI loaders will work correctly (including the Tianocore reference implementation), PE loaders do exist that reject such images, on the basis that both the file and memory views of the file contents should be described by the section headers in a monotonically increasing manner without leaving any gaps. So reorganize the sections to avoid this issue. This results in a slight padding overhead (< 4k) which can be avoided if desired by disabling CONFIG_EFI_MIXED (which is only needed in rare cases these days) Fixes: 3e3eabe26dc8 ("x86/boot: Increase section and file alignment to 4k/512") Reported-by: Mike Beaton <mjsbeaton@gmail.com> Link: https://lkml.kernel.org/r/CAHzAAWQ6srV6LVNdmfbJhOwhBw5ZzxxZZ07aHt9oKkfYAdvuQQ%40mail.gmail.com Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
2024-02-05powerpc/pseries/papr-sysparm: use u8 arrays for payloadsNathan Lynch2-2/+2
Some PAPR system parameter values are formatted by firmware as nul-terminated strings (e.g. LPAR name, shared processor attributes). But the values returned for other parameters, such as processor module info and TLB block invalidate characteristics, are binary data with parameter-specific layouts. So char[] isn't the appropriate type for the general case. Use u8/__u8. Signed-off-by: Nathan Lynch <nathanl@linux.ibm.com> Fixes: 905b9e48786e ("powerpc/pseries/papr-sysparm: Expose character device to user space") Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://msgid.link/20240202-papr-sysparm-ioblock-data-use-u8-v1-1-f5c6c89f65ec@linux.ibm.com
2024-02-04KVM: arm64: Do not source virt/lib/Kconfig twiceMasahiro Yamada1-1/+0
For ARCH=arm64, virt/lib/Kconfig is sourced twice, from arch/arm64/kvm/Kconfig and from drivers/vfio/Kconfig. There is no good reason to parse virt/lib/Kconfig twice. Commit 2412405b3141 ("KVM: arm/arm64: register irq bypass consumer on ARM/ARM64") should not have added this 'source' directive. Signed-off-by: Masahiro Yamada <masahiroy@kernel.org> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20240204074305.31492-1-masahiroy@kernel.org
2024-02-03KVM: x86/pmu: Fix type length error when reading pmu->fixed_ctr_ctrlMingwei Zhang1-1/+1
Use a u64 instead of a u8 when taking a snapshot of pmu->fixed_ctr_ctrl when reprogramming fixed counters, as truncating the value results in KVM thinking fixed counter 2 is already disabled (the bug also affects fixed counters 3+, but KVM doesn't yet support those). As a result, if the guest disables fixed counter 2, KVM will get a false negative and fail to reprogram/disable emulation of the counter, which can leads to incorrect counts and spurious PMIs in the guest. Fixes: 76d287b2342e ("KVM: x86/pmu: Drop "u8 ctrl, int idx" for reprogram_fixed_counter()") Cc: stable@vger.kernel.org Signed-off-by: Mingwei Zhang <mizhang@google.com> Link: https://lore.kernel.org/r/20240123221220.3911317-1-mizhang@google.com [sean: rewrite changelog to call out the effects of the bug] Signed-off-by: Sean Christopherson <seanjc@google.com>
2024-02-02Merge tag 'iommu-fixes-v6.8-rc2' of ↵Linus Torvalds1-9/+28
git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu Pul iommu fixes from Joerg Roedel: - Make iommu_ops->default_domain work without CONFIG_IOMMU_DMA to fix initialization of FSL-PAMU devices - Fix for Tegra fbdev initialization failure - Fix for a VFIO device unbinding failure on PowerPC * tag 'iommu-fixes-v6.8-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu: powerpc: iommu: Bring back table group release_ownership() call drm/tegra: Do not assume that a NULL domain means no DMA IOMMU iommu: Allow ops->default_domain to work when !CONFIG_IOMMU_DMA
2024-02-02Merge tag 'arm64-fixes' of ↵Linus Torvalds4-16/+4
git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux Pull arm64 fixes from Will Deacon: "Two small fixes. The first one is an alternative fix for the SCS patching problem we thought we'd fixed in -rc1; it turned out not to be robust with all toolchains/configs, so this is a revert+retry which has seen some more testing. The other one simply removes an unused header file, but I couldn't resist the negative diffstat. - Really fix shadow call stack patching with LTO=full - Remove unused (empty) header file generated from the compat vDSO" * tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: arm64: vdso32: Remove unused vdso32-offsets.h arm64: scs: Disable LTO for SCS patching code arm64: Revert "scs: Work around full LTO issue with dynamic SCS"
2024-02-02powerpc: iommu: Bring back table group release_ownership() callShivaprasad G Bhat1-9/+28
The commit 2ad56efa80db ("powerpc/iommu: Setup a default domain and remove set_platform_dma_ops") refactored the code removing the set_platform_dma_ops(). It missed out the table group release_ownership() call which would have got called otherwise during the guest shutdown via vfio_group_detach_container(). On PPC64, this particular call actually sets up the 32-bit TCE table, and enables the 64-bit DMA bypass etc. Now after guest shutdown, the subsequent host driver (e.g megaraid-sas) probe post unbind from vfio-pci fails like, megaraid_sas 0031:01:00.0: Warning: IOMMU dma not supported: mask 0x7fffffffffffffff, table unavailable megaraid_sas 0031:01:00.0: Warning: IOMMU dma not supported: mask 0xffffffff, table unavailable megaraid_sas 0031:01:00.0: Failed to set DMA mask megaraid_sas 0031:01:00.0: Failed from megasas_init_fw 6539 The patch brings back the call to table_group release_ownership() call when switching back to PLATFORM domain from BLOCKED, while also separates the domain_ops for both. Fixes: 2ad56efa80db ("powerpc/iommu: Setup a default domain and remove set_platform_dma_ops") Signed-off-by: Shivaprasad G Bhat <sbhat@linux.ibm.com> Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Link: https://lore.kernel.org/r/170628173462.3742.18330000394415935845.stgit@ltcd48-lp2.aus.stglab.ibm.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2024-02-02Merge patch series "svnapot fixes"Palmer Dabbelt1-2/+60
Alexandre Ghiti <alexghiti@rivosinc.com> says: While merging riscv napot and arm64 contpte support, I noticed we did not abide by the specification which states that we should clear a napot mapping before setting a new one, called "break before make" in arm64 (patch 1). And also that we did not add the new hugetlb page size added by napot in hugetlb_mask_last_page() (patch 2). * b4-shazam-merge: riscv: Fix hugetlb_mask_last_page() when NAPOT is enabled riscv: Fix set_huge_pte_at() for NAPOT mapping Link: https://lore.kernel.org/r/20240117195741.1926459-1-alexghiti@rivosinc.com Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-02-02riscv: Fix hugetlb_mask_last_page() when NAPOT is enabledAlexandre Ghiti1-0/+20
When NAPOT is enabled, a new hugepage size is available and then we need to make hugetlb_mask_last_page() aware of that. Fixes: 82a1a1f3bfb6 ("riscv: mm: support Svnapot in hugetlb page") Signed-off-by: Alexandre Ghiti <alexghiti@rivosinc.com> Link: https://lore.kernel.org/r/20240117195741.1926459-3-alexghiti@rivosinc.com Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-02-02riscv: Fix set_huge_pte_at() for NAPOT mappingAlexandre Ghiti1-2/+40
As stated by the privileged specification, we must clear a NAPOT mapping and emit a sfence.vma before setting a new translation. Fixes: 82a1a1f3bfb6 ("riscv: mm: support Svnapot in hugetlb page") Signed-off-by: Alexandre Ghiti <alexghiti@rivosinc.com> Link: https://lore.kernel.org/r/20240117195741.1926459-2-alexghiti@rivosinc.com Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-02-01Merge tag 'parisc-for-6.8-rc3' of ↵Linus Torvalds10-74/+118
git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux Pull parisc architecture fixes from Helge Deller: "The current exception handler, which helps on kernel accesses to userspace, may exhibit data corruption. The problem is that it is not guaranteed that the compiler will use the processor register we specified in the source code, but may choose another register which then will lead to silent register- and data corruption. To fix this issue we now use another strategy to help the exception handler to always find and set the error code into the correct CPU register. The other fixes are small: fixing CPU hotplug bringup, fix the page alignment of the RO_DATA section, added a check for the calculated cache stride and fix possible hangups when printing longer output at bootup when running on serial console. Most of the patches are tagged for stable series. - Fix random data corruption triggered by exception handler - Fix crash when setting up BTLB at CPU bringup - Prevent hung tasks when printing inventory on serial console - Make RO_DATA page aligned in vmlinux.lds.S - Add check for valid cache stride size" * tag 'parisc-for-6.8-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux: parisc: BTLB: Fix crash when setting up BTLB at CPU bringup parisc: Fix random data corruption from exception handler parisc: Drop unneeded semicolon in parse_tree_node() parisc: Prevent hung tasks when printing inventory on serial console parisc: Check for valid stride size for cache flushes parisc: Make RO_DATA page aligned in vmlinux.lds.S
2024-02-01Merge tag 'kbuild-fixes-v6.8' of ↵Linus Torvalds4-10/+12
git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild Pull Kbuild fixes from Masahiro Yamada: - Fix UML build with clang-18 and newer - Avoid using the alias attribute in host programs - Replace tabs with spaces when followed by conditionals for future GNU Make versions - Fix rpm-pkg for the systemd-provided kernel-install tool - Fix the undefined behavior in Kconfig for a 'int' symbol used in a conditional * tag 'kbuild-fixes-v6.8' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild: kconfig: initialize sym->curr.tri to 'no' for all symbol types again kbuild: rpm-pkg: simplify installkernel %post kbuild: Replace tabs with spaces when followed by conditionals modpost: avoid using the alias attribute kbuild: fix W= flags in the help message modpost: Add '.ltext' and '.ltext.*' to TEXT_SECTIONS um: Fix adding '-no-pie' for clang kbuild: defconf: use SRCARCH to find merged configs
2024-02-01KVM: x86: Make gtod_is_based_on_tsc() return 'bool'Vitaly Kuznetsov1-1/+1
gtod_is_based_on_tsc() is boolean in nature, i.e. it returns '1' for good clocksources and '0' otherwise. Moreover, its result is used raw by kvm_get_time_and_clockread()/kvm_get_walltime_and_clockread() which are 'bool'. No functional change intended. Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com> Link: https://lore.kernel.org/r/20240109141121.1619463-6-vkuznets@redhat.com Signed-off-by: Sean Christopherson <seanjc@google.com>
2024-02-01x86/kvm: Fix SEV check in sev_map_percpu_data()Kirill A. Shutemov1-1/+2
The function sev_map_percpu_data() checks if it is running on an SEV platform by checking the CC_ATTR_GUEST_MEM_ENCRYPT attribute. However, this attribute is also defined for TDX. To avoid false positives, add a cc_vendor check. Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Fixes: 4d96f9109109 ("x86/sev: Replace occurrences of sev_active() with cc_platform_has()") Suggested-by: Borislav Petkov (AMD) <bp@alien8.de> Acked-by: David Rientjes <rientjes@google.com> Message-Id: <20240124130317.495519-1-kirill.shutemov@linux.intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2024-02-01KVM: x86: Give a hint when Win2016 might fail to boot due to XSAVES erratumMaciej S. Szmigiero4-0/+59
Since commit b0563468eeac ("x86/CPU/AMD: Disable XSAVES on AMD family 0x17") kernel unconditionally clears the XSAVES CPU feature bit on Zen1/2 CPUs. Because KVM CPU caps are initialized from the kernel boot CPU features this makes the XSAVES feature also unavailable for KVM guests in this case. At the same time the XSAVEC feature is left enabled. Unfortunately, having XSAVEC but no XSAVES in CPUID breaks Hyper-V enabled Windows Server 2016 VMs that have more than one vCPU. Let's at least give users hint in the kernel log what could be wrong since these VMs currently simply hang at boot with a black screen - giving no clue what suddenly broke them and how to make them work again. Trigger the kernel message hint based on the particular guest ID written to the Guest OS Identity Hyper-V MSR implemented by KVM. Defer this check to when the L1 Hyper-V hypervisor enables SVM in EFER since we want to limit this message to Hyper-V enabled Windows guests only (Windows session running nested as L2) but the actual Guest OS Identity MSR write is done by L1 and happens before it enables SVM. Fixes: b0563468eeac ("x86/CPU/AMD: Disable XSAVES on AMD family 0x17") Signed-off-by: Maciej S. Szmigiero <maciej.szmigiero@oracle.com> Message-Id: <b83ab45c5e239e5d148b0ae7750133a67ac9575c.1706127425.git.maciej.szmigiero@oracle.com> [Move some checks before mutex_lock(), rename function. - Paolo] Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2024-02-01KVM: x86: Check irqchip mode before create PITTengfei Yu1-0/+3
As the kvm api(https://docs.kernel.org/virt/kvm/api.html) reads, KVM_CREATE_PIT2 call is only valid after enabling in-kernel irqchip support via KVM_CREATE_IRQCHIP. Without this check, I can create PIT first and enable irqchip-split then, which may cause the PIT invalid because of lacking of in-kernel PIC to inject the interrupt. Signed-off-by: Tengfei Yu <moehanabichan@gmail.com> Message-Id: <20240125050823.4893-1-moehanabichan@gmail.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2024-01-31riscv: mm: execute local TLB flush after populating vmemmapVincent Chen3-1/+7
The spare_init() calls memmap_populate() many times to create VA to PA mapping for the VMEMMAP area, where all "struct page" are located once CONFIG_SPARSEMEM_VMEMMAP is defined. These "struct page" are later initialized in the zone_sizes_init() function. However, during this process, no sfence.vma instruction is executed for this VMEMMAP area. This omission may cause the hart to fail to perform page table walk because some data related to the address translation is invisible to the hart. To solve this issue, the local_flush_tlb_kernel_range() is called right after the sparse_init() to execute a sfence.vma instruction for this VMEMMAP area, ensuring that all data related to the address translation is visible to the hart. Fixes: d95f1a542c3d ("RISC-V: Implement sparsemem") Signed-off-by: Vincent Chen <vincent.chen@sifive.com> Reviewed-by: Alexandre Ghiti <alexghiti@rivosinc.com> Link: https://lore.kernel.org/r/20240117140333.2479667-1-vincent.chen@sifive.com Fixes: 7a92fc8b4d20 ("mm: Introduce flush_cache_vmap_early()") Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-01-31KVM: x86: make KVM_REQ_NMI request iff NMI pending for vcpuPrasad Pandit1-1/+2
kvm_vcpu_ioctl_x86_set_vcpu_events() routine makes 'KVM_REQ_NMI' request for a vcpu even when its 'events->nmi.pending' is zero. Ex: qemu_thread_start kvm_vcpu_thread_fn qemu_wait_io_event qemu_wait_io_event_common process_queued_cpu_work do_kvm_cpu_synchronize_post_init/_reset kvm_arch_put_registers kvm_put_vcpu_events (cpu, level=[2|3]) This leads vCPU threads in QEMU to constantly acquire & release the global mutex lock, delaying the guest boot due to lock contention. Add check to make KVM_REQ_NMI request only if vcpu has NMI pending. Fixes: bdedff263132 ("KVM: x86: Route pending NMIs from userspace through process_nmi()") Cc: stable@vger.kernel.org Signed-off-by: Prasad Pandit <pjp@fedoraproject.org> Link: https://lore.kernel.org/r/20240103075343.549293-1-ppandit@redhat.com Signed-off-by: Sean Christopherson <seanjc@google.com>
2024-01-31kbuild: Replace tabs with spaces when followed by conditionalsDmitry Goncharov3-9/+9
This is needed for the future (post make-4.4.1) versions of gnu make. Starting from https://git.savannah.gnu.org/cgit/make.git/commit/?id=07fcee35f058a876447c8a021f9eb1943f902534 gnu make won't allow conditionals to follow recipe prefix. For example there is a tab followed by ifeq on line 324 in the root Makefile. With the new make this conditional causes the following $ make cpu.o /home/dgoncharov/src/linux-kbuild/Makefile:2063: *** missing 'endif'. Stop. make: *** [Makefile:240: __sub-make] Error 2 This patch replaces tabs followed by conditionals with 8 spaces. See https://savannah.gnu.org/bugs/?64185 and https://savannah.gnu.org/bugs/?64259 for details. Signed-off-by: Dmitry Goncharov <dgoncharov@users.sf.net> Reported-by: Martin Dorey <martin.dorey@hitachivantara.com> Reviewed-by: Miguel Ojeda <ojeda@kernel.org> Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
2024-01-31parisc: BTLB: Fix crash when setting up BTLB at CPU bringupHelge Deller1-1/+1
When using hotplug and bringing up a 32-bit CPU, ask the firmware about the BTLB information to set up the static (block) TLB entries. For that write access to the static btlb_info struct is needed, but since it is marked __ro_after_init the kernel segfaults with missing write permissions. Fix the crash by dropping the __ro_after_init annotation. Fixes: e5ef93d02d6c ("parisc: BTLB: Initialize BTLB tables at CPU startup") Signed-off-by: Helge Deller <deller@gmx.de> Cc: <stable@vger.kernel.org> # v6.6+
2024-01-31KVM: arm64: Fix circular locking dependencySebastian Ene1-10/+17
The rule inside kvm enforces that the vcpu->mutex is taken *inside* kvm->lock. The rule is violated by the pkvm_create_hyp_vm() which acquires the kvm->lock while already holding the vcpu->mutex lock from kvm_vcpu_ioctl(). Avoid the circular locking dependency altogether by protecting the hyp vm handle with the config_lock, much like we already do for other forms of VM-scoped data. Signed-off-by: Sebastian Ene <sebastianene@google.com> Cc: stable@vger.kernel.org Reviewed-by: Oliver Upton <oliver.upton@linux.dev> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20240124091027.1477174-2-sebastianene@google.com
2024-01-30parisc: Fix random data corruption from exception handlerHelge Deller8-71/+108
The current exception handler implementation, which assists when accessing user space memory, may exhibit random data corruption if the compiler decides to use a different register than the specified register %r29 (defined in ASM_EXCEPTIONTABLE_REG) for the error code. If the compiler choose another register, the fault handler will nevertheless store -EFAULT into %r29 and thus trash whatever this register is used for. Looking at the assembly I found that this happens sometimes in emulate_ldd(). To solve the issue, the easiest solution would be if it somehow is possible to tell the fault handler which register is used to hold the error code. Using %0 or %1 in the inline assembly is not posssible as it will show up as e.g. %r29 (with the "%r" prefix), which the GNU assembler can not convert to an integer. This patch takes another, better and more flexible approach: We extend the __ex_table (which is out of the execution path) by one 32-word. In this word we tell the compiler to insert the assembler instruction "or %r0,%r0,%reg", where %reg references the register which the compiler choosed for the error return code. In case of an access failure, the fault handler finds the __ex_table entry and can examine the opcode. The used register is encoded in the lowest 5 bits, and the fault handler can then store -EFAULT into this register. Since we extend the __ex_table to 3 words we can't use the BUILDTIME_TABLE_SORT config option any longer. Signed-off-by: Helge Deller <deller@gmx.de> Cc: <stable@vger.kernel.org> # v6.0+
2024-01-30x86/fpu: Stop relying on userspace for info to fault in xsave bufferAndrei Vagin1-8/+5
Before this change, the expected size of the user space buffer was taken from fx_sw->xstate_size. fx_sw->xstate_size can be changed from user-space, so it is possible construct a sigreturn frame where: * fx_sw->xstate_size is smaller than the size required by valid bits in fx_sw->xfeatures. * user-space unmaps parts of the sigrame fpu buffer so that not all of the buffer required by xrstor is accessible. In this case, xrstor tries to restore and accesses the unmapped area which results in a fault. But fault_in_readable succeeds because buf + fx_sw->xstate_size is within the still mapped area, so it goes back and tries xrstor again. It will spin in this loop forever. Instead, fault in the maximum size which can be touched by XRSTOR (taken from fpstate->user_size). [ dhansen: tweak subject / changelog ] Fixes: fcb3635f5018 ("x86/fpu/signal: Handle #PF in the direct restore path") Reported-by: Konstantin Bogomolov <bogomolov@google.com> Suggested-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Andrei Vagin <avagin@google.com> Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com> Cc:stable@vger.kernel.org Link: https://lore.kernel.org/all/20240130063603.3392627-1-avagin%40google.com
2024-01-30arm64: vdso32: Remove unused vdso32-offsets.hKevin Brodsky3-13/+1
Commit 2d071968a405 ("arm64: compat: Remove 32-bit sigreturn code from the vDSO") removed all VDSO_* symbols in the compat vDSO. As a result, vdso32-offsets.h is now empty and therefore unused. Time to remove it. Signed-off-by: Kevin Brodsky <kevin.brodsky@arm.com> Link: https://lore.kernel.org/r/20240129154748.1727759-1-kevin.brodsky@arm.com Signed-off-by: Will Deacon <will@kernel.org>
2024-01-30arm64: scs: Disable LTO for SCS patching codeArd Biesheuvel1-0/+6
Full LTO takes the '-mbranch-protection=none' passed to the compiler when generating the dynamic shadow call stack patching code as a hint to stop emitting PAC instructions altogether. (Thin LTO appears unaffected by this) Work around this by disabling LTO for the compilation unit, which appears to convince the linker that it should still use PAC in the rest of the kernel.. Fixes: 3b619e22c460 ("arm64: implement dynamic shadow call stack for Clang") Signed-off-by: Ard Biesheuvel <ardb@kernel.org> Reviewed-by: Kees Cook <keescook@chromium.org> Reviewed-by: Sami Tolvanen <samitolvanen@google.com> Tested-by: Sami Tolvanen <samitolvanen@google.com> Link: https://lore.kernel.org/r/20240123133052.1417449-6-ardb+git@google.com Signed-off-by: Will Deacon <will@kernel.org>
2024-01-30arm64: Revert "scs: Work around full LTO issue with dynamic SCS"Ard Biesheuvel1-7/+1
This reverts commit 8c5a19cb17a71e ("arm64: scs: Work around full LTO issue with dynamic SCS"), which did not quite fix the issue as intended. Apparently, -fno-unwind-tables is ignored for the final full LTO link when it is set on any of the objects, resulting in an early boot crash due to the SCS patching code patching itself, and attempting to pop the return address from the shadow stack while the associated push was still a PACIASP instruction when it executed. Reported-by: Sami Tolvanen <samitolvanen@google.com> Signed-off-by: Ard Biesheuvel <ardb@kernel.org> Reviewed-by: Kees Cook <keescook@chromium.org> Reviewed-by: Sami Tolvanen <samitolvanen@google.com> Tested-by: Sami Tolvanen <samitolvanen@google.com> Link: https://lore.kernel.org/r/20240123133052.1417449-5-ardb+git@google.com Signed-off-by: Will Deacon <will@kernel.org>
2024-01-30Merge tag 'mm-hotfixes-stable-2024-01-28-23-21' of ↵Linus Torvalds2-1/+17
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm Pull misc fixes from Andrew Morton: "22 hotfixes. 11 are cc:stable and the remainder address post-6.7 issues or aren't considered appropriate for backporting" * tag 'mm-hotfixes-stable-2024-01-28-23-21' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (22 commits) mm: thp_get_unmapped_area must honour topdown preference mm: huge_memory: don't force huge page alignment on 32 bit userfaultfd: fix mmap_changing checking in mfill_atomic_hugetlb selftests/mm: ksm_tests should only MADV_HUGEPAGE valid memory scs: add CONFIG_MMU dependency for vfree_atomic() mm/memory: fix folio_set_dirty() vs. folio_mark_dirty() in zap_pte_range() mm/huge_memory: fix folio_set_dirty() vs. folio_mark_dirty() selftests/mm: Update va_high_addr_switch.sh to check CPU for la57 flag selftests: mm: fix map_hugetlb failure on 64K page size systems MAINTAINERS: supplement of zswap maintainers update stackdepot: make fast paths lock-less again stackdepot: add stats counters exported via debugfs mm, kmsan: fix infinite recursion due to RCU critical section mm/writeback: fix possible divide-by-zero in wb_dirty_limits(), again selftests/mm: switch to bash from sh MAINTAINERS: add man-pages git trees mm: memcontrol: don't throttle dying tasks on memory.high mm: mmap: map MAP_STACK to VM_NOHUGEPAGE uprobes: use pagesize-aligned virtual address when replacing pages selftests/mm: mremap_test: fix build warning ...
2024-01-29x86/lib: Revert to _ASM_EXTABLE_UA() for {get,put}_user() fixupsQiuxu Zhuo2-22/+22
During memory error injection test on kernels >= v6.4, the kernel panics like below. However, this issue couldn't be reproduced on kernels <= v6.3. mce: [Hardware Error]: CPU 296: Machine Check Exception: f Bank 1: bd80000000100134 mce: [Hardware Error]: RIP 10:<ffffffff821b9776> {__get_user_nocheck_4+0x6/0x20} mce: [Hardware Error]: TSC 411a93533ed ADDR 346a8730040 MISC 86 mce: [Hardware Error]: PROCESSOR 0:a06d0 TIME 1706000767 SOCKET 1 APIC 211 microcode 80001490 mce: [Hardware Error]: Run the above through 'mcelog --ascii' mce: [Hardware Error]: Machine check: Data load in unrecoverable area of kernel Kernel panic - not syncing: Fatal local machine check The MCA code can recover from an in-kernel #MC if the fixup type is EX_TYPE_UACCESS, explicitly indicating that the kernel is attempting to access userspace memory. However, if the fixup type is EX_TYPE_DEFAULT the only thing that is raised for an in-kernel #MC is a panic. ex_handler_uaccess() would warn if users gave a non-canonical addresses (with bit 63 clear) to {get, put}_user(), which was unexpected. Therefore, commit b19b74bc99b1 ("x86/mm: Rework address range check in get_user() and put_user()") replaced _ASM_EXTABLE_UA() with _ASM_EXTABLE() for {get, put}_user() fixups. However, the new fixup type EX_TYPE_DEFAULT results in a panic. Commit 6014bc27561f ("x86-64: make access_ok() independent of LAM") added the check gp_fault_address_ok() right before the WARN_ONCE() in ex_handler_uaccess() to not warn about non-canonical user addresses due to LAM. With that in place, revert back to _ASM_EXTABLE_UA() for {get,put}_user() exception fixups in order to be able to handle in-kernel MCEs correctly again. [ bp: Massage commit message. ] Fixes: b19b74bc99b1 ("x86/mm: Rework address range check in get_user() and put_user()") Signed-off-by: Qiuxu Zhuo <qiuxu.zhuo@intel.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Reviewed-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Cc: <stable@kernel.org> Link: https://lore.kernel.org/r/20240129063842.61584-1-qiuxu.zhuo@intel.com
2024-01-29riscv: Fix wrong size passed to local_flush_tlb_range_asid()Alexandre Ghiti1-1/+1
local_flush_tlb_range_asid() takes the size as argument, not the end of the range to flush, so fix this by computing the size from the end and the start of the range. Fixes: 7a92fc8b4d20 ("mm: Introduce flush_cache_vmap_early()") Signed-off-by: Alexandre Ghiti <alexghiti@rivosinc.com> Signed-off-by: Dennis Zhou <dennis@kernel.org>
2024-01-28Merge tag 'mips-fixes_6.8_1' of ↵Linus Torvalds34-230/+83
git://git.kernel.org/pub/scm/linux/kernel/git/mips/linux Pull MIPS fixes from Thomas Bogendoerfer: - fix boot issue on single core Lantiq Danube devices - fix boot issue on Loongson64 platforms - fix improper FPU setup - fix missing prototypes issues * tag 'mips-fixes_6.8_1' of git://git.kernel.org/pub/scm/linux/kernel/git/mips/linux: mips: Call lose_fpu(0) before initializing fcr31 in mips_set_personality_nan MIPS: loongson64: set nid for reserved memblock region Revert "MIPS: loongson64: set nid for reserved memblock region" MIPS: lantiq: register smp_ops on non-smp platforms MIPS: loongson64: set nid for reserved memblock region MIPS: reserve exception vector space ONLY ONCE MIPS: BCM63XX: Fix missing prototypes MIPS: sgi-ip32: Fix missing prototypes MIPS: sgi-ip30: Fix missing prototypes MIPS: fw arc: Fix missing prototypes MIPS: sgi-ip27: Fix missing prototypes MIPS: Alchemy: Fix missing prototypes MIPS: Cobalt: Fix missing prototypes
2024-01-28Merge tag 'x86_urgent_for_v6.8_rc2' of ↵Linus Torvalds5-12/+49
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 fixes from Borislav Petkov: - Make sure 32-bit syscall registers are properly sign-extended - Add detection for AMD's Zen5 generation CPUs and Intel's Clearwater Forest CPU model number - Make a stub function export non-GPL because it is part of the paravirt alternatives and that can be used by non-GPL code * tag 'x86_urgent_for_v6.8_rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/CPU/AMD: Add more models to X86_FEATURE_ZEN5 x86/entry/ia32: Ensure s32 is sign extended to s64 x86/cpu: Add model number for Intel Clearwater Forest processor x86/CPU/AMD: Add X86_FEATURE_ZEN5 x86/paravirt: Make BUG_func() usable by non-GPL modules
2024-01-28parisc: Drop unneeded semicolon in parse_tree_node()Helge Deller1-1/+1
Reported-by: kernel test robot <lkp@intel.com> Signed-off-by: Helge Deller <deller@gmx.de> Closes: https://lore.kernel.org/oe-kbuild-all/202401222059.Wli6OGT0-lkp@intel.com/
2024-01-28parisc: Prevent hung tasks when printing inventory on serial consoleHelge Deller1-0/+3
Printing the inventory on a serial console can be quite slow and thus may trigger the hung task detector (CONFIG_DETECT_HUNG_TASK=y) and possibly reboot the machine. Adding a cond_resched() prevents this. Signed-off-by: Helge Deller <deller@gmx.de> Cc: <stable@vger.kernel.org> # v6.0+
2024-01-28parisc: Check for valid stride size for cache flushesHelge Deller1-0/+4
Report if the calculated cache stride size is zero, otherwise the cache flushing routine will never finish and hang the machine. This can be reproduced with a testcase in qemu, where the firmware reports wrong cache values. Signed-off-by: Helge Deller <deller@gmx.de>
2024-01-28parisc: Make RO_DATA page aligned in vmlinux.lds.SHelge Deller1-1/+1
The rodata_test program for CONFIG_DEBUG_RODATA_TEST=y complains if read-only data does not start at page boundary. Signed-off-by: Helge Deller <deller@gmx.de>
2024-01-27Merge tag 'loongarch-fixes-6.8-1' of ↵Linus Torvalds4-11/+14
git://git.kernel.org/pub/scm/linux/kernel/git/chenhuacai/linux-loongson Pull LoongArch fixes from Huacai Chen: "Fix boot failure on machines with more than 8 nodes, and fix two build errors about KVM" * tag 'loongarch-fixes-6.8-1' of git://git.kernel.org/pub/scm/linux/kernel/git/chenhuacai/linux-loongson: LoongArch: KVM: Add returns to SIMD stubs LoongArch: KVM: Fix build due to API changes LoongArch/smp: Call rcutree_report_cpu_starting() at tlb_init()
2024-01-27um: Fix adding '-no-pie' for clangNathan Chancellor1-1/+3
The kernel builds with -fno-PIE, so commit 883354afbc10 ("um: link vmlinux with -no-pie") added the compiler linker flag '-no-pie' via cc-option because '-no-pie' was only supported in GCC 6.1.0 and newer. While this works for GCC, this does not work for clang because cc-option uses '-c', which stops the pipeline right before linking, so '-no-pie' is unconsumed and clang warns, causing cc-option to fail just as it would if the option was entirely unsupported: $ clang -Werror -no-pie -c -o /dev/null -x c /dev/null clang-16: error: argument unused during compilation: '-no-pie' [-Werror,-Wunused-command-line-argument] A recent version of clang exposes this because it generates a relocation under '-mcmodel=large' that is not supported in PIE mode: /usr/sbin/ld: init/main.o: relocation R_X86_64_32 against symbol `saved_command_line' can not be used when making a PIE object; recompile with -fPIE /usr/sbin/ld: failed to set dynamic section sizes: bad value clang: error: linker command failed with exit code 1 (use -v to see invocation) Remove the cc-option check altogether. It is wasteful to invoke the compiler to check for '-no-pie' because only one supported compiler version does not support it, GCC 5.x (as it is supported with the minimum version of clang and GCC 6.1.0+). Use a combination of the gcc-min-version macro and CONFIG_CC_IS_CLANG to unconditionally add '-no-pie' with CONFIG_LD_SCRIPT_DYN=y, so that it is enabled with all compilers that support this. Furthermore, using gcc-min-version can help turn this back into LINK-$(CONFIG_LD_SCRIPT_DYN) += -no-pie when the minimum version of GCC is bumped past 6.1.0. Cc: stable@vger.kernel.org Closes: https://github.com/ClangBuiltLinux/linux/issues/1982 Signed-off-by: Nathan Chancellor <nathan@kernel.org> Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
2024-01-27mips: Call lose_fpu(0) before initializing fcr31 in mips_set_personality_nanXi Ruoyao1-0/+6
If we still own the FPU after initializing fcr31, when we are preempted the dirty value in the FPU will be read out and stored into fcr31, clobbering our setting. This can cause an improper floating-point environment after execve(). For example: zsh% cat measure.c #include <fenv.h> int main() { return fetestexcept(FE_INEXACT); } zsh% cc measure.c -o measure -lm zsh% echo $((1.0/3)) # raising FE_INEXACT 0.33333333333333331 zsh% while ./measure; do ; done (stopped in seconds) Call lose_fpu(0) before setting fcr31 to prevent this. Closes: https://lore.kernel.org/linux-mips/7a6aa1bbdbbe2e63ae96ff163fab0349f58f1b9e.camel@xry111.site/ Fixes: 9b26616c8d9d ("MIPS: Respect the ISA level in FCSR handling") Cc: stable@vger.kernel.org Signed-off-by: Xi Ruoyao <xry111@xry111.site> Signed-off-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
2024-01-27MIPS: loongson64: set nid for reserved memblock regionHuang Pei2-0/+5
Commit 61167ad5fecd("mm: pass nid to reserve_bootmem_region()") reveals that reserved memblock regions have no valid node id set, just set it right since loongson64 firmware makes it clear in memory layout info. This works around booting failure on 3A1000+ since commit 61167ad5fecd ("mm: pass nid to reserve_bootmem_region()") under CONFIG_DEFERRED_STRUCT_PAGE_INIT. Signed-off-by: Huang Pei <huangpei@loongson.cn> Signed-off-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
2024-01-27Revert "MIPS: loongson64: set nid for reserved memblock region"Thomas Bogendoerfer2-4/+0
This reverts commit ce7b1b97776ec0b068c4dd6b6dbb48ae09a23519. Signed-off-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de>