summaryrefslogtreecommitdiff
path: root/arch
AgeCommit message (Collapse)AuthorFilesLines
2022-08-25MIPS: tlbex: Explicitly compare _PAGE_NO_EXEC against 0Nathan Chancellor1-2/+2
[ Upstream commit 74de14fe05dd6b151d73cb0c73c8ec874cbdcde6 ] When CONFIG_XPA is enabled, Clang warns: arch/mips/mm/tlbex.c:629:24: error: converting the result of '<<' to a boolean; did you mean '(1 << _PAGE_NO_EXEC_SHIFT) != 0'? [-Werror,-Wint-in-bool-context] if (cpu_has_rixi && !!_PAGE_NO_EXEC) { ^ arch/mips/include/asm/pgtable-bits.h:174:28: note: expanded from macro '_PAGE_NO_EXEC' # define _PAGE_NO_EXEC (1 << _PAGE_NO_EXEC_SHIFT) ^ arch/mips/mm/tlbex.c:2568:24: error: converting the result of '<<' to a boolean; did you mean '(1 << _PAGE_NO_EXEC_SHIFT) != 0'? [-Werror,-Wint-in-bool-context] if (!cpu_has_rixi || !_PAGE_NO_EXEC) { ^ arch/mips/include/asm/pgtable-bits.h:174:28: note: expanded from macro '_PAGE_NO_EXEC' # define _PAGE_NO_EXEC (1 << _PAGE_NO_EXEC_SHIFT) ^ 2 errors generated. _PAGE_NO_EXEC can be '0' or '1 << _PAGE_NO_EXEC_SHIFT' depending on the build and runtime configuration, which is what the negation operators are trying to convey. To silence the warning, explicitly compare against 0 so the result of the '<<' operator is not implicitly converted to a boolean. According to its documentation, GCC enables -Wint-in-bool-context with -Wall but this warning is not visible when building the same configuration with GCC. It appears GCC only warns when compiling C++, not C, although the documentation makes no note of this: https://godbolt.org/z/x39q3brxf Reported-by: Sudip Mukherjee (Codethink) <sudipm.mukherjee@gmail.com> Signed-off-by: Nathan Chancellor <nathan@kernel.org> Signed-off-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de> Signed-off-by: Sasha Levin <sashal@kernel.org>
2022-08-25powerpc/64: Init jump labels before parse_early_param()Zhouyi Zhou1-0/+7
[ Upstream commit ca829e05d3d4f728810cc5e4b468d9ebc7745eb3 ] On 64-bit, calling jump_label_init() in setup_feature_keys() is too late because static keys may be used in subroutines of parse_early_param() which is again subroutine of early_init_devtree(). For example booting with "threadirqs": static_key_enable_cpuslocked(): static key '0xc000000002953260' used before call to jump_label_init() WARNING: CPU: 0 PID: 0 at kernel/jump_label.c:166 static_key_enable_cpuslocked+0xfc/0x120 ... NIP static_key_enable_cpuslocked+0xfc/0x120 LR static_key_enable_cpuslocked+0xf8/0x120 Call Trace: static_key_enable_cpuslocked+0xf8/0x120 (unreliable) static_key_enable+0x30/0x50 setup_forced_irqthreads+0x28/0x40 do_early_param+0xa0/0x108 parse_args+0x290/0x4e0 parse_early_options+0x48/0x5c parse_early_param+0x58/0x84 early_init_devtree+0xd4/0x518 early_setup+0xb4/0x214 So call jump_label_init() just before parse_early_param() in early_init_devtree(). Suggested-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Zhouyi Zhou <zhouzhouyi@gmail.com> [mpe: Add call trace to change log and minor wording edits.] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20220726015747.11754-1-zhouzhouyi@gmail.com Signed-off-by: Sasha Levin <sashal@kernel.org>
2022-08-25powerpc/32: Don't always pass -mcpu=powerpc to the compilerChristophe Leroy2-28/+19
[ Upstream commit 446cda1b21d9a6b3697fe399c6a3a00ff4a285f5 ] Since commit 4bf4f42a2feb ("powerpc/kbuild: Set default generic machine type for 32-bit compile"), when building a 32 bits kernel with a bi-arch version of GCC, or when building a book3s/32 kernel, the option -mcpu=powerpc is passed to GCC at all time, relying on it being eventually overriden by a subsequent -mcpu=xxxx. But when building the same kernel with a 32 bits only version of GCC, that is not done, relying on gcc being built with the expected default CPU. This logic has two problems. First, it is a bit fragile to rely on whether the GCC version is bi-arch or not, because today we can have bi-arch versions of GCC configured with a 32 bits default. Second, there are some versions of GCC which don't support -mcpu=powerpc, for instance for e500 SPE-only versions. So, stop relying on this approximative logic and allow the user to decide whether he/she wants to use the toolchain's default CPU or if he/she wants to set one, and allow only possible CPUs based on the selected target. Reported-by: Pali Rohár <pali@kernel.org> Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Tested-by: Pali Rohár <pali@kernel.org> Reviewed-by: Arnd Bergmann <arnd@arndb.de> Reviewed-by: Segher Boessenkool <segher@kernel.crashing.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/d4df724691351531bf46d685d654689e5dfa0d74.1657549153.git.christophe.leroy@csgroup.eu Signed-off-by: Sasha Levin <sashal@kernel.org>
2022-08-25RISC-V: Add fast call path of crash_kexec()Xianting Tian1-0/+4
[ Upstream commit 3f1901110a89b0e2e13adb2ac8d1a7102879ea98 ] Currently, almost all archs (x86, arm64, mips...) support fast call of crash_kexec() when "regs && kexec_should_crash()" is true. But RISC-V not, it can only enter crash system via panic(). However panic() doesn't pass the regs of the real accident scene to crash_kexec(), it caused we can't get accurate backtrace via gdb, $ riscv64-linux-gnu-gdb vmlinux vmcore Reading symbols from vmlinux... [New LWP 95] #0 console_unlock () at kernel/printk/printk.c:2557 2557 if (do_cond_resched) (gdb) bt #0 console_unlock () at kernel/printk/printk.c:2557 #1 0x0000000000000000 in ?? () With the patch we can get the accurate backtrace, $ riscv64-linux-gnu-gdb vmlinux vmcore Reading symbols from vmlinux... [New LWP 95] #0 0xffffffe00063a4e0 in test_thread (data=<optimized out>) at drivers/test_crash.c:81 81 *(int *)p = 0xdead; (gdb) (gdb) bt #0 0xffffffe00064d5c0 in test_thread (data=<optimized out>) at drivers/test_crash.c:81 #1 0x0000000000000000 in ?? () Test code to produce NULL address dereference in test_crash.c, void *p = NULL; *(int *)p = 0xdead; Reviewed-by: Guo Ren <guoren@kernel.org> Tested-by: Xianting Tian <xianting.tian@linux.alibaba.com> Signed-off-by: Xianting Tian <xianting.tian@linux.alibaba.com> Link: https://lore.kernel.org/r/20220606082308.2883458-1-xianting.tian@linux.alibaba.com Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2022-08-25riscv: mmap with PROT_WRITE but no PROT_READ is invalidCeleste Liu1-3/+2
[ Upstream commit 2139619bcad7ac44cc8f6f749089120594056613 ] As mentioned in Table 4.5 in RISC-V spec Volume 2 Section 4.3, write but not read is "Reserved for future use.". For now, they are not valid. In the current code, -wx is marked as invalid, but -w- is not marked as invalid. This patch refines that judgment. Reported-by: xctan <xc-tan@outlook.com> Co-developed-by: dram <dramforever@live.com> Signed-off-by: dram <dramforever@live.com> Co-developed-by: Ruizhe Pan <c141028@gmail.com> Signed-off-by: Ruizhe Pan <c141028@gmail.com> Signed-off-by: Celeste Liu <coelacanthus@outlook.com> Link: https://lore.kernel.org/r/PH7PR14MB559464DBDD310E755F5B21E8CEDC9@PH7PR14MB5594.namprd14.prod.outlook.com Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2022-08-25mips: cavium-octeon: Fix missing of_node_put() in octeon2_usb_clocks_startLiang He1-1/+2
[ Upstream commit 7a9f743ceead60ed454c46fbc3085ee9a79cbebb ] We should call of_node_put() for the reference 'uctl_node' returned by of_get_parent() which will increase the refcount. Otherwise, there will be a refcount leak bug. Signed-off-by: Liang He <windhl@126.com> Signed-off-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de> Signed-off-by: Sasha Levin <sashal@kernel.org>
2022-08-25csky/kprobe: reclaim insn_slot on kprobe unregistrationLiao Chang1-0/+4
[ Upstream commit a2310c74d418deca0f1d749c45f1f43162510f51 ] On kprobe registration kernel allocate one insn_slot for new kprobe, but it forget to reclaim the insn_slot on unregistration, leading to a potential leakage. Reported-by: Chen Guokai <chenguokai17@mails.ucas.ac.cn> Reviewed-by: Masami Hiramatsu (Google) <mhiramat@kernel.org> Signed-off-by: Liao Chang <liaochang1@huawei.com> Signed-off-by: Guo Ren <guoren@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2022-08-25um: add "noreboot" command line option for PANIC_TIMEOUT=-1 setupsJason A. Donenfeld1-1/+16
[ Upstream commit dda520d07b95072a0b63f6c52a8eb566d08ea897 ] QEMU has a -no-reboot option, which halts instead of reboots when the guest asks to reboot. This is invaluable when used with CONFIG_PANIC_TIMEOUT=-1 (and panic_on_warn), because it allows panics and warnings to be caught immediately in CI. Implement this in UML too, by way of a basic setup param. Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com> Signed-off-by: Richard Weinberger <richard@nod.at> Signed-off-by: Sasha Levin <sashal@kernel.org>
2022-08-25powerpc/pci: Fix get_phb_number() lockingMichael Ellerman1-6/+10
commit 8d48562a2729742f767b0fdd994d6b2a56a49c63 upstream. The recent change to get_phb_number() causes a DEBUG_ATOMIC_SLEEP warning on some systems: BUG: sleeping function called from invalid context at kernel/locking/mutex.c:580 in_atomic(): 1, irqs_disabled(): 0, non_block: 0, pid: 1, name: swapper preempt_count: 1, expected: 0 RCU nest depth: 0, expected: 0 1 lock held by swapper/1: #0: c157efb0 (hose_spinlock){+.+.}-{2:2}, at: pcibios_alloc_controller+0x64/0x220 Preemption disabled at: [<00000000>] 0x0 CPU: 0 PID: 1 Comm: swapper Not tainted 5.19.0-yocto-standard+ #1 Call Trace: [d101dc90] [c073b264] dump_stack_lvl+0x50/0x8c (unreliable) [d101dcb0] [c0093b70] __might_resched+0x258/0x2a8 [d101dcd0] [c0d3e634] __mutex_lock+0x6c/0x6ec [d101dd50] [c0a84174] of_alias_get_id+0x50/0xf4 [d101dd80] [c002ec78] pcibios_alloc_controller+0x1b8/0x220 [d101ddd0] [c140c9dc] pmac_pci_init+0x198/0x784 [d101de50] [c140852c] discover_phbs+0x30/0x4c [d101de60] [c0007fd4] do_one_initcall+0x94/0x344 [d101ded0] [c1403b40] kernel_init_freeable+0x1a8/0x22c [d101df10] [c00086e0] kernel_init+0x34/0x160 [d101df30] [c001b334] ret_from_kernel_thread+0x5c/0x64 This is because pcibios_alloc_controller() holds hose_spinlock but of_alias_get_id() takes of_mutex which can sleep. The hose_spinlock protects the phb_bitmap, and also the hose_list, but it doesn't need to be held while get_phb_number() calls the OF routines, because those are only looking up information in the device tree. So fix it by having get_phb_number() take the hose_spinlock itself, only where required, and then dropping the lock before returning. pcibios_alloc_controller() then needs to take the lock again before the list_add() but that's safe, the order of the list is not important. Fixes: 0fe1e96fef0a ("powerpc/pci: Prefer PCI domain assignment via DT 'linux,pci-domain' and alias") Reported-by: Guenter Roeck <linux@roeck-us.net> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20220815065550.1303620-1-mpe@ellerman.id.au Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-08-25nios2: add force_successful_syscall_return()Al Viro2-0/+8
commit fd0c153daad135d0ec1a53c5dbe6936a724d6ae1 upstream. If we use the ancient SysV syscall ABI, we'd better have tell the kernel how to claim that a negative return value is a success. Use ->orig_r2 for that - it's inaccessible via ptrace, so it's a fair game for changes and it's normally[*] non-negative on return from syscall. Set to -1; syscall is not going to be restart-worthy by definition, so we won't interfere with that use either. [*] the only exception is rt_sigreturn(), where we skip the entire messing with r1/r2 anyway. Fixes: 82ed08dd1b0e ("nios2: Exception handling") Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Dinh Nguyen <dinguyen@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-08-25nios2: restarts apply only to the first sigframe we build...Al Viro1-0/+1
commit 411a76b7219555c55867466c82d70ce928d6c9e1 upstream. Fixes: b53e906d255d ("nios2: Signal handling support") Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Dinh Nguyen <dinguyen@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-08-25nios2: fix syscall restart checksAl Viro1-1/+1
commit 2d631bd58fe0ea3e3350212e23c9aba1fb606514 upstream. sys_foo() returns -512 (aka -ERESTARTSYS) => do_signal() sees 512 in r2 and 1 in r1. sys_foo() returns 512 => do_signal() sees 512 in r2 and 0 in r1. The former is restart-worthy; the latter obviously isn't. Fixes: b53e906d255d ("nios2: Signal handling support") Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Dinh Nguyen <dinguyen@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-08-25nios2: traced syscall does need to check the syscall numberAl Viro1-3/+8
commit 25ba820ef36bdbaf9884adeac69b6e1821a7df76 upstream. all checks done before letting the tracer modify the register state are worthless... Fixes: 82ed08dd1b0e ("nios2: Exception handling") Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Dinh Nguyen <dinguyen@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-08-25nios2: don't leave NULLs in sys_call_table[]Al Viro2-1/+1
commit 45ec746c65097c25e77d24eae8fee0def5b6cc5d upstream. fill the gaps in there with sys_ni_syscall, as everyone does... Fixes: 82ed08dd1b0e ("nios2: Exception handling") Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Dinh Nguyen <dinguyen@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-08-25nios2: page fault et.al. are *not* restartable syscalls...Al Viro2-4/+3
commit 8535c239ac674f7ead0f2652932d35c52c4123b2 upstream. make sure that ->orig_r2 is negative for everything except the syscalls. Fixes: 82ed08dd1b0e ("nios2: Exception handling") Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Dinh Nguyen <dinguyen@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-08-25x86/mm: Use proper mask when setting PUD mappingAaron Lu1-1/+1
commit 88e0a74902f894fbbc55ad3ad2cb23b4bfba555c upstream. Commit c164fbb40c43f("x86/mm: thread pgprot_t through init_memory_mapping()") mistakenly used __pgprot() which doesn't respect __default_kernel_pte_mask when setting PUD mapping. Fix it by only setting the one bit we actually need (PSE) and leaving the other bits (that have been properly masked) alone. Fixes: c164fbb40c43 ("x86/mm: thread pgprot_t through init_memory_mapping()") Signed-off-by: Aaron Lu <aaron.lu@intel.com> Cc: stable@kernel.org Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-08-21kvm: x86/pmu: Fix the compare function used by the pmu event filterAaron Lewis1-2/+5
commit 4ac19ead0dfbabd8e0bfc731f507cfb0b95d6c99 upstream. When returning from the compare function the u64 is truncated to an int. This results in a loss of the high nybble[1] in the event select and its sign if that nybble is in use. Switch from using a result that can end up being truncated to a result that can only be: 1, 0, -1. [1] bits 35:32 in the event select register and bits 11:8 in the event select. Fixes: 7ff775aca48ad ("KVM: x86/pmu: Use binary search to check filtered events") Signed-off-by: Aaron Lewis <aaronlewis@google.com> Reviewed-by: Sean Christopherson <seanjc@google.com> Message-Id: <20220517051238.2566934-1-aaronlewis@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-08-21KVM: x86: Avoid theoretical NULL pointer dereference in ↵Vitaly Kuznetsov1-0/+4
kvm_irq_delivery_to_apic_fast() commit 00b5f37189d24ac3ed46cb7f11742094778c46ce upstream When kvm_irq_delivery_to_apic_fast() is called with APIC_DEST_SELF shorthand, 'src' must not be NULL. Crash the VM with KVM_BUG_ON() instead of crashing the host. Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com> Message-Id: <20220325132140.25650-3-vkuznets@redhat.com> Cc: stable@vger.kernel.org Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Stefan Ghinea <stefan.ghinea@windriver.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-08-21KVM: x86: Check lapic_in_kernel() before attempting to set a SynIC irqVitaly Kuznetsov1-0/+3
commit 7ec37d1cbe17d8189d9562178d8b29167fe1c31a upstream When KVM_CAP_HYPERV_SYNIC{,2} is activated, KVM already checks for irqchip_in_kernel() so normally SynIC irqs should never be set. It is, however, possible for a misbehaving VMM to write to SYNIC/STIMER MSRs causing erroneous behavior. The immediate issue being fixed is that kvm_irq_delivery_to_apic() (kvm_irq_delivery_to_apic_fast()) crashes when called with 'irq.shorthand = APIC_DEST_SELF' and 'src == NULL'. Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com> Message-Id: <20220325132140.25650-2-vkuznets@redhat.com> Cc: stable@vger.kernel.org Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Stefan Ghinea <stefan.ghinea@windriver.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-08-21KVM: x86/pmu: Ignore pmu->global_ctrl check if vPMU doesn't support global_ctrlLike Xu1-0/+3
[ Upstream commit 98defd2e17803263f49548fea930cfc974d505aa ] MSR_CORE_PERF_GLOBAL_CTRL is introduced as part of Architecture PMU V2, as indicated by Intel SDM 19.2.2 and the intel_is_valid_msr() function. So in the absence of global_ctrl support, all PMCs are enabled as AMD does. Signed-off-by: Like Xu <likexu@tencent.com> Message-Id: <20220509102204.62389-1-likexu@tencent.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2022-08-21KVM: VMX: Mark all PERF_GLOBAL_(OVF)_CTRL bits reserved if there's no vPMUSean Christopherson1-0/+2
[ Upstream commit 93255bf92939d948bc86d81c6bb70bb0fecc5db1 ] Mark all MSR_CORE_PERF_GLOBAL_CTRL and MSR_CORE_PERF_GLOBAL_OVF_CTRL bits as reserved if there is no guest vPMU. The nVMX VM-Entry consistency checks do not check for a valid vPMU prior to consuming the masks via kvm_valid_perf_global_ctrl(), i.e. may incorrectly allow a non-zero mask to be loaded via VM-Enter or VM-Exit (well, attempted to be loaded, the actual MSR load will be rejected by intel_is_valid_msr()). Fixes: f5132b01386b ("KVM: Expose a version 2 architectural PMU to a guests") Cc: stable@vger.kernel.org Signed-off-by: Sean Christopherson <seanjc@google.com> Message-Id: <20220722224409.1336532-3-seanjc@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2022-08-21KVM: x86/pmu: Introduce the ctrl_mask value for fixed counterLike Xu2-1/+6
[ Upstream commit 2c985527dd8d283e786ad7a67e532ef7f6f00fac ] The mask value of fixed counter control register should be dynamic adjusted with the number of fixed counters. This patch introduces a variable that includes the reserved bits of fixed counter control registers. This is a generic code refactoring. Co-developed-by: Luwei Kang <luwei.kang@intel.com> Signed-off-by: Luwei Kang <luwei.kang@intel.com> Signed-off-by: Like Xu <like.xu@linux.intel.com> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Message-Id: <20220411101946.20262-6-likexu@tencent.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2022-08-21KVM: x86/pmu: Use different raw event masks for AMD and IntelJim Mattson4-1/+5
[ Upstream commit 95b065bf5c431c06c68056a03a5853b660640ecc ] The third nybble of AMD's event select overlaps with Intel's IN_TX and IN_TXCP bits. Therefore, we can't use AMD64_RAW_EVENT_MASK on Intel platforms that support TSX. Declare a raw_event_mask in the kvm_pmu structure, initialize it in the vendor-specific pmu_refresh() functions, and use that mask for PERF_TYPE_RAW configurations in reprogram_gp_counter(). Fixes: 710c47651431 ("KVM: x86/pmu: Use AMD64_RAW_EVENT_MASK for PERF_TYPE_RAW") Signed-off-by: Jim Mattson <jmattson@google.com> Message-Id: <20220308012452.3468611-1-jmattson@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2022-08-21KVM: x86/pmu: Use binary search to check filtered eventsJim Mattson1-11/+19
[ Upstream commit 7ff775aca48adc854436b92c060e5eebfffb6a4a ] The PMU event filter may contain up to 300 events. Replace the linear search in reprogram_gp_counter() with a binary search. Signed-off-by: Jim Mattson <jmattson@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Message-Id: <20220115052431.447232-2-jmattson@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2022-08-21KVM: x86/pmu: preserve IA32_PERF_CAPABILITIES across CPUID refreshPaolo Bonzini1-6/+10
[ Upstream commit a755753903a40d982f6dd23d65eb96b248a2577a ] Once MSR_IA32_PERF_CAPABILITIES is changed via vmx_set_msr(), the value should not be changed by cpuid(). To ensure that the new value is kept, the default initialization path is moved to intel_pmu_init(). The effective value of the MSR will be 0 if PDCM is clear, however. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2022-08-21KVM: nVMX: Inject #UD if VMXON is attempted with incompatible CR0/CR4Sean Christopherson1-9/+14
[ Upstream commit c7d855c2aff2d511fd60ee2e356134c4fb394799 ] Inject a #UD if L1 attempts VMXON with a CR0 or CR4 that is disallowed per the associated nested VMX MSRs' fixed0/1 settings. KVM cannot rely on hardware to perform the checks, even for the few checks that have higher priority than VM-Exit, as (a) KVM may have forced CR0/CR4 bits in hardware while running the guest, (b) there may incompatible CR0/CR4 bits that have lower priority than VM-Exit, e.g. CR0.NE, and (c) userspace may have further restricted the allowed CR0/CR4 values by manipulating the guest's nested VMX MSRs. Note, despite a very strong desire to throw shade at Jim, commit 70f3aac964ae ("kvm: nVMX: Remove superfluous VMX instruction fault checks") is not to blame for the buggy behavior (though the comment...). That commit only removed the CR0.PE, EFLAGS.VM, and COMPATIBILITY mode checks (though it did erroneously drop the CPL check, but that has already been remedied). KVM may force CR0.PE=1, but will do so only when also forcing EFLAGS.VM=1 to emulate Real Mode, i.e. hardware will still #UD. Link: https://bugzilla.kernel.org/show_bug.cgi?id=216033 Fixes: ec378aeef9df ("KVM: nVMX: Implement VMXON and VMXOFF") Reported-by: Eric Li <ercli@ucdavis.edu> Cc: stable@vger.kernel.org Signed-off-by: Sean Christopherson <seanjc@google.com> Message-Id: <20220607213604.3346000-4-seanjc@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2022-08-21KVM: x86: Move vendor CR4 validity check to dedicated kvm_x86_ops hookSean Christopherson7-21/+34
[ Upstream commit c2fe3cd4604ac87c587db05d41843d667dc43815 ] Split out VMX's checks on CR4.VMXE to a dedicated hook, .is_valid_cr4(), and invoke the new hook from kvm_valid_cr4(). This fixes an issue where KVM_SET_SREGS would return success while failing to actually set CR4. Fixing the issue by explicitly checking kvm_x86_ops.set_cr4()'s return in __set_sregs() is not a viable option as KVM has already stuffed a variety of vCPU state. Note, kvm_valid_cr4() and is_valid_cr4() have different return types and inverted semantics. This will be remedied in a future patch. Fixes: 5e1746d6205d ("KVM: nVMX: Allow setting the VMXE bit in CR4") Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com> Message-Id: <20201007014417.29276-5-sean.j.christopherson@intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2022-08-21KVM: SVM: Drop VMXE check from svm_set_cr4()Sean Christopherson1-3/+0
[ Upstream commit 311a06593b9a3944a63ed176b95cb8d857f7c83b ] Drop svm_set_cr4()'s explicit check CR4.VMXE now that common x86 handles the check by incorporating VMXE into the CR4 reserved bits, via kvm_cpu_caps. SVM obviously does not set X86_FEATURE_VMX. No functional change intended. Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com> Message-Id: <20201007014417.29276-4-sean.j.christopherson@intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2022-08-21KVM: VMX: Drop explicit 'nested' check from vmx_set_cr4()Sean Christopherson1-12/+7
[ Upstream commit a447e38a7fadb2e554c3942dda183e55cccd5df0 ] Drop vmx_set_cr4()'s explicit check on the 'nested' module param now that common x86 handles the check by incorporating VMXE into the CR4 reserved bits, via kvm_cpu_caps. X86_FEATURE_VMX is set in kvm_cpu_caps (by vmx_set_cpu_caps()), if and only if 'nested' is true. No functional change intended. Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com> Message-Id: <20201007014417.29276-3-sean.j.christopherson@intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2022-08-21KVM: VMX: Drop guest CPUID check for VMXE in vmx_set_cr4()Sean Christopherson1-2/+3
[ Upstream commit d3a9e4146a6f79f19430bca3f2a4d6ebaaffe36b ] Drop vmx_set_cr4()'s somewhat hidden guest_cpuid_has() check on VMXE now that common x86 handles the check by incorporating VMXE into the CR4 reserved bits, i.e. in cr4_guest_rsvd_bits. This fixes a bug where KVM incorrectly rejects KVM_SET_SREGS with CR4.VMXE=1 if it's executed before KVM_SET_CPUID{,2}. Fixes: 5e1746d6205d ("KVM: nVMX: Allow setting the VMXE bit in CR4") Reported-by: Stas Sergeev <stsp@users.sourceforge.net> Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com> Message-Id: <20201007014417.29276-2-sean.j.christopherson@intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2022-08-21um: Allow PM with suspend-to-idleJohannes Berg5-1/+46
[ Upstream commit 92dcd3d31843fbe1a95d880dc912e1f6beac6632 ] In order to be able to experiment with suspend in UML, add the minimal work to be able to suspend (s2idle) an instance of UML, and be able to wake it back up from that state with the USR1 signal sent to the main UML process. Signed-off-by: Johannes Berg <johannes.berg@intel.com> Acked-By: Anton Ivanov <anton.ivanov@cambridgegreys.com> Signed-off-by: Richard Weinberger <richard@nod.at> Signed-off-by: Sasha Levin <sashal@kernel.org>
2022-08-21kexec, KEYS, s390: Make use of built-in and secondary keyring for signature ↵Michal Suchanek1-5/+13
verification [ Upstream commit 0828c4a39be57768b8788e8cbd0d84683ea757e5 ] commit e23a8020ce4e ("s390/kexec_file: Signature verification prototype") adds support for KEXEC_SIG verification with keys from platform keyring but the built-in keys and secondary keyring are not used. Add support for the built-in keys and secondary keyring as x86 does. Fixes: e23a8020ce4e ("s390/kexec_file: Signature verification prototype") Cc: stable@vger.kernel.org Cc: Philipp Rudo <prudo@linux.ibm.com> Cc: kexec@lists.infradead.org Cc: keyrings@vger.kernel.org Cc: linux-security-module@vger.kernel.org Signed-off-by: Michal Suchanek <msuchanek@suse.de> Reviewed-by: "Lee, Chun-Yi" <jlee@suse.com> Acked-by: Baoquan He <bhe@redhat.com> Signed-off-by: Coiby Xu <coxu@redhat.com> Acked-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Mimi Zohar <zohar@linux.ibm.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2022-08-21KVM: x86: Signal #GP, not -EPERM, on bad WRMSR(MCi_CTL/STATUS)Sean Christopherson1-2/+2
[ Upstream commit 2368048bf5c2ec4b604ac3431564071e89a0bc71 ] Return '1', not '-1', when handling an illegal WRMSR to a MCi_CTL or MCi_STATUS MSR. The behavior of "all zeros' or "all ones" for CTL MSRs is architectural, as is the "only zeros" behavior for STATUS MSRs. I.e. the intent is to inject a #GP, not exit to userspace due to an unhandled emulation case. Returning '-1' gets interpreted as -EPERM up the stack and effecitvely kills the guest. Fixes: 890ca9aefa78 ("KVM: Add MCE support") Fixes: 9ffd986c6e4e ("KVM: X86: #GP when guest attempts to write MCi_STATUS register w/o 0") Cc: stable@vger.kernel.org Signed-off-by: Sean Christopherson <seanjc@google.com> Reviewed-by: Jim Mattson <jmattson@google.com> Link: https://lore.kernel.org/r/20220512222716.4112548-2-seanjc@google.com Signed-off-by: Sasha Levin <sashal@kernel.org>
2022-08-21KVM: set_msr_mce: Permit guests to ignore single-bit ECC errorsLev Kujawski1-2/+5
[ Upstream commit 0471a7bd1bca2a47a5f378f2222c5cf39ce94152 ] Certain guest operating systems (e.g., UNIXWARE) clear bit 0 of MC1_CTL to ignore single-bit ECC data errors. Single-bit ECC data errors are always correctable and thus are safe to ignore because they are informational in nature rather than signaling a loss of data integrity. Prior to this patch, these guests would crash upon writing MC1_CTL, with resultant error messages like the following: error: kvm run failed Operation not permitted EAX=fffffffe EBX=fffffffe ECX=00000404 EDX=ffffffff ESI=ffffffff EDI=00000001 EBP=fffdaba4 ESP=fffdab20 EIP=c01333a5 EFL=00000246 [---Z-P-] CPL=0 II=0 A20=1 SMM=0 HLT=0 ES =0108 00000000 ffffffff 00c09300 DPL=0 DS [-WA] CS =0100 00000000 ffffffff 00c09b00 DPL=0 CS32 [-RA] SS =0108 00000000 ffffffff 00c09300 DPL=0 DS [-WA] DS =0108 00000000 ffffffff 00c09300 DPL=0 DS [-WA] FS =0000 00000000 ffffffff 00c00000 GS =0000 00000000 ffffffff 00c00000 LDT=0118 c1026390 00000047 00008200 DPL=0 LDT TR =0110 ffff5af0 00000067 00008b00 DPL=0 TSS32-busy GDT= ffff5020 000002cf IDT= ffff52f0 000007ff CR0=8001003b CR2=00000000 CR3=0100a000 CR4=00000230 DR0=00000000 DR1=00000000 DR2=00000000 DR3=00000000 DR6=ffff0ff0 DR7=00000400 EFER=0000000000000000 Code=08 89 01 89 51 04 c3 8b 4c 24 08 8b 01 8b 51 04 8b 4c 24 04 <0f> 30 c3 f7 05 a4 6d ff ff 10 00 00 00 74 03 0f 31 c3 33 c0 33 d2 c3 8d 74 26 00 0f 31 c3 Signed-off-by: Lev Kujawski <lkujaw@member.fsf.org> Message-Id: <20220521081511.187388-1-lkujaw@member.fsf.org> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2022-08-21x86/olpc: fix 'logical not is only applied to the left hand side'Alexander Lobakin1-1/+1
commit 3a2ba42cbd0b669ce3837ba400905f93dd06c79f upstream. The bitops compile-time optimization series revealed one more problem in olpc-xo1-sci.c:send_ebook_state(), resulted in GCC warnings: arch/x86/platform/olpc/olpc-xo1-sci.c: In function 'send_ebook_state': arch/x86/platform/olpc/olpc-xo1-sci.c:83:63: warning: logical not is only applied to the left hand side of comparison [-Wlogical-not-parentheses] 83 | if (!!test_bit(SW_TABLET_MODE, ebook_switch_idev->sw) == state) | ^~ arch/x86/platform/olpc/olpc-xo1-sci.c:83:13: note: add parentheses around left hand side expression to silence this warning Despite this code working as intended, this redundant double negation of boolean value, together with comparing to `char` with no explicit conversion to bool, makes compilers think the author made some unintentional logical mistakes here. Make it the other way around and negate the char instead to silence the warnings. Fixes: d2aa37411b8e ("x86/olpc/xo1/sci: Produce wakeup events for buttons and switches") Cc: stable@vger.kernel.org # 3.5+ Reported-by: Guenter Roeck <linux@roeck-us.net> Reported-by: kernel test robot <lkp@intel.com> Reviewed-and-tested-by: Guenter Roeck <linux@roeck-us.net> Signed-off-by: Alexander Lobakin <alexandr.lobakin@intel.com> Signed-off-by: Yury Norov <yury.norov@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-08-21ftrace/x86: Add back ftrace_expected assignmentSteven Rostedt (Google)1-0/+1
commit ac6c1b2ca77e722a1e5d651f12f437f2f237e658 upstream. When a ftrace_bug happens (where ftrace fails to modify a location) it is helpful to have what was at that location as well as what was expected to be there. But with the conversion to text_poke() the variable that assigns the expected for debugging was dropped. Unfortunately, I noticed this when I needed it. Add it back. Link: https://lkml.kernel.org/r/20220726101851.069d2e70@gandalf.local.home Cc: "x86@kernel.org" <x86@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: stable@vger.kernel.org Fixes: 768ae4406a5c ("x86/ftrace: Use text_poke()") Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-08-21x86/bugs: Enable STIBP for IBPB mitigated RETBleedKim Phillips1-4/+6
commit e6cfcdda8cbe81eaf821c897369a65fec987b404 upstream. AMD's "Technical Guidance for Mitigating Branch Type Confusion, Rev. 1.0 2022-07-12" whitepaper, under section 6.1.2 "IBPB On Privileged Mode Entry / SMT Safety" says: Similar to the Jmp2Ret mitigation, if the code on the sibling thread cannot be trusted, software should set STIBP to 1 or disable SMT to ensure SMT safety when using this mitigation. So, like already being done for retbleed=unret, and now also for retbleed=ibpb, force STIBP on machines that have it, and report its SMT vulnerability status accordingly. [ bp: Remove the "we" and remove "[AMD]" applicability parameter which doesn't work here. ] Fixes: 3ebc17006888 ("x86/bugs: Add retbleed=ibpb") Signed-off-by: Kim Phillips <kim.phillips@amd.com> Signed-off-by: Borislav Petkov <bp@suse.de> Cc: stable@vger.kernel.org # 5.10, 5.15, 5.19 Link: https://bugzilla.kernel.org/show_bug.cgi?id=206537 Link: https://lore.kernel.org/r/20220804192201.439596-1-kim.phillips@amd.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-08-21x86/entry: Build thunk_$(BITS) only if CONFIG_PREEMPTION=yAndrea Righi4-8/+4
[ Upstream commit de979c83574abf6e78f3fa65b716515c91b2613d ] With CONFIG_PREEMPTION disabled, arch/x86/entry/thunk_$(BITS).o becomes an empty object file. With some old versions of binutils (i.e., 2.35.90.20210113-1ubuntu1) the GNU assembler doesn't generate a symbol table for empty object files and objtool fails with the following error when a valid symbol table cannot be found: arch/x86/entry/thunk_64.o: warning: objtool: missing symbol table To prevent this from happening, build thunk_$(BITS).o only if CONFIG_PREEMPTION is enabled. BugLink: https://bugs.launchpad.net/bugs/1911359 Fixes: 320100a5ffe5 ("x86/entry: Remove the TRACE_IRQS cruft") Signed-off-by: Andrea Righi <andrea.righi@canonical.com> Signed-off-by: Ingo Molnar <mingo@kernel.org> Link: https://lore.kernel.org/r/Ys/Ke7EWjcX+ZlXO@arighi-desktop Signed-off-by: Sasha Levin <sashal@kernel.org>
2022-08-21x86/numa: Use cpumask_available instead of hardcoded NULL checkSiddh Raman Pant1-2/+2
[ Upstream commit 625395c4a0f4775e0fe00f616888d2e6c1ba49db ] GCC-12 started triggering a new warning: arch/x86/mm/numa.c: In function ‘cpumask_of_node’: arch/x86/mm/numa.c:916:39: warning: the comparison will always evaluate as ‘false’ for the address of ‘node_to_cpumask_map’ will never be NULL [-Waddress] 916 | if (node_to_cpumask_map[node] == NULL) { | ^~ node_to_cpumask_map is of type cpumask_var_t[]. When CONFIG_CPUMASK_OFFSTACK is set, cpumask_var_t is typedef'd to a pointer for dynamic allocation, else to an array of one element. The "wicked game" can be checked on line 700 of include/linux/cpumask.h. The original code in debug_cpumask_set_cpu() and cpumask_of_node() were probably written by the original authors with CONFIG_CPUMASK_OFFSTACK=y (i.e. dynamic allocation) in mind, checking if the cpumask was available via a direct NULL check. When CONFIG_CPUMASK_OFFSTACK is not set, GCC gives the above warning while compiling the kernel. Fix that by using cpumask_available(), which does the NULL check when CONFIG_CPUMASK_OFFSTACK is set, otherwise returns true. Use it wherever such checks are made. Conditional definitions of cpumask_available() can be found along with the definition of cpumask_var_t. Check the cpumask.h reference mentioned above. Fixes: c032ef60d1aa ("cpumask: convert node_to_cpumask_map[] to cpumask_var_t") Fixes: de2d9445f162 ("x86: Unify node_to_cpumask_map handling between 32 and 64bit") Signed-off-by: Siddh Raman Pant <code@siddh.me> Signed-off-by: Ingo Molnar <mingo@kernel.org> Link: https://lore.kernel.org/r/20220731160913.632092-1-code@siddh.me Signed-off-by: Sasha Levin <sashal@kernel.org>
2022-08-21powerpc/pci: Fix PHB numbering when using opal-phbidMichael Ellerman1-4/+6
[ Upstream commit f4b39e88b42d13366b831270306326b5c20971ca ] The recent change to the PHB numbering logic has a logic error in the handling of "ibm,opal-phbid". When an "ibm,opal-phbid" property is present, &prop is written to and ret is set to zero. The following call to of_alias_get_id() is skipped because ret == 0. But then the if (ret >= 0) is true, and the body of that if statement sets prop = ret which throws away the value that was just read from "ibm,opal-phbid". Fix the logic by only doing the ret >= 0 check in the of_alias_get_id() case. Fixes: 0fe1e96fef0a ("powerpc/pci: Prefer PCI domain assignment via DT 'linux,pci-domain' and alias") Reviewed-by: Pali Rohár <pali@kernel.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20220802105723.1055178-1-mpe@ellerman.id.au Signed-off-by: Sasha Levin <sashal@kernel.org>
2022-08-21powerpc/cell/axon_msi: Fix refcount leak in setup_msi_msg_addressMiaoqian Lin1-0/+1
[ Upstream commit df5d4b616ee76abc97e5bd348e22659c2b095b1c ] of_get_next_parent() returns a node pointer with refcount incremented, we should use of_node_put() on it when not need anymore. Add missing of_node_put() in the error path to avoid refcount leak. Fixes: ce21b3c9648a ("[CELL] add support for MSI on Axon-based Cell systems") Signed-off-by: Miaoqian Lin <linmq006@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20220605065129.63906-1-linmq006@gmail.com Signed-off-by: Sasha Levin <sashal@kernel.org>
2022-08-21powerpc/xive: Fix refcount leak in xive_get_max_prioMiaoqian Lin1-0/+1
[ Upstream commit 255b650cbec6849443ce2e0cdd187fd5e61c218c ] of_find_node_by_path() returns a node pointer with refcount incremented, we should use of_node_put() on it when done. Add missing of_node_put() to avoid refcount leak. Fixes: eac1e731b59e ("powerpc/xive: guest exploitation of the XIVE interrupt controller") Signed-off-by: Miaoqian Lin <linmq006@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20220605053225.56125-1-linmq006@gmail.com Signed-off-by: Sasha Levin <sashal@kernel.org>
2022-08-21powerpc/spufs: Fix refcount leak in spufs_init_isolated_loaderMiaoqian Lin1-0/+1
[ Upstream commit 6ac059dacffa8ab2f7798f20e4bd3333890c541c ] of_find_node_by_path() returns remote device nodepointer with refcount incremented, we should use of_node_put() on it when done. Add missing of_node_put() to avoid refcount leak. Fixes: 0afacde3df4c ("[POWERPC] spufs: allow isolated mode apps by starting the SPE loader") Signed-off-by: Miaoqian Lin <linmq006@gmail.com> Acked-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20220603121543.22884-1-linmq006@gmail.com Signed-off-by: Sasha Levin <sashal@kernel.org>
2022-08-21powerpc/pci: Prefer PCI domain assignment via DT 'linux,pci-domain' and aliasPali Rohár1-8/+19
[ Upstream commit 0fe1e96fef0a5c53b4c0d1500d356f3906000f81 ] Other Linux architectures use DT property 'linux,pci-domain' for specifying fixed PCI domain of PCI controller specified in Device-Tree. And lot of Freescale powerpc boards have defined numbered pci alias in Device-Tree for every PCIe controller which number specify preferred PCI domain. So prefer usage of DT property 'linux,pci-domain' (via function of_get_pci_domain_nr()) and DT pci alias (via function of_alias_get_id()) on powerpc architecture for assigning PCI domain to PCI controller. Fixes: 63a72284b159 ("powerpc/pci: Assign fixed PHB number based on device-tree properties") Signed-off-by: Pali Rohár <pali@kernel.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20220706102148.5060-2-pali@kernel.org Signed-off-by: Sasha Levin <sashal@kernel.org>
2022-08-21powerpc/32: Do not allow selection of e5500 or e6500 CPUs on PPC32Christophe Leroy1-2/+2
[ Upstream commit 9be013b2a9ecb29b5168e4b9db0e48ed53acf37c ] Commit 0e00a8c9fd92 ("powerpc: Allow CPU selection also on PPC32") enlarged the CPU selection logic to PPC32 by removing depend to PPC64, and failed to restrict that depend to E5500_CPU and E6500_CPU. Fortunately that got unnoticed because -mcpu=8540 will override the -mcpu=e500mc64 or -mpcu=e6500 as they are ealier, but that's fragile and may no be right in the future. Add back the depend PPC64 on E5500_CPU and E6500_CPU. Fixes: 0e00a8c9fd92 ("powerpc: Allow CPU selection also on PPC32") Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/8abab4888da69ff78b73a56f64d9678a7bf684e9.1657549153.git.christophe.leroy@csgroup.eu Signed-off-by: Sasha Levin <sashal@kernel.org>
2022-08-21s390/dump: fix old lowcore virtual vs physical address confusionAlexander Gordeev3-2/+5
[ Upstream commit dc306186a130c6d9feb0aabc1c71b8ed1674a3bf ] Virtual addresses of vmcore_info and os_info members are wrongly passed to copy_oldmem_kernel(), while the function expects physical address of the source. Instead, __pa() macro should have been applied. Yet, use of __pa() macro could be somehow confusing, since copy_oldmem_kernel() may treat the source as an offset, not as a direct physical address (that depens from the oldmem availability and location). Fix the virtual vs physical address confusion and make the way the old lowcore is read consistent across all sources. Reviewed-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2022-08-21powerpc/perf: Optimize clearing the pending PMI and remove WARN_ON for PMI ↵Athira Rajeev1-20/+15
check in power_pmu_disable [ Upstream commit 890005a7d98f7452cfe86dcfb2aeeb7df01132ce ] commit 2c9ac51b850d ("powerpc/perf: Fix PMU callbacks to clear pending PMI before resetting an overflown PMC") added a new function "pmi_irq_pending" in hw_irq.h. This function is to check if there is a PMI marked as pending in Paca (PACA_IRQ_PMI).This is used in power_pmu_disable in a WARN_ON. The intention here is to provide a warning if there is PMI pending, but no counter is found overflown. During some of the perf runs, below warning is hit: WARNING: CPU: 36 PID: 0 at arch/powerpc/perf/core-book3s.c:1332 power_pmu_disable+0x25c/0x2c0 Modules linked in: ----- NIP [c000000000141c3c] power_pmu_disable+0x25c/0x2c0 LR [c000000000141c8c] power_pmu_disable+0x2ac/0x2c0 Call Trace: [c000000baffcfb90] [c000000000141c8c] power_pmu_disable+0x2ac/0x2c0 (unreliable) [c000000baffcfc10] [c0000000003e2f8c] perf_pmu_disable+0x4c/0x60 [c000000baffcfc30] [c0000000003e3344] group_sched_out.part.124+0x44/0x100 [c000000baffcfc80] [c0000000003e353c] __perf_event_disable+0x13c/0x240 [c000000baffcfcd0] [c0000000003dd334] event_function+0xc4/0x140 [c000000baffcfd20] [c0000000003d855c] remote_function+0x7c/0xa0 [c000000baffcfd50] [c00000000026c394] flush_smp_call_function_queue+0xd4/0x300 [c000000baffcfde0] [c000000000065b24] smp_ipi_demux_relaxed+0xa4/0x100 [c000000baffcfe20] [c0000000000cb2b0] xive_muxed_ipi_action+0x20/0x40 [c000000baffcfe40] [c000000000207c3c] __handle_irq_event_percpu+0x8c/0x250 [c000000baffcfee0] [c000000000207e2c] handle_irq_event_percpu+0x2c/0xa0 [c000000baffcff10] [c000000000210a04] handle_percpu_irq+0x84/0xc0 [c000000baffcff40] [c000000000205f14] generic_handle_irq+0x54/0x80 [c000000baffcff60] [c000000000015740] __do_irq+0x90/0x1d0 [c000000baffcff90] [c000000000016990] __do_IRQ+0xc0/0x140 [c0000009732f3940] [c000000bafceaca8] 0xc000000bafceaca8 [c0000009732f39d0] [c000000000016b78] do_IRQ+0x168/0x1c0 [c0000009732f3a00] [c0000000000090c8] hardware_interrupt_common_virt+0x218/0x220 This means that there is no PMC overflown among the active events in the PMU, but there is a PMU pending in Paca. The function "any_pmc_overflown" checks the PMCs on active events in cpuhw->n_events. Code snippet: <<>> if (any_pmc_overflown(cpuhw)) clear_pmi_irq_pending(); else WARN_ON(pmi_irq_pending()); <<>> Here the PMC overflown is not from active event. Example: When we do perf record, default cycles and instructions will be running on PMC6 and PMC5 respectively. It could happen that overflowed event is currently not active and pending PMI is for the inactive event. Debug logs from trace_printk: <<>> any_pmc_overflown: idx is 5: pmc value is 0xd9a power_pmu_disable: PMC1: 0x0, PMC2: 0x0, PMC3: 0x0, PMC4: 0x0, PMC5: 0xd9a, PMC6: 0x80002011 <<>> Here active PMC (from idx) is PMC5 , but overflown PMC is PMC6(0x80002011). When we handle PMI interrupt for such cases, if the PMC overflown is from inactive event, it will be ignored. Reference commit: commit bc09c219b2e6 ("powerpc/perf: Fix finding overflowed PMC in interrupt") Patch addresses two changes: 1) Fix 1 : Removal of warning ( WARN_ON(pmi_irq_pending()); ) We were printing warning if no PMC is found overflown among active PMU events, but PMI pending in PACA. But this could happen in cases where PMC overflown is not in active PMC. An inactive event could have caused the overflow. Hence the warning is not needed. To know pending PMI is from an inactive event, we need to loop through all PMC's which will cause more SPR reads via mfspr and increase in context switch. Also in existing function: perf_event_interrupt, already we ignore PMI's overflown when it is from an inactive PMC. 2) Fix 2: optimization in clearing pending PMI. Currently we check for any active PMC overflown before clearing PMI pending in Paca. This is causing additional SPR read also. From point 1, we know that if PMI pending in Paca from inactive cases, that is going to be ignored during replay. Hence if there is pending PMI in Paca, just clear it irrespective of PMC overflown or not. In summary, remove the any_pmc_overflown check entirely in power_pmu_disable. ie If there is a pending PMI in Paca, clear it, since we are in pmu_disable. There could be cases where PMI is pending because of inactive PMC ( which later when replayed also will get ignored ), so WARN_ON could give false warning. Hence removing it. Fixes: 2c9ac51b850d ("powerpc/perf: Fix PMU callbacks to clear pending PMI before resetting an overflown PMC") Signed-off-by: Athira Rajeev <atrajeev@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20220522142256.24699-1-atrajeev@linux.vnet.ibm.com Signed-off-by: Sasha Levin <sashal@kernel.org>
2022-08-21um: random: Don't initialise hwrng struct with zeroChristopher Obbard1-1/+1
[ Upstream commit 9e70cbd11b03889c92462cf52edb2bd023c798fa ] Initialising the hwrng struct with zeros causes a compile-time sparse warning: $ ARCH=um make -j10 W=1 C=1 CF='-fdiagnostic-prefix -D__CHECK_ENDIAN__' ... CHECK arch/um/drivers/random.c arch/um/drivers/random.c:31:31: sparse: warning: Using plain integer as NULL pointer Fix the warning by not initialising the hwrng struct with zeros as it is initialised anyway during module init. Fixes: 72d3e093afae ("um: random: Register random as hwrng-core device") Reported-by: kernel test robot <lkp@intel.com> Signed-off-by: Christopher Obbard <chris.obbard@collabora.com> Signed-off-by: Richard Weinberger <richard@nod.at> Signed-off-by: Sasha Levin <sashal@kernel.org>
2022-08-21xtensa: iss: fix handling error cases in iss_net_configure()Yang Yingliang1-17/+15
[ Upstream commit 628ccfc8f5f79dd548319408fcc53949fe97b258 ] The 'pdev' and 'netdev' need to be released in error cases of iss_net_configure(). Change the return type of iss_net_configure() to void, because it's not used. Fixes: 7282bee78798 ("[PATCH] xtensa: Architecture support for Tensilica Xtensa Part 8") Reported-by: Hulk Robot <hulkci@huawei.com> Signed-off-by: Yang Yingliang <yangyingliang@huawei.com> Signed-off-by: Max Filippov <jcmvbkbc@gmail.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2022-08-21xtensa: iss/network: provide release() callbackMax Filippov1-0/+10
[ Upstream commit 8864fb8359682912ee99235db7db916733a1fd7b ] Provide release() callback for the platform device embedded into struct iss_net_private and registered in the iss_net_configure so that platform_device_unregister could be called for it. Signed-off-by: Max Filippov <jcmvbkbc@gmail.com> Signed-off-by: Sasha Levin <sashal@kernel.org>