summaryrefslogtreecommitdiff
path: root/arch
AgeCommit message (Collapse)AuthorFilesLines
2019-01-23Disable MSI also when pcie-octeon.pcie_disable onYunQiang Su1-1/+3
commit a214720cbf50cd8c3f76bbb9c3f5c283910e9d33 upstream. Octeon has an boot-time option to disable pcie. Since MSI depends on PCI-E, we should also disable MSI also with this option is on in order to avoid inadvertently accessing PCIe registers. Signed-off-by: YunQiang Su <ysu@wavecomp.com> Signed-off-by: Paul Burton <paul.burton@mips.com> Cc: pburton@wavecomp.com Cc: linux-mips@vger.kernel.org Cc: aaro.koskinen@iki.fi Cc: stable@vger.kernel.org # v3.3+ Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-01-23arm64: kaslr: ensure randomized quantities are clean to the PoCArd Biesheuvel1-2/+6
commit 1598ecda7b239e9232dda032bfddeed9d89fab6c upstream. kaslr_early_init() is called with the kernel mapped at its link time offset, and if it returns with a non-zero offset, the kernel is unmapped and remapped again at the randomized offset. During its execution, kaslr_early_init() also randomizes the base of the module region and of the linear mapping of DRAM, and sets two variables accordingly. However, since these variables are assigned with the caches on, they may get lost during the cache maintenance that occurs when unmapping and remapping the kernel, so ensure that these values are cleaned to the PoC. Acked-by: Catalin Marinas <catalin.marinas@arm.com> Fixes: f80fb3a3d508 ("arm64: add support for kernel ASLR") Cc: <stable@vger.kernel.org> # v4.6+ Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-01-23MIPS: lantiq: Fix IPI interrupt handlingHauke Mehrtens1-63/+5
commit 2b4dba55b04b212a7fd1f0395b41d79ee3a9801b upstream. This makes SMP on the vrx200 work again, by removing all the MIPS CPU interrupt specific code and making it fully use the generic MIPS CPU interrupt controller. The mti,cpu-interrupt-controller from irq-mips-cpu.c now handles the CPU interrupts and also the IPI interrupts which are used to communication between the CPUs in a SMP system. The generic interrupt code was already used before but the interrupt vectors were overwritten again when we called set_vi_handler() in the lantiq interrupt driver and we also provided our own plat_irq_dispatch() function which overwrote the weak generic implementation. Now the code uses the generic handler for the MIPS CPU interrupts including the IPI interrupts and registers a handler for the CPU interrupts which are handled by the lantiq ICU with irq_set_chained_handler() which was already called before. Calling the set_c0_status() function is also not needed any more because the generic MIPS CPU interrupt already activates the needed bits. Fixes: 1eed40043579 ("MIPS: smp-mt: Use CPU interrupt controller IPI IRQ domain support") Cc: stable@kernel.org # v4.12 Signed-off-by: Hauke Mehrtens <hauke@hauke-m.de> Signed-off-by: Paul Burton <paul.burton@mips.com> Cc: jhogan@kernel.org Cc: ralf@linux-mips.org Cc: john@phrozen.org Cc: linux-mips@linux-mips.org Cc: linux-mips@vger.kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-01-23mips: fix n32 compat_ipc_parse_versionArnd Bergmann1-0/+1
commit 5a9372f751b5350e0ce3d2ee91832f1feae2c2e5 upstream. While reading through the sysvipc implementation, I noticed that the n32 semctl/shmctl/msgctl system calls behave differently based on whether o32 support is enabled or not: Without o32, the IPC_64 flag passed by user space is rejected but calls without that flag get IPC_64 behavior. As far as I can tell, this was inadvertently changed by a cleanup patch but never noticed by anyone, possibly nobody has tried using sysvipc on n32 after linux-3.19. Change it back to the old behavior now. Fixes: 78aaf956ba3a ("MIPS: Compat: Fix build error if CONFIG_MIPS32_COMPAT but no compat ABI.") Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Paul Burton <paul.burton@mips.com> Cc: linux-mips@vger.kernel.org Cc: stable@vger.kernel.org # 3.19+ Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-01-23arm64: Don't trap host pointer auth use to EL2Mark Rutland1-1/+3
[ Backport of upstream commit b3669b1e1c09890d61109a1a8ece2c5b66804714 ] To allow EL0 (and/or EL1) to use pointer authentication functionality, we must ensure that pointer authentication instructions and accesses to pointer authentication keys are not trapped to EL2. This patch ensures that HCR_EL2 is configured appropriately when the kernel is booted at EL2. For non-VHE kernels we set HCR_EL2.{API,APK}, ensuring that EL1 can access keys and permit EL0 use of instructions. For VHE kernels host EL0 (TGE && E2H) is unaffected by these settings, and it doesn't matter how we configure HCR_EL2.{API,APK}, so we don't bother setting them. This does not enable support for KVM guests, since KVM manages HCR_EL2 itself when running VMs. Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Signed-off-by: Mark Rutland <mark.rutland@arm.com> Acked-by: Christoffer Dall <christoffer.dall@arm.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Marc Zyngier <marc.zyngier@arm.com> Cc: Will Deacon <will.deacon@arm.com> Cc: kvmarm@lists.cs.columbia.edu Signed-off-by: Will Deacon <will.deacon@arm.com> [kristina: backport to 4.14.y: adjust context] Signed-off-by: Kristina Martsenko <kristina.martsenko@arm.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2019-01-23arm64/kvm: consistently handle host HCR_EL2 flagsMark Rutland3-4/+4
[ Backport of upstream commit 4eaed6aa2c628101246bcabc91b203bfac1193f8 ] In KVM we define the configuration of HCR_EL2 for a VHE HOST in HCR_HOST_VHE_FLAGS, but we don't have a similar definition for the non-VHE host flags, and open-code HCR_RW. Further, in head.S we open-code the flags for VHE and non-VHE configurations. In future, we're going to want to configure more flags for the host, so lets add a HCR_HOST_NVHE_FLAGS defintion, and consistently use both HCR_HOST_VHE_FLAGS and HCR_HOST_NVHE_FLAGS in the kvm code and head.S. We now use mov_q to generate the HCR_EL2 value, as we use when configuring other registers in head.S. Reviewed-by: Marc Zyngier <marc.zyngier@arm.com> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Signed-off-by: Mark Rutland <mark.rutland@arm.com> Reviewed-by: Christoffer Dall <christoffer.dall@arm.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Marc Zyngier <marc.zyngier@arm.com> Cc: Will Deacon <will.deacon@arm.com> Cc: kvmarm@lists.cs.columbia.edu Signed-off-by: Will Deacon <will.deacon@arm.com> [kristina: backport to 4.14.y: adjust context] Signed-off-by: Kristina Martsenko <kristina.martsenko@arm.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2019-01-17x86, modpost: Replace last remnants of RETPOLINE with CONFIG_RETPOLINEWANG Chao1-1/+1
commit e4f358916d528d479c3c12bd2fd03f2d5a576380 upstream. Commit 4cd24de3a098 ("x86/retpoline: Make CONFIG_RETPOLINE depend on compiler support") replaced the RETPOLINE define with CONFIG_RETPOLINE checks. Remove the remaining pieces. [ bp: Massage commit message. ] Fixes: 4cd24de3a098 ("x86/retpoline: Make CONFIG_RETPOLINE depend on compiler support") Signed-off-by: WANG Chao <chao.wang@ucloud.cn> Signed-off-by: Borislav Petkov <bp@suse.de> Reviewed-by: Zhenzhong Duan <zhenzhong.duan@oracle.com> Reviewed-by: Masahiro Yamada <yamada.masahiro@socionext.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Andy Lutomirski <luto@kernel.org> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Daniel Borkmann <daniel@iogearbox.net> Cc: David Woodhouse <dwmw@amazon.co.uk> Cc: Geert Uytterhoeven <geert@linux-m68k.org> Cc: Jessica Yu <jeyu@kernel.org> Cc: Jiri Kosina <jkosina@suse.cz> Cc: Kees Cook <keescook@chromium.org> Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Cc: Luc Van Oostenryck <luc.vanoostenryck@gmail.com> Cc: Michal Marek <michal.lkml@markovi.net> Cc: Miguel Ojeda <miguel.ojeda.sandonis@gmail.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Tim Chen <tim.c.chen@linux.intel.com> Cc: Vasily Gorbik <gor@linux.ibm.com> Cc: linux-kbuild@vger.kernel.org Cc: srinivas.eeda@oracle.com Cc: stable <stable@vger.kernel.org> Cc: x86-ml <x86@kernel.org> Link: https://lkml.kernel.org/r/20181210163725.95977-1-chao.wang@ucloud.cn Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-01-17x86,kvm: move qemu/guest FPU switching out to vcpu_runRik van Riel2-21/+26
commit f775b13eedee2f7f3c6fdd4e90fb79090ce5d339 upstream. Currently, every time a VCPU is scheduled out, the host kernel will first save the guest FPU/xstate context, then load the qemu userspace FPU context, only to then immediately save the qemu userspace FPU context back to memory. When scheduling in a VCPU, the same extraneous FPU loads and saves are done. This could be avoided by moving from a model where the guest FPU is loaded and stored with preemption disabled, to a model where the qemu userspace FPU is swapped out for the guest FPU context for the duration of the KVM_RUN ioctl. This is done under the VCPU mutex, which is also taken when other tasks inspect the VCPU FPU context, so the code should already be safe for this change. That should come as no surprise, given that s390 already has this optimization. This can fix a bug where KVM calls get_user_pages while owning the FPU, and the file system ends up requesting the FPU again: [258270.527947] __warn+0xcb/0xf0 [258270.527948] warn_slowpath_null+0x1d/0x20 [258270.527951] kernel_fpu_disable+0x3f/0x50 [258270.527953] __kernel_fpu_begin+0x49/0x100 [258270.527955] kernel_fpu_begin+0xe/0x10 [258270.527958] crc32c_pcl_intel_update+0x84/0xb0 [258270.527961] crypto_shash_update+0x3f/0x110 [258270.527968] crc32c+0x63/0x8a [libcrc32c] [258270.527975] dm_bm_checksum+0x1b/0x20 [dm_persistent_data] [258270.527978] node_prepare_for_write+0x44/0x70 [dm_persistent_data] [258270.527985] dm_block_manager_write_callback+0x41/0x50 [dm_persistent_data] [258270.527988] submit_io+0x170/0x1b0 [dm_bufio] [258270.527992] __write_dirty_buffer+0x89/0x90 [dm_bufio] [258270.527994] __make_buffer_clean+0x4f/0x80 [dm_bufio] [258270.527996] __try_evict_buffer+0x42/0x60 [dm_bufio] [258270.527998] dm_bufio_shrink_scan+0xc0/0x130 [dm_bufio] [258270.528002] shrink_slab.part.40+0x1f5/0x420 [258270.528004] shrink_node+0x22c/0x320 [258270.528006] do_try_to_free_pages+0xf5/0x330 [258270.528008] try_to_free_pages+0xe9/0x190 [258270.528009] __alloc_pages_slowpath+0x40f/0xba0 [258270.528011] __alloc_pages_nodemask+0x209/0x260 [258270.528014] alloc_pages_vma+0x1f1/0x250 [258270.528017] do_huge_pmd_anonymous_page+0x123/0x660 [258270.528021] handle_mm_fault+0xfd3/0x1330 [258270.528025] __get_user_pages+0x113/0x640 [258270.528027] get_user_pages+0x4f/0x60 [258270.528063] __gfn_to_pfn_memslot+0x120/0x3f0 [kvm] [258270.528108] try_async_pf+0x66/0x230 [kvm] [258270.528135] tdp_page_fault+0x130/0x280 [kvm] [258270.528149] kvm_mmu_page_fault+0x60/0x120 [kvm] [258270.528158] handle_ept_violation+0x91/0x170 [kvm_intel] [258270.528162] vmx_handle_exit+0x1ca/0x1400 [kvm_intel] No performance changes were detected in quick ping-pong tests on my 4 socket system, which is expected since an FPU+xstate load is on the order of 0.1us, while ping-ponging between CPUs is on the order of 20us, and somewhat noisy. Cc: stable@vger.kernel.org Signed-off-by: Rik van Riel <riel@redhat.com> Suggested-by: Christian Borntraeger <borntraeger@de.ibm.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> [Fixed a bug where reset_vcpu called put_fpu without preceding load_fpu, which happened inside from KVM_CREATE_VCPU ioctl. - Radim] Signed-off-by: Radim Krčmář <rkrcmar@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-01-13powerpc/tm: Set MSR[TS] just prior to recheckpointBreno Leitao2-15/+49
commit e1c3743e1a20647c53b719dbf28b48f45d23f2cd upstream. On a signal handler return, the user could set a context with MSR[TS] bits set, and these bits would be copied to task regs->msr. At restore_tm_sigcontexts(), after current task regs->msr[TS] bits are set, several __get_user() are called and then a recheckpoint is executed. This is a problem since a page fault (in kernel space) could happen when calling __get_user(). If it happens, the process MSR[TS] bits were already set, but recheckpoint was not executed, and SPRs are still invalid. The page fault can cause the current process to be de-scheduled, with MSR[TS] active and without tm_recheckpoint() being called. More importantly, without TEXASR[FS] bit set also. Since TEXASR might not have the FS bit set, and when the process is scheduled back, it will try to reclaim, which will be aborted because of the CPU is not in the suspended state, and, then, recheckpoint. This recheckpoint will restore thread->texasr into TEXASR SPR, which might be zero, hitting a BUG_ON(). kernel BUG at /build/linux-sf3Co9/linux-4.9.30/arch/powerpc/kernel/tm.S:434! cpu 0xb: Vector: 700 (Program Check) at [c00000041f1576d0] pc: c000000000054550: restore_gprs+0xb0/0x180 lr: 0000000000000000 sp: c00000041f157950 msr: 8000000100021033 current = 0xc00000041f143000 paca = 0xc00000000fb86300 softe: 0 irq_happened: 0x01 pid = 1021, comm = kworker/11:1 kernel BUG at /build/linux-sf3Co9/linux-4.9.30/arch/powerpc/kernel/tm.S:434! Linux version 4.9.0-3-powerpc64le (debian-kernel@lists.debian.org) (gcc version 6.3.0 20170516 (Debian 6.3.0-18) ) #1 SMP Debian 4.9.30-2+deb9u2 (2017-06-26) enter ? for help [c00000041f157b30] c00000000001bc3c tm_recheckpoint.part.11+0x6c/0xa0 [c00000041f157b70] c00000000001d184 __switch_to+0x1e4/0x4c0 [c00000041f157bd0] c00000000082eeb8 __schedule+0x2f8/0x990 [c00000041f157cb0] c00000000082f598 schedule+0x48/0xc0 [c00000041f157ce0] c0000000000f0d28 worker_thread+0x148/0x610 [c00000041f157d80] c0000000000f96b0 kthread+0x120/0x140 [c00000041f157e30] c00000000000c0e0 ret_from_kernel_thread+0x5c/0x7c This patch simply delays the MSR[TS] set, so, if there is any page fault in the __get_user() section, it does not have regs->msr[TS] set, since the TM structures are still invalid, thus avoiding doing TM operations for in-kernel exceptions and possible process reschedule. With this patch, the MSR[TS] will only be set just before recheckpointing and setting TEXASR[FS] = 1, thus avoiding an interrupt with TM registers in invalid state. Other than that, if CONFIG_PREEMPT is set, there might be a preemption just after setting MSR[TS] and before tm_recheckpoint(), thus, this block must be atomic from a preemption perspective, thus, calling preempt_disable/enable() on this code. It is not possible to move tm_recheckpoint to happen earlier, because it is required to get the checkpointed registers from userspace, with __get_user(), thus, the only way to avoid this undesired behavior is delaying the MSR[TS] set. The 32-bits signal handler seems to be safe this current issue, but, it might be exposed to the preemption issue, thus, disabling preemption in this chunk of code. Changes from v2: * Run the critical section with preempt_disable. Fixes: 87b4e5393af7 ("powerpc/tm: Fix return of active 64bit signals") Cc: stable@vger.kernel.org (v3.9+) Signed-off-by: Breno Leitao <leitao@debian.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-01-13arm64: relocatable: fix inconsistencies in linker script and optionsArd Biesheuvel2-5/+6
commit 3bbd3db86470c701091fb1d67f1fab6621debf50 upstream. readelf complains about the section layout of vmlinux when building with CONFIG_RELOCATABLE=y (for KASLR): readelf: Warning: [21]: Link field (0) should index a symtab section. readelf: Warning: [21]: Info field (0) should index a relocatable section. Also, it seems that our use of '-pie -shared' is contradictory, and thus ambiguous. In general, the way KASLR is wired up at the moment is highly tailored to how ld.bfd happens to implement (and conflate) PIE executables and shared libraries, so given the current effort to support other toolchains, let's fix some of these issues as well. - Drop the -pie linker argument and just leave -shared. In ld.bfd, the differences between them are unclear (except for the ELF type of the produced image [0]) but lld chokes on seeing both at the same time. - Rename the .rela output section to .rela.dyn, as is customary for shared libraries and PIE executables, so that it is not misidentified by readelf as a static relocation section (producing the warnings above). - Pass the -z notext and -z norelro options to explicitly instruct the linker to permit text relocations, and to omit the RELRO program header (which requires a certain section layout that we don't adhere to in the kernel). These are the defaults for current versions of ld.bfd. - Discard .eh_frame and .gnu.hash sections to avoid them from being emitted between .head.text and .text, screwing up the section layout. These changes only affect the ELF image, and produce the same binary image. [0] b9dce7f1ba01 ("arm64: kernel: force ET_DYN ELF type for ...") Cc: Nick Desaulniers <ndesaulniers@google.com> Cc: Peter Smith <peter.smith@linaro.org> Tested-by: Nick Desaulniers <ndesaulniers@google.com> Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-01-13arm64: drop linker script hack to hide __efistub_ symbolsArd Biesheuvel1-27/+17
commit dd6846d774693bfa27d7db4dae5ea67dfe373fa1 upstream. Commit 1212f7a16af4 ("scripts/kallsyms: filter arm64's __efistub_ symbols") updated the kallsyms code to filter out symbols with the __efistub_ prefix explicitly, so we no longer require the hack in our linker script to emit them as absolute symbols. Cc: Nick Desaulniers <ndesaulniers@google.com> Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: Will Deacon <will.deacon@arm.com> [ND: adjusted for context] Signed-off-by: Nick Desaulniers <ndesaulniers@google.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-01-13powerpc/boot: Set target when cross-compiling for clangJoel Stanley1-0/+5
commit 813af51f5d30a2da6a2523c08465f9726e51772e upstream. Clang needs to be told which target it is building for when cross compiling. Link: https://github.com/ClangBuiltLinux/linux/issues/259 Signed-off-by: Joel Stanley <joel@jms.id.au> Tested-by: Daniel Axtens <dja@axtens.net> # powerpc 64-bit BE Acked-by: Michael Ellerman <mpe@ellerman.id.au> Reviewed-by: Nick Desaulniers <ndesaulniers@google.com> Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com> [nc: Use 'ifeq ($(cc-name),clang)' instead of 'ifdef CONFIG_CC_IS_CLANG' because that config does not exist in 4.14; the Kconfig rewrite that added that config happened in 4.18] Signed-off-by: Nathan Chancellor <natechancellor@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-01-13powerpc: Disable -Wbuiltin-requires-header when setjmp is usedJoel Stanley2-1/+7
commit aea447141c7e7824b81b49acd1bc785506fba46e upstream. The powerpc kernel uses setjmp which causes a warning when building with clang: In file included from arch/powerpc/xmon/xmon.c:51: ./arch/powerpc/include/asm/setjmp.h:15:13: error: declaration of built-in function 'setjmp' requires inclusion of the header <setjmp.h> [-Werror,-Wbuiltin-requires-header] extern long setjmp(long *); ^ ./arch/powerpc/include/asm/setjmp.h:16:13: error: declaration of built-in function 'longjmp' requires inclusion of the header <setjmp.h> [-Werror,-Wbuiltin-requires-header] extern void longjmp(long *, long); ^ This *is* the header and we're not using the built-in setjump but rather the one in arch/powerpc/kernel/misc.S. As the compiler warning does not make sense, it for the files where setjmp is used. Signed-off-by: Joel Stanley <joel@jms.id.au> Reviewed-by: Nick Desaulniers <ndesaulniers@google.com> [mpe: Move subdir-ccflags in xmon/Makefile to not clobber -Werror] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Nathan Chancellor <natechancellor@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-01-13powerpc: avoid -mno-sched-epilog on GCC 4.9 and newerNicholas Piggin1-1/+6
commit 6977f95e63b9b3fb4a5973481a800dd9f48a1338 upstream. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Reviewed-by: Joel Stanley <joel@jms.id.au> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> [nc: Adjust context due to lack of f2910f0e6835 and 2a056f58fd33] Signed-off-by: Nathan Chancellor <natechancellor@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-01-13x86/dump_pagetables: Fix LDT remap address markerKirill A. Shutemov1-5/+2
[ Upstream commit 254eb5505ca0ca749d3a491fc6668b6c16647a99 ] The LDT remap placement has been changed. It's now placed before the direct mapping in the kernel virtual address space for both paging modes. Change address markers order accordingly. Fixes: d52888aa2753 ("x86/mm: Move LDT remap out of KASLR region on 5-level paging") Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: bp@alien8.de Cc: hpa@zytor.com Cc: dave.hansen@linux.intel.com Cc: luto@kernel.org Cc: peterz@infradead.org Cc: boris.ostrovsky@oracle.com Cc: jgross@suse.com Cc: bhe@redhat.com Cc: hans.van.kranenburg@mendix.com Cc: linux-mm@kvack.org Cc: xen-devel@lists.xenproject.org Link: https://lkml.kernel.org/r/20181130202328.65359-3-kirill.shutemov@linux.intel.com Signed-off-by: Sasha Levin <sashal@kernel.org>
2019-01-13x86/mm: Fix guard hole handlingKirill A. Shutemov3-9/+15
[ Upstream commit 16877a5570e0c5f4270d5b17f9bab427bcae9514 ] There is a guard hole at the beginning of the kernel address space, also used by hypervisors. It occupies 16 PGD entries. This reserved range is not defined explicitely, it is calculated relative to other entities: direct mapping and user space ranges. The calculation got broken by recent changes of the kernel memory layout: LDT remap range is now mapped before direct mapping and makes the calculation invalid. The breakage leads to crash on Xen dom0 boot[1]. Define the reserved range explicitely. It's part of kernel ABI (hypervisors expect it to be stable) and must not depend on changes in the rest of kernel memory layout. [1] https://lists.xenproject.org/archives/html/xen-devel/2018-11/msg03313.html Fixes: d52888aa2753 ("x86/mm: Move LDT remap out of KASLR region on 5-level paging") Reported-by: Hans van Kranenburg <hans.van.kranenburg@mendix.com> Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Tested-by: Hans van Kranenburg <hans.van.kranenburg@mendix.com> Reviewed-by: Juergen Gross <jgross@suse.com> Cc: bp@alien8.de Cc: hpa@zytor.com Cc: dave.hansen@linux.intel.com Cc: luto@kernel.org Cc: peterz@infradead.org Cc: boris.ostrovsky@oracle.com Cc: bhe@redhat.com Cc: linux-mm@kvack.org Cc: xen-devel@lists.xenproject.org Link: https://lkml.kernel.org/r/20181130202328.65359-2-kirill.shutemov@linux.intel.com Signed-off-by: Sasha Levin <sashal@kernel.org>
2019-01-13ARM: dts: imx7d-nitrogen7: Fix the description of the Wifi clockFabio Estevam1-2/+7
[ Upstream commit f15096f12a4e9340168df5fdd9201aa8ed60d59e ] According to bindings/regulator/fixed-regulator.txt the 'clocks' and 'clock-names' properties are not valid ones. In order to turn on the Wifi clock the correct location for describing the CLKO2 clock is via a mmc-pwrseq handle, so do it accordingly. Fixes: 56354959cfec ("ARM: dts: imx: add Boundary Devices Nitrogen7 board") Signed-off-by: Fabio Estevam <festevam@gmail.com> Acked-by: Troy Kisky <troy.kisky@boundarydevices.com> Signed-off-by: Shawn Guo <shawnguo@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2019-01-13ARM: imx: update the cpu power up timing setting on i.mx6sxAnson Huang1-1/+1
[ Upstream commit 1e434b703248580b7aaaf8a115d93e682f57d29f ] The sw2iso count should cover ARM LDO ramp-up time, the MAX ARM LDO ramp-up time may be up to more than 100us on some boards, this patch sets sw2iso to 0xf (~384us) which is the reset value, and it is much more safe to cover different boards, since we have observed that some customer boards failed with current setting of 0x2. Fixes: 05136f0897b5 ("ARM: imx: support arm power off in cpuidle for i.mx6sx") Signed-off-by: Anson Huang <Anson.Huang@nxp.com> Reviewed-by: Fabio Estevam <festevam@gmail.com> Signed-off-by: Shawn Guo <shawnguo@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2019-01-13powerpc/mm: Fix linux page tables build with some configsMichael Ellerman1-0/+1
[ Upstream commit 462951cd32e1496dc64b00051dfb777efc8ae5d8 ] For some configs the build fails with: arch/powerpc/mm/dump_linuxpagetables.c: In function 'populate_markers': arch/powerpc/mm/dump_linuxpagetables.c:306:39: error: 'PKMAP_BASE' undeclared (first use in this function) arch/powerpc/mm/dump_linuxpagetables.c:314:50: error: 'LAST_PKMAP' undeclared (first use in this function) These come from highmem.h, including that fixes the build. Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Sasha Levin <sashal@kernel.org>
2019-01-13powerpc: Fix COFF zImage booting on old powermacsPaul Mackerras1-1/+3
[ Upstream commit 5564597d51c8ff5b88d95c76255e18b13b760879 ] Commit 6975a783d7b4 ("powerpc/boot: Allow building the zImage wrapper as a relocatable ET_DYN", 2011-04-12) changed the procedure descriptor at the start of crt0.S to have a hard-coded start address of 0x500000 rather than a reference to _zimage_start, presumably because having a reference to a symbol introduced a relocation which is awkward to handle in a position-independent executable. Unfortunately, what is at 0x500000 in the COFF image is not the first instruction, but the procedure descriptor itself, that is, a word containing 0x500000, which is not a valid instruction. Hence, booting a COFF zImage results in a "DEFAULT CATCH!, code=FFF00700" message from Open Firmware. This fixes the problem by (a) putting the procedure descriptor in the data section and (b) adding a branch to _zimage_start as the first instruction in the program. Fixes: 6975a783d7b4 ("powerpc/boot: Allow building the zImage wrapper as a relocatable ET_DYN") Signed-off-by: Paul Mackerras <paulus@ozlabs.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Sasha Levin <sashal@kernel.org>
2019-01-09MIPS: Only include mmzone.h when CONFIG_NEED_MULTIPLE_NODES=yPaul Burton2-3/+4
commit 66a4059ba72c23ae74a7c702894ff76c4b7eac1f upstream. MIPS' asm/mmzone.h includes the machine/platform mmzone.h unconditionally, but since commit bb53fdf395ee ("MIPS: c-r4k: Add r4k_blast_scache_node for Loongson-3") is included by asm/rk4cache.h for all r4k-style configs regardless of CONFIG_NEED_MULTIPLE_NODES. This is problematic when CONFIG_NEED_MULTIPLE_NODES=n because both the loongson3 & ip27 mmzone.h headers unconditionally define the NODE_DATA preprocessor macro which is aready defined by linux/mmzone.h, resulting in the following build error: In file included from ./arch/mips/include/asm/mmzone.h:10, from ./arch/mips/include/asm/r4kcache.h:23, from arch/mips/mm/c-r4k.c:33: ./arch/mips/include/asm/mach-loongson64/mmzone.h:48: error: "NODE_DATA" redefined [-Werror] #define NODE_DATA(n) (&__node_data[(n)]->pglist) In file included from ./include/linux/topology.h:32, from ./include/linux/irq.h:19, from ./include/asm-generic/hardirq.h:13, from ./arch/mips/include/asm/hardirq.h:16, from ./include/linux/hardirq.h:9, from arch/mips/mm/c-r4k.c:11: ./include/linux/mmzone.h:907: note: this is the location of the previous definition #define NODE_DATA(nid) (&contig_page_data) Resolve this by only including the machine mmzone.h when CONFIG_NEED_MULTIPLE_NODES=y, which also removes the need for the empty mach-generic version of the header which we delete. Signed-off-by: Paul Burton <paul.burton@mips.com> Fixes: bb53fdf395ee ("MIPS: c-r4k: Add r4k_blast_scache_node for Loongson-3") Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-01-09arm64: KVM: Avoid setting the upper 32 bits of VTCR_EL2 to 1Will Deacon1-1/+1
commit df655b75c43fba0f2621680ab261083297fd6d16 upstream. Although bit 31 of VTCR_EL2 is RES1, we inadvertently end up setting all of the upper 32 bits to 1 as well because we define VTCR_EL2_RES1 as signed, which is sign-extended when assigning to kvm->arch.vtcr. Lucky for us, the architecture currently treats these upper bits as RES0 so, whilst we've been naughty, we haven't set fire to anything yet. Cc: <stable@vger.kernel.org> Cc: Marc Zyngier <marc.zyngier@arm.com> Cc: Christoffer Dall <christoffer.dall@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-01-09MIPS: OCTEON: mark RGMII interface disabled on OCTEON IIIAaro Koskinen1-1/+2
commit edefae94b7b9f10d5efe32dece5a36e9d9ecc29e upstream. Commit 885872b722b7 ("MIPS: Octeon: Add Octeon III CN7xxx interface detection") added RGMII interface detection for OCTEON III, but it results in the following logs: [ 7.165984] ERROR: Unsupported Octeon model in __cvmx_helper_rgmii_probe [ 7.173017] ERROR: Unsupported Octeon model in __cvmx_helper_rgmii_probe The current RGMII routines are valid only for older OCTEONS that use GMX/ASX hardware blocks. On later chips AGL should be used, but support for that is missing in the mainline. Until that is added, mark the interface as disabled. Fixes: 885872b722b7 ("MIPS: Octeon: Add Octeon III CN7xxx interface detection") Signed-off-by: Aaro Koskinen <aaro.koskinen@iki.fi> Signed-off-by: Paul Burton <paul.burton@mips.com> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: James Hogan <jhogan@kernel.org> Cc: linux-mips@vger.kernel.org Cc: stable@vger.kernel.org # 4.7+ Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-01-09MIPS: Expand MIPS32 ASIDs to 64 bitsPaul Burton4-9/+7
commit ff4dd232ec45a0e45ea69f28f069f2ab22b4908a upstream. ASIDs have always been stored as unsigned longs, ie. 32 bits on MIPS32 kernels. This is problematic because it is feasible for the ASID version to overflow & wrap around to zero. We currently attempt to handle this overflow by simply setting the ASID version to 1, using asid_first_version(), but we make no attempt to account for the fact that there may be mm_structs with stale ASIDs that have versions which we now reuse due to the overflow & wrap around. Encountering this requires that: 1) A struct mm_struct X is active on CPU A using ASID (V,n). 2) That mm is not used on CPU A for the length of time that it takes for CPU A's asid_cache to overflow & wrap around to the same version V that the mm had in step 1. During this time tasks using the mm could either be sleeping or only scheduled on other CPUs. 3) Some other mm Y becomes active on CPU A and is allocated the same ASID (V,n). 4) mm X now becomes active on CPU A again, and now incorrectly has the same ASID as mm Y. Where struct mm_struct ASIDs are represented above in the format (version, EntryHi.ASID), and on a typical MIPS32 system version will be 24 bits wide & EntryHi.ASID will be 8 bits wide. The length of time required in step 2 is highly dependent upon the CPU & workload, but for a hypothetical 2GHz CPU running a workload which generates a new ASID every 10000 cycles this period is around 248 days. Due to this long period of time & the fact that tasks need to be scheduled in just the right (or wrong, depending upon your inclination) way, this is obviously a difficult bug to encounter but it's entirely possible as evidenced by reports. In order to fix this, simply extend ASIDs to 64 bits even on MIPS32 builds. This will extend the period of time required for the hypothetical system above to encounter the problem from 28 days to around 3 trillion years, which feels safely outside of the realms of possibility. The cost of this is slightly more generated code in some commonly executed paths, but this is pretty minimal: | Code Size Gain | Percentage -----------------------|----------------|------------- decstation_defconfig | +270 | +0.00% 32r2el_defconfig | +652 | +0.01% 32r6el_defconfig | +1000 | +0.01% I have been unable to measure any change in performance of the LMbench lat_ctx or lat_proc tests resulting from the 64b ASIDs on either 32r2el_defconfig+interAptiv or 32r6el_defconfig+I6500 systems. Signed-off-by: Paul Burton <paul.burton@mips.com> Suggested-by: James Hogan <jhogan@kernel.org> References: https://lore.kernel.org/linux-mips/80B78A8B8FEE6145A87579E8435D78C30205D5F3@fzex.ruijie.com.cn/ References: https://lore.kernel.org/linux-mips/1488684260-18867-1-git-send-email-jiwei.sun@windriver.com/ Cc: Jiwei Sun <jiwei.sun@windriver.com> Cc: Yu Huabing <yhb@ruijie.com.cn> Cc: stable@vger.kernel.org # 2.6.12+ Cc: linux-mips@vger.kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-01-09MIPS: Align kernel load address to 64KBHuacai Chen1-3/+4
commit bec0de4cfad21bd284dbddee016ed1767a5d2823 upstream. KEXEC needs the new kernel's load address to be aligned on a page boundary (see sanity_check_segment_list()), but on MIPS the default vmlinuz load address is only explicitly aligned to 16 bytes. Since the largest PAGE_SIZE supported by MIPS kernels is 64KB, increase the alignment calculated by calc_vmlinuz_load_addr to 64KB. Signed-off-by: Huacai Chen <chenhc@lemote.com> Signed-off-by: Paul Burton <paul.burton@mips.com> Patchwork: https://patchwork.linux-mips.org/patch/21131/ Cc: Ralf Baechle <ralf@linux-mips.org> Cc: James Hogan <james.hogan@mips.com> Cc: Steven J . Hill <Steven.Hill@cavium.com> Cc: linux-mips@linux-mips.org Cc: Fuxin Zhang <zhangfx@lemote.com> Cc: Zhangjin Wu <wuzhangjin@gmail.com> Cc: <stable@vger.kernel.org> # 2.6.36+ Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-01-09MIPS: Ensure pmd_present() returns false after pmd_mknotpresent()Huacai Chen1-0/+5
commit 92aa0718c9fa5160ad2f0e7b5bffb52f1ea1e51a upstream. This patch is borrowed from ARM64 to ensure pmd_present() returns false after pmd_mknotpresent(). This is needed for THP. References: 5bb1cc0ff9a6 ("arm64: Ensure pmd_present() returns false after pmd_mknotpresent()") Reviewed-by: James Hogan <jhogan@kernel.org> Signed-off-by: Huacai Chen <chenhc@lemote.com> Signed-off-by: Paul Burton <paul.burton@mips.com> Patchwork: https://patchwork.linux-mips.org/patch/21135/ Cc: Ralf Baechle <ralf@linux-mips.org> Cc: James Hogan <james.hogan@mips.com> Cc: Steven J . Hill <Steven.Hill@cavium.com> Cc: linux-mips@linux-mips.org Cc: Fuxin Zhang <zhangfx@lemote.com> Cc: Zhangjin Wu <wuzhangjin@gmail.com> Cc: <stable@vger.kernel.org> # 3.8+ Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-01-09MIPS: c-r4k: Add r4k_blast_scache_node for Loongson-3Huacai Chen5-7/+70
commit bb53fdf395eed103f85061bfff3b116cee123895 upstream. For multi-node Loongson-3 (NUMA configuration), r4k_blast_scache() can only flush Node-0's scache. So we add r4k_blast_scache_node() by using (CAC_BASE | (node_id << NODE_ADDRSPACE_SHIFT)) instead of CKSEG0 as the start address. Signed-off-by: Huacai Chen <chenhc@lemote.com> [paul.burton@mips.com: Include asm/mmzone.h from asm/r4kcache.h for nid_to_addrbase(). Add asm/mach-generic/mmzone.h to allow inclusion for all platforms.] Signed-off-by: Paul Burton <paul.burton@mips.com> Patchwork: https://patchwork.linux-mips.org/patch/21129/ Cc: Ralf Baechle <ralf@linux-mips.org> Cc: James Hogan <james.hogan@mips.com> Cc: Steven J . Hill <Steven.Hill@cavium.com> Cc: linux-mips@linux-mips.org Cc: Fuxin Zhang <zhangfx@lemote.com> Cc: Zhangjin Wu <wuzhangjin@gmail.com> Cc: <stable@vger.kernel.org> # 3.15+ Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-01-09MIPS: math-emu: Write-protect delay slot emulation pagesPaul Burton2-20/+22
commit adcc81f148d733b7e8e641300c5590a2cdc13bf3 upstream. Mapping the delay slot emulation page as both writeable & executable presents a security risk, in that if an exploit can write to & jump into the page then it can be used as an easy way to execute arbitrary code. Prevent this by mapping the page read-only for userland, and using access_process_vm() with the FOLL_FORCE flag to write to it from mips_dsemul(). This will likely be less efficient due to copy_to_user_page() performing cache maintenance on a whole page, rather than a single line as in the previous use of flush_cache_sigtramp(). However this delay slot emulation code ought not to be running in any performance critical paths anyway so this isn't really a problem, and we can probably do better in copy_to_user_page() anyway in future. A major advantage of this approach is that the fix is small & simple to backport to stable kernels. Reported-by: Andy Lutomirski <luto@kernel.org> Signed-off-by: Paul Burton <paul.burton@mips.com> Fixes: 432c6bacbd0c ("MIPS: Use per-mm page to execute branch delay slot instructions") Cc: stable@vger.kernel.org # v4.8+ Cc: linux-mips@vger.kernel.org Cc: linux-kernel@vger.kernel.org Cc: Rich Felker <dalias@libc.org> Cc: David Daney <david.daney@cavium.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-01-09KVM: nVMX: Free the VMREAD/VMWRITE bitmaps if alloc_kvm_area() failsSean Christopherson1-2/+5
commit 1b3ab5ad1b8ad99bae76ec583809c5f5a31c707c upstream. Fixes: 34a1cd60d17f ("kvm: x86: vmx: move some vmx setting from vmx_init() to hardware_setup()") Cc: stable@vger.kernel.org Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-01-09KVM: x86: Use jmp to invoke kvm_spurious_fault() from .fixupSean Christopherson1-1/+1
commit e81434995081fd7efb755fd75576b35dbb0850b1 upstream. ____kvm_handle_fault_on_reboot() provides a generic exception fixup handler that is used to cleanly handle faults on VMX/SVM instructions during reboot (or at least try to). If there isn't a reboot in progress, ____kvm_handle_fault_on_reboot() treats any exception as fatal to KVM and invokes kvm_spurious_fault(), which in turn generates a BUG() to get a stack trace and die. When it was originally added by commit 4ecac3fd6dc2 ("KVM: Handle virtualization instruction #UD faults during reboot"), the "call" to kvm_spurious_fault() was handcoded as PUSH+JMP, where the PUSH'd value is the RIP of the faulting instructing. The PUSH+JMP trickery is necessary because the exception fixup handler code lies outside of its associated function, e.g. right after the function. An actual CALL from the .fixup code would show a slightly bogus stack trace, e.g. an extra "random" function would be inserted into the trace, as the return RIP on the stack would point to no known function (and the unwinder will likely try to guess who owns the RIP). Unfortunately, the JMP was replaced with a CALL when the macro was reworked to not spin indefinitely during reboot (commit b7c4145ba2eb "KVM: Don't spin on virt instruction faults during reboot"). This causes the aforementioned behavior where a bogus function is inserted into the stack trace, e.g. my builds like to blame free_kvm_area(). Revert the CALL back to a JMP. The changelog for commit b7c4145ba2eb ("KVM: Don't spin on virt instruction faults during reboot") contains nothing that indicates the switch to CALL was deliberate. This is backed up by the fact that the PUSH <insn RIP> was left intact. Note that an alternative to the PUSH+JMP magic would be to JMP back to the "real" code and CALL from there, but that would require adding a JMP in the non-faulting path to avoid calling kvm_spurious_fault() and would add no value, i.e. the stack trace would be the same. Using CALL: ------------[ cut here ]------------ kernel BUG at /home/sean/go/src/kernel.org/linux/arch/x86/kvm/x86.c:356! invalid opcode: 0000 [#1] SMP CPU: 4 PID: 1057 Comm: qemu-system-x86 Not tainted 4.20.0-rc6+ #75 Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 0.0.0 02/06/2015 RIP: 0010:kvm_spurious_fault+0x5/0x10 [kvm] Code: <0f> 0b 66 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 41 55 49 89 fd 41 RSP: 0018:ffffc900004bbcc8 EFLAGS: 00010046 RAX: 0000000000000000 RBX: 0000000000000000 RCX: ffffffffffffffff RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000 RBP: ffff888273fd8000 R08: 00000000000003e8 R09: 0000000000000000 R10: 0000000000000000 R11: 0000000000000784 R12: ffffc90000371fb0 R13: 0000000000000000 R14: 000000026d763cf4 R15: ffff888273fd8000 FS: 00007f3d69691700(0000) GS:ffff888277800000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 000055f89bc56fe0 CR3: 0000000271a5a001 CR4: 0000000000362ee0 Call Trace: free_kvm_area+0x1044/0x43ea [kvm_intel] ? vmx_vcpu_run+0x156/0x630 [kvm_intel] ? kvm_arch_vcpu_ioctl_run+0x447/0x1a40 [kvm] ? kvm_vcpu_ioctl+0x368/0x5c0 [kvm] ? kvm_vcpu_ioctl+0x368/0x5c0 [kvm] ? __set_task_blocked+0x38/0x90 ? __set_current_blocked+0x50/0x60 ? __fpu__restore_sig+0x97/0x490 ? do_vfs_ioctl+0xa1/0x620 ? __x64_sys_futex+0x89/0x180 ? ksys_ioctl+0x66/0x70 ? __x64_sys_ioctl+0x16/0x20 ? do_syscall_64+0x4f/0x100 ? entry_SYSCALL_64_after_hwframe+0x44/0xa9 Modules linked in: vhost_net vhost tap kvm_intel kvm irqbypass bridge stp llc ---[ end trace 9775b14b123b1713 ]--- Using JMP: ------------[ cut here ]------------ kernel BUG at /home/sean/go/src/kernel.org/linux/arch/x86/kvm/x86.c:356! invalid opcode: 0000 [#1] SMP CPU: 6 PID: 1067 Comm: qemu-system-x86 Not tainted 4.20.0-rc6+ #75 Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 0.0.0 02/06/2015 RIP: 0010:kvm_spurious_fault+0x5/0x10 [kvm] Code: <0f> 0b 66 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 41 55 49 89 fd 41 RSP: 0018:ffffc90000497cd0 EFLAGS: 00010046 RAX: 0000000000000000 RBX: 0000000000000000 RCX: ffffffffffffffff RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000 RBP: ffff88827058bd40 R08: 00000000000003e8 R09: 0000000000000000 R10: 0000000000000000 R11: 0000000000000784 R12: ffffc90000369fb0 R13: 0000000000000000 R14: 00000003c8fc6642 R15: ffff88827058bd40 FS: 00007f3d7219e700(0000) GS:ffff888277900000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007f3d64001000 CR3: 0000000271c6b004 CR4: 0000000000362ee0 Call Trace: vmx_vcpu_run+0x156/0x630 [kvm_intel] ? kvm_arch_vcpu_ioctl_run+0x447/0x1a40 [kvm] ? kvm_vcpu_ioctl+0x368/0x5c0 [kvm] ? kvm_vcpu_ioctl+0x368/0x5c0 [kvm] ? __set_task_blocked+0x38/0x90 ? __set_current_blocked+0x50/0x60 ? __fpu__restore_sig+0x97/0x490 ? do_vfs_ioctl+0xa1/0x620 ? __x64_sys_futex+0x89/0x180 ? ksys_ioctl+0x66/0x70 ? __x64_sys_ioctl+0x16/0x20 ? do_syscall_64+0x4f/0x100 ? entry_SYSCALL_64_after_hwframe+0x44/0xa9 Modules linked in: vhost_net vhost tap kvm_intel kvm irqbypass bridge stp llc ---[ end trace f9daedb85ab3ddba ]--- Fixes: b7c4145ba2eb ("KVM: Don't spin on virt instruction faults during reboot") Cc: stable@vger.kernel.org Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-01-09x86/mm: Drop usage of __flush_tlb_all() in kernel_physical_mapping_init()Dan Williams1-6/+0
commit ba6f508d0ec4adb09f0a939af6d5e19cdfa8667d upstream. Commit: f77084d96355 "x86/mm/pat: Disable preemption around __flush_tlb_all()" addressed a case where __flush_tlb_all() is called without preemption being disabled. It also left a warning to catch other cases where preemption is not disabled. That warning triggers for the memory hotplug path which is also used for persistent memory enabling: WARNING: CPU: 35 PID: 911 at ./arch/x86/include/asm/tlbflush.h:460 RIP: 0010:__flush_tlb_all+0x1b/0x3a [..] Call Trace: phys_pud_init+0x29c/0x2bb kernel_physical_mapping_init+0xfc/0x219 init_memory_mapping+0x1a5/0x3b0 arch_add_memory+0x2c/0x50 devm_memremap_pages+0x3aa/0x610 pmem_attach_disk+0x585/0x700 [nd_pmem] Andy wondered why a path that can sleep was using __flush_tlb_all() [1] and Dave confirmed the expectation for TLB flush is for modifying / invalidating existing PTE entries, but not initial population [2]. Drop the usage of __flush_tlb_all() in phys_{p4d,pud,pmd}_init() on the expectation that this path is only ever populating empty entries for the linear map. Note, at linear map teardown time there is a call to the all-cpu flush_tlb_all() to invalidate the removed mappings. [1]: https://lkml.kernel.org/r/9DFD717D-857D-493D-A606-B635D72BAC21@amacapital.net [2]: https://lkml.kernel.org/r/749919a4-cdb1-48a3-adb4-adb81a5fa0b5@intel.com [ mingo: Minor readability edits. ] Suggested-by: Dave Hansen <dave.hansen@linux.intel.com> Reported-by: Andy Lutomirski <luto@kernel.org> Signed-off-by: Dan Williams <dan.j.williams@intel.com> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Acked-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Cc: <stable@vger.kernel.org> Cc: Borislav Petkov <bp@alien8.de> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Rik van Riel <riel@surriel.com> Cc: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: dave.hansen@intel.com Fixes: f77084d96355 ("x86/mm/pat: Disable preemption around __flush_tlb_all()") Link: http://lkml.kernel.org/r/154395944713.32119.15611079023837132638.stgit@dwillia2-desk3.amr.corp.intel.com Signed-off-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-01-09x86/speculation/l1tf: Drop the swap storage limit restriction when l1tf=offMichal Hocko2-2/+3
commit 5b5e4d623ec8a34689df98e42d038a3b594d2ff9 upstream. Swap storage is restricted to max_swapfile_size (~16TB on x86_64) whenever the system is deemed affected by L1TF vulnerability. Even though the limit is quite high for most deployments it seems to be too restrictive for deployments which are willing to live with the mitigation disabled. We have a customer to deploy 8x 6,4TB PCIe/NVMe SSD swap devices which is clearly out of the limit. Drop the swap restriction when l1tf=off is specified. It also doesn't make much sense to warn about too much memory for the l1tf mitigation when it is forcefully disabled by the administrator. [ tglx: Folded the documentation delta change ] Fixes: 377eeaa8e11f ("x86/speculation/l1tf: Limit swap file size to MAX_PA/2") Signed-off-by: Michal Hocko <mhocko@suse.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Pavel Tatashin <pasha.tatashin@soleen.com> Reviewed-by: Andi Kleen <ak@linux.intel.com> Acked-by: Jiri Kosina <jkosina@suse.cz> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Dave Hansen <dave.hansen@intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Borislav Petkov <bp@suse.de> Cc: <linux-mm@kvack.org> Cc: stable@vger.kernel.org Link: https://lkml.kernel.org/r/20181113184910.26697-1-mhocko@kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-01-09s390/pci: fix sleeping in atomic during hotplugSebastian Ott1-1/+1
commit 98dfd32620e970eb576ebce5ea39d905cb005e72 upstream. When triggered by pci hotplug (PEC 0x306) clp_get_state is called with spinlocks held resulting in the following warning: zpci: n/a: Event 0x306 reconfigured PCI function 0x0 BUG: sleeping function called from invalid context at mm/page_alloc.c:4324 in_atomic(): 1, irqs_disabled(): 0, pid: 98, name: kmcheck 2 locks held by kmcheck/98: Change the allocation to use GFP_ATOMIC. Cc: stable@vger.kernel.org # 4.13+ Signed-off-by: Sebastian Ott <sebott@linux.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-12-29x86/mtrr: Don't copy uninitialized gentry fields back to userspaceColin Ian King1-0/+2
commit 32043fa065b51e0b1433e48d118821c71b5cd65d upstream. Currently the copy_to_user of data in the gentry struct is copying uninitiaized data in field _pad from the stack to userspace. Fix this by explicitly memset'ing gentry to zero, this also will zero any compiler added padding fields that may be in struct (currently there are none). Detected by CoverityScan, CID#200783 ("Uninitialized scalar variable") Fixes: b263b31e8ad6 ("x86, mtrr: Use explicit sizing and padding for the 64-bit ioctls") Signed-off-by: Colin Ian King <colin.king@canonical.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Tyler Hicks <tyhicks@canonical.com> Cc: security@kernel.org Link: https://lkml.kernel.org/r/20181218172956.1440-1-colin.king@canonical.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-12-29KVM: Fix UAF in nested posted interrupt processingCfir Cohen1-0/+2
commit c2dd5146e9fe1f22c77c1b011adf84eea0245806 upstream. nested_get_vmcs12_pages() processes the posted_intr address in vmcs12. It caches the kmap()ed page object and pointer, however, it doesn't handle errors correctly: it's possible to cache a valid pointer, then release the page and later dereference the dangling pointer. I was able to reproduce with the following steps: 1. Call vmlaunch with valid posted_intr_desc_addr but an invalid MSR_EFER. This causes nested_get_vmcs12_pages() to cache the kmap()ed pi_desc_page and pi_desc. Later the invalid EFER value fails check_vmentry_postreqs() which fails the first vmlaunch. 2. Call vmlanuch with a valid EFER but an invalid posted_intr_desc_addr (I set it to 2G - 0x80). The second time we call nested_get_vmcs12_pages pi_desc_page is unmapped and released and pi_desc_page is set to NULL (the "shouldn't happen" clause). Due to the invalid posted_intr_desc_addr, kvm_vcpu_gpa_to_page() fails and nested_get_vmcs12_pages() returns. It doesn't return an error value so vmlaunch proceeds. Note that at this time we have a dangling pointer in vmx->nested.pi_desc and POSTED_INTR_DESC_ADDR in L0's vmcs. 3. Issue an IPI in L2 guest code. This triggers a call to vmx_complete_nested_posted_interrupt() and pi_test_and_clear_on() which dereferences the dangling pointer. Vulnerable code requires nested and enable_apicv variables to be set to true. The host CPU must also support posted interrupts. Fixes: 5e2f30b756a37 "KVM: nVMX: get rid of nested_get_page()" Cc: stable@vger.kernel.org Reviewed-by: Andy Honig <ahonig@google.com> Signed-off-by: Cfir Cohen <cfir@google.com> Reviewed-by: Liran Alon <liran.alon@oracle.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-12-29kvm: x86: Add AMD's EX_CFG to the list of ignored MSRsEduardo Habkost2-0/+3
commit 0e1b869fff60c81b510c2d00602d778f8f59dd9a upstream. Some guests OSes (including Windows 10) write to MSR 0xc001102c on some cases (possibly while trying to apply a CPU errata). Make KVM ignore reads and writes to that MSR, so the guest won't crash. The MSR is documented as "Execution Unit Configuration (EX_CFG)", at AMD's "BIOS and Kernel Developer's Guide (BKDG) for AMD Family 15h Models 00h-0Fh Processors". Cc: stable@vger.kernel.org Signed-off-by: Eduardo Habkost <ehabkost@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-12-21bpf, arm: fix emit_ldx_r and emit_mov_i using TMP_REG_1Nicolas Schichan1-1/+1
emit_ldx_r() and emit_a32_mov_i() were both using TMP_REG_1 and clashing with each other. Using TMP_REG_2 in emit_ldx_r() fixes the issue. Fixes: ec19e02b343 ("ARM: net: bpf: fix LDX instructions") Cc: Russell King <rmk+kernel@armlinux.org.uk> Signed-off-by: Nicolas Schichan <nschichan@freebox.fr> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: Sasha Levin <sashal@kernel.org>
2018-12-21ARM: 8815/1: V7M: align v7m_dma_inv_range() with v7 counterpartVladimir Murzin1-5/+9
[ Upstream commit 3d0358d0ba048c5afb1385787aaec8fa5ad78fcc ] Chris has discovered and reported that v7_dma_inv_range() may corrupt memory if address range is not aligned to cache line size. Since the whole cache-v7m.S was lifted form cache-v7.S the same observation applies to v7m_dma_inv_range(). So the fix just mirrors what has been done for v7 with a little specific of M-class. Cc: Chris Cole <chris@sageembedded.com> Signed-off-by: Vladimir Murzin <vladimir.murzin@arm.com> Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk> Signed-off-by: Sasha Levin <sashal@kernel.org>
2018-12-21ARM: 8814/1: mm: improve/fix ARM v7_dma_inv_range() unaligned address handlingChris Cole1-3/+5
[ Upstream commit a1208f6a822ac29933e772ef1f637c5d67838da9 ] This patch addresses possible memory corruption when v7_dma_inv_range(start_address, end_address) address parameters are not aligned to whole cache lines. This function issues "invalidate" cache management operations to all cache lines from start_address (inclusive) to end_address (exclusive). When start_address and/or end_address are not aligned, the start and/or end cache lines are first issued "clean & invalidate" operation. The assumption is this is done to ensure that any dirty data addresses outside the address range (but part of the first or last cache lines) are cleaned/flushed so that data is not lost, which could happen if just an invalidate is issued. The problem is that these first/last partial cache lines are issued "clean & invalidate" and then "invalidate". This second "invalidate" is not required and worse can cause "lost" writes to addresses outside the address range but part of the cache line. If another component writes to its part of the cache line between the "clean & invalidate" and "invalidate" operations, the write can get lost. This fix is to remove the extra "invalidate" operation when unaligned addressed are used. A kernel module is available that has a stress test to reproduce the issue and a unit test of the updated v7_dma_inv_range(). It can be downloaded from http://ftp.sageembedded.com/outgoing/linux/cache-test-20181107.tgz. v7_dma_inv_range() is call by dmac_[un]map_area(addr, len, direction) when the direction is DMA_FROM_DEVICE. One can (I believe) successfully argue that DMA from a device to main memory should use buffers aligned to cache line size, because the "clean & invalidate" might overwrite data that the device just wrote using DMA. But if a driver does use unaligned buffers, at least this fix will prevent memory corruption outside the buffer. Signed-off-by: Chris Cole <chris@sageembedded.com> Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk> Signed-off-by: Sasha Levin <sashal@kernel.org>
2018-12-21ARC: io.h: Implement reads{x}()/writes{x}()Jose Abreu1-0/+72
[ Upstream commit 10d443431dc2bb733cf7add99b453e3fb9047a2e ] Some ARC CPU's do not support unaligned loads/stores. Currently, generic implementation of reads{b/w/l}()/writes{b/w/l}() is being used with ARC. This can lead to misfunction of some drivers as generic functions do a plain dereference of a pointer that can be unaligned. Let's use {get/put}_unaligned() helpers instead of plain dereference of pointer in order to fix. The helpers allow to get and store data from an unaligned address whilst preserving the CPU internal alignment. According to [1], the use of these helpers are costly in terms of performance so we added an initial check for a buffer already aligned so that the usage of the helpers can be avoided, when possible. [1] Documentation/unaligned-memory-access.txt Cc: Alexey Brodkin <abrodkin@synopsys.com> Cc: Joao Pinto <jpinto@synopsys.com> Cc: David Laight <David.Laight@ACULAB.COM> Tested-by: Vitor Soares <soares@synopsys.com> Signed-off-by: Jose Abreu <joabreu@synopsys.com> Signed-off-by: Vineet Gupta <vgupta@synopsys.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2018-12-21x86/earlyprintk/efi: Fix infinite loop on some screen widthsYiFei Zhu1-1/+1
[ Upstream commit 79c2206d369b87b19ac29cb47601059b6bf5c291 ] An affected screen resolution is 1366 x 768, which width is not divisible by 8, the default font width. On such screens, when longer lines are earlyprintk'ed, overflow-to-next-line can never trigger, due to the left-most x-coordinate of the next character always less than the screen width. Earlyprintk will infinite loop in trying to print the rest of the string but unable to, due to the line being full. This patch makes the trigger consider the right-most x-coordinate, instead of left-most, as the value to compare against the screen width threshold. Signed-off-by: YiFei Zhu <zhuyifei1999@gmail.com> Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Cc: Andy Lutomirski <luto@kernel.org> Cc: Arend van Spriel <arend.vanspriel@broadcom.com> Cc: Bhupesh Sharma <bhsharma@redhat.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Dave Hansen <dave.hansen@intel.com> Cc: Eric Snowberg <eric.snowberg@oracle.com> Cc: Hans de Goede <hdegoede@redhat.com> Cc: Joe Perches <joe@perches.com> Cc: Jon Hunter <jonathanh@nvidia.com> Cc: Julien Thierry <julien.thierry@arm.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Marc Zyngier <marc.zyngier@arm.com> Cc: Matt Fleming <matt@codeblueprint.co.uk> Cc: Nathan Chancellor <natechancellor@gmail.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Sai Praneeth Prakhya <sai.praneeth.prakhya@intel.com> Cc: Sedat Dilek <sedat.dilek@gmail.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: linux-efi@vger.kernel.org Link: http://lkml.kernel.org/r/20181129171230.18699-12-ard.biesheuvel@linaro.org Signed-off-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2018-12-21locking/qspinlock, x86: Provide liveness guaranteePeter Zijlstra1-0/+21
commit 7aa54be2976550f17c11a1c3e3630002dea39303 upstream. On x86 we cannot do fetch_or() with a single instruction and thus end up using a cmpxchg loop, this reduces determinism. Replace the fetch_or() with a composite operation: tas-pending + load. Using two instructions of course opens a window we previously did not have. Consider the scenario: CPU0 CPU1 CPU2 1) lock trylock -> (0,0,1) 2) lock trylock /* fail */ 3) unlock -> (0,0,0) 4) lock trylock -> (0,0,1) 5) tas-pending -> (0,1,1) load-val <- (0,1,0) from 3 6) clear-pending-set-locked -> (0,0,1) FAIL: _2_ owners where 5) is our new composite operation. When we consider each part of the qspinlock state as a separate variable (as we can when _Q_PENDING_BITS == 8) then the above is entirely possible, because tas-pending will only RmW the pending byte, so the later load is able to observe prior tail and lock state (but not earlier than its own trylock, which operates on the whole word, due to coherence). To avoid this we need 2 things: - the load must come after the tas-pending (obviously, otherwise it can trivially observe prior state). - the tas-pending must be a full word RmW instruction, it cannot be an XCHGB for example, such that we cannot observe other state prior to setting pending. On x86 we can realize this by using "LOCK BTS m32, r32" for tas-pending followed by a regular load. Note that observing later state is not a problem: - if we fail to observe a later unlock, we'll simply spin-wait for that store to become visible. - if we observe a later xchg_tail(), there is no difference from that xchg_tail() having taken place before the tas-pending. Suggested-by: Will Deacon <will.deacon@arm.com> Reported-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Will Deacon <will.deacon@arm.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: andrea.parri@amarulasolutions.com Cc: longman@redhat.com Fixes: 59fb586b4a07 ("locking/qspinlock: Remove unbounded cmpxchg() loop from locking slowpath") Link: https://lkml.kernel.org/r/20181003130957.183726335@infradead.org Signed-off-by: Ingo Molnar <mingo@kernel.org> [bigeasy: GEN_BINARY_RMWcc macro redo] Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Signed-off-by: Sasha Levin <sashal@kernel.org>
2018-12-21locking/qspinlock/x86: Increase _Q_PENDING_LOOPS upper boundWill Deacon1-0/+2
commit b247be3fe89b6aba928bf80f4453d1c4ba8d2063 upstream. On x86, atomic_cond_read_relaxed will busy-wait with a cpu_relax() loop, so it is desirable to increase the number of times we spin on the qspinlock lockword when it is found to be transitioning from pending to locked. According to Waiman Long: | Ideally, the spinning times should be at least a few times the typical | cacheline load time from memory which I think can be down to 100ns or | so for each cacheline load with the newest systems or up to several | hundreds ns for older systems. which in his benchmarking corresponded to 512 iterations. Suggested-by: Waiman Long <longman@redhat.com> Signed-off-by: Will Deacon <will.deacon@arm.com> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Acked-by: Waiman Long <longman@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: boqun.feng@gmail.com Cc: linux-arm-kernel@lists.infradead.org Cc: paulmck@linux.vnet.ibm.com Link: http://lkml.kernel.org/r/1524738868-31318-5-git-send-email-will.deacon@arm.com Signed-off-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Signed-off-by: Sasha Levin <sashal@kernel.org>
2018-12-21locking/qspinlock: Merge 'struct __qspinlock' into 'struct qspinlock'Will Deacon2-3/+2
commit 625e88be1f41b53cec55827c984e4a89ea8ee9f9 upstream. 'struct __qspinlock' provides a handy union of fields so that subcomponents of the lockword can be accessed by name, without having to manage shifts and masks explicitly and take endianness into account. This is useful in qspinlock.h and also potentially in arch headers, so move the 'struct __qspinlock' into 'struct qspinlock' and kill the extra definition. Signed-off-by: Will Deacon <will.deacon@arm.com> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Acked-by: Waiman Long <longman@redhat.com> Acked-by: Boqun Feng <boqun.feng@gmail.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: linux-arm-kernel@lists.infradead.org Cc: paulmck@linux.vnet.ibm.com Link: http://lkml.kernel.org/r/1524738868-31318-3-git-send-email-will.deacon@arm.com Signed-off-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Signed-off-by: Sasha Levin <sashal@kernel.org>
2018-12-21x86/build: Fix compiler support check for CONFIG_RETPOLINEMasahiro Yamada1-3/+7
commit 25896d073d8a0403b07e6dec56f58e6c33678207 upstream. It is troublesome to add a diagnostic like this to the Makefile parse stage because the top-level Makefile could be parsed with a stale include/config/auto.conf. Once you are hit by the error about non-retpoline compiler, the compilation still breaks even after disabling CONFIG_RETPOLINE. The easiest fix is to move this check to the "archprepare" like this commit did: 829fe4aa9ac1 ("x86: Allow generating user-space headers without a compiler") Reported-by: Meelis Roos <mroos@linux.ee> Tested-by: Meelis Roos <mroos@linux.ee> Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com> Acked-by: Zhenzhong Duan <zhenzhong.duan@oracle.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Zhenzhong Duan <zhenzhong.duan@oracle.com> Fixes: 4cd24de3a098 ("x86/retpoline: Make CONFIG_RETPOLINE depend on compiler support") Link: http://lkml.kernel.org/r/1543991239-18476-1-git-send-email-yamada.masahiro@socionext.com Link: https://lkml.org/lkml/2018/12/4/206 Signed-off-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org> Cc: Gi-Oh Kim <gi-oh.kim@cloud.ionos.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-12-21powerpc/msi: Fix NULL pointer access in teardown codeRadu Rendec1-1/+6
commit 78e7b15e17ac175e7eed9e21c6f92d03d3b0a6fa upstream. The arch_teardown_msi_irqs() function assumes that controller ops pointers were already checked in arch_setup_msi_irqs(), but this assumption is wrong: arch_teardown_msi_irqs() can be called even when arch_setup_msi_irqs() returns an error (-ENOSYS). This can happen in the following scenario: - msi_capability_init() calls pci_msi_setup_msi_irqs() - pci_msi_setup_msi_irqs() returns -ENOSYS - msi_capability_init() notices the error and calls free_msi_irqs() - free_msi_irqs() calls pci_msi_teardown_msi_irqs() This is easier to see when CONFIG_PCI_MSI_IRQ_DOMAIN is not set and pci_msi_setup_msi_irqs() and pci_msi_teardown_msi_irqs() are just aliases to arch_setup_msi_irqs() and arch_teardown_msi_irqs(). The call to free_msi_irqs() upon pci_msi_setup_msi_irqs() failure seems legit, as it does additional cleanup; e.g. list_del(&entry->list) and kfree(entry) inside free_msi_irqs() do happen (MSI descriptors are allocated before pci_msi_setup_msi_irqs() is called and need to be cleaned up if that fails). Fixes: 6b2fd7efeb88 ("PCI/MSI/PPC: Remove arch_msi_check_device()") Cc: stable@vger.kernel.org # v3.18+ Signed-off-by: Radu Rendec <radu.rendec@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-12-21ARM: mmp/mmp2: fix cpu_is_mmp2() on mmp2-dtLubomir Rintel1-2/+4
commit 76f4e2c3b6a560cdd7a75b87df543e04d05a9e5f upstream. cpu_is_mmp2() was equivalent to cpu_is_pj4(), wouldn't be correct for multiplatform kernels. Fix it by also considering mmp_chip_id, as is done for cpu_is_pxa168() and cpu_is_pxa910() above. Moreover, it is only available with CONFIG_CPU_MMP2 and thus doesn't work on DT-based MMP2 machines. Enable it on CONFIG_MACH_MMP2_DT too. Note: CONFIG_CPU_MMP2 is only used for machines that use board files instead of DT. It should perhaps be renamed. I'm not doing it now, because I don't have a better idea. Signed-off-by: Lubomir Rintel <lkundrak@v3.sk> Acked-by: Arnd Bergmann <arnd@arndb.de> Cc: stable@vger.kernel.org Signed-off-by: Olof Johansson <olof@lixom.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-12-21arm64: dma-mapping: Fix FORCE_CONTIGUOUS buffer clearingRobin Murphy1-1/+1
commit 3238c359acee4ab57f15abb5a82b8ab38a661ee7 upstream. We need to invalidate the caches *before* clearing the buffer via the non-cacheable alias, else in the worst case __dma_flush_area() may write back dirty lines over the top of our nice new zeros. Fixes: dd65a941f6ba ("arm64: dma-mapping: clear buffers allocated with FORCE_CONTIGUOUS flag") Cc: <stable@vger.kernel.org> # 4.18.x- Acked-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Robin Murphy <robin.murphy@arm.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-12-17Revert "xen/balloon: Mark unallocated host memory as UNUSABLE"Igor Druzhinin2-80/+4
[ Upstream commit 123664101aa2156d05251704fc63f9bcbf77741a ] This reverts commit b3cf8528bb21febb650a7ecbf080d0647be40b9f. That commit unintentionally broke Xen balloon memory hotplug with "hotplug_unpopulated" set to 1. As long as "System RAM" resource got assigned under a new "Unusable memory" resource in IO/Mem tree any attempt to online this memory would fail due to general kernel restrictions on having "System RAM" resources as 1st level only. The original issue that commit has tried to workaround fa564ad96366 ("x86/PCI: Enable a 64bit BAR on AMD Family 15h (Models 00-1f, 30-3f, 60-7f)") also got amended by the following 03a551734 ("x86/PCI: Move and shrink AMD 64-bit window to avoid conflict") which made the original fix to Xen ballooning unnecessary. Signed-off-by: Igor Druzhinin <igor.druzhinin@citrix.com> Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com> Signed-off-by: Juergen Gross <jgross@suse.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2018-12-17x86/kvm/vmx: fix old-style function declarationYi Wang1-4/+4
[ Upstream commit 1e4329ee2c52692ea42cc677fb2133519718b34a ] The inline keyword which is not at the beginning of the function declaration may trigger the following build warnings, so let's fix it: arch/x86/kvm/vmx.c:1309:1: warning: ‘inline’ is not at beginning of declaration [-Wold-style-declaration] arch/x86/kvm/vmx.c:5947:1: warning: ‘inline’ is not at beginning of declaration [-Wold-style-declaration] arch/x86/kvm/vmx.c:5985:1: warning: ‘inline’ is not at beginning of declaration [-Wold-style-declaration] arch/x86/kvm/vmx.c:6023:1: warning: ‘inline’ is not at beginning of declaration [-Wold-style-declaration] Signed-off-by: Yi Wang <wang.yi59@zte.com.cn> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>