summaryrefslogtreecommitdiff
path: root/arch/arm64/kvm
AgeCommit message (Collapse)AuthorFilesLines
2020-06-13Merge tag 'kbuild-v5.8-2' of ↵Linus Torvalds1-3/+3
git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild Pull more Kbuild updates from Masahiro Yamada: - fix build rules in binderfs sample - fix build errors when Kbuild recurses to the top Makefile - covert '---help---' in Kconfig to 'help' * tag 'kbuild-v5.8-2' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild: treewide: replace '---help---' in Kconfig files with 'help' kbuild: fix broken builds because of GZIP,BZIP2,LZOP variables samples: binderfs: really compile this sample and fix build issues
2020-06-13treewide: replace '---help---' in Kconfig files with 'help'Masahiro Yamada1-3/+3
Since commit 84af7a6194e4 ("checkpatch: kconfig: prefer 'help' over '---help---'"), the number of '---help---' has been gradually decreasing, but there are still more than 2400 instances. This commit finishes the conversion. While I touched the lines, I also fixed the indentation. There are a variety of indentation styles found. a) 4 spaces + '---help---' b) 7 spaces + '---help---' c) 8 spaces + '---help---' d) 1 space + 1 tab + '---help---' e) 1 tab + '---help---' (correct indentation) f) 1 tab + 1 space + '---help---' g) 1 tab + 2 spaces + '---help---' In order to convert all of them to 1 tab + 'help', I ran the following commend: $ find . -name 'Kconfig*' | xargs sed -i 's/^[[:space:]]*---help---/\thelp/' Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
2020-06-12Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvmLinus Torvalds9-68/+137
Pull more KVM updates from Paolo Bonzini: "The guest side of the asynchronous page fault work has been delayed to 5.9 in order to sync with Thomas's interrupt entry rework, but here's the rest of the KVM updates for this merge window. MIPS: - Loongson port PPC: - Fixes ARM: - Fixes x86: - KVM_SET_USER_MEMORY_REGION optimizations - Fixes - Selftest fixes" * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (62 commits) KVM: x86: do not pass poisoned hva to __kvm_set_memory_region KVM: selftests: fix sync_with_host() in smm_test KVM: async_pf: Inject 'page ready' event only if 'page not present' was previously injected KVM: async_pf: Cleanup kvm_setup_async_pf() kvm: i8254: remove redundant assignment to pointer s KVM: x86: respect singlestep when emulating instruction KVM: selftests: Don't probe KVM_CAP_HYPERV_ENLIGHTENED_VMCS when nested VMX is unsupported KVM: selftests: do not substitute SVM/VMX check with KVM_CAP_NESTED_STATE check KVM: nVMX: Consult only the "basic" exit reason when routing nested exit KVM: arm64: Move hyp_symbol_addr() to kvm_asm.h KVM: arm64: Synchronize sysreg state on injecting an AArch32 exception KVM: arm64: Make vcpu_cp1x() work on Big Endian hosts KVM: arm64: Remove host_cpu_context member from vcpu structure KVM: arm64: Stop sparse from moaning at __hyp_this_cpu_ptr KVM: arm64: Handle PtrAuth traps early KVM: x86: Unexport x86_fpu_cache and make it static KVM: selftests: Ignore KVM 5-level paging support for VM_MODE_PXXV48_4K KVM: arm64: Save the host's PtrAuth keys in non-preemptible context KVM: arm64: Stop save/restoring ACTLR_EL1 KVM: arm64: Add emulation for 32bit guests accessing ACTLR2 ...
2020-06-11Merge tag 'kvmarm-fixes-5.8-1' of ↵Paolo Bonzini9-63/+137
git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm into HEAD KVM/arm64 fixes for Linux 5.8, take #1 * 32bit VM fixes: - Fix embarassing mapping issue between AArch32 CSSELR and AArch64 ACTLR - Add ACTLR2 support for AArch32 - Get rid of the useless ACTLR_EL1 save/restore - Fix CP14/15 accesses for AArch32 guests on BE hosts - Ensure that we don't loose any state when injecting a 32bit exception when running on a VHE host * 64bit VM fixes: - Fix PtrAuth host saving happening in preemptible contexts - Optimize PtrAuth lazy enable - Drop vcpu to cpu context pointer - Fix sparse warnings for HYP per-CPU accesses
2020-06-10Merge branch 'kvm-arm64/ptrauth-fixes' into kvmarm-master/nextMarc Zyngier7-53/+81
Signed-off-by: Marc Zyngier <maz@kernel.org>
2020-06-10KVM: arm64: Synchronize sysreg state on injecting an AArch32 exceptionMarc Zyngier1-0/+28
On a VHE system, the EL1 state is left in the CPU most of the time, and only syncronized back to memory when vcpu_put() is called (most of the time on preemption). Which means that when injecting an exception, we'd better have a way to either: (1) write directly to the EL1 sysregs (2) synchronize the state back to memory, and do the changes there For an AArch64, we already do (1), so we are safe. Unfortunately, doing the same thing for AArch32 would be pretty invasive. Instead, we can easily implement (2) by calling the put/load architectural backends, and keep preemption disabled. We can then reload the state back into EL1. Cc: stable@vger.kernel.org Reported-by: James Morse <james.morse@arm.com> Signed-off-by: Marc Zyngier <maz@kernel.org>
2020-06-09mmap locking API: convert mmap_sem call sites missed by coccinelleMichel Lespinasse1-7/+7
Convert the last few remaining mmap_sem rwsem calls to use the new mmap locking API. These were missed by coccinelle for some reason (I think coccinelle does not support some of the preprocessor constructs in these files ?) [akpm@linux-foundation.org: convert linux-next leftovers] [akpm@linux-foundation.org: more linux-next leftovers] [akpm@linux-foundation.org: more linux-next leftovers] Signed-off-by: Michel Lespinasse <walken@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Reviewed-by: Daniel Jordan <daniel.m.jordan@oracle.com> Reviewed-by: Laurent Dufour <ldufour@linux.ibm.com> Reviewed-by: Vlastimil Babka <vbabka@suse.cz> Cc: Davidlohr Bueso <dbueso@suse.de> Cc: David Rientjes <rientjes@google.com> Cc: Hugh Dickins <hughd@google.com> Cc: Jason Gunthorpe <jgg@ziepe.ca> Cc: Jerome Glisse <jglisse@redhat.com> Cc: John Hubbard <jhubbard@nvidia.com> Cc: Liam Howlett <Liam.Howlett@oracle.com> Cc: Matthew Wilcox <willy@infradead.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ying Han <yinghan@google.com> Link: http://lkml.kernel.org/r/20200520052908.204642-6-walken@google.com Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-06-09KVM: arm64: Remove host_cpu_context member from vcpu structureMarc Zyngier5-16/+11
For very long, we have kept this pointer back to the per-cpu host state, despite having working per-cpu accessors at EL2 for some time now. Recent investigations have shown that this pointer is easy to abuse in preemptible context, which is a sure sign that it would better be gone. Not to mention that a per-cpu pointer is faster to access at all times. Reported-by: Andrew Scull <ascull@google.com> Acked-by: Mark Rutland <mark.rutland@arm.com Reviewed-by: Andrew Scull <ascull@google.com> Signed-off-by: Marc Zyngier <maz@kernel.org>
2020-06-09KVM: arm64: Handle PtrAuth traps earlyMarc Zyngier4-38/+70
The current way we deal with PtrAuth is a bit heavy handed: - We forcefully save the host's keys on each vcpu_load() - Handling the PtrAuth trap forces us to go all the way back to the exit handling code to just set the HCR bits Overall, this is pretty cumbersome. A better approach would be to handle it the same way we deal with the FPSIMD registers: - On vcpu_load() disable PtrAuth for the guest - On first use, save the host's keys, enable PtrAuth in the guest Crucially, this can happen as a fixup, which is done very early on exit. We can then reenter the guest immediately without leaving the hypervisor role. Another thing is that it simplify the rest of the host handling: exiting all the way to the host means that the only possible outcome for this trap is to inject an UNDEF. Reviewed-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Marc Zyngier <maz@kernel.org>
2020-06-09KVM: arm64: Save the host's PtrAuth keys in non-preemptible contextMarc Zyngier2-18/+19
When using the PtrAuth feature in a guest, we need to save the host's keys before allowing the guest to program them. For that, we dump them in a per-CPU data structure (the so called host context). But both call sites that do this are in preemptible context, which may end up in disaster should the vcpu thread get preempted before reentering the guest. Instead, save the keys eagerly on each vcpu_load(). This has an increased overhead, but is at least safe. Cc: stable@vger.kernel.org Reviewed-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Marc Zyngier <maz@kernel.org>
2020-06-09KVM: arm64: Stop save/restoring ACTLR_EL1James Morse2-4/+0
KVM sets HCR_EL2.TACR via HCR_GUEST_FLAGS. This means ACTLR* accesses from the guest are always trapped, and always return the value in the sys_regs array. The guest can't change the value of these registers, so we are save restoring the reset value, which came from the host. Stop save/restoring this register. Keep the storage for this register in sys_regs[] as this is how the value is exposed to user-space, removing it would break migration. Signed-off-by: James Morse <james.morse@arm.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20200529150656.7339-4-james.morse@arm.com
2020-06-09KVM: arm64: Add emulation for 32bit guests accessing ACTLR2James Morse1-0/+10
ACTLR_EL1 is a 64bit register while the 32bit ACTLR is obviously 32bit. For 32bit software, the extra bits are accessible via ACTLR2... which KVM doesn't emulate. Suggested-by: Marc Zyngier <maz@kernel.org> Signed-off-by: James Morse <james.morse@arm.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20200529150656.7339-3-james.morse@arm.com
2020-06-09KVM: arm64: Stop writing aarch32's CSSELR into ACTLRJames Morse1-2/+8
aarch32 has pairs of registers to access the high and low parts of 64bit registers. KVM has a union of 64bit sys_regs[] and 32bit copro[]. The 32bit accessors read the high or low part of the 64bit sys_reg[] value through the union. Both sys_reg_descs[] and cp15_regs[] list access_csselr() as the accessor for CSSELR{,_EL1}. access_csselr() is only aware of the 64bit sys_regs[], and expects r->reg to be 'CSSELR_EL1' in the enum, index 2 of the 64bit array. cp15_regs[] uses the 32bit copro[] alias of sys_regs[]. Here CSSELR is c0_CSSELR which is the same location in sys_reg[]. r->reg is 'c0_CSSELR', index 4 in the 32bit array. access_csselr() uses the 32bit r->reg value to access the 64bit array, so reads and write the wrong value. sys_regs[4], is ACTLR_EL1, which is subsequently save/restored when we enter the guest. ACTLR_EL1 is supposed to be read-only for the guest. This register only affects execution at EL1, and the host's value is restored before we return to host EL1. Convert the 32bit register index back to the 64bit version. Suggested-by: Marc Zyngier <maz@kernel.org> Signed-off-by: James Morse <james.morse@arm.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20200529150656.7339-2-james.morse@arm.com
2020-06-05arm64: add support for folded p4d page tablesMike Rapoport1-32/+177
Implement primitives necessary for the 4th level folding, add walks of p4d level where appropriate, replace 5level-fixup.h with pgtable-nop4d.h and remove __ARCH_USE_5LEVEL_HACK. [arnd@arndb.de: fix gcc-10 shift warning] Link: http://lkml.kernel.org/r/20200429185657.4085975-1-arnd@arndb.de Signed-off-by: Mike Rapoport <rppt@linux.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Brian Cain <bcain@codeaurora.org> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Christophe Leroy <christophe.leroy@c-s.fr> Cc: Fenghua Yu <fenghua.yu@intel.com> Cc: Geert Uytterhoeven <geert+renesas@glider.be> Cc: Guan Xuetao <gxt@pku.edu.cn> Cc: James Morse <james.morse@arm.com> Cc: Jonas Bonn <jonas@southpole.se> Cc: Julien Thierry <julien.thierry.kdev@gmail.com> Cc: Ley Foon Tan <ley.foon.tan@intel.com> Cc: Marc Zyngier <maz@kernel.org> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Paul Mackerras <paulus@samba.org> Cc: Rich Felker <dalias@libc.org> Cc: Russell King <linux@armlinux.org.uk> Cc: Stafford Horne <shorne@gmail.com> Cc: Stefan Kristiansson <stefan.kristiansson@saunalahti.fi> Cc: Suzuki K Poulose <suzuki.poulose@arm.com> Cc: Tony Luck <tony.luck@intel.com> Cc: Will Deacon <will@kernel.org> Cc: Yoshinori Sato <ysato@users.sourceforge.jp> Link: http://lkml.kernel.org/r/20200414153455.21744-4-rppt@kernel.org Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-06-04KVM: let kvm_destroy_vm_debugfs clean up vCPU debugfs directoriesPaolo Bonzini1-5/+0
After commit 63d0434 ("KVM: x86: move kvm_create_vcpu_debugfs after last failure point") we are creating the pre-vCPU debugfs files after the creation of the vCPU file descriptor. This makes it possible for userspace to reach kvm_vcpu_release before kvm_create_vcpu_debugfs has finished. The vcpu->debugfs_dentry then does not have any associated inode anymore, and this causes a NULL-pointer dereference in debugfs_create_file. The solution is simply to avoid removing the files; they are cleaned up when the VM file descriptor is closed (and that must be after KVM_CREATE_VCPU returns). We can stop storing the dentry in struct kvm_vcpu too, because it is not needed anywhere after kvm_create_vcpu_debugfs returns. Reported-by: syzbot+705f4401d5a93a59b87d@syzkaller.appspotmail.com Fixes: 63d04348371b ("KVM: x86: move kvm_create_vcpu_debugfs after last failure point") Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-06-04Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvmLinus Torvalds41-439/+20032
Pull kvm updates from Paolo Bonzini: "ARM: - Move the arch-specific code into arch/arm64/kvm - Start the post-32bit cleanup - Cherry-pick a few non-invasive pre-NV patches x86: - Rework of TLB flushing - Rework of event injection, especially with respect to nested virtualization - Nested AMD event injection facelift, building on the rework of generic code and fixing a lot of corner cases - Nested AMD live migration support - Optimization for TSC deadline MSR writes and IPIs - Various cleanups - Asynchronous page fault cleanups (from tglx, common topic branch with tip tree) - Interrupt-based delivery of asynchronous "page ready" events (host side) - Hyper-V MSRs and hypercalls for guest debugging - VMX preemption timer fixes s390: - Cleanups Generic: - switch vCPU thread wakeup from swait to rcuwait The other architectures, and the guest side of the asynchronous page fault work, will come next week" * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (256 commits) KVM: selftests: fix rdtsc() for vmx_tsc_adjust_test KVM: check userspace_addr for all memslots KVM: selftests: update hyperv_cpuid with SynDBG tests x86/kvm/hyper-v: Add support for synthetic debugger via hypercalls x86/kvm/hyper-v: enable hypercalls regardless of hypercall page x86/kvm/hyper-v: Add support for synthetic debugger interface x86/hyper-v: Add synthetic debugger definitions KVM: selftests: VMX preemption timer migration test KVM: nVMX: Fix VMX preemption timer migration x86/kvm/hyper-v: Explicitly align hcall param for kvm_hyperv_exit KVM: x86/pmu: Support full width counting KVM: x86/pmu: Tweak kvm_pmu_get_msr to pass 'struct msr_data' in KVM: x86: announce KVM_FEATURE_ASYNC_PF_INT KVM: x86: acknowledgment mechanism for async pf page ready notifications KVM: x86: interrupt based APF 'page ready' event delivery KVM: introduce kvm_read_guest_offset_cached() KVM: rename kvm_arch_can_inject_async_page_present() to kvm_arch_can_dequeue_async_page_present() KVM: x86: extend struct kvm_vcpu_pv_apf_data with token info Revert "KVM: async_pf: Fix #DF due to inject "Page not Present" and "Page Ready" exceptions simultaneously" KVM: VMX: Replace zero-length array with flexible-array ...
2020-06-02Merge tag 'arm64-upstream' of ↵Linus Torvalds5-28/+66
git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux Pull arm64 updates from Will Deacon: "A sizeable pile of arm64 updates for 5.8. Summary below, but the big two features are support for Branch Target Identification and Clang's Shadow Call stack. The latter is currently arm64-only, but the high-level parts are all in core code so it could easily be adopted by other architectures pending toolchain support Branch Target Identification (BTI): - Support for ARMv8.5-BTI in both user- and kernel-space. This allows branch targets to limit the types of branch from which they can be called and additionally prevents branching to arbitrary code, although kernel support requires a very recent toolchain. - Function annotation via SYM_FUNC_START() so that assembly functions are wrapped with the relevant "landing pad" instructions. - BPF and vDSO updates to use the new instructions. - Addition of a new HWCAP and exposure of BTI capability to userspace via ID register emulation, along with ELF loader support for the BTI feature in .note.gnu.property. - Non-critical fixes to CFI unwind annotations in the sigreturn trampoline. Shadow Call Stack (SCS): - Support for Clang's Shadow Call Stack feature, which reserves platform register x18 to point at a separate stack for each task that holds only return addresses. This protects function return control flow from buffer overruns on the main stack. - Save/restore of x18 across problematic boundaries (user-mode, hypervisor, EFI, suspend, etc). - Core support for SCS, should other architectures want to use it too. - SCS overflow checking on context-switch as part of the existing stack limit check if CONFIG_SCHED_STACK_END_CHECK=y. CPU feature detection: - Removed numerous "SANITY CHECK" errors when running on a system with mismatched AArch32 support at EL1. This is primarily a concern for KVM, which disabled support for 32-bit guests on such a system. - Addition of new ID registers and fields as the architecture has been extended. Perf and PMU drivers: - Minor fixes and cleanups to system PMU drivers. Hardware errata: - Unify KVM workarounds for VHE and nVHE configurations. - Sort vendor errata entries in Kconfig. Secure Monitor Call Calling Convention (SMCCC): - Update to the latest specification from Arm (v1.2). - Allow PSCI code to query the SMCCC version. Software Delegated Exception Interface (SDEI): - Unexport a bunch of unused symbols. - Minor fixes to handling of firmware data. Pointer authentication: - Add support for dumping the kernel PAC mask in vmcoreinfo so that the stack can be unwound by tools such as kdump. - Simplification of key initialisation during CPU bringup. BPF backend: - Improve immediate generation for logical and add/sub instructions. vDSO: - Minor fixes to the linker flags for consistency with other architectures and support for LLVM's unwinder. - Clean up logic to initialise and map the vDSO into userspace. ACPI: - Work around for an ambiguity in the IORT specification relating to the "num_ids" field. - Support _DMA method for all named components rather than only PCIe root complexes. - Minor other IORT-related fixes. Miscellaneous: - Initialise debug traps early for KGDB and fix KDB cacheflushing deadlock. - Minor tweaks to early boot state (documentation update, set TEXT_OFFSET to 0x0, increase alignment of PE/COFF sections). - Refactoring and cleanup" * tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: (148 commits) KVM: arm64: Move __load_guest_stage2 to kvm_mmu.h KVM: arm64: Check advertised Stage-2 page size capability arm64/cpufeature: Add get_arm64_ftr_reg_nowarn() ACPI/IORT: Remove the unused __get_pci_rid() arm64/cpuinfo: Add ID_MMFR4_EL1 into the cpuinfo_arm64 context arm64/cpufeature: Add remaining feature bits in ID_AA64PFR1 register arm64/cpufeature: Add remaining feature bits in ID_AA64PFR0 register arm64/cpufeature: Add remaining feature bits in ID_AA64ISAR0 register arm64/cpufeature: Add remaining feature bits in ID_MMFR4 register arm64/cpufeature: Add remaining feature bits in ID_PFR0 register arm64/cpufeature: Introduce ID_MMFR5 CPU register arm64/cpufeature: Introduce ID_DFR1 CPU register arm64/cpufeature: Introduce ID_PFR2 CPU register arm64/cpufeature: Make doublelock a signed feature in ID_AA64DFR0 arm64/cpufeature: Drop TraceFilt feature exposure from ID_DFR0 register arm64/cpufeature: Add explicit ftr_id_isar0[] for ID_ISAR0 register arm64: mm: Add asid_gen_match() helper firmware: smccc: Fix missing prototype warning for arm_smccc_version_init arm64: vdso: Fix CFI directives in sigreturn trampoline arm64: vdso: Don't prefix sigreturn trampoline with a BTI C instruction ...
2020-06-01Merge tag 'kvmarm-5.8' of ↵Paolo Bonzini41-426/+20022
git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm into HEAD KVM/arm64 updates for Linux 5.8: - Move the arch-specific code into arch/arm64/kvm - Start the post-32bit cleanup - Cherry-pick a few non-invasive pre-NV patches
2020-05-31KVM: arm64: Flush the instruction cache if not unmapping the VM on rebootMarc Zyngier1-4/+10
On a system with FWB, we don't need to unmap Stage-2 on reboot, as even if userspace takes this opportunity to repaint the whole of memory, FWB ensures that the data side stays consistent even if the guest uses non-cacheable mappings. However, the I-side is not necessarily coherent with the D-side if CTR_EL0.DIC is 0. In this case, invalidate the i-cache to preserve coherency. Reported-by: Alexandru Elisei <alexandru.elisei@arm.com> Reviewed-by: Alexandru Elisei <alexandru.elisei@arm.com> Fixes: 892713e97ca1 ("KVM: arm64: Sidestep stage2_unmap_vm() on vcpu reset when S2FWB is supported") Signed-off-by: Marc Zyngier <maz@kernel.org>
2020-05-28Merge branch 'for-next/kvm/errata' into for-next/coreWill Deacon3-10/+13
KVM CPU errata rework (Andrew Scull and Marc Zyngier) * for-next/kvm/errata: KVM: arm64: Move __load_guest_stage2 to kvm_mmu.h arm64: Unify WORKAROUND_SPECULATIVE_AT_{NVHE,VHE}
2020-05-28KVM: arm64: Check advertised Stage-2 page size capabilityMarc Zyngier1-2/+35
With ARMv8.5-GTG, the hardware (or more likely a hypervisor) can advertise the supported Stage-2 page sizes. Let's check this at boot time. Reviewed-by: Suzuki K Poulose <suzuki.poulose@arm.com> Reviewed-by: Alexandru Elisei <alexandru.elisei@arm.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Signed-off-by: Will Deacon <will@kernel.org>
2020-05-28KVM: arm64: Parametrize exception entry with a target ELMarc Zyngier1-37/+38
We currently assume that an exception is delivered to EL1, always. Once we emulate EL2, this no longer will be the case. To prepare for this, add a target_mode parameter. While we're at it, merge the computing of the target PC and PSTATE in a single function that updates both PC and CPSR after saving their previous values in the corresponding ELR/SPSR. This ensures that they are updated in the correct order (a pretty common source of bugs...). Reviewed-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Marc Zyngier <maz@kernel.org>
2020-05-28KVM: arm64: Don't use empty structures as CPU reset stateMarc Zyngier1-12/+9
Keeping empty structure as the vcpu state initializer is slightly wasteful: we only want to set pstate, and zero everything else. Just do that. Signed-off-by: Marc Zyngier <maz@kernel.org>
2020-05-28KVM: arm64: Move sysreg reset check to boot timeMarc Zyngier1-37/+35
Our sysreg reset check has become a bit silly, as it only checks whether a reset callback actually exists for a given sysreg entry, and apply the method if available. Doing the check at each vcpu reset is pretty dumb, as the tables never change. It is thus perfectly possible to do the same checks at boot time. This also allows us to introduce a sparse sys_regs[] array, something that will be required with ARMv8.4-NV. Signed-off-by: Marc Zyngier <maz@kernel.org>
2020-05-28KVM: arm64: Add missing reset handlers for PMU emulationMarc Zyngier1-3/+3
As we're about to become a bit more harsh when it comes to the lack of reset callbacks, let's add the missing PMU reset handlers. Note that these only cover *CLR registers that were always covered by their *SET counterpart, so there is no semantic change here. Reviewed-by: James Morse <james.morse@arm.com> Signed-off-by: Marc Zyngier <maz@kernel.org>
2020-05-28KVM: arm64: Refactor vcpu_{read,write}_sys_regMarc Zyngier1-57/+71
Extract the direct HW accessors for later reuse. Reviewed-by: James Morse <james.morse@arm.com> Signed-off-by: Marc Zyngier <maz@kernel.org>
2020-05-28KVM: arm64: vgic-v3: Take cpu_if pointer directly instead of vcpuChristoffer Dall5-46/+44
If we move the used_lrs field to the version-specific cpu interface structure, the following functions only operate on the struct vgic_v3_cpu_if and not the full vcpu: __vgic_v3_save_state __vgic_v3_restore_state __vgic_v3_activate_traps __vgic_v3_deactivate_traps __vgic_v3_save_aprs __vgic_v3_restore_aprs This is going to be very useful for nested virt, so move the used_lrs field and change the prototypes and implementations of these functions to take the cpu_if parameter directly. No functional change. Reviewed-by: James Morse <james.morse@arm.com> Signed-off-by: Christoffer Dall <christoffer.dall@arm.com> Signed-off-by: Marc Zyngier <maz@kernel.org>
2020-05-25KVM: arm64: Remove obsolete kvm_virt_to_phys abstractionAndrew Scull1-3/+3
This abstraction was introduced to hide the difference between arm and arm64 but, with the former no longer supported, this abstraction can be removed and the canonical kernel API used directly instead. Signed-off-by: Andrew Scull <ascull@google.com> Signed-off-by: Marc Zyngier <maz@kernel.org> CC: Marc Zyngier <maz@kernel.org> CC: James Morse <james.morse@arm.com> CC: Suzuki K Poulose <suzuki.poulose@arm.com> Link: https://lore.kernel.org/r/20200519104036.259917-1-ascull@google.com
2020-05-25KVM: arm64: Clean up cpu_init_hyp_mode()David Brazdil1-5/+27
Pull bits of code to the only place where it is used. Remove empty function __cpu_init_stage2(). Remove redundant has_vhe() check since this function is nVHE-only. No functional changes intended. Signed-off-by: David Brazdil <dbrazdil@google.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20200515152056.83158-1-dbrazdil@google.com
2020-05-21arm64/cpufeature: Introduce ID_MMFR5 CPU registerAnshuman Khandual1-1/+1
This adds basic building blocks required for ID_MMFR5 CPU register which provides information about the implemented memory model and memory management support in AArch32 state. This is added per ARM DDI 0487F.a specification. Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Will Deacon <will@kernel.org> Cc: Marc Zyngier <maz@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: James Morse <james.morse@arm.com> Cc: Suzuki K Poulose <suzuki.poulose@arm.com> Cc: kvmarm@lists.cs.columbia.edu Cc: linux-arm-kernel@lists.infradead.org Cc: linux-kernel@vger.kernel.org Suggested-by: Will Deacon <will@kernel.org> Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com> Reviewed-by: Suzuki K Poulose <suzuki.poulose@arm.com> Link: https://lore.kernel.org/r/1589881254-10082-7-git-send-email-anshuman.khandual@arm.com Signed-off-by: Will Deacon <will@kernel.org>
2020-05-21arm64/cpufeature: Introduce ID_DFR1 CPU registerAnshuman Khandual1-1/+1
This adds basic building blocks required for ID_DFR1 CPU register which provides top level information about the debug system in AArch32 state. We hide the register from KVM guests, as we don't emulate the 'MTPMU' feature. This is added per ARM DDI 0487F.a specification. Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Will Deacon <will@kernel.org> Cc: Marc Zyngier <maz@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: James Morse <james.morse@arm.com> Cc: Suzuki K Poulose <suzuki.poulose@arm.com> Cc: kvmarm@lists.cs.columbia.edu Cc: linux-arm-kernel@lists.infradead.org Cc: linux-kernel@vger.kernel.org Suggested-by: Will Deacon <will@kernel.org> Reviewed-by : Suzuki K Poulose <suzuki.poulose@arm.com> Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com> Link: https://lore.kernel.org/r/1589881254-10082-6-git-send-email-anshuman.khandual@arm.com Signed-off-by: Will Deacon <will@kernel.org>
2020-05-21arm64/cpufeature: Introduce ID_PFR2 CPU registerAnshuman Khandual1-1/+1
This adds basic building blocks required for ID_PFR2 CPU register which provides information about the AArch32 programmers model which must be interpreted along with ID_PFR0 and ID_PFR1 CPU registers. This is added per ARM DDI 0487F.a specification. Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Will Deacon <will@kernel.org> Cc: Marc Zyngier <maz@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: James Morse <james.morse@arm.com> Cc: Suzuki K Poulose <suzuki.poulose@arm.com> Cc: kvmarm@lists.cs.columbia.edu Cc: linux-arm-kernel@lists.infradead.org Cc: linux-kernel@vger.kernel.org Suggested-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com> Reviewed-by: Suzuki K Poulose <suzuki.poulose@arm.com> Link: https://lore.kernel.org/r/1589881254-10082-5-git-send-email-anshuman.khandual@arm.com Signed-off-by: Will Deacon <will@kernel.org>
2020-05-20arm64/cpufeature: Drop open encodings while extracting parangeAnshuman Khandual1-3/+8
Currently there are multiple instances of parange feature width mask open encodings while fetching it's value. Even the width mask value (0x7) itself is not accurate. It should be (0xf) per ID_AA64MMFR0_EL1.PARange[3:0] as in ARM ARM (0487F.a). Replace them with cpuid_feature_extract_unsigned_field() which can extract given standard feature (4 bits width i.e 0xf mask) field. Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Will Deacon <will@kernel.org> Cc: Marc Zyngier <maz@kernel.org> Cc: James Morse <james.morse@arm.com> Cc: kvmarm@lists.cs.columbia.edu Cc: linux-arm-kernel@lists.infradead.org Cc: linux-kernel@vger.kernel.org Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com> Acked-by: Marc Zyngier <maz@kernel.org> Acked-by: Will Deacon <will@kernel.org> Link: https://lore.kernel.org/r/1589360614-1164-1-git-send-email-anshuman.khandual@arm.com Signed-off-by: Will Deacon <will@kernel.org>
2020-05-20arm64/cpufeature: Validate hypervisor capabilities during CPU hotplugAnshuman Khandual1-0/+5
This validates hypervisor capabilities like VMID width, IPA range for any hot plug CPU against system finalized values. KVM's view of the IPA space is used while allowing a given CPU to come up. While here, it factors out get_vmid_bits() for general use. Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Will Deacon <will@kernel.org> Cc: Marc Zyngier <maz@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: James Morse <james.morse@arm.com> Cc: Suzuki K Poulose <suzuki.poulose@arm.com> Cc: linux-arm-kernel@lists.infradead.org Cc: kvmarm@lists.cs.columbia.edu Cc: linux-kernel@vger.kernel.org Suggested-by: Suzuki Poulose <suzuki.poulose@arm.com> Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com> Reviewed-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/1589248647-22925-1-git-send-email-anshuman.khandual@arm.com Signed-off-by: Will Deacon <will@kernel.org>
2020-05-16KVM: arm64: Make KVM_CAP_MAX_VCPUS compatible with the selected GIC versionMarc Zyngier1-5/+10
KVM_CAP_MAX_VCPUS always return the maximum possible number of VCPUs, irrespective of the selected interrupt controller. This is pretty misleading for userspace that selects a GICv2 on a GICv3 system that supports v2 compat: It always gets a maximum of 512 VCPUs, even if the effective limit is 8. The 9th VCPU will fail to be created, which is unexpected as far as userspace is concerned. Fortunately, we already have the right information stashed in the kvm structure, and we can return it as requested. Reported-by: Ard Biesheuvel <ardb@kernel.org> Signed-off-by: Marc Zyngier <maz@kernel.org> Tested-by: Alexandru Elisei <alexandru.elisei@arm.com> Reviewed-by: Alexandru Elisei <alexandru.elisei@arm.com> Link: https://lore.kernel.org/r/20200427141507.284985-1-maz@kernel.org
2020-05-16KVM: arm64: Support enabling dirty log gradually in small chunksKeqian Zhu1-2/+10
There is already support of enabling dirty log gradually in small chunks for x86 in commit 3c9bd4006bfc ("KVM: x86: enable dirty log gradually in small chunks"). This adds support for arm64. x86 still writes protect all huge pages when DIRTY_LOG_INITIALLY_ALL_SET is enabled. However, for arm64, both huge pages and normal pages can be write protected gradually by userspace. Under the Huawei Kunpeng 920 2.6GHz platform, I did some tests on 128G Linux VMs with different page size. The memory pressure is 127G in each case. The time taken of memory_global_dirty_log_start in QEMU is listed below: Page Size Before After Optimization 4K 650ms 1.8ms 2M 4ms 1.8ms 1G 2ms 1.8ms Besides the time reduction, the biggest improvement is that we will minimize the performance side effect (because of dissolving huge pages and marking memslots dirty) on guest after enabling dirty log. Signed-off-by: Keqian Zhu <zhukeqian1@huawei.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20200413122023.52583-1-zhukeqian1@huawei.com
2020-05-16KVM: arm64: Unify handling THP backed host memorySuzuki K Poulose1-55/+60
We support mapping host memory backed by PMD transparent hugepages at stage2 as huge pages. However the checks are now spread across two different places. Let us unify the handling of the THPs to keep the code cleaner (and future proof for PUD THP support). This patch moves transparent_hugepage_adjust() closer to the caller to avoid a forward declaration for fault_supports_stage2_huge_mappings(). Also, since we already handle the case where the host VA and the guest PA may not be aligned, the explicit VM_BUG_ON() is not required. Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com> Signed-off-by: Zenghui Yu <yuzenghui@huawei.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20200507123546.1875-3-yuzenghui@huawei.com
2020-05-16KVM: arm64: Clean up the checking for huge mappingSuzuki K Poulose1-1/+5
If we are checking whether the stage2 can map PAGE_SIZE, we don't have to do the boundary checks as both the host VMA and the guest memslots are page aligned. Bail the case easily. While we're at it, fixup a typo in the comment below. Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com> Signed-off-by: Zenghui Yu <yuzenghui@huawei.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20200507123546.1875-2-yuzenghui@huawei.com
2020-05-16KVM: arm/arm64: Release kvm->mmu_lock in loop to prevent starvationJiang Yi1-0/+3
Do cond_resched_lock() in stage2_flush_memslot() like what is done in unmap_stage2_range() and other places holding mmu_lock while processing a possibly large range of memory. Signed-off-by: Jiang Yi <giangyi@amazon.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Reviewed-by: Suzuki K Poulose <suzuki.poulose@arm.com> Link: https://lore.kernel.org/r/20200415084229.29992-1-giangyi@amazon.com
2020-05-16KVM: arm64: Sidestep stage2_unmap_vm() on vcpu reset when S2FWB is supportedZenghui Yu1-1/+4
stage2_unmap_vm() was introduced to unmap user RAM region in the stage2 page table to make the caches coherent. E.g., a guest reboot with stage1 MMU disabled will access memory using non-cacheable attributes. If the RAM and caches are not coherent at this stage, some evicted dirty cache line may go and corrupt guest data in RAM. Since ARMv8.4, S2FWB feature is mandatory and KVM will take advantage of it to configure the stage2 page table and the attributes of memory access. So we ensure that guests always access memory using cacheable attributes and thus, the caches always be coherent. So on CPUs that support S2FWB, we can safely reset the vcpu without a heavy stage2 unmapping. Signed-off-by: Zenghui Yu <yuzenghui@huawei.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20200415072835.1164-1-yuzenghui@huawei.com
2020-05-16KVM: Fix spelling in code commentsFuad Tabba9-20/+20
Fix spelling and typos (e.g., repeated words) in comments. Signed-off-by: Fuad Tabba <tabba@google.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20200401140310.29701-1-tabba@google.com
2020-05-16KVM: arm64: Simplify __kvm_timer_set_cntvoff implementationMarc Zyngier2-13/+2
Now that this function isn't constrained by the 32bit PCS, let's simplify it by taking a single 64bit offset instead of two 32bit parameters. Signed-off-by: Marc Zyngier <maz@kernel.org>
2020-05-16KVM: arm64: Clean up kvm makefilesFuad Tabba2-36/+17
Consolidate references to the CONFIG_KVM configuration item to encompass entire folders rather than per line. Signed-off-by: Fuad Tabba <tabba@google.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Acked-by: Will Deacon <will@kernel.org> Link: https://lore.kernel.org/r/20200505154520.194120-5-tabba@google.com
2020-05-16KVM: arm64: Change CONFIG_KVM to a menuconfig entryWill Deacon1-5/+11
Changing CONFIG_KVM to be a 'menuconfig' entry in Kconfig mean that we can straightforwardly enumerate optional features, such as the virtual PMU device as dependent options. Signed-off-by: Will Deacon <will@kernel.org> Signed-off-by: Fuad Tabba <tabba@google.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20200505154520.194120-4-tabba@google.com
2020-05-16KVM: arm64: Update help textWill Deacon1-2/+0
arm64 KVM supports 16k pages since 02e0b7600f83 ("arm64: kvm: Add support for 16K pages"), so update the Kconfig help text accordingly. Signed-off-by: Will Deacon <will@kernel.org> Signed-off-by: Fuad Tabba <tabba@google.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20200505154520.194120-3-tabba@google.com
2020-05-16KVM: arm64: Kill off CONFIG_KVM_ARM_HOSTWill Deacon3-43/+37
CONFIG_KVM_ARM_HOST is just a proxy for CONFIG_KVM, so remove it in favour of the latter. Signed-off-by: Will Deacon <will@kernel.org> Signed-off-by: Fuad Tabba <tabba@google.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20200505154520.194120-2-tabba@google.com
2020-05-16KVM: arm64: Move virt/kvm/arm to arch/arm64Marc Zyngier35-242/+19810
Now that the 32bit KVM/arm host is a distant memory, let's move the whole of the KVM/arm64 code into the arm64 tree. As they said in the song: Welcome Home (Sanitarium). Signed-off-by: Marc Zyngier <maz@kernel.org> Acked-by: Will Deacon <will@kernel.org> Link: https://lore.kernel.org/r/20200513104034.74741-1-maz@kernel.org
2020-05-15kvm: add halt-polling cpu usage statsDavid Matlack1-0/+2
Two new stats for exposing halt-polling cpu usage: halt_poll_success_ns halt_poll_fail_ns Thus sum of these 2 stats is the total cpu time spent polling. "success" means the VCPU polled until a virtual interrupt was delivered. "fail" means the VCPU had to schedule out (either because the maximum poll time was reached or it needed to yield the CPU). To avoid touching every arch's kvm_vcpu_stat struct, only update and export halt-polling cpu usage stats if we're on x86. Exporting cpu usage as a u64 and in nanoseconds means we will overflow at ~500 years, which seems reasonably large. Signed-off-by: David Matlack <dmatlack@google.com> Signed-off-by: Jon Cargille <jcargill@google.com> Reviewed-by: Jim Mattson <jmattson@google.com> Message-Id: <20200508182240.68440-1-jcargill@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-05-13Merge branch 'kvm-amd-fixes' into HEADPaolo Bonzini5-17/+33
2020-05-04arm64: Unify WORKAROUND_SPECULATIVE_AT_{NVHE,VHE}Andrew Scull3-10/+13
Errata 1165522, 1319367 and 1530923 each allow TLB entries to be allocated as a result of a speculative AT instruction. In order to avoid mandating VHE on certain affected CPUs, apply the workaround to both the nVHE and the VHE case for all affected CPUs. Signed-off-by: Andrew Scull <ascull@google.com> Acked-by: Will Deacon <will@kernel.org> CC: Marc Zyngier <maz@kernel.org> CC: James Morse <james.morse@arm.com> CC: Suzuki K Poulose <suzuki.poulose@arm.com> CC: Will Deacon <will@kernel.org> CC: Steven Price <steven.price@arm.com> Link: https://lore.kernel.org/r/20200504094858.108917-1-ascull@google.com Signed-off-by: Will Deacon <will@kernel.org>