summaryrefslogtreecommitdiff
path: root/arch/arm64/kvm/hyp
AgeCommit message (Collapse)AuthorFilesLines
2024-06-11KVM: arm64: FFA: Release hyp rx bufferVincent Donnefort1-0/+12
According to the FF-A spec (Buffer states and ownership), after a producer has written into a buffer, it is "full" and now owned by the consumer. The producer won't be able to use that buffer, until the consumer hands it over with an invocation such as RX_RELEASE. It is clear in the following paragraph (Transfer of buffer ownership), that MEM_RETRIEVE_RESP is transferring the ownership from producer (in our case SPM) to consumer (hypervisor). RX_RELEASE is therefore mandatory here. It is less clear though what is happening with MEM_FRAG_TX. But this invocation, as a response to MEM_FRAG_RX writes into the same hypervisor RX buffer (see paragraph "Transmission of transaction descriptor in fragments"). Also this is matching the TF-A implementation where the RX buffer is marked "full" during a MEM_FRAG_RX. Release the RX hypervisor buffer in those two cases. This will unblock later invocations using this buffer which would otherwise fail. (RETRIEVE_REQ, MEM_FRAG_RX and PARTITION_INFO_GET). Signed-off-by: Vincent Donnefort <vdonnefort@google.com> Reviewed-by: Sudeep Holla <sudeep.holla@arm.com> Link: https://lore.kernel.org/r/20240611175317.1220842-1-vdonnefort@google.com Signed-off-by: Marc Zyngier <maz@kernel.org>
2024-06-04KVM: arm64: Ensure that SME controls are disabled in protected modeFuad Tabba1-0/+11
KVM (and pKVM) do not support SME guests. Therefore KVM ensures that the host's SME state is flushed and that SME controls for enabling access to ZA storage and for streaming are disabled. pKVM needs to protect against a buggy/malicious host. Ensure that it wouldn't run a guest when protected mode is enabled should any of the SME controls be enabled. Signed-off-by: Fuad Tabba <tabba@google.com> Link: https://lore.kernel.org/r/20240603122852.3923848-10-tabba@google.com Signed-off-by: Marc Zyngier <maz@kernel.org>
2024-06-04KVM: arm64: Refactor CPACR trap bit setting/clearing to use ELx formatFuad Tabba3-8/+6
When setting/clearing CPACR bits for EL0 and EL1, use the ELx format of the bits, which covers both. This makes the code clearer, and reduces the chances of accidentally missing a bit. No functional change intended. Reviewed-by: Oliver Upton <oliver.upton@linux.dev> Signed-off-by: Fuad Tabba <tabba@google.com> Link: https://lore.kernel.org/r/20240603122852.3923848-9-tabba@google.com Signed-off-by: Marc Zyngier <maz@kernel.org>
2024-06-04KVM: arm64: Consolidate initializing the host data's fpsimd_state/sve in pKVMFuad Tabba3-13/+0
Now that we have introduced finalize_init_hyp_mode(), lets consolidate the initializing of the host_data fpsimd_state and sve state. Reviewed-by: Oliver Upton <oliver.upton@linux.dev> Signed-off-by: Fuad Tabba <tabba@google.com> Reviewed-by: Mark Brown <broonie@kernel.org> Link: https://lore.kernel.org/r/20240603122852.3923848-8-tabba@google.com Signed-off-by: Marc Zyngier <maz@kernel.org>
2024-06-04KVM: arm64: Eagerly restore host fpsimd/sve state in pKVMFuad Tabba4-5/+93
When running in protected mode we don't want to leak protected guest state to the host, including whether a guest has used fpsimd/sve. Therefore, eagerly restore the host state on guest exit when running in protected mode, which happens only if the guest has used fpsimd/sve. Reviewed-by: Oliver Upton <oliver.upton@linux.dev> Signed-off-by: Fuad Tabba <tabba@google.com> Link: https://lore.kernel.org/r/20240603122852.3923848-7-tabba@google.com Signed-off-by: Marc Zyngier <maz@kernel.org>
2024-06-04KVM: arm64: Allocate memory mapped at hyp for host sve state in pKVMFuad Tabba2-0/+26
Protected mode needs to maintain (save/restore) the host's sve state, rather than relying on the host kernel to do that. This is to avoid leaking information to the host about guests and the type of operations they are performing. As a first step towards that, allocate memory mapped at hyp, per cpu, for the host sve state. The following patch will use this memory to save/restore the host state. Reviewed-by: Oliver Upton <oliver.upton@linux.dev> Signed-off-by: Fuad Tabba <tabba@google.com> Link: https://lore.kernel.org/r/20240603122852.3923848-6-tabba@google.com Signed-off-by: Marc Zyngier <maz@kernel.org>
2024-06-04KVM: arm64: Specialize handling of host fpsimd state on trapFuad Tabba3-1/+13
In subsequent patches, n/vhe will diverge on saving the host fpsimd/sve state when taking a guest fpsimd/sve trap. Add a specialized helper to handle it. No functional change intended. Reviewed-by: Mark Brown <broonie@kernel.org> Reviewed-by: Oliver Upton <oliver.upton@linux.dev> Signed-off-by: Fuad Tabba <tabba@google.com> Link: https://lore.kernel.org/r/20240603122852.3923848-5-tabba@google.com Signed-off-by: Marc Zyngier <maz@kernel.org>
2024-06-04KVM: arm64: Abstract set/clear of CPTR_EL2 bits behind helperFuad Tabba2-19/+5
The same traps controlled by CPTR_EL2 or CPACR_EL1 need to be toggled in different parts of the code, but the exact bits and their polarity differ between these two formats and the mode (vhe/nvhe/hvhe). To reduce the amount of duplicated code and the chance of getting the wrong bit/polarity or missing a field, abstract the set/clear of CPTR_EL2 bits behind a helper. Since (h)VHE is the way of the future, use the CPACR_EL1 format, which is a subset of the VHE CPTR_EL2, as a reference. No functional change intended. Suggested-by: Oliver Upton <oliver.upton@linux.dev> Reviewed-by: Oliver Upton <oliver.upton@linux.dev> Signed-off-by: Fuad Tabba <tabba@google.com> Link: https://lore.kernel.org/r/20240603122852.3923848-4-tabba@google.com Signed-off-by: Marc Zyngier <maz@kernel.org>
2024-06-04KVM: arm64: Fix prototype for __sve_save_state/__sve_restore_stateFuad Tabba1-1/+2
Since the prototypes for __sve_save_state/__sve_restore_state at hyp were added, the underlying macro has acquired a third parameter for saving/restoring ffr. Fix the prototypes to account for the third parameter, and restore the ffr for the guest since it is saved. Suggested-by: Mark Brown <broonie@kernel.org> Signed-off-by: Fuad Tabba <tabba@google.com> Reviewed-by: Mark Brown <broonie@kernel.org> Link: https://lore.kernel.org/r/20240603122852.3923848-3-tabba@google.com Signed-off-by: Marc Zyngier <maz@kernel.org>
2024-06-04KVM: arm64: Reintroduce __sve_save_stateFuad Tabba1-0/+6
Now that the hypervisor is handling the host sve state in protected mode, it needs to be able to save it. This reverts commit e66425fc9ba3 ("KVM: arm64: Remove unused __sve_save_state"). Reviewed-by: Oliver Upton <oliver.upton@linux.dev> Signed-off-by: Fuad Tabba <tabba@google.com> Link: https://lore.kernel.org/r/20240603122852.3923848-2-tabba@google.com Signed-off-by: Marc Zyngier <maz@kernel.org>
2024-05-27KVM: arm64: AArch32: Fix spurious trapping of conditional instructionsMarc Zyngier1-2/+16
We recently upgraded the view of ESR_EL2 to 64bit, in keeping with the requirements of the architecture. However, the AArch32 emulation code was left unaudited, and the (already dodgy) code that triages whether a trap is spurious or not (because the condition code failed) broke in a subtle way: If ESR_EL2.ISS2 is ever non-zero (unlikely, but hey, this is the ARM architecture we're talking about), the hack that tests the top bits of ESR_EL2.EC will break in an interesting way. Instead, use kvm_vcpu_trap_get_class() to obtain the EC, and list all the possible ECs that can fail a condition code check. While we're at it, add SMC32 to the list, as it is explicitly listed as being allowed to trap despite failing a condition code check (as described in the HCR_EL2.TSC documentation). Fixes: 0b12620fddb8 ("KVM: arm64: Treat ESR_EL2 as a 64-bit register") Cc: stable@vger.kernel.org Acked-by: Oliver Upton <oliver.upton@linux.dev> Link: https://lore.kernel.org/r/20240524141956.1450304-4-maz@kernel.org Signed-off-by: Marc Zyngier <maz@kernel.org>
2024-05-18Merge tag 'kbuild-v6.10' of ↵Linus Torvalds2-14/+1
git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild Pull Kbuild updates from Masahiro Yamada: - Avoid 'constexpr', which is a keyword in C23 - Allow 'dtbs_check' and 'dt_compatible_check' run independently of 'dt_binding_check' - Fix weak references to avoid GOT entries in position-independent code generation - Convert the last use of 'optional' property in arch/sh/Kconfig - Remove support for the 'optional' property in Kconfig - Remove support for Clang's ThinLTO caching, which does not work with the .incbin directive - Change the semantics of $(src) so it always points to the source directory, which fixes Makefile inconsistencies between upstream and downstream - Fix 'make tar-pkg' for RISC-V to produce a consistent package - Provide reasonable default coverage for objtool, sanitizers, and profilers - Remove redundant OBJECT_FILES_NON_STANDARD, KASAN_SANITIZE, etc. - Remove the last use of tristate choice in drivers/rapidio/Kconfig - Various cleanups and fixes in Kconfig * tag 'kbuild-v6.10' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild: (46 commits) kconfig: use sym_get_choice_menu() in sym_check_prop() rapidio: remove choice for enumeration kconfig: lxdialog: remove initialization with A_NORMAL kconfig: m/nconf: merge two item_add_str() calls kconfig: m/nconf: remove dead code to display value of bool choice kconfig: m/nconf: remove dead code to display children of choice members kconfig: gconf: show checkbox for choice correctly kbuild: use GCOV_PROFILE and KCSAN_SANITIZE in scripts/Makefile.modfinal Makefile: remove redundant tool coverage variables kbuild: provide reasonable defaults for tool coverage modules: Drop the .export_symbol section from the final modules kconfig: use menu_list_for_each_sym() in sym_check_choice_deps() kconfig: use sym_get_choice_menu() in conf_write_defconfig() kconfig: add sym_get_choice_menu() helper kconfig: turn defaults and additional prompt for choice members into error kconfig: turn missing prompt for choice members into error kconfig: turn conf_choice() into void function kconfig: use linked list in sym_set_changed() kconfig: gconf: use MENU_CHANGED instead of SYMBOL_CHANGED kconfig: gconf: remove debug code ...
2024-05-14Makefile: remove redundant tool coverage variablesMasahiro Yamada1-13/+0
Now Kbuild provides reasonable defaults for objtool, sanitizers, and profilers. Remove redundant variables. Note: This commit changes the coverage for some objects: - include arch/mips/vdso/vdso-image.o into UBSAN, GCOV, KCOV - include arch/sparc/vdso/vdso-image-*.o into UBSAN - include arch/sparc/vdso/vma.o into UBSAN - include arch/x86/entry/vdso/extable.o into KASAN, KCSAN, UBSAN, GCOV, KCOV - include arch/x86/entry/vdso/vdso-image-*.o into KASAN, KCSAN, UBSAN, GCOV, KCOV - include arch/x86/entry/vdso/vdso32-setup.o into KASAN, KCSAN, UBSAN, GCOV, KCOV - include arch/x86/entry/vdso/vma.o into GCOV, KCOV - include arch/x86/um/vdso/vma.o into KASAN, GCOV, KCOV I believe these are positive effects because all of them are kernel space objects. Signed-off-by: Masahiro Yamada <masahiroy@kernel.org> Reviewed-by: Kees Cook <keescook@chromium.org> Tested-by: Roberto Sassu <roberto.sassu@huawei.com>
2024-05-09kbuild: use $(src) instead of $(srctree)/$(src) for source directoryMasahiro Yamada1-1/+1
Kbuild conventionally uses $(obj)/ for generated files, and $(src)/ for checked-in source files. It is merely a convention without any functional difference. In fact, $(obj) and $(src) are exactly the same, as defined in scripts/Makefile.build: src := $(obj) When the kernel is built in a separate output directory, $(src) does not accurately reflect the source directory location. While Kbuild resolves this discrepancy by specifying VPATH=$(srctree) to search for source files, it does not cover all cases. For example, when adding a header search path for local headers, -I$(srctree)/$(src) is typically passed to the compiler. This introduces inconsistency between upstream and downstream Makefiles because $(src) is used instead of $(srctree)/$(src) for the latter. To address this inconsistency, this commit changes the semantics of $(src) so that it always points to the directory in the source tree. Going forward, the variables used in Makefiles will have the following meanings: $(obj) - directory in the object tree $(src) - directory in the source tree (changed by this commit) $(objtree) - the top of the kernel object tree $(srctree) - the top of the kernel source tree Consequently, $(srctree)/$(src) in upstream Makefiles need to be replaced with $(src). Signed-off-by: Masahiro Yamada <masahiroy@kernel.org> Reviewed-by: Nicolas Schier <nicolas@fjasle.eu>
2024-05-08Merge branch kvm-arm64/misc-6.10 into kvmarm-master/nextMarc Zyngier1-1/+0
* kvm-arm64/misc-6.10: : . : Misc fixes and updates targeting 6.10 : : - Improve boot-time diagnostics when the sysreg tables : are not correctly sorted : : - Allow FFA_MSG_SEND_DIRECT_REQ in the FFA proxy : : - Fix duplicate XNX field in the ID_AA64MMFR1_EL1 : writeable mask : : - Allocate PPIs and SGIs outside of the vcpu structure, allowing : for smaller EL2 mapping and some flexibility in implementing : more or less than 32 private IRQs. : : - Use bitmap_gather() instead of its open-coded equivalent : : - Make protected mode use hVHE if available : : - Purge stale mpidr_data if a vcpu is created after the MPIDR : map has been created : . KVM: arm64: Destroy mpidr_data for 'late' vCPU creation KVM: arm64: Use hVHE in pKVM by default on CPUs with VHE support KVM: arm64: Fix hvhe/nvhe early alias parsing KVM: arm64: Convert kvm_mpidr_index() to bitmap_gather() KVM: arm64: vgic: Allocate private interrupts on demand KVM: arm64: Remove duplicated AA64MMFR1_EL1 XNX KVM: arm64: Remove FFA_MSG_SEND_DIRECT_REQ from the denylist KVM: arm64: Improve out-of-order sysreg table diagnostics Signed-off-by: Marc Zyngier <maz@kernel.org>
2024-05-03Merge branch kvm-arm64/pkvm-6.10 into kvmarm-master/nextMarc Zyngier12-79/+185
* kvm-arm64/pkvm-6.10: (25 commits) : . : At last, a bunch of pKVM patches, courtesy of Fuad Tabba. : From the cover letter: : : "This series is a bit of a bombay-mix of patches we've been : carrying. There's no one overarching theme, but they do improve : the code by fixing existing bugs in pKVM, refactoring code to : make it more readable and easier to re-use for pKVM, or adding : functionality to the existing pKVM code upstream." : . KVM: arm64: Force injection of a data abort on NISV MMIO exit KVM: arm64: Restrict supported capabilities for protected VMs KVM: arm64: Refactor setting the return value in kvm_vm_ioctl_enable_cap() KVM: arm64: Document the KVM/arm64-specific calls in hypercalls.rst KVM: arm64: Rename firmware pseudo-register documentation file KVM: arm64: Reformat/beautify PTP hypercall documentation KVM: arm64: Clarify rationale for ZCR_EL1 value restored on guest exit KVM: arm64: Introduce and use predicates that check for protected VMs KVM: arm64: Add is_pkvm_initialized() helper KVM: arm64: Simplify vgic-v3 hypercalls KVM: arm64: Move setting the page as dirty out of the critical section KVM: arm64: Change kvm_handle_mmio_return() return polarity KVM: arm64: Fix comment for __pkvm_vcpu_init_traps() KVM: arm64: Prevent kmemleak from accessing .hyp.data KVM: arm64: Do not map the host fpsimd state to hyp in pKVM KVM: arm64: Rename __tlb_switch_to_{guest,host}() in VHE KVM: arm64: Support TLB invalidation in guest context KVM: arm64: Avoid BBM when changing only s/w bits in Stage-2 PTE KVM: arm64: Check for PTE validity when checking for executable/cacheable KVM: arm64: Avoid BUG-ing from the host abort path ... Signed-off-by: Marc Zyngier <maz@kernel.org>
2024-05-03Merge branch kvm-arm64/nv-eret-pauth into kvmarm-master/nextMarc Zyngier3-71/+91
* kvm-arm64/nv-eret-pauth: : . : Add NV support for the ERETAA/ERETAB instructions. From the cover letter: : : "Although the current upstream NV support has *some* support for : correctly emulating ERET, that support is only partial as it doesn't : support the ERETAA and ERETAB variants. : : Supporting these instructions was cast aside for a long time as it : involves implementing some form of PAuth emulation, something I wasn't : overly keen on. But I have reached a point where enough of the : infrastructure is there that it actually makes sense. So here it is!" : . KVM: arm64: nv: Work around lack of pauth support in old toolchains KVM: arm64: Drop trapping of PAuth instructions/keys KVM: arm64: nv: Advertise support for PAuth KVM: arm64: nv: Handle ERETA[AB] instructions KVM: arm64: nv: Add emulation for ERETAx instructions KVM: arm64: nv: Add kvm_has_pauth() helper KVM: arm64: nv: Reinject PAC exceptions caused by HCR_EL2.API==0 KVM: arm64: nv: Handle HCR_EL2.{API,APK} independently KVM: arm64: nv: Honor HFGITR_EL2.ERET being set KVM: arm64: nv: Fast-track 'InHost' exception returns KVM: arm64: nv: Add trap forwarding for ERET and SMC KVM: arm64: nv: Configure HCR_EL2 for FEAT_NV2 KVM: arm64: nv: Drop VCPU_HYP_CONTEXT flag KVM: arm64: Constraint PAuth support to consistent implementations KVM: arm64: Add helpers for ESR_ELx_ERET_ISS_ERET* KVM: arm64: Harden __ctxt_sys_reg() against out-of-range values Signed-off-by: Marc Zyngier <maz@kernel.org>
2024-05-03Merge branch kvm-arm64/host_data into kvmarm-master/nextMarc Zyngier9-33/+32
* kvm-arm64/host_data: : . : Rationalise the host-specific data to live as part of the per-CPU state. : : From the cover letter: : : "It appears that over the years, we have accumulated a lot of cruft in : the kvm_vcpu_arch structure. Part of the gunk is data that is strictly : host CPU specific, and this result in two main problems: : : - the structure itself is stupidly large, over 8kB. With the : arch-agnostic kvm_vcpu, we're above 10kB, which is insane. This has : some ripple effects, as we need physically contiguous allocation to : be able to map it at EL2 for !VHE. There is more to it though, as : some data structures, although per-vcpu, could be allocated : separately. : : - We lose track of the life-cycle of this data, because we're : guaranteed that it will be around forever and we start relying on : wrong assumptions. This is becoming a maintenance burden. : : This series rectifies some of these things, starting with the two main : offenders: debug and FP, a lot of which gets pushed out to the per-CPU : host structure. Indeed, their lifetime really isn't that of the vcpu, : but tied to the physical CPU the vpcu runs on. : : This results in a small reduction of the vcpu size, but mainly a much : clearer understanding of the life-cycle of these structures." : . KVM: arm64: Move management of __hyp_running_vcpu to load/put on VHE KVM: arm64: Exclude FP ownership from kvm_vcpu_arch KVM: arm64: Exclude host_fpsimd_state pointer from kvm_vcpu_arch KVM: arm64: Exclude mdcr_el2_host from kvm_vcpu_arch KVM: arm64: Exclude host_debug_data from vcpu_arch KVM: arm64: Add accessor for per-CPU state Signed-off-by: Marc Zyngier <maz@kernel.org>
2024-05-03KVM: arm64: Move management of __hyp_running_vcpu to load/put on VHEMarc Zyngier1-1/+4
The per-CPU host context structure contains a __hyp_running_vcpu that serves as a replacement for kvm_get_current_vcpu() in contexts where we cannot make direct use of it (such as in the nVHE hypervisor). Since there is a lot of common code between nVHE and VHE, the latter also populates this field even if kvm_get_running_vcpu() always works. We currently pretty inconsistent when populating __hyp_running_vcpu to point to the currently running vcpu: - on {n,h}VHE, we set __hyp_running_vcpu on entry to __kvm_vcpu_run and clear it on exit. - on VHE, we set __hyp_running_vcpu on entry to __kvm_vcpu_run_vhe and never clear it, effectively leaving a dangling pointer... VHE is obviously the odd one here. Although we could make it behave just like nVHE, this wouldn't match the behaviour of KVM with VHE, where the load phase is where most of the context-switch gets done. So move all the __hyp_running_vcpu management to the VHE-specific load/put phases, giving us a bit more sanity and matching the behaviour of kvm_get_running_vcpu(). Reviewed-by: Oliver Upton <oliver.upton@linux.dev> Link: https://lore.kernel.org/r/20240502154030.3011995-1-maz@kernel.org Signed-off-by: Marc Zyngier <maz@kernel.org>
2024-05-01KVM: arm64: Introduce and use predicates that check for protected VMsFuad Tabba2-4/+7
In order to determine whether or not a VM or vcpu are protected, introduce helpers to query this state. While at it, use the vcpu helper to check vcpus protected state instead of the kvm one. Co-authored-by: Marc Zyngier <maz@kernel.org> Signed-off-by: Fuad Tabba <tabba@google.com> Acked-by: Oliver Upton <oliver.upton@linux.dev> Link: https://lore.kernel.org/r/20240423150538.2103045-19-tabba@google.com Signed-off-by: Marc Zyngier <maz@kernel.org>
2024-05-01KVM: arm64: Simplify vgic-v3 hypercallsMarc Zyngier2-22/+29
Consolidate the GICv3 VMCR accessor hypercalls into the APR save/restore hypercalls so that all of the EL2 GICv3 state is covered by a single pair of hypercalls. Signed-off-by: Fuad Tabba <tabba@google.com> Acked-by: Oliver Upton <oliver.upton@linux.dev> Link: https://lore.kernel.org/r/20240423150538.2103045-17-tabba@google.com Signed-off-by: Marc Zyngier <maz@kernel.org>
2024-05-01KVM: arm64: Fix comment for __pkvm_vcpu_init_traps()Fuad Tabba1-1/+1
Fix the comment to clarify that __pkvm_vcpu_init_traps() initializes traps for all VMs in protected mode, and not only for protected VMs. Signed-off-by: Fuad Tabba <tabba@google.com> Acked-by: Oliver Upton <oliver.upton@linux.dev> Link: https://lore.kernel.org/r/20240423150538.2103045-14-tabba@google.com Signed-off-by: Marc Zyngier <maz@kernel.org>
2024-05-01KVM: arm64: Rename __tlb_switch_to_{guest,host}() in VHEFuad Tabba1-13/+13
Rename __tlb_switch_to_{guest,host}() to {enter,exit}_vmid_context() in VHE code to maintain symmetry between the nVHE and VHE TLB invalidations. No functional change intended. Suggested-by: Oliver Upton <oliver.upton@linux.dev> Signed-off-by: Fuad Tabba <tabba@google.com> Acked-by: Oliver Upton <oliver.upton@linux.dev> Link: https://lore.kernel.org/r/20240423150538.2103045-11-tabba@google.com Signed-off-by: Marc Zyngier <maz@kernel.org>
2024-05-01KVM: arm64: Support TLB invalidation in guest contextWill Deacon1-24/+91
Typically, TLB invalidation of guest stage-2 mappings using nVHE is performed by a hypercall originating from the host. For the invalidation instruction to be effective, therefore, __tlb_switch_to_{guest,host}() swizzle the active stage-2 context around the TLBI instruction. With guest-to-host memory sharing and unsharing hypercalls originating from the guest under pKVM, there is need to support both guest and host VMID invalidations issued from guest context. Replace the __tlb_switch_to_{guest,host}() functions with a more general {enter,exit}_vmid_context() implementation which supports being invoked from guest context and acts as a no-op if the target context matches the running context. Signed-off-by: Will Deacon <will@kernel.org> Signed-off-by: Fuad Tabba <tabba@google.com> Acked-by: Oliver Upton <oliver.upton@linux.dev> Link: https://lore.kernel.org/r/20240423150538.2103045-10-tabba@google.com Signed-off-by: Marc Zyngier <maz@kernel.org>
2024-05-01KVM: arm64: Avoid BBM when changing only s/w bits in Stage-2 PTEWill Deacon1-0/+15
Break-before-make (BBM) can be expensive, as transitioning via an invalid mapping (i.e. the "break" step) requires the completion of TLB invalidation and can also cause other agents to fault concurrently on the invalid mapping. Since BBM is not required when changing only the software bits of a PTE, avoid the sequence in this case and just update the PTE directly. Signed-off-by: Will Deacon <will@kernel.org> Signed-off-by: Fuad Tabba <tabba@google.com> Acked-by: Oliver Upton <oliver.upton@linux.dev> Link: https://lore.kernel.org/r/20240423150538.2103045-9-tabba@google.com Signed-off-by: Marc Zyngier <maz@kernel.org>
2024-05-01KVM: arm64: Check for PTE validity when checking for executable/cacheableMarc Zyngier1-3/+3
Don't just assume that the PTE is valid when checking whether it describes an executable or cacheable mapping. This makes sure that we don't issue CMOs for invalid mappings. Suggested-by: Will Deacon <will@kernel.org> Signed-off-by: Fuad Tabba <tabba@google.com> Acked-by: Oliver Upton <oliver.upton@linux.dev> Link: https://lore.kernel.org/r/20240423150538.2103045-8-tabba@google.com Signed-off-by: Marc Zyngier <maz@kernel.org>
2024-05-01KVM: arm64: Avoid BUG-ing from the host abort pathQuentin Perret1-1/+7
Under certain circumstances __get_fault_info() may resolve the faulting address using the AT instruction. Given that this is being done outside of the host lock critical section, it is racy and the resolution via AT may fail. We currently BUG() in this situation, which is obviously less than ideal. Moving the address resolution to the critical section may have a performance impact, so let's keep it where it is, but bail out and return to the host to try a second time. Signed-off-by: Quentin Perret <qperret@google.com> Signed-off-by: Fuad Tabba <tabba@google.com> Acked-by: Oliver Upton <oliver.upton@linux.dev> Link: https://lore.kernel.org/r/20240423150538.2103045-7-tabba@google.com Signed-off-by: Marc Zyngier <maz@kernel.org>
2024-05-01KVM: arm64: Issue CMOs when tearing down guest s2 pagesQuentin Perret1-0/+1
On the guest teardown path, pKVM will zero the pages used to back the guest data structures before returning them to the host as they may contain secrets (e.g. in the vCPU registers). However, the zeroing is done using a cacheable alias, and CMOs are missing, hence giving the host a potential opportunity to read the original content of the guest structs from memory. Fix this by issuing CMOs after zeroing the pages. Signed-off-by: Quentin Perret <qperret@google.com> Signed-off-by: Fuad Tabba <tabba@google.com> Acked-by: Oliver Upton <oliver.upton@linux.dev> Link: https://lore.kernel.org/r/20240423150538.2103045-6-tabba@google.com Signed-off-by: Marc Zyngier <maz@kernel.org>
2024-05-01KVM: arm64: Refactor checks for FP state ownershipFuad Tabba3-3/+3
To avoid direct comparison against the fp_owner enum, add a new function that performs the check, host_owns_fp_regs(), to complement the existing guest_owns_fp_regs(). To check for fpsimd state ownership, use the helpers instead of directly using the enums. No functional change intended. Suggested-by: Marc Zyngier <maz@kernel.org> Signed-off-by: Fuad Tabba <tabba@google.com> Reviewed-by: Mark Brown <broonie@kernel.org> Acked-by: Oliver Upton <oliver.upton@linux.dev> Link: https://lore.kernel.org/r/20240423150538.2103045-4-tabba@google.com Signed-off-by: Marc Zyngier <maz@kernel.org>
2024-05-01KVM: arm64: Move guest_owns_fp_regs() to increase its scopeFuad Tabba3-8/+2
guest_owns_fp_regs() will be used to check fpsimd state ownership across kvm/arm64. Therefore, move it to kvm_host.h to widen its scope. Moreover, the host state is not per-vcpu anymore, the vcpu parameter isn't used, so remove it as well. No functional change intended. Signed-off-by: Fuad Tabba <tabba@google.com> Reviewed-by: Mark Brown <broonie@kernel.org> Acked-by: Oliver Upton <oliver.upton@linux.dev> Link: https://lore.kernel.org/r/20240423150538.2103045-3-tabba@google.com Signed-off-by: Marc Zyngier <maz@kernel.org>
2024-05-01KVM: arm64: Initialize the kvm host data's fpsimd_state pointer in pKVMFuad Tabba3-0/+13
Since the host_fpsimd_state has been removed from kvm_vcpu_arch, it isn't pointing to the hyp's version of the host fp_regs in protected mode. Initialize the host_data fpsimd_state point to the host_data's context fp_regs on pKVM initialization. Fixes: 51e09b5572d6 ("KVM: arm64: Exclude host_fpsimd_state pointer from kvm_vcpu_arch") Signed-off-by: Fuad Tabba <tabba@google.com> Acked-by: Oliver Upton <oliver.upton@linux.dev> Link: https://lore.kernel.org/r/20240423150538.2103045-2-tabba@google.com Signed-off-by: Marc Zyngier <maz@kernel.org>
2024-04-20KVM: arm64: Drop trapping of PAuth instructions/keysMarc Zyngier3-85/+3
We currently insist on disabling PAuth on vcpu_load(), and get to enable it on first guest use of an instruction or a key (ignoring the NV case for now). It isn't clear at all what this is trying to achieve: guests tend to use PAuth when available, and nothing forces you to expose it to the guest if you don't want to. This also isn't totally free: we take a full GPR save/restore between host and guest, only to write ten 64bit registers. The "value proposition" escapes me. So let's forget this stuff and enable PAuth eagerly if exposed to the guest. This results in much simpler code. Performance wise, that's not bad either (tested on M2 Pro running a fully automated Debian installer as the workload): - On a non-NV guest, I can see reduction of 0.24% in the number of cycles (measured with perf over 10 consecutive runs) - On a NV guest (L2), I see a 2% reduction in wall-clock time (measured with 'time', as M2 doesn't have a PMUv3 and NV doesn't support it either) So overall, a much reduced complexity and a (small) performance improvement. Reviewed-by: Oliver Upton <oliver.upton@linux.dev> Link: https://lore.kernel.org/r/20240419102935.1935571-16-maz@kernel.org Signed-off-by: Marc Zyngier <maz@kernel.org>
2024-04-20KVM: arm64: nv: Handle ERETA[AB] instructionsMarc Zyngier1-2/+11
Now that we have some emulation in place for ERETA[AB], we can plug it into the exception handling machinery. As for a bare ERET, an "easy" ERETAx instruction is processed as a fixup, while something that requires a translation regime transition or an exception delivery is left to the slow path. Reviewed-by: Joey Gouly <joey.gouly@arm.com> Reviewed-by: Oliver Upton <oliver.upton@linux.dev> Link: https://lore.kernel.org/r/20240419102935.1935571-14-maz@kernel.org Signed-off-by: Marc Zyngier <maz@kernel.org>
2024-04-20KVM: arm64: nv: Handle HCR_EL2.{API,APK} independentlyMarc Zyngier1-5/+27
Although KVM couples API and APK for simplicity, the architecture makes no such requirement, and the two can be independently set or cleared. Check for which of the two possible reasons we have trapped here, and if the corresponding L1 control bit isn't set, delegate the handling for forwarding. Otherwise, set this exact bit in HCR_EL2 and resume the guest. Of course, in the non-NV case, we keep setting both bits and be done with it. Note that the entry core already saves/restores the keys should any of the two control bits be set. This results in a bit of rework, and the removal of the (trivial) vcpu_ptrauth_enable() helper. Reviewed-by: Oliver Upton <oliver.upton@linux.dev> Link: https://lore.kernel.org/r/20240419102935.1935571-10-maz@kernel.org Signed-off-by: Marc Zyngier <maz@kernel.org>
2024-04-20KVM: arm64: nv: Honor HFGITR_EL2.ERET being setMarc Zyngier1-1/+2
If the L1 hypervisor decides to trap ERETs while running L2, make sure we don't try to emulate it, just like we wouldn't if it had its NV bit set. The exception will be reinjected from the core handler. Reviewed-by: Joey Gouly <joey.gouly@arm.com> Reviewed-by: Oliver Upton <oliver.upton@linux.dev> Link: https://lore.kernel.org/r/20240419102935.1935571-9-maz@kernel.org Signed-off-by: Marc Zyngier <maz@kernel.org>
2024-04-20KVM: arm64: nv: Fast-track 'InHost' exception returnsMarc Zyngier1-0/+44
A significant part of the FEAT_NV extension is to trap ERET instructions so that the hypervisor gets a chance to switch from a vEL2 L1 guest to an EL1 L2 guest. But this also has the unfortunate consequence of trapping ERET in unsuspecting circumstances, such as staying at vEL2 (interrupt handling while being in the guest hypervisor), or returning to host userspace in the case of a VHE guest. Although we already make some effort to handle these ERET quicker by not doing the put/load dance, it is still way too far down the line for it to be efficient enough. For these cases, it would ideal to ERET directly, no question asked. Of course, we can't do that. But the next best thing is to do it as early as possible, in fixup_guest_exit(), much as we would handle FPSIMD exceptions. Reviewed-by: Joey Gouly <joey.gouly@arm.com> Reviewed-by: Oliver Upton <oliver.upton@linux.dev> Link: https://lore.kernel.org/r/20240419102935.1935571-8-maz@kernel.org Signed-off-by: Marc Zyngier <maz@kernel.org>
2024-04-20KVM: arm64: nv: Configure HCR_EL2 for FEAT_NV2Marc Zyngier3-5/+36
Add the HCR_EL2 configuration for FEAT_NV2, adding the required bits for running a guest hypervisor, and overall merging the allowed bits provided by the guest. This heavily replies on unavaliable features being sanitised when the HCR_EL2 shadow register is accessed, and only a couple of bits must be explicitly disabled. Non-NV guests are completely unaffected by any of this. Reviewed-by: Joey Gouly <joey.gouly@arm.com> Reviewed-by: Oliver Upton <oliver.upton@linux.dev> Link: https://lore.kernel.org/r/20240419102935.1935571-6-maz@kernel.org Signed-off-by: Marc Zyngier <maz@kernel.org>
2024-04-20KVM: arm64: nv: Drop VCPU_HYP_CONTEXT flagMarc Zyngier1-6/+1
It has become obvious that HCR_EL2.NV serves the exact same use as VCPU_HYP_CONTEXT, only in an architectural way. So just drop the flag for good. Reviewed-by: Joey Gouly <joey.gouly@arm.com> Reviewed-by: Oliver Upton <oliver.upton@linux.dev> Link: https://lore.kernel.org/r/20240419102935.1935571-5-maz@kernel.org Signed-off-by: Marc Zyngier <maz@kernel.org>
2024-04-14KVM: arm64: Remove FFA_MSG_SEND_DIRECT_REQ from the denylistSebastian Ene1-1/+0
The denylist is blocking the 32 bit version of the call but is allowing the 64 bit version of it. There is no reason for blocking only one of them and the hypervisor should support these calls. Signed-off-by: Sebastian Ene <sebastianene@google.com> Link: https://lore.kernel.org/r/20240411135700.2140550-1-sebastianene@google.com Signed-off-by: Marc Zyngier <maz@kernel.org>
2024-04-12KVM: arm64: Exclude FP ownership from kvm_vcpu_archMarc Zyngier4-7/+5
In retrospect, it is fairly obvious that the FP state ownership is only meaningful for a given CPU, and that locating this information in the vcpu was just a mistake. Move the ownership tracking into the host data structure, and rename it from fp_state to fp_owner, which is a better description (name suggested by Mark Brown). Reviewed-by: Mark Brown <broonie@kernel.org> Signed-off-by: Marc Zyngier <maz@kernel.org>
2024-04-12KVM: arm64: Exclude host_fpsimd_state pointer from kvm_vcpu_archMarc Zyngier2-2/+1
As the name of the field indicates, host_fpsimd_state is strictly a host piece of data, and we reset this pointer on each PID change. So let's move it where it belongs, and set it at load-time. Although this is slightly more often, it is a well defined life-cycle which matches other pieces of data. Reviewed-by: Mark Brown <broonie@kernel.org> Signed-off-by: Marc Zyngier <maz@kernel.org>
2024-04-12KVM: arm64: Exclude mdcr_el2_host from kvm_vcpu_archMarc Zyngier1-2/+2
As for the rest of the host debug state, the host copy of mdcr_el2 has little to do in the vcpu, and is better placed in the host_data structure. Reviewed-by : Suzuki K Poulose <suzuki.poulose@arm.com> Signed-off-by: Marc Zyngier <maz@kernel.org>
2024-04-12KVM: arm64: Exclude host_debug_data from vcpu_archMarc Zyngier2-6/+6
Keeping host_debug_state on a per-vcpu basis is completely pointless. The lifetime of this data is only that of the inner run-loop, which means it is never accessed outside of the core EL2 code. Move the structure into kvm_host_data, and save over 500 bytes per vcpu. Reviewed-by: Suzuki K Poulose <suzuki.poulose@arm.com> Signed-off-by: Marc Zyngier <maz@kernel.org>
2024-04-12KVM: arm64: Add accessor for per-CPU stateMarc Zyngier7-15/+14
In order to facilitate the introduction of new per-CPU state, add a new host_data_ptr() helped that hides some of the per-CPU verbosity, and make it easier to move that state around in the future. Reviewed-by: Suzuki K Poulose <suzuki.poulose@arm.com> Signed-off-by: Marc Zyngier <maz@kernel.org>
2024-04-01KVM: arm64: Ensure target address is granule-aligned for range TLBIWill Deacon1-4/+7
When zapping a table entry in stage2_try_break_pte(), we issue range TLB invalidation for the region that was mapped by the table. However, we neglect to align the base address down to the granule size and so if we ended up reaching the table entry via a misaligned address then we will accidentally skip invalidation for some prefix of the affected address range. Align 'ctx->addr' down to the granule size when performing TLB invalidation for an unmapped table in stage2_try_break_pte(). Cc: Raghavendra Rao Ananta <rananta@google.com> Cc: Gavin Shan <gshan@redhat.com> Cc: Shaoqin Huang <shahuang@redhat.com> Cc: Quentin Perret <qperret@google.com> Fixes: defc8cc7abf0 ("KVM: arm64: Invalidate the table entries upon a range") Signed-off-by: Will Deacon <will@kernel.org> Reviewed-by: Shaoqin Huang <shahuang@redhat.com> Reviewed-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20240327124853.11206-5-will@kernel.org Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
2024-04-01KVM: arm64: Use TLBI_TTL_UNKNOWN in __kvm_tlb_flush_vmid_range()Will Deacon2-2/+4
Commit c910f2b65518 ("arm64/mm: Update tlb invalidation routines for FEAT_LPA2") updated the __tlbi_level() macro to take the target level as an argument, with TLBI_TTL_UNKNOWN (rather than 0) indicating that the caller cannot provide level information. Unfortunately, the two implementations of __kvm_tlb_flush_vmid_range() were not updated and so now ask for an level 0 invalidation if FEAT_LPA2 is implemented. Fix the problem by passing TLBI_TTL_UNKNOWN instead of 0 as the level argument to __flush_s2_tlb_range_op() in __kvm_tlb_flush_vmid_range(). Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Oliver Upton <oliver.upton@linux.dev> Cc: Marc Zyngier <maz@kernel.org> Reviewed-by: Ryan Roberts <ryan.roberts@arm.com> Fixes: c910f2b65518 ("arm64/mm: Update tlb invalidation routines for FEAT_LPA2") Signed-off-by: Will Deacon <will@kernel.org> Reviewed-by: Shaoqin Huang <shahuang@redhat.com> Reviewed-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20240327124853.11206-4-will@kernel.org Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
2024-04-01KVM: arm64: Don't pass a TLBI level hint when zapping table entriesWill Deacon1-5/+7
The TLBI level hints are for leaf entries only, so take care not to pass them incorrectly after clearing a table entry. Cc: Gavin Shan <gshan@redhat.com> Cc: Marc Zyngier <maz@kernel.org> Cc: Quentin Perret <qperret@google.com> Fixes: 82bb02445de5 ("KVM: arm64: Implement kvm_pgtable_hyp_unmap() at EL2") Fixes: 6d9d2115c480 ("KVM: arm64: Add support for stage-2 map()/unmap() in generic page-table") Signed-off-by: Will Deacon <will@kernel.org> Reviewed-by: Shaoqin Huang <shahuang@redhat.com> Reviewed-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20240327124853.11206-3-will@kernel.org Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
2024-04-01KVM: arm64: Don't defer TLB invalidation when zapping table entriesWill Deacon1-1/+3
Commit 7657ea920c54 ("KVM: arm64: Use TLBI range-based instructions for unmap") introduced deferred TLB invalidation for the stage-2 page-table so that range-based invalidation can be used for the accumulated addresses. This works fine if the structure of the page-tables remains unchanged, but if entire tables are zapped and subsequently freed then we transiently leave the hardware page-table walker with a reference to freed memory thanks to the translation walk caches. For example, stage2_unmap_walker() will free page-table pages: if (childp) mm_ops->put_page(childp); and issue the TLB invalidation later in kvm_pgtable_stage2_unmap(): if (stage2_unmap_defer_tlb_flush(pgt)) /* Perform the deferred TLB invalidations */ kvm_tlb_flush_vmid_range(pgt->mmu, addr, size); For now, take the conservative approach and invalidate the TLB eagerly when we clear a table entry. Note, however, that the existing level hint passed to __kvm_tlb_flush_vmid_ipa() is incorrect and will be fixed in a subsequent patch. Cc: Raghavendra Rao Ananta <rananta@google.com> Cc: Shaoqin Huang <shahuang@redhat.com> Cc: Marc Zyngier <maz@kernel.org> Cc: Oliver Upton <oliver.upton@linux.dev> Signed-off-by: Will Deacon <will@kernel.org> Reviewed-by: Shaoqin Huang <shahuang@redhat.com> Reviewed-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20240327124853.11206-2-will@kernel.org Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
2024-03-11Merge tag 'kvmarm-6.9' of ↵Paolo Bonzini8-82/+120
https://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm into HEAD KVM/arm64 updates for 6.9 - Infrastructure for building KVM's trap configuration based on the architectural features (or lack thereof) advertised in the VM's ID registers - Support for mapping vfio-pci BARs as Normal-NC (vaguely similar to x86's WC) at stage-2, improving the performance of interacting with assigned devices that can tolerate it - Conversion of KVM's representation of LPIs to an xarray, utilized to address serialization some of the serialization on the LPI injection path - Support for _architectural_ VHE-only systems, advertised through the absence of FEAT_E2H0 in the CPU's ID register - Miscellaneous cleanups, fixes, and spelling corrections to KVM and selftests
2024-03-07Merge branch kvm-arm64/kerneldoc into kvmarm/nextOliver Upton2-3/+3
* kvm-arm64/kerneldoc: : kerneldoc warning fixes, courtesy of Randy Dunlap : : Fixes addressing the widespread misuse of kerneldoc-style comments : throughout KVM/arm64. KVM: arm64: vgic: fix a kernel-doc warning KVM: arm64: vgic-its: fix kernel-doc warnings KVM: arm64: vgic-init: fix a kernel-doc warning KVM: arm64: sys_regs: fix kernel-doc warnings KVM: arm64: PMU: fix kernel-doc warnings KVM: arm64: mmu: fix a kernel-doc warning KVM: arm64: vhe: fix a kernel-doc warning KVM: arm64: hyp/aarch32: fix kernel-doc warnings KVM: arm64: guest: fix kernel-doc warnings KVM: arm64: debug: fix kernel-doc warnings Signed-off-by: Oliver Upton <oliver.upton@linux.dev>