summaryrefslogtreecommitdiff
path: root/virt/kvm/arm/vgic/vgic-v3.c
AgeCommit message (Collapse)AuthorFilesLines
2019-03-19KVM: arm/arm64: vgic-its: Take the srcu lock when writing to guest memoryMarc Zyngier1-2/+2
When halting a guest, QEMU flushes the virtual ITS caches, which amounts to writing to the various tables that the guest has allocated. When doing this, we fail to take the srcu lock, and the kernel shouts loudly if running a lockdep kernel: [ 69.680416] ============================= [ 69.680819] WARNING: suspicious RCU usage [ 69.681526] 5.1.0-rc1-00008-g600025238f51-dirty #18 Not tainted [ 69.682096] ----------------------------- [ 69.682501] ./include/linux/kvm_host.h:605 suspicious rcu_dereference_check() usage! [ 69.683225] [ 69.683225] other info that might help us debug this: [ 69.683225] [ 69.683975] [ 69.683975] rcu_scheduler_active = 2, debug_locks = 1 [ 69.684598] 6 locks held by qemu-system-aar/4097: [ 69.685059] #0: 0000000034196013 (&kvm->lock){+.+.}, at: vgic_its_set_attr+0x244/0x3a0 [ 69.686087] #1: 00000000f2ed935e (&its->its_lock){+.+.}, at: vgic_its_set_attr+0x250/0x3a0 [ 69.686919] #2: 000000005e71ea54 (&vcpu->mutex){+.+.}, at: lock_all_vcpus+0x64/0xd0 [ 69.687698] #3: 00000000c17e548d (&vcpu->mutex){+.+.}, at: lock_all_vcpus+0x64/0xd0 [ 69.688475] #4: 00000000ba386017 (&vcpu->mutex){+.+.}, at: lock_all_vcpus+0x64/0xd0 [ 69.689978] #5: 00000000c2c3c335 (&vcpu->mutex){+.+.}, at: lock_all_vcpus+0x64/0xd0 [ 69.690729] [ 69.690729] stack backtrace: [ 69.691151] CPU: 2 PID: 4097 Comm: qemu-system-aar Not tainted 5.1.0-rc1-00008-g600025238f51-dirty #18 [ 69.691984] Hardware name: rockchip evb_rk3399/evb_rk3399, BIOS 2019.04-rc3-00124-g2feec69fb1 03/15/2019 [ 69.692831] Call trace: [ 69.694072] lockdep_rcu_suspicious+0xcc/0x110 [ 69.694490] gfn_to_memslot+0x174/0x190 [ 69.694853] kvm_write_guest+0x50/0xb0 [ 69.695209] vgic_its_save_tables_v0+0x248/0x330 [ 69.695639] vgic_its_set_attr+0x298/0x3a0 [ 69.696024] kvm_device_ioctl_attr+0x9c/0xd8 [ 69.696424] kvm_device_ioctl+0x8c/0xf8 [ 69.696788] do_vfs_ioctl+0xc8/0x960 [ 69.697128] ksys_ioctl+0x8c/0xa0 [ 69.697445] __arm64_sys_ioctl+0x28/0x38 [ 69.697817] el0_svc_common+0xd8/0x138 [ 69.698173] el0_svc_handler+0x38/0x78 [ 69.698528] el0_svc+0x8/0xc The fix is to obviously take the srcu lock, just like we do on the read side of things since bf308242ab98. One wonders why this wasn't fixed at the same time, but hey... Fixes: bf308242ab98 ("KVM: arm/arm64: VGIC/ITS: protect kvm_read_guest() calls with SRCU lock") Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2019-03-16Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvmLinus Torvalds1-2/+2
Pull KVM updates from Paolo Bonzini: "ARM: - some cleanups - direct physical timer assignment - cache sanitization for 32-bit guests s390: - interrupt cleanup - introduction of the Guest Information Block - preparation for processor subfunctions in cpu models PPC: - bug fixes and improvements, especially related to machine checks and protection keys x86: - many, many cleanups, including removing a bunch of MMU code for unnecessary optimizations - AVIC fixes Generic: - memcg accounting" * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (147 commits) kvm: vmx: fix formatting of a comment KVM: doc: Document the life cycle of a VM and its resources MAINTAINERS: Add KVM selftests to existing KVM entry Revert "KVM/MMU: Flush tlb directly in the kvm_zap_gfn_range()" KVM: PPC: Book3S: Add count cache flush parameters to kvmppc_get_cpu_char() KVM: PPC: Fix compilation when KVM is not enabled KVM: Minor cleanups for kvm_main.c KVM: s390: add debug logging for cpu model subfunctions KVM: s390: implement subfunction processor calls arm64: KVM: Fix architecturally invalid reset value for FPEXC32_EL2 KVM: arm/arm64: Remove unused timer variable KVM: PPC: Book3S: Improve KVM reference counting KVM: PPC: Book3S HV: Fix build failure without IOMMU support Revert "KVM: Eliminate extra function calls in kvm_get_dirty_log_protect()" x86: kvmguest: use TSC clocksource if invariant TSC is exposed KVM: Never start grow vCPU halt_poll_ns from value below halt_poll_ns_grow_start KVM: Expose the initial start value in grow_halt_poll_ns() as a module parameter KVM: grow_halt_poll_ns() should never shrink vCPU halt_poll_ns KVM: x86/mmu: Consolidate kvm_mmu_zap_all() and kvm_mmu_zap_mmio_sptes() KVM: x86/mmu: WARN if zapping a MMIO spte results in zapping children ...
2019-02-20arm/arm64: KVM: Introduce kvm_call_hyp_ret()Marc Zyngier1-2/+2
Until now, we haven't differentiated between HYP calls that have a return value and those who don't. As we're about to change this, introduce kvm_call_hyp_ret(), and change all call sites that actually make use of a return value. Signed-off-by: Marc Zyngier <marc.zyngier@arm.com> Acked-by: Christoffer Dall <christoffer.dall@arm.com> Signed-off-by: Christoffer Dall <christoffer.dall@arm.com>
2019-01-24KVM: arm/arm64: vgic: Make vgic_irq->irq_lock a raw_spinlockJulien Thierry1-4/+4
vgic_irq->irq_lock must always be taken with interrupts disabled as it is used in interrupt context. For configurations such as PREEMPT_RT_FULL, this means that it should be a raw_spinlock since RT spinlocks are interruptible. Signed-off-by: Julien Thierry <julien.thierry@arm.com> Acked-by: Christoffer Dall <christoffer.dall@arm.com> Acked-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Christoffer Dall <christoffer.dall@arm.com>
2018-08-12KVM: arm/arm64: vgic: Do not use spin_lock_irqsave/restore with irq disabledJia He1-3/+4
kvm_vgic_sync_hwstate is only called with IRQ being disabled. There is thus no need to call spin_lock_irqsave/restore in vgic_fold_lr_state and vgic_prune_ap_list. This patch replace them with the non irq-safe version. Signed-off-by: Jia He <jia.he@hxt-semitech.com> Acked-by: Christoffer Dall <christoffer.dall@arm.com> [maz: commit message tidy-up] Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2018-07-21KVM: arm/arm64: vgic: Signal IRQs using their configured groupChristoffer Dall1-5/+1
Now when we have a group configuration on the struct IRQ, use this state when populating the LR and signaling interrupts as either group 0 or group 1 to the VM. Depending on the model of the emulated GIC, and the guest's configuration of the VMCR, interrupts may be signaled as IRQs or FIQs to the VM. Reviewed-by: Andrew Jones <drjones@redhat.com> Signed-off-by: Christoffer Dall <christoffer.dall@arm.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2018-06-21KVM: arm/arm64: Drop resource size check for GICV windowArd Biesheuvel1-5/+0
When booting a 64 KB pages kernel on a ACPI GICv3 system that implements support for v2 emulation, the following warning is produced GICV size 0x2000 not a multiple of page size 0x10000 and support for v2 emulation is disabled, preventing GICv2 VMs from being able to run on such hosts. The reason is that vgic_v3_probe() performs a sanity check on the size of the window (it should be a multiple of the page size), while the ACPI MADT parsing code hardcodes the size of the window to 8 KB. This makes sense, considering that ACPI does not bother to describe the size in the first place, under the assumption that platforms implementing ACPI will follow the architecture and not put anything else in the same 64 KB window. So let's just drop the sanity check altogether, and assume that the window is at least 64 KB in size. Fixes: 909777324588 ("KVM: arm/arm64: vgic-new: vgic_init: implement kvm_vgic_hyp_init") Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2018-06-01Merge tag 'kvmarm-for-v4.18' of ↵Paolo Bonzini1-17/+82
git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm into HEAD KVM/ARM updates for 4.18 - Lazy context-switching of FPSIMD registers on arm64 - Allow virtual redistributors to be part of two or more MMIO ranges
2018-05-25KVM: arm/arm64: Implement KVM_VGIC_V3_ADDR_TYPE_REDIST_REGIONEric Auger1-0/+14
Now all the internals are ready to handle multiple redistributor regions, let's allow the userspace to register them. Signed-off-by: Eric Auger <eric.auger@redhat.com> Reviewed-by: Christoffer Dall <christoffer.dall@arm.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2018-05-25KVM: arm/arm64: Check all vcpu redistributors are set on map_resourcesEric Auger1-5/+14
On vcpu first run, we eventually know the actual number of vcpus. This is a synchronization point to check all redistributors were assigned. On kvm_vgic_map_resources() we check both dist and redist were set, eventually check potential base address inconsistencies. Signed-off-by: Eric Auger <eric.auger@redhat.com> Reviewed-by: Christoffer Dall <christoffer.dall@arm.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2018-05-25KVM: arm/arm64: Adapt vgic_v3_check_base to multiple rdist regionsEric Auger1-17/+32
vgic_v3_check_base() currently only handles the case of a unique legacy redistributor region whose size is not explicitly set but inferred, instead, from the number of online vcpus. We adapt it to handle the case of multiple redistributor regions with explicitly defined size. We rely on two new helpers: - vgic_v3_rdist_overlap() is used to detect overlap with the dist region if defined - vgic_v3_rd_region_size computes the size of the redist region, would it be a legacy unique region or a new explicitly sized region. Signed-off-by: Eric Auger <eric.auger@redhat.com> Reviewed-by: Christoffer Dall <christoffer.dall@arm.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2018-05-25KVM: arm/arm64: Helper to locate free rdist indexEric Auger1-0/+23
We introduce vgic_v3_rdist_free_slot to help identifying where we can place a new 2x64KB redistributor. Signed-off-by: Eric Auger <eric.auger@redhat.com> Reviewed-by: Christoffer Dall <christoffer.dall@arm.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2018-05-25KVM: arm/arm64: Replace the single rdist region by a listEric Auger1-8/+12
At the moment KVM supports a single rdist region. We want to support several separate rdist regions so let's introduce a list of them. This patch currently only cares about a single entry in this list as the functionality to register several redist regions is not yet there. So this only translates the existing code into something functionally similar using that new data struct. The redistributor region handle is stored in the vgic_cpu structure to allow later computation of the TYPER last bit. Signed-off-by: Eric Auger <eric.auger@redhat.com> Reviewed-by: Christoffer Dall <christoffer.dall@arm.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2018-05-15KVM: arm/arm64: VGIC/ITS save/restore: protect kvm_read_guest() callsAndre Przywara1-2/+2
kvm_read_guest() will eventually look up in kvm_memslots(), which requires either to hold the kvm->slots_lock or to be inside a kvm->srcu critical section. In contrast to x86 and s390 we don't take the SRCU lock on every guest exit, so we have to do it individually for each kvm_read_guest() call. Use the newly introduced wrapper for that. Cc: Stable <stable@vger.kernel.org> # 4.12+ Reported-by: Jan Glauber <jan.glauber@caviumnetworks.com> Signed-off-by: Andre Przywara <andre.przywara@arm.com> Acked-by: Christoffer Dall <christoffer.dall@arm.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2018-04-27KVM: arm/arm64: vgic: Fix source vcpu issues for GICv2 SGIMarc Zyngier1-20/+29
Now that we make sure we don't inject multiple instances of the same GICv2 SGI at the same time, we've made another bug more obvious: If we exit with an active SGI, we completely lose track of which vcpu it came from. On the next entry, we restore it with 0 as a source, and if that wasn't the right one, too bad. While this doesn't seem to trouble GIC-400, the architectural model gets offended and doesn't deactivate the interrupt on EOI. Another connected issue is that we will happilly make pending an interrupt from another vcpu, overriding the above zero with something that is just as inconsistent. Don't do that. The final issue is that we signal a maintenance interrupt when no pending interrupts are present in the LR. Assuming we've fixed the two issues above, we end-up in a situation where we keep exiting as soon as we've reached the active state, and not be able to inject the following pending. The fix comes in 3 parts: - GICv2 SGIs have their source vcpu saved if they are active on exit, and restored on entry - Multi-SGIs cannot go via the Pending+Active state, as this would corrupt the source field - Multi-SGIs are converted to using MI on EOI instead of NPIE Fixes: 16ca6a607d84bef0 ("KVM: arm/arm64: vgic: Don't populate multiple LRs with the same vintid") Reported-by: Mark Rutland <mark.rutland@arm.com> Tested-by: Mark Rutland <mark.rutland@arm.com> Reviewed-by: Christoffer Dall <christoffer.dall@arm.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2018-03-26KVM: arm/arm64: vgic: Disallow Active+Pending for level interruptsMarc Zyngier1-24/+30
It was recently reported that VFIO mediated devices, and anything that VFIO exposes as level interrupts, do no strictly follow the expected logic of such interrupts as it only lowers the input line when the guest has EOId the interrupt at the GIC level, rather than when it Acked the interrupt at the device level. THe GIC's Active+Pending state is fundamentally incompatible with this behaviour, as it prevents KVM from observing the EOI, and in turn results in VFIO never dropping the line. This results in an interrupt storm in the guest, which it really never expected. As we cannot really change VFIO to follow the strict rules of level signalling, let's forbid the A+P state altogether, as it is in the end only an optimization. It ensures that we will transition via an invalid state, which we can use to notify VFIO of the EOI. Reviewed-by: Eric Auger <eric.auger@redhat.com> Tested-by: Eric Auger <eric.auger@redhat.com> Tested-by: Shunyong Yang <shunyong.yang@hxt-semitech.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2018-03-19Merge tag 'kvm-arm-fixes-for-v4.16-2' into HEADMarc Zyngier1-1/+8
Resolve conflicts with current mainline
2018-03-19KVM: arm/arm64: Avoid VGICv3 save/restore on VHE with no IRQsChristoffer Dall1-0/+6
We can finally get completely rid of any calls to the VGICv3 save/restore functions when the AP lists are empty on VHE systems. This requires carefully factoring out trap configuration from saving and restoring state, and carefully choosing what to do on the VHE and non-VHE path. One of the challenges is that we cannot save/restore the VMCR lazily because we can only write the VMCR when ICC_SRE_EL1.SRE is cleared when emulating a GICv2-on-GICv3, since otherwise all Group-0 interrupts end up being delivered as FIQ. To solve this problem, and still provide fast performance in the fast path of exiting a VM when no interrupts are pending (which also optimized the latency for actually delivering virtual interrupts coming from physical interrupts), we orchestrate a dance of only doing the activate/deactivate traps in vgic load/put for VHE systems (which can have ICC_SRE_EL1.SRE cleared when running in the host), and doing the configuration on every round-trip on non-VHE systems. Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2018-03-19KVM: arm/arm64: Move VGIC APR save/restore to vgic put/loadChristoffer Dall1-0/+5
The APRs can only have bits set when the guest acknowledges an interrupt in the LR and can only have a bit cleared when the guest EOIs an interrupt in the LR. Therefore, if we have no LRs with any pending/active interrupts, the APR cannot change value and there is no need to clear it on every exit from the VM (hint: it will have already been cleared when we exited the guest the last time with the LRs all EOIed). The only case we need to take care of is when we migrate the VCPU away from a CPU or migrate a new VCPU onto a CPU, or when we return to userspace to capture the state of the VCPU for migration. To make sure this works, factor out the APR save/restore functionality into separate functions called from the VCPU (and by extension VGIC) put/load hooks. Reviewed-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2018-03-19KVM: arm/arm64: Get rid of vgic_elrsrChristoffer Dall1-1/+0
There is really no need to store the vgic_elrsr on the VGIC data structures as the only need we have for the elrsr is to figure out if an LR is inactive when we save the VGIC state upon returning from the guest. We can might as well store this in a temporary local variable. Reviewed-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2018-03-14KVM: arm/arm64: vgic: Don't populate multiple LRs with the same vintidMarc Zyngier1-1/+8
The vgic code is trying to be clever when injecting GICv2 SGIs, and will happily populate LRs with the same interrupt number if they come from multiple vcpus (after all, they are distinct interrupt sources). Unfortunately, this is against the letter of the architecture, and the GICv2 architecture spec says "Each valid interrupt stored in the List registers must have a unique VirtualID for that virtual CPU interface.". GICv3 has similar (although slightly ambiguous) restrictions. This results in guests locking up when using GICv2-on-GICv3, for example. The obvious fix is to stop trying so hard, and inject a single vcpu per SGI per guest entry. After all, pending SGIs with multiple source vcpus are pretty rare, and are mostly seen in scenario where the physical CPUs are severely overcomitted. But as we now only inject a single instance of a multi-source SGI per vcpu entry, we may delay those interrupts for longer than strictly necessary, and run the risk of injecting lower priority interrupts in the meantime. In order to address this, we adopt a three stage strategy: - If we encounter a multi-source SGI in the AP list while computing its depth, we force the list to be sorted - When populating the LRs, we prevent the injection of any interrupt of lower priority than that of the first multi-source SGI we've injected. - Finally, the injection of a multi-source SGI triggers the request of a maintenance interrupt when there will be no pending interrupt in the LRs (HCR_NPIE). At the point where the last pending interrupt in the LRs switches from Pending to Active, the maintenance interrupt will be delivered, allowing us to add the remaining SGIs using the same process. Cc: stable@vger.kernel.org Fixes: 0919e84c0fc1 ("KVM: arm/arm64: vgic-new: Add IRQ sync/flush framework") Acked-by: Christoffer Dall <cdall@kernel.org> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2018-01-02KVM: arm/arm64: vgic: Support level-triggered mapped interruptsChristoffer Dall1-0/+29
Level-triggered mapped IRQs are special because we only observe rising edges as input to the VGIC, and we don't set the EOI flag and therefore are not told when the level goes down, so that we can re-queue a new interrupt when the level goes up. One way to solve this problem is to side-step the logic of the VGIC and special case the validation in the injection path, but it has the unfortunate drawback of having to peak into the physical GIC state whenever we want to know if the interrupt is pending on the virtual distributor. Instead, we can maintain the current semantics of a level triggered interrupt by sort of treating it as an edge-triggered interrupt, following from the fact that we only observe an asserting edge. This requires us to be a bit careful when populating the LRs and when folding the state back in though: * We lower the line level when populating the LR, so that when subsequently observing an asserting edge, the VGIC will do the right thing. * If the guest never acked the interrupt while running (for example if it had masked interrupts at the CPU level while running), we have to preserve the pending state of the LR and move it back to the line_level field of the struct irq when folding LR state. If the guest never acked the interrupt while running, but changed the device state and lowered the line (again with interrupts masked) then we need to observe this change in the line_level. Both of the above situations are solved by sampling the physical line and set the line level when folding the LR back. * Finally, if the guest never acked the interrupt while running and sampling the line reveals that the device state has changed and the line has been lowered, we must clear the physical active state, since we will otherwise never be told when the interrupt becomes asserted again. This has the added benefit of making the timer optimization patches (https://lists.cs.columbia.edu/pipermail/kvmarm/2017-July/026343.html) a bit simpler, because the timer code doesn't have to clear the active state on the sync anymore. It also potentially improves the performance of the timer implementation because the GIC knows the state or the LR and only needs to clear the active state when the pending bit in the LR is still set, where the timer has to always clear it when returning from running the guest with an injected timer interrupt. Reviewed-by: Marc Zyngier <marc.zyngier@arm.com> Reviewed-by: Eric Auger <eric.auger@redhat.com> Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
2017-11-29KVM: arm/arm64: vgic: Preserve the revious read from the pending tableMarc Zyngier1-1/+1
The current pending table parsing code assumes that we keep the previous read of the pending bits, but keep that variable in the current block, making sure it is discarded on each loop. We end-up using whatever is on the stack. Who knows, it might just be the right thing... Fixes: 280771252c1ba ("KVM: arm64: vgic-v3: KVM_DEV_ARM_VGIC_SAVE_PENDING_TABLES") Cc: stable@vger.kernel.org # 4.12 Reported-by: AKASHI Takahiro <takahiro.akashi@linaro.org> Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
2017-11-10KVM: arm/arm64: GICv4: Enable VLPI supportMarc Zyngier1-0/+14
All it takes is the has_v4 flag to be set in gic_kvm_info as well as "kvm-arm.vgic_v4_enable=1" being passed on the command line for GICv4 to be enabled in KVM. Acked-by: Christoffer Dall <cdall@linaro.org> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
2017-11-06KVM: arm/arm64: Support calling vgic_update_irq_pending from irq contextChristoffer Dall1-5/+7
We are about to optimize our timer handling logic which involves injecting irqs to the vgic directly from the irq handler. Unfortunately, the injection path can take any AP list lock and irq lock and we must therefore make sure to use spin_lock_irqsave where ever interrupts are enabled and we are taking any of those locks, to avoid deadlocking between process context and the ISR. This changes a lot of the VGIC code, but the good news are that the changes are mostly mechanical. Acked-by: Marc Zyngier <marc,zyngier@arm.com> Signed-off-by: Christoffer Dall <cdall@linaro.org>
2017-06-15KVM: arm64: vgic-v3: Log which GICv3 system registers are trappedMarc Zyngier1-1/+4
In order to facilitate debug, let's log which class of GICv3 system registers are trapped. Tested-by: Alexander Graf <agraf@suse.de> Acked-by: David Daney <david.daney@cavium.com> Acked-by: Christoffer Dall <cdall@linaro.org> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Christoffer Dall <cdall@linaro.org>
2017-06-15KVM: arm64: Enable GICv3 common sysreg trapping via command-lineMarc Zyngier1-1/+10
Now that we're able to safely handle common sysreg access, let's give the user the opportunity to enable it by passing a specific command-line option (vgic_v3.common_trap). Tested-by: Alexander Graf <agraf@suse.de> Acked-by: David Daney <david.daney@cavium.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com> Acked-by: Christoffer Dall <cdall@linaro.org> Signed-off-by: Christoffer Dall <cdall@linaro.org>
2017-06-15arm64: Add workaround for Cavium Thunder erratum 30115David Daney1-0/+7
Some Cavium Thunder CPUs suffer a problem where a KVM guest may inadvertently cause the host kernel to quit receiving interrupts. Use the Group-0/1 trapping in order to deal with it. [maz]: Adapted patch to the Group-0/1 trapping, reworked commit log Tested-by: Alexander Graf <agraf@suse.de> Acked-by: Catalin Marinas <catalin.marinas@arm.com> Reviewed-by: Eric Auger <eric.auger@redhat.com> Signed-off-by: David Daney <david.daney@cavium.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Christoffer Dall <cdall@linaro.org>
2017-06-15KVM: arm64: Enable GICv3 Group-0 sysreg trapping via command-lineMarc Zyngier1-0/+6
Now that we're able to safely handle Group-0 sysreg access, let's give the user the opportunity to enable it by passing a specific command-line option (vgic_v3.group0_trap). Tested-by: Alexander Graf <agraf@suse.de> Acked-by: David Daney <david.daney@cavium.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Christoffer Dall <cdall@linaro.org>
2017-06-15KVM: arm64: vgic-v3: Enable trapping of Group-0 system registersMarc Zyngier1-1/+4
In order to be able to trap Group-0 GICv3 system registers, we need to set ICH_HCR_EL2.TALL0 begore entering the guest. This is conditionnaly done after having restored the guest's state, and cleared on exit. Tested-by: Alexander Graf <agraf@suse.de> Acked-by: David Daney <david.daney@cavium.com> Acked-by: Christoffer Dall <cdall@linaro.org> Reviewed-by: Eric Auger <eric.auger@redhat.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Christoffer Dall <cdall@linaro.org>
2017-06-15KVM: arm64: Enable GICv3 Group-1 sysreg trapping via command-lineMarc Zyngier1-0/+11
Now that we're able to safely handle Group-1 sysreg access, let's give the user the opportunity to enable it by passing a specific command-line option (vgic_v3.group1_trap). Tested-by: Alexander Graf <agraf@suse.de> Acked-by: David Daney <david.daney@cavium.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com> Acked-by: Christoffer Dall <cdall@linaro.org> Signed-off-by: Christoffer Dall <cdall@linaro.org>
2017-06-15KVM: arm64: vgic-v3: Enable trapping of Group-1 system registersMarc Zyngier1-0/+4
In order to be able to trap Group-1 GICv3 system registers, we need to set ICH_HCR_EL2.TALL1 before entering the guest. This is conditionally done after having restored the guest's state, and cleared on exit. Tested-by: Alexander Graf <agraf@suse.de> Acked-by: David Daney <david.daney@cavium.com> Acked-by: Christoffer Dall <cdall@linaro.org> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Christoffer Dall <cdall@linaro.org>
2017-06-15KVM: arm64: vgic-v3: Add hook to handle guest GICv3 sysreg accesses at EL2Marc Zyngier1-0/+2
In order to start handling guest access to GICv3 system registers, let's add a hook that will get called when we trap a system register access. This is gated by a new static key (vgic_v3_cpuif_trap). Tested-by: Alexander Graf <agraf@suse.de> Acked-by: David Daney <david.daney@cavium.com> Reviewed-by: Eric Auger <eric.auger@redhat.com> Reviewed-by: Christoffer Dall <cdall@linaro.org> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Christoffer Dall <cdall@linaro.org>
2017-05-24KVM: arm/arm64: Fix isues with GICv2 on GICv3 migrationChristoffer Dall1-14/+33
We have been a little loose with our intermediate VMCR representation where we had a 'ctlr' field, but we failed to differentiate between the GICv2 GICC_CTLR and ICC_CTLR_EL1 layouts, and therefore ended up mapping the wrong bits into the individual fields of the ICH_VMCR_EL2 when emulating a GICv2 on a GICv3 system. Fix this by using explicit fields for the VMCR bits instead. Cc: Eric Auger <eric.auger@redhat.com> Reported-by: wanghaibin <wanghaibin.wang@huawei.com> Signed-off-by: Christoffer Dall <cdall@linaro.org> Reviewed-by: Marc Zyngier <marc.zyngier@arm.com> Tested-by: Marc Zyngier <marc.zyngier@arm.com>
2017-05-15KVM: arm/arm64: vgic-v3: Do not use Active+Pending state for a HW interruptMarc Zyngier1-0/+7
When an interrupt is injected with the HW bit set (indicating that deactivation should be propagated to the physical distributor), special care must be taken so that we never mark the corresponding LR with the Active+Pending state (as the pending state is kept in the physycal distributor). Cc: stable@vger.kernel.org Fixes: 59529f69f504 ("KVM: arm/arm64: vgic-new: Add GICv3 world switch backend") Signed-off-by: Marc Zyngier <marc.zyngier@arm.com> Reviewed-by: Christoffer Dall <cdall@linaro.org> Signed-off-by: Christoffer Dall <cdall@linaro.org>
2017-05-09KVM: arm/arm64: Register ITS iodev when setting base addressChristoffer Dall1-8/+0
We have to register the ITS iodevice before running the VM, because in migration scenarios, we may be restoring a live device that wishes to inject MSIs before the VCPUs have started. All we need to register the ITS io device is the base address of the ITS, so we can simply register that when the base address of the ITS is set. [ Code to fix concurrency issues when setting the ITS base address and to fix the undef base address check written by Marc Zyngier ] Signed-off-by: Christoffer Dall <cdall@linaro.org> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com> Reviewed-by: Eric Auger <eric.auger@redhat.com>
2017-05-09KVM: arm/arm64: Register iodevs when setting redist base and creating VCPUsChristoffer Dall1-6/+0
Instead of waiting with registering KVM iodevs until the first VCPU is run, we can actually create the iodevs when the redist base address is set. The only downside is that we must now also check if we need to do this for VCPUs which are created after creating the VGIC, because there is no enforced ordering between creating the VGIC (and setting its base addresses) and creating the VCPUs. Signed-off-by: Christoffer Dall <cdall@linaro.org> Reviewed-by: Eric Auger <eric.auger@redhat.com>
2017-05-09KVM: arm/arm64: Make vgic_v3_check_base more broadly usableChristoffer Dall1-4/+15
As we are about to fiddle with the IO device registration mechanism, let's be a little more careful when setting base addresses as early as possible. When setting a base address, we can check that there's address space enough for its scope and when the last of the two base addresses (dist and redist) get set, we can also check if the regions overlap at that time. This allows us to provide error messages to the user at time when trying to set the base address, as opposed to later when trying to run the VM. To do this, we make vgic_v3_check_base available in the core vgic-v3 code as well as in the other parts of the GICv3 code, namely the MMIO config code. We also return true for undefined base addresses so that the function can be used before all base addresses are set; all callers already check for uninitialized addresses before calling this function. Signed-off-by: Christoffer Dall <cdall@linaro.org> Reviewed-by: Eric Auger <eric.auger@redhat.com>
2017-05-09KVM: arm/arm64: Refactor vgic_register_redist_iodevsChristoffer Dall1-1/+1
Split out the function to register all the redistributor iodevs into a function that handles a single redistributor at a time in preparation for being able to call this per VCPU as these get created. Signed-off-by: Christoffer Dall <cdall@linaro.org> Reviewed-by: Eric Auger <eric.auger@redhat.com>
2017-05-08KVM: arm64: vgic-v3: KVM_DEV_ARM_VGIC_SAVE_PENDING_TABLESEric Auger1-0/+51
This patch adds a new attribute to GICV3 KVM device KVM_DEV_ARM_VGIC_GRP_CTRL group. This allows userspace to flush all GICR pending tables into guest RAM. Signed-off-by: Eric Auger <eric.auger@redhat.com> Reviewed-by: Christoffer Dall <cdall@linaro.org> Acked-by: Marc Zyngier <marc.zyngier@arm.com>
2017-05-08KVM: arm64: vgic-v3: vgic_v3_lpi_sync_pending_statusEric Auger1-0/+44
this new helper synchronizes the irq pending_latch with the LPI pending bit status found in rdist pending table. As the status is consumed, we reset the bit in pending table. As we need the PENDBASER_ADDRESS() in vgic-v3, let's move its definition in the irqchip header. We restore the full length of the field, ie [51:16]. Same for PROPBASER_ADDRESS with full field length of [51:12]. Signed-off-by: Eric Auger <eric.auger@redhat.com> Reviewed-by: Marc Zyngier <marc.zyngier@arm.com> Reviewed-by: Christoffer Dall <cdall@linaro.org>
2017-04-19KVM: arm/arm64: vgic-v3: De-optimize VMCR save/restore when emulating a GICv2Marc Zyngier1-2/+9
When emulating a GICv2-on-GICv3, special care must be taken to only save/restore VMCR_EL2 when ICC_SRE_EL1.SRE is cleared. Otherwise, all Group-0 interrupts end-up being delivered as FIQ, which is probably not what the guest expects, as demonstrated here with an unhappy EFI: FIQ Exception at 0x000000013BD21CC4 This means that we cannot perform the load/put trick when dealing with VMCR_EL2 (because the host has SRE set), and we have to deal with it in the world-switch. Fortunately, this is not the most common case (modern guests should be able to deal with GICv3 directly), and the performance is not worse than what it was before the VMCR optimization. Reviewed-by: Christoffer Dall <cdall@linaro.org> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Christoffer Dall <cdall@linaro.org>
2017-04-09KVM: arm/arm64: vgic: Improve sync_hwstate performanceChristoffer Dall1-2/+5
There is no need to call any functions to fold LRs when we don't use any LRs and we don't need to mess with overflow flags, take spinlocks, or prune the AP list if the AP list is empty. Note: list_empty is a single atomic read (uses READ_ONCE) and can therefore check if a list is empty or not without the need to take the spinlock protecting the list. Reviewed-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Christoffer Dall <cdall@linaro.org>
2017-04-09KVM: arm/arm64: vgic: Get rid of unnecessary process_maintenance operationChristoffer Dall1-38/+13
Since we always read back the LRs that we wrote to the guest and the MISR and EISR registers simply provide a summary of the configuration of the bits in the LRs, there is really no need to read back those status registers and process them. We might as well just signal the notifyfd when folding the LR state and save some cycles in the process. We now clear the underflow bit in the fold_lr_state functions as we only need to clear this bit if we had used all the LRs, so this is as good a place as any to do that work. Reviewed-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
2017-04-09KVM: arm/arm64: vgic: Defer touching GICH_VMCR to vcpu_load/putChristoffer Dall1-2/+20
We don't have to save/restore the VMCR on every entry to/from the guest, since on GICv2 we can access the control interface from EL1 and on VHE systems with GICv3 we can access the control interface from KVM running in EL2. GICv3 systems without VHE becomes the rare case, which has to save/restore the register on each round trip. Note that userspace accesses may see out-of-date values if the VCPU is running while accessing the VGIC state via the KVM device API, but this is already the case and it is up to userspace to quiesce the CPUs before reading the CPU registers from the GIC for an up-to-date view. Reviewed-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Christoffer Dall <cdall@cs.columbia.edu> Signed-off-by: Christoffer Dall <cdall@linaro.org>
2017-03-06KVM: arm/arm64: vgic-v3: Don't pretend to support IRQ/FIQ bypassMarc Zyngier1-1/+4
Our GICv3 emulation always presents ICC_SRE_EL1 with DIB/DFB set to zero, which implies that there is a way to bypass the GIC and inject raw IRQ/FIQ by driving the CPU pins. Of course, we don't allow that when the GIC is configured, but we fail to indicate that to the guest. The obvious fix is to set these bits (and never let them being changed again). Reported-by: Peter Maydell <peter.maydell@linaro.org> Acked-by: Christoffer Dall <cdall@linaro.org> Reviewed-by: Eric Auger <eric.auger@redhat.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2017-01-30KVM: arm/arm64: vgic: Implement VGICv3 CPU interface accessVijaya Kumar K1-0/+8
VGICv3 CPU interface registers are accessed using KVM_DEV_ARM_VGIC_CPU_SYSREGS ioctl. These registers are accessed as 64-bit. The cpu MPIDR value is passed along with register id. It is used to identify the cpu for registers access. The VM that supports SEIs expect it on destination machine to handle guest aborts and hence checked for ICC_CTLR_EL1.SEIS compatibility. Similarly, VM that supports Affinity Level 3 that is required for AArch64 mode, is required to be supported on destination machine. Hence checked for ICC_CTLR_EL1.A3V compatibility. The arch/arm64/kvm/vgic-sys-reg-v3.c handles read and write of VGIC CPU registers for AArch64. For AArch32 mode, arch/arm/kvm/vgic-v3-coproc.c file is created but APIs are not implemented. Updated arch/arm/include/uapi/asm/kvm.h with new definitions required to compile for AArch32. The version of VGIC v3 specification is defined here Documentation/virtual/kvm/devices/arm-vgic-v3.txt Acked-by: Christoffer Dall <christoffer.dall@linaro.org> Reviewed-by: Eric Auger <eric.auger@redhat.com> Signed-off-by: Pavel Fedin <p.fedin@samsung.com> Signed-off-by: Vijaya Kumar K <Vijaya.Kumar@cavium.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2017-01-30KVM: arm/arm64: vgic: Introduce VENG0 and VENG1 fields to vmcr structVijaya Kumar K1-2/+18
ICC_VMCR_EL2 supports virtual access to ICC_IGRPEN1_EL1.Enable and ICC_IGRPEN0_EL1.Enable fields. Add grpen0 and grpen1 member variables to struct vmcr to support read and write of these fields. Also refactor vgic_set_vmcr and vgic_get_vmcr() code. Drop ICH_VMCR_CTLR_SHIFT and ICH_VMCR_CTLR_MASK macros and instead use ICH_VMCR_EOI* and ICH_VMCR_CBPR* macros. Signed-off-by: Vijaya Kumar K <Vijaya.Kumar@cavium.com> Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org> Reviewed-by: Eric Auger <eric.auger@redhat.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2017-01-25KVM: arm/arm64: Remove struct vgic_irq pending fieldChristoffer Dall1-7/+5
One of the goals behind the VGIC redesign was to get rid of cached or intermediate state in the data structures, but we decided to allow ourselves to precompute the pending value of an IRQ based on the line level and pending latch state. However, this has now become difficult to base proper GICv3 save/restore on, because there is a potential to modify the pending state without knowing if an interrupt is edge or level configured. See the following post and related message for more background: https://lists.cs.columbia.edu/pipermail/kvmarm/2017-January/023195.html This commit gets rid of the precomputed pending field in favor of a function that calculates the value when needed, irq_is_pending(). The soft_pending field is renamed to pending_latch to represent that this latch is the equivalent hardware latch which gets manipulated by the input signal for edge-triggered interrupts and when writing to the SPENDR/CPENDR registers. After this commit save/restore code should be able to simply restore the pending_latch state, line_level state, and config state in any order and get the desired result. Reviewed-by: Andre Przywara <andre.przywara@arm.com> Reviewed-by: Marc Zyngier <marc.zyngier@arm.com> Tested-by: Andre Przywara <andre.przywara@arm.com> Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
2017-01-13KVM: arm/arm64: vgic: Fix deadlock on error handlingMarc Zyngier1-2/+0
Dmitry Vyukov reported that the syzkaller fuzzer triggered a deadlock in the vgic setup code when an error was detected, as the cleanup code tries to take a lock that is already held by the setup code. The fix is to avoid retaking the lock when cleaning up, by telling the cleanup function that we already hold it. Cc: stable@vger.kernel.org Reported-by: Dmitry Vyukov <dvyukov@google.com> Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org> Reviewed-by: Eric Auger <eric.auger@redhat.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>