summaryrefslogtreecommitdiff
diff options
context:
space:
mode:
authorTao Su <tao1.su@linux.intel.com>2023-09-14 08:55:04 +0300
committerSean Christopherson <seanjc@google.com>2023-09-28 20:42:16 +0300
commit629d3698f6958ee6f8131ea324af794f973b12ac (patch)
tree3915616575024f18234b3f440cf20bf24bbb3c66
parent9cfec6d097c607e36199cf0cfbb8cf5acbd8e9b2 (diff)
downloadlinux-629d3698f6958ee6f8131ea324af794f973b12ac.tar.xz
KVM: x86: Clear bit12 of ICR after APIC-write VM-exit
When IPI virtualization is enabled, a WARN is triggered if bit12 of ICR MSR is set after APIC-write VM-exit. The reason is kvm_apic_send_ipi() thinks the APIC_ICR_BUSY bit should be cleared because KVM has no delay, but kvm_apic_write_nodecode() doesn't clear the APIC_ICR_BUSY bit. Under the x2APIC section, regarding ICR, the SDM says: It remains readable only to aid in debugging; however, software should not assume the value returned by reading the ICR is the last written value. I.e. the guest is allowed to set bit 12. However, the SDM also gives KVM free reign to do whatever it wants with the bit, so long as KVM's behavior doesn't confuse userspace or break KVM's ABI. Clear bit 12 so that it reads back as '0'. This approach is safer than "do nothing" and is consistent with the case where IPI virtualization is disabled or not supported, i.e., handle_fastpath_set_x2apic_icr_irqoff() -> kvm_x2apic_icr_write() Opportunistically replace the TODO with a comment calling out that eating the write is likely faster than a conditional branch around the busy bit. Link: https://lore.kernel.org/all/ZPj6iF0Q7iynn62p@google.com/ Fixes: 5413bcba7ed5 ("KVM: x86: Add support for vICR APIC-write VM-Exits in x2APIC mode") Cc: stable@vger.kernel.org Signed-off-by: Tao Su <tao1.su@linux.intel.com> Tested-by: Yi Lai <yi1.lai@intel.com> Reviewed-by: Chao Gao <chao.gao@intel.com> Link: https://lore.kernel.org/r/20230914055504.151365-1-tao1.su@linux.intel.com [sean: tweak changelog, replace TODO with comment, drop local "val"] Signed-off-by: Sean Christopherson <seanjc@google.com>
-rw-r--r--arch/x86/kvm/lapic.c26
1 files changed, 13 insertions, 13 deletions
diff --git a/arch/x86/kvm/lapic.c b/arch/x86/kvm/lapic.c
index c6a5a5945021..953700e68cf6 100644
--- a/arch/x86/kvm/lapic.c
+++ b/arch/x86/kvm/lapic.c
@@ -2444,22 +2444,22 @@ EXPORT_SYMBOL_GPL(kvm_lapic_set_eoi);
void kvm_apic_write_nodecode(struct kvm_vcpu *vcpu, u32 offset)
{
struct kvm_lapic *apic = vcpu->arch.apic;
- u64 val;
/*
- * ICR is a single 64-bit register when x2APIC is enabled. For legacy
- * xAPIC, ICR writes need to go down the common (slightly slower) path
- * to get the upper half from ICR2.
+ * ICR is a single 64-bit register when x2APIC is enabled, all others
+ * registers hold 32-bit values. For legacy xAPIC, ICR writes need to
+ * go down the common path to get the upper half from ICR2.
+ *
+ * Note, using the write helpers may incur an unnecessary write to the
+ * virtual APIC state, but KVM needs to conditionally modify the value
+ * in certain cases, e.g. to clear the ICR busy bit. The cost of extra
+ * conditional branches is likely a wash relative to the cost of the
+ * maybe-unecessary write, and both are in the noise anyways.
*/
- if (apic_x2apic_mode(apic) && offset == APIC_ICR) {
- val = kvm_lapic_get_reg64(apic, APIC_ICR);
- kvm_apic_send_ipi(apic, (u32)val, (u32)(val >> 32));
- trace_kvm_apic_write(APIC_ICR, val);
- } else {
- /* TODO: optimize to just emulate side effect w/o one more write */
- val = kvm_lapic_get_reg(apic, offset);
- kvm_lapic_reg_write(apic, offset, (u32)val);
- }
+ if (apic_x2apic_mode(apic) && offset == APIC_ICR)
+ kvm_x2apic_icr_write(apic, kvm_lapic_get_reg64(apic, APIC_ICR));
+ else
+ kvm_lapic_reg_write(apic, offset, kvm_lapic_get_reg(apic, offset));
}
EXPORT_SYMBOL_GPL(kvm_apic_write_nodecode);