summaryrefslogtreecommitdiff
path: root/arch/x86/include
AgeCommit message (Collapse)AuthorFilesLines
2024-04-24locking/pvqspinlock/x86: Use _Q_LOCKED_VAL in PV_UNLOCK_ASM macroUros Bizjak1-1/+1
Use _Q_LOCKED_VAL instead of hardcoded $0x1 in PV_UNLOCK_ASM macro. No functional changes intended. Signed-off-by: Uros Bizjak <ubizjak@gmail.com> Signed-off-by: Ingo Molnar <mingo@kernel.org> Acked-by: Boqun Feng <boqun.feng@gmail.com> Cc: Waiman Long <longman@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Link: https://lore.kernel.org/r/20240422151752.53997-1-ubizjak@gmail.com
2024-04-24locking/qspinlock/x86: Micro-optimize virt_spin_lock()Uros Bizjak1-4/+9
Optimize virt_spin_lock() to use simpler and faster: atomic_try_cmpxchg(*ptr, &val, new) instead of: atomic_cmpxchg(*ptr, val, new) == val The x86 CMPXCHG instruction returns success in the ZF flag, so this change saves a compare after the CMPXCHG. Also optimize retry loop a bit. atomic_try_cmpxchg() fails iff &lock->val != 0, so there is no need to load and compare the lock value again - cpu_relax() can be unconditinally called in this case. This allows us to generate optimized: 1f: ba 01 00 00 00 mov $0x1,%edx 24: 8b 03 mov (%rbx),%eax 26: 85 c0 test %eax,%eax 28: 75 63 jne 8d <...> 2a: f0 0f b1 13 lock cmpxchg %edx,(%rbx) 2e: 75 5d jne 8d <...> ... 8d: f3 90 pause 8f: eb 93 jmp 24 <...> instead of: 1f: ba 01 00 00 00 mov $0x1,%edx 24: 8b 03 mov (%rbx),%eax 26: 85 c0 test %eax,%eax 28: 75 13 jne 3d <...> 2a: f0 0f b1 13 lock cmpxchg %edx,(%rbx) 2e: 85 c0 test %eax,%eax 30: 75 f2 jne 24 <...> ... 3d: f3 90 pause 3f: eb e3 jmp 24 <...> Signed-off-by: Uros Bizjak <ubizjak@gmail.com> Signed-off-by: Ingo Molnar <mingo@kernel.org> Cc: Waiman Long <longman@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Link: https://lore.kernel.org/r/20240422120054.199092-1-ubizjak@gmail.com
2024-04-24locking/atomic/x86: Merge __arch{,_try}_cmpxchg64_emu_local() with ↵Uros Bizjak1-46/+10
__arch{,_try}_cmpxchg64_emu() Macros __arch{,_try}_cmpxchg64_emu() are almost identical to their local variants __arch{,_try}_cmpxchg64_emu_local(), differing only by lock prefixes. Merge these two macros by introducing additional macro parameters to pass lock location and lock prefix from their respective static inline functions. No functional change intended. Signed-off-by: Uros Bizjak <ubizjak@gmail.com> Signed-off-by: Ingo Molnar <mingo@kernel.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: "H. Peter Anvin" <hpa@zytor.com> Link: https://lore.kernel.org/r/20240417175830.161561-1-ubizjak@gmail.com
2024-04-23crash: add a new kexec flag for hotplug supportSourabh Jain1-9/+2
Commit a72bbec70da2 ("crash: hotplug support for kexec_load()") introduced a new kexec flag, `KEXEC_UPDATE_ELFCOREHDR`. Kexec tool uses this flag to indicate to the kernel that it is safe to modify the elfcorehdr of the kdump image loaded using the kexec_load system call. However, it is possible that architectures may need to update kexec segments other then elfcorehdr. For example, FDT (Flatten Device Tree) on PowerPC. Introducing a new kexec flag for every new kexec segment may not be a good solution. Hence, a generic kexec flag bit, `KEXEC_CRASH_HOTPLUG_SUPPORT`, is introduced to share the CPU/Memory hotplug support intent between the kexec tool and the kernel for the kexec_load system call. Now we have two kexec flags that enables crash hotplug support for kexec_load system call. First is KEXEC_UPDATE_ELFCOREHDR (only used in x86), and second is KEXEC_CRASH_HOTPLUG_SUPPORT (for all architectures). To simplify the process of finding and reporting the crash hotplug support the following changes are introduced. 1. Define arch specific function to process the kexec flags and determine crash hotplug support 2. Rename the @update_elfcorehdr member of struct kimage to @hotplug_support and populate it for both kexec_load and kexec_file_load syscalls, because architecture can update more than one kexec segment 3. Let generic function crash_check_hotplug_support report hotplug support for loaded kdump image based on value of @hotplug_support To bring the x86 crash hotplug support in line with the above points, the following changes have been made: - Introduce the arch_crash_hotplug_support function to process kexec flags and determine crash hotplug support - Remove the arch_crash_hotplug_[cpu|memory]_support functions Signed-off-by: Sourabh Jain <sourabhjain@linux.ibm.com> Acked-by: Baoquan He <bhe@redhat.com> Acked-by: Hari Bathini <hbathini@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://msgid.link/20240326055413.186534-3-sourabhjain@linux.ibm.com
2024-04-23crash: forward memory_notify arg to arch crash hotplug handlerSourabh Jain1-1/+1
In the event of memory hotplug or online/offline events, the crash memory hotplug notifier `crash_memhp_notifier()` receives a `memory_notify` object but doesn't forward that object to the generic and architecture-specific crash hotplug handler. The `memory_notify` object contains the starting PFN (Page Frame Number) and the number of pages in the hot-removed memory. This information is necessary for architectures like PowerPC to update/recreate the kdump image, specifically `elfcorehdr`. So update the function signature of `crash_handle_hotplug_event()` and `arch_crash_handle_hotplug_event()` to accept the `memory_notify` object as an argument from crash memory hotplug notifier. Since no such object is available in the case of CPU hotplug event, the crash CPU hotplug notifier `crash_cpuhp_online()` passes NULL to the crash hotplug handler. Signed-off-by: Sourabh Jain <sourabhjain@linux.ibm.com> Acked-by: Baoquan He <bhe@redhat.com> Acked-by: Hari Bathini <hbathini@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://msgid.link/20240326055413.186534-2-sourabhjain@linux.ibm.com
2024-04-22x86/cpu/vfm: Update arch/x86/include/asm/intel-family.hTony Luck1-0/+84
New CPU #defines encode vendor and family as well as model. Update the example usage comment in arch/x86/kernel/cpu/match.c Signed-off-by: Tony Luck <tony.luck@intel.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Link: https://lore.kernel.org/r/20240416211941.9369-4-tony.luck@intel.com
2024-04-22x86/cpu/vfm: Add new macros to work with (vendor/family/model) valuesTony Luck1-0/+93
To avoid adding a slew of new macros for each new Intel CPU family switch over from providing CPU model number #defines to a new scheme that encodes vendor, family, and model in a single number. [ bp: s/casted/cast/g ] Signed-off-by: Tony Luck <tony.luck@intel.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Link: https://lore.kernel.org/r/20240416211941.9369-3-tony.luck@intel.com
2024-04-22x86/cpu/vfm: Add/initialize x86_vfm field to struct cpuinfo_x86Tony Luck1-3/+17
Refactor struct cpuinfo_x86 so that the vendor, family, and model fields are overlaid in a union with a 32-bit field that combines all three (together with a one byte reserved field in the upper byte). This will make it easy, cheap, and reliable to check all three values at once. See https://lore.kernel.org/r/Zgr6kT8oULbnmEXx@agluck-desk3 for why the ordering is (low-to-high bits): (vendor, family, model) [ bp: Move comments over the line, add the backstory about the particular order of the fields. ] Signed-off-by: Tony Luck <tony.luck@intel.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Link: https://lore.kernel.org/r/20240416211941.9369-2-tony.luck@intel.com
2024-04-21Merge tag 'sched_urgent_for_v6.9_rc5' of ↵Linus Torvalds1-0/+3
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull scheduler fix from Borislav Petkov: - Add a missing memory barrier in the concurrency ID mm switching * tag 'sched_urgent_for_v6.9_rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: sched: Add missing memory barrier in switch_mm_cid
2024-04-20Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvmLinus Torvalds2-0/+2
Pull kvm fixes from Paolo Bonzini: "This is a bit on the large side, mostly due to two changes: - Changes to disable some broken PMU virtualization (see below for details under "x86 PMU") - Clean up SVM's enter/exit assembly code so that it can be compiled without OBJECT_FILES_NON_STANDARD. This fixes a warning "Unpatched return thunk in use. This should not happen!" when running KVM selftests. Everything else is small bugfixes and selftest changes: - Fix a mostly benign bug in the gfn_to_pfn_cache infrastructure where KVM would allow userspace to refresh the cache with a bogus GPA. The bug has existed for quite some time, but was exposed by a new sanity check added in 6.9 (to ensure a cache is either GPA-based or HVA-based). - Drop an unused param from gfn_to_pfn_cache_invalidate_start() that got left behind during a 6.9 cleanup. - Fix a math goof in x86's hugepage logic for KVM_SET_MEMORY_ATTRIBUTES that results in an array overflow (detected by KASAN). - Fix a bug where KVM incorrectly clears root_role.direct when userspace sets guest CPUID. - Fix a dirty logging bug in the where KVM fails to write-protect SPTEs used by a nested guest, if KVM is using Page-Modification Logging and the nested hypervisor is NOT using EPT. x86 PMU: - Drop support for virtualizing adaptive PEBS, as KVM's implementation is architecturally broken without an obvious/easy path forward, and because exposing adaptive PEBS can leak host LBRs to the guest, i.e. can leak host kernel addresses to the guest. - Set the enable bits for general purpose counters in PERF_GLOBAL_CTRL at RESET time, as done by both Intel and AMD processors. - Disable LBR virtualization on CPUs that don't support LBR callstacks, as KVM unconditionally uses PERF_SAMPLE_BRANCH_CALL_STACK when creating the perf event, and would fail on such CPUs. Tests: - Fix a flaw in the max_guest_memory selftest that results in it exhausting the supply of ucall structures when run with more than 256 vCPUs. - Mark KVM_MEM_READONLY as supported for RISC-V in set_memory_region_test" * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (30 commits) KVM: Drop unused @may_block param from gfn_to_pfn_cache_invalidate_start() KVM: selftests: Add coverage of EPT-disabled to vmx_dirty_log_test KVM: x86/mmu: Fix and clarify comments about clearing D-bit vs. write-protecting KVM: x86/mmu: Remove function comments above clear_dirty_{gfn_range,pt_masked}() KVM: x86/mmu: Write-protect L2 SPTEs in TDP MMU when clearing dirty status KVM: x86/mmu: Precisely invalidate MMU root_role during CPUID update KVM: VMX: Disable LBR virtualization if the CPU doesn't support LBR callstacks perf/x86/intel: Expose existence of callback support to KVM KVM: VMX: Snapshot LBR capabilities during module initialization KVM: x86/pmu: Do not mask LVTPC when handling a PMI on AMD platforms KVM: x86: Snapshot if a vCPU's vendor model is AMD vs. Intel compatible KVM: x86: Stop compiling vmenter.S with OBJECT_FILES_NON_STANDARD KVM: SVM: Create a stack frame in __svm_sev_es_vcpu_run() KVM: SVM: Save/restore args across SEV-ES VMRUN via host save area KVM: SVM: Save/restore non-volatile GPRs in SEV-ES VMRUN via host save area KVM: SVM: Clobber RAX instead of RBX when discarding spec_ctrl_intercepted KVM: SVM: Drop 32-bit "support" from __svm_sev_es_vcpu_run() KVM: SVM: Wrap __svm_sev_es_vcpu_run() with #ifdef CONFIG_KVM_AMD_SEV KVM: SVM: Create a stack frame in __svm_vcpu_run() for unwinding KVM: SVM: Remove a useless zeroing of allocated memory ...
2024-04-19KVM, x86: add architectural support code for #VEPaolo Bonzini1-0/+12
Dump the contents of the #VE info data structure and assert that #VE does not happen, but do not yet do anything with it. No functional change intended, separated for clarity only. Extracted from a patch by Isaku Yamahata <isaku.yamahata@intel.com>. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2024-04-19KVM: x86/mmu: Track shadow MMIO value on a per-VM basisSean Christopherson1-0/+2
TDX will use a different shadow PTE entry value for MMIO from VMX. Add a member to kvm_arch and track value for MMIO per-VM instead of a global variable. By using the per-VM EPT entry value for MMIO, the existing VMX logic is kept working. Introduce a separate setter function so that guest TD can use a different value later. Signed-off-by: Sean Christopherson <seanjc@google.com> Signed-off-by: Isaku Yamahata <isaku.yamahata@intel.com> Message-Id: <229a18434e5d83f45b1fcd7bf1544d79db1becb6.1705965635.git.isaku.yamahata@intel.com> Reviewed-by: Xiaoyao Li <xiaoyao.li@intel.com> Reviewed-by: Binbin Wu <binbin.wu@linux.intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2024-04-19KVM: x86/mmu: Add Suppress VE bit to EPT shadow_mmio_mask/shadow_present_maskIsaku Yamahata1-0/+1
To make use of the same value of shadow_mmio_mask and shadow_present_mask for TDX and VMX, add Suppress-VE bit to shadow_mmio_mask and shadow_present_mask so that they can be common for both VMX and TDX. TDX will require shadow_mmio_mask and shadow_present_mask to include VMX_SUPPRESS_VE for shared GPA so that EPT violation is triggered for shared GPA. For VMX, VMX_SUPPRESS_VE doesn't matter for MMIO because the spte value is defined so as to cause EPT misconfig. Signed-off-by: Isaku Yamahata <isaku.yamahata@intel.com> Message-Id: <97cc616b3563cd8277be91aaeb3e14bce23c3649.1705965635.git.isaku.yamahata@intel.com> Reviewed-by: Xiaoyao Li <xiaoyao.li@intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2024-04-19Merge x86 bugfixes from Linux 6.9-rc3Paolo Bonzini11-17/+48
Pull fix for SEV-SNP late disable bugs. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2024-04-16Merge tag 'kvm-x86-fixes-6.9-rcN' of https://github.com/kvm-x86/linux into HEADPaolo Bonzini1-0/+1
- Fix a mostly benign bug in the gfn_to_pfn_cache infrastructure where KVM would allow userspace to refresh the cache with a bogus GPA. The bug has existed for quite some time, but was exposed by a new sanity check added in 6.9 (to ensure a cache is either GPA-based or HVA-based). - Drop an unused param from gfn_to_pfn_cache_invalidate_start() that got left behind during a 6.9 cleanup. - Disable support for virtualizing adaptive PEBS, as KVM's implementation is architecturally broken and can leak host LBRs to the guest. - Fix a bug where KVM neglects to set the enable bits for general purpose counters in PERF_GLOBAL_CTRL when initializing the virtual PMU. Both Intel and AMD architectures require the bits to be set at RESET in order for v2 PMUs to be backwards compatible with software that was written for v1 PMUs, i.e. for software that will never manually set the global enables. - Disable LBR virtualization on CPUs that don't support LBR callstacks, as KVM unconditionally uses PERF_SAMPLE_BRANCH_CALL_STACK when creating the virtual LBR perf event, i.e. KVM will always fail to create LBR events on such CPUs. - Fix a math goof in x86's hugepage logic for KVM_SET_MEMORY_ATTRIBUTES that results in an array overflow (detected by KASAN). - Fix a flaw in the max_guest_memory selftest that results in it exhausting the supply of ucall structures when run with more than 256 vCPUs. - Mark KVM_MEM_READONLY as supported for RISC-V in set_memory_region_test. - Fix a bug where KVM incorrectly thinks a TDP MMU root is an indirect shadow root due KVM unnecessarily clobbering root_role.direct when userspace sets guest CPUID. - Fix a dirty logging bug in the where KVM fails to write-protect TDP MMU SPTEs used for L2 if Page-Modification Logging is enabled for L1 and the L1 hypervisor is NOT using EPT (if nEPT is enabled, KVM doesn't use the TDP MMU to run L2). For simplicity, KVM always disables PML when running L2, but the TDP MMU wasn't accounting for root-specific conditions that force write- protect based dirty logging.
2024-04-16sched: Add missing memory barrier in switch_mm_cidMathieu Desnoyers1-0/+3
Many architectures' switch_mm() (e.g. arm64) do not have an smp_mb() which the core scheduler code has depended upon since commit: commit 223baf9d17f25 ("sched: Fix performance regression introduced by mm_cid") If switch_mm() doesn't call smp_mb(), sched_mm_cid_remote_clear() can unset the actively used cid when it fails to observe active task after it sets lazy_put. There *is* a memory barrier between storing to rq->curr and _return to userspace_ (as required by membarrier), but the rseq mm_cid has stricter requirements: the barrier needs to be issued between store to rq->curr and switch_mm_cid(), which happens earlier than: - spin_unlock(), - switch_to(). So it's fine when the architecture switch_mm() happens to have that barrier already, but less so when the architecture only provides the full barrier in switch_to() or spin_unlock(). It is a bug in the rseq switch_mm_cid() implementation. All architectures that don't have memory barriers in switch_mm(), but rather have the full barrier either in finish_lock_switch() or switch_to() have them too late for the needs of switch_mm_cid(). Introduce a new smp_mb__after_switch_mm(), defined as smp_mb() in the generic barrier.h header, and use it in switch_mm_cid() for scheduler transitions where switch_mm() is expected to provide a memory barrier. Architectures can override smp_mb__after_switch_mm() if their switch_mm() implementation provides an implicit memory barrier. Override it with a no-op on x86 which implicitly provide this memory barrier by writing to CR3. Fixes: 223baf9d17f2 ("sched: Fix performance regression introduced by mm_cid") Reported-by: levi.yun <yeoreum.yun@arm.com> Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Signed-off-by: Ingo Molnar <mingo@kernel.org> Reviewed-by: Catalin Marinas <catalin.marinas@arm.com> # for arm64 Acked-by: Dave Hansen <dave.hansen@linux.intel.com> # for x86 Cc: <stable@vger.kernel.org> # 6.4.x Cc: Linus Torvalds <torvalds@linux-foundation.org> Link: https://lore.kernel.org/r/20240415152114.59122-2-mathieu.desnoyers@efficios.com
2024-04-14locking/atomic/x86: Introduce arch_try_cmpxchg64_local()Uros Bizjak2-0/+40
Introduce arch_try_cmpxchg64_local() for 64-bit and 32-bit targets to improve code using cmpxchg64_local(). On 64-bit targets, the generated assembly improves from: 3e28: 31 c0 xor %eax,%eax 3e2a: 4d 0f b1 7d 00 cmpxchg %r15,0x0(%r13) 3e2f: 48 85 c0 test %rax,%rax 3e32: 0f 85 9f 00 00 00 jne 3ed7 <...> to: 3e28: 31 c0 xor %eax,%eax 3e2a: 4d 0f b1 7d 00 cmpxchg %r15,0x0(%r13) 3e2f: 0f 85 9f 00 00 00 jne 3ed4 <...> where a TEST instruction after CMPXCHG is saved. The improvements for 32-bit targets are even more noticeable, because double-word compare after CMPXCHG8B gets eliminated. Signed-off-by: Uros Bizjak <ubizjak@gmail.com> Signed-off-by: Ingo Molnar <mingo@kernel.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Waiman Long <longman@redhat.com> Link: https://lore.kernel.org/r/20240414161257.49145-1-ubizjak@gmail.com
2024-04-14x86/pat: Introduce lookup_address_in_pgd_attr()Juergen Gross1-0/+2
Add lookup_address_in_pgd_attr() doing the same as the already existing lookup_address_in_pgd(), but returning the effective settings of the NX and RW bits of all walked page table levels, too. This will be needed in order to match hardware behavior when looking for effective access rights, especially for detecting writable code pages. In order to avoid code duplication, let lookup_address_in_pgd() call lookup_address_in_pgd_attr() with dummy parameters. Signed-off-by: Juergen Gross <jgross@suse.com> Signed-off-by: Ingo Molnar <mingo@kernel.org> Link: https://lore.kernel.org/r/20240412151258.9171-2-jgross@suse.com
2024-04-12Merge branch 'x86/urgent' into x86/cpu, to resolve conflictIngo Molnar5-9/+38
There's a new conflict between this commit pending in x86/cpu: 63edbaa48a57 x86/cpu/topology: Add support for the AMD 0x80000026 leaf And these fixes in x86/urgent: c064b536a8f9 x86/cpu/amd: Make the NODEID_MSR union actually work 1b3108f6898e x86/cpu/amd: Make the CPUID 0x80000008 parser correct Resolve them. Conflicts: arch/x86/kernel/cpu/topology_amd.c Signed-off-by: Ingo Molnar <mingo@kernel.org>
2024-04-12locking/pvqspinlock/x86: Remove redundant CMP after CMPXCHG in ↵Uros Bizjak1-3/+2
__raw_callee_save___pv_queued_spin_unlock() x86 CMPXCHG instruction returns success in the ZF flag. Remove redundant CMP instruction after CMPXCHG that performs the same check. Also update the function comment to mention the modern version of the equivalent C code. Signed-off-by: Uros Bizjak <ubizjak@gmail.com> Signed-off-by: Ingo Molnar <mingo@kernel.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Link: https://lore.kernel.org/r/20240412083908.282802-1-ubizjak@gmail.com
2024-04-12KVM: x86: Split core of hypercall emulation to helper functionSean Christopherson1-0/+4
By necessity, TDX will use a different register ABI for hypercalls. Break out the core functionality so that it may be reused for TDX. Signed-off-by: Sean Christopherson <seanjc@google.com> Signed-off-by: Isaku Yamahata <isaku.yamahata@intel.com> Message-Id: <5134caa55ac3dec33fb2addb5545b52b3b52db02.1705965635.git.isaku.yamahata@intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2024-04-11perf/x86/intel: Expose existence of callback support to KVMSean Christopherson1-0/+1
Add a "has_callstack" field to the x86_pmu_lbr structure used to pass information to KVM, and set it accordingly in x86_perf_get_lbr(). KVM will use has_callstack to avoid trying to create perf LBR events with PERF_SAMPLE_BRANCH_CALL_STACK on CPUs that don't support callstacks. Reviewed-by: Mingwei Zhang <mizhang@google.com> Link: https://lore.kernel.org/r/20240307011344.835640-3-seanjc@google.com Signed-off-by: Sean Christopherson <seanjc@google.com>
2024-04-11KVM: SEV: introduce KVM_SEV_INIT2 operationPaolo Bonzini1-0/+9
The idea that no parameter would ever be necessary when enabling SEV or SEV-ES for a VM was decidedly optimistic. In fact, in some sense it's already a parameter whether SEV or SEV-ES is desired. Another possible source of variability is the desired set of VMSA features, as that affects the measurement of the VM's initial state and cannot be changed arbitrarily by the hypervisor. Create a new sub-operation for KVM_MEMORY_ENCRYPT_OP that can take a struct, and put the new op to work by including the VMSA features as a field of the struct. The existing KVM_SEV_INIT and KVM_SEV_ES_INIT use the full set of supported VMSA features for backwards compatibility. The struct also includes the usual bells and whistles for future extensibility: a flags field that must be zero for now, and some padding at the end. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Message-ID: <20240404121327.3107131-13-pbonzini@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2024-04-11KVM: SEV: sync FPU and AVX state at LAUNCH_UPDATE_VMSA timePaolo Bonzini1-0/+3
SEV-ES allows passing custom contents for x87, SSE and AVX state into the VMSA. Allow userspace to do that with the usual KVM_SET_XSAVE API and only mark FPU contents as confidential after it has been copied and encrypted into the VMSA. Since the XSAVE state for AVX is the first, it does not need the compacted-state handling of get_xsave_addr(). However, there are other parts of XSAVE state in the VMSA that currently are not handled, and the validation logic of get_xsave_addr() is pointless to duplicate in KVM, so move get_xsave_addr() to public FPU API; it is really just a facility to operate on XSAVE state and does not expose any internal details of arch/x86/kernel/fpu. Acked-by: Dave Hansen <dave.hansen@linux.intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Message-ID: <20240404121327.3107131-12-pbonzini@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2024-04-11KVM: SEV: define VM types for SEV and SEV-ESPaolo Bonzini1-0/+2
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Message-ID: <20240404121327.3107131-11-pbonzini@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2024-04-11KVM: x86: add fields to struct kvm_arch for CoCo featuresPaolo Bonzini1-2/+5
Some VM types have characteristics in common; in fact, the only use of VM types right now is kvm_arch_has_private_mem and it assumes that _all_ nonzero VM types have private memory. We will soon introduce a VM type for SEV and SEV-ES VMs, and at that point we will have two special characteristics of confidential VMs that depend on the VM type: not just if memory is private, but also whether guest state is protected. For the latter we have kvm->arch.guest_state_protected, which is only set on a fully initialized VM. For VM types with protected guest state, we can actually fix a problem in the SEV-ES implementation, where ioctls to set registers do not cause an error even if the VM has been initialized and the guest state encrypted. Make sure that when using VM types that will become an error. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Message-Id: <20240209183743.22030-7-pbonzini@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Reviewed-by: Isaku Yamahata <isaku.yamahata@intel.com> Message-ID: <20240404121327.3107131-8-pbonzini@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2024-04-11KVM: SEV: publish supported VMSA featuresPaolo Bonzini1-2/+7
Compute the set of features to be stored in the VMSA when KVM is initialized; move it from there into kvm_sev_info when SEV is initialized, and then into the initial VMSA. The new variable can then be used to return the set of supported features to userspace, via the KVM_GET_DEVICE_ATTR ioctl. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Reviewed-by: Isaku Yamahata <isaku.yamahata@intel.com> Message-ID: <20240404121327.3107131-6-pbonzini@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2024-04-11KVM: introduce new vendor op for KVM_GET_DEVICE_ATTRPaolo Bonzini2-0/+2
Allow vendor modules to provide their own attributes on /dev/kvm. To avoid proliferation of vendor ops, implement KVM_HAS_DEVICE_ATTR and KVM_GET_DEVICE_ATTR in terms of the same function. You're not supposed to use KVM_GET_DEVICE_ATTR to do complicated computations, especially on /dev/kvm. Reviewed-by: Michael Roth <michael.roth@amd.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Reviewed-by: Isaku Yamahata <isaku.yamahata@intel.com> Message-ID: <20240404121327.3107131-5-pbonzini@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2024-04-11KVM: x86: Snapshot if a vCPU's vendor model is AMD vs. Intel compatibleSean Christopherson1-0/+1
Add kvm_vcpu_arch.is_amd_compatible to cache if a vCPU's vendor model is compatible with AMD, i.e. if the vCPU vendor is AMD or Hygon, along with helpers to check if a vCPU is compatible AMD vs. Intel. To handle Intel vs. AMD behavior related to masking the LVTPC entry, KVM will need to check for vendor compatibility on every PMI injection, i.e. querying for AMD will soon be a moderately hot path. Note! This subtly (or maybe not-so-subtly) makes "Intel compatible" KVM's default behavior, both if userspace omits (or never sets) CPUID 0x0 and if userspace sets a completely unknown vendor. One could argue that KVM should treat such vCPUs as not being compatible with Intel *or* AMD, but that would add useless complexity to KVM. KVM needs to do *something* in the face of vendor specific behavior, and so unless KVM conjured up a magic third option, choosing to treat unknown vendors as neither Intel nor AMD means that checks on AMD compatibility would yield Intel behavior, and checks for Intel compatibility would yield AMD behavior. And that's far worse as it would effectively yield random behavior depending on whether KVM checked for AMD vs. Intel vs. !AMD vs. !Intel. And practically speaking, all x86 CPUs follow either Intel or AMD architecture, i.e. "supporting" an unknown third architecture adds no value. Deliberately don't convert any of the existing guest_cpuid_is_intel() checks, as the Intel side of things is messier due to some flows explicitly checking for exactly vendor==Intel, versus some flows assuming anything that isn't "AMD compatible" gets Intel behavior. The Intel code will be cleaned up in the future. Cc: stable@vger.kernel.org Signed-off-by: Sean Christopherson <seanjc@google.com> Message-ID: <20240405235603.1173076-2-seanjc@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2024-04-11Merge tag 'v6.9-rc3' into x86/boot, to pick up fixes before queueing up more ↵Ingo Molnar18-107/+173
changes Signed-off-by: Ingo Molnar <mingo@kernel.org>
2024-04-10locking/atomic/x86: Define arch_atomic_sub() family using arch_atomic_add() ↵Uros Bizjak2-20/+4
functions There is no need to implement arch_atomic_sub() family of inline functions, corresponding macros can be directly implemented using arch_atomic_add() inlines with negated argument. No functional changes intended. Signed-off-by: Uros Bizjak <ubizjak@gmail.com> Signed-off-by: Ingo Molnar <mingo@kernel.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Link: https://lore.kernel.org/r/20240410062957.322614-4-ubizjak@gmail.com
2024-04-10locking/atomic/x86: Rewrite x86_32 arch_atomic64_{,fetch}_{and,or,xor}() ↵Uros Bizjak1-25/+18
functions Rewrite x86_32 arch_atomic64_{,fetch}_{and,or,xor}() functions to use arch_atomic64_try_cmpxchg(). This implementation avoids one extra trip through the CMPXCHG loop. The value preload before the cmpxchg loop does not need to be atomic. Use arch_atomic64_read_nonatomic(v) to load the value from atomic_t location in a non-atomic way. The generated code improves from: 1917d5: 31 c9 xor %ecx,%ecx 1917d7: 31 db xor %ebx,%ebx 1917d9: 89 4c 24 3c mov %ecx,0x3c(%esp) 1917dd: 8b 74 24 24 mov 0x24(%esp),%esi 1917e1: 89 c8 mov %ecx,%eax 1917e3: 89 5c 24 34 mov %ebx,0x34(%esp) 1917e7: 8b 7c 24 28 mov 0x28(%esp),%edi 1917eb: 21 ce and %ecx,%esi 1917ed: 89 74 24 4c mov %esi,0x4c(%esp) 1917f1: 21 df and %ebx,%edi 1917f3: 89 de mov %ebx,%esi 1917f5: 89 7c 24 50 mov %edi,0x50(%esp) 1917f9: 8b 54 24 4c mov 0x4c(%esp),%edx 1917fd: 8b 7c 24 2c mov 0x2c(%esp),%edi 191801: 8b 4c 24 50 mov 0x50(%esp),%ecx 191805: 89 d3 mov %edx,%ebx 191807: 89 f2 mov %esi,%edx 191809: f0 0f c7 0f lock cmpxchg8b (%edi) 19180d: 89 c1 mov %eax,%ecx 19180f: 8b 74 24 34 mov 0x34(%esp),%esi 191813: 89 d3 mov %edx,%ebx 191815: 89 44 24 4c mov %eax,0x4c(%esp) 191819: 8b 44 24 3c mov 0x3c(%esp),%eax 19181d: 89 df mov %ebx,%edi 19181f: 89 54 24 44 mov %edx,0x44(%esp) 191823: 89 ca mov %ecx,%edx 191825: 31 de xor %ebx,%esi 191827: 31 c8 xor %ecx,%eax 191829: 09 f0 or %esi,%eax 19182b: 75 ac jne 1917d9 <...> to: 1912ba: 8b 06 mov (%esi),%eax 1912bc: 8b 56 04 mov 0x4(%esi),%edx 1912bf: 89 44 24 3c mov %eax,0x3c(%esp) 1912c3: 89 c1 mov %eax,%ecx 1912c5: 23 4c 24 34 and 0x34(%esp),%ecx 1912c9: 89 d3 mov %edx,%ebx 1912cb: 23 5c 24 38 and 0x38(%esp),%ebx 1912cf: 89 54 24 40 mov %edx,0x40(%esp) 1912d3: 89 4c 24 2c mov %ecx,0x2c(%esp) 1912d7: 89 5c 24 30 mov %ebx,0x30(%esp) 1912db: 8b 5c 24 2c mov 0x2c(%esp),%ebx 1912df: 8b 4c 24 30 mov 0x30(%esp),%ecx 1912e3: f0 0f c7 0e lock cmpxchg8b (%esi) 1912e7: 0f 85 f3 02 00 00 jne 1915e0 <...> Signed-off-by: Uros Bizjak <ubizjak@gmail.com> Signed-off-by: Ingo Molnar <mingo@kernel.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Link: https://lore.kernel.org/r/20240410062957.322614-3-ubizjak@gmail.com
2024-04-10locking/atomic/x86: Introduce arch_atomic64_read_nonatomic() to x86_32Uros Bizjak1-0/+26
Introduce arch_atomic64_read_nonatomic() for 32-bit targets to load the value from atomic64_t location in a non-atomic way. This function is intended to be used in cases where a subsequent atomic operation will handle the torn value, and can be used to prime the first iteration of unconditional try_cmpxchg() loops. Suggested-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Uros Bizjak <ubizjak@gmail.com> Signed-off-by: Ingo Molnar <mingo@kernel.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Link: https://lore.kernel.org/r/20240410062957.322614-2-ubizjak@gmail.com
2024-04-10locking/atomic/x86: Introduce arch_atomic64_try_cmpxchg() to x86_32Uros Bizjak1-2/+8
Introduce arch_atomic64_try_cmpxchg() for 32-bit targets to use optimized target specific implementation instead of a generic one. This implementation eliminates dual-word compare after cmpxchg8b instruction and improves generated asm code from: 2273: f0 0f c7 0f lock cmpxchg8b (%edi) 2277: 8b 74 24 2c mov 0x2c(%esp),%esi 227b: 89 d3 mov %edx,%ebx 227d: 89 c2 mov %eax,%edx 227f: 89 5c 24 10 mov %ebx,0x10(%esp) 2283: 8b 7c 24 30 mov 0x30(%esp),%edi 2287: 89 44 24 1c mov %eax,0x1c(%esp) 228b: 31 f2 xor %esi,%edx 228d: 89 d0 mov %edx,%eax 228f: 89 da mov %ebx,%edx 2291: 31 fa xor %edi,%edx 2293: 09 d0 or %edx,%eax 2295: 0f 85 a5 00 00 00 jne 2340 <...> to: 2270: f0 0f c7 0f lock cmpxchg8b (%edi) 2274: 0f 85 a6 00 00 00 jne 2320 <...> Signed-off-by: Uros Bizjak <ubizjak@gmail.com> Signed-off-by: Ingo Molnar <mingo@kernel.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Link: https://lore.kernel.org/r/20240410062957.322614-1-ubizjak@gmail.com
2024-04-10Merge branch 'linus' into x86/urgent, to pick up dependent commitsIngo Molnar4-8/+36
Prepare to fix aspects of the new BHI code. Signed-off-by: Ingo Molnar <mingo@kernel.org>
2024-04-09KVM: x86: Move nEPT exit_qualification field from kvm_vcpu_arch to x86_exceptionSean Christopherson1-3/+0
Move the exit_qualification field that is used to track information about in-flight nEPT violations from "struct kvm_vcpu_arch" to "x86_exception", i.e. associate the information with the actual nEPT violation instead of the vCPU. To handle bits that are pulled from vmcs.EXIT_QUALIFICATION, i.e. that are propagated from the "original" EPT violation VM-Exit, simply grab them from the VMCS on-demand when injecting a nEPT Violation or a PML Full VM-exit. Aside from being ugly, having an exit_qualification field in kvm_vcpu_arch is outright dangerous, e.g. see commit d7f0a00e438d ("KVM: VMX: Report up-to-date exit qualification to userspace"). Opportunstically add a comment to call out that PML Full and EPT Violation VM-Exits use the same bit to report NMI blocking information. Link: https://lore.kernel.org/r/20240209221700.393189-3-seanjc@google.com Signed-off-by: Sean Christopherson <seanjc@google.com>
2024-04-09x86/alternatives: Use a temporary buffer when optimizing NOPsBorislav Petkov (AMD)1-1/+1
Instead of optimizing NOPs in-place, use a temporary buffer like the usual alternatives patching flow does. This obviates the need to grab locks when patching, see 6778977590da ("x86/alternatives: Disable interrupts and sync when optimizing NOPs in place") While at it, add nomenclature definitions clarifying and simplifying the naming of function-local variables in the alternatives code. Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Link: https://lore.kernel.org/r/20240130105941.19707-2-bp@alien8.de
2024-04-09x86/alternatives: Catch late X86_FEATURE modifiersBorislav Petkov (AMD)1-2/+6
After alternatives have been patched, changes to the X86_FEATURE flags won't take effect and could potentially even be wrong. Warn about it. This is something which has been long overdue. Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Signed-off-by: Ingo Molnar <mingo@kernel.org> Tested-by: Srikanth Aithal <sraithal@amd.com> Link: https://lore.kernel.org/r/20240327154317.29909-3-bp@alien8.de
2024-04-09x86/alternatives: Remove a superfluous newline in _static_cpu_has()Borislav Petkov (AMD)1-2/+1
No functional changes. Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Signed-off-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Link: https://lore.kernel.org/r/20240327154317.29909-2-bp@alien8.de
2024-04-09x86/msr: Move ARCH_CAP_XAPIC_DISABLE bit definition to its rightful placePawan Gupta1-5/+4
The ARCH_CAP_XAPIC_DISABLE bit of MSR_IA32_ARCH_CAP is not in the correct sorted order. Move it where it belongs. No functional change. [ bp: Massage commit message. ] Signed-off-by: Pawan Gupta <pawan.kumar.gupta@linux.intel.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Link: https://lore.kernel.org/r/243317ff6c8db307b7701a45f71e5c21da80194b.1705632532.git.pawan.kumar.gupta@linux.intel.com
2024-04-09locking/atomic/x86: Introduce arch_try_cmpxchg64() for !CONFIG_X86_CMPXCHG64Uros Bizjak1-0/+28
Commit: 6d12c8d308e68 ("percpu: Wire up cmpxchg128") improved emulated cmpxchg8b_emu() library function to return success/failure in a ZF flag. Define arch_try_cmpxchg64() for !CONFIG_X86_CMPXCHG64 targets to override the generic archy_try_cmpxchg() with an optimized target specific implementation that handles ZF flag. The assembly code at the call sites improves from: bf56d: e8 fc ff ff ff call cmpxchg8b_emu bf572: 8b 74 24 28 mov 0x28(%esp),%esi bf576: 89 c3 mov %eax,%ebx bf578: 89 d1 mov %edx,%ecx bf57a: 8b 7c 24 2c mov 0x2c(%esp),%edi bf57e: 89 f0 mov %esi,%eax bf580: 89 fa mov %edi,%edx bf582: 31 d8 xor %ebx,%eax bf584: 31 ca xor %ecx,%edx bf586: 09 d0 or %edx,%eax bf588: 0f 84 e3 01 00 00 je bf771 <...> to: bf572: e8 fc ff ff ff call cmpxchg8b_emu bf577: 0f 84 b6 01 00 00 je bf733 <...> Signed-off-by: Uros Bizjak <ubizjak@gmail.com> Signed-off-by: Ingo Molnar <mingo@kernel.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: "H. Peter Anvin" <hpa@zytor.com> Link: https://lore.kernel.org/r/20240408091547.90111-4-ubizjak@gmail.com
2024-04-09locking/atomic/x86: Modernize x86_32 arch_{,try_}_cmpxchg64{,_local}()Uros Bizjak1-79/+100
Commit: b23e139d0b66 ("arch: Introduce arch_{,try_}_cmpxchg128{,_local}()") introduced arch_{,try_}_cmpxchg128{,_local}() for x86_64 targets. Modernize existing x86_32 arch_{,try_}_cmpxchg64{,_local}() definitions to follow the same structure as the definitions introduced by the above commit. No functional changes intended. Signed-off-by: Uros Bizjak <ubizjak@gmail.com> Signed-off-by: Ingo Molnar <mingo@kernel.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: "H. Peter Anvin" <hpa@zytor.com> Link: https://lore.kernel.org/r/20240408091547.90111-3-ubizjak@gmail.com
2024-04-09locking/atomic/x86: Correct the definition of __arch_try_cmpxchg128()Uros Bizjak1-1/+1
Correct the definition of __arch_try_cmpxchg128(), introduced by: b23e139d0b66 ("arch: Introduce arch_{,try_}_cmpxchg128{,_local}()") Fixes: b23e139d0b66 ("arch: Introduce arch_{,try_}_cmpxchg128{,_local}()") Signed-off-by: Uros Bizjak <ubizjak@gmail.com> Signed-off-by: Ingo Molnar <mingo@kernel.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: "H. Peter Anvin" <hpa@zytor.com> Link: https://lore.kernel.org/r/20240408091547.90111-2-ubizjak@gmail.com
2024-04-09x86/mce: Implement recovery for errors in TDX/SEAM non-root modeTony Luck1-0/+2
Machine check SMIs (MSMI) signaled during SEAM operation (typically inside TDX guests), on a system with Intel eMCA enabled, might eventually be reported to the kernel #MC handler with the saved RIP on the stack pointing to the instruction in kernel code after the SEAMCALL instruction that entered the SEAM operation. Linux currently says that is a fatal error and shuts down. There is a new bit in IA32_MCG_STATUS that, when set to 1, indicates that the machine check didn't originally occur at that saved RIP, but during SEAM non-root operation. Add new entries to the severity table to detect this for both data load and instruction fetch that set the severity to "AR" (action required). Increase the width of the mcgmask/mcgres fields in "struct severity" from unsigned char to unsigned short since the new bit is in position 12. Action required for these errors is just mark the page as poisoned and return from the machine check handler. HW ABI notes: ============= The SEAM_NR bit in IA32_MCG_STATUS hasn't yet made it into the Intel Software Developers' Manual. But it is described in section 16.5.2 of "Intel(R) Trust Domain Extensions (Intel(R) TDX) Module Base Architecture Specification" downloadable from: https://cdrdv2.intel.com/v1/dl/getContent/733575 Backport notes: =============== Little value in backporting this patch to stable or LTS kernels as this is only relevant with support for TDX, which I assume won't be backported. But for anyone taking this to v6.1 or older, you also need commit: a51cbd0d86d3 ("x86/mce: Use severity table to handle uncorrected errors in kernel") Signed-off-by: Tony Luck <tony.luck@intel.com> Signed-off-by: Ingo Molnar <mingo@kernel.org> Link: https://lore.kernel.org/r/20240408180944.44638-1-tony.luck@intel.com
2024-04-09Merge tag 'v6.9-rc3' into x86/cpu, to pick up fixesIngo Molnar18-107/+173
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2024-04-09Merge tag 'nativebhi' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tipLinus Torvalds4-8/+36
Pull x86 mitigations from Thomas Gleixner: "Mitigations for the native BHI hardware vulnerabilty: Branch History Injection (BHI) attacks may allow a malicious application to influence indirect branch prediction in kernel by poisoning the branch history. eIBRS isolates indirect branch targets in ring0. The BHB can still influence the choice of indirect branch predictor entry, and although branch predictor entries are isolated between modes when eIBRS is enabled, the BHB itself is not isolated between modes. Add mitigations against it either with the help of microcode or with software sequences for the affected CPUs" [ This also ends up enabling the full mitigation by default despite the system call hardening, because apparently there are other indirect calls that are still sufficiently reachable, and the 'auto' case just isn't hardened enough. We'll have some more inevitable tweaking in the future - Linus ] * tag 'nativebhi' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: KVM: x86: Add BHI_NO x86/bhi: Mitigate KVM by default x86/bhi: Add BHI mitigation knob x86/bhi: Enumerate Branch History Injection (BHI) bug x86/bhi: Define SPEC_CTRL_BHI_DIS_S x86/bhi: Add support for clearing branch history at syscall entry x86/syscall: Don't force use of indirect calls for system calls x86/bugs: Change commas to semicolons in 'spectre_v2' sysfs file
2024-04-08x86/bhi: Mitigate KVM by defaultPawan Gupta2-0/+6
BHI mitigation mode spectre_bhi=auto does not deploy the software mitigation by default. In a cloud environment, it is a likely scenario where userspace is trusted but the guests are not trusted. Deploying system wide mitigation in such cases is not desirable. Update the auto mode to unconditionally mitigate against malicious guests. Deploy the software sequence at VMexit in auto mode also, when hardware mitigation is not available. Unlike the force =on mode, software sequence is not deployed at syscalls in auto mode. Suggested-by: Alexandre Chartre <alexandre.chartre@oracle.com> Signed-off-by: Pawan Gupta <pawan.kumar.gupta@linux.intel.com> Signed-off-by: Daniel Sneddon <daniel.sneddon@linux.intel.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Alexandre Chartre <alexandre.chartre@oracle.com> Reviewed-by: Josh Poimboeuf <jpoimboe@kernel.org>
2024-04-08x86/bhi: Add BHI mitigation knobPawan Gupta1-0/+1
Branch history clearing software sequences and hardware control BHI_DIS_S were defined to mitigate Branch History Injection (BHI). Add cmdline spectre_bhi={on|off|auto} to control BHI mitigation: auto - Deploy the hardware mitigation BHI_DIS_S, if available. on - Deploy the hardware mitigation BHI_DIS_S, if available, otherwise deploy the software sequence at syscall entry and VMexit. off - Turn off BHI mitigation. The default is auto mode which does not deploy the software sequence mitigation. This is because of the hardening done in the syscall dispatch path, which is the likely target of BHI. Signed-off-by: Pawan Gupta <pawan.kumar.gupta@linux.intel.com> Signed-off-by: Daniel Sneddon <daniel.sneddon@linux.intel.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Alexandre Chartre <alexandre.chartre@oracle.com> Reviewed-by: Josh Poimboeuf <jpoimboe@kernel.org>
2024-04-08x86/bhi: Enumerate Branch History Injection (BHI) bugPawan Gupta2-0/+5
Mitigation for BHI is selected based on the bug enumeration. Add bits needed to enumerate BHI bug. Signed-off-by: Pawan Gupta <pawan.kumar.gupta@linux.intel.com> Signed-off-by: Daniel Sneddon <daniel.sneddon@linux.intel.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Alexandre Chartre <alexandre.chartre@oracle.com> Reviewed-by: Josh Poimboeuf <jpoimboe@kernel.org>
2024-04-08x86/bhi: Define SPEC_CTRL_BHI_DIS_SDaniel Sneddon2-1/+5
Newer processors supports a hardware control BHI_DIS_S to mitigate Branch History Injection (BHI). Setting BHI_DIS_S protects the kernel from userspace BHI attacks without having to manually overwrite the branch history. Define MSR_SPEC_CTRL bit BHI_DIS_S and its enumeration CPUID.BHI_CTRL. Mitigation is enabled later. Signed-off-by: Daniel Sneddon <daniel.sneddon@linux.intel.com> Signed-off-by: Pawan Gupta <pawan.kumar.gupta@linux.intel.com> Signed-off-by: Daniel Sneddon <daniel.sneddon@linux.intel.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Alexandre Chartre <alexandre.chartre@oracle.com> Reviewed-by: Josh Poimboeuf <jpoimboe@kernel.org>