summaryrefslogtreecommitdiff
AgeCommit message (Collapse)AuthorFilesLines
2024-02-01KVM: x86/pmu: Add macros to iterate over all PMCs given a bitmapSean Christopherson3-24/+15
Add and use kvm_for_each_pmc() to dedup a variety of open coded for-loops that iterate over valid PMCs given a bitmap (and because seeing checkpatch whine about bad macro style is always amusing). No functional change intended. Link: https://lore.kernel.org/r/20231110022857.1273836-6-seanjc@google.com Signed-off-by: Sean Christopherson <seanjc@google.com>
2024-02-01KVM: x86/pmu: Snapshot and clear reprogramming bitmap before reprogrammingSean Christopherson2-23/+30
Refactor the handling of the reprogramming bitmap to snapshot and clear to-be-processed bits before doing the reprogramming, and then explicitly set bits for PMCs that need to be reprogrammed (again). This will allow adding a macro to iterate over all valid PMCs without having to add special handling for the reprogramming bit, which (a) can have bits set for non-existent PMCs and (b) needs to clear such bits to avoid wasting cycles in perpetuity. Note, the existing behavior of clearing bits after reprogramming does NOT have a race with kvm_vm_ioctl_set_pmu_event_filter(). Setting a new PMU filter synchronizes SRCU _before_ setting the bitmap, i.e. guarantees that the vCPU isn't in the middle of reprogramming with a stale filter prior to setting the bitmap. Link: https://lore.kernel.org/r/20231110022857.1273836-5-seanjc@google.com Signed-off-by: Sean Christopherson <seanjc@google.com>
2024-02-01KVM: x86/pmu: Move pmc_idx => pmc translation helper to common codeSean Christopherson5-24/+36
Add a common helper for *internal* PMC lookups, and delete the ops hook and Intel's implementation. Keep AMD's implementation, but rename it to amd_pmu_get_pmc() to make it somewhat more obvious that it's suited for both KVM-internal and guest-initiated lookups. Because KVM tracks all counters in a single bitmap, getting a counter when iterating over a bitmap, e.g. of all valid PMCs, requires a small amount of math, that while simple, isn't super obvious and doesn't use the same semantics as PMC lookups from RDPMC! Although AMD doesn't support fixed counters, the common PMU code still behaves as if there a split, the high half of which just happens to always be empty. Opportunstically add a comment to explain both what is going on, and why KVM uses a single bitmap, e.g. the boilerplate for iterating over separate bitmaps could be done via macros, so it's not (just) about deduplicating code. Link: https://lore.kernel.org/r/20231110022857.1273836-4-seanjc@google.com Signed-off-by: Sean Christopherson <seanjc@google.com>
2024-02-01KVM: x86/pmu: Add common define to capture fixed counters offsetSean Christopherson3-11/+13
Add a common define to "officially" solidify KVM's split of counters, i.e. to commit to using bits 31:0 to track general purpose counters and bits 63:32 to track fixed counters (which only Intel supports). KVM already bleeds this behavior all over common PMU code, and adding a KVM- defined macro allows clarifying that the value is a _base_, as oppposed to the _flag_ that is used to access fixed PMCs via RDPMC (which perf confusingly calls INTEL_PMC_FIXED_RDPMC_BASE). No functional change intended. Link: https://lore.kernel.org/r/20231110022857.1273836-3-seanjc@google.com Signed-off-by: Sean Christopherson <seanjc@google.com>
2024-02-01KVM: x86/pmu: Zero out PMU metadata on AMD if PMU is disabledSean Christopherson2-16/+20
Move the purging of common PMU metadata from intel_pmu_refresh() to kvm_pmu_refresh(), and invoke the vendor refresh() hook if and only if the VM is supposed to have a vPMU. KVM already denies access to the PMU based on kvm->arch.enable_pmu, as get_gp_pmc_amd() returns NULL for all PMCs in that case, i.e. KVM already violates AMD's architecture by not virtualizing a PMU (kernels have long since learned to not panic when the PMU is unavailable). But configuring the PMU as if it were enabled causes unwanted side effects, e.g. calls to kvm_pmu_trigger_event() waste an absurd number of cycles due to the all_valid_pmc_idx bitmap being non-zero. Fixes: b1d66dad65dc ("KVM: x86/svm: Add module param to control PMU virtualization") Reported-by: Konstantin Khorenko <khorenko@virtuozzo.com> Closes: https://lore.kernel.org/all/20231109180646.2963718-2-khorenko@virtuozzo.com Link: https://lore.kernel.org/r/20231110022857.1273836-2-seanjc@google.com Signed-off-by: Sean Christopherson <seanjc@google.com>
2024-01-31KVM: selftests: Extend PMU counters test to validate RDPMC after WRMSRSean Christopherson1-0/+41
Extend the read/write PMU counters subtest to verify that RDPMC also reads back the written value. Opportunsitically verify that attempting to use the "fast" mode of RDPMC fails, as the "fast" flag is only supported by non-architectural PMUs, which KVM doesn't virtualize. Tested-by: Dapeng Mi <dapeng1.mi@linux.intel.com> Link: https://lore.kernel.org/r/20240109230250.424295-30-seanjc@google.com Signed-off-by: Sean Christopherson <seanjc@google.com>
2024-01-31KVM: selftests: Add helpers for safe and safe+forced RDMSR, RDPMC, and XGETBVSean Christopherson1-12/+26
Add helpers for safe and safe-with-forced-emulations versions of RDMSR, RDPMC, and XGETBV. Use macro shenanigans to eliminate the rather large amount of boilerplate needed to get values in and out of registers. Tested-by: Dapeng Mi <dapeng1.mi@linux.intel.com> Link: https://lore.kernel.org/r/20240109230250.424295-29-seanjc@google.com Signed-off-by: Sean Christopherson <seanjc@google.com>
2024-01-31KVM: selftests: Add a forced emulation variation of KVM_ASM_SAFE()Sean Christopherson1-2/+28
Add KVM_ASM_SAFE_FEP() to allow forcing emulation on an instruction that might fault. Note, KVM skips RIP past the FEP prefix before injecting an exception, i.e. the fixup needs to be on the instruction itself. Do not check for FEP support, that is firmly the responsibility of whatever code wants to use KVM_ASM_SAFE_FEP(). Sadly, chaining variadic arguments that contain commas doesn't work, thus the unfortunate amount of copy+paste. Tested-by: Dapeng Mi <dapeng1.mi@linux.intel.com> Link: https://lore.kernel.org/r/20240109230250.424295-28-seanjc@google.com Signed-off-by: Sean Christopherson <seanjc@google.com>
2024-01-31KVM: selftests: Test PMC virtualization with forced emulationSean Christopherson1-14/+30
Extend the PMC counters test to use forced emulation to verify that KVM emulates counter events for instructions retired and branches retired. Force emulation for only a subset of the measured code to test that KVM does the right thing when mixing perf events with emulated events. Reviewed-by: Dapeng Mi <dapeng1.mi@linux.intel.com> Tested-by: Dapeng Mi <dapeng1.mi@linux.intel.com> Link: https://lore.kernel.org/r/20240109230250.424295-27-seanjc@google.com Signed-off-by: Sean Christopherson <seanjc@google.com>
2024-01-31KVM: selftests: Move KVM_FEP macro into common library headerSean Christopherson2-2/+3
Move the KVM_FEP definition, a.k.a. the KVM force emulation prefix, into processor.h so that it can be used for other tests besides the MSR filter test. Tested-by: Dapeng Mi <dapeng1.mi@linux.intel.com> Link: https://lore.kernel.org/r/20240109230250.424295-26-seanjc@google.com Signed-off-by: Sean Christopherson <seanjc@google.com>
2024-01-31KVM: selftests: Query module param to detect FEP in MSR filtering testSean Christopherson2-18/+14
Add a helper to detect KVM support for forced emulation by querying the module param, and use the helper to detect support for the MSR filtering test instead of throwing a noodle/NOP at KVM to see if it sticks. Cc: Aaron Lewis <aaronlewis@google.com> Tested-by: Dapeng Mi <dapeng1.mi@linux.intel.com> Link: https://lore.kernel.org/r/20240109230250.424295-25-seanjc@google.com Signed-off-by: Sean Christopherson <seanjc@google.com>
2024-01-31KVM: selftests: Add helpers to read integer module paramsSean Christopherson2-6/+60
Add helpers to read integer module params, which is painfully non-trivial because the pain of dealing with strings in C is exacerbated by the kernel inserting a newline. Don't bother differentiating between int, uint, short, etc. They all fit in an int, and KVM (thankfully) doesn't have any integer params larger than an int. Tested-by: Dapeng Mi <dapeng1.mi@linux.intel.com> Link: https://lore.kernel.org/r/20240109230250.424295-24-seanjc@google.com Signed-off-by: Sean Christopherson <seanjc@google.com>
2024-01-31KVM: selftests: Add a helper to query if the PMU module param is enabledSean Christopherson4-3/+8
Add a helper to probe KVM's "enable_pmu" param, open coding strings in multiple places is just asking for false negatives and/or runtime errors due to typos. Reviewed-by: Dapeng Mi <dapeng1.mi@linux.intel.com> Tested-by: Dapeng Mi <dapeng1.mi@linux.intel.com> Link: https://lore.kernel.org/r/20240109230250.424295-23-seanjc@google.com Signed-off-by: Sean Christopherson <seanjc@google.com>
2024-01-31KVM: selftests: Expand PMU counters test to verify LLC eventsSean Christopherson1-19/+40
Expand the PMU counters test to verify that LLC references and misses have non-zero counts when the code being executed while the LLC event(s) is active is evicted via CFLUSH{,OPT}. Note, CLFLUSH{,OPT} requires a fence of some kind to ensure the cache lines are flushed before execution continues. Use MFENCE for simplicity (performance is not a concern). Suggested-by: Jim Mattson <jmattson@google.com> Reviewed-by: Dapeng Mi <dapeng1.mi@linux.intel.com> Tested-by: Dapeng Mi <dapeng1.mi@linux.intel.com> Link: https://lore.kernel.org/r/20240109230250.424295-22-seanjc@google.com Signed-off-by: Sean Christopherson <seanjc@google.com>
2024-01-31KVM: selftests: Add functional test for Intel's fixed PMU countersJinrong Liang1-1/+30
Extend the fixed counters test to verify that supported counters can actually be enabled in the control MSRs, that unsupported counters cannot, and that enabled counters actually count. Co-developed-by: Like Xu <likexu@tencent.com> Signed-off-by: Like Xu <likexu@tencent.com> Signed-off-by: Jinrong Liang <cloudliang@tencent.com> [sean: fold into the rd/wr access test, massage changelog] Reviewed-by: Dapeng Mi <dapeng1.mi@linux.intel.com> Tested-by: Dapeng Mi <dapeng1.mi@linux.intel.com> Link: https://lore.kernel.org/r/20240109230250.424295-21-seanjc@google.com Signed-off-by: Sean Christopherson <seanjc@google.com>
2024-01-31KVM: selftests: Test consistency of CPUID with num of fixed countersJinrong Liang1-3/+57
Extend the PMU counters test to verify KVM emulation of fixed counters in addition to general purpose counters. Fixed counters add an extra wrinkle in the form of an extra supported bitmask. Thus quoth the SDM: fixed-function performance counter 'i' is supported if ECX[i] || (EDX[4:0] > i) Test that KVM handles a counter being available through either method. Reviewed-by: Dapeng Mi <dapeng1.mi@linux.intel.com> Co-developed-by: Like Xu <likexu@tencent.com> Signed-off-by: Like Xu <likexu@tencent.com> Signed-off-by: Jinrong Liang <cloudliang@tencent.com> Co-developed-by: Sean Christopherson <seanjc@google.com> Tested-by: Dapeng Mi <dapeng1.mi@linux.intel.com> Link: https://lore.kernel.org/r/20240109230250.424295-20-seanjc@google.com Signed-off-by: Sean Christopherson <seanjc@google.com>
2024-01-31KVM: selftests: Test consistency of CPUID with num of gp countersJinrong Liang1-0/+99
Add a test to verify that KVM correctly emulates MSR-based accesses to general purpose counters based on guest CPUID, e.g. that accesses to non-existent counters #GP and accesses to existent counters succeed. Note, for compatibility reasons, KVM does not emulate #GP when MSR_P6_PERFCTR[0|1] is not present (writes should be dropped). Co-developed-by: Like Xu <likexu@tencent.com> Signed-off-by: Like Xu <likexu@tencent.com> Signed-off-by: Jinrong Liang <cloudliang@tencent.com> Co-developed-by: Sean Christopherson <seanjc@google.com> Tested-by: Dapeng Mi <dapeng1.mi@linux.intel.com> Link: https://lore.kernel.org/r/20240109230250.424295-19-seanjc@google.com Signed-off-by: Sean Christopherson <seanjc@google.com>
2024-01-31KVM: selftests: Test Intel PMU architectural events on fixed countersJinrong Liang1-9/+45
Extend the PMU counters test to validate architectural events using fixed counters. The core logic is largely the same, the biggest difference being that if a fixed counter exists, its associated event is available (the SDM doesn't explicitly state this to be true, but it's KVM's ABI and letting software program a fixed counter that doesn't actually count would be quite bizarre). Note, fixed counters rely on PERF_GLOBAL_CTRL. Reviewed-by: Jim Mattson <jmattson@google.com> Reviewed-by: Dapeng Mi <dapeng1.mi@linux.intel.com> Co-developed-by: Like Xu <likexu@tencent.com> Signed-off-by: Like Xu <likexu@tencent.com> Signed-off-by: Jinrong Liang <cloudliang@tencent.com> Co-developed-by: Sean Christopherson <seanjc@google.com> Tested-by: Dapeng Mi <dapeng1.mi@linux.intel.com> Link: https://lore.kernel.org/r/20240109230250.424295-18-seanjc@google.com Signed-off-by: Sean Christopherson <seanjc@google.com>
2024-01-31KVM: selftests: Test Intel PMU architectural events on gp countersJinrong Liang2-0/+322
Add test cases to verify that Intel's Architectural PMU events work as expected when they are available according to guest CPUID. Iterate over a range of sane PMU versions, with and without full-width writes enabled, and over interesting combinations of lengths/masks for the bit vector that enumerates unavailable events. Test up to vPMU version 5, i.e. the current architectural max. KVM only officially supports up to version 2, but the behavior of the counters is backwards compatible, i.e. KVM shouldn't do something completely different for a higher, architecturally-defined vPMU version. Verify KVM behavior against the effective vPMU version, e.g. advertising vPMU 5 when KVM only supports vPMU 2 shouldn't magically unlock vPMU 5 features. According to Intel SDM, the number of architectural events is reported through CPUID.0AH:EAX[31:24] and the architectural event x is supported if EBX[x]=0 && EAX[31:24]>x. Handcode the entirety of the measured section so that the test can precisely assert on the number of instructions and branches retired. Co-developed-by: Like Xu <likexu@tencent.com> Signed-off-by: Like Xu <likexu@tencent.com> Signed-off-by: Jinrong Liang <cloudliang@tencent.com> Co-developed-by: Sean Christopherson <seanjc@google.com> Tested-by: Dapeng Mi <dapeng1.mi@linux.intel.com> Link: https://lore.kernel.org/r/20240109230250.424295-17-seanjc@google.com Signed-off-by: Sean Christopherson <seanjc@google.com>
2024-01-31KVM: selftests: Add pmu.h and lib/pmu.c for common PMU assetsJinrong Liang4-97/+173
Add a PMU library for x86 selftests to help eliminate open-coded event encodings, and to reduce the amount of copy+paste between PMU selftests. Use the new common macro definitions in the existing PMU event filter test. Cc: Aaron Lewis <aaronlewis@google.com> Suggested-by: Sean Christopherson <seanjc@google.com> Signed-off-by: Jinrong Liang <cloudliang@tencent.com> Co-developed-by: Sean Christopherson <seanjc@google.com> Tested-by: Dapeng Mi <dapeng1.mi@linux.intel.com> Link: https://lore.kernel.org/r/20240109230250.424295-16-seanjc@google.com Signed-off-by: Sean Christopherson <seanjc@google.com>
2024-01-31KVM: selftests: Extend {kvm,this}_pmu_has() to support fixed countersSean Christopherson1-18/+47
Extend the kvm_x86_pmu_feature framework to allow querying for fixed counters via {kvm,this}_pmu_has(). Like architectural events, checking for a fixed counter annoyingly requires checking multiple CPUID fields, as a fixed counter exists if: FxCtr[i]_is_supported := ECX[i] || (EDX[4:0] > i); Note, KVM currently doesn't actually support exposing fixed counters via the bitmask, but that will hopefully change sooner than later, and Intel's SDM explicitly "recommends" checking both the number of counters and the mask. Rename the intermedate "anti_feature" field to simply 'f' since the fixed counter bitmask (thankfully) doesn't have reversed polarity like the architectural events bitmask. Note, ideally the helpers would use BUILD_BUG_ON() to assert on the incoming register, but the expected usage in PMU tests can't guarantee the inputs are compile-time constants. Opportunistically define macros for all of the known architectural events and fixed counters. Tested-by: Dapeng Mi <dapeng1.mi@linux.intel.com> Link: https://lore.kernel.org/r/20240109230250.424295-15-seanjc@google.com Signed-off-by: Sean Christopherson <seanjc@google.com>
2024-01-31KVM: selftests: Drop the "name" param from KVM_X86_PMU_FEATURE()Sean Christopherson1-2/+2
Drop the "name" parameter from KVM_X86_PMU_FEATURE(), it's unused and the name is redundant with the macro, i.e. it's truly useless. Reviewed-by: Jim Mattson <jmattson@google.com> Tested-by: Dapeng Mi <dapeng1.mi@linux.intel.com> Link: https://lore.kernel.org/r/20240109230250.424295-14-seanjc@google.com Signed-off-by: Sean Christopherson <seanjc@google.com>
2024-01-31KVM: selftests: Add vcpu_set_cpuid_property() to set propertiesJinrong Liang3-5/+16
Add vcpu_set_cpuid_property() helper function for setting properties, and use it instead of open coding an equivalent for MAX_PHY_ADDR. Future vPMU testcases will also need to stuff various CPUID properties. Reviewed-by: Jim Mattson <jmattson@google.com> Signed-off-by: Jinrong Liang <cloudliang@tencent.com> Co-developed-by: Sean Christopherson <seanjc@google.com> Tested-by: Dapeng Mi <dapeng1.mi@linux.intel.com> Link: https://lore.kernel.org/r/20240109230250.424295-13-seanjc@google.com Signed-off-by: Sean Christopherson <seanjc@google.com>
2024-01-31KVM: x86/pmu: Explicitly check for RDPMC of unsupported Intel PMC typesSean Christopherson1-6/+15
Explicitly check for attempts to read unsupported PMC types instead of letting the bounds check fail. Functionally, letting the check fail is ok, but it's unnecessarily subtle and does a poor job of documenting the architectural behavior that KVM is emulating. Reviewed-by: Dapeng Mi  <dapeng1.mi@linux.intel.com> Tested-by: Dapeng Mi <dapeng1.mi@linux.intel.com> Link: https://lore.kernel.org/r/20240109230250.424295-12-seanjc@google.com Signed-off-by: Sean Christopherson <seanjc@google.com>
2024-01-31KVM: x86/pmu: Treat "fixed" PMU type in RDPMC as index as a value, not flagSean Christopherson1-3/+11
Refactor KVM's handling of ECX for RDPMC to treat the FIXED modifier as an explicit value, not a flag (minus one wart). While non-architectural PMUs do use bit 31 as a flag (for "fast" reads), architectural PMUs use the upper half of ECX to encode the type. From the SDM: ECX[31:16] specifies type of PMC while ECX[15:0] specifies the index of the PMC to be read within that type Note, that the known supported types are 4000H and 2000H, i.e. look a lot like flags, doesn't contradict the above statement that ECX[31:16] holds the type, at least not by any sane reading of the SDM. Keep the explicitly clearing of the FIXED "flag", as KVM subtly relies on that behavior to disallow unsupported types while allowing the correct indices for fixed counters. This wart will be cleaned up in short order. Opportunistically grab the per-type bitmask in the if-else blocks to eliminate the one-off usage of the local "fixed" bool. Reported-by: Jim Mattson <jmattson@google.com> Tested-by: Dapeng Mi <dapeng1.mi@linux.intel.com> Link: https://lore.kernel.org/r/20240109230250.424295-11-seanjc@google.com Signed-off-by: Sean Christopherson <seanjc@google.com>
2024-01-31KVM: x86/pmu: Disallow "fast" RDPMC for architectural Intel PMUsSean Christopherson1-4/+18
Inject #GP on RDPMC if the "fast" flag is set for architectural Intel PMUs, i.e. if the PMU version is non-zero. Per Intel's SDM, and confirmed on bare metal, the "fast" flag is supported only for non-architectural PMUs, and is reserved for architectural PMUs. If the processor does not support architectural performance monitoring (CPUID.0AH:EAX[7:0]=0), ECX[30:0] specifies the index of the PMC to be read. Setting ECX[31] selects “fast” read mode if supported. In this mode, RDPMC returns bits 31:0 of the PMC in EAX while clearing EDX to zero. If the processor does support architectural performance monitoring (CPUID.0AH:EAX[7:0] ≠ 0), ECX[31:16] specifies type of PMC while ECX[15:0] specifies the index of the PMC to be read within that type. The following PMC types are currently defined: — General-purpose counters use type 0. The index x (to read IA32_PMCx) must be less than the value enumerated by CPUID.0AH.EAX[15:8] (thus ECX[15:8] must be zero). — Fixed-function counters use type 4000H. The index x (to read IA32_FIXED_CTRx) can be used if either CPUID.0AH.EDX[4:0] > x or CPUID.0AH.ECX[x] = 1 (thus ECX[15:5] must be 0). — Performance metrics use type 2000H. This type can be used only if IA32_PERF_CAPABILITIES.PERF_METRICS_AVAILABLE[bit 15]=1. For this type, the index in ECX[15:0] is implementation specific. Opportunistically WARN if KVM ever actually tries to complete RDPMC for a non-architectural PMU, and drop the non-existent "support" for fast RDPMC, as KVM doesn't support such PMUs, i.e. kvm_pmu_rdpmc() should reject the RDPMC before getting to the Intel code. Fixes: f5132b01386b ("KVM: Expose a version 2 architectural PMU to a guests") Fixes: 67f4d4288c35 ("KVM: x86: rdpmc emulation checks the counter incorrectly") Reviewed-by: Dapeng Mi <dapeng1.mi@linux.intel.com> Tested-by: Dapeng Mi <dapeng1.mi@linux.intel.com> Link: https://lore.kernel.org/r/20240109230250.424295-10-seanjc@google.com Signed-off-by: Sean Christopherson <seanjc@google.com>
2024-01-31KVM: x86/pmu: Apply "fast" RDPMC only to Intel PMUsSean Christopherson2-4/+15
Move the handling of "fast" RDPMC instructions, which drop bits 63:32 of the count, to Intel. The "fast" flag, and all modifiers for that matter, are Intel-only and aren't supported by AMD. Opportunistically replace open coded bit crud with proper #defines, and add comments to try and disentangle the flags vs. values mess for non-architectural vs. architectural PMUs. Fixes: ca724305a2b0 ("KVM: x86/vPMU: Implement AMD vPMU code for KVM") Reviewed-by: Dapeng Mi <dapeng1.mi@linux.intel.com> Tested-by: Dapeng Mi <dapeng1.mi@linux.intel.com> Link: https://lore.kernel.org/r/20240109230250.424295-9-seanjc@google.com Signed-off-by: Sean Christopherson <seanjc@google.com>
2024-01-31KVM: x86/pmu: Prioritize VMX interception over #GP on RDPMC due to bad indexSean Christopherson8-29/+27
Apply the pre-intercepts RDPMC validity check only to AMD, and rename all relevant functions to make it as clear as possible that the check is not a standard PMC index check. On Intel, the basic rule is that only invalid opcodes and privilege/permission/mode checks have priority over VM-Exit, i.e. RDPMC with an invalid index should VM-Exit, not #GP. While the SDM doesn't explicitly call out RDPMC, it _does_ explicitly use RDMSR of a non-existent MSR as an example where VM-Exit has priority over #GP, and RDPMC is effectively just a variation of RDMSR. Manually testing on various Intel CPUs confirms this behavior, and the inverted priority was introduced for SVM compatibility, i.e. was not an intentional change for Intel PMUs. On AMD, *all* exceptions on RDPMC have priority over VM-Exit. Check for a NULL kvm_pmu_ops.check_rdpmc_early instead of using a RET0 static call so as to provide a convenient location to document the difference between Intel and AMD, and to again try to make it as obvious as possible that the early check is a one-off thing, not a generic "is this PMC valid?" helper. Fixes: 8061252ee0d2 ("KVM: SVM: Add intercept checks for remaining twobyte instructions") Cc: Jim Mattson <jmattson@google.com> Tested-by: Dapeng Mi <dapeng1.mi@linux.intel.com> Link: https://lore.kernel.org/r/20240109230250.424295-8-seanjc@google.com Signed-off-by: Sean Christopherson <seanjc@google.com>
2024-01-31KVM: x86/pmu: Don't ignore bits 31:30 for RDPMC index on AMDSean Christopherson1-3/+1
Stop stripping bits 31:30 prior to validating/consuming the RDPMC index on AMD. Per the APM's documentation of RDPMC, *values* greater than 27 are reserved. The behavior of upper bits being flags is firmly Intel-only. Fixes: ca724305a2b0 ("KVM: x86/vPMU: Implement AMD vPMU code for KVM") Tested-by: Dapeng Mi <dapeng1.mi@linux.intel.com> Link: https://lore.kernel.org/r/20240109230250.424295-7-seanjc@google.com Signed-off-by: Sean Christopherson <seanjc@google.com>
2024-01-31KVM: x86/pmu: Get eventsel for fixed counters from perfSean Christopherson1-13/+17
Get the event selectors used to effectively request fixed counters for perf events from perf itself instead of hardcoding them in KVM and hoping that they match the underlying hardware. While fixed counters 0 and 1 use architectural events, as of ffbe4ab0beda ("perf/x86/intel: Extend the ref-cycles event to GP counters") fixed counter 2 (reference TSC cycles) may use a software-defined pseudo-encoding or a real hardware-defined encoding. Reported-by: Kan Liang <kan.liang@linux.intel.com> Closes: https://lkml.kernel.org/r/4281eee7-6423-4ec8-bb18-c6aeee1faf2c%40linux.intel.com Reviewed-by: Kan Liang <kan.liang@linux.intel.com> Tested-by: Dapeng Mi <dapeng1.mi@linux.intel.com> Link: https://lore.kernel.org/r/20240109230250.424295-6-seanjc@google.com Signed-off-by: Sean Christopherson <seanjc@google.com>
2024-01-31KVM: x86/pmu: Setup fixed counters' eventsel during PMU initializationSean Christopherson1-11/+5
Set the eventsel for all fixed counters during PMU initialization, the eventsel is hardcoded and consumed if and only if the counter is supported, i.e. there is no reason to redo the setup every time the PMU is refreshed. Configuring all KVM-supported fixed counter also eliminates a potential pitfall if/when KVM supports discontiguous fixed counters, in which case configuring only nr_arch_fixed_counters will be insufficient (ignoring the fact that KVM will need many other changes to support discontiguous fixed counters). Tested-by: Dapeng Mi <dapeng1.mi@linux.intel.com> Link: https://lore.kernel.org/r/20240109230250.424295-5-seanjc@google.com Signed-off-by: Sean Christopherson <seanjc@google.com>
2024-01-31KVM: x86/pmu: Remove KVM's enumeration of Intel's architectural encodingsSean Christopherson1-49/+23
Drop KVM's enumeration of Intel's architectural event encodings, and instead open code the three encodings (of which only two are real) that KVM uses to emulate fixed counters. Now that KVM doesn't incorrectly enforce the availability of architectural encodings, there is no reason for KVM to ever care about the encodings themselves, at least not in the current format of an array indexed by the encoding's position in CPUID. Opportunistically add a comment to explain why KVM cares about eventsel values for fixed counters. Suggested-by: Jim Mattson <jmattson@google.com> Tested-by: Dapeng Mi <dapeng1.mi@linux.intel.com> Link: https://lore.kernel.org/r/20240109230250.424295-4-seanjc@google.com Signed-off-by: Sean Christopherson <seanjc@google.com>
2024-01-31KVM: x86/pmu: Allow programming events that match unsupported arch eventsSean Christopherson5-47/+0
Remove KVM's bogus restriction that the guest can't program an event whose encoding matches an unsupported architectural event. The enumeration of an architectural event only says that if a CPU supports an architectural event, then the event can be programmed using the architectural encoding. The enumeration does NOT say anything about the encoding when the CPU doesn't report support the architectural event. Preventing the guest from counting events whose encoding happens to match an architectural event breaks existing functionality whenever Intel adds an architectural encoding that was *ever* used for a CPU that doesn't enumerate support for the architectural event, even if the encoding is for the exact same event! E.g. the architectural encoding for Top-Down Slots is 0x01a4. Broadwell CPUs, which do not support the Top-Down Slots architectural event, 0x01a4 is a valid, model-specific event. Denying guest usage of 0x01a4 if/when KVM adds support for Top-Down slots would break any Broadwell-based guest. Reported-by: Kan Liang <kan.liang@linux.intel.com> Closes: https://lore.kernel.org/all/2004baa6-b494-462c-a11f-8104ea152c6a@linux.intel.com Fixes: a21864486f7e ("KVM: x86/pmu: Fix available_event_types check for REF_CPU_CYCLES event") Reviewed-by: Dapeng Mi <dapeng1.mi@linux.intel.com> Reviewed-by: Jim Mattson <jmattson@google.com> Reviewed-by: Kan Liang <kan.liang@linux.intel.com> Tested-by: Dapeng Mi <dapeng1.mi@linux.intel.com> Link: https://lore.kernel.org/r/20240109230250.424295-3-seanjc@google.com Signed-off-by: Sean Christopherson <seanjc@google.com>
2024-01-31KVM: x86/pmu: Always treat Fixed counters as available when supportedSean Christopherson1-1/+14
Treat fixed counters as available when they are supported, i.e. don't silently ignore an enabled fixed counter just because guest CPUID says the associated general purpose architectural event is unavailable. KVM originally treated fixed counters as always available, but that got changed as part of a fix to avoid confusing REF_CPU_CYCLES, which does NOT map to an architectural event, with the actual architectural event used associated with bit 7, TOPDOWN_SLOTS. The commit justified the change with: If the event is marked as unavailable in the Intel guest CPUID 0AH.EBX leaf, we need to avoid any perf_event creation, whether it's a gp or fixed counter. but that justification doesn't mesh with reality. The Intel SDM uses "architectural events" to refer to both general purpose events (the ones with the reverse polarity mask in CPUID.0xA.EBX) and the events for fixed counters, e.g. the SDM makes statements like: Each of the fixed-function PMC can count only one architectural performance event. but the fact that fixed counter 2 (TSC reference cycles) doesn't have an associated general purpose architectural makes trying to apply the mask from CPUID.0xA.EBX impossible. Furthermore, the lack of enumeration for an architectural event in CPUID only means the CPU doesn't officially support the architectural encoding, i.e. it doesn't mean using the architectural encoding _won't_ work, it sipmly means there are no guarantees that it will work as expected. E.g. if KVM is running in a VM that advertises a fixed counters but not the corresponding architectural event encoding, and perf decides to use a general purpose counter instead of a fixed counter, odds are very good that the underlying hardware actually does support the architectrual encoding, and that programming the encoding will count the right thing. In other words, asking perf to count the event will probably work, whereas intentionally doing nothing is obviously guaranteed to fail. Note, at the time of the change, KVM didn't enforce hardware support, i.e. didn't prevent userspace from enumerating support in guest CPUID.0xA.EBX for architectural events that aren't supported in hardware. I.e. silently dropping the fixed counter didn't somehow protection against counting the wrong event, it just enforced guest CPUID. And practically speaking, this issue is almost certainly limited to running KVM on a funky virtual CPU model. No known real hardware has an asymmetric PMU where a fixed counter is supported but the associated architectural event is not. Fixes: a21864486f7e ("KVM: x86/pmu: Fix available_event_types check for REF_CPU_CYCLES event") Tested-by: Dapeng Mi <dapeng1.mi@linux.intel.com> Link: https://lore.kernel.org/r/20240109230250.424295-2-seanjc@google.com Signed-off-by: Sean Christopherson <seanjc@google.com>
2024-01-29Linux 6.8-rc2v6.8-rc2Linus Torvalds1-1/+1
2024-01-29Merge tag 'cxl-fixes-6.8-rc2' of ↵Linus Torvalds5-13/+23
git://git.kernel.org/pub/scm/linux/kernel/git/cxl/cxl Pull cxl fixes from Dan Williams: "A build regression fix, a device compatibility fix, and an original bug preventing creation of large (16 device) interleave sets: - Fix unit test build regression fallout from global "missing-prototypes" change - Fix compatibility with devices that do not support interrupts - Fix overflow when calculating the capacity of large interleave sets" * tag 'cxl-fixes-6.8-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/cxl/cxl: cxl/region:Fix overflow issue in alloc_hpa() cxl/pci: Skip irq features if MSI/MSI-X are not supported tools/testing/nvdimm: Disable "missing prototypes / declarations" warnings tools/testing/cxl: Disable "missing prototypes / declarations" warnings
2024-01-28Merge tag 'mips-fixes_6.8_1' of ↵Linus Torvalds34-230/+83
git://git.kernel.org/pub/scm/linux/kernel/git/mips/linux Pull MIPS fixes from Thomas Bogendoerfer: - fix boot issue on single core Lantiq Danube devices - fix boot issue on Loongson64 platforms - fix improper FPU setup - fix missing prototypes issues * tag 'mips-fixes_6.8_1' of git://git.kernel.org/pub/scm/linux/kernel/git/mips/linux: mips: Call lose_fpu(0) before initializing fcr31 in mips_set_personality_nan MIPS: loongson64: set nid for reserved memblock region Revert "MIPS: loongson64: set nid for reserved memblock region" MIPS: lantiq: register smp_ops on non-smp platforms MIPS: loongson64: set nid for reserved memblock region MIPS: reserve exception vector space ONLY ONCE MIPS: BCM63XX: Fix missing prototypes MIPS: sgi-ip32: Fix missing prototypes MIPS: sgi-ip30: Fix missing prototypes MIPS: fw arc: Fix missing prototypes MIPS: sgi-ip27: Fix missing prototypes MIPS: Alchemy: Fix missing prototypes MIPS: Cobalt: Fix missing prototypes
2024-01-28Merge tag 'locking_urgent_for_v6.8_rc2' of ↵Linus Torvalds2-6/+20
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull locking fix from Borislav Petkov: - Prevent an inconsistent futex operation leading to stale state exposure * tag 'locking_urgent_for_v6.8_rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: futex: Prevent the reuse of stale pi_state
2024-01-28Merge tag 'irq_urgent_for_v6.8_rc2' of ↵Linus Torvalds1-1/+1
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull irq fix from Borislav Petkov: - Initialize the resend node of each IRQ descriptor, not only the first one * tag 'irq_urgent_for_v6.8_rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: genirq: Initialize resend_node hlist for all interrupt descriptors
2024-01-28Merge tag 'timers_urgent_for_v6.8_rc2' of ↵Linus Torvalds2-1/+29
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull timer fixes from Borislav Petkov: - Preserve the number of idle calls and sleep entries across CPU hotplug events in order to be able to compute correct averages - Limit the duration of the clocksource watchdog checking interval as too long intervals lead to wrongly marking the TSC as unstable * tag 'timers_urgent_for_v6.8_rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: tick/sched: Preserve number of idle sleeps across CPU hotplug events clocksource: Skip watchdog check for large watchdog intervals
2024-01-28Merge tag 'x86_urgent_for_v6.8_rc2' of ↵Linus Torvalds6-12/+50
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 fixes from Borislav Petkov: - Make sure 32-bit syscall registers are properly sign-extended - Add detection for AMD's Zen5 generation CPUs and Intel's Clearwater Forest CPU model number - Make a stub function export non-GPL because it is part of the paravirt alternatives and that can be used by non-GPL code * tag 'x86_urgent_for_v6.8_rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/CPU/AMD: Add more models to X86_FEATURE_ZEN5 x86/entry/ia32: Ensure s32 is sign extended to s64 x86/cpu: Add model number for Intel Clearwater Forest processor x86/CPU/AMD: Add X86_FEATURE_ZEN5 x86/paravirt: Make BUG_func() usable by non-GPL modules
2024-01-28Merge tag 'fixes-2024-01-28' of ↵Linus Torvalds1-0/+3
git://git.kernel.org/pub/scm/linux/kernel/git/rppt/memblock Pull memblock fix from Mike Rapoport: "Fix crash when reserved memory is not added to memory. When CONFIG_DEFERRED_STRUCT_PAGE_INIT is enabled, the initialization of reserved pages may cause access of NODE_DATA() with invalid nid and crash. Add a fall back to early_pfn_to_nid() in memmap_init_reserved_pages() to ensure a valid node id is always passed to init_reserved_page()" * tag 'fixes-2024-01-28' of git://git.kernel.org/pub/scm/linux/kernel/git/rppt/memblock: memblock: fix crash when reserved memory is not added to memory
2024-01-27Merge tag 'platform-drivers-x86-v6.8-2' of ↵Linus Torvalds14-177/+460
git://git.kernel.org/pub/scm/linux/kernel/git/pdx86/platform-drivers-x86 Pull x86 platform driver fixes from Hans de Goede: - WMI bus driver fixes - Second attempt (previously reverted) at P2SB PCI rescan deadlock fix - AMD PMF driver improvements - MAINTAINERS updates - Misc other small fixes and hw-id additions * tag 'platform-drivers-x86-v6.8-2' of git://git.kernel.org/pub/scm/linux/kernel/git/pdx86/platform-drivers-x86: platform/x86: touchscreen_dmi: Add info for the TECLAST X16 Plus tablet platform/x86/intel/ifs: Call release_firmware() when handling errors. platform/x86/amd/pmf: Fix memory leak in amd_pmf_get_pb_data() platform/x86/amd/pmf: Get ambient light information from AMD SFH driver platform/x86/amd/pmf: Get Human presence information from AMD SFH driver platform/mellanox: mlxbf-pmc: Fix offset calculation for crspace events platform/mellanox: mlxbf-tmfifo: Drop Tx network packet when Tx TmFIFO is full MAINTAINERS: remove defunct acpi4asus project info from asus notebooks section MAINTAINERS: add Luke Jones as maintainer for asus notebooks MAINTAINERS: Remove Perry Yuan as DELL WMI HARDWARE PRIVACY SUPPORT maintainer platform/x86: silicom-platform: Add missing "Description:" for power_cycle sysfs attr platform/x86: intel-wmi-sbl-fw-update: Fix function name in error message platform/x86: p2sb: Use pci_resource_n() in p2sb_read_bar0() platform/x86: p2sb: Allow p2sb_bar() calls during PCI device probe platform/x86: intel-uncore-freq: Fix types in sysfs callbacks platform/x86: wmi: Fix wmi_dev_probe() platform/x86: wmi: Fix notify callback locking platform/x86: wmi: Decouple legacy WMI notify handlers from wmi_block_list platform/x86: wmi: Return immediately if an suitable WMI event is found platform/x86: wmi: Fix error handling in legacy WMI notify handler functions
2024-01-27Merge tag 'loongarch-fixes-6.8-1' of ↵Linus Torvalds4-11/+14
git://git.kernel.org/pub/scm/linux/kernel/git/chenhuacai/linux-loongson Pull LoongArch fixes from Huacai Chen: "Fix boot failure on machines with more than 8 nodes, and fix two build errors about KVM" * tag 'loongarch-fixes-6.8-1' of git://git.kernel.org/pub/scm/linux/kernel/git/chenhuacai/linux-loongson: LoongArch: KVM: Add returns to SIMD stubs LoongArch: KVM: Fix build due to API changes LoongArch/smp: Call rcutree_report_cpu_starting() at tlb_init()
2024-01-27Merge tag 'xfs-6.8-fixes-1' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linuxLinus Torvalds1-10/+17
Pull xfs fix from Chandan Babu: - Fix read only mounts when using fsopen mount API * tag 'xfs-6.8-fixes-1' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux: xfs: read only mounts with fsopen mount API are busted
2024-01-27Merge tag 'bcachefs-2024-01-26' of https://evilpiepirate.org/git/bcachefsLinus Torvalds10-32/+42
Pull bcachefs fixes from Kent Overstreet: - fix for REQ_OP_FLUSH usage; this fixes filesystems going read only with -EOPNOTSUPP from the block layer. (this really should have gone in with the block layer patch causing the -EOPNOTSUPP, or should have gone in before). - fix an allocation in non-sleepable context - fix one source of srcu lock latency, on devices with terrible discard latency - fix a reattach_inode() issue in fsck * tag 'bcachefs-2024-01-26' of https://evilpiepirate.org/git/bcachefs: bcachefs: __lookup_dirent() works in snapshot, not subvol bcachefs: discard path uses unlock_long() bcachefs: fix incorrect usage of REQ_OP_FLUSH bcachefs: Add gfp flags param to bch2_prt_task_backtrace()
2024-01-27Merge tag '6.8-rc2-smb3-server-fixes' of git://git.samba.org/ksmbdLinus Torvalds3-3/+6
Pull smb server fixes from Steve French: - Fix netlink OOB - Minor kernel doc fix * tag '6.8-rc2-smb3-server-fixes' of git://git.samba.org/ksmbd: ksmbd: fix global oob in ksmbd_nl_policy smb: Fix some kernel-doc comments
2024-01-27Merge tag '6.8-rc1-smb3-client-fixes' of git://git.samba.org/sfrench/cifs-2.6Linus Torvalds13-74/+467
Pull smb client fixes from Steve French: "Nine cifs/smb client fixes - Four network error fixes (three relating to replays of requests that need to be retried, and one fixing some places where we were returning the wrong rc up the stack on network errors) - Two multichannel fixes including locking fix and case where subset of channels need reconnect - netfs integration fixup: share remote i_size with netfslib - Two small cleanups (one for addressing a clang warning)" * tag '6.8-rc1-smb3-client-fixes' of git://git.samba.org/sfrench/cifs-2.6: cifs: fix stray unlock in cifs_chan_skip_or_disable cifs: set replay flag for retries of write command cifs: commands that are retried should have replay flag set cifs: helper function to check replayable error codes cifs: translate network errors on send to -ECONNABORTED cifs: cifs_pick_channel should try selecting active channels cifs: Share server EOF pos with netfslib smb: Work around Clang __bdos() type confusion smb: client: delete "true", "false" defines
2024-01-27mips: Call lose_fpu(0) before initializing fcr31 in mips_set_personality_nanXi Ruoyao1-0/+6
If we still own the FPU after initializing fcr31, when we are preempted the dirty value in the FPU will be read out and stored into fcr31, clobbering our setting. This can cause an improper floating-point environment after execve(). For example: zsh% cat measure.c #include <fenv.h> int main() { return fetestexcept(FE_INEXACT); } zsh% cc measure.c -o measure -lm zsh% echo $((1.0/3)) # raising FE_INEXACT 0.33333333333333331 zsh% while ./measure; do ; done (stopped in seconds) Call lose_fpu(0) before setting fcr31 to prevent this. Closes: https://lore.kernel.org/linux-mips/7a6aa1bbdbbe2e63ae96ff163fab0349f58f1b9e.camel@xry111.site/ Fixes: 9b26616c8d9d ("MIPS: Respect the ISA level in FCSR handling") Cc: stable@vger.kernel.org Signed-off-by: Xi Ruoyao <xry111@xry111.site> Signed-off-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
2024-01-27MIPS: loongson64: set nid for reserved memblock regionHuang Pei2-0/+5
Commit 61167ad5fecd("mm: pass nid to reserve_bootmem_region()") reveals that reserved memblock regions have no valid node id set, just set it right since loongson64 firmware makes it clear in memory layout info. This works around booting failure on 3A1000+ since commit 61167ad5fecd ("mm: pass nid to reserve_bootmem_region()") under CONFIG_DEFERRED_STRUCT_PAGE_INIT. Signed-off-by: Huang Pei <huangpei@loongson.cn> Signed-off-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de>