Age | Commit message (Collapse) | Author | Files | Lines |
|
Much of turbostat can now run with perf, rather than using the MSR driver
Some of turbostat can now run as a regular non-root user.
Add some new output columns for some new GFX hardware.
[This patch updates the version, but otherwise changes no function;
it touches up some checkpatch issues from previous patches]
Signed-off-by: Len Brown <len.brown@intel.com>
|
|
Xe graphics driver uses different graphics sysfs knobs including
/sys/class/drm/card0/device/tile0/gt0/gtidle/idle_residency_ms
/sys/class/drm/card0/device/tile0/gt0/freq0/cur_freq
/sys/class/drm/card0/device/tile0/gt0/freq0/act_freq
/sys/class/drm/card0/device/tile0/gt1/gtidle/idle_residency_ms
/sys/class/drm/card0/device/tile0/gt1/freq0/cur_freq
/sys/class/drm/card0/device/tile0/gt1/freq0/act_freq
Plus that,
/sys/class/drm/card0/device/tile0/gt<n>/gtidle/name
returns either gt<n>-rc or gt<n>-mc. rc is for GFX and mc is SA Media.
Enhance turbostat to prefer the Xe sysfs knobs when they are available.
Export gt<n>-rc via BIC_GFX_rc6/BIC_GFXMHz/BIC_GFXACTMHz.
Export gt<n>-mc via BIC_SMA_mc6/BIC_SMAMHz/BIC_SMAACTMHz.
Signed-off-by: Zhang Rui <rui.zhang@intel.com>
|
|
On Meteorlake platform, i915 driver supports the traditional graphics
sysfs knobs including
/sys/class/drm/card0/power/rc6_residency_ms
/sys/class/drm/card0/gt_cur_freq_mhz
/sys/class/drm/card0/gt_act_freq_mhz
At the same time, it also supports
/sys/class/drm/card0/gt/gt0/rc6_residency_ms
/sys/class/drm/card0/gt/gt0/rps_cur_freq_mhz
/sys/class/drm/card0/gt/gt0/rps_act_freq_mhz
/sys/class/drm/card0/gt/gt1/rc6_residency_ms
/sys/class/drm/card0/gt/gt1/rps_cur_freq_mhz
/sys/class/drm/card0/gt/gt1/rps_act_freq_mhz
gt0 is for GFX and gt1 is for SA Media.
Enhance turbostat to prefer the i915 new sysfs knobs.
Export gt0 via BIC_GFX_rc6/BIC_GFXMHz/BIC_GFXACTMHz.
Export gt1 via BIC_SMA_mc6/BIC_SMAMHz/BIC_SMAACTMHz.
Signed-off-by: Zhang Rui <rui.zhang@intel.com>
|
|
Graphics driver (i915/Xe) on mordern platforms splits GFX and SA Media
information via different sysfs knobs.
Existing BIC_GFX_rc6/BIC_GFXMHz/BIC_GFXACTMHz columns can be reused for
GFX.
Introduce BIC_SAM_mc6/BIC_SAMMHz/BIC_SAMACTMHz columns for SA Media.
Signed-off-by: Zhang Rui <rui.zhang@intel.com>
|
|
Running turbostat on a 16 socket HPE Scale-up Compute 3200 (SapphireRapids) fails with:
turbostat: /sys/devices/system/cpu/intel_uncore_frequency/package_010_die_00/current_freq_khz: open failed: No such file or directory
We observe the sysfs uncore frequency directories named:
...
package_09_die_00/
package_10_die_00/
package_11_die_00/
...
package_15_die_00/
The culprit is an incorrect sprintf format string "package_0%d_die_0%d" used
with each instance of reading uncore frequency files. uncore-frequency-common.c
creates the sysfs directory with the format "package_%02d_die_%02d". Once the
package value reaches double digits, the formats diverge.
Change each instance of "package_0%d_die_0%d" to "package_%02d_die_%02d".
[lenb: deleted the probe part of this patch, as it was already fixed]
Signed-off-by: Justin Ernst <justin.ernst@hpe.com>
Reviewed-by: Thomas Renninger <trenn@suse.de>
Signed-off-by: Len Brown <len.brown@intel.com>
|
|
Graphics sysfs snapshots share similar logic.
Combine them into one function to avoid code duplication.
No functional change.
Signed-off-by: Zhang Rui <rui.zhang@intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
|
|
Graphics drivers (i915/Xe) have different sysfs knobs on different
platforms, and it is possible that different sysfs knobs fit into the
same turbostat columns.
Instead of specifying different sysfs knobs every time, detect them
once and cache the path for future use.
No functional change.
Signed-off-by: Zhang Rui <rui.zhang@intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
|
|
Enable Core C1 hardware residency counter (MSR_CORE_C1_RES) on ICX.
Signed-off-by: Zhang Rui <rui.zhang@intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
|
|
Some of the future Intel platforms will require reading the RAPL
counters via perf and not MSR. On current platforms we can still read
them using both ways.
Signed-off-by: Patryk Wlazlyn <patryk.wlazlyn@linux.intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
|
|
Signed-off-by: Patryk Wlazlyn <patryk.wlazlyn@linux.intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
|
|
If user request --no-msr or is not able to access the MSRs,
turbostat should clear all the counters added with --add.
Because MSR access permission checks are done after the cmdline is
parsed, the decision has to be defered up until the transition into
no-msr mode happen.
Signed-off-by: Len Brown <len.brown@intel.com>
|
|
Checking early if the permissions are even needed gets rid of the
warnings about some of them missing. Earlier we issued a warning in case
of missing MSR and/or perf permissions, even when user never asked for
counters that require those.
Signed-off-by: Patryk Wlazlyn <patryk.wlazlyn@linux.intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
|
|
To allow unprivileged user to run turbostat seamlessly.
Signed-off-by: Patryk Wlazlyn <patryk.wlazlyn@linux.intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
|
|
By using the perf API we spend less time in between the reads of the
counters, resulting in more accurate calculations of the dependent
metrics.
Using perf API is also usually faster overall, although cache miss, if
we get one, is more costly when using perf vs MSR driver.
We would fallback to the msr reads if the sysfs isn't there or when in
--no-perf mode.
Signed-off-by: Patryk Wlazlyn <patryk.wlazlyn@linux.intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
|
|
Add the --no-perf option to allow users to run turbostat without
accessing perf.
Signed-off-by: Patryk Wlazlyn <patryk.wlazlyn@linux.intel.com>
Reviewed-by: Len Brown <len.brown@intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
|
|
Add --no-msr option to allow users to run turbostat without
accessing MSRs via the MSR driver.
Signed-off-by: Patryk Wlazlyn <patryk.wlazlyn@linux.intel.com>
Reviewed-by: Len Brown <len.brown@intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
|
|
Eliminate redundant debug output for core and package scope counters.
Include name and path for all "ADDED" counters.
Signed-off-by: Len Brown <len.brown@intel.com>
|
|
Previously a failed read of /dev/cpu_dma_latency erroneously complained
turbostat: capget(CAP_SYS_ADMIN) failed, try "# setcap cap_sys_admin=ep ./turbostat
This went unnoticed because this file is typically visible to root,
and turbostat was typically run as root.
Going forward, when a non-root user can run turbostat...
Complain about failed read access to this file only if --debug is used.
Signed-off-by: Len Brown <len.brown@intel.com>
|
|
If MSRs cannot be read, values can be obtained from cpuid.
Signed-off-by: Patryk Wlazlyn <patryk.wlazlyn@linux.intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
|
|
If the MSR read were to fail, turbostat would print "microcode 0x0"
Signed-off-by: Patryk Wlazlyn <patryk.wlazlyn@linux.intel.com>
Reviewed-by: Len Brown <len.brown@intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
|
|
Print current frequency along with the current (and initial) limits
Probe and print uncore config also for machines using the new cluster API
Signed-off-by: Len Brown <len.brown@intel.com>
|
|
turbostat prints the abnormal SYS%LPI across suspend-to-idle:
SYS%LPI = 114479815993277.50
This is reproduced by:
Run a freeze cycle, e.g. "sleepgraph -m freeze -rtcwake 15".
Then do a reboot. After boot up, launch the suspend-idle-idle
and check the SYS%LPI field.
The slp_so residence counter is in LPIT table, and BIOS does not
clears this register across reset. The PMC expects the OS to calculate
the LPI residency based on the delta. However, there is an firmware
issue that the LPIT gets cleared to 0 during the second suspend
to idle after the reboot, which brings negative delta value.
[lenb: updated to print "neg" upon this BIOS failure]
Reported-by: Todd Brandt <todd.e.brandt@intel.com>
Signed-off-by: Chen Yu <yu.c.chen@intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
|
|
The code calculates Bzy_MHz by multiplying TSC_delta * APERF_delta/MPERF_delta
The man page erroneously showed that TSC_delta was divided.
Signed-off-by: Peng Liu <liupeng17@lenovo.com>
Signed-off-by: Len Brown <len.brown@intel.com>
|
|
When running turbostat, a system with 512 cpus reaches the limit for
maximum number of file descriptors that can be opened. To solve this
problem, the limit is raised to 2^15, which is a large enough number.
Below data is collected from AMD server systems while running turbostat:
|-----------+-------------------------------|
| # of cpus | # of opened fds for turbostat |
|-----------+-------------------------------|
| 128 | 260 |
|-----------+-------------------------------|
| 192 | 388 |
|-----------+-------------------------------|
| 512 | 1028 |
|-----------+-------------------------------|
So, the new max limit would be sufficient up to 2^14 cpus (but this
also depends on how many counters are enabled).
Reviewed-by: Doug Smythies <dsmythies@telus.net>
Tested-by: Doug Smythies <dsmythies@telus.net>
Signed-off-by: Wyes Karny <wyes.karny@amd.com>
Signed-off-by: Len Brown <len.brown@intel.com>
|
|
When using --Summary mode, added MSRs in raw mode always
print zeros. Print the actual register contents.
Example, with patch:
note the added column:
--add msr0x64f,u32,package,raw,REASON
Where:
0x64F is MSR_CORE_PERF_LIMIT_REASONS
Busy% Bzy_MHz PkgTmp PkgWatt CorWatt REASON
0.00 4800 35 1.42 0.76 0x00000000
0.00 4801 34 1.42 0.76 0x00000000
80.08 4531 66 108.17 107.52 0x08000000
98.69 4530 66 133.21 132.54 0x08000000
99.28 4505 66 128.26 127.60 0x0c000400
99.65 4486 68 124.91 124.25 0x0c000400
99.63 4483 68 124.90 124.25 0x0c000400
79.34 4481 41 99.80 99.13 0x0c000000
0.00 4801 41 1.40 0.73 0x0c000000
Where, for the test processor (i5-10600K):
PKG Limit #1: 125.000 Watts, 8.000000 sec
MSR bit 26 = log; bit 10 = status
PKG Limit #2: 136.000 Watts, 0.002441 sec
MSR bit 27 = log; bit 11 = status
Example, without patch:
Busy% Bzy_MHz PkgTmp PkgWatt CorWatt REASON
0.01 4800 35 1.43 0.77 0x00000000
0.00 4801 35 1.39 0.73 0x00000000
83.49 4531 66 112.71 112.06 0x00000000
98.69 4530 68 133.35 132.69 0x00000000
99.31 4500 67 127.96 127.30 0x00000000
99.63 4483 69 124.91 124.25 0x00000000
99.61 4481 69 124.90 124.25 0x00000000
99.61 4481 71 124.92 124.25 0x00000000
59.35 4479 42 75.03 74.37 0x00000000
0.00 4800 42 1.39 0.73 0x00000000
0.00 4801 42 1.42 0.76 0x00000000
c000000
[lenb: simplified patch to apply only to package scope]
Signed-off-by: Doug Smythies <dsmythies@telus.net>
Signed-off-by: Len Brown <len.brown@intel.com>
|
|
Turbostat features are now table-driven (Rui Zhang)
Add support for some new platforms (Sumeet Pawnikar, Rui Zhang)
Gracefully run in configs when CPUs are limited (Rui Zhang, Srinivas Pandruvada)
misc minor fixes.
Signed-off-by: Len Brown <len.brown@intel.com>
|
|
turbostat --show IPC
displays "inf" for the IPC column
turbostat was missing the explicit dependency of IPC on APERF,
and thus neglected to collect APERF when only IPC was requested.
typcial use:
turbostat --quiet --show CPU,IPC
Signed-off-by: Len Brown <len.brown@intel.com>
|
|
Add initial support for LunarLake platform.
It shares the same features with CannonLake.
Signed-off-by: Sumeet Pawnikar <sumeet.r.pawnikar@intel.com>
|
|
Add initial support for ArrowLake platform.
It shares the same features with CannonLake.
Signed-off-by: Sumeet Pawnikar <sumeet.r.pawnikar@intel.com>
|
|
Add initial support for GrandRidge.
It shares the same features as SierraForest, except that it does not
support PC2/PC6.
Signed-off-by: Zhang Rui <rui.zhang@intel.com>
|
|
Add initial support for SierraForest.
It shares the same features with SapphireRapids, except that it has
MSR_MODULE_C6_RES_MS support.
Signed-off-by: Zhang Rui <rui.zhang@intel.com>
|
|
Add initial support for GraniteRapids.
It shares the same features with SapphireRapids.
Signed-off-by: Zhang Rui <rui.zhang@intel.com>
|
|
Add MSR_CORE_C1_RES support for spr_features because both Sapphirerapids
and Emeraldrapids support this MSR.
Signed-off-by: Zhang Rui <rui.zhang@intel.com>
|
|
When available CPUs are reduced via cgroup cpuset controller, turbostat
will exit with errors (For example):
get_counters: Could not migrate to CPU 0
turbostat: re-initialized with num_cpus 20
get_counters: Could not migrate to CPU 0
turbostat: re-initialized with num_cpus 20
Move the turbostat to root cgroup, which has every CPU.
Writing the value 0 to a cgroup.procs file causes the writing
process to be moved to the corresponding cgroup.
Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
Tested-by: Zhang Rui <rui.zhang@intel.com>
|
|
CPUs can be isolated via cgroup settings and turbostat should avoid
migrating to these CPUs, just like it does for the '-c' cpus.
Introduce cpu_effective_set to save the cgroup cpu limitation info from
/sys/fs/cgroup/cpuset.cpus.effective. And use cpu_allowed_set as the
intersection of cpu_present_set, cpu_effective_set and cpu_subset.
Signed-off-by: Zhang Rui <rui.zhang@intel.com>
|
|
Abstract parse_cpu_str() which can update any specified cpu_set by a
given cpu string. This can be used to handle further CPU limitations
from other sources like cgroup.
The cpu string parsing code is also enhanced to handle the strings that
have an extra '\n' before string terminator.
Signed-off-by: Zhang Rui <rui.zhang@intel.com>
|
|
It is possible that the cpu_subset contains offlined CPUs.
If this happens during start, exit immediately because this is likely an
operator error that is best fixed by re-invoking.
If this happens at runtime, give a warning only because turbostat should
do its best effort to continue running.
Signed-off-by: Zhang Rui <rui.zhang@intel.com>
|
|
System summary should summarize the information for allowed CPUs instead
of all the present CPUs.
Introduce topology information for allowed CPUs, and use them to
get system summary.
Signed-off-by: Zhang Rui <rui.zhang@intel.com>
|
|
Thread_id doesn't tell if a CPU is allowed or not.
Detect allowed CPUs only and use the first detected thread/core as the
primary thread/core of a core/package.
Signed-off-by: Zhang Rui <rui.zhang@intel.com>
|
|
When detecting the primary thread/core in a core/package, current code
doesn't handle the allowed CPUs.
Abstract several functions for further fix of this issue.
No functional change.
Signed-off-by: Zhang Rui <rui.zhang@intel.com>
|
|
Set turbostat CPU affinity to make sure turbostat is running on one of
the allowed CPUs.
Set base_cpu to the first allowed CPU so that some platform information
is dumped using one of the allowed CPUs.
Signed-off-by: Zhang Rui <rui.zhang@intel.com>
|
|
for_all_cpus/for_all_cpus_2 are used for accessing the per CPU counters,
and they should follow the cpu_allowed_set instead of cpu_present_set.
Signed-off-by: Zhang Rui <rui.zhang@intel.com>
|
|
Turbostat supports "-c" parameter which limits output to system summary
plus the specified cpu-set. But some code still uses cpu_present_set to
read and dump the counters.
Introduce cpu_allowed_set for code that should obey the specified cpu-set.
No functional change.
Signed-off-by: Zhang Rui <rui.zhang@intel.com>
|
|
Compared with other platforms that share cnl_features, ADL/RPL don't
have PC7/PC9.
Clone a new platform feature set from cnl_features for ADL/RPL, with
PC7/PC9 removed.
Signed-off-by: Zhang Rui <rui.zhang@intel.com>
Reviewed-by: Len Brown <len.brown@intel.com>
|
|
All recent Intel client platforms have MSR_CORE_C1_RES. Enable the
support on these platforms, including CNL/ICL/LKF/RKL/TGL/ADL/RPL/MTL.
Signed-off-by: Zhang Rui <rui.zhang@intel.com>
Reviewed-by: Len Brown <len.brown@intel.com>
|
|
Feature probe has nothing to do with CPUID, thus it should not be in
process_cpuids().
Introduce probe_pm_features() and move all feature probing functions
into it.
Signed-off-by: Zhang Rui <rui.zhang@intel.com>
Reviewed-by: Len Brown <len.brown@intel.com>
|
|
Relocate more feature probing code outside of process_cpuids() into the
corresponding probing functions.
This improves the readability of code and the turbostat output.
Signed-off-by: Zhang Rui <rui.zhang@intel.com>
Reviewed-by: Len Brown <len.brown@intel.com>
|
|
Reorder some functions to solve code depdency introduced by next patch.
No functional change.
Signed-off-by: Zhang Rui <rui.zhang@intel.com>
Reviewed-by: Len Brown <len.brown@intel.com>
|
|
Introduce probe_thermal(), and move all thermal probing related code
into it.
Signed-off-by: Zhang Rui <rui.zhang@intel.com>
Reviewed-by: Len Brown <len.brown@intel.com>
|
|
Introduce probe_lpi(), and move all lpi probing related code into it.
Signed-off-by: Zhang Rui <rui.zhang@intel.com>
Reviewed-by: Len Brown <len.brown@intel.com>
|