summaryrefslogtreecommitdiff
path: root/tools/perf/util
AgeCommit message (Collapse)AuthorFilesLines
2023-03-14perf tools bpf: Add vmlinux.h to .gitignoreArnaldo Carvalho de Melo1-1/+2
Now that BPF skel based tools will be built by default if the toolchain pieces that are needed are available, building directly on the source tree will produce a vmlinux.h from the BTF info that needs to get ignored. Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-03-14perf lock contention: Show lock type with addressNamhyung Kim1-0/+2
Show lock type names after the symbol of locks if any. This can be useful especially when it doesn't show the lock symbols. The indentation before the lock type parenthesis is to recognize lock symbols more easily. $ sudo ./perf lock con -abl -- sleep 1 contended total wait max wait avg wait address symbol 44 6.13 ms 284.49 us 139.28 us ffffffff92e06080 tasklist_lock (rwlock) 159 983.38 us 12.38 us 6.18 us ffff8cc717c90000 siglock (spinlock) 10 679.90 us 153.35 us 67.99 us ffff8cdc2872aaf8 mmap_lock (rwsem) 9 558.11 us 180.67 us 62.01 us ffff8cd647914038 mmap_lock (rwsem) 78 228.56 us 7.82 us 2.93 us ffff8cc700061c00 (spinlock) 5 41.60 us 16.93 us 8.32 us ffffd853acb41468 (spinlock) 10 37.24 us 5.87 us 3.72 us ffff8cd560b5c200 siglock (spinlock) 4 11.17 us 3.97 us 2.79 us ffff8d053ddf0c80 rq_lock (spinlock) 1 7.86 us 7.86 us 7.86 us ffff8cd64791404c (spinlock) 1 4.13 us 4.13 us 4.13 us ffff8d053d930c80 rq_lock (spinlock) 7 3.98 us 1.67 us 568 ns ffff8ccb92479440 (mutex) 2 2.62 us 2.33 us 1.31 us ffff8cc702e6ede0 (rwlock) Signed-off-by: Namhyung Kim <namhyung@kernel.org> Cc: Boqun Feng <boqun.feng@gmail.com> Cc: Hao Luo <haoluo@google.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Juri Lelli <juri.lelli@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Song Liu <song@kernel.org> Cc: Stephane Eranian <eranian@google.com> Cc: Waiman Long <longman@redhat.com> Cc: Will Deacon <will@kernel.org> Cc: bpf@vger.kernel.org Link: https://lore.kernel.org/r/20230313204825.2665483-2-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-03-14perf lock contention: Show per-cpu rq_lock with addressNamhyung Kim3-2/+61
Using the BPF_PROG_RUN mechanism, we can run a raw_tp BPF program to collect some semi-global locks like per-cpu locks. Let's add runqueue locks using bpf_per_cpu_ptr() helper. $ sudo ./perf lock con -abl -- sleep 1 contended total wait max wait avg wait address symbol 248 3.25 ms 32.23 us 13.10 us ffff8cc75cfd2940 siglock 60 217.91 us 9.69 us 3.63 us ffff8cc700061c00 8 70.23 us 13.86 us 8.78 us ffff8cc703629484 4 56.32 us 35.81 us 14.08 us ffff8cc78b66f778 mmap_lock 4 16.70 us 5.18 us 4.18 us ffff8cc7036a0684 3 4.99 us 2.65 us 1.66 us ffff8d053da30c80 rq_lock 2 3.44 us 2.28 us 1.72 us ffff8d053dcf0c80 rq_lock 9 2.51 us 371 ns 278 ns ffff8ccb92479440 2 2.11 us 1.24 us 1.06 us ffff8d053db30c80 rq_lock 2 2.06 us 1.69 us 1.03 us ffff8d053d970c80 rq_lock Signed-off-by: Namhyung Kim <namhyung@kernel.org> Cc: Boqun Feng <boqun.feng@gmail.com> Cc: Hao Luo <haoluo@google.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Juri Lelli <juri.lelli@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Song Liu <song@kernel.org> Cc: Stephane Eranian <eranian@google.com> Cc: Waiman Long <longman@redhat.com> Cc: Will Deacon <will@kernel.org> Cc: bpf@vger.kernel.org Link: https://lore.kernel.org/r/20230313204825.2665483-2-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-03-14perf lock contention: Track and show siglock with addressNamhyung Kim3-3/+13
Likewise, we can display siglock by following the pointer like current->sighand->siglock. $ sudo ./perf lock con -abl -- sleep 1 contended total wait max wait avg wait address symbol 16 2.18 ms 305.35 us 136.34 us ffffffff92e06080 tasklist_lock 28 521.78 us 31.16 us 18.63 us ffff8cc703783ec4 7 119.03 us 23.55 us 17.00 us ffff8ccb92479440 15 88.29 us 10.06 us 5.89 us ffff8cd560b5f380 siglock 7 37.67 us 9.16 us 5.38 us ffff8d053daf0c80 5 8.81 us 4.92 us 1.76 us ffff8d053d6b0c80 Signed-off-by: Namhyung Kim <namhyung@kernel.org> Cc: Boqun Feng <boqun.feng@gmail.com> Cc: Hao Luo <haoluo@google.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Juri Lelli <juri.lelli@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Song Liu <song@kernel.org> Cc: Stephane Eranian <eranian@google.com> Cc: Waiman Long <longman@redhat.com> Cc: Will Deacon <will@kernel.org> Cc: bpf@vger.kernel.org Link: https://lore.kernel.org/r/20230313204825.2665483-2-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-03-14perf lock contention: Track and show mmap_lock with addressNamhyung Kim2-0/+47
Sometimes there are severe contentions on the mmap_lock and we want see it in the -l/--lock-addr output. However it cannot symbolize the mmap_lock because it's allocated dynamically without symbols. Stephane and Hao gave me an idea separately to display mmap_lock by following the current->mm pointer. I added a flag to mark mmap_lock after comparing the lock address so that it can show them differently. With this change it can show mmap_lock like below: $ sudo ./perf lock con -abl -- sleep 10 contended total wait max wait avg wait address symbol ... 16344 312.30 ms 2.22 ms 19.11 us ffff8cc702595640 17686 310.08 ms 1.49 ms 17.53 us ffff8cc7025952c0 3 84.14 ms 45.79 ms 28.05 ms ffff8cc78114c478 mmap_lock 3557 76.80 ms 68.75 us 21.59 us ffff8cc77ca3af58 1 68.27 ms 68.27 ms 68.27 ms ffff8cda745dfd70 9 54.53 ms 7.96 ms 6.06 ms ffff8cc7642a48b8 mmap_lock 14629 44.01 ms 60.00 us 3.01 us ffff8cc7625f9ca0 3481 42.63 ms 140.71 us 12.24 us ffffffff937906ac vmap_area_lock 16194 38.73 ms 42.15 us 2.39 us ffff8cd397cbc560 11 38.44 ms 10.39 ms 3.49 ms ffff8ccd6d12fbb8 mmap_lock 1 5.43 ms 5.43 ms 5.43 ms ffff8cd70018f0d8 1674 5.38 ms 422.93 us 3.21 us ffffffff92e06080 tasklist_lock 581 4.51 ms 130.68 us 7.75 us ffff8cc9b1259058 5 3.52 ms 1.27 ms 703.23 us ffff8cc754510070 112 3.47 ms 56.47 us 31.02 us ffff8ccee38b3120 381 3.31 ms 73.44 us 8.69 us ffffffff93790690 purge_vmap_area_lock 255 3.19 ms 36.35 us 12.49 us ffff8d053ce30c80 Note that mmap_lock was renamed some time ago and it needs to support old kernels with a different name 'mmap_sem'. Suggested-by: Hao Luo <haoluo@google.com> Suggested-by: Stephane Eranian <eranian@google.com> Signed-off-by: Namhyung Kim <namhyung@kernel.org> Cc: Boqun Feng <boqun.feng@gmail.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Juri Lelli <juri.lelli@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Song Liu <song@kernel.org> Cc: Suren Baghdasaryan <surenb@google.com> Cc: Waiman Long <longman@redhat.com> Cc: Will Deacon <will@kernel.org> Cc: bpf@vger.kernel.org Link: https://lore.kernel.org/r/20230313204825.2665483-2-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-03-14perf symbol: Add abi::__cxa_demangle C++ demangling supportIan Rogers4-30/+74
Refactor C++ demangling out of symbol-elf into its own files similar to other languages. Add abi::__cxa_demangle support. As the other demanglers are not shippable with distributions, this brings back C++ demangling in a common case. It isn't perfect as the support for optionally demangling arguments and modifiers isn't present. Signed-off-by: Ian Rogers <irogers@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andres Freund <andres@anarazel.de> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Leo Yan <leo.yan@linaro.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Martin Liška <mliska@suse.cz> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Nathan Chancellor <nathan@kernel.org> Cc: Nick Desaulniers <ndesaulniers@google.com> Cc: Pavithra Gurushankar <gpavithrasha@gmail.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Quentin Monnet <quentin@isovalent.com> Cc: Roberto Sassu <roberto.sassu@huawei.com> Cc: Stephane Eranian <eranian@google.com> Cc: Tiezhu Yang <yangtiezhu@loongson.cn> Cc: Tom Rix <trix@redhat.com> Cc: Yang Jihong <yangjihong1@huawei.com> Cc: llvm@lists.linux.dev Link: https://lore.kernel.org/r/20230311065753.3012826-2-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-03-14perf util: Remove weak sched_getcpuIan Rogers2-13/+16
sched_getcpu may not be present and so a feature test and definition exist to workaround this in the build. The feature test is used to define HAVE_SCHED_GETCPU_SUPPORT and so this is sufficient to know whether the local sched_getcpu is needed and a weak symbol can be avoided. Signed-off-by: Ian Rogers <irogers@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andres Freund <andres@anarazel.de> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Leo Yan <leo.yan@linaro.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Martin Liška <mliska@suse.cz> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Nathan Chancellor <nathan@kernel.org> Cc: Nick Desaulniers <ndesaulniers@google.com> Cc: Pavithra Gurushankar <gpavithrasha@gmail.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Quentin Monnet <quentin@isovalent.com> Cc: Roberto Sassu <roberto.sassu@huawei.com> Cc: Stephane Eranian <eranian@google.com> Cc: Tiezhu Yang <yangtiezhu@loongson.cn> Cc: Tom Rix <trix@redhat.com> Cc: Yang Jihong <yangjihong1@huawei.com> Cc: llvm@lists.linux.dev Link: https://lore.kernel.org/r/20230311065753.3012826-2-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-03-14perf bpf: Remove pre libbpf 1.0 conditional logicIan Rogers3-102/+0
Tests are no longer applicable as libbpf 1.0 can be assumed. Signed-off-by: Ian Rogers <irogers@google.com> Acked/Tested-by: Jiri Olsa <jolsa@kernel.org> Acked/Tested-by: Jiri Olsa <jolsa@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andres Freund <andres@anarazel.de> Cc: Andrii Nakryiko <andrii@kernel.org> Cc: Christy Lee <christylee@fb.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Quentin Monnet <quentin@isovalent.com> Cc: Roberto Sassu <roberto.sassu@huawei.com> Cc: bpf@vger.kernel.org Link: https://lore.kernel.org/r/20230116010115.490713-2-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-03-13perf lock contention: Fix compiler builtin detectionIan Rogers1-1/+7
__has_builtin was passed the macro rather than the actual builtin feature. The builtin test isn't sufficient and a clang version test also needs to be performed. Fixes: 1bece1351c653c3d ("perf lock contention: Support old rw_semaphore type") Reviewed-by: Namhyung Kim <namhyung@kernel.org> Signed-off-by: Ian Rogers <irogers@google.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Martin KaFai Lau <martin.lau@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: bpf@vger.kernel.org Link: https://lore.kernel.org/r/20230308003020.3653271-1-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-03-13perf parse-events: Warn when events are regroupedIan Rogers4-14/+36
Use if an event is reordered or the number of groups increases to signal that regrouping has happened and warn about it. Disable the warning in the case wild card PMU names are used and for metrics. Signed-off-by: Ian Rogers <irogers@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Florian Fischer <florian.fischer@muhq.space> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@arm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: John Garry <john.g.garry@oracle.com> Cc: Kajol Jain <kjain@linux.ibm.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Kim Phillips <kim.phillips@amd.com> Cc: Leo Yan <leo.yan@linaro.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ravi Bangoria <ravi.bangoria@amd.com> Cc: Sean Christopherson <seanjc@google.com> Cc: Steinar H. Gunderson <sesse@google.com> Cc: Stephane Eranian <eranian@google.com> Cc: Suzuki Poulouse <suzuki.poulose@arm.com> Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com> Link: https://lore.kernel.org/r/20230312021543.3060328-2-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-03-13perf evlist: Remove nr_groupsIan Rogers7-16/+8
Maintaining the number of groups during event parsing is problematic and since changing to sort/regroup events can only be computed by a linear pass over the evlist. As the value is generally only used in tests, rather than hold it in a variable compute it by passing over the evlist when necessary. This change highlights that libpfm's counting of groups with a single entry disagreed with regular event parsing. The libpfm tests are updated accordingly. Signed-off-by: Ian Rogers <irogers@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Florian Fischer <florian.fischer@muhq.space> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@arm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: John Garry <john.g.garry@oracle.com> Cc: Kajol Jain <kjain@linux.ibm.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Kim Phillips <kim.phillips@amd.com> Cc: Leo Yan <leo.yan@linaro.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ravi Bangoria <ravi.bangoria@amd.com> Cc: Sean Christopherson <seanjc@google.com> Cc: Steinar H. Gunderson <sesse@google.com> Cc: Stephane Eranian <eranian@google.com> Cc: Suzuki Poulouse <suzuki.poulose@arm.com> Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com> Link: https://lore.kernel.org/r/20230312021543.3060328-2-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-03-13perf evsel: Remove use_uncore_aliasIan Rogers5-17/+11
This flag used to be used when regrouping uncore events in particular due to wildcard matches. This is now handled by sorting evlist and so the flag is redundant. Signed-off-by: Ian Rogers <irogers@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Florian Fischer <florian.fischer@muhq.space> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@arm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: John Garry <john.g.garry@oracle.com> Cc: Kajol Jain <kjain@linux.ibm.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Kim Phillips <kim.phillips@amd.com> Cc: Leo Yan <leo.yan@linaro.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ravi Bangoria <ravi.bangoria@amd.com> Cc: Sean Christopherson <seanjc@google.com> Cc: Steinar H. Gunderson <sesse@google.com> Cc: Stephane Eranian <eranian@google.com> Cc: Suzuki Poulouse <suzuki.poulose@arm.com> Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com> Link: https://lore.kernel.org/r/20230312021543.3060328-2-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-03-13perf parse-events: Sort and group parsed eventsIan Rogers4-129/+120
This change is intended to be a no-op for most current cases, the default sort order is the order the events were parsed. Where it varies is in how groups are handled. Previously an uncore and core event that are grouped would most often cause the group to be removed: ``` $ perf stat -e '{instructions,uncore_imc_free_running_0/data_total/}' -a sleep 1 WARNING: grouped events cpus do not match, disabling group: anon group { instructions, uncore_imc_free_running_0/data_total/ } ... ``` However, when wildcards are used the events should be re-sorted and re-grouped in parse_events__set_leader, but this currently fails for simple examples: ``` $ perf stat -e '{uncore_imc_free_running/data_read/,uncore_imc_free_running/data_write/}' -a sleep 1 Performance counter stats for 'system wide': <not counted> MiB uncore_imc_free_running/data_read/ <not counted> MiB uncore_imc_free_running/data_write/ 1.000996992 seconds time elapsed ``` A futher failure mode, fixed in this patch, is to force topdown events into a group. This change moves sorting the evsels in the evlist after parsing. It requires parsing to set up groups. First the evsels are sorted respecting the existing groupings and parse order, but also reordering to ensure evsels of the same PMU and group appear together. So that software and aux events respect groups, their pmu_name is taken from the group leader. The sorting is done with list_sort removing a memory allocation. After sorting a pass is done to correct the group leaders and for topdown events ensuring they have a group leader. This fixes the problems seen before: ``` $ perf stat -e '{uncore_imc_free_running/data_read/,uncore_imc_free_running/data_write/}' -a sleep 1 Performance counter stats for 'system wide': 727.42 MiB uncore_imc_free_running/data_read/ 81.84 MiB uncore_imc_free_running/data_write/ 1.000948615 seconds time elapsed ``` As well as making groups not fail for cases like: ``` $ perf stat -e '{imc_free_running_0/data_total/,imc_free_running_1/data_total/}' -a sleep 1 Performance counter stats for 'system wide': 256.47 MiB imc_free_running_0/data_total/ 256.48 MiB imc_free_running_1/data_total/ 1.001165442 seconds time elapsed ``` Signed-off-by: Ian Rogers <irogers@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Florian Fischer <florian.fischer@muhq.space> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@arm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: John Garry <john.g.garry@oracle.com> Cc: Kajol Jain <kjain@linux.ibm.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Kim Phillips <kim.phillips@amd.com> Cc: Leo Yan <leo.yan@linaro.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ravi Bangoria <ravi.bangoria@amd.com> Cc: Sean Christopherson <seanjc@google.com> Cc: Steinar H. Gunderson <sesse@google.com> Cc: Stephane Eranian <eranian@google.com> Cc: Suzuki Poulouse <suzuki.poulose@arm.com> Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com> Link: https://lore.kernel.org/r/20230312021543.3060328-2-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-03-13perf parse-events: Pass ownership of the group nameIan Rogers2-2/+3
Pass ownership of the group name rather than copying and freeing the original. This saves a memory allocation and copy. Signed-off-by: Ian Rogers <irogers@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Florian Fischer <florian.fischer@muhq.space> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@arm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: John Garry <john.g.garry@oracle.com> Cc: Kajol Jain <kjain@linux.ibm.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Kim Phillips <kim.phillips@amd.com> Cc: Leo Yan <leo.yan@linaro.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ravi Bangoria <ravi.bangoria@amd.com> Cc: Sean Christopherson <seanjc@google.com> Cc: Steinar H. Gunderson <sesse@google.com> Cc: Stephane Eranian <eranian@google.com> Cc: Suzuki Poulouse <suzuki.poulose@arm.com> Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com> Link: https://lore.kernel.org/r/20230312021543.3060328-2-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-03-13perf evsel: Add function to compute group PMU nameIan Rogers3-0/+30
The computed name respects software events and aux event groups, such that the pmu_name is changed to be that of the aux event leader or group leader for software events. This is done as a later change will split events that are in different PMUs into different groups. Committer notes: Added a stub for this new function so that 'perf test python' passes. Signed-off-by: Ian Rogers <irogers@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Florian Fischer <florian.fischer@muhq.space> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@arm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: John Garry <john.g.garry@oracle.com> Cc: Kajol Jain <kjain@linux.ibm.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Kim Phillips <kim.phillips@amd.com> Cc: Leo Yan <leo.yan@linaro.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ravi Bangoria <ravi.bangoria@amd.com> Cc: Sean Christopherson <seanjc@google.com> Cc: Steinar H. Gunderson <sesse@google.com> Cc: Stephane Eranian <eranian@google.com> Cc: Suzuki Poulouse <suzuki.poulose@arm.com> Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com> Link: https://lore.kernel.org/r/20230312021543.3060328-2-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-03-13perf evsel: Allow const evsel for certain accessesIan Rogers4-8/+8
List sorting, added later to evlist, passes const elements requiring helper functions to also be const. Make the argument to evsel__find_pmu, evsel__is_aux_event and evsel__leader const. Signed-off-by: Ian Rogers <irogers@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Florian Fischer <florian.fischer@muhq.space> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@arm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: John Garry <john.g.garry@oracle.com> Cc: Kajol Jain <kjain@linux.ibm.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Kim Phillips <kim.phillips@amd.com> Cc: Leo Yan <leo.yan@linaro.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ravi Bangoria <ravi.bangoria@amd.com> Cc: Sean Christopherson <seanjc@google.com> Cc: Steinar H. Gunderson <sesse@google.com> Cc: Stephane Eranian <eranian@google.com> Cc: Suzuki Poulouse <suzuki.poulose@arm.com> Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com> Link: https://lore.kernel.org/r/20230312021543.3060328-2-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-03-13perf stat: Modify the group testIan Rogers1-1/+1
Currently nr_members is 0 for an event with no group, however, they are always a leader of their own group. A later change will make that count 1 because the event is its own leader. Make the find_stat logic consistent with this, an improvement suggested by Namhyung Kim. Suggested-by: Namhyung Kim <namhyung@kernel.org> Signed-off-by: Ian Rogers <irogers@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Florian Fischer <florian.fischer@muhq.space> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@arm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: John Garry <john.g.garry@oracle.com> Cc: Kajol Jain <kjain@linux.ibm.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Kim Phillips <kim.phillips@amd.com> Cc: Leo Yan <leo.yan@linaro.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ravi Bangoria <ravi.bangoria@amd.com> Cc: Sean Christopherson <seanjc@google.com> Cc: Steinar H. Gunderson <sesse@google.com> Cc: Stephane Eranian <eranian@google.com> Cc: Suzuki Poulouse <suzuki.poulose@arm.com> Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com> Link: https://lore.kernel.org/r/20230312021543.3060328-2-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-03-13perf ftrace: Reuse target::initial_delayChangbin Du1-1/+0
Replace perf_ftrace::initial_delay with target::initial_delay. Specifying a negative initial_delay is meaningless for ftrace in practice but allowed here. Signed-off-by: Changbin Du <changbin.du@huawei.com> Acked-by: Namhyung Kim <namhyung@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Hui Wang <hw.huiwang@huawei.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20230302031146.2801588-4-changbin.du@huawei.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-03-13perf record: Reuse target::initial_delayChangbin Du3-5/+4
This just simply replace record_opts::initial_delay with target::initial_delay. Nothing else is changed. Signed-off-by: Changbin Du <changbin.du@huawei.com> Acked-by: Namhyung Kim <namhyung@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Hui Wang <hw.huiwang@huawei.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20230302031146.2801588-3-changbin.du@huawei.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-03-11Merge remote-tracking branch 'acme/perf-tools' into perf-tools-nextArnaldo Carvalho de Melo3-6/+13
To pick up perf-tools fixes just merged upstream. Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-03-02perf stat: Fix counting when initial delay configuredChangbin Du3-6/+13
When creating counters with initial delay configured, the enable_on_exec field is not set. So we need to enable the counters later. The problem is, when a workload is specified the target__none() is true. So we also need to check stat_config.initial_delay. In this change, we add a new field 'initial_delay' for struct target which could be shared by other subcommands. And define target__enable_on_exec() which returns whether enable_on_exec should be set on normal cases. Before this fix the event is not counted: $ ./perf stat -e instructions -D 100 sleep 2 Events disabled Events enabled Performance counter stats for 'sleep 2': <not counted> instructions 1.901661124 seconds time elapsed 0.001602000 seconds user 0.000000000 seconds sys After fix it works: $ ./perf stat -e instructions -D 100 sleep 2 Events disabled Events enabled Performance counter stats for 'sleep 2': 404,214 instructions 1.901743475 seconds time elapsed 0.001617000 seconds user 0.000000000 seconds sys Fixes: c587e77e100fa40e ("perf stat: Do not delay the workload with --delay") Signed-off-by: Changbin Du <changbin.du@huawei.com> Acked-by: Namhyung Kim <namhyung@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Hui Wang <hw.huiwang@huawei.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20230302031146.2801588-2-changbin.du@huawei.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-02-22perf test: Fix offcpu test prev_state checkNamhyung Kim1-1/+1
On Fedora 36, the 'perf record' offcpu profiling tests are failing. It was because the BPF checks the prev task's state being S or D but actually it has more bits set. Let's check the LSB 8 bits for the purpose of offcpu profiling. Reported-by: Arnaldo Carvalho de Melo <acme@kernel.org> Signed-off-by: Namhyung Kim <namhyung@kernel.org> Acked-by: Ian Rogers <irogers@google.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20230218162724.1292657-1-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-02-19perf stat: Remove saved_value/runtime_statIan Rogers3-226/+0
As saved_value/runtime_stat are only written to and not read, remove the associated logic and writes. Signed-off-by: Ian Rogers <irogers@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexandre Torgue <alexandre.torgue@foss.st.com> Cc: Andrii Nakryiko <andrii@kernel.org> Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com> Cc: Caleb Biggers <caleb.biggers@intel.com> Cc: Eduard Zingerman <eddyz87@gmail.com> Cc: Florian Fischer <florian.fischer@muhq.space> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@arm.com> Cc: Jing Zhang <renyu.zj@linux.alibaba.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: John Garry <john.g.garry@oracle.com> Cc: Kajol Jain <kjain@linux.ibm.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Leo Yan <leo.yan@linaro.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Perry Taylor <perry.taylor@intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ravi Bangoria <ravi.bangoria@amd.com> Cc: Sandipan Das <sandipan.das@amd.com> Cc: Sean Christopherson <seanjc@google.com> Cc: Stephane Eranian <eranian@google.com> Cc: Suzuki Poulouse <suzuki.poulose@arm.com> Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com> Cc: linux-arm-kernel@lists.infradead.org Cc: linux-stm32@st-md-mailman.stormreply.com Link: https://lore.kernel.org/r/20230219092848.639226-52-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-02-19perf stat: Use counts rather than saved_valueIan Rogers2-266/+270
Switch the hard coded metrics to use the aggregate value rather than from saved_value. When computing a metric like IPC the aggregate count comes from instructions then cycles is looked up and if present IPC computed. Rather than lookup from the saved_value rbtree, search the counter's evlist for the desired counter. A new helper evsel__stat_type is used to both quickly find a metric function and to identify when a counter is the one being sought. So that both total and miss counts can be sought, the stat_type enum is expanded. The ratio functions are rewritten to share a common helper with the ratios being directly passed rather than computed from an enum value. Signed-off-by: Ian Rogers <irogers@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexandre Torgue <alexandre.torgue@foss.st.com> Cc: Andrii Nakryiko <andrii@kernel.org> Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com> Cc: Caleb Biggers <caleb.biggers@intel.com> Cc: Eduard Zingerman <eddyz87@gmail.com> Cc: Florian Fischer <florian.fischer@muhq.space> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@arm.com> Cc: Jing Zhang <renyu.zj@linux.alibaba.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: John Garry <john.g.garry@oracle.com> Cc: Kajol Jain <kjain@linux.ibm.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Leo Yan <leo.yan@linaro.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Perry Taylor <perry.taylor@intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ravi Bangoria <ravi.bangoria@amd.com> Cc: Sandipan Das <sandipan.das@amd.com> Cc: Sean Christopherson <seanjc@google.com> Cc: Stephane Eranian <eranian@google.com> Cc: Suzuki Poulouse <suzuki.poulose@arm.com> Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com> Cc: linux-arm-kernel@lists.infradead.org Cc: linux-stm32@st-md-mailman.stormreply.com Link: https://lore.kernel.org/r/20230219092848.639226-51-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-02-19perf metric: Directly use counts rather than saved_valueIan Rogers1-37/+19
Bugs with double aggregation have been introduced because of aggregation of counters and again with saved_value. Remove the generic metric use-case. Update parse-metric and pmu-events tests to update aggregate rather than saved_value counts. Signed-off-by: Ian Rogers <irogers@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexandre Torgue <alexandre.torgue@foss.st.com> Cc: Andrii Nakryiko <andrii@kernel.org> Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com> Cc: Caleb Biggers <caleb.biggers@intel.com> Cc: Eduard Zingerman <eddyz87@gmail.com> Cc: Florian Fischer <florian.fischer@muhq.space> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@arm.com> Cc: Jing Zhang <renyu.zj@linux.alibaba.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: John Garry <john.g.garry@oracle.com> Cc: Kajol Jain <kjain@linux.ibm.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Leo Yan <leo.yan@linaro.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Perry Taylor <perry.taylor@intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ravi Bangoria <ravi.bangoria@amd.com> Cc: Sandipan Das <sandipan.das@amd.com> Cc: Sean Christopherson <seanjc@google.com> Cc: Stephane Eranian <eranian@google.com> Cc: Suzuki Poulouse <suzuki.poulose@arm.com> Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com> Cc: linux-arm-kernel@lists.infradead.org Cc: linux-stm32@st-md-mailman.stormreply.com Link: https://lore.kernel.org/r/20230219092848.639226-50-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-02-19perf stat: Add cpu_aggr_map for loopIan Rogers5-125/+132
Rename variables, add a comment and add a cpu_aggr_map__for_each_idx to aid the readability of the stat-display code. In particular, try to make sure aggr_idx is used consistently to differentiate from other kinds of index. Signed-off-by: Ian Rogers <irogers@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexandre Torgue <alexandre.torgue@foss.st.com> Cc: Andrii Nakryiko <andrii@kernel.org> Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com> Cc: Caleb Biggers <caleb.biggers@intel.com> Cc: Eduard Zingerman <eddyz87@gmail.com> Cc: Florian Fischer <florian.fischer@muhq.space> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@arm.com> Cc: Jing Zhang <renyu.zj@linux.alibaba.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: John Garry <john.g.garry@oracle.com> Cc: Kajol Jain <kjain@linux.ibm.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Leo Yan <leo.yan@linaro.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Perry Taylor <perry.taylor@intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ravi Bangoria <ravi.bangoria@amd.com> Cc: Sandipan Das <sandipan.das@amd.com> Cc: Sean Christopherson <seanjc@google.com> Cc: Stephane Eranian <eranian@google.com> Cc: Suzuki Poulouse <suzuki.poulose@arm.com> Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com> Cc: linux-arm-kernel@lists.infradead.org Cc: linux-stm32@st-md-mailman.stormreply.com Link: https://lore.kernel.org/r/20230219092848.639226-49-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-02-19perf stat: Hide runtime_statIan Rogers4-114/+75
runtime_stat is only shared for the sake of tests that don't care about its value. Move the definition into stat-shadow.c and have the tests also use the global version. Signed-off-by: Ian Rogers <irogers@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexandre Torgue <alexandre.torgue@foss.st.com> Cc: Andrii Nakryiko <andrii@kernel.org> Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com> Cc: Caleb Biggers <caleb.biggers@intel.com> Cc: Eduard Zingerman <eddyz87@gmail.com> Cc: Florian Fischer <florian.fischer@muhq.space> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@arm.com> Cc: Jing Zhang <renyu.zj@linux.alibaba.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: John Garry <john.g.garry@oracle.com> Cc: Kajol Jain <kjain@linux.ibm.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Leo Yan <leo.yan@linaro.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Perry Taylor <perry.taylor@intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ravi Bangoria <ravi.bangoria@amd.com> Cc: Sandipan Das <sandipan.das@amd.com> Cc: Sean Christopherson <seanjc@google.com> Cc: Stephane Eranian <eranian@google.com> Cc: Suzuki Poulouse <suzuki.poulose@arm.com> Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com> Cc: linux-arm-kernel@lists.infradead.org Cc: linux-stm32@st-md-mailman.stormreply.com Link: https://lore.kernel.org/r/20230219092848.639226-48-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-02-19perf stat: Move enums from headerIan Rogers2-27/+25
The enums are only used in stat-shadow.c, so narrow their scope by moving to the C file. Signed-off-by: Ian Rogers <irogers@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexandre Torgue <alexandre.torgue@foss.st.com> Cc: Andrii Nakryiko <andrii@kernel.org> Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com> Cc: Caleb Biggers <caleb.biggers@intel.com> Cc: Eduard Zingerman <eddyz87@gmail.com> Cc: Florian Fischer <florian.fischer@muhq.space> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@arm.com> Cc: Jing Zhang <renyu.zj@linux.alibaba.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: John Garry <john.g.garry@oracle.com> Cc: Kajol Jain <kjain@linux.ibm.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Leo Yan <leo.yan@linaro.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Perry Taylor <perry.taylor@intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ravi Bangoria <ravi.bangoria@amd.com> Cc: Sandipan Das <sandipan.das@amd.com> Cc: Sean Christopherson <seanjc@google.com> Cc: Stephane Eranian <eranian@google.com> Cc: Suzuki Poulouse <suzuki.poulose@arm.com> Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com> Cc: linux-arm-kernel@lists.infradead.org Cc: linux-stm32@st-md-mailman.stormreply.com Link: https://lore.kernel.org/r/20230219092848.639226-47-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-02-19perf stat: Remove perf_stat_evsel_idIan Rogers2-43/+0
perf_stat_evsel_id was used for hard coded metrics. These have now migrated to json metrics and so the id values are no longer necessary. Signed-off-by: Ian Rogers <irogers@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexandre Torgue <alexandre.torgue@foss.st.com> Cc: Andrii Nakryiko <andrii@kernel.org> Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com> Cc: Caleb Biggers <caleb.biggers@intel.com> Cc: Eduard Zingerman <eddyz87@gmail.com> Cc: Florian Fischer <florian.fischer@muhq.space> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@arm.com> Cc: Jing Zhang <renyu.zj@linux.alibaba.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: John Garry <john.g.garry@oracle.com> Cc: Kajol Jain <kjain@linux.ibm.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Leo Yan <leo.yan@linaro.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Perry Taylor <perry.taylor@intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ravi Bangoria <ravi.bangoria@amd.com> Cc: Sandipan Das <sandipan.das@amd.com> Cc: Sean Christopherson <seanjc@google.com> Cc: Stephane Eranian <eranian@google.com> Cc: Suzuki Poulouse <suzuki.poulose@arm.com> Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com> Cc: linux-arm-kernel@lists.infradead.org Cc: linux-stm32@st-md-mailman.stormreply.com Link: https://lore.kernel.org/r/20230219092848.639226-46-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-02-19perf stat: Use metrics for --smi-costIan Rogers3-36/+0
Rather than parsing events for --smi-cost, use the json metric group 'smi'. Signed-off-by: Ian Rogers <irogers@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexandre Torgue <alexandre.torgue@foss.st.com> Cc: Andrii Nakryiko <andrii@kernel.org> Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com> Cc: Caleb Biggers <caleb.biggers@intel.com> Cc: Eduard Zingerman <eddyz87@gmail.com> Cc: Florian Fischer <florian.fischer@muhq.space> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@arm.com> Cc: Jing Zhang <renyu.zj@linux.alibaba.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: John Garry <john.g.garry@oracle.com> Cc: Kajol Jain <kjain@linux.ibm.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Leo Yan <leo.yan@linaro.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Perry Taylor <perry.taylor@intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ravi Bangoria <ravi.bangoria@amd.com> Cc: Sandipan Das <sandipan.das@amd.com> Cc: Sean Christopherson <seanjc@google.com> Cc: Stephane Eranian <eranian@google.com> Cc: Suzuki Poulouse <suzuki.poulose@arm.com> Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com> Cc: linux-arm-kernel@lists.infradead.org Cc: linux-stm32@st-md-mailman.stormreply.com Link: https://lore.kernel.org/r/20230219092848.639226-45-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-02-19perf stat: Remove hard coded transaction eventsIan Rogers3-58/+1
The metric group "transaction" is now present for Intel architectures so the legacy hard coded approach won't be used. Remove the associated logic. Signed-off-by: Ian Rogers <irogers@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexandre Torgue <alexandre.torgue@foss.st.com> Cc: Andrii Nakryiko <andrii@kernel.org> Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com> Cc: Caleb Biggers <caleb.biggers@intel.com> Cc: Eduard Zingerman <eddyz87@gmail.com> Cc: Florian Fischer <florian.fischer@muhq.space> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@arm.com> Cc: Jing Zhang <renyu.zj@linux.alibaba.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: John Garry <john.g.garry@oracle.com> Cc: Kajol Jain <kjain@linux.ibm.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Leo Yan <leo.yan@linaro.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Perry Taylor <perry.taylor@intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ravi Bangoria <ravi.bangoria@amd.com> Cc: Sandipan Das <sandipan.das@amd.com> Cc: Sean Christopherson <seanjc@google.com> Cc: Stephane Eranian <eranian@google.com> Cc: Suzuki Poulouse <suzuki.poulose@arm.com> Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com> Cc: linux-arm-kernel@lists.infradead.org Cc: linux-stm32@st-md-mailman.stormreply.com Link: https://lore.kernel.org/r/20230219092848.639226-44-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-02-19perf stat: Remove topdown event special handlingIan Rogers3-385/+0
Now the events are computed from json metrics, the hard coded logic can be removed. Signed-off-by: Ian Rogers <irogers@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexandre Torgue <alexandre.torgue@foss.st.com> Cc: Andrii Nakryiko <andrii@kernel.org> Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com> Cc: Caleb Biggers <caleb.biggers@intel.com> Cc: Eduard Zingerman <eddyz87@gmail.com> Cc: Florian Fischer <florian.fischer@muhq.space> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@arm.com> Cc: Jing Zhang <renyu.zj@linux.alibaba.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: John Garry <john.g.garry@oracle.com> Cc: Kajol Jain <kjain@linux.ibm.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Leo Yan <leo.yan@linaro.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Perry Taylor <perry.taylor@intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ravi Bangoria <ravi.bangoria@amd.com> Cc: Sandipan Das <sandipan.das@amd.com> Cc: Sean Christopherson <seanjc@google.com> Cc: Stephane Eranian <eranian@google.com> Cc: Suzuki Poulouse <suzuki.poulose@arm.com> Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com> Cc: linux-arm-kernel@lists.infradead.org Cc: linux-stm32@st-md-mailman.stormreply.com Link: https://lore.kernel.org/r/20230219092848.639226-42-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-02-19perf stat: Implement --topdown using json metricsIan Rogers4-74/+37
Request the topdown metric group of a level with the metrics in the group 'TopdownL<level>' rather than through specific events. As more topdown levels are supported this way, such as 6 on Intel Ice Lake, default to just showing the level 1 metrics. This can be overridden using '--td-level'. Rather than determine the maximum topdown level from sysfs, use the metric group names. Remove some now unused topdown code. Signed-off-by: Ian Rogers <irogers@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexandre Torgue <alexandre.torgue@foss.st.com> Cc: Andrii Nakryiko <andrii@kernel.org> Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com> Cc: Caleb Biggers <caleb.biggers@intel.com> Cc: Eduard Zingerman <eddyz87@gmail.com> Cc: Florian Fischer <florian.fischer@muhq.space> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@arm.com> Cc: Jing Zhang <renyu.zj@linux.alibaba.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: John Garry <john.g.garry@oracle.com> Cc: Kajol Jain <kjain@linux.ibm.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Leo Yan <leo.yan@linaro.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Perry Taylor <perry.taylor@intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ravi Bangoria <ravi.bangoria@amd.com> Cc: Sandipan Das <sandipan.das@amd.com> Cc: Sean Christopherson <seanjc@google.com> Cc: Stephane Eranian <eranian@google.com> Cc: Suzuki Poulouse <suzuki.poulose@arm.com> Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com> Cc: linux-arm-kernel@lists.infradead.org Cc: linux-stm32@st-md-mailman.stormreply.com Link: https://lore.kernel.org/r/20230219092848.639226-41-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-02-19perf stat: Add TopdownL1 metric as a default if presentIan Rogers1-4/+2
When there are no events and on Intel, the topdown events will be added by default if present. To display the metrics associated with these request special handling in stat-shadow.c. To more easily update these metrics use the json metric version via the TopdownL1 group. This makes the handling less platform specific. Modify the metricgroup__has_metric code to also cover metric groups. Signed-off-by: Ian Rogers <irogers@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexandre Torgue <alexandre.torgue@foss.st.com> Cc: Andrii Nakryiko <andrii@kernel.org> Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com> Cc: Caleb Biggers <caleb.biggers@intel.com> Cc: Eduard Zingerman <eddyz87@gmail.com> Cc: Florian Fischer <florian.fischer@muhq.space> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@arm.com> Cc: Jing Zhang <renyu.zj@linux.alibaba.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: John Garry <john.g.garry@oracle.com> Cc: Kajol Jain <kjain@linux.ibm.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Leo Yan <leo.yan@linaro.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Perry Taylor <perry.taylor@intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ravi Bangoria <ravi.bangoria@amd.com> Cc: Sandipan Das <sandipan.das@amd.com> Cc: Sean Christopherson <seanjc@google.com> Cc: Stephane Eranian <eranian@google.com> Cc: Suzuki Poulouse <suzuki.poulose@arm.com> Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com> Cc: linux-arm-kernel@lists.infradead.org Cc: linux-stm32@st-md-mailman.stormreply.com Link: https://lore.kernel.org/r/20230219092848.639226-40-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-02-19perf metric: Add --metric-no-threshold optionIan Rogers4-26/+43
Thresholds may need additional events, this can impact things like sharing groups of events to avoid multiplexing. Add a flag to make the threshold calculations optional. The threshold will still be computed if no additional events are necessary. Signed-off-by: Ian Rogers <irogers@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexandre Torgue <alexandre.torgue@foss.st.com> Cc: Andrii Nakryiko <andrii@kernel.org> Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com> Cc: Caleb Biggers <caleb.biggers@intel.com> Cc: Eduard Zingerman <eddyz87@gmail.com> Cc: Florian Fischer <florian.fischer@muhq.space> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@arm.com> Cc: Jing Zhang <renyu.zj@linux.alibaba.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: John Garry <john.g.garry@oracle.com> Cc: Kajol Jain <kjain@linux.ibm.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Leo Yan <leo.yan@linaro.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Perry Taylor <perry.taylor@intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ravi Bangoria <ravi.bangoria@amd.com> Cc: Sandipan Das <sandipan.das@amd.com> Cc: Sean Christopherson <seanjc@google.com> Cc: Stephane Eranian <eranian@google.com> Cc: Suzuki Poulouse <suzuki.poulose@arm.com> Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com> Cc: linux-arm-kernel@lists.infradead.org Cc: linux-stm32@st-md-mailman.stormreply.com Link: https://lore.kernel.org/r/20230219092848.639226-39-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-02-19perf expr: More explicit NAN handlingIan Rogers1-2/+10
Comparison and logical operations on NAN won't ensure the result is NAN. Ensure NANs are propogated so that threshold expressions like "tma_fetch_latency > 0.1 & tma_frontend_bound > 0.15" don't yield a number when the components are NAN. Signed-off-by: Ian Rogers <irogers@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexandre Torgue <alexandre.torgue@foss.st.com> Cc: Andrii Nakryiko <andrii@kernel.org> Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com> Cc: Caleb Biggers <caleb.biggers@intel.com> Cc: Eduard Zingerman <eddyz87@gmail.com> Cc: Florian Fischer <florian.fischer@muhq.space> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@arm.com> Cc: Jing Zhang <renyu.zj@linux.alibaba.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: John Garry <john.g.garry@oracle.com> Cc: Kajol Jain <kjain@linux.ibm.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Leo Yan <leo.yan@linaro.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Perry Taylor <perry.taylor@intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ravi Bangoria <ravi.bangoria@amd.com> Cc: Sandipan Das <sandipan.das@amd.com> Cc: Sean Christopherson <seanjc@google.com> Cc: Stephane Eranian <eranian@google.com> Cc: Suzuki Poulouse <suzuki.poulose@arm.com> Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com> Cc: linux-arm-kernel@lists.infradead.org Cc: linux-stm32@st-md-mailman.stormreply.com Link: https://lore.kernel.org/r/20230219092848.639226-38-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-02-19perf metric: Compute and print threshold valuesIan Rogers3-11/+38
Compute the threshold metric and use it to color the metric value as red or green. The threshold expression is used to generate the set of events as the threshold may require additional events. A later patch make this behavior optional with a --metric-no-threshold flag. Signed-off-by: Ian Rogers <irogers@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexandre Torgue <alexandre.torgue@foss.st.com> Cc: Andrii Nakryiko <andrii@kernel.org> Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com> Cc: Caleb Biggers <caleb.biggers@intel.com> Cc: Eduard Zingerman <eddyz87@gmail.com> Cc: Florian Fischer <florian.fischer@muhq.space> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@arm.com> Cc: Jing Zhang <renyu.zj@linux.alibaba.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: John Garry <john.g.garry@oracle.com> Cc: Kajol Jain <kjain@linux.ibm.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Leo Yan <leo.yan@linaro.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Perry Taylor <perry.taylor@intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ravi Bangoria <ravi.bangoria@amd.com> Cc: Sandipan Das <sandipan.das@amd.com> Cc: Sean Christopherson <seanjc@google.com> Cc: Stephane Eranian <eranian@google.com> Cc: Suzuki Poulouse <suzuki.poulose@arm.com> Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com> Cc: linux-arm-kernel@lists.infradead.org Cc: linux-stm32@st-md-mailman.stormreply.com Link: https://lore.kernel.org/r/20230219092848.639226-37-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-02-19perf list: Support for printing metric thresholdsIan Rogers2-0/+4
Add to json output by default. For regular output, only enable with the --detail flag. Signed-off-by: Ian Rogers <irogers@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexandre Torgue <alexandre.torgue@foss.st.com> Cc: Andrii Nakryiko <andrii@kernel.org> Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com> Cc: Caleb Biggers <caleb.biggers@intel.com> Cc: Eduard Zingerman <eddyz87@gmail.com> Cc: Florian Fischer <florian.fischer@muhq.space> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@arm.com> Cc: Jing Zhang <renyu.zj@linux.alibaba.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: John Garry <john.g.garry@oracle.com> Cc: Kajol Jain <kjain@linux.ibm.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Leo Yan <leo.yan@linaro.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Perry Taylor <perry.taylor@intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ravi Bangoria <ravi.bangoria@amd.com> Cc: Sandipan Das <sandipan.das@amd.com> Cc: Sean Christopherson <seanjc@google.com> Cc: Stephane Eranian <eranian@google.com> Cc: Suzuki Poulouse <suzuki.poulose@arm.com> Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com> Cc: linux-arm-kernel@lists.infradead.org Cc: linux-stm32@st-md-mailman.stormreply.com Link: https://lore.kernel.org/r/20230219092848.639226-36-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-02-19perf pmu-events: Make the metric_constraint an enumIan Rogers1-7/+12
Rename metric_constraint to event_grouping to better explain what the variable is used for. Switch to use an enum for encoding instead of a string. Rather than just no constraint/grouping information or "NO_NMI_WATCHDOG", have 4 enum values. The values encode whether to group or not, and two cases where the behavior is dependent on either the NMI watchdog being enabled or SMT being enabled. Signed-off-by: Ian Rogers <irogers@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexandre Torgue <alexandre.torgue@foss.st.com> Cc: Andrii Nakryiko <andrii@kernel.org> Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com> Cc: Caleb Biggers <caleb.biggers@intel.com> Cc: Eduard Zingerman <eddyz87@gmail.com> Cc: Florian Fischer <florian.fischer@muhq.space> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@arm.com> Cc: Jing Zhang <renyu.zj@linux.alibaba.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: John Garry <john.g.garry@oracle.com> Cc: Kajol Jain <kjain@linux.ibm.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Leo Yan <leo.yan@linaro.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Perry Taylor <perry.taylor@intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ravi Bangoria <ravi.bangoria@amd.com> Cc: Sandipan Das <sandipan.das@amd.com> Cc: Sean Christopherson <seanjc@google.com> Cc: Stephane Eranian <eranian@google.com> Cc: Suzuki Poulouse <suzuki.poulose@arm.com> Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com> Cc: linux-arm-kernel@lists.infradead.org Cc: linux-stm32@st-md-mailman.stormreply.com Link: https://lore.kernel.org/r/20230219092848.639226-9-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-02-19perf expr: Make the online topology accessible globallyIan Rogers5-22/+36
Knowing the topology of online CPUs is useful for more than just expr literals. Move to a global function that caches the value. An additional upside is that this may also avoid computing the CPU topology in some situations. Signed-off-by: Ian Rogers <irogers@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexandre Torgue <alexandre.torgue@foss.st.com> Cc: Andrii Nakryiko <andrii@kernel.org> Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com> Cc: Caleb Biggers <caleb.biggers@intel.com> Cc: Eduard Zingerman <eddyz87@gmail.com> Cc: Florian Fischer <florian.fischer@muhq.space> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@arm.com> Cc: Jing Zhang <renyu.zj@linux.alibaba.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: John Garry <john.g.garry@oracle.com> Cc: Kajol Jain <kjain@linux.ibm.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Leo Yan <leo.yan@linaro.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Perry Taylor <perry.taylor@intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ravi Bangoria <ravi.bangoria@amd.com> Cc: Sandipan Das <sandipan.das@amd.com> Cc: Sean Christopherson <seanjc@google.com> Cc: Stephane Eranian <eranian@google.com> Cc: Suzuki Poulouse <suzuki.poulose@arm.com> Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com> Cc: linux-arm-kernel@lists.infradead.org Cc: linux-stm32@st-md-mailman.stormreply.com Link: https://lore.kernel.org/r/20230219092848.639226-8-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-02-19perf pmu-events: Change perpkg to be a boolIan Rogers1-7/+4
Switch to a more natural bool rather than string encoding, where NULL implicitly meant false. The only value of 'PerPkg' in the event json is '1'. Signed-off-by: Ian Rogers <irogers@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexandre Torgue <alexandre.torgue@foss.st.com> Cc: Andrii Nakryiko <andrii@kernel.org> Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com> Cc: Caleb Biggers <caleb.biggers@intel.com> Cc: Eduard Zingerman <eddyz87@gmail.com> Cc: Florian Fischer <florian.fischer@muhq.space> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@arm.com> Cc: Jing Zhang <renyu.zj@linux.alibaba.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: John Garry <john.g.garry@oracle.com> Cc: Kajol Jain <kjain@linux.ibm.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Leo Yan <leo.yan@linaro.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Perry Taylor <perry.taylor@intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ravi Bangoria <ravi.bangoria@amd.com> Cc: Sandipan Das <sandipan.das@amd.com> Cc: Sean Christopherson <seanjc@google.com> Cc: Stephane Eranian <eranian@google.com> Cc: Suzuki Poulouse <suzuki.poulose@arm.com> Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com> Cc: linux-arm-kernel@lists.infradead.org Cc: linux-stm32@st-md-mailman.stormreply.com Link: https://lore.kernel.org/r/20230219092848.639226-7-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-02-19perf pmu-events: Change deprecated to be a boolIan Rogers1-6/+4
Switch to a more natural bool rather than string encoding, where NULL implicitly meant false. Signed-off-by: Ian Rogers <irogers@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexandre Torgue <alexandre.torgue@foss.st.com> Cc: Andrii Nakryiko <andrii@kernel.org> Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com> Cc: Caleb Biggers <caleb.biggers@intel.com> Cc: Eduard Zingerman <eddyz87@gmail.com> Cc: Florian Fischer <florian.fischer@muhq.space> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@arm.com> Cc: Jing Zhang <renyu.zj@linux.alibaba.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: John Garry <john.g.garry@oracle.com> Cc: Kajol Jain <kjain@linux.ibm.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Leo Yan <leo.yan@linaro.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Perry Taylor <perry.taylor@intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ravi Bangoria <ravi.bangoria@amd.com> Cc: Sandipan Das <sandipan.das@amd.com> Cc: Sean Christopherson <seanjc@google.com> Cc: Stephane Eranian <eranian@google.com> Cc: Suzuki Poulouse <suzuki.poulose@arm.com> Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com> Cc: linux-arm-kernel@lists.infradead.org Cc: linux-stm32@st-md-mailman.stormreply.com Link: https://lore.kernel.org/r/20230219092848.639226-6-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-02-19perf metrics: Improve variable namesIan Rogers1-23/+22
has_constraint implies the NMI_WATCHDOG_CONSTRAINT and if the constraint is detected it causes events not to be grouped. Most of the code cares about whether events are grouped or not, so rename has_constraint to group_events. Also remove group from metricgroup___watchdog_constraint_hint as the warning is specific to a metric. Make the warning message agree with this too. Signed-off-by: Ian Rogers <irogers@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexandre Torgue <alexandre.torgue@foss.st.com> Cc: Andrii Nakryiko <andrii@kernel.org> Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com> Cc: Caleb Biggers <caleb.biggers@intel.com> Cc: Eduard Zingerman <eddyz87@gmail.com> Cc: Florian Fischer <florian.fischer@muhq.space> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@arm.com> Cc: Jing Zhang <renyu.zj@linux.alibaba.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: John Garry <john.g.garry@oracle.com> Cc: Kajol Jain <kjain@linux.ibm.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Leo Yan <leo.yan@linaro.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Perry Taylor <perry.taylor@intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ravi Bangoria <ravi.bangoria@amd.com> Cc: Sandipan Das <sandipan.das@amd.com> Cc: Sean Christopherson <seanjc@google.com> Cc: Stephane Eranian <eranian@google.com> Cc: Suzuki Poulouse <suzuki.poulose@arm.com> Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com> Cc: linux-arm-kernel@lists.infradead.org Cc: linux-stm32@st-md-mailman.stormreply.com Link: https://lore.kernel.org/r/20230219092848.639226-3-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-02-19perf tools: Ensure evsel name is initializedIan Rogers1-1/+1
Use the evsel__name accessor as otherwise name may be NULL resulting in a segv. This was observed with the perf stat shell test. Signed-off-by: Ian Rogers <irogers@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexandre Torgue <alexandre.torgue@foss.st.com> Cc: Andrii Nakryiko <andrii@kernel.org> Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com> Cc: Caleb Biggers <caleb.biggers@intel.com> Cc: Eduard Zingerman <eddyz87@gmail.com> Cc: Florian Fischer <florian.fischer@muhq.space> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@arm.com> Cc: Jing Zhang <renyu.zj@linux.alibaba.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: John Garry <john.g.garry@oracle.com> Cc: Kajol Jain <kjain@linux.ibm.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Leo Yan <leo.yan@linaro.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Perry Taylor <perry.taylor@intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ravi Bangoria <ravi.bangoria@amd.com> Cc: Sandipan Das <sandipan.das@amd.com> Cc: Sean Christopherson <seanjc@google.com> Cc: Stephane Eranian <eranian@google.com> Cc: Suzuki Poulouse <suzuki.poulose@arm.com> Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com> Cc: linux-arm-kernel@lists.infradead.org Cc: linux-stm32@st-md-mailman.stormreply.com Link: https://lore.kernel.org/r/20230219092848.639226-2-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-02-19perf test: Fix offcpu test prev_state checkNamhyung Kim1-1/+1
On Fedora 36, the 'perf record' offcpu profiling tests are failing. It was because the BPF checks the prev task's state being S or D but actually it has more bits set. Let's check the LSB 8 bits for the purpose of offcpu profiling. Reported-by: Arnaldo Carvalho de Melo <acme@kernel.org> Signed-off-by: Namhyung Kim <namhyung@kernel.org> Acked-by: Ian Rogers <irogers@google.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20230218162724.1292657-1-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-02-17perf intel-pt: Synthesize cycle eventsSteinar H. Gunderson3-8/+75
There is no good reason why we cannot synthesize "cycle" events from Intel PT just as we can synthesize "instruction" events, in particular when CYC packets are available. This enables using PT to getting much more accurate cycle profiles than regular sampling (record -e cycles) when the work last for very short periods (<10 ms). Thus, add support for this, based off of the existing IPC calculation framework. The new option to --itrace is "y" (for cYcles), as c was taken for calls. Cycle and instruction events can be synthesized together, and are by default. The only real caveat is that CYC packets are only emitted whenever some other packet is, which in practice is when a branch instruction is encountered (and not even all branches). Thus, even at no subsampling (e.g. --itrace=y0ns), it is impossible to get more accuracy than a single basic block, and all cycles spent executing that block will get attributed to the branch instruction that ends the packet. Thus, one cannot know whether the cycles came from e.g. a specific load, a mispredicted branch, or something else. When subsampling (which is the default), the cycle events will get smeared out even more, but will still be generally useful to attribute cycle counts to functions. Reviewed-by: Adrian Hunter <adrian.hunter@intel.com> Signed-off-by: Steinar H. Gunderson <sesse@google.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20220322082452.1429091-1-sesse@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-02-16perf c2c: Add report option to show false sharing in adjacent cachelinesFeng Tang3-8/+31
Many platforms have feature of adjacent cachelines prefetch, when it is enabled, for data in RAM of 2 cachelines (2N and 2N+1) granularity, if one is fetched to cache, the other one could likely be fetched too, which sort of extends the cacheline size to double, thus the false sharing could happens in adjacent cachelines. 0Day has captured performance changed related with this [1], and some commercial software explicitly makes its hot global variables 128 bytes aligned (2 cache lines) to avoid this kind of extended false sharing. So add an option "--double-cl" for 'perf c2c report' to show false sharing in double cache line granularity, which acts just like the cacheline size is doubled. There is no change to c2c record. The hardware events of shared cacheline are still per cacheline, and this option just changes the granularity of how events are grouped and displayed. In the 'perf c2c report' output below (will-it-scale's 'pagefault2' case on old kernel): ---------------------------------------------------------------------- 26 31 2 0 0 0 0xffff888103ec6000 ---------------------------------------------------------------------- 35.48% 50.00% 0.00% 0.00% 0.00% 0x10 0 1 0xffffffff8133148b 1153 66 971 3748 74 [k] get_mem_cgroup_from_mm 6.45% 0.00% 0.00% 0.00% 0.00% 0x10 0 1 0xffffffff813396e4 570 0 1531 879 75 [k] mem_cgroup_charge 25.81% 50.00% 0.00% 0.00% 0.00% 0x54 0 1 0xffffffff81331472 949 70 593 3359 74 [k] get_mem_cgroup_from_mm 19.35% 0.00% 0.00% 0.00% 0.00% 0x54 0 1 0xffffffff81339686 1352 0 1073 1022 74 [k] mem_cgroup_charge 9.68% 0.00% 0.00% 0.00% 0.00% 0x54 0 1 0xffffffff813396d6 1401 0 863 768 74 [k] mem_cgroup_charge 3.23% 0.00% 0.00% 0.00% 0.00% 0x54 0 1 0xffffffff81333106 618 0 804 11 9 [k] uncharge_batch The offset 0x10 and 0x54 used to displayed in 2 groups, and now they are listed together to give users a hint of extended false sharing. [1]. https://lore.kernel.org/lkml/20201102091543.GM31092@shao2-debian/ Committer notes: Link: https://lore.kernel.org/r/Y+wvVNWqXb70l4uy@feng-clx Removed -a, leaving just as --double-cl, as this probably is not used so frequently and perhaps will be even auto-detected if we manage to record the MSR where this is configured. Reviewed-by: Andi Kleen <ak@linux.intel.com> Reviewed-by: Leo Yan <leo.yan@linaro.org> Signed-off-by: Feng Tang <feng.tang@intel.com> Tested-by: Leo Yan <leo.yan@linaro.org> Acked-by: Joe Mario <jmario@redhat.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Tim Chen <tim.c.chen@intel.com> Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com> Link: https://lore.kernel.org/r/20230214075823.246414-1-feng.tang@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-02-10perf stat: Avoid merging/aggregating metric counts twiceIan Rogers1-1/+1
The added perf_stat_merge_counters combines uncore counters. When metrics are enabled, the counts are merged into a metric_leader via the stat-shadow saved_value logic. As the leader now is passed an aggregated count, it leads to all counters being added together twice and counts appearing approximately doubled in metrics. This change disables the saved_value merging of counts for evsels that are merged. It is recommended that later changes remove the saved_value entirely as the two layers of aggregation in the code is confusing. Fixes: 942c5593393d9418 ("perf stat: Add perf_stat_merge_counters()") Reported-by: Perry Taylor <perry.taylor@intel.com> Signed-off-by: Ian Rogers <irogers@google.com> Acked-by: Namhyung Kim <namhyung@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Eduard Zingerman <eddyz87@gmail.com> Cc: Florian Fischer <florian.fischer@muhq.space> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@arm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com> Link: https://lore.kernel.org/r/20230209064447.83733-1-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-02-08perf tools: Fix perf tool build error in util/pfm.cThomas Richter1-4/+2
I have downloaded linux-next and build the perf tool using # make LIBPFM4=1 to have libpfm4 support built into perf. The build fails: # make LIBPFM4=1 .... INSTALL libbpf_headers CC util/pfm.o util/pfm.c: In function ‘print_libpfm_event’: util/pfm.c:189:9: error: too many arguments to function ‘print_cb->print_event’ 189 | print_cb->print_event(print_state, | ^~~~~~~~ util/pfm.c:220:25: error: too many arguments to function ‘print_cb->print_event’ 220 | print_cb->print_event(print_state, The build error is caused by commit d9dc8874d6ce46cc ("perf pmu-events: Remove now unused event and metric variables") which changes the function prototype of struct print_callbacks { ... void (*print_event)(...); --> last two parameters removed. }; but does not adjust the usage of this function prototype in util/pfm.c. In file util/pfm.c function print_event() is still invoked with 13 parameters instead of 11. The compile fails. When I adjust the file util/pfm.c as in this patch, the build works file. Please check this patch for correctness, I have just fixed the compile issue. Fixes: d9dc8874d6ce46cc ("perf pmu-events: Remove now unused event and metric variables") Signed-off-by: Thomas Richter <tmricht@linux.ibm.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Tested-by: Ian Rogers <irogers@google.com> Cc: Heiko Carstens <hca@linux.ibm.com> Cc: Ian Rogers <irogers@google.com> Cc: Sumanth Korikkar <sumanthk@linux.ibm.com> Cc: Sven Schnelle <svens@linux.ibm.com> Cc: Vasily Gorbik <gor@linux.ibm.com> Cc: egorenar@linux.ibm.com Cc: linux-kernel-next@vger.kernel.org Link: https://lore.kernel.org/r/20230207140447.1827741-1-tmricht@linux.ibm.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-02-08perf lock contention: Support old rw_semaphore typeNamhyung Kim1-13/+44
The old kernel has a different type of the owner field in rwsem. We can check it using bpf_core_type_matches() builtin in clang but it also needs its own version check since it's available on recent versions. Signed-off-by: Namhyung Kim <namhyung@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Boqun Feng <boqun.feng@gmail.com> Cc: Davidlohr Bueso <dave@stgolabs.net> Cc: Hao Luo <haoluo@google.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Song Liu <song@kernel.org> Cc: Waiman Long <longman@redhat.com> Cc: Will Deacon <will@kernel.org> Cc: bpf@vger.kernel.org Link: https://lore.kernel.org/r/20230207002403.63590-4-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>