diff options
author | German Gomez <german.gomez@arm.com> | 2023-03-20 18:15:08 +0300 |
---|---|---|
committer | Arnaldo Carvalho de Melo <acme@redhat.com> | 2023-03-21 01:28:21 +0300 |
commit | ea15483e7c55f73809cd9e208fff511966539ee5 (patch) | |
tree | 8b4221249351637e01ddf6df8ecd287a64cf59d7 /tools/perf/Documentation | |
parent | 03a6c16ebf0162b57acd16c2dd2baa79d8db2a97 (diff) | |
download | linux-ea15483e7c55f73809cd9e208fff511966539ee5.tar.xz |
perf report: Add 'simd' sort field
Add 'simd' sort field to visualize SIMD ops in 'perf report'.
Rows are labeled with the SIMD ISA, and the type of predicate (if any):
- [p] partial predicate
- [e] empty predicate (no elements in the vector being used)
Example with Arm SPE and SVE (Scalable Vector Extension):
#include <arm_sve.h>
double src[1025], dst[1025];
int main(void) {
svfloat64_t vc = svdup_f64(1);
for(;;)
for(int i = 0; i < 1025; i += svcntd())
{
svbool_t pg = svwhilelt_b64(i, 1025);
svfloat64_t vsrc = svld1(pg, &src[i]);
svfloat64_t vdst = svadd_x(pg, vsrc, vc);
svst1(pg, &dst[i], vdst);
}
return 0;
}
... compiled using "gcc-11 -march=armv8-a+sve -O3"
Profiling on a platform that implements FEAT_SVE and FEAT_SPEv1p1:
$ perf record -e arm_spe_0// -- ./a.out
$ perf report --itrace=i1i -s overhead,pid,simd,sym
Overhead Pid:Command Simd Symbol
........ ................ ....... ......................
53.76% 10758:program [.] main
46.14% 10758:program [.] SVE [.] main
0.09% 10758:program [p] SVE [.] main
The report shows 0.09% of the sampled SVE operations use partial
predicates due to src and dst arrays not being multiples of the vector
register lengths.
Signed-off-by: German Gomez <german.gomez@arm.com>
Acked-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Anshuman.Khandual@arm.com
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Will Deacon <will@kernel.org>
Cc: linux-arm-kernel@lists.infradead.org
Link: https://lore.kernel.org/r/20230320151509.1137462-2-james.clark@arm.com
Signed-off-by: James Clark <james.clark@arm.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Diffstat (limited to 'tools/perf/Documentation')
-rw-r--r-- | tools/perf/Documentation/perf-report.txt | 1 |
1 files changed, 1 insertions, 0 deletions
diff --git a/tools/perf/Documentation/perf-report.txt b/tools/perf/Documentation/perf-report.txt index c242e8da6b1a..cfd502f7e6da 100644 --- a/tools/perf/Documentation/perf-report.txt +++ b/tools/perf/Documentation/perf-report.txt @@ -117,6 +117,7 @@ OPTIONS - addr: (Full) virtual address of the sampled instruction - retire_lat: On X86, this reports pipeline stall of this instruction compared to the previous instruction in cycles. And currently supported only on X86 + - simd: Flags describing a SIMD operation. "e" for empty Arm SVE predicate. "p" for partial Arm SVE predicate By default, comm, dso and symbol keys are used. (i.e. --sort comm,dso,symbol) |