summaryrefslogtreecommitdiff
path: root/tools/perf
AgeCommit message (Collapse)AuthorFilesLines
2015-10-29perf unwind: Pass symbol source to libunwindRabin Vincent1-1/+4
Even if --symfs is used to point to the debug binaries, we send in the non-debug filenames to libunwind, which leads to libunwind not finding the debug frame. Fix this by preferring the file in --symfs, if it is available. Signed-off-by: Rabin Vincent <rabin.vincent@axis.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Rabin Vincent <rabinv@axis.com> Link: http://lkml.kernel.org/r/1446104978-26429-1-git-send-email-rabin.vincent@axis.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2015-10-29perf tools: Compile scriptlets to BPF objects when passing '.c' to --eventWang Nan7-9/+83
This patch provides infrastructure for passing source files to --event directly using: # perf record --event bpf-file.c command This patch does following works: 1) Allow passing '.c' file to '--event'. parse_events_load_bpf() is expanded to allow caller tell it whether the passed file is source file or object. 2) llvm__compile_bpf() is called to compile the '.c' file, the result is saved into memory. Use bpf_object__open_buffer() to load the in-memory object. Introduces a bpf-script-example.c so we can manually test it: # perf record --clang-opt "-DLINUX_VERSION_CODE=0x40200" --event ./bpf-script-example.c sleep 1 Note that '--clang-opt' must put before '--event'. Futher patches will merge it into a testcase so can be tested automatically. Signed-off-by: Wang Nan <wangnan0@huawei.com> Acked-by: Alexei Starovoitov <ast@plumgrid.com> Cc: Brendan Gregg <brendan.d.gregg@gmail.com> Cc: Daniel Borkmann <daniel@iogearbox.net> Cc: David Ahern <dsahern@gmail.com> Cc: He Kuang <hekuang@huawei.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kaixu Xia <xiakaixu@huawei.com> Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Zefan Li <lizefan@huawei.com> Cc: pi3orama@163.com Link: http://lkml.kernel.org/r/1444826502-49291-10-git-send-email-wangnan0@huawei.com Signed-off-by: He Kuang <hekuang@huawei.com> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2015-10-29perf record: Add clang options for compiling BPF scriptsWang Nan2-0/+13
Although previous patch allows setting BPF compiler related options in perfconfig, on some ad-hoc situation it still requires passing options through cmdline. This patch introduces 2 options to 'perf record' for this propose: --clang-path and --clang-opt. Signed-off-by: Wang Nan <wangnan0@huawei.com> Cc: Alexei Starovoitov <ast@plumgrid.com> Cc: Brendan Gregg <brendan.d.gregg@gmail.com> Cc: Daniel Borkmann <daniel@iogearbox.net> Cc: David Ahern <dsahern@gmail.com> Cc: He Kuang <hekuang@huawei.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kaixu Xia <xiakaixu@huawei.com> Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Zefan Li <lizefan@huawei.com> Cc: pi3orama@163.com Link: http://lkml.kernel.org/r/1444826502-49291-9-git-send-email-wangnan0@huawei.com [ Add the new options to the 'record' man page ] Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2015-10-29perf bpf: Attach eBPF filter to perf eventWang Nan3-0/+24
This is the final patch which makes basic BPF filter work. After applying this patch, users are allowed to use BPF filter like: # perf record --event ./hello_world.o ls A bpf_fd field is appended to 'struct evsel', and setup during the callback function add_bpf_event() for each 'probe_trace_event'. PERF_EVENT_IOC_SET_BPF ioctl is used to attach eBPF program to a newly created perf event. The file descriptor of the eBPF program is passed to perf record using previous patches, and stored into evsel->bpf_fd. It is possible that different perf event are created for one kprobe events for different CPUs. In this case, when trying to call the ioctl, EEXIST will be return. This patch doesn't treat it as an error. Committer note: The bpf proggie used so far: __attribute__((section("fork=_do_fork"), used)) int fork(void *ctx) { return 0; } char _license[] __attribute__((section("license"), used)) = "GPL"; int _version __attribute__((section("version"), used)) = 0x40300; failed to produce any samples, even with forks happening and it being running in system wide mode. That is because now the filter is being associated, and the code above always returns zero, meaning that all forks will be probed but filtered away ;-/ Change it to 'return 1;' instead and after that: # trace --no-syscalls --event /tmp/foo.o 0.000 perf_bpf_probe:fork:(ffffffff8109be30)) 2.333 perf_bpf_probe:fork:(ffffffff8109be30)) 3.725 perf_bpf_probe:fork:(ffffffff8109be30)) 4.550 perf_bpf_probe:fork:(ffffffff8109be30)) ^C# And it works with all tools, including 'perf trace'. Signed-off-by: Wang Nan <wangnan0@huawei.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Alexei Starovoitov <ast@plumgrid.com> Cc: Brendan Gregg <brendan.d.gregg@gmail.com> Cc: Daniel Borkmann <daniel@iogearbox.net> Cc: David Ahern <dsahern@gmail.com> Cc: He Kuang <hekuang@huawei.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kaixu Xia <xiakaixu@huawei.com> Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Zefan Li <lizefan@huawei.com> Cc: pi3orama@163.com Link: http://lkml.kernel.org/r/1444826502-49291-8-git-send-email-wangnan0@huawei.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2015-10-29perf tools: Make sure fixdep is built before libbpfJiri Olsa1-1/+1
While doing 'make -C tools/perf build-test': LD fixdep-in.o LINK fixdep /bin/sh: /home/acme/git/linux/tools/build/fixdep: Permission denied make[6]: *** [bpf.o] Error 1 make[5]: *** [libbpf-in.o] Error 2 make[4]: *** [/home/acme/git/linux/tools/lib/bpf/libbpf.a] Error 2 make[4]: *** Waiting for unfinished jobs.... The fixdep tool needs to be built as the first binary. Libraries are built in paralel, so each of them needs to depend on fixdep target. Reported-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Jiri Olsa <jolsa@redhat.com> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/r/20151028204450.GA25553@krava.redhat.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2015-10-29perf script: Enable printing of branch stackStephane Eranian2-3/+93
This patch improves perf script by enabling printing of the branch stack via the 'brstack' and 'brstacksym' arguments to the field selection option -F. The option is off by default and operates only if the perf.data file has branch stack content. The branches are printed in to/from pairs. The most recent branch is printed first. The number of branch entries vary based on the underlying hardware and filtering used. The brstack prints FROM/TO addresses in raw hexadecimal format. The brstacksym prints FROM/TO addresses in symbolic form wherever possible. $ perf script -F ip,brstack 5d3000 0x401aa0/0x5d2000/M/-/-/-/0 ... $ perf script -F ip,brstacksym 4011e0 noploop+0x0/noploop+0x0/P/-/-/0 The notation F/T/M/X/A/C describes the attributes of the branch. F=from, T=to, M/P=misprediction/prediction, X=TSX, A=TSX abort, C=cycles (SKL) Signed-off-by: Stephane Eranian <eranian@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Kan Liang <kan.liang@intel.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Yuanfang Chen <cyfmxc@gmail.com> Link: http://lkml.kernel.org/r/1441039273-16260-5-git-send-email-eranian@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2015-10-29perf trace: Add cmd string table to decode sys_bpf first argArnaldo Carvalho de Melo1-0/+7
# perf trace -e bpf perf record -e /tmp/foo.o -a 362.779 (0.130 ms): perf/3451 bpf(cmd: PROG_LOAD, uattr: 0x7ffe9a6825d0, size: 48) = 3 Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexei Starovoitov <ast@plumgrid.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/n/tip-2b0nknu53baz9e0wj4thcdd8@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2015-10-28perf bpf: Collect perf_evsel in BPF object filesWang Nan3-7/+99
This patch creates a 'struct perf_evsel' for every probe in a BPF object file(s) and fills 'struct evlist' with them. The previously introduced dummy event is now removed. After this patch, the following command: # perf record --event filter.o ls Can trace on each of the probes defined in filter.o. The core of this patch is bpf__foreach_tev(), which calls a callback function for each 'struct probe_trace_event' event for a bpf program with each associated file descriptors. The add_bpf_event() callback creates evsels by calling parse_events_add_tracepoint(). Since bpf-loader.c will not be built if libbpf is turned off, an empty bpf__foreach_tev() is defined in bpf-loader.h to avoid build errors. Committer notes: Before: # /tmp/oldperf record --event /tmp/foo.o -a usleep 1 [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.198 MB perf.data ] # perf evlist /tmp/foo.o # perf evlist -v /tmp/foo.o: type: 1, size: 112, config: 0x9, { sample_period, sample_freq }: 4000, sample_type: IP|TID|TIME|CPU|PERIOD, disabled: 1, inherit: 1, mmap: 1, comm: 1, freq: 1, task: 1, sample_id_all: 1, exclude_guest: 1, mmap2: 1, comm_exec: 1 I.e. we create just the PERF_TYPE_SOFTWARE (type: 1), PERF_COUNT_SW_DUMMY(config 0x9) event, now, with this patch: # perf record --event /tmp/foo.o -a usleep 1 [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.210 MB perf.data ] # perf evlist -v perf_bpf_probe:fork: type: 2, size: 112, config: 0x6bd, { sample_period, sample_freq }: 1, sample_type: IP|TID|TIME|CPU|PERIOD|RAW, disabled: 1, inherit: 1, mmap: 1, comm: 1, task: 1, sample_id_all: 1, exclude_guest: 1, mmap2: 1, comm_exec: 1 # We now have a PERF_TYPE_SOFTWARE (type: 1), but the config states 0x6bd, which is how, after setting up the event via the kprobes interface, the 'perf_bpf_probe:fork' event is accessible via the perf_event_open syscall. This is all transient, as soon as the 'perf record' session ends, these probes will go away. To see how it looks like, lets try doing a neverending session, one that expects a control+C to end: # perf record --event /tmp/foo.o -a So, with that in place, we can use 'perf probe' to see what is in place: # perf probe -l perf_bpf_probe:fork (on _do_fork@acme/git/linux/kernel/fork.c) We also can use debugfs: [root@felicio ~]# cat /sys/kernel/debug/tracing/kprobe_events p:perf_bpf_probe/fork _text+638512 Ok, now lets stop and see if we got some forks: [root@felicio linux]# perf record --event /tmp/foo.o -a ^C[ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.325 MB perf.data (111 samples) ] [root@felicio linux]# perf script sshd 1271 [003] 81797.507678: perf_bpf_probe:fork: (ffffffff8109be30) sshd 18309 [000] 81797.524917: perf_bpf_probe:fork: (ffffffff8109be30) sshd 18309 [001] 81799.381603: perf_bpf_probe:fork: (ffffffff8109be30) sshd 18309 [001] 81799.408635: perf_bpf_probe:fork: (ffffffff8109be30) <SNIP> Sure enough, we have 111 forks :-) Callchains seems to work as well: # perf report --stdio --no-child # To display the perf.data header info, please use --header/--header-only options. # # Total Lost Samples: 0 # # Samples: 562 of event 'perf_bpf_probe:fork' # Event count (approx.): 562 # # Overhead Command Shared Object Symbol # ........ ........ ................ ............ # 44.66% sh [kernel.vmlinux] [k] _do_fork | ---_do_fork entry_SYSCALL_64_fastpath __libc_fork make_child 26.16% make [kernel.vmlinux] [k] _do_fork <SNIP> # Signed-off-by: Wang Nan <wangnan0@huawei.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Alexei Starovoitov <ast@plumgrid.com> Cc: Brendan Gregg <brendan.d.gregg@gmail.com> Cc: Daniel Borkmann <daniel@iogearbox.net> Cc: David Ahern <dsahern@gmail.com> Cc: He Kuang <hekuang@huawei.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kaixu Xia <xiakaixu@huawei.com> Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Zefan Li <lizefan@huawei.com> Cc: pi3orama@163.com Link: http://lkml.kernel.org/r/1444826502-49291-7-git-send-email-wangnan0@huawei.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2015-10-28perf tools: Load eBPF object into kernelWang Nan3-0/+39
This patch utilizes bpf_object__load() provided by libbpf to load all objects into kernel. Committer notes: Testing it: When using an incorrect kernel version number, i.e., having this in your eBPF proggie: int _version __attribute__((section("version"), used)) = 0x40100; For a 4.3.0-rc6+ kernel, say, this happens and needs checking at event parsing time, to provide a better error report to the user: # perf record --event /tmp/foo.o sleep 1 libbpf: load bpf program failed: Invalid argument libbpf: -- BEGIN DUMP LOG --- libbpf: libbpf: -- END LOG -- libbpf: failed to load program 'fork=_do_fork' libbpf: failed to load object '/tmp/foo.o' event syntax error: '/tmp/foo.o' \___ Invalid argument: Are you root and runing a CONFIG_BPF_SYSCALL kernel? (add -v to see detail) Run 'perf list' for a list of valid events Usage: perf record [<options>] [<command>] or: perf record [<options>] -- <command> [<options>] -e, --event <event> event selector. use 'perf list' to list available events If we instead make it match, i.e. use 0x40300 on this v4.3.0-rc6+ kernel, the whole process goes thru: # perf record --event /tmp/foo.o -a usleep 1 [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.202 MB perf.data ] # perf evlist -v /tmp/foo.o: type: 1, size: 112, config: 0x9, { sample_period, sample_freq }: 4000, sample_type: IP|TID|TIME|CPU|PERIOD, disabled: 1, inherit: 1, mmap: 1, comm: 1, freq: 1, task: 1, sample_id_all: 1, exclude_guest: 1, mmap2: 1, comm_exec: 1 # Signed-off-by: Wang Nan <wangnan0@huawei.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Alexei Starovoitov <ast@plumgrid.com> Cc: Brendan Gregg <brendan.d.gregg@gmail.com> Cc: Daniel Borkmann <daniel@iogearbox.net> Cc: David Ahern <dsahern@gmail.com> Cc: He Kuang <hekuang@huawei.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kaixu Xia <xiakaixu@huawei.com> Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Zefan Li <lizefan@huawei.com> Cc: pi3orama@163.com Link: http://lkml.kernel.org/r/1444826502-49291-6-git-send-email-wangnan0@huawei.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2015-10-28perf tools: Create probe points for BPF programsWang Nan3-1/+268
This patch introduces bpf__{un,}probe() functions to enable callers to create kprobe points based on section names a BPF program. It parses the section names in the program and creates corresponding 'struct perf_probe_event' structures. The parse_perf_probe_command() function is used to do the main parsing work. The resuling 'struct perf_probe_event' is stored into program private data for further using. By utilizing the new probing API, this patch creates probe points during event parsing. To ensure probe points be removed correctly, register an atexit hook so even perf quit through exit() bpf__clear() is still called, so probing points are cleared. Note that bpf_clear() should be registered before bpf__probe() is called, so failure of bpf__probe() can still trigger bpf__clear() to remove probe points which are already probed. strerror style error reporting scaffold is created by this patch. bpf__strerror_probe() is the first error reporting function in bpf-loader.c. Committer note: Trying it: To build a test eBPF object file: I am testing using a script I built from the 'perf test -v LLVM' output: $ cat ~/bin/hello-ebpf export KERNEL_INC_OPTIONS="-nostdinc -isystem /usr/lib/gcc/x86_64-redhat-linux/4.8.3/include -I/home/acme/git/linux/arch/x86/include -Iarch/x86/include/generated/uapi -Iarch/x86/include/generated -I/home/acme/git/linux/include -Iinclude -I/home/acme/git/linux/arch/x86/include/uapi -Iarch/x86/include/generated/uapi -I/home/acme/git/linux/include/uapi -Iinclude/generated/uapi -include /home/acme/git/linux/include/linux/kconfig.h" export WORKING_DIR=/lib/modules/4.2.0/build export CLANG_SOURCE=- export CLANG_OPTIONS=-xc OBJ=/tmp/foo.o rm -f $OBJ echo '__attribute__((section("fork=do_fork"), used)) int fork(void *ctx) {return 0;} char _license[] __attribute__((section("license"), used)) = "GPL";int _version __attribute__((section("version"), used)) = 0x40100;' | \ clang -D__KERNEL__ $CLANG_OPTIONS $KERNEL_INC_OPTIONS -Wno-unused-value -Wno-pointer-sign -working-directory $WORKING_DIR -c "$CLANG_SOURCE" -target bpf -O2 -o /tmp/foo.o && file $OBJ --- First asking to put a probe in a function not present in the kernel (misses the initial _): $ perf record --event /tmp/foo.o sleep 1 Probe point 'do_fork' not found. event syntax error: '/tmp/foo.o' \___ You need to check probing points in BPF file (add -v to see detail) Run 'perf list' for a list of valid events Usage: perf record [<options>] [<command>] or: perf record [<options>] -- <command> [<options>] -e, --event <event> event selector. use 'perf list' to list available events $ --- Now, with "__attribute__((section("fork=_do_fork"), used)): $ grep _do_fork /proc/kallsyms ffffffff81099ab0 T _do_fork $ perf record --event /tmp/foo.o sleep 1 Failed to open kprobe_events: Permission denied event syntax error: '/tmp/foo.o' \___ Permission denied --- Cool, we need to provide some better hints, "kprobe_events" is too low level, one doesn't strictly need to know the precise details of how these things are put in place, so something that shows the command needed to fix the permissions would be more helpful. Lets try as root instead: # perf record --event /tmp/foo.o sleep 1 Lowering default frequency rate to 1000. Please consider tweaking /proc/sys/kernel/perf_event_max_sample_rate. [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.013 MB perf.data ] # perf evlist /tmp/foo.o [root@felicio ~]# perf evlist -v /tmp/foo.o: type: 1, size: 112, config: 0x9, { sample_period, sample_freq }: 1000, sample_type: IP|TID|TIME|PERIOD, disabled: 1, inherit: 1, mmap: 1, comm: 1, freq: 1, enable_on_exec: 1, task: 1, sample_id_all: 1, exclude_guest: 1, mmap2: 1, comm_exec: 1 --- Signed-off-by: Wang Nan <wangnan0@huawei.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Alexei Starovoitov <ast@plumgrid.com> Cc: Brendan Gregg <brendan.d.gregg@gmail.com> Cc: Daniel Borkmann <daniel@iogearbox.net> Cc: David Ahern <dsahern@gmail.com> Cc: He Kuang <hekuang@huawei.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kaixu Xia <xiakaixu@huawei.com> Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Zefan Li <lizefan@huawei.com> Cc: pi3orama@163.com Link: http://lkml.kernel.org/r/1444826502-49291-5-git-send-email-wangnan0@huawei.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2015-10-28perf tools: Enable passing bpf object file to --eventWang Nan6-1/+88
By introducing new rules in tools/perf/util/parse-events.[ly], this patch enables 'perf record --event bpf_file.o' to select events by an eBPF object file. It calls parse_events_load_bpf() to load that file, which uses bpf__prepare_load() and finally calls bpf_object__open() for the object files. After applying this patch, commands like: # perf record --event foo.o sleep become possible. However, at this point it is unable to link any useful things onto the evsel list because the creating of probe points and BPF program attaching have not been implemented. Before real events are possible to be extracted, to avoid perf report error because of empty evsel list, this patch link a dummy evsel. The dummy event related code will be removed when probing and extracting code is ready. Commiter notes: Using it: $ ls -la foo.o ls: cannot access foo.o: No such file or directory $ perf record --event foo.o sleep libbpf: failed to open foo.o: No such file or directory event syntax error: 'foo.o' \___ BPF object file 'foo.o' is invalid (add -v to see detail) Run 'perf list' for a list of valid events Usage: perf record [<options>] [<command>] or: perf record [<options>] -- <command> [<options>] -e, --event <event> event selector. use 'perf list' to list available events $ $ file /tmp/build/perf/perf.o /tmp/build/perf/perf.o: ELF 64-bit LSB relocatable, x86-64, version 1 (SYSV), not stripped $ perf record --event /tmp/build/perf/perf.o sleep libbpf: /tmp/build/perf/perf.o is not an eBPF object file event syntax error: '/tmp/build/perf/perf.o' \___ BPF object file '/tmp/build/perf/perf.o' is invalid (add -v to see detail) Run 'perf list' for a list of valid events Usage: perf record [<options>] [<command>] or: perf record [<options>] -- <command> [<options>] -e, --event <event> event selector. use 'perf list' to list available events $ $ file /tmp/foo.o /tmp/foo.o: ELF 64-bit LSB relocatable, no machine, version 1 (SYSV), not stripped $ perf record --event /tmp/foo.o sleep 1 [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.013 MB perf.data ] $ perf evlist /tmp/foo.o $ perf evlist -v /tmp/foo.o: type: 1, size: 112, config: 0x9, { sample_period, sample_freq }: 4000, sample_type: IP|TID|TIME|PERIOD, disabled: 1, inherit: 1, mmap: 1, comm: 1, freq: 1, enable_on_exec: 1, task: 1, sample_id_all: 1, exclude_guest: 1, mmap2: 1, comm_exec: 1 $ So, type 1 is PERF_TYPE_SOFTWARE, config 0x9 is PERF_COUNT_SW_DUMMY, ok. $ perf report --stdio Error: The perf.data file has no samples! # To display the perf.data header info, please use --header/--header-only options. # $ Signed-off-by: Wang Nan <wangnan0@huawei.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Alexei Starovoitov <ast@plumgrid.com> Cc: Brendan Gregg <brendan.d.gregg@gmail.com> Cc: Daniel Borkmann <daniel@iogearbox.net> Cc: David Ahern <dsahern@gmail.com> Cc: He Kuang <hekuang@huawei.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kaixu Xia <xiakaixu@huawei.com> Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Zefan Li <lizefan@huawei.com> Cc: pi3orama@163.com Link: http://lkml.kernel.org/r/1444826502-49291-4-git-send-email-wangnan0@huawei.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2015-10-28perf ebpf: Add the libbpf glueWang Nan2-0/+86
The 'bpf-loader.[ch]' files are introduced in this patch. Which will be the interface between perf and libbpf. bpf__prepare_load() resides in bpf-loader.c. Following patches will enrich these two files. Signed-off-by: Wang Nan <wangnan0@huawei.com> Acked-by: Alexei Starovoitov <ast@plumgrid.com> Cc: Brendan Gregg <brendan.d.gregg@gmail.com> Cc: Daniel Borkmann <daniel@iogearbox.net> Cc: David Ahern <dsahern@gmail.com> Cc: He Kuang <hekuang@huawei.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kaixu Xia <xiakaixu@huawei.com> Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Zefan Li <lizefan@huawei.com> Cc: pi3orama@163.com Link: http://lkml.kernel.org/r/1444826502-49291-3-git-send-email-wangnan0@huawei.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2015-10-28perf tools: Make perf depend on libbpfWang Nan4-4/+43
By adding libbpf into perf's Makefile, this patch enables perf to build libbpf if libelf is found and neither NO_LIBELF nor NO_LIBBPF is set. The newly introduced code is similar to how libapi and libtraceevent are wired into Makefile.perf. MANIFEST is also updated for 'make perf-*-src-pkg'. Append make_no_libbpf to tools/perf/tests/make. The 'bpf' feature check is appended into default FEATURE_TESTS and FEATURE_DISPLAY, so perf will check the API version of bpf in /path/to/kernel/include/uapi/linux/bpf.h. Which should not fail except when we are trying to port this code to an old kernel. Error messages are also updated to notify users about the lack of BPF support in 'perf record' if libelf is missing or the BPF API check failed. tools/lib/bpf is added to TAG_FOLDERS to allow us to navigate libbpf files when working on perf using tools/perf/tags. Signed-off-by: Wang Nan <wangnan0@huawei.com> Acked-by: Alexei Starovoitov <ast@plumgrid.com> Cc: Brendan Gregg <brendan.d.gregg@gmail.com> Cc: Daniel Borkmann <daniel@iogearbox.net> Cc: David Ahern <dsahern@gmail.com> Cc: He Kuang <hekuang@huawei.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kaixu Xia <xiakaixu@huawei.com> Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Zefan Li <lizefan@huawei.com> Cc: pi3orama@163.com Link: http://lkml.kernel.org/r/1444826502-49291-2-git-send-email-wangnan0@huawei.com [ Document NO_LIBBPF in Makefile.perf, noted by Jiri Olsa ] Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2015-10-28perf symbols: Fix endless loop in dso__split_kallsyms_for_kcoreJiri Olsa1-1/+1
Currently we split symbols based on the map comparison, but symbols are stored within dso objects and maps could point into same dso objects (kernel maps). Hence we could end up changing rbtree we are currently iterating and mess it up. It's easily reproduced on s390x by running: $ perf record -a -- sleep 3 $ perf buildid-list -i perf.data --with-hits The fix is to compare dso objects instead. Reported-by: Michael Petlan <mpetlan@redhat.com> Signed-off-by: Jiri Olsa <jolsa@kernel.org> Acked-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Kan Liang <kan.liang@intel.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/20151026135130.GA26003@krava.brq.redhat.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2015-10-28perf tools: Enable pre-event inherit setting by config termsWang Nan5-0/+29
This patch allows perf record setting event's attr.inherit bit by config terms like: # perf record -e cycles/no-inherit/ ... # perf record -e cycles/inherit/ ... So user can control inherit bit for each event separately. In following example, a.out fork()s in main then do some complex CPU intensive computations in both of its children. Basic result with and without inherit: # perf record -e cycles -e instructions ./a.out [ perf record: Woken up 9 times to write data ] [ perf record: Captured and wrote 2.205 MB perf.data (47920 samples) ] # perf report --stdio # ... # Samples: 23K of event 'cycles' # Event count (approx.): 23641752891 ... # Samples: 24K of event 'instructions' # Event count (approx.): 30428312415 # perf record -i -e cycles -e instructions ./a.out [ perf record: Woken up 5 times to write data ] [ perf record: Captured and wrote 1.111 MB perf.data (24019 samples) ] ... # Samples: 12K of event 'cycles' # Event count (approx.): 11699501775 ... # Samples: 12K of event 'instructions' # Event count (approx.): 15058023559 Cancel inherit for one event when globally enable: # perf record -e cycles/no-inherit/ -e instructions ./a.out [ perf record: Woken up 7 times to write data ] [ perf record: Captured and wrote 1.660 MB perf.data (36004 samples) ] ... # Samples: 12K of event 'cycles/no-inherit/' # Event count (approx.): 11895759282 ... # Samples: 24K of event 'instructions' # Event count (approx.): 30668000441 Enable inherit for one event when globally disable: # perf record -i -e cycles/inherit/ -e instructions ./a.out [ perf record: Woken up 7 times to write data ] [ perf record: Captured and wrote 1.654 MB perf.data (35868 samples) ] ... # Samples: 23K of event 'cycles/inherit/' # Event count (approx.): 23285400229 ... # Samples: 11K of event 'instructions' # Event count (approx.): 14969050259 Committer note: One can check if the bit was set, in addition to seeing the result in the perf.data file size as above by doing one of: # perf record -e cycles -e instructions -a usleep 1 [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.911 MB perf.data (63 samples) ] # perf evlist -v cycles: size: 112, { sample_period, sample_freq }: 4000, sample_type: IP|TID|TIME|ID|CPU|PERIOD, read_format: ID, disabled: 1, inherit: 1, mmap: 1, comm: 1, freq: 1, task: 1, sample_id_all: 1, exclude_guest: 1, mmap2: 1, comm_exec: 1 instructions: size: 112, config: 0x1, { sample_period, sample_freq }: 4000, sample_type: IP|TID|TIME|ID|CPU|PERIOD, read_format: ID, disabled: 1, inherit: 1, freq: 1, sample_id_all: 1, exclude_guest: 1 # So, the inherit bit was set in both, now, if we disable it globally using --no-inherit: # perf record --no-inherit -e cycles -e instructions -a usleep 1 [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.910 MB perf.data (56 samples) ] # perf evlist -v cycles: size: 112, { sample_period, sample_freq }: 4000, sample_type: IP|TID|TIME|ID|CPU|PERIOD, read_format: ID, disabled: 1, mmap: 1, comm: 1, freq: 1, task: 1, sample_id_all: 1, exclude_guest: 1, mmap2: 1, comm_exec: 1 instructions: size: 112, config: 0x1, { sample_period, sample_freq }: 4000, sample_type: IP|TID|TIME|ID|CPU|PERIOD, read_format: ID, disabled: 1, freq: 1, sample_id_all: 1, exclude_guest: 1 No inherit bit set, then disabling it and setting just on the cycles event: # perf record --no-inherit -e cycles/inherit/ -e instructions -a usleep 1 [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.909 MB perf.data (48 samples) ] # perf evlist -v cycles/inherit/: size: 112, { sample_period, sample_freq }: 4000, sample_type: IP|TID|TIME|ID|CPU|PERIOD, read_format: ID, disabled: 1, inherit: 1, mmap: 1, comm: 1, freq: 1, task: 1, sample_id_all: 1, exclude_guest: 1, mmap2: 1, comm_exec: 1 instructions: size: 112, config: 0x1, { sample_period, sample_freq }: 4000, sample_type: IP|TID|TIME|ID|CPU|PERIOD, read_format: ID, disabled: 1, freq: 1, sample_id_all: 1, exclude_guest: 1 # We can see it as well in by using a more verbose level of debug messages in the tool that sets up the perf_event_attr, 'perf record' in this case: [root@zoo ~]# perf record -vv --no-inherit -e cycles/inherit/ -e instructions -a usleep 1 ------------------------------------------------------------ perf_event_attr: size 112 { sample_period, sample_freq } 4000 sample_type IP|TID|TIME|ID|CPU|PERIOD read_format ID disabled 1 inherit 1 mmap 1 comm 1 freq 1 task 1 sample_id_all 1 exclude_guest 1 mmap2 1 comm_exec 1 ------------------------------------------------------------ sys_perf_event_open: pid -1 cpu 0 group_fd -1 flags 0x8 sys_perf_event_open: pid -1 cpu 1 group_fd -1 flags 0x8 sys_perf_event_open: pid -1 cpu 2 group_fd -1 flags 0x8 sys_perf_event_open: pid -1 cpu 3 group_fd -1 flags 0x8 ------------------------------------------------------------ perf_event_attr: size 112 config 0x1 { sample_period, sample_freq } 4000 sample_type IP|TID|TIME|ID|CPU|PERIOD read_format ID disabled 1 freq 1 sample_id_all 1 exclude_guest 1 ------------------------------------------------------------ sys_perf_event_open: pid -1 cpu 0 group_fd -1 flags 0x8 <SNIP> Signed-off-by: Wang Nan <wangnan0@huawei.com> Acked-by: Jiri Olsa <jolsa@kernel.org> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Alexei Starovoitov <ast@plumgrid.com> Cc: Brendan Gregg <brendan.d.gregg@gmail.com> Cc: David S. Miller <davem@davemloft.net> Cc: Li Zefan <lizefan@huawei.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Zefan Li <lizefan@huawei.com> Cc: pi3orama@163.com Link: http://lkml.kernel.org/r/1446029705-199659-2-git-send-email-wangnan0@huawei.com [ s/u64/bool/ for the perf_evsel_config_term inherit field - jolsa] Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2015-10-28perf symbols: we can now read separate debug-info files based on a build IDDima Kogan1-0/+9
Recent GDB (at least on a vanilla Debian box) looks for debug information in /usr/lib/debug/.build-id/nn/nnnnnnn where nn/nnnnnn is the build-id of the stripped ELF binary. This is documented here: https://sourceware.org/gdb/onlinedocs/gdb/Separate-Debug-Files.html This was not working in perf because we didn't read the build id until AFTER we searched for the separate debug information file. This patch reads the build ID and THEN does the search. Signed-off-by: Dima Kogan <dima@secretsauce.net> Link: http://lkml.kernel.org/r/87si6pfwz4.fsf@secretsauce.net Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2015-10-28perf symbols: Fix type error when reading a build-idDima Kogan1-1/+1
This was benign, but wrong. The build-id should live in a char[], not a char*[] Signed-off-by: Dima Kogan <dima@secretsauce.net> Link: http://lkml.kernel.org/r/87si6pfwz4.fsf@secretsauce.net Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2015-10-27perf tools: Search for more options when passing args to -hArnaldo Carvalho de Melo1-1/+14
Recently 'perf <tool> -h' was made aware of arguments and would show just the help for the arguments specified, but that required a strict form, i.e.: $ perf -h --tui worked, but: $ perf -h tui didn't. Make it support both cases and also look at the option help when neither matches, so that he following examples works: $ perf report -h interface Usage: perf report [<options>] --gtk Use the GTK2 interface --stdio Use the stdio interface --tui Use the TUI interface $ perf report -h stack Usage: perf report [<options>] -g, --call-graph <print_type,threshold[,print_limit],order, sort_key[,branch]> Display call graph (stack chain/backtrace): print_type: call graph printing style (graph|flat|fractal|none) threshold: minimum call graph inclusion threshold (<percent>) print_limit: maximum number of call graph entry (<number>) order: call graph order (caller|callee) sort_key: call graph sort key (function|address) branch: include last branch info to call graph (branch) Default: graph,0.5,caller,function --max-stack <n> Set the maximum stack depth when parsing the callchain, anything beyond the specified depth will be ignored. Default: 127 $ Suggested-by: Ingo Molnar <mingo@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Borislav Petkov <bp@suse.de> Cc: Brendan Gregg <brendan.d.gregg@gmail.com> Cc: Chandler Carruth <chandlerc@gmail.com> Cc: David Ahern <dsahern@gmail.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Stephane Eranian <eranian@google.com> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/n/tip-xzqvamzqv3cv0p6w3inhols3@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2015-10-27perf stat: Cache aggregated map entries in extra cpumapJiri Olsa1-4/+55
Currently any time we need to access socket or core id for given cpu, we access the sysfs topology file. Adding a cpus_aggr_map cpu_map to cache those entries. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Tested-by: Kan Liang <kan.liang@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1445784728-21732-3-git-send-email-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2015-10-27perf cpu_map: Add cpu_map__empty_new functionJiri Olsa2-0/+18
Adding cpu_map__empty_new interface to create empty cpumap with given size. The cpumap entries are initialized with -1. It'll be used for caching cpu_map in following patches. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Tested-by: Kan Liang <kan.liang@intel.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1445784728-21732-2-git-send-email-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2015-10-27perf evsel: Move id_offset out of struct perf_evsel union memberJiri Olsa1-1/+1
Because the 'perf stat record' patches will use the id_offset member together with the priv pointer. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Tested-by: Kan Liang <kan.liang@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1445784728-21732-29-git-send-email-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2015-10-27perf tools: Introduce usage_with_options_msg()Namhyung Kim9-28/+62
Now usage_with_options() setup a pager before printing message so normal printf() or pr_err() will not be shown. The usage_with_options_msg() can be used to print some help message before usage strings. Signed-off-by: Namhyung Kim <namhyung@kernel.org> Acked-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1445701767-12731-4-git-send-email-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2015-10-26perf tools: Setup pager when printing usage and helpNamhyung Kim1-2/+13
It's annoying to see error or help message when command has many options like in perf record, report or top. So setup pager when print parser error or help message - it should be OK since no UI is enabled at the parsing time. The usage_with_options() already disables it by calling exit_browser() anyway. Signed-off-by: Namhyung Kim <namhyung@kernel.org> Acked-by: Ingo Molnar <mingo@kernel.org> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1445701767-12731-3-git-send-email-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2015-10-26perf report: Rename to --show-cpu-utilizationNamhyung Kim3-2/+5
So that it can be more consistent with other --show-* options. The old name (--showcpuutilization) is provided only for compatibility. Signed-off-by: Namhyung Kim <namhyung@kernel.org> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1445701767-12731-2-git-send-email-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2015-10-26perf tools: Improve ambiguous option help messageNamhyung Kim1-9/+8
Currently if an option name is ambiguous it only prints first two matched option names but no help. It'd be better it could show all possible names and help messages too. Before: $ perf report --show Error: Ambiguous option: show (could be --show-total-period or --show-ref-call-graph) Usage: perf report [<options>] After: $ perf report --show Error: Ambiguous option: show (could be --show-total-period or --show-ref-call-graph) Usage: perf report [<options>] -n, --show-nr-samples Show a column with the number of samples --showcpuutilization Show sample percentage for different cpu modes -I, --show-info Display extended information about perf.data file --show-total-period Show a column with the sum of periods --show-ref-call-graph Show callgraph from reference event Signed-off-by: Namhyung Kim <namhyung@kernel.org> Acked-by: Ingo Molnar <mingo@kernel.org> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1445701767-12731-1-git-send-email-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2015-10-24perf tools: Provide help for subset of optionsArnaldo Carvalho de Melo1-9/+33
Some tools have a lot of options, so, providing a way to show help just for some of them may come handy: $ perf report -h --tui Usage: perf report [<options>] --tui Use the TUI interface $ perf report -h --tui --showcpuutilization -b -c Usage: perf report [<options>] -b, --branch-stack use branch records for per branch histogram filling -c, --comms <comm[,comm...]> only consider symbols in these comms --showcpuutilization Show sample percentage for different cpu modes --tui Use the TUI interface $ Using it with perf bash completion is also handy, just make sure you source the needed file: $ . ~/git/linux/tools/perf/perf-completion.sh Then press tab/tab after -- to see a list of options, put them after -h and only the options chosen will have its help presented: $ perf report -h -- --asm-raw --demangle-kernel --group --kallsyms --pretty --stdio --branch-history --disassembler-style --gtk --max-stack --showcpuutilization --symbol-filter --branch-stack --dsos --header --mem-mode --show-info --symbols --call-graph --dump-raw-trace --header-only --modules --show-nr-samples --symfs --children --exclude-other --hide-unresolved --objdump --show-ref-call-graph --threads --column-widths --fields --ignore-callees --parent --show-total-period --tid --comms --field-separator --input --percentage --socket-filter --tui --cpu --force --inverted --percent-limit --sort --verbose --demangle --full-source-path --itrace --pid --source --vmlinux $ perf report -h --socket-filter Usage: perf report [<options>] --socket-filter <n> only show processor socket that match with this filter Suggested-by: Ingo Molnar <mingo@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Borislav Petkov <bp@suse.de> Cc: Brendan Gregg <brendan.d.gregg@gmail.com> Cc: Chandler Carruth <chandlerc@gmail.com> Cc: David Ahern <dsahern@gmail.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Stephane Eranian <eranian@google.com> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/n/tip-83mcdd3wj0379jcgea8w0fxa@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2015-10-24perf tools: Show tool command line options orderedArnaldo Carvalho de Melo1-0/+48
When asking for a listing of the options, be it using -h or when an unknown option is passed, order it by one-letter options, then the ones having just long names. Suggested-by: Ingo Molnar <mingo@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Borislav Petkov <bp@suse.de> Cc: Brendan Gregg <brendan.d.gregg@gmail.com> Cc: Chandler Carruth <chandlerc@gmail.com> Cc: David Ahern <dsahern@gmail.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Stephane Eranian <eranian@google.com> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/n/tip-41qh68t35n4ehrpsuazp1dx8@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2015-10-23perf annotate: Don't die() when finding an invalid config optionArnaldo Carvalho de Melo1-3/+3
The perf_config() infrastructure we inherited from git calls die() when the provided config callback returns -1, meaning some key in a config section is unexpected, that seems ok for a stdio based tool, but in --tui we end up messing up the output, so just tell the user about the error, wait for a keystroke and return 0, being more resilient and proceeding with what we managed to parse. That die() needs to die, tho :-) Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Borislav Petkov <bp@suse.de> Cc: David Ahern <dsahern@gmail.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Stephane Eranian <eranian@google.com> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/n/tip-pqtsffh2kwr5mwm4qg9kgotu@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2015-10-22perf ui tui: Register the error callbacks before initializing the widgetsArnaldo Carvalho de Melo1-4/+4
I.e. we want to tell the user about errors found during, for instance, the ui_browser initialization, so that a call to ui__warning() appears as a window waiting for a key to be pressed. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Borislav Petkov <bp@suse.de> Cc: David Ahern <dsahern@gmail.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Stephane Eranian <eranian@google.com> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/n/tip-ederrwizcl6mfz10vfobl5qq@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2015-10-22perf annotate: Fix 'annotate.use_offset' config variable usageNamhyung Kim1-1/+1
The annotate__configs should be sorted so that it can use bsearch(3). However commit 0c4a5bcea460 ("perf annotate: Display total number of samples with --show-total-period") added a new config item at the end. This resulted in the 'annotate.use_offset' config variable cannot be found and perf terminated like below: $ perf report bad config file line 6 in ~/.perfconfig Signed-off-by: Namhyung Kim <namhyung@kernel.org> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Martin Liška <mliska@suse.cz> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Taeung Song <treeze.taeung@gmail.com> Fixes: 0c4a5bcea460 ("perf annotate: Display total number of samples with --show-total-period") Link: http://lkml.kernel.org/r/1445396240-3428-1-git-send-email-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2015-10-22perf tools: Improve call graph documents and help messagesNamhyung Kim6-30/+62
The --call-graph option is complex so we should provide better guide for users. Also change help message to be consistent with config option names. Now perf top will show help like below: $ perf top --call-graph Error: option `call-graph' requires a value Usage: perf top [<options>] --call-graph <record_mode[,record_size],print_type,threshold[,print_limit],order,sort_key[,branch]> setup and enables call-graph (stack chain/backtrace): record_mode: call graph recording mode (fp|dwarf|lbr) record_size: if record_mode is 'dwarf', max size of stack recording (<bytes>) default: 8192 (bytes) print_type: call graph printing style (graph|flat|fractal|none) threshold: minimum call graph inclusion threshold (<percent>) print_limit: maximum number of call graph entry (<number>) order: call graph order (caller|callee) sort_key: call graph sort key (function|address) branch: include last branch info to call graph (branch) Default: fp,graph,0.5,caller,function Requested-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Namhyung Kim <namhyung@kernel.org> Acked-by: Frederic Weisbecker <fweisbec@gmail.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Borislav Petkov <bp@suse.de> Cc: Brendan Gregg <brendan.d.gregg@gmail.com> Cc: Chandler Carruth <chandlerc@gmail.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Stephane Eranian <eranian@google.com> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/r/1445524112-5201-2-git-send-email-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2015-10-22perf tools: Defaults to 'caller' callchain order only if --children is enabledNamhyung Kim5-1/+9
The caller callchain order is useful with --children option since it can show 'overview' style output, but other commands which don't use --children feature like 'perf script' or even 'perf report/top' without --children are better to keep callee order. Signed-off-by: Namhyung Kim <namhyung@kernel.org> Acked-by: Brendan Gregg <brendan.d.gregg@gmail.com> Acked-by: Frederic Weisbecker <fweisbec@gmail.com> Acked-by: Ingo Molnar <mingo@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Borislav Petkov <bp@suse.de> Cc: Chandler Carruth <chandlerc@gmail.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Stephane Eranian <eranian@google.com> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/r/1445499946-29817-1-git-send-email-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2015-10-22perf top: Support call-graph display options alsoNamhyung Kim4-10/+62
Currently 'perf top --call-graph' option is same as 'perf record'. But 'perf top' also need to receive display options in 'perf report'. To do that, change parse_callchain_report_opt() to allow record options too. Now perf top can receive display options like below: $ perf top --call-graph Error: option `call-graph' requires a value Usage: perf top [<options>] --call-graph <mode[,dump_size],output_type,min_percent[,print_limit],call_order[,branch]> setup and enables call-graph (stack chain/backtrace) recording: fp dwarf lbr, output_type (graph, flat, fractal, or none), min percent threshold, optional print limit, callchain order, key (function or address), add branches $ perf top --call-graph callee,graph,fp Signed-off-by: Namhyung Kim <namhyung@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Borislav Petkov <bp@suse.de> Cc: Brendan Gregg <brendan.d.gregg@gmail.com> Cc: Chandler Carruth <chandlerc@gmail.com> Cc: David Ahern <dsahern@gmail.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Stephane Eranian <eranian@google.com> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/r/1445495330-25416-2-git-send-email-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2015-10-22perf tools: Move callchain help messages to callchain.hNamhyung Kim3-10/+20
These messages will be used by 'perf top' in the next patch. Signed-off-by: Namhyung Kim <namhyung@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Borislav Petkov <bp@suse.de> Cc: Brendan Gregg <brendan.d.gregg@gmail.com> Cc: Chandler Carruth <chandlerc@gmail.com> Cc: David Ahern <dsahern@gmail.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Stephane Eranian <eranian@google.com> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/r/1445495330-25416-1-git-send-email-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2015-10-22perf annotate: Add debug message for out of bounds sampleArnaldo Carvalho de Melo1-1/+4
Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Borislav Petkov <bp@suse.de> Cc: David Ahern <dsahern@gmail.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Stephane Eranian <eranian@google.com> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/n/tip-q0lde9ajs84oi38nlyjcqbwg@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2015-10-22perf evsel: Print branch filter state with -vvAndi Kleen1-0/+1
Add a missing field to the perf_event_attr debug output. Signed-off-by: Andi Kleen <ak@linux.intel.com> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Jiri Olsa <jolsa@kernel.org> Link: http://lkml.kernel.org/r/1445366797-30894-4-git-send-email-andi@firstfloor.org [ Print it between config2 and sample_regs_user (peterz)] Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2015-10-20perf cpu_map: Fix core dump caused by per-socket/core system-wide statKan Liang1-1/+1
Perf will core dump if --per-socket/core -a are applied for perf stat. The root cause is that cpu_map__build_map set refcnt of evlist's cpu_map to 1. It should set refcnt for the newly created cpu_map, not evlist's cpu_map. Here is the example: # perf stat -e cycles --per-socket -a sleep 1 Performance counter stats for 'system wide': S0 36 30,196,257 cycles S1 28 15,823,536 cycles 1.001126828 seconds time elapsed *** Error in `./perf': corrupted double-linked list: 0x00000000021f9090 *** ======= Backtrace: ========= /lib64/libc.so.6[0x3002e7bbe7] /lib64/libc.so.6[0x3002e7d2b5] ./perf(perf_evsel__delete+0x28)[0x485bdd] ./perf[0x4800e8] ./perf(perf_evlist__delete+0x5e)[0x482cd5] ./perf(cmd_stat+0xf25)[0x432328] ./perf[0x4768e0] ./perf[0x476ad6] ./perf[0x476b41] ./perf(main+0x1d0)[0x476db2] /lib64/libc.so.6(__libc_start_main+0xf5)[0x3002e21b45] ./perf[0x4202c5] Signed-off-by: Kan Liang <kan.liang@intel.com> Acked-by: Jiri Olsa <jolsa@kernel.org> Cc: Andi Kleen <ak@linux.intel.com> Link: http://lkml.kernel.org/r/1444388363-35936-1-git-send-email-kan.liang@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2015-10-20perf record: Add ability to sample call branchesStephane Eranian2-0/+2
This patch add a new branch type sampling filter to perf record. It is named 'call' and maps to PERF_SAMPLE_BRANCH_CALL. It samples direct call branches only, unlike 'any_call' which includes indirect calls as well. $ perf record -j call -e cycles ..... The man page is updated accordingly. Signed-off-by: Stephane Eranian <eranian@google.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vince Weaver <vincent.weaver@maine.edu> Cc: khandual@linux.vnet.ibm.com Link: http://lkml.kernel.org/r/1444720151-10275-5-git-send-email-eranian@google.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-10-20perf bench: Use named initializers in the trailer tooArnaldo Carvalho de Melo1-2/+2
To avoid this splat with gcc 4.4.7: cc1: warnings being treated as errors bench/mem-functions.c:273: error: missing initializer bench/mem-functions.c:273: error: (near initialization for ‘memcpy_functions[4].desc’) bench/mem-functions.c:366: error: missing initializer bench/mem-functions.c:366: error: (near initialization for ‘memset_functions[4].desc’) Cc: David Ahern <dsahern@gmail.com> Cc: Hitoshi Mitake <mitake@dcl.info.waseda.ac.jp> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/n/tip-0s8o6tgw1pdwvdv02llb9tkd@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2015-10-20perf script: Check output fields only for samplesJiri Olsa1-1/+4
There's no need to check sampling output fields for events without perf_event_attr::sample_type field set. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Tested-by: Kan Liang <kan.liang@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1444992092-17897-51-git-send-email-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2015-10-20perf cpu_map: Add data arg to cpu_map__build_map callbackJiri Olsa5-15/+27
Adding data arg to cpu_map__build_map callback, so we could pass data along to the callback. It'll be needed in following patches to retrieve topology info from perf.data. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Tested-by: Kan Liang <kan.liang@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1444992092-17897-41-git-send-email-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2015-10-20perf cpu_map: Make cpu_map__build_map globalJiri Olsa2-2/+4
We'll need to call it from perf stat in the stat_script patchkit Signed-off-by: Jiri Olsa <jolsa@kernel.org> Tested-by: Kan Liang <kan.liang@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1444992092-17897-40-git-send-email-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2015-10-20perf stat: Add AGGR_UNSET modeJiri Olsa3-0/+7
Adding AGGR_UNSET mode, so we could distinguish unset aggr_mode in following patches. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Tested-by: Kan Liang <kan.liang@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1444992092-17897-30-git-send-email-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2015-10-20perf stat: Rename perf_stat struct into perf_stat_evselJiri Olsa3-8/+8
It's used as the perf_evsel::priv data, so the name suits better. Also we'll need the perf_stat name free for more generic struct. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Tested-by: Kan Liang <kan.liang@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1444992092-17897-29-git-send-email-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2015-10-19perf help: Change 'usage' to 'Usage' for consistencyYunlong Song2-3/+3
Capitalize 'usage' to make it consistent with all the other 'Usage' in the codes, e.g., usage_builtin. Signed-off-by: Yunlong Song <yunlong.song@huawei.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Ramkumar Ramachandra <artagnon@gmail.com> Cc: Sriram Raghunathan <sriram.r@nokia.com> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/r/1444894792-2338-3-git-send-email-yunlong.song@huawei.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2015-10-19perf bench: Run benchmarks, don't test themIngo Molnar1-4/+4
So right now we output this text: memcpy: Benchmark for memcpy() functions memset: Benchmark for memset() functions all: Test all memory access benchmarks But the right verb to use with benchmarks is to 'run' them, not 'test' them. So change this (and all similar texts) to: memcpy: Benchmark for memcpy() functions memset: Benchmark for memset() functions all: Run all memory access benchmarks Signed-off-by: Ingo Molnar <mingo@kernel.org> Cc: David Ahern <dsahern@gmail.com> Cc: Hitoshi Mitake <mitake@dcl.info.waseda.ac.jp> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/1445241870-24854-15-git-send-email-mingo@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2015-10-19perf bench mem: Rename 'routine' to 'function'Ingo Molnar2-38/+38
So right now there's a somewhat inconsistent mess of the benchmarking code and options sometimes calling benchmarked functions 'functions', sometimes calling them 'routines'. Name them 'functions' consistently. Signed-off-by: Ingo Molnar <mingo@kernel.org> Cc: David Ahern <dsahern@gmail.com> Cc: Hitoshi Mitake <mitake@dcl.info.waseda.ac.jp> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/1445241870-24854-14-git-send-email-mingo@kernel.org [ Updated perf-bench man page, pointed out by David Ahern ] Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2015-10-19perf bench: Harmonize all the -l/--nr_loops optionsIngo Molnar4-23/+23
We have three benchmarking subsystems that specify some sort of 'number of loops' parameter - but all of them do it inconsistently: numa: -l/--nr_loops sched messaging: -l/--loops mem memset/memcpy: -i/--iterations Harmonize them to -l/--nr_loops by picking the numa variant - which is also the most likely one to have existing scripting which we don't want to break. Plus improve the parameter help texts to indicate the default value for the nr_loops variable to keep users from guessing ... Also propagate the naming to internal variables. Signed-off-by: Ingo Molnar <mingo@kernel.org> Cc: David Ahern <dsahern@gmail.com> Cc: Hitoshi Mitake <mitake@dcl.info.waseda.ac.jp> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/1445241870-24854-13-git-send-email-mingo@kernel.org [ Let the harmonisation reach the perf-bench man page as well ] Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2015-10-19perf bench mem: Reorganize the code a bitIngo Molnar1-19/+19
Reorder functions a bit, so that we synchronize the layout of the memcpy() and memset() portions of the code. This improves the code, especially after we'll add an strlcpy() variant as well. Signed-off-by: Ingo Molnar <mingo@kernel.org> Cc: David Ahern <dsahern@gmail.com> Cc: Hitoshi Mitake <mitake@dcl.info.waseda.ac.jp> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/1445241870-24854-12-git-send-email-mingo@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2015-10-19perf bench mem: Improve user visible stringsIngo Molnar2-15/+20
- fix various typos in user visible output strings - make the output consistent (wrt. capitalization and spelling) - offer the list of routines to benchmark on '-r help'. Signed-off-by: Ingo Molnar <mingo@kernel.org> Cc: David Ahern <dsahern@gmail.com> Cc: Hitoshi Mitake <mitake@dcl.info.waseda.ac.jp> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/1445241870-24854-11-git-send-email-mingo@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>