summaryrefslogtreecommitdiff
path: root/tools
AgeCommit message (Collapse)AuthorFilesLines
2017-03-30Merge branch 'linus' into perf/core, to pick up fixesIngo Molnar2-12/+36
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-03-28Merge tag 'perf-core-for-mingo-4.12-20170327' of ↵Ingo Molnar62-208/+739
git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/core Pull perf/core improvements and fixes from Arnaldo Carvalho de Melo: New features: - Handle inline functions in callchains (Jin Yao) - Enable sorting by srcline as key (Milian Wolff) Fixes: - Fix no_size logic in addr_filter__resolve_kernel_syms() in the auxtrace code (Adrian Hunter) - Fix some thread refcount leaks in 'perf trace' (Arnaldo Carvalho de Melo) - Fix divide by zero when calculating percent for an event in a group in the annotate by source line code (Taeung Song) - build-id files now aren't anymore symlinks, their parent directories are, so readlink the later (Taeung Song) - Assorted fixes for null termination problems, mostly related to readlink, detected by valgrind (Tommi Rantala) Infrastructure changes: - Make vfs_getname probe point logic in 'perf trace' more robust wrt length of pathname (Arnaldo Carvalho de Melo) - Remove unused 'prefix' parameter from builtins main functions (Arnaldo Carvalho de Melo) - Show 'perf list sdt' option in man page (Ravi Bangoria) Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-03-28Merge branch 'perf/urgent' into perf/core, to pick up fixesIngo Molnar1-1/+1
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-03-27perf utils: Readlink /proc/self/exe to find the perf binaryTommi Rantala1-6/+2
Simplification: it is easier to open /proc/self/exe than /proc/$pid/exe. Signed-off-by: Tommi Rantala <tommi.t.rantala@nokia.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/20170322130624.21881-7-tommi.t.rantala@nokia.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-03-27perf utils: Null terminate buf in read_ftrace_printk()Tommi Rantala1-1/+3
Ensure that the string that we read from the data file is null terminated. Valgrind was complaining: ==31357== Invalid read of size 1 ==31357== at 0x4EC8C1: __strtok_r_1c (string2.h:200) ==31357== by 0x4EC8C1: parse_ftrace_printk (trace-event-parse.c:161) ==31357== by 0x4F82A8: read_ftrace_printk (trace-event-read.c:204) ==31357== by 0x4F82A8: trace_report (trace-event-read.c:468) ==31357== by 0x4CD552: process_tracing_data (header.c:1576) ==31357== by 0x4D3397: perf_file_section__process (header.c:2705) ==31357== by 0x4D3397: perf_header__process_sections (header.c:2488) ==31357== by 0x4D3397: perf_session__read_header (header.c:2925) ==31357== by 0x4E71E2: perf_session__open (session.c:32) ==31357== by 0x4E71E2: perf_session__new (session.c:139) ==31357== by 0x429F5D: cmd_annotate (builtin-annotate.c:472) ==31357== by 0x497150: run_builtin (perf.c:359) ==31357== by 0x428CE0: handle_internal_command (perf.c:421) ==31357== by 0x428CE0: run_argv (perf.c:467) ==31357== by 0x428CE0: main (perf.c:614) ==31357== Address 0x8ac0efb is 0 bytes after a block of size 1,963 alloc'd ==31357== at 0x4C2DB9D: malloc (vg_replace_malloc.c:299) ==31357== by 0x4F827B: read_ftrace_printk (trace-event-read.c:195) ==31357== by 0x4F827B: trace_report (trace-event-read.c:468) ==31357== by 0x4CD552: process_tracing_data (header.c:1576) ==31357== by 0x4D3397: perf_file_section__process (header.c:2705) ==31357== by 0x4D3397: perf_header__process_sections (header.c:2488) ==31357== by 0x4D3397: perf_session__read_header (header.c:2925) ==31357== by 0x4E71E2: perf_session__open (session.c:32) ==31357== by 0x4E71E2: perf_session__new (session.c:139) ==31357== by 0x429F5D: cmd_annotate (builtin-annotate.c:472) ==31357== by 0x497150: run_builtin (perf.c:359) ==31357== by 0x428CE0: handle_internal_command (perf.c:421) ==31357== by 0x428CE0: run_argv (perf.c:467) ==31357== by 0x428CE0: main (perf.c:614) Signed-off-by: Tommi Rantala <tommi.t.rantala@nokia.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/20170322130624.21881-6-tommi.t.rantala@nokia.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-03-27perf utils: use sizeof(buf) - 1 in readlink() callTommi Rantala1-1/+1
Ensure that we have space for the null byte in buf. Signed-off-by: Tommi Rantala <tommi.t.rantala@nokia.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/20170322130624.21881-5-tommi.t.rantala@nokia.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-03-27perf tests: Do not assume that readlink() returns a null terminated stringTommi Rantala1-1/+1
Ensure that the string in buf is null terminated. Signed-off-by: Tommi Rantala <tommi.t.rantala@nokia.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/20170322130624.21881-4-tommi.t.rantala@nokia.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-03-27perf buildid: Do not assume that readlink() returns a null terminated stringTommi Rantala1-1/+5
Valgrind was complaining: $ valgrind ./perf list >/dev/null ==11643== Memcheck, a memory error detector ==11643== Copyright (C) 2002-2015, and GNU GPL'd, by Julian Seward et al. ==11643== Using Valgrind-3.12.0 and LibVEX; rerun with -h for copyright info ==11643== Command: ./perf list ==11643== ==11643== Conditional jump or move depends on uninitialised value(s) ==11643== at 0x4C30620: rindex (vg_replace_strmem.c:199) ==11643== by 0x49DAA9: build_id_cache__origname (build-id.c:198) ==11643== by 0x49E1C7: build_id_cache__valid_id (build-id.c:222) ==11643== by 0x49E1C7: build_id_cache__list_all (build-id.c:507) ==11643== by 0x4B9C8F: print_sdt_events (parse-events.c:2067) ==11643== by 0x4BB0B3: print_events (parse-events.c:2313) ==11643== by 0x439501: cmd_list (builtin-list.c:53) ==11643== by 0x497150: run_builtin (perf.c:359) ==11643== by 0x428CE0: handle_internal_command (perf.c:421) ==11643== by 0x428CE0: run_argv (perf.c:467) ==11643== by 0x428CE0: main (perf.c:614) [...] Additionally, a zero length result from readlink() is not very interesting. Signed-off-by: Tommi Rantala <tommi.t.rantala@nokia.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/20170322130624.21881-3-tommi.t.rantala@nokia.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-03-27perf buildid: Do not update SDT cache with null filenameTommi Rantala1-1/+1
Valgrind was complaining: ==2633== Syscall param open(filename) points to unaddressable byte(s) ==2633== at 0x5281CC0: __open_nocancel (syscall-template.S:84) ==2633== by 0x537D38: open (fcntl2.h:53) ==2633== by 0x537D38: get_sdt_note_list (symbol-elf.c:2017) ==2633== by 0x5396FD: probe_cache__scan_sdt (probe-file.c:700) ==2633== by 0x49EA2C: build_id_cache__add_sdt_cache (build-id.c:625) ==2633== by 0x49EA2C: build_id_cache__add_s (build-id.c:697) ==2633== by 0x49EE72: build_id_cache__add_b (build-id.c:717) ==2633== by 0x49EE72: dso__cache_build_id (build-id.c:782) ==2633== by 0x49F190: __dsos__cache_build_ids (build-id.c:793) ==2633== by 0x49F190: machine__cache_build_ids (build-id.c:801) ==2633== by 0x49F190: perf_session__cache_build_ids (build-id.c:815) ==2633== by 0x4CD4F2: write_build_id (header.c:165) ==2633== by 0x4D26F7: do_write_feat (header.c:2296) ==2633== by 0x4D26F7: perf_header__adds_write (header.c:2335) ==2633== by 0x4D26F7: perf_session__write_header (header.c:2414) ==2633== by 0x43B324: __cmd_record (builtin-record.c:1154) ==2633== by 0x43B324: cmd_record (builtin-record.c:1839) ==2633== by 0x455A07: __cmd_record (builtin-kmem.c:1868) ==2633== by 0x455A07: cmd_kmem (builtin-kmem.c:1944) ==2633== by 0x497150: run_builtin (perf.c:359) ==2633== by 0x428CE0: handle_internal_command (perf.c:421) ==2633== by 0x428CE0: run_argv (perf.c:467) ==2633== by 0x428CE0: main (perf.c:614) ==2633== Address 0x0 is not stack'd, malloc'd or (recently) free'd Signed-off-by: Tommi Rantala <tommi.t.rantala@nokia.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Tommi Rantala <tommi.t.rantala@nokia.com> Link: http://lkml.kernel.org/r/20170322130624.21881-2-tommi.t.rantala@nokia.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-03-27perf annotate: Fix a bug of division by zero when calculating percentTaeung Song1-3/+7
Currently perf-annotate with --print-line can print -nan(0x8000000000000) because of division by zero when calculating percent. The division by zero happens when a sum of samples is zero in symbol__get_source_line(), so fix it. For example: After running 'perf record' like below, $ perf record -e "{cycles,page-faults,branch-misses}" ./a.out Before: $ perf annotate --stdio -l Sorted summary for file /home/taeung/workspace/a.out ---------------------------------------------- 32.89 -nan 7.04 a.c:38 25.14 -nan 0.00 a.c:34 16.26 -nan 56.34 a.c:31 15.88 -nan 1.41 a.c:37 5.67 -nan 0.00 a.c:39 1.13 -nan 35.21 a.c:26 0.95 -nan 0.00 a.c:44 0.57 -nan 0.00 a.c:32 Percent | Source code & Disassembly of a.out for cycles (529 samples) ----------------------------------------------------------------------------------------- : ... a.c:26 0.57 -nan 4.23 : 40081a: mov %edi,-0x24(%rbp) a.c:26 0.00 -nan 9.86 : 40081d: mov %rsi,-0x30(%rbp) ... However, if a sum of samples is zero (e.g. 'page-faults'), skip calculating percent. After: $ perf annotate --stdio -l Sorted summary for file /home/taeung/workspace/a.out ---------------------------------------------- 32.89 0.00 7.04 a.c:38 25.14 0.00 0.00 a.c:34 16.26 0.00 56.34 a.c:31 15.88 0.00 1.41 a.c:37 5.67 0.00 0.00 a.c:39 1.13 0.00 35.21 a.c:26 0.95 0.00 0.00 a.c:44 0.57 0.00 0.00 a.c:32 Percent | Source code & Disassembly of old for cycles (529 samples) ----------------------------------------------------------------------------------------- : ... a.c:26 0.57 0.00 4.23 : 40081a: mov %edi,-0x24(%rbp) a.c:26 0.00 0.00 9.86 : 40081d: mov %rsi,-0x30(%rbp) ... Signed-off-by: Taeung Song <treeze.taeung@gmail.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/r/1490598638-13947-3-git-send-email-treeze.taeung@gmail.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-03-27perf annotate: Fix a bug following symbolic link of a build-id fileTaeung Song1-1/+9
It is wrong way to read link name from a build-id file. Because a build-id file is not anymore a symbolic link but build-id directory of it is symbolic link, so fix it. For example, if build-id file name gotten from dso__build_id_filename() is as below, /root/.debug/.build-id/4f/75c7d197c951659d1c1b8b5fd49bcdf8f3f8b1/elf To correctly read link name of build-id, use the build-id dir path that is a symbolic link, instead of the above build-id file name like below. /root/.debug/.build-id/4f/75c7d197c951659d1c1b8b5fd49bcdf8f3f8b1 Signed-off-by: Taeung Song <treeze.taeung@gmail.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/r/1490598638-13947-2-git-send-email-treeze.taeung@gmail.com Fixes: 01412261d994 ("perf buildid-cache: Use path/to/bin/buildid/elf instead of path/to/bin/buildid") Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-03-27perf report: Enable sorting by srcline as keyMilian Wolff10-21/+78
Often it is interesting to know how costly a given source line is in total. Previously, one had to build these sums manually based on all addresses that pointed to the same source line. This patch introduces srcline as a sort key, which will do the aggregation for us. Paired with the recent addition of showing inline frames, this makes perf report much more useful for many C++ work loads. The following shows the new feature in action. First, let's show the status quo output when we sort by address. The result contains many hist entries that generate the same output: ~~~~~~~~~~~~~~~~ $ perf report --stdio --inline -g address # Children Self Command Shared Object Symbol # ........ ........ ............ ................... ......................................... # 99.89% 35.34% cpp-inlining cpp-inlining [.] main | |--64.55%--main complex:655 | /home/milian/projects/kdab/rnd/hotspot/tests/test-clients/cpp-inlining/main.cpp:39 (inline) | /usr/include/c++/6.3.1/complex:664 (inline) | | | |--60.31%--hypot +20 | | | | | |--8.52%--__hypot_finite +273 | | | | | |--7.32%--__hypot_finite +411 ... --35.34%--_start +4194346 __libc_start_main +241 | |--6.65%--main random.tcc:3326 | /home/milian/projects/kdab/rnd/hotspot/tests/test-clients/cpp-inlining/main.cpp:39 (inline) | /usr/include/c++/6.3.1/bits/random.h:1809 (inline) | /usr/include/c++/6.3.1/bits/random.h:1818 (inline) | /usr/include/c++/6.3.1/bits/random.h:185 (inline) | |--2.70%--main random.tcc:3326 | /home/milian/projects/kdab/rnd/hotspot/tests/test-clients/cpp-inlining/main.cpp:39 (inline) | /usr/include/c++/6.3.1/bits/random.h:1809 (inline) | /usr/include/c++/6.3.1/bits/random.h:1818 (inline) | /usr/include/c++/6.3.1/bits/random.h:185 (inline) | |--1.69%--main random.tcc:3326 | /home/milian/projects/kdab/rnd/hotspot/tests/test-clients/cpp-inlining/main.cpp:39 (inline) | /usr/include/c++/6.3.1/bits/random.h:1809 (inline) | /usr/include/c++/6.3.1/bits/random.h:1818 (inline) | /usr/include/c++/6.3.1/bits/random.h:185 (inline) ... ~~~~~~~~~~~~~~~~ With this patch and `-g srcline` we instead get the following output: ~~~~~~~~~~~~~~~~ $ perf report --stdio --inline -g srcline # Children Self Command Shared Object Symbol # ........ ........ ............ ................... ......................................... # 99.89% 35.34% cpp-inlining cpp-inlining [.] main | |--64.55%--main complex:655 | /home/milian/projects/kdab/rnd/hotspot/tests/test-clients/cpp-inlining/main.cpp:39 (inline) | /usr/include/c++/6.3.1/complex:664 (inline) | | | |--64.02%--hypot | | | | | --59.81%--__hypot_finite | | | --0.53%--cabs | --35.34%--_start __libc_start_main | |--12.48%--main random.tcc:3326 | /home/milian/projects/kdab/rnd/hotspot/tests/test-clients/cpp-inlining/main.cpp:39 (inline) | /usr/include/c++/6.3.1/bits/random.h:1809 (inline) | /usr/include/c++/6.3.1/bits/random.h:1818 (inline) | /usr/include/c++/6.3.1/bits/random.h:185 (inline) ... ~~~~~~~~~~~~~~~~ Signed-off-by: Milian Wolff <milian.wolff@kdab.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Yao Jin <yao.jin@linux.intel.com> Link: http://lkml.kernel.org/r/20170318214928.9047-1-milian.wolff@kdab.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-03-27perf report: Show inline stack for browser modeJin Yao3-8/+178
If the address belongs to an inlined function, the source information back to the first non-inlined function will be printed. For example: 1. Show inlined function name perf report -g function --inline - 0.69% 0.00% inline ld-2.23.so [.] dl_main - dl_main 0.56% _dl_relocate_object _dl_relocate_object (inline) elf_dynamic_do_Rela (inline) 2. Show the file/line information perf report -g address --inline - 0.69% 0.00% inline ld-2.23.so [.] _dl_start _dl_start rtld.c:307 /build/glibc-GKVZIf/glibc-2.23/elf/rtld.c:413 (inline) + _dl_sysdep_start dl-sysdep.c:250 Signed-off-by: Yao Jin <yao.jin@linux.intel.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Tested-by: Milian Wolff <milian.wolff@kdab.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@intel.com> Link: http://lkml.kernel.org/r/1490474069-15823-6-git-send-email-yao.jin@linux.intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-03-27perf report: Show inline stack for stdio modeJin Yao1-1/+84
If the address belongs to an inlined function, the source information back to the first non-inlined function will be printed. For example: 1. Show inlined function name perf report --stdio -g function --inline 0.69% 0.00% inline ld-2.23.so [.] dl_main | ---dl_main | --0.56%--_dl_relocate_object _dl_relocate_object (inline) elf_dynamic_do_Rela (inline) 2. Show the file/line information perf report --stdio -g address --inline 0.69% 0.00% inline ld-2.23.so [.] _dl_start_user | ---_dl_start_user .:0 _dl_start rtld.c:307 /build/glibc-GKVZIf/glibc-2.23/elf/rtld.c:413 (inline) _dl_sysdep_start dl-sysdep.c:250 | --0.56%--dl_main rtld.c:2076 Committer tests: # perf record --call-graph dwarf ~/bin/perf stat usleep 1 Performance counter stats for 'usleep 1': 0.443020 task-clock (msec) # 0.449 CPUs utilized 1 context-switches # 0.002 M/sec 0 cpu-migrations # 0.000 K/sec 52 page-faults # 0.117 M/sec 1,049,423 cycles # 2.369 GHz 801,456 instructions # 0.76 insn per cycle 155,609 branches # 351.246 M/sec 7,026 branch-misses # 4.52% of all branches 0.000987570 seconds time elapsed [ perf record: Woken up 2 times to write data ] [ perf record: Captured and wrote 0.553 MB perf.data (66 samples) ] # perf report --stdio --inline fs__get_mountpoint <SNIP> 1.73% 0.00% perf perf [.] fs__get_mountpoint | ---fs__get_mountpoint fs__get_mountpoint (inline) fs__check_mounts (inline) __statfs entry_SYSCALL_64 sys_statfs SYSC_statfs user_statfs user_path_at_empty filename_lookup path_lookupat link_path_walk inode_permission __inode_permission kernfs_iop_permission kernfs_refresh_inode security_inode_notifysecctx selinux_inode_notifysecctx selinux_inode_setsecurity security_context_to_sid security_context_to_sid_core string_to_context_struct symcmp Signed-off-by: Yao Jin <yao.jin@linux.intel.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Tested-by: Milian Wolff <milian.wolff@kdab.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@intel.com> Link: http://lkml.kernel.org/r/1490474069-15823-5-git-send-email-yao.jin@linux.intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-03-27perf report: Introduce --inline optionJin Yao3-1/+8
It takes some time to look for inline stack for callgraph addresses. So it provides new option "--inline" to let user decide if enable this feature. --inline: If a callgraph address belongs to an inlined function, the inline stack will be printed. Each entry is the inline function name or file/line. Signed-off-by: Yao Jin <yao.jin@linux.intel.com> Tested-by: Milian Wolff <milian.wolff@kdab.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@intel.com> Link: http://lkml.kernel.org/r/1490474069-15823-4-git-send-email-yao.jin@linux.intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-03-27perf report: Find the inline stack for a given addressJin Yao5-5/+192
It would be useful for perf to support a mode to query the inline stack for a given callgraph address. This would simplify finding the right code in code that does a lot of inlining. The srcline.c has contained the code which supports to translate the address to filename:line_nr. This patch just extends the function to let it support getting the inline stacks. It introduces the inline_list which will store the inline function result (filename:line_nr and funcname). If BFD lib is not supported, the result is only filename:line_nr. Signed-off-by: Yao Jin <yao.jin@linux.intel.com> Tested-by: Milian Wolff <milian.wolff@kdab.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@intel.com> Link: http://lkml.kernel.org/r/1490474069-15823-3-git-send-email-yao.jin@linux.intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-03-27perf report: Refactor common code in srcline.cJin Yao1-23/+45
Introduce dso__name() and filename_split() out of existing code because these codes will be used in several places in next patch. For filename_split(), it may also solve a potential memory leak in existing code. In existing addr2line(), sep = strchr(filename, ':'); if (sep) { *sep++ = '\0'; *file = filename; *line_nr = strtoul(sep, NULL, 0); ret = 1; } out: pclose(fp); return ret; If sep is NULL, filename is not freed or returned via file. Signed-off-by: Yao Jin <yao.jin@linux.intel.com> Tested-by: Milian Wolff <milian.wolff@kdab.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@intel.com> Link: http://lkml.kernel.org/r/1490474069-15823-2-git-send-email-yao.jin@linux.intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-03-27perf tools: Remove unused 'prefix' from builtin functionsArnaldo Carvalho de Melo41-126/+110
We got it from the git sources but never used it for anything, with the place where this would be somehow used remaining: static int run_builtin(struct cmd_struct *p, int argc, const char **argv) { prefix = NULL; if (p->option & RUN_SETUP) prefix = NULL; /* setup_perf_directory(); */ Ditch it. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/n/tip-uw5swz05vol0qpr32c5lpvus@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-03-27perf list sdt: Show option in man pageRavi Bangoria1-1/+3
Commit 40218daea1db ("perf list: Show SDT and pre-cached events") added sdt support in perf list, but it missed to update documentation. Show sdt option in man perf-list. Signed-off-by: Ravi Bangoria <ravi.bangoria@linux.vnet.ibm.com> Acked-by: Masami Hiramatsu <mhiramat@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Sukadev Bhattiprolu <sukadev@linux.vnet.ibm.com> Link: http://lkml.kernel.org/r/20170327025538.1753-1-ravi.bangoria@linux.vnet.ibm.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-03-27perf auxtrace: Fix no_size logic in addr_filter__resolve_kernel_syms()Adrian Hunter1-2/+2
Address filtering with kernel symbols incorrectly resulted in the error "Cannot determine size of symbol" because the no_size logic was the wrong way around. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Tested-by: Andi Kleen <ak@linux.intel.com> Cc: stable@vger.kernel.org # v4.9+ Link: http://lkml.kernel.org/r/1490357752-27942-1-git-send-email-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-03-24perf trace: Fixup thread refcountingArnaldo Carvalho de Melo1-9/+12
In trace__vfs_getname() and when checking if a thread is filtered in trace__process_sample() we were not dropping the reference obtained via machine__findnew_thread(), fix it. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/n/tip-9gc470phavxwxv5d9w7ck8ev@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-03-24perf trace: Fix up error path indentationArnaldo Carvalho de Melo1-1/+1
Trivial fix removing a tab in an error path. Link: http://lkml.kernel.org/n/tip-c14mk6cqaiby8gf5rpft3d9r@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-03-24perf trace: Check for vfs_getname.pathname lengthArnaldo Carvalho de Melo1-0/+2
It shouldn't be zero, but if the 'perf probe' on getname_flags() (or elsewhere in the future we need to probe to catch the pathname for syscalls like 'open' being copied from userspace to the kernel) is misplaced somehow, then we will end up not allocating space and trying to copy the "" empty string to ttrace->filename.name, causing a segfault, fix it. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/n/tip-c4f1t6sx1nczuzop19r5si5s@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-03-23Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netLinus Torvalds2-12/+36
Pull networking fixes from David Miller: 1) Several netfilter fixes from Pablo and the crew: - Handle fragmented packets properly in netfilter conntrack, from Florian Westphal. - Fix SCTP ICMP packet handling, from Ying Xue. - Fix big-endian bug in nftables, from Liping Zhang. - Fix alignment of fake conntrack entry, from Steven Rostedt. 2) Fix feature flags setting in fjes driver, from Taku Izumi. 3) Openvswitch ipv6 tunnel source address not set properly, from Or Gerlitz. 4) Fix jumbo MTU handling in amd-xgbe driver, from Thomas Lendacky. 5) sk->sk_frag.page not released properly in some cases, from Eric Dumazet. 6) Fix RTNL deadlocks in nl80211, from Johannes Berg. 7) Fix erroneous RTNL lockdep splat in crypto, from Herbert Xu. 8) Cure improper inflight handling during AF_UNIX GC, from Andrey Ulanov. 9) sch_dsmark doesn't write to packet headers properly, from Eric Dumazet. 10) Fix SCM_TIMESTAMPING_OPT_STATS handling in TCP, from Soheil Hassas Yeganeh. 11) Add some IDs for Motorola qmi_wwan chips, from Tony Lindgren. 12) Fix nametbl deadlock in tipc, from Ying Xue. 13) GRO and LRO packets not counted correctly in mlx5 driver, from Gal Pressman. 14) Fix reset of internal PHYs in bcmgenet, from Doug Berger. 15) Fix hashmap allocation handling, from Alexei Starovoitov. 16) nl_fib_input() needs stronger netlink message length checking, from Eric Dumazet. 17) Fix double-free of sk->sk_filter during sock clone, from Daniel Borkmann. 18) Fix RX checksum offloading in aquantia driver, from Pavel Belous. * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (85 commits) net:ethernet:aquantia: Fix for RX checksum offload. amd-xgbe: Fix the ECC-related bit position definitions sfc: cleanup a condition in efx_udp_tunnel_del() Bluetooth: btqcomsmd: fix compile-test dependency inet: frag: release spinlock before calling icmp_send() tcp: initialize icsk_ack.lrcvtime at session start time genetlink: fix counting regression on ctrl_dumpfamily() socket, bpf: fix sk_filter use after free in sk_clone_lock ipv4: provide stronger user input validation in nl_fib_input() bpf: fix hashmap extra_elems logic enic: update enic maintainers net: bcmgenet: remove bcmgenet_internal_phy_setup() ipv6: make sure to initialize sockc.tsflags before first use fjes: Do not load fjes driver if extended socket device is not power on. fjes: Do not load fjes driver if system does not have extended socket device. net/mlx5e: Count LRO packets correctly net/mlx5e: Count GSO packets correctly net/mlx5: Increase number of max QPs in default profile net/mlx5e: Avoid supporting udp tunnel port ndo for VF reps net/mlx5e: Use the proper UAPI values when offloading TC vlan actions ...
2017-03-23perf list: Move extra details printing to new optionAndi Kleen6-10/+21
Move the printing of perf expressions and internal events to a new clearer --details flag, instead of lumping it together with other debug options in --debug. This makes it clearer to use. Before perf list --debug ... unc_m_power_critical_throttle_cycles [Cycles all ranks are in critical thermal throttle. Unit: uncore_imc] uncore_imc_2/event=0x86/ MetricName: power_critical_throttle_cycles % MetricExpr: (unc_m_power_critical_throttle_cycles / unc_m_clockticks) * 100. after perf list --details ... unc_m_power_critical_throttle_cycles [Cycles all ranks are in critical thermal throttle. Unit: uncore_imc] uncore_imc_2/event=0x86/ MetricName: power_critical_throttle_cycles % MetricExpr: (unc_m_power_critical_throttle_cycles / unc_m_clockticks) * 100. Signed-off-by: Andi Kleen <ak@linux.intel.com> Acked-by: Jiri Olsa <jolsa@kernel.org> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Link: http://lkml.kernel.org/r/20170320201711.14142-14-andi@firstfloor.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-03-23perf pmu: Add support for MetricName JSON attributeAndi Kleen9-8/+34
Add support for a new JSON event attribute to name MetricExpr for better output in perf stat. If the event has no MetricName it uses the normal event name instead to describe the metric. Before % perf stat -a -I 1000 -e '{unc_p_clockticks,unc_p_freq_max_os_cycles}' --metric-only time unc_p_freq_max_os_cycles 1.000149775 15.7 2.000344807 19.3 3.000502544 16.7 4.000640656 6.6 5.000779955 9.9 After % perf stat -a -I 1000 -e '{unc_p_clockticks,unc_p_freq_max_os_cycles}' --metric-only time freq_max_os_cycles % 1.000149775 15.7 2.000344807 19.3 3.000502544 16.7 4.000640656 6.6 5.000779955 9.9 Signed-off-by: Andi Kleen <ak@linux.intel.com> Acked-by: Jiri Olsa <jolsa@kernel.org> Link: http://lkml.kernel.org/r/20170320201711.14142-13-andi@firstfloor.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-03-23perf list: Support printing MetricExpr with --debugAndi Kleen1-2/+8
Output the metric expr in perf list when --debug is specified, so that the user can check the formula. Before: % perf list ... unc_m_power_channel_ppd [Cycles where DRAM ranks are in power down (CKE) mode. Derived from unc_m_power_channel_ppd. Unit: uncore_imc] uncore_imc_2/event=0x85/ After: % perf list --debug ... unc_m_power_channel_ppd [Cycles where DRAM ranks are in power down (CKE) mode. Derived from unc_m_power_channel_ppd. Unit: uncore_imc] Perf: uncore_imc_2/event=0x85/ MetricExpr: (unc_m_power_channel_ppd / unc_m_clockticks) * 100. Signed-off-by: Andi Kleen <ak@linux.intel.com> Acked-by: Jiri Olsa <jolsa@kernel.org> Link: http://lkml.kernel.org/r/20170320201711.14142-12-andi@firstfloor.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-03-23perf stat: Output JSON MetricExpr metricAndi Kleen8-0/+210
Add generic infrastructure to perf stat to output ratios for "MetricExpr" entries in the event lists. Many events are more useful as ratios than in raw form, typically some count in relation to total ticks. Transfer the MetricExpr information from the alias to the evsel. We mark the events that need to be collected for MetricExpr, and also link the events using them with a pointer. The code is careful to always prefer the right event in the same group to minimize multiplexing errors. At the moment only a single relation is supported. Then add a rblist to the stat shadow code that remembers stats based on the cpu and context. Then finally update and retrieve and print these values similarly to the existing hardcoded perf metrics. We use the simple expression parser added earlier to evaluate the expression. Normally we just output the result without further commentary, but for --metric-only this would lead to empty columns. So for this case use the original event as description. There is no attempt to automatically add the MetricExpr event, if it is missing, however we suggest it to the user, because the user tool doesn't have enough information to reliably construct a group that is guaranteed to schedule. So we leave that to the user. % perf stat -a -I 1000 -e '{unc_p_clockticks,unc_p_freq_max_os_cycles}' 1.000147889 800,085,181 unc_p_clockticks 1.000147889 93,126,241 unc_p_freq_max_os_cycles # 11.6 2.000448381 800,218,217 unc_p_clockticks 2.000448381 142,516,095 unc_p_freq_max_os_cycles # 17.8 3.000639852 800,243,057 unc_p_clockticks 3.000639852 162,292,689 unc_p_freq_max_os_cycles # 20.3 % perf stat -a -I 1000 -e '{unc_p_clockticks,unc_p_freq_max_os_cycles}' --metric-only # time freq_max_os_cycles % 1.000127077 0.9 2.000301436 0.7 3.000456379 0.0 v2: Change from DivideBy to MetricExpr v3: Use expr__ prefix. Support more than one other event. v4: Update description v5: Only print warning message once for multiple PMUs. Signed-off-by: Andi Kleen <ak@linux.intel.com> Acked-by: Jiri Olsa <jolsa@kernel.org> Link: http://lkml.kernel.org/r/20170320201711.14142-11-andi@firstfloor.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-03-23perf pmu: Support MetricExpr header in JSON event listAndi Kleen5-8/+23
Add support for parsing the MetricExpr header in the JSON event lists and storing them in the alias structure. Used in the next patch. v2: Change DividedBy to MetricExpr v3: Really catch all uses of DividedBy Signed-off-by: Andi Kleen <ak@linux.intel.com> Acked-by: Jiri Olsa <jolsa@kernel.org> Link: http://lkml.kernel.org/r/20170320201711.14142-10-andi@firstfloor.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-03-23perf vendor events intel: Update Intel uncore JSON event filesAndi Kleen19-180/+267
- Add MetricName to describe Metric - Remove redundant "derived from" in descriptions - Rename UNC_M_CAS_COUNT to LLC_MISSES.READ Signed-off-by: Andi Kleen <ak@linux.intel.com> Acked-by: Jiri Olsa <jolsa@kernel.org> Link: http://lkml.kernel.org/r/20170320201711.14142-9-andi@firstfloor.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-03-23perf tools: Add a simple expression parser for JSONAndi Kleen7-0/+266
Add a simple expression parser good enough to parse JSON relation expressions. The parser is implemented using bison. This is just intended as an simple parser for internal usage in the event lists, not the beginning of a "perf scripting language" v2: Use expr__ prefix instead of expr_ Support multiple free variables for parser Committer note: The v2 patch had: %define api.pure full In expr.y, that is a feature introduced in bison 2.7, to have reentrant parsers, not using global variables, which would make tools/perf stop building with the bison version shipped in older distros, so Andi realised that the other parsers (e.g. parse-events.y) were using: %pure-parser Which is present in older versions of bison and fits the bill. I added: CFLAGS_expr-bison.o += -DYYENABLE_NLS=0 -DYYLTYPE_IS_TRIVIAL=0 -w To finally make it build, copying what was there for pmu-bison.o, another parser. Signed-off-by: Andi Kleen <ak@linux.intel.com> Acked-by: Jiri Olsa <jolsa@kernel.org> Link: http://lkml.kernel.org/r/20170320201711.14142-8-andi@firstfloor.org [ stdlib.h is needed in tests/expr.c for free() fixing build in systems such as ubuntu:16.04-x-s390 ] Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-03-23bpf: fix hashmap extra_elems logicAlexei Starovoitov1-3/+26
In both kmalloc and prealloc mode the bpf_map_update_elem() is using per-cpu extra_elems to do atomic update when the map is full. There are two issues with it. The logic can be misused, since it allows max_entries+num_cpus elements to be present in the map. And alloc_extra_elems() at map creation time can fail percpu alloc for large map values with a warn: WARNING: CPU: 3 PID: 2752 at ../mm/percpu.c:892 pcpu_alloc+0x119/0xa60 illegal size (32824) or align (8) for percpu allocation The fixes for both of these issues are different for kmalloc and prealloc modes. For prealloc mode allocate extra num_possible_cpus elements and store their pointers into extra_elems array instead of actual elements. Hence we can use these hidden(spare) elements not only when the map is full but during bpf_map_update_elem() that replaces existing element too. That also improves performance, since pcpu_freelist_pop/push is avoided. Unfortunately this approach cannot be used for kmalloc mode which needs to kfree elements after rcu grace period. Therefore switch it back to normal kmalloc even when full and old element exists like it was prior to commit 6c9059817432 ("bpf: pre-allocate hash map elements"). Add tests to check for over max_entries and large map values. Reported-by: Dave Jones <davej@codemonkey.org.uk> Fixes: 6c9059817432 ("bpf: pre-allocate hash map elements") Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Martin KaFai Lau <kafai@fb.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-03-22selftests/bpf: fix broken build, take 2Zi Shen Lim1-9/+10
Merge of 'linux-kselftest-4.11-rc1': 1. Partially removed use of 'test_objs' target, breaking force rebuild of BPFOBJ, introduced in commit d498f8719a09 ("bpf: Rebuild bpf.o for any dependency update"). Update target so dependency on BPFOBJ is restored. 2. Introduced commit 2047f1d8ba28 ("selftests: Fix the .c linking rule") which fixes order of LDLIBS. Commit d02d8986a768 ("bpf: Always test unprivileged programs") added libcap dependency into CFLAGS. Use LDLIBS instead to fix linking of test_verifier. 3. Introduced commit d83c3ba0b926 ("selftests: Fix selftests build to just build, not run tests"). Reordering the Makefile allows us to remove the 'all' target. Tested both: selftests/bpf$ make and selftests$ make TARGETS=bpf on Ubuntu 16.04.2. Signed-off-by: Zi Shen Lim <zlim.lnx@gmail.com> Acked-by: Daniel Borkmann <daniel@iogearbox.net> Tested-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Alexei Starovoitov <ast@kernel.org> Tested-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Shuah Khan <shuahkh@osg.samsung.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-03-21perf pmu: Special case uncore_ prefixAndi Kleen1-0/+3
Special case uncore_ prefix in PMU match, to allow for shorter event uncore specifications. Before: perf stat -a -e uncore_cbox/event=0x35,umask=0x1,filter_opc=0x19C/ sleep 1 After perf stat -a -e cbox/event=0x35,umask=0x1,filter_opc=0x19C/ sleep 1 Committer tests: # perf list uncore List of pre-defined events (to be used in -e): uncore_cbox_0/clockticks/ [Kernel PMU event] uncore_cbox_1/clockticks/ [Kernel PMU event] uncore_imc/data_reads/ [Kernel PMU event] uncore_imc/data_writes/ [Kernel PMU event] # perf stat -a -e cbox_0/clockticks/ sleep 1 Performance counter stats for 'system wide': 281,474,976,653,084 cbox_0/clockticks/ 1.000870129 seconds time elapsed # Signed-off-by: Andi Kleen <ak@linux.intel.com> Acked-by: Jiri Olsa <jolsa@kernel.org> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Link: http://lkml.kernel.org/r/20170320201711.14142-7-andi@firstfloor.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-03-21perf pmu: Expand PMU events by prefix matchAndi Kleen3-14/+54
When the user specifies a pmu directly, expand it automatically with a prefix match for all available PMUs, similar as we do for the normal aliases now. This allows to specify attributes for duplicated boxes quickly. For example uncore_cbox_{0,6}/.../ can be now specified as uncore_cbox/.../ and it gets automatically expanded for all boxes. This generally makes it more concise to write uncore specifications, and also avoids the need to know the exact topology of the system. Before: % perf stat -a -e uncore_cbox_0/event=0x35,umask=0x1,filter_opc=0x19C/,\ uncore_cbox_1/event=0x35,umask=0x1,filter_opc=0x19C/,\ uncore_cbox_2/event=0x35,umask=0x1,filter_opc=0x19C/,\ uncore_cbox_3/event=0x35,umask=0x1,filter_opc=0x19C/,\ uncore_cbox_4/event=0x35,umask=0x1,filter_opc=0x19C/,\ uncore_cbox_5/event=0x35,umask=0x1,filter_opc=0x19C/ sleep 1 After: % perf stat -a -e uncore_cbox/event=0x35,umask=0x1,filter_opc=0x19C/ sleep 1 v2: Handle all bison rules. Move multi add code to separate function. Handle uncore_ prefix correctly. v3: Move parse_events_multi_pmu_add to separate patch. Move uncore prefix check to separate patch. Signed-off-by: Andi Kleen <ak@linux.intel.com> Acked-by: Jiri Olsa <jolsa@kernel.org> Link: http://lkml.kernel.org/r/20170320201711.14142-6-andi@firstfloor.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-03-21perf tools: Factor out PMU matching in parserAndi Kleen3-29/+52
Factor out the PMU name matching in the event parser into a separate function, to use the same code for other grammar rules later. Signed-off-by: Andi Kleen <ak@linux.intel.com> Acked-by: Jiri Olsa <jolsa@kernel.org> Link: http://lkml.kernel.org/r/20170320201711.14142-5-andi@firstfloor.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-03-21perf stat: Handle partially bad results with mergingAndi Kleen1-0/+10
When any result that is being merged is bad, mark them all bad to give consistent output in interval mode. No before/after, because the issue was only found in theoretical review and it is hard to reproduce Signed-off-by: Andi Kleen <ak@linux.intel.com> Acked-by: Jiri Olsa <jolsa@kernel.org> Link: http://lkml.kernel.org/r/20170320201711.14142-4-andi@firstfloor.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-03-21perf stat: Collapse identically named eventsAndi Kleen3-4/+38
The uncore PMU has a lot of duplicated PMUs for different subsystems. When expanding an uncore alias we usually end up with a large number of identically named aliases, which makes perf stat output difficult to read. Automatically sum them up in perf stat, unless --no-merge is specified. This can be default because only the uncores generally have duplicated aliases. Other PMUs have unique names. Before: % perf stat --no-merge -a -e unc_c_llc_lookup.any sleep 1 Performance counter stats for 'system wide': 694,976 Bytes unc_c_llc_lookup.any 706,304 Bytes unc_c_llc_lookup.any 956,608 Bytes unc_c_llc_lookup.any 782,720 Bytes unc_c_llc_lookup.any 605,696 Bytes unc_c_llc_lookup.any 442,816 Bytes unc_c_llc_lookup.any 659,328 Bytes unc_c_llc_lookup.any 509,312 Bytes unc_c_llc_lookup.any 263,936 Bytes unc_c_llc_lookup.any 592,448 Bytes unc_c_llc_lookup.any 672,448 Bytes unc_c_llc_lookup.any 608,640 Bytes unc_c_llc_lookup.any 641,024 Bytes unc_c_llc_lookup.any 856,896 Bytes unc_c_llc_lookup.any 808,832 Bytes unc_c_llc_lookup.any 684,864 Bytes unc_c_llc_lookup.any 710,464 Bytes unc_c_llc_lookup.any 538,304 Bytes unc_c_llc_lookup.any 1.002577660 seconds time elapsed After: % perf stat -a -e unc_c_llc_lookup.any sleep 1 Performance counter stats for 'system wide': 2,685,120 Bytes unc_c_llc_lookup.any 1.002648032 seconds time elapsed v2: Split collect_aliases. Rename alias flag. v3: Make sure unsupported/not counted is always printed. v4: Factor out callback change into separate patch. v5: Move check for bad results here Move merged check into collect_data Signed-off-by: Andi Kleen <ak@linux.intel.com> Acked-by: Jiri Olsa <jolsa@kernel.org> Link: http://lkml.kernel.org/r/20170320201711.14142-3-andi@firstfloor.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-03-21perf stat: Factor out callback for collecting event valuesAndi Kleen1-23/+80
To be used in next patch to support automatic summing of alias events. v2: Move check for bad results to next patch v3: Remove trivial addition. v4: Use perf_evsel__cpus instead of evsel->cpus Signed-off-by: Andi Kleen <ak@linux.intel.com> Acked-by: Jiri Olsa <jolsa@kernel.org> Link: http://lkml.kernel.org/r/20170320201711.14142-2-andi@firstfloor.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-03-21perf annotate: Add comment clarifying how the source code line is parsedArnaldo Carvalho de Melo1-0/+6
The source code line number (lineno) needs to be kept in accross calls to symbol__parse_objdump_line() when parsing the output of 'objdump -l -dS', so that it can associate it with the instructions till the next line. See disasm_line__new() and struct disasm_line::line_nr. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Taeung Song <treeze.taeung@gmail.com> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/n/tip-7hpx8f8ybdpiujceysaj229w@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-03-21perf annotate: More exactly grep -v of the objdump commandTaeung Song1-1/+1
The 'grep -v "filename"' applied to the objdump command output cause a side effect eliminating filename:linenr of output of 'objdump -l' if the object file name and source file name are the same, fix it. E.g. the output of the following objdump command in symbol__disassemble(): $ objdump -l -d -S -C /home/taeung/hello --start-address=... /home/taeung/hello: file format elf64-x86-64 Disassembly of section .text: 0000000000400526 <main>: main(): /home/taeung/hello.c:4 void main() { 400526: 55 push %rbp 400527: 48 89 e5 mov %rsp,%rbp /home/taeung/hello.c:5 ... But it uses grep -v "filename" e.g. "/home/taeung/hello" in the objdump command to remove the first line containing file name and file format ("/home/taeung/hello: file format elf64-x86-64"): Before: $ objdump -l -d -S -C /home/taeung/hello | grep /home/taeung/hello But this causes a side effect, removing filename:linenr too, because the object file and source file have the same name e.g. "/home/taueng/hello", "/home/taeung/hello.c" So more do a better match by using grep -v as below to correctly remove that first line: "/home/taeung/hello: file format elf64-x86-64" After: $ objdump -l -d -S -C /home/taeung/hello | grep /home/taeung/hello: Signed-off-by: Taeung Song <treeze.taeung@gmail.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/r/1489978617-31396-5-git-send-email-treeze.taeung@gmail.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-03-21perf sdt x86: Add renaming logic for rNN and other registersRavi Bangoria1-12/+32
'perf probe' is failing for sdt markers whose arguments has rNN (with postfix b/w/d), %rsp, %esp, %sil etc. registers. Add renaming logic for these registers. Signed-off-by: Ravi Bangoria <ravi.bangoria@linux.vnet.ibm.com> Acked-by: Masami Hiramatsu <mhiramat@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexis Berlemont <alexis.berlemont@gmail.com> Cc: Madhavan Srinivasan <maddy@linux.vnet.ibm.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/20170202111143.14319-3-ravi.bangoria@linux.vnet.ibm.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-03-21perf probe: Add sdt probes arguments into the uprobe cmd stringAlexis Berlemont4-4/+261
An sdt probe can be associated with arguments but they were not passed to the user probe tracing interface (uprobe_events); this patch adapts the sdt argument descriptors according to the uprobe input format. As the uprobe parser does not support scaled address mode, perf will skip arguments which cannot be adapted to the uprobe format. Here are the results: $ perf buildid-cache -v --add test_sdt $ perf probe -x test_sdt sdt_libfoo:table_frob $ perf probe -x test_sdt sdt_libfoo:table_diddle $ perf record -e sdt_libfoo:table_frob -e sdt_libfoo:table_diddle test_sdt $ perf script test_sdt ... 666.255678: sdt_libfoo:table_frob: (4004d7) arg0=0 arg1=0 test_sdt ... 666.255683: sdt_libfoo:table_diddle: (40051a) arg0=0 arg1=0 test_sdt ... 666.255686: sdt_libfoo:table_frob: (4004d7) arg0=1 arg1=2 test_sdt ... 666.255689: sdt_libfoo:table_diddle: (40051a) arg0=3 arg1=4 test_sdt ... 666.255692: sdt_libfoo:table_frob: (4004d7) arg0=2 arg1=4 test_sdt ... 666.255694: sdt_libfoo:table_diddle: (40051a) arg0=6 arg1=8 Signed-off-by: Alexis Berlemont <alexis.berlemont@gmail.com> Acked-by: Masami Hiramatsu <mhiramat@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Hemant Kumar <hemant@linux.vnet.ibm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ravi Bangoria <ravi.bangoria@linux.vnet.ibm.com> Link: http://lkml.kernel.org/r/20161214000732.1710-3-alexis.berlemont@gmail.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-03-21perf sdt: Add scanning of sdt probes argumentsAlexis Berlemont2-2/+24
During a "perf buildid-cache --add" command, the section ".note.stapsdt" of the "added" binary is scanned in order to list the available SDT markers available in a binary. The parts containing the probes arguments were left unscanned. The whole section is now parsed; the probe arguments are extracted for later use. Signed-off-by: Alexis Berlemont <alexis.berlemont@gmail.com> Acked-by: Masami Hiramatsu <mhiramat@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Hemant Kumar <hemant@linux.vnet.ibm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ravi Bangoria <ravi.bangoria@linux.vnet.ibm.com> Link: http://lkml.kernel.org/r/20161214000732.1710-2-alexis.berlemont@gmail.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-03-21perf probe: Return errno when not hitting any eventKefeng Wang1-3/+3
On old perf, when using 'perf probe -d' to delete an inexistent event, it returns errno, eg, -bash-4.3# perf probe -d xxx || echo $? Info: Event "*:xxx" does not exist. Error: Failed to delete events. 255 But now perf_del_probe_events() will always set ret = 0, different from previous del_perf_probe_events(). After this, it returns errno again, eg, -bash-4.3# ./perf probe -d xxx || echo $? "xxx" does not hit any event. Error: Failed to delete events. 254 And it is more appropriate to return -ENOENT instead of -EPERM. Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com> Acked-by: Masami Hiramatsu <mhiramat@kernel.org> Cc: Hanjun Guo <guohanjun@huawei.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Wang Nan <wangnan0@huawei.com> Fixes: dddc7ee32fa1 ("perf probe: Fix an error when deleting probes successfully") Link: http://lkml.kernel.org/r/1489738592-61011-1-git-send-email-wangkefeng.wang@huawei.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-03-21perf probe: Change MAX_CMDLENRavi Bangoria2-2/+2
There are many SDT markers in powerpc whose uprobe definition goes beyond current MAX_CMDLEN, especially when target filename is long and sdt marker has long list of arguments. For example, definition of sdt marker method__compile__end: 8@17 8@9 8@10 -4@8 8@7 -4@6 8@5 -4@4 1@37(28) from file /usr/lib/jvm/java-1.8.0-openjdk-1.8.0.91-2.b14.fc22.ppc64/jre/lib/ppc64/server/libjvm.so is p:sdt_hotspot/method__compile__end /usr/lib/jvm/java-1.8.0-openjdk-\ 1.8.0.91-2.b14.fc22.ppc64/jre/lib/ppc64/server/libjvm.so:0x4c4e00\ arg1=%gpr17:u64 arg2=%gpr9:u64 arg3=%gpr10:u64 arg4=%gpr8:s32\ arg5=%gpr7:u64 arg6=%gpr6:s32 arg7=%gpr5:u64 arg8=%gpr4:s32\ arg9=+37(%gpr28):u8 'perf probe' fails with segfault for such markers. As the uprobe_events file accepts definitions up to 4094 characters(4096 - 2 (\n\0)), increase value of MAX_CMDLEN match that. Signed-off-by: Ravi Bangoria <ravi.bangoria@linux.vnet.ibm.com> Acked-by: Masami Hiramatsu <mhiramat@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexis Berlemont <alexis.berlemont@gmail.com> Cc: Madhavan Srinivasan <maddy@linux.vnet.ibm.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/20170207054547.3690-1-ravi.bangoria@linux.vnet.ibm.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-03-20tools headers: Sync {tools/,}arch/powerpc/include/uapi/asm/kvm.hArnaldo Carvalho de Melo1-0/+22
The changes in the following csets are not relevant for what is used in tools/perf/arch/powerpc/util/kvm-stat.c, but lets sync it to silence the diff detector in the tools build system: c92701322711 ("KVM: PPC: Book3S HV: Add userspace interfaces for POWER9 MMU") 17d48610ae0f ("KVM: PPC: Book 3S: XICS: Implement ICS P/Q states") Cc: Alexander Yarygin <yarygin@linux.vnet.ibm.com> Cc: David Ahern <dsahern@gmail.com> Cc: Li Zhong <zhong@linux.vnet.ibm.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com> Cc: Paul Mackerras <paulus@ozlabs.org> Cc: Scott Wood <scottwood@freescale.com> Cc: Srikar Dronamraju <srikar@linux.vnet.ibm.com> Link: http://lkml.kernel.org/n/tip-nsqxpyzcv4ywesikhhhrgfgc@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-03-20perf probe: Fix concat_probe_trace_eventsRavi Bangoria1-1/+1
'*ntevs' contains number of elements present in 'tevs' array. If there are no elements in array, 'tevs2' can be directly assigned to 'tevs' without allocating more space. So the condition should be '*ntevs == 0' not 'ntevs == 0'. Signed-off-by: Ravi Bangoria <ravi.bangoria@linux.vnet.ibm.com> Acked-by: Masami Hiramatsu <mhiramat@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Fixes: 42bba263eb58 ("perf probe: Allow wildcard for cached events") Link: http://lkml.kernel.org/r/20170308065908.4128-1-ravi.bangoria@linux.vnet.ibm.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-03-20perf stat: Correct --no-aggr descriptionRavi Bangoria1-2/+1
Description of --no-aggr in perf-stat man page is outdated. --no-aggr can also be used while profiling specific set of cpus. For ex, $ perf stat -e cycles,instructions -C 1-2 --no-aggr -- sleep 1 Performance counter stats for 'CPU(s) 1-2': CPU1 5,94,92,795 cycles CPU2 2,69,72,403 cycles CPU1 2,02,08,327 instructions # 0.34 insn per cycle CPU2 73,17,123 instructions # 0.12 insn per cycle 1.000989132 seconds time elapsed Signed-off-by: Ravi Bangoria <ravi.bangoria@linux.vnet.ibm.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/r/1490013438-5713-1-git-send-email-ravi.bangoria@linux.vnet.ibm.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-03-20tools headers: Sync {tools/,}arch/arm{64}/include/uapi/asm/kvm.hArnaldo Carvalho de Melo2-0/+26
The changes in the following csets are not relevant for 'perf kvm' usage but lets sync it to silence the diff detector in the tools build system: e96a006cb066 ("KVM: arm/arm64: vgic: Implement KVM_DEV_ARM_VGIC_GRP_LEVEL_INFO ioctl") d017d7b0bd7a ("KVM: arm/arm64: vgic: Implement VGICv3 CPU interface access") 94574c9488e2 ("KVM: arm/arm64: vgic: Add distributor and redistributor access") Cc: Hemant Kumar <hemant@linux.vnet.ibm.com> Cc: Marc Zyngier <marc.zyngier@arm.com> Cc: Vijaya Kumar K <Vijaya.Kumar@cavium.com> Cc: Yunlong Song <yunlong.song@huawei.com> Link: http://lkml.kernel.org/n/tip-nsqxpyzcv4ywesikhhhrgfgc@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>