summaryrefslogtreecommitdiff
path: root/tools/perf/util/annotate.h
AgeCommit message (Collapse)AuthorFilesLines
2017-11-02License cleanup: add SPDX GPL-2.0 license identifier to files with no licenseGreg Kroah-Hartman1-0/+1
Many source files in the tree are missing licensing information, which makes it harder for compliance tools to determine the correct license. By default all files without license information are under the default license of the kernel, which is GPL version 2. Update the files which contain no license information with the 'GPL-2.0' SPDX license identifier. The SPDX identifier is a legally binding shorthand, which can be used instead of the full boiler plate text. This patch is based on work done by Thomas Gleixner and Kate Stewart and Philippe Ombredanne. How this work was done: Patches were generated and checked against linux-4.14-rc6 for a subset of the use cases: - file had no licensing information it it. - file was a */uapi/* one with no licensing information in it, - file was a */uapi/* one with existing licensing information, Further patches will be generated in subsequent months to fix up cases where non-standard license headers were used, and references to license had to be inferred by heuristics based on keywords. The analysis to determine which SPDX License Identifier to be applied to a file was done in a spreadsheet of side by side results from of the output of two independent scanners (ScanCode & Windriver) producing SPDX tag:value files created by Philippe Ombredanne. Philippe prepared the base worksheet, and did an initial spot review of a few 1000 files. The 4.13 kernel was the starting point of the analysis with 60,537 files assessed. Kate Stewart did a file by file comparison of the scanner results in the spreadsheet to determine which SPDX license identifier(s) to be applied to the file. She confirmed any determination that was not immediately clear with lawyers working with the Linux Foundation. Criteria used to select files for SPDX license identifier tagging was: - Files considered eligible had to be source code files. - Make and config files were included as candidates if they contained >5 lines of source - File already had some variant of a license header in it (even if <5 lines). All documentation files were explicitly excluded. The following heuristics were used to determine which SPDX license identifiers to apply. - when both scanners couldn't find any license traces, file was considered to have no license information in it, and the top level COPYING file license applied. For non */uapi/* files that summary was: SPDX license identifier # files ---------------------------------------------------|------- GPL-2.0 11139 and resulted in the first patch in this series. If that file was a */uapi/* path one, it was "GPL-2.0 WITH Linux-syscall-note" otherwise it was "GPL-2.0". Results of that was: SPDX license identifier # files ---------------------------------------------------|------- GPL-2.0 WITH Linux-syscall-note 930 and resulted in the second patch in this series. - if a file had some form of licensing information in it, and was one of the */uapi/* ones, it was denoted with the Linux-syscall-note if any GPL family license was found in the file or had no licensing in it (per prior point). Results summary: SPDX license identifier # files ---------------------------------------------------|------ GPL-2.0 WITH Linux-syscall-note 270 GPL-2.0+ WITH Linux-syscall-note 169 ((GPL-2.0 WITH Linux-syscall-note) OR BSD-2-Clause) 21 ((GPL-2.0 WITH Linux-syscall-note) OR BSD-3-Clause) 17 LGPL-2.1+ WITH Linux-syscall-note 15 GPL-1.0+ WITH Linux-syscall-note 14 ((GPL-2.0+ WITH Linux-syscall-note) OR BSD-3-Clause) 5 LGPL-2.0+ WITH Linux-syscall-note 4 LGPL-2.1 WITH Linux-syscall-note 3 ((GPL-2.0 WITH Linux-syscall-note) OR MIT) 3 ((GPL-2.0 WITH Linux-syscall-note) AND MIT) 1 and that resulted in the third patch in this series. - when the two scanners agreed on the detected license(s), that became the concluded license(s). - when there was disagreement between the two scanners (one detected a license but the other didn't, or they both detected different licenses) a manual inspection of the file occurred. - In most cases a manual inspection of the information in the file resulted in a clear resolution of the license that should apply (and which scanner probably needed to revisit its heuristics). - When it was not immediately clear, the license identifier was confirmed with lawyers working with the Linux Foundation. - If there was any question as to the appropriate license identifier, the file was flagged for further research and to be revisited later in time. In total, over 70 hours of logged manual review was done on the spreadsheet to determine the SPDX license identifiers to apply to the source files by Kate, Philippe, Thomas and, in some cases, confirmation by lawyers working with the Linux Foundation. Kate also obtained a third independent scan of the 4.13 code base from FOSSology, and compared selected files where the other two scanners disagreed against that SPDX file, to see if there was new insights. The Windriver scanner is based on an older version of FOSSology in part, so they are related. Thomas did random spot checks in about 500 files from the spreadsheets for the uapi headers and agreed with SPDX license identifier in the files he inspected. For the non-uapi files Thomas did random spot checks in about 15000 files. In initial set of patches against 4.14-rc6, 3 files were found to have copy/paste license identifier errors, and have been fixed to reflect the correct identifier. Additionally Philippe spent 10 hours this week doing a detailed manual inspection and review of the 12,461 patched files from the initial patch version early this week with: - a full scancode scan run, collecting the matched texts, detected license ids and scores - reviewing anything where there was a license detected (about 500+ files) to ensure that the applied SPDX license was correct - reviewing anything where there was no detection but the patch license was not GPL-2.0 WITH Linux-syscall-note to ensure that the applied SPDX license was correct This produced a worksheet with 20 files needing minor correction. This worksheet was then exported into 3 different .csv files for the different types of files to be modified. These .csv files were then reviewed by Greg. Thomas wrote a script to parse the csv files and add the proper SPDX tag to the file, in the format that the file expected. This script was further refined by Greg based on the output to detect more types of files automatically and to distinguish between header and source .c files (which need different comment types.) Finally Greg ran the script using the .csv files to generate the patches. Reviewed-by: Kate Stewart <kstewart@linuxfoundation.org> Reviewed-by: Philippe Ombredanne <pombredanne@nexb.com> Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-07-21perf annotate: Store the sample period in each histogram bucketTaeung Song1-0/+1
We'll use it soon, when fixing --show-total-period. Signed-off-by: Taeung Song <treeze.taeung@gmail.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/r/1500500215-16646-1-git-send-email-treeze.taeung@gmail.com [ split from a larger patch, do the math in __symbol__inc_addr_samples() ] Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-07-21perf hists: Pass perf_sample to __symbol__inc_addr_samples()Taeung Song1-2/+4
To pave the way to use perf_sample fields in the annotate code, storing sample->period in sym_hist->addr->period and its sum in sym_hist->period. Signed-off-by: Taeung Song <treeze.taeung@gmail.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Link: http://lkml.kernel.org/r/1500500215-16646-1-git-send-email-treeze.taeung@gmail.com [ split and adjusted from a larger patch ] Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-07-21perf annotate: Rename 'sum' to 'nr_samples' in struct sym_histTaeung Song1-1/+1
To make it more clear that it is the sum of all the nr_samples fields in the addr[] entries, i.e.: sym_hist->nr_samples = sum(sym_hist->addr[0 .. symbol__size(sym)]->nr_samples) Committer notes: Taeung had renamed it to total_samples, but using nr_samples, as in the added explanation above, looks clearer and establishes the direct connection, making clear it is about the _number_ of samples. Signed-off-by: Taeung Song <treeze.taeung@gmail.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Link: http://lkml.kernel.org/r/1500500211-16599-1-git-send-email-treeze.taeung@gmail.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-07-21perf annotate: Introduce struct sym_hist_entryTaeung Song1-2/+7
struct sym_hist has addr[] but it should have not only number of samples but also the sample period. So use new struct symhist_entry to pave the way to have that. Committer notes: This initial patch will only introduce the struct sym_hist_entry and use only the nr_samples member, which makes the code clearer and paves the way to save the period as well. Signed-off-by: Taeung Song <treeze.taeung@gmail.com> Suggested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Link: http://lkml.kernel.org/r/1500500205-16553-1-git-send-email-treeze.taeung@gmail.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-07-19perf annotate: Implement visual marker for macro fusionJin Yao1-0/+1
For marking fused instructions clearly this patch adds a line before the first instruction of pair and joins it with the arrow of the jump to its target. For example, when "je" is selected in annotate view, the line before cmpl is displayed and joins the arrow of "je". │ ┌──cmpl $0x0,argp_program_version_hook 81.93 │ ├──je 20 │ │ lock cmpxchg %esi,0x38a9a4(%rip) │ │↓ jne 29 │ │↓ jmp 43 11.47 │20:└─→cmpxch %esi,0x38a999(%rip) That means the cmpl+je is a fused instruction pair and they should be considered together. Changelog: v3: Use Arnaldo's fix to improve the arrow origin rendering. To get the evsel->evlist->env->cpuid, save the evsel in annotate_browser. v2: new function "ins__is_fused" to check if the instructions are fused. Signed-off-by: Yao Jin <yao.jin@linux.intel.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/1499403995-19857-3-git-send-email-yao.jin@linux.intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-07-19perf annotate: Check for fused instructionsJin Yao1-1/+2
Macro fusion merges two instructions to a single micro-op. Intel core platform performs this hardware optimization under limited circumstances. For example, CMP + JCC can be "fused" and executed /retired together. While with sampling this can result in the sample sometimes being on the JCC and sometimes on the CMP. So for the fused instruction pair, they could be considered together. On Nehalem, fused instruction pairs: cmp/test + jcc. On other new CPU: cmp/test/add/sub/and/inc/dec + jcc. This patch adds an x86-specific function which checks if 2 instructions are in a "fused" pair. For non-x86 arch, the function is just NULL. Changelog: v4: Move the CPU model checking to symbol__disassemble and save the CPU family/model in arch structure. It avoids checking every time when jump arrow printed. v3: Add checking for Nehalem (CMP, TEST). For other newer Intel CPUs just check it by default (CMP, TEST, ADD, SUB, AND, INC, DEC). v2: Remove the original weak function. Arnaldo points out that doing it as a weak function that will be overridden by the host arch doesn't work. So now it's implemented as an arch-specific function. Committer fix: Do not access evsel->evlist->env->cpuid, ->env can be null, introduce perf_evsel__env_cpuid(), just like perf_evsel__env_arch(), also used in this function call. The original patch was segfaulting 'perf top' + annotation. But this essentially disables this fused instructions augmentation in 'perf top', the right thing is to get the cpuid from the running kernel, left for a later patch tho. Signed-off-by: Yao Jin <yao.jin@linux.intel.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/1499403995-19857-2-git-send-email-yao.jin@linux.intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-06-19perf annotate: Return arch from symbol__disassemble() and save it in browserJin Yao1-1/+3
In annotate browser, we will add support to check fused instructions. While this is x86-specific feature so we need the annotate browser to know what the arch it runs on. symbol__disassemble() has figured out the arch. This patch just lets the arch return from symbol__disassemble and save the arch in annotate browser. Signed-off-by: Yao Jin <yao.jin@linux.intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/1497840958-4759-2-git-send-email-yao.jin@linux.intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-04-05perf annotate: Fix missing number of samples for source_line_samplesTaeung Song1-1/+1
The option 'show-total-period' works fine without a option '-l'. But if running 'perf annotate --stdio -l --show-total-period', you can see a problem showing only zero '0' for number of samples. Before: $ perf annotate --stdio -l --show-total-period ... 0 : 400816: push %rbp 0 : 400817: mov %rsp,%rbp 0 : 40081a: mov %edi,-0x24(%rbp) 0 : 40081d: mov %rsi,-0x30(%rbp) 0 : 400821: mov -0x24(%rbp),%eax 0 : 400824: mov -0x30(%rbp),%rdx 0 : 400828: mov (%rdx),%esi 0 : 40082a: mov $0x0,%edx ... The reason is it was missed to set number of samples of source_line_samples, so set it ordinarily. After: $ perf annotate --stdio -l --show-total-period ... 3 : 400816: push %rbp 4 : 400817: mov %rsp,%rbp 0 : 40081a: mov %edi,-0x24(%rbp) 0 : 40081d: mov %rsi,-0x30(%rbp) 1 : 400821: mov -0x24(%rbp),%eax 2 : 400824: mov -0x30(%rbp),%rdx 0 : 400828: mov (%rdx),%esi 1 : 40082a: mov $0x0,%edx ... Signed-off-by: Taeung Song <treeze.taeung@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Martin Liska <mliska@suse.cz> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Wang Nan <wangnan0@huawei.com> Fixes: 0c4a5bcea460 ("perf annotate: Display total number of samples with --show-total-period") Link: http://lkml.kernel.org/r/1490703125-13643-1-git-send-email-treeze.taeung@gmail.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-12-15perf annotate: Fix jump target outside of function address rangeRavi Bangoria1-2/+3
If jump target is outside of function range, perf is not handling it correctly. Especially when target address is lesser than function start address, target offset will be negative. But, target address declared to be unsigned, converts negative number into 2's complement. See below example. Here target of 'jumpq' instruction at 34cf8 is 34ac0 which is lesser than function start address(34cf0). 34ac0 - 34cf0 = -0x230 = 0xfffffffffffffdd0 Objdump output: 0000000000034cf0 <__sigaction>: __GI___sigaction(): 34cf0: lea -0x20(%rdi),%eax 34cf3: cmp -bashx1,%eax 34cf6: jbe 34d00 <__sigaction+0x10> 34cf8: jmpq 34ac0 <__GI___libc_sigaction> 34cfd: nopl (%rax) 34d00: mov 0x386161(%rip),%rax # 3bae68 <_DYNAMIC+0x2e8> 34d07: movl -bashx16,%fs:(%rax) 34d0e: mov -bashxffffffff,%eax 34d13: retq perf annotate before applying patch: __GI___sigaction /usr/lib64/libc-2.22.so lea -0x20(%rdi),%eax cmp -bashx1,%eax v jbe 10 v jmpq fffffffffffffdd0 nop 10: mov _DYNAMIC+0x2e8,%rax movl -bashx16,%fs:(%rax) mov -bashxffffffff,%eax retq perf annotate after applying patch: __GI___sigaction /usr/lib64/libc-2.22.so lea -0x20(%rdi),%eax cmp -bashx1,%eax v jbe 10 ^ jmpq 34ac0 <__GI___libc_sigaction> nop 10: mov _DYNAMIC+0x2e8,%rax movl -bashx16,%fs:(%rax) mov -bashxffffffff,%eax retq Signed-off-by: Ravi Bangoria <ravi.bangoria@linux.vnet.ibm.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Chris Riyder <chris.ryder@arm.com> Cc: Kim Phillips <kim.phillips@arm.com> Cc: Markus Trippelsdorf <markus@trippelsdorf.de> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Taeung Song <treeze.taeung@gmail.com> Cc: linuxppc-dev@lists.ozlabs.org Link: http://lkml.kernel.org/r/1480953407-7605-3-git-send-email-ravi.bangoria@linux.vnet.ibm.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-11-25perf annotate: Remove duplicate 'name' field from disasm_lineArnaldo Carvalho de Melo1-9/+8
The disasm_line::name field is always equal to ins::name, being used just to locate the instruction's ins_ops from the per-arch instructions table. Eliminate this duplication, nuking that field and instead make ins__find() return an ins_ops, store it in disasm_line::ins.ops, and keep just in disasm_line::ins.name what was in disasm_line::name, this way we end up not keeping a reference to entries in the per-arch instructions table. This in turn will help supporting multiple ways to manage the per-arch instructions table, allowing resorting that array, for instance, when the entries will move after references to its addresses were made. The same problem is avoided when one grows the array with realloc. So architectures simply keeping a constant array will work as well as architectures building the table using regular expressions or other logic that involves resorting the table. Reviewed-by: Ravi Bangoria <ravi.bangoria@linux.vnet.ibm.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Chris Riyder <chris.ryder@arm.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kim Phillips <kim.phillips@arm.com> Cc: Markus Trippelsdorf <markus@trippelsdorf.de> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com> Cc: Pawel Moll <pawel.moll@arm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Russell King <rmk+kernel@arm.linux.org.uk> Cc: Taeung Song <treeze.taeung@gmail.com> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/n/tip-vr899azvabnw9gtuepuqfd9t@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-11-17perf annotate: Start supporting cross arch annotationArnaldo Carvalho de Melo1-2/+4
Introduce a 'struct arch', where arch specific stuff will live, starting with objdump's choice of comment delimitation character, that is '#' in x86 while a ';' in arm. This has some bits and pieces from a patch submitted by Ravi. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Chris Riyder <chris.ryder@arm.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kim Phillips <kim.phillips@arm.com> Cc: Markus Trippelsdorf <markus@trippelsdorf.de> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com> Cc: Pawel Moll <pawel.moll@arm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ravi Bangoria <ravi.bangoria@linux.vnet.ibm.com> Cc: Taeung Song <treeze.taeung@gmail.com> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/n/tip-f337tzjjcl8vtapgvjxmhrbx@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-09-20perf annotate: Pass the symbol's map/dso to the instruction parsersArnaldo Carvalho de Melo1-1/+1
So that things like: → callq 0xffffffff993e3230 found while disassembling /proc/kcore can be beautified by later patches, that will resolve that address to a function, looking it up in /proc/kallsyms. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Chris Riyder <chris.ryder@arm.com> Cc: David Ahern <dsahern@gmail.com> Cc: Hemant Kumar <hemant@linux.vnet.ibm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Markus Trippelsdorf <markus@trippelsdorf.de> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com> Cc: Pawel Moll <pawel.moll@arm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ravi Bangoria <ravi.bangoria@linux.vnet.ibm.com> Cc: Russell King <rmk+kernel@arm.linux.org.uk> Cc: Taeung Song <treeze.taeung@gmail.com> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/n/tip-p76myuke4j7gplg54amaklxk@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-09-08perf annotate: Add branch stack / basic blockPeter Zijlstra1-0/+1
I wanted to know the hottest path through a function and figured the branch-stack (LBR) information should be able to help out with that. The below uses the branch-stack to create basic blocks and generate statistics from them. from to branch_i * ----> * | | block v * ----> * from to branch_i+1 The blocks are broken down into non-overlapping ranges, while tracking if the start of each range is an entry point and/or the end of a range is a branch. Each block iterates all ranges it covers (while splitting where required to exactly match the block) and increments the 'coverage' count. For the range including the branch we increment the taken counter, as well as the pred counter if flags.predicted. Using these number we can find if an instruction: - had coverage; given by: br->coverage / br->sym->max_coverage This metric ensures each symbol has a 100% spot, which reflects the observation that each symbol must have a most covered/hottest block. - is a branch target: br->is_target && br->start == add - for targets, how much of a branch's coverages comes from it: target->entry / branch->coverage - is a branch: br->is_branch && br->end == addr - for branches, how often it was taken: br->taken / br->coverage after all, all execution that didn't take the branch would have incremented the coverage and continued onward to a later branch. - for branches, how often it was predicted: br->pred / br->taken The coverage percentage is used to color the address and asm sections; for low (<1%) coverage we use NORMAL (uncolored), indicating that these instructions are not 'important'. For high coverage (>75%) we color the address RED. For each branch, we add an asm comment after the instruction with information on how often it was taken and predicted. Output looks like (sans color, which does loose a lot of the information :/) $ perf record --branch-filter u,any -e cycles:p ./branches 27 $ perf annotate branches Percent | Source code & Disassembly of branches for cycles:pu (217 samples) --------------------------------------------------------------------------------- : branches(): 0.00 : 40057a: push %rbp 0.00 : 40057b: mov %rsp,%rbp 0.00 : 40057e: sub $0x20,%rsp 0.00 : 400582: mov %rdi,-0x18(%rbp) 0.00 : 400586: mov %rsi,-0x20(%rbp) 0.00 : 40058a: mov -0x18(%rbp),%rax 0.00 : 40058e: mov %rax,-0x10(%rbp) 0.00 : 400592: movq $0x0,-0x8(%rbp) 0.00 : 40059a: jmpq 400656 <branches+0xdc> 1.84 : 40059f: mov -0x10(%rbp),%rax # +100.00% 3.23 : 4005a3: and $0x1,%eax 1.84 : 4005a6: test %rax,%rax 0.00 : 4005a9: je 4005bf <branches+0x45> # -54.50% (p:42.00%) 0.46 : 4005ab: mov 0x200bbe(%rip),%rax # 601170 <acc> 12.90 : 4005b2: add $0x1,%rax 2.30 : 4005b6: mov %rax,0x200bb3(%rip) # 601170 <acc> 0.46 : 4005bd: jmp 4005d1 <branches+0x57> # -100.00% (p:100.00%) 0.92 : 4005bf: mov 0x200baa(%rip),%rax # 601170 <acc> # +49.54% 13.82 : 4005c6: sub $0x1,%rax 0.46 : 4005ca: mov %rax,0x200b9f(%rip) # 601170 <acc> 2.30 : 4005d1: mov -0x10(%rbp),%rax # +50.46% 0.46 : 4005d5: mov %rax,%rdi 0.46 : 4005d8: callq 400526 <lfsr> # -100.00% (p:100.00%) 0.00 : 4005dd: mov %rax,-0x10(%rbp) # +100.00% 0.92 : 4005e1: mov -0x18(%rbp),%rax 0.00 : 4005e5: and $0x1,%eax 0.00 : 4005e8: test %rax,%rax 0.00 : 4005eb: je 4005ff <branches+0x85> # -100.00% (p:100.00%) 0.00 : 4005ed: mov 0x200b7c(%rip),%rax # 601170 <acc> 0.00 : 4005f4: shr $0x2,%rax 0.00 : 4005f8: mov %rax,0x200b71(%rip) # 601170 <acc> 0.00 : 4005ff: mov -0x10(%rbp),%rax # +100.00% 7.37 : 400603: and $0x1,%eax 3.69 : 400606: test %rax,%rax 0.00 : 400609: jne 400612 <branches+0x98> # -59.25% (p:42.99%) 1.84 : 40060b: mov $0x1,%eax 14.29 : 400610: jmp 400617 <branches+0x9d> # -100.00% (p:100.00%) 1.38 : 400612: mov $0x0,%eax # +57.65% 10.14 : 400617: test %al,%al # +42.35% 0.00 : 400619: je 40062f <branches+0xb5> # -57.65% (p:100.00%) 0.46 : 40061b: mov 0x200b4e(%rip),%rax # 601170 <acc> 2.76 : 400622: sub $0x1,%rax 0.00 : 400626: mov %rax,0x200b43(%rip) # 601170 <acc> 0.46 : 40062d: jmp 400641 <branches+0xc7> # -100.00% (p:100.00%) 0.92 : 40062f: mov 0x200b3a(%rip),%rax # 601170 <acc> # +56.13% 2.30 : 400636: add $0x1,%rax 0.92 : 40063a: mov %rax,0x200b2f(%rip) # 601170 <acc> 0.92 : 400641: mov -0x10(%rbp),%rax # +43.87% 2.30 : 400645: mov %rax,%rdi 0.00 : 400648: callq 400526 <lfsr> # -100.00% (p:100.00%) 0.00 : 40064d: mov %rax,-0x10(%rbp) # +100.00% 1.84 : 400651: addq $0x1,-0x8(%rbp) 0.92 : 400656: mov -0x8(%rbp),%rax 5.07 : 40065a: cmp -0x20(%rbp),%rax 0.00 : 40065e: jb 40059f <branches+0x25> # -100.00% (p:100.00%) 0.00 : 400664: nop 0.00 : 400665: leaveq 0.00 : 400666: retq (Note: the --branch-filter u,any was used to avoid spurious target and branch points due to interrupts/faults, they show up as very small -/+ annotations on 'weird' locations) Committer note: Please take a look at: http://vger.kernel.org/~acme/perf/annotate_basic_blocks.png To see the colors. Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <andi@firstfloor.org> Cc: Anshuman Khandual <khandual@linux.vnet.ibm.com> Cc: David Carrillo-Cisneros <davidcc@google.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@intel.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Stephane Eranian <eranian@google.com> [ Moved sym->max_coverage to 'struct annotate', aka symbol__annotate(sym) ] Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-08-30perf annotate: Initialize the priv are in symbol__new()Arnaldo Carvalho de Melo1-1/+0
We need to initializa some fields (right now just a mutex) when we allocate the per symbol annotation struct, so do it at the symbol constructor instead of (ab)using the filter mechanism for that. This way we remove one of the few cases we have for that symbol filter, which will eventually led to removing it. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/n/tip-cvz34avlz1lez888lob95390@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-08-02perf annotate: Introduce strerror for handling symbol__disassemble() errorsArnaldo Carvalho de Melo1-0/+20
We were just using pr_error() which makes it difficult for non stdio UIs to provide errors using its widgets, as they need to somehow catch what was passed to pr_error(). Fix it by introducing a __strerror() interface like the ones used elsewhere, for instance target__strerror(). This is just the initial step, more work will be done, but first some error handling bugs noticed while working on this need to be dealt with. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/n/tip-dgd22zl2xg7x4vcnoa83jxfb@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-08-01perf annotate: Rename symbol__annotate() to symbol__disassemble()Arnaldo Carvalho de Melo1-1/+1
This function will not annotate anything, it will just disassembly the given map->dso and symbol. It currently does this by parsing the output of 'objdump --disassemble', but this could conceivably be done using a library or an offshot of the kernel's instruction decoder (arch/x86/lib/inat.c), etc. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/n/tip-2xpfl4bfnrd6x584b390qok7@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-06-27perf annotate: Generalize handling of 'ret' instructionsNaveen N. Rao1-0/+1
Introduce helper to detect 'ret' instructions and use the same in the TUI. A helper is needed since some architectures such as powerpc have more than one return instruction. Signed-off-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com> Cc: Ananth N Mavinakayanahalli <ananth@in.ibm.com> Cc: Anton Blanchard <anton@ozlabs.org> Cc: Daniel Axtens <dja@axtens.net> Cc: Michael Ellerman <mpe@ellerman.id.au> Link: http://lkml.kernel.org/r/1466769240-12376-5-git-send-email-ravi.bangoria@linux.vnet.ibm.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-06-27perf annotate: Remove unused hist_entry__annotate functionRavi Bangoria1-2/+0
hist_entry__annotate looks part of API but I don't find any caller of this function. Removing it. Signed-off-by: Ravi Bangoria <ravi.bangoria@linux.vnet.ibm.com> Cc: Ananth N Mavinakayanahalli <ananth@in.ibm.com> Cc: Anton Blanchard <anton@ozlabs.org> Cc: Daniel Axtens <dja@axtens.net> Cc: Michael Ellerman <mpe@ellerman.id.au> Link: http://lkml.kernel.org/r/1466769240-12376-2-git-send-email-ravi.bangoria@linux.vnet.ibm.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-03-23perf tools: Remove misplaced __maybe_unusedArnaldo Carvalho de Melo1-1/+1
All over the tree. Cc: David Ahern <dsahern@gmail.com> cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Ravi Bangoria <ravi.bangoria@linux.vnet.ibm.com> Link: http://lkml.kernel.org/n/tip-8nzhnokxyp8y4v7gf0j00oyb@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2015-10-05perf annotate: Fix sizeof_sym_hist overflow issueJiri Olsa1-1/+1
The annotated_source::sizeof_sym_hist could easily overflow int size, resulting in crash in __symbol__inc_addr_samples. Changing its type int size_t as was probably intended from beginning based on the initialization code in symbol__alloc_hist. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Cc: David Ahern <dsahern@gmail.com> Cc: Don Zickus <dzickus@redhat.com> Cc: Kan Liang <kan.liang@intel.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1444068369-20978-4-git-send-email-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2015-08-06perf annotate: Compute IPC and basic block cyclesAndi Kleen1-0/+2
Compute the IPC and the basic block cycles for the annotate display. IPC is computed by counting the instructions, and then dividing the accounted cycles by that count. The actual IPC computation can only be done at annotate time, because we need to parse the objdump output first to know the number of instructions in the basic block. The cycles/IPC are also put into the perf function annotation so that the display code can show them. Again basic block overlaps are not handled, with the longest winning, but there are some heuristics to hide the IPC when the longest is not the most common. v2: Compute IPC correctly. Signed-off-by: Andi Kleen <ak@linux.intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Link: http://lkml.kernel.org/r/1437233094-12844-6-git-send-email-andi@firstfloor.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2015-08-06perf report: Add infrastructure for a cycles histogramAndi Kleen1-0/+17
This adds the basic infrastructure to keep track of cycle counts per basic block for annotate. We allocate an array similar to the normal accounting, and then account branch cycles there. We handle two cases: cycles per basic block with start and cycles per branch (these are later used for either IPC or just cycles per BB) In the start case we cannot handle overlaps, so always the longest basic block wins. For the cycles per branch case everything is accurately accounted. v2: Remove unnecessary checks. Slight restructure. Move symbol__get_annotation to another patch. Move histogram allocation. v3: Merged with current tree Signed-off-by: Andi Kleen <ak@linux.intel.com> Acked-by: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Link: http://lkml.kernel.org/r/1437233094-12844-4-git-send-email-andi@firstfloor.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2015-06-19perf annotate: Rename source_line_percent to source_line_samplesArnaldo Carvalho de Melo1-3/+3
To better reflect the purpose of this struct, that is to hold info about samples, its total number and is percentage. Cc: Martin Liska <mliska@suse.cz> Cc: Jiri Olsa <jolsa@redhat.com> Link: http://lkml.kernel.org/n/tip-6bf8gwcl975uurl0ttpvtk69@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2015-06-19perf annotate: Display total number of samples with --show-total-periodMartin Liška1-1/+2
To compare two records on an instruction base, with --show-total-period option provided, display total number of samples that belong to a line in assembly language. New hot key 't' is introduced for 'perf annotate' TUI. Signed-off-by: Martin Liska <mliska@suse.cz> Cc: Andi Kleen <andi@firstfloor.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/5583E26D.1040407@suse.cz Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2015-01-16perf tools: Fix segfault for symbol annotation on TUINamhyung Kim1-7/+1
Currently the symbol structure is allocated with symbol_conf.priv_size to carry sideband information like annotation, map browser on TUI and sort-by-name tree node. So retrieving these information from symbol needs to care about the details of such placement. However the annotation code just assumes that the symbol is placed after the struct annotation. But actually there's other info between them. So accessing those struct will lead to an undefined behavior (usually a crash) after they write their info to the same location. To reproduce the problem, please follow the steps below: 1. run perf report (TUI of course) with -v option 2. open map browser (by pressing right arrow key for any entry) 3. search any function (by pressing '/' key and input whatever..) 4. return to the hist browser (by pressing 'q' or left arrow key) 5. open annotation window for the same entry (by pressing 'a' key) Signed-off-by: Namhyung Kim <namhyung@kernel.org> Cc: David Ahern <dsahern@gmail.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1421234288-22758-1-git-send-email-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2014-11-19perf annotate: Support source line numbers in annotateAndi Kleen1-0/+1
With srcline key/sort'ing it's useful to have line numbers in the annotate window. This patch implements this. Use objdump -l to request the line numbers and save them in the line structure. Then the browser displays them for source lines. The line numbers are not displayed by default, but can be toggled on with 'k' There is one unfortunate problem with this setup. For lines not containing source and which are outside functions objdump -l reports line numbers off by a few: it always reports the first line number in the next function even for lines that are outside the function. I haven't found a nice way to detect/correct this. Probably objdump has to be fixed. See https://sourceware.org/bugzilla/show_bug.cgi?id=16433 The line numbers are still useful even with these problems, as most are correct and the ones which are not are nearby. v2: Fix help text. Handle (discriminator...) output in objdump. Left align the line numbers. Signed-off-by: Andi Kleen <ak@linux.intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Link: http://lkml.kernel.org/r/1415844328-4884-9-git-send-email-andi@firstfloor.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2014-05-01tools: Consolidate types.hBorislav Petkov1-1/+1
Combine all definitions into a common tools/include/linux/types.h and kill the wild growth elsewhere. Move DECLARE_BITMAP to its proper bitmap.h header. Signed-off-by: Borislav Petkov <bp@suse.de> Acked-by: Rusty Russell <rusty@rustcorp.com.au> Link: http://lkml.kernel.org/n/tip-azczs7qcv6h9xek9od10hiv2@git.kernel.org Signed-off-by: Jiri Olsa <jolsa@kernel.org>
2014-02-24perf annotate: Check availability of annotate when processing samplesNamhyung Kim1-0/+2
The TUI of perf report and top support annotation, but stdio and GTK don't. So it should be checked before calling hist_entry__inc_addr_ samples() to avoid wasting resources that will never be used. perf annotate need it regardless of UI and sort keys, so the check of whether to allocate resources should be on the tools that have annotate as an option in the TUI, 'report' and 'top', not on the function called by all of them. It caused perf annotate on ppc64 to produce zero output, since the buckets were not being allocated. Reported-by: Anton Blanchard <anton@samba.org> Signed-off-by: Namhyung Kim <namhyung@kernel.org> Cc: Anton Blanchard <anton@samba.org> Cc: Ingo Molnar <mingo@kernel.org> Cc: Namhyung Kim <namhyung.kim@lge.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1392859976-32760-1-git-send-email-namhyung@kernel.org [ Renamed (report,top)__needs_annotate() to ui__has_annotation() ] Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2013-12-19perf annotate: Make symbol__inc_addr_samples privateArnaldo Carvalho de Melo1-3/+0
Since it is now accessed just thru addr_map_symbol and hist_entry wrappers. Acked-by: Namhyung Kim <namhyung@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/n/tip-gjoam7wcfrb03sp753gk1nfk@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2013-12-19perf annotate: Adopt methods from histsArnaldo Carvalho de Melo1-0/+5
Those are just wrappers to annotation methods, so move them to annotate.c Acked-by: Namhyung Kim <namhyung@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/n/tip-336h7z0bi2k51cbfi6mkpo5k@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2013-12-19perf annotate: Add inc_samples method to addr_map_symbolArnaldo Carvalho de Melo1-0/+3
Since there are three calls that could receive just the struct addr_map_symbol pointer and call the symbol method. Acked-by: Namhyung Kim <namhyung@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/n/tip-d728gz1orgkaknac9ppnzd9e@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2013-10-09perf tools: Separate out GTK codes to libperf-gtk.soNamhyung Kim1-24/+0
Separate out GTK codes to a shared object called libperf-gtk.so. This time only GTK codes are built with -fPIC and libperf remains as is. Now run GTK hist and annotation browser using libdl. Signed-off-by: Namhyung Kim <namhyung@kernel.org> Reviewed-by: Pekka Enberg <penberg@kernel.org> Cc: Andi Kleen <ak@linux.intel.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Namhyung Kim <namhyung.kim@lge.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Pekka Enberg <penberg@kernel.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1379053663-13706-1-git-send-email-namhyung@kernel.org [ Fix it up wrt Ingo's tools/perf build speedups ] Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2013-10-09tools/perf: Standardize feature support define names to: HAVE_{FEATURE}_SUPPORTIngo Molnar1-2/+2
Standardize all the feature flags based on the HAVE_{FEATURE}_SUPPORT naming convention: HAVE_ARCH_X86_64_SUPPORT HAVE_BACKTRACE_SUPPORT HAVE_CPLUS_DEMANGLE_SUPPORT HAVE_DWARF_SUPPORT HAVE_ELF_GETPHDRNUM_SUPPORT HAVE_GTK2_SUPPORT HAVE_GTK_INFO_BAR_SUPPORT HAVE_LIBAUDIT_SUPPORT HAVE_LIBELF_MMAP_SUPPORT HAVE_LIBELF_SUPPORT HAVE_LIBNUMA_SUPPORT HAVE_LIBUNWIND_SUPPORT HAVE_ON_EXIT_SUPPORT HAVE_PERF_REGS_SUPPORT HAVE_SLANG_SUPPORT HAVE_STRLCPY_SUPPORT Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Namhyung Kim <namhyung@kernel.org> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@redhat.com> Link: http://lkml.kernel.org/n/tip-u3zvqejddfZhtrbYbfhi3spa@git.kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2013-04-01perf tools: Remove dependency on libnewtArnaldo Carvalho de Melo1-1/+1
Now that the map browser shares the input routine with the hists browser, there is no need for using any libnewt routine, so remove all traces except for honouring NO_NEWT=1 on the makefile command line as an indication that TUI support is not needed, in fact it just sets NO_SLANG=1. Cc: David Ahern <dsahern@gmail.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Namhyung Kim <namhyung@gmail.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/n/tip-wae5o7xca9m52bj1re28jc5j@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2013-03-15perf annotate browser: Use disasm__calc_percent()Namhyung Kim1-0/+4
The disasm_line__calc_percent() which was used by annotate browser code almost duplicates disasm__calc_percent. Let's get rid of the code duplication. Signed-off-by: Namhyung Kim <namhyung@kernel.org> Cc: Andi Kleen <andi@firstfloor.org> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Namhyung Kim <namhyung.kim@lge.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Pekka Enberg <penberg@kernel.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1362462812-30885-11-git-send-email-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2013-03-15perf annotate: Support event group view for --print-lineNamhyung Kim1-0/+1
Dynamically allocate source_line_percent according to a number of group members and save nr_pcnt to the struct source_line. This way we can handle multiple events in a general manner. However since the size of struct source_line is not fixed anymore, iterating whole source_line should care about its size. $ perf annotate --group --stdio --print-line Sorted summary for file /lib/ld-2.11.1.so ---------------------------------------------- 33.33 0.00 /build/buildd/eglibc-2.11.1/elf/rtld.c:381 33.33 0.00 /build/buildd/eglibc-2.11.1/elf/dynamic-link.h:128 33.33 0.00 /build/buildd/eglibc-2.11.1/elf/do-rel.h:105 0.00 75.00 /build/buildd/eglibc-2.11.1/elf/dynamic-link.h:137 0.00 25.00 /build/buildd/eglibc-2.11.1/elf/dynamic-link.h:187 ... Signed-off-by: Namhyung Kim <namhyung@kernel.org> Cc: Andi Kleen <andi@firstfloor.org> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Namhyung Kim <namhyung.kim@lge.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Pekka Enberg <penberg@kernel.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1362462812-30885-9-git-send-email-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2013-03-15perf annotate: Factor out struct source_line_percentNamhyung Kim1-2/+6
The source_line_percent struct contains percentage value of the symbol histogram. This is a preparation of event group view change. Signed-off-by: Namhyung Kim <namhyung@kernel.org> Cc: Andi Kleen <andi@firstfloor.org> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Pekka Enberg <penberg@kernel.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1362462812-30885-8-git-send-email-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2013-03-15perf annotate: Pass evsel instead of evidx on annotation functionsNamhyung Kim1-17/+19
Pass evsel instead of evidx. This is a preparation for supporting event group view in annotation and no functional change is intended. Signed-off-by: Namhyung Kim <namhyung@kernel.org> Cc: Andi Kleen <andi@firstfloor.org> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Pekka Enberg <penberg@kernel.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1362462812-30885-2-git-send-email-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2013-02-14perf gtk/annotate: Support multiple event annotationNamhyung Kim1-0/+4
Show multiple annotation result for each evsel. Each result represents the most frquently sampled symbol/function for the evsel and it will be shown in a tab window. For this add a reference to main container (notebook) to the pgctx. At the first call to annotate browser, hist_entry__find_annotations() will setup a new browser, and next calls will add new tabs to the browser. But it requires final perf_gtk__show_annotations() to start processing GUI events. Signed-off-by: Namhyung Kim <namhyung@kernel.org> Cc: Andi Kleen <andi@firstfloor.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Pekka Enberg <penberg@kernel.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1360227734-375-3-git-send-email-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2013-02-14perf ui/gtk: Implement basic GTK2 annotation browserNamhyung Kim1-0/+20
Basic implementation of perf annotate on GTK2. Currently only shows first symbol. Add a new --gtk option to use it. Signed-off-by: Namhyung Kim <namhyung@kernel.org> Cc: Andi Kleen <andi@firstfloor.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Namhyung Kim <namhyung.kim@lge.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Pekka Enberg <penberg@kernel.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1360227734-375-2-git-send-email-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2012-11-09perf annotate: Merge same lines in summary viewNamhyung Kim1-0/+1
The --print-line option of perf annotate command shows summary for each source line. But it didn't merge same lines so that it can appear multiple times. * before: Sorted summary for file /home/namhyung/bin/mcol ---------------------------------------------- 21.71 /home/namhyung/tmp/mcol.c:26 20.66 /home/namhyung/tmp/mcol.c:25 9.53 /home/namhyung/tmp/mcol.c:24 7.68 /home/namhyung/tmp/mcol.c:25 7.67 /home/namhyung/tmp/mcol.c:25 7.66 /home/namhyung/tmp/mcol.c:26 7.49 /home/namhyung/tmp/mcol.c:26 6.92 /home/namhyung/tmp/mcol.c:25 6.81 /home/namhyung/tmp/mcol.c:25 1.07 /home/namhyung/tmp/mcol.c:26 0.52 /home/namhyung/tmp/mcol.c:25 0.51 /home/namhyung/tmp/mcol.c:25 0.51 /home/namhyung/tmp/mcol.c:24 * after: Sorted summary for file /home/namhyung/bin/mcol ---------------------------------------------- 50.77 /home/namhyung/tmp/mcol.c:25 37.94 /home/namhyung/tmp/mcol.c:26 10.04 /home/namhyung/tmp/mcol.c:24 To do that, introduce percent_sum field so that the normal line-by-line output doesn't get changed. Signed-off-by: Namhyung Kim <namhyung@kernel.org> Cc: Ingo Molnar <mingo@kernel.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1352440729-21848-1-git-send-email-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2012-11-05perf tools: Introduce struct hist_browser_timerNamhyung Kim1-4/+4
Currently various hist browser functions receive 3 arguments for refreshing histogram but only used from a few places. Also it's only for perf top command so that it can be NULL for other (and probably most) cases. Pack them into a struct in order to reduce number of those unused arguments. This is a mechanical change and does not intend a functional change. Signed-off-by: Namhyung Kim <namhyung@kernel.org> Tested-by: David Ahern <dsahern@gmail.com> Cc: David Ahern <dsahern@gmail.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Irina Tirdea <irina.tirdea@gmail.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1351835406-15208-2-git-send-email-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2012-10-24perf tools: Try to find cross-built objdump pathIrina Tirdea1-1/+0
As we have architecture information of saved perf.data file, we can try to find cross-built objdump path. The triplets include support for Android (arm, x86 and mips architectures). Signed-off-by: Irina Tirdea <irina.tirdea@intel.com> Originally-by: Namhyung Kim <namhyung@kernel.org> Acked-by: Namhyung Kim <namhyung@kernel.org> Cc: David Ahern <dsahern@gmail.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Namhyung Kim <namhyung.kim@lge.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Pekka Enberg <penberg@kernel.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Steven Rostedt <rostedt@goodmis.org> Link: http://lkml.kernel.org/r/1350344020-8071-5-git-send-email-irina.tirdea@gmail.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2012-10-03perf tools: Convert to NEWT_SUPPORTNamhyung Kim1-4/+4
For building perf without libnewt, we can set NO_NEWT=1 as a argument of make. It then defines NO_NEWT_SUPPORT macro for C code to do the proper handling. However it usually used in a negative semantics - e.g. #ifndef - so we saw double negations which can be misleading. Convert it to a positive form to make it more readable. Signed-off-by: Namhyung Kim <namhyung@kernel.org> Cc: Ingo Molnar <mingo@kernel.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1348824728-14025-7-git-send-email-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2012-09-11perf tools: Use __maybe_used for unused variablesIrina Tirdea1-6/+7
perf defines both __used and __unused variables to use for marking unused variables. The variable __used is defined to __attribute__((__unused__)), which contradicts the kernel definition to __attribute__((__used__)) for new gcc versions. On Android, __used is also defined in system headers and this leads to warnings like: warning: '__used__' attribute ignored __unused is not defined in the kernel and is not a standard definition. If __unused is included everywhere instead of __used, this leads to conflicts with glibc headers, since glibc has a variables with this name in its headers. The best approach is to use __maybe_unused, the definition used in the kernel for __attribute__((unused)). In this way there is only one definition in perf sources (instead of 2 definitions that point to the same thing: __used and __unused) and it works on both Linux and Android. This patch simply replaces all instances of __used and __unused with __maybe_unused. Signed-off-by: Irina Tirdea <irina.tirdea@intel.com> Acked-by: Pekka Enberg <penberg@kernel.org> Cc: David Ahern <dsahern@gmail.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Namhyung Kim <namhyung.kim@lge.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Steven Rostedt <rostedt@goodmis.org> Link: http://lkml.kernel.org/r/1347315303-29906-7-git-send-email-irina.tirdea@intel.com [ committer note: fixed up conflict with a116e05 in builtin-sched.c ] Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2012-09-08perf tools: include missing pthread.h headerIrina Tirdea1-0/+1
pthread variables are used in some files without explicitely including pthread.h. This leads to compile errors on Android. e.g.: in annotate.h, error: unknown type name 'pthread_mutex_t' Including pthread.h explicitely in files that use it to have all definitions included. Signed-off-by: Irina Tirdea <irina.tirdea@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Namhyung Kim <namhyung.kim@lge.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Steven Rostedt <rostedt@goodmis.org> Link: http://lkml.kernel.org/r/1347065004-15306-8-git-send-email-irina.tirdea@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2012-09-06perf tools: Allow user to indicate path to objdump in command lineMaciek Borzecki1-0/+1
When analyzing perf data from hosts of other architecture than one of the local host it's useful to call objdump that is part of a toolchain for that architecture. Instead of calling regular objdump, call one that user specified in command line. Signed-off-by: Maciek Borzecki <maciek.borzecki@gmail.com> Acked-by: David Ahern <dsahern@gmail.com> Link: http://lkml.kernel.org/r/1346754750.16299.3.camel@localhost.localdomain Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2012-05-12perf annotate: Introduce ->free() method in ins_opsArnaldo Carvalho de Melo1-0/+1
So that we don't special case disasm_line__free, allowing each instruction class to provide an specialized destructor, like is needed for 'lock'. Cc: David Ahern <dsahern@gmail.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Namhyung Kim <namhyung@gmail.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/n/tip-xxw4vs5n077tf35jsvjzylhb@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2012-05-12perf annotate: Augment lock instruction outputArnaldo Carvalho de Melo1-5/+11
It just chops off the 'lock' and uses the ins__find, etc machinery to call instruction specific parsers/beautifiers. Cc: David Ahern <dsahern@gmail.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Namhyung Kim <namhyung@gmail.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/n/tip-4913ba2dzakz5rivgumosqbh@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>