summaryrefslogtreecommitdiff
path: root/Documentation/trace/ftrace.rst
diff options
context:
space:
mode:
authorLinus Torvalds <torvalds@linux-foundation.org>2023-06-30 20:33:17 +0300
committerLinus Torvalds <torvalds@linux-foundation.org>2023-06-30 20:33:17 +0300
commitcccf0c2ee52d3bd710be3a3f865df1b869a68f11 (patch)
tree561792a7a9bb45057f172b2d8fb07ee65dbc75ac /Documentation/trace/ftrace.rst
parent533925cb760431cb496a8c965cfd765a1a21d37e (diff)
parentfc30ace06f250f79381a8e3f6ed92dd68e25a9f5 (diff)
downloadlinux-cccf0c2ee52d3bd710be3a3f865df1b869a68f11.tar.xz
Merge tag 'trace-v6.5' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace
Pull tracing updates from Steven Rostedt: - Add new feature to have function graph tracer record the return value. Adds a new option: funcgraph-retval ; when set, will show the return value of a function in the function graph tracer. - Also add the option: funcgraph-retval-hex where if it is not set, and the return value is an error code, then it will return the decimal of the error code, otherwise it still reports the hex value. - Add the file /sys/kernel/tracing/osnoise/per_cpu/cpu<cpu>/timerlat_fd That when a application opens it, it becomes the task that the timer lat tracer traces. The application can also read this file to find out how it's being interrupted. - Add the file /sys/kernel/tracing/available_filter_functions_addrs that works just the same as available_filter_functions but also shows the addresses of the functions like kallsyms, except that it gives the address of where the fentry/mcount jump/nop is. This is used by BPF to make it easier to attach BPF programs to ftrace hooks. - Replace strlcpy with strscpy in the tracing boot code. * tag 'trace-v6.5' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace: tracing: Fix warnings when building htmldocs for function graph retval riscv: ftrace: Enable HAVE_FUNCTION_GRAPH_RETVAL tracing/boot: Replace strlcpy with strscpy tracing/timerlat: Add user-space interface tracing/osnoise: Skip running osnoise if all instances are off tracing/osnoise: Switch from PF_NO_SETAFFINITY to migrate_disable ftrace: Show all functions with addresses in available_filter_functions_addrs selftests/ftrace: Add funcgraph-retval test case LoongArch: ftrace: Enable HAVE_FUNCTION_GRAPH_RETVAL x86/ftrace: Enable HAVE_FUNCTION_GRAPH_RETVAL arm64: ftrace: Enable HAVE_FUNCTION_GRAPH_RETVAL tracing: Add documentation for funcgraph-retval and funcgraph-retval-hex function_graph: Support recording and printing the return value of function fgraph: Add declaration of "struct fgraph_ret_regs"
Diffstat (limited to 'Documentation/trace/ftrace.rst')
-rw-r--r--Documentation/trace/ftrace.rst132
1 files changed, 132 insertions, 0 deletions
diff --git a/Documentation/trace/ftrace.rst b/Documentation/trace/ftrace.rst
index 027437b745a0..f606c5bd1c0d 100644
--- a/Documentation/trace/ftrace.rst
+++ b/Documentation/trace/ftrace.rst
@@ -324,6 +324,12 @@ of ftrace. Here is a list of some of the key files:
"set_graph_function", or "set_graph_notrace".
(See the section "dynamic ftrace" below for more details.)
+ available_filter_functions_addrs:
+
+ Similar to available_filter_functions, but with address displayed
+ for each function. The displayed address is the patch-site address
+ and can differ from /proc/kallsyms address.
+
dyn_ftrace_total_info:
This file is for debugging purposes. The number of functions that
@@ -1359,6 +1365,19 @@ Options for function_graph tracer:
only a closing curly bracket "}" is displayed for
the return of a function.
+ funcgraph-retval
+ When set, the return value of each traced function
+ will be printed after an equal sign "=". By default
+ this is off.
+
+ funcgraph-retval-hex
+ When set, the return value will always be printed
+ in hexadecimal format. If the option is not set and
+ the return value is an error code, it will be printed
+ in signed decimal format; otherwise it will also be
+ printed in hexadecimal format. By default, this option
+ is off.
+
sleep-time
When running function graph tracer, to include
the time a task schedules out in its function.
@@ -2704,6 +2723,119 @@ It is default disabled.
0) 1.757 us | } /* kmem_cache_free() */
0) 2.861 us | } /* putname() */
+The return value of each traced function can be displayed after
+an equal sign "=". When encountering system call failures, it
+can be verfy helpful to quickly locate the function that first
+returns an error code.
+
+ - hide: echo nofuncgraph-retval > trace_options
+ - show: echo funcgraph-retval > trace_options
+
+ Example with funcgraph-retval::
+
+ 1) | cgroup_migrate() {
+ 1) 0.651 us | cgroup_migrate_add_task(); /* = 0xffff93fcfd346c00 */
+ 1) | cgroup_migrate_execute() {
+ 1) | cpu_cgroup_can_attach() {
+ 1) | cgroup_taskset_first() {
+ 1) 0.732 us | cgroup_taskset_next(); /* = 0xffff93fc8fb20000 */
+ 1) 1.232 us | } /* cgroup_taskset_first = 0xffff93fc8fb20000 */
+ 1) 0.380 us | sched_rt_can_attach(); /* = 0x0 */
+ 1) 2.335 us | } /* cpu_cgroup_can_attach = -22 */
+ 1) 4.369 us | } /* cgroup_migrate_execute = -22 */
+ 1) 7.143 us | } /* cgroup_migrate = -22 */
+
+The above example shows that the function cpu_cgroup_can_attach
+returned the error code -22 firstly, then we can read the code
+of this function to get the root cause.
+
+When the option funcgraph-retval-hex is not set, the return value can
+be displayed in a smart way. Specifically, if it is an error code,
+it will be printed in signed decimal format, otherwise it will
+printed in hexadecimal format.
+
+ - smart: echo nofuncgraph-retval-hex > trace_options
+ - hexadecimal: echo funcgraph-retval-hex > trace_options
+
+ Example with funcgraph-retval-hex::
+
+ 1) | cgroup_migrate() {
+ 1) 0.651 us | cgroup_migrate_add_task(); /* = 0xffff93fcfd346c00 */
+ 1) | cgroup_migrate_execute() {
+ 1) | cpu_cgroup_can_attach() {
+ 1) | cgroup_taskset_first() {
+ 1) 0.732 us | cgroup_taskset_next(); /* = 0xffff93fc8fb20000 */
+ 1) 1.232 us | } /* cgroup_taskset_first = 0xffff93fc8fb20000 */
+ 1) 0.380 us | sched_rt_can_attach(); /* = 0x0 */
+ 1) 2.335 us | } /* cpu_cgroup_can_attach = 0xffffffea */
+ 1) 4.369 us | } /* cgroup_migrate_execute = 0xffffffea */
+ 1) 7.143 us | } /* cgroup_migrate = 0xffffffea */
+
+At present, there are some limitations when using the funcgraph-retval
+option, and these limitations will be eliminated in the future:
+
+- Even if the function return type is void, a return value will still
+ be printed, and you can just ignore it.
+
+- Even if return values are stored in multiple registers, only the
+ value contained in the first register will be recorded and printed.
+ To illustrate, in the x86 architecture, eax and edx are used to store
+ a 64-bit return value, with the lower 32 bits saved in eax and the
+ upper 32 bits saved in edx. However, only the value stored in eax
+ will be recorded and printed.
+
+- In certain procedure call standards, such as arm64's AAPCS64, when a
+ type is smaller than a GPR, it is the responsibility of the consumer
+ to perform the narrowing, and the upper bits may contain UNKNOWN values.
+ Therefore, it is advisable to check the code for such cases. For instance,
+ when using a u8 in a 64-bit GPR, bits [63:8] may contain arbitrary values,
+ especially when larger types are truncated, whether explicitly or implicitly.
+ Here are some specific cases to illustrate this point:
+
+ **Case One**:
+
+ The function narrow_to_u8 is defined as follows::
+
+ u8 narrow_to_u8(u64 val)
+ {
+ // implicitly truncated
+ return val;
+ }
+
+ It may be compiled to::
+
+ narrow_to_u8:
+ < ... ftrace instrumentation ... >
+ RET
+
+ If you pass 0x123456789abcdef to this function and want to narrow it,
+ it may be recorded as 0x123456789abcdef instead of 0xef.
+
+ **Case Two**:
+
+ The function error_if_not_4g_aligned is defined as follows::
+
+ int error_if_not_4g_aligned(u64 val)
+ {
+ if (val & GENMASK(31, 0))
+ return -EINVAL;
+
+ return 0;
+ }
+
+ It could be compiled to::
+
+ error_if_not_4g_aligned:
+ CBNZ w0, .Lnot_aligned
+ RET // bits [31:0] are zero, bits
+ // [63:32] are UNKNOWN
+ .Lnot_aligned:
+ MOV x0, #-EINVAL
+ RET
+
+ When passing 0x2_0000_0000 to it, the return value may be recorded as
+ 0x2_0000_0000 instead of 0.
+
You can put some comments on specific functions by using
trace_printk() For example, if you want to put a comment inside
the __might_sleep() function, you just have to include