Skip to content

Commit cccf0c2

Browse files
committed
Merge tag 'trace-v6.5' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace
Pull tracing updates from Steven Rostedt: - Add new feature to have function graph tracer record the return value. Adds a new option: funcgraph-retval ; when set, will show the return value of a function in the function graph tracer. - Also add the option: funcgraph-retval-hex where if it is not set, and the return value is an error code, then it will return the decimal of the error code, otherwise it still reports the hex value. - Add the file /sys/kernel/tracing/osnoise/per_cpu/cpu<cpu>/timerlat_fd That when a application opens it, it becomes the task that the timer lat tracer traces. The application can also read this file to find out how it's being interrupted. - Add the file /sys/kernel/tracing/available_filter_functions_addrs that works just the same as available_filter_functions but also shows the addresses of the functions like kallsyms, except that it gives the address of where the fentry/mcount jump/nop is. This is used by BPF to make it easier to attach BPF programs to ftrace hooks. - Replace strlcpy with strscpy in the tracing boot code. * tag 'trace-v6.5' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace: tracing: Fix warnings when building htmldocs for function graph retval riscv: ftrace: Enable HAVE_FUNCTION_GRAPH_RETVAL tracing/boot: Replace strlcpy with strscpy tracing/timerlat: Add user-space interface tracing/osnoise: Skip running osnoise if all instances are off tracing/osnoise: Switch from PF_NO_SETAFFINITY to migrate_disable ftrace: Show all functions with addresses in available_filter_functions_addrs selftests/ftrace: Add funcgraph-retval test case LoongArch: ftrace: Enable HAVE_FUNCTION_GRAPH_RETVAL x86/ftrace: Enable HAVE_FUNCTION_GRAPH_RETVAL arm64: ftrace: Enable HAVE_FUNCTION_GRAPH_RETVAL tracing: Add documentation for funcgraph-retval and funcgraph-retval-hex function_graph: Support recording and printing the return value of function fgraph: Add declaration of "struct fgraph_ret_regs"
2 parents 533925c + fc30ace commit cccf0c2

File tree

29 files changed

+1079
-62
lines changed

29 files changed

+1079
-62
lines changed

Documentation/trace/ftrace.rst

Lines changed: 132 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -324,6 +324,12 @@ of ftrace. Here is a list of some of the key files:
324324
"set_graph_function", or "set_graph_notrace".
325325
(See the section "dynamic ftrace" below for more details.)
326326

327+
available_filter_functions_addrs:
328+
329+
Similar to available_filter_functions, but with address displayed
330+
for each function. The displayed address is the patch-site address
331+
and can differ from /proc/kallsyms address.
332+
327333
dyn_ftrace_total_info:
328334

329335
This file is for debugging purposes. The number of functions that
@@ -1359,6 +1365,19 @@ Options for function_graph tracer:
13591365
only a closing curly bracket "}" is displayed for
13601366
the return of a function.
13611367

1368+
funcgraph-retval
1369+
When set, the return value of each traced function
1370+
will be printed after an equal sign "=". By default
1371+
this is off.
1372+
1373+
funcgraph-retval-hex
1374+
When set, the return value will always be printed
1375+
in hexadecimal format. If the option is not set and
1376+
the return value is an error code, it will be printed
1377+
in signed decimal format; otherwise it will also be
1378+
printed in hexadecimal format. By default, this option
1379+
is off.
1380+
13621381
sleep-time
13631382
When running function graph tracer, to include
13641383
the time a task schedules out in its function.
@@ -2704,6 +2723,119 @@ It is default disabled.
27042723
0) 1.757 us | } /* kmem_cache_free() */
27052724
0) 2.861 us | } /* putname() */
27062725

2726+
The return value of each traced function can be displayed after
2727+
an equal sign "=". When encountering system call failures, it
2728+
can be verfy helpful to quickly locate the function that first
2729+
returns an error code.
2730+
2731+
- hide: echo nofuncgraph-retval > trace_options
2732+
- show: echo funcgraph-retval > trace_options
2733+
2734+
Example with funcgraph-retval::
2735+
2736+
1) | cgroup_migrate() {
2737+
1) 0.651 us | cgroup_migrate_add_task(); /* = 0xffff93fcfd346c00 */
2738+
1) | cgroup_migrate_execute() {
2739+
1) | cpu_cgroup_can_attach() {
2740+
1) | cgroup_taskset_first() {
2741+
1) 0.732 us | cgroup_taskset_next(); /* = 0xffff93fc8fb20000 */
2742+
1) 1.232 us | } /* cgroup_taskset_first = 0xffff93fc8fb20000 */
2743+
1) 0.380 us | sched_rt_can_attach(); /* = 0x0 */
2744+
1) 2.335 us | } /* cpu_cgroup_can_attach = -22 */
2745+
1) 4.369 us | } /* cgroup_migrate_execute = -22 */
2746+
1) 7.143 us | } /* cgroup_migrate = -22 */
2747+
2748+
The above example shows that the function cpu_cgroup_can_attach
2749+
returned the error code -22 firstly, then we can read the code
2750+
of this function to get the root cause.
2751+
2752+
When the option funcgraph-retval-hex is not set, the return value can
2753+
be displayed in a smart way. Specifically, if it is an error code,
2754+
it will be printed in signed decimal format, otherwise it will
2755+
printed in hexadecimal format.
2756+
2757+
- smart: echo nofuncgraph-retval-hex > trace_options
2758+
- hexadecimal: echo funcgraph-retval-hex > trace_options
2759+
2760+
Example with funcgraph-retval-hex::
2761+
2762+
1) | cgroup_migrate() {
2763+
1) 0.651 us | cgroup_migrate_add_task(); /* = 0xffff93fcfd346c00 */
2764+
1) | cgroup_migrate_execute() {
2765+
1) | cpu_cgroup_can_attach() {
2766+
1) | cgroup_taskset_first() {
2767+
1) 0.732 us | cgroup_taskset_next(); /* = 0xffff93fc8fb20000 */
2768+
1) 1.232 us | } /* cgroup_taskset_first = 0xffff93fc8fb20000 */
2769+
1) 0.380 us | sched_rt_can_attach(); /* = 0x0 */
2770+
1) 2.335 us | } /* cpu_cgroup_can_attach = 0xffffffea */
2771+
1) 4.369 us | } /* cgroup_migrate_execute = 0xffffffea */
2772+
1) 7.143 us | } /* cgroup_migrate = 0xffffffea */
2773+
2774+
At present, there are some limitations when using the funcgraph-retval
2775+
option, and these limitations will be eliminated in the future:
2776+
2777+
- Even if the function return type is void, a return value will still
2778+
be printed, and you can just ignore it.
2779+
2780+
- Even if return values are stored in multiple registers, only the
2781+
value contained in the first register will be recorded and printed.
2782+
To illustrate, in the x86 architecture, eax and edx are used to store
2783+
a 64-bit return value, with the lower 32 bits saved in eax and the
2784+
upper 32 bits saved in edx. However, only the value stored in eax
2785+
will be recorded and printed.
2786+
2787+
- In certain procedure call standards, such as arm64's AAPCS64, when a
2788+
type is smaller than a GPR, it is the responsibility of the consumer
2789+
to perform the narrowing, and the upper bits may contain UNKNOWN values.
2790+
Therefore, it is advisable to check the code for such cases. For instance,
2791+
when using a u8 in a 64-bit GPR, bits [63:8] may contain arbitrary values,
2792+
especially when larger types are truncated, whether explicitly or implicitly.
2793+
Here are some specific cases to illustrate this point:
2794+
2795+
**Case One**:
2796+
2797+
The function narrow_to_u8 is defined as follows::
2798+
2799+
u8 narrow_to_u8(u64 val)
2800+
{
2801+
// implicitly truncated
2802+
return val;
2803+
}
2804+
2805+
It may be compiled to::
2806+
2807+
narrow_to_u8:
2808+
< ... ftrace instrumentation ... >
2809+
RET
2810+
2811+
If you pass 0x123456789abcdef to this function and want to narrow it,
2812+
it may be recorded as 0x123456789abcdef instead of 0xef.
2813+
2814+
**Case Two**:
2815+
2816+
The function error_if_not_4g_aligned is defined as follows::
2817+
2818+
int error_if_not_4g_aligned(u64 val)
2819+
{
2820+
if (val & GENMASK(31, 0))
2821+
return -EINVAL;
2822+
2823+
return 0;
2824+
}
2825+
2826+
It could be compiled to::
2827+
2828+
error_if_not_4g_aligned:
2829+
CBNZ w0, .Lnot_aligned
2830+
RET // bits [31:0] are zero, bits
2831+
// [63:32] are UNKNOWN
2832+
.Lnot_aligned:
2833+
MOV x0, #-EINVAL
2834+
RET
2835+
2836+
When passing 0x2_0000_0000 to it, the return value may be recorded as
2837+
0x2_0000_0000 instead of 0.
2838+
27072839
You can put some comments on specific functions by using
27082840
trace_printk() For example, if you want to put a comment inside
27092841
the __might_sleep() function, you just have to include

Documentation/trace/timerlat-tracer.rst

Lines changed: 78 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -180,3 +180,81 @@ dummy_load_1ms_pd_init, which had the following code (on purpose)::
180180
return 0;
181181

182182
}
183+
184+
User-space interface
185+
---------------------------
186+
187+
Timerlat allows user-space threads to use timerlat infra-structure to
188+
measure scheduling latency. This interface is accessible via a per-CPU
189+
file descriptor inside $tracing_dir/osnoise/per_cpu/cpu$ID/timerlat_fd.
190+
191+
This interface is accessible under the following conditions:
192+
193+
- timerlat tracer is enable
194+
- osnoise workload option is set to NO_OSNOISE_WORKLOAD
195+
- The user-space thread is affined to a single processor
196+
- The thread opens the file associated with its single processor
197+
- Only one thread can access the file at a time
198+
199+
The open() syscall will fail if any of these conditions are not met.
200+
After opening the file descriptor, the user space can read from it.
201+
202+
The read() system call will run a timerlat code that will arm the
203+
timer in the future and wait for it as the regular kernel thread does.
204+
205+
When the timer IRQ fires, the timerlat IRQ will execute, report the
206+
IRQ latency and wake up the thread waiting in the read. The thread will be
207+
scheduled and report the thread latency via tracer - as for the kernel
208+
thread.
209+
210+
The difference from the in-kernel timerlat is that, instead of re-arming
211+
the timer, timerlat will return to the read() system call. At this point,
212+
the user can run any code.
213+
214+
If the application rereads the file timerlat file descriptor, the tracer
215+
will report the return from user-space latency, which is the total
216+
latency. If this is the end of the work, it can be interpreted as the
217+
response time for the request.
218+
219+
After reporting the total latency, timerlat will restart the cycle, arm
220+
a timer, and go to sleep for the following activation.
221+
222+
If at any time one of the conditions is broken, e.g., the thread migrates
223+
while in user space, or the timerlat tracer is disabled, the SIG_KILL
224+
signal will be sent to the user-space thread.
225+
226+
Here is an basic example of user-space code for timerlat::
227+
228+
int main(void)
229+
{
230+
char buffer[1024];
231+
int timerlat_fd;
232+
int retval;
233+
long cpu = 0; /* place in CPU 0 */
234+
cpu_set_t set;
235+
236+
CPU_ZERO(&set);
237+
CPU_SET(cpu, &set);
238+
239+
if (sched_setaffinity(gettid(), sizeof(set), &set) == -1)
240+
return 1;
241+
242+
snprintf(buffer, sizeof(buffer),
243+
"/sys/kernel/tracing/osnoise/per_cpu/cpu%ld/timerlat_fd",
244+
cpu);
245+
246+
timerlat_fd = open(buffer, O_RDONLY);
247+
if (timerlat_fd < 0) {
248+
printf("error opening %s: %s\n", buffer, strerror(errno));
249+
exit(1);
250+
}
251+
252+
for (;;) {
253+
retval = read(timerlat_fd, buffer, 1024);
254+
if (retval < 0)
255+
break;
256+
}
257+
258+
close(timerlat_fd);
259+
exit(0);
260+
}

arch/arm64/Kconfig

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -202,6 +202,7 @@ config ARM64
202202
select HAVE_FTRACE_MCOUNT_RECORD
203203
select HAVE_FUNCTION_TRACER
204204
select HAVE_FUNCTION_ERROR_INJECTION
205+
select HAVE_FUNCTION_GRAPH_RETVAL if HAVE_FUNCTION_GRAPH_TRACER
205206
select HAVE_FUNCTION_GRAPH_TRACER
206207
select HAVE_GCC_PLUGINS
207208
select HAVE_HARDLOCKUP_DETECTOR_PERF if PERF_EVENTS && \

arch/arm64/include/asm/ftrace.h

Lines changed: 22 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -192,4 +192,26 @@ static inline bool arch_syscall_match_sym_name(const char *sym,
192192
}
193193
#endif /* ifndef __ASSEMBLY__ */
194194

195+
#ifndef __ASSEMBLY__
196+
#ifdef CONFIG_FUNCTION_GRAPH_TRACER
197+
struct fgraph_ret_regs {
198+
/* x0 - x7 */
199+
unsigned long regs[8];
200+
201+
unsigned long fp;
202+
unsigned long __unused;
203+
};
204+
205+
static inline unsigned long fgraph_ret_regs_return_value(struct fgraph_ret_regs *ret_regs)
206+
{
207+
return ret_regs->regs[0];
208+
}
209+
210+
static inline unsigned long fgraph_ret_regs_frame_pointer(struct fgraph_ret_regs *ret_regs)
211+
{
212+
return ret_regs->fp;
213+
}
214+
#endif /* ifdef CONFIG_FUNCTION_GRAPH_TRACER */
215+
#endif
216+
195217
#endif /* __ASM_FTRACE_H */

arch/arm64/kernel/asm-offsets.c

Lines changed: 13 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -200,6 +200,19 @@ int main(void)
200200
#endif
201201
#ifdef CONFIG_FUNCTION_TRACER
202202
DEFINE(FTRACE_OPS_FUNC, offsetof(struct ftrace_ops, func));
203+
#endif
204+
BLANK();
205+
#ifdef CONFIG_FUNCTION_GRAPH_TRACER
206+
DEFINE(FGRET_REGS_X0, offsetof(struct fgraph_ret_regs, regs[0]));
207+
DEFINE(FGRET_REGS_X1, offsetof(struct fgraph_ret_regs, regs[1]));
208+
DEFINE(FGRET_REGS_X2, offsetof(struct fgraph_ret_regs, regs[2]));
209+
DEFINE(FGRET_REGS_X3, offsetof(struct fgraph_ret_regs, regs[3]));
210+
DEFINE(FGRET_REGS_X4, offsetof(struct fgraph_ret_regs, regs[4]));
211+
DEFINE(FGRET_REGS_X5, offsetof(struct fgraph_ret_regs, regs[5]));
212+
DEFINE(FGRET_REGS_X6, offsetof(struct fgraph_ret_regs, regs[6]));
213+
DEFINE(FGRET_REGS_X7, offsetof(struct fgraph_ret_regs, regs[7]));
214+
DEFINE(FGRET_REGS_FP, offsetof(struct fgraph_ret_regs, fp));
215+
DEFINE(FGRET_REGS_SIZE, sizeof(struct fgraph_ret_regs));
203216
#ifdef CONFIG_DYNAMIC_FTRACE_WITH_DIRECT_CALLS
204217
DEFINE(FTRACE_OPS_DIRECT_CALL, offsetof(struct ftrace_ops, direct_call));
205218
#endif

arch/arm64/kernel/entry-ftrace.S

Lines changed: 14 additions & 13 deletions
Original file line numberDiff line numberDiff line change
@@ -330,22 +330,23 @@ SYM_FUNC_END(ftrace_stub_graph)
330330
*/
331331
SYM_CODE_START(return_to_handler)
332332
/* save return value regs */
333-
sub sp, sp, #64
334-
stp x0, x1, [sp]
335-
stp x2, x3, [sp, #16]
336-
stp x4, x5, [sp, #32]
337-
stp x6, x7, [sp, #48]
333+
sub sp, sp, #FGRET_REGS_SIZE
334+
stp x0, x1, [sp, #FGRET_REGS_X0]
335+
stp x2, x3, [sp, #FGRET_REGS_X2]
336+
stp x4, x5, [sp, #FGRET_REGS_X4]
337+
stp x6, x7, [sp, #FGRET_REGS_X6]
338+
str x29, [sp, #FGRET_REGS_FP] // parent's fp
338339

339-
mov x0, x29 // parent's fp
340-
bl ftrace_return_to_handler// addr = ftrace_return_to_hander(fp);
341-
mov x30, x0 // restore the original return address
340+
mov x0, sp
341+
bl ftrace_return_to_handler // addr = ftrace_return_to_hander(regs);
342+
mov x30, x0 // restore the original return address
342343

343344
/* restore return value regs */
344-
ldp x0, x1, [sp]
345-
ldp x2, x3, [sp, #16]
346-
ldp x4, x5, [sp, #32]
347-
ldp x6, x7, [sp, #48]
348-
add sp, sp, #64
345+
ldp x0, x1, [sp, #FGRET_REGS_X0]
346+
ldp x2, x3, [sp, #FGRET_REGS_X2]
347+
ldp x4, x5, [sp, #FGRET_REGS_X4]
348+
ldp x6, x7, [sp, #FGRET_REGS_X6]
349+
add sp, sp, #FGRET_REGS_SIZE
349350

350351
ret
351352
SYM_CODE_END(return_to_handler)

arch/loongarch/Kconfig

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -111,6 +111,7 @@ config LOONGARCH
111111
select HAVE_FTRACE_MCOUNT_RECORD
112112
select HAVE_FUNCTION_ARG_ACCESS_API
113113
select HAVE_FUNCTION_ERROR_INJECTION
114+
select HAVE_FUNCTION_GRAPH_RETVAL if HAVE_FUNCTION_GRAPH_TRACER
114115
select HAVE_FUNCTION_GRAPH_TRACER
115116
select HAVE_FUNCTION_TRACER
116117
select HAVE_GENERIC_VDSO

arch/loongarch/include/asm/ftrace.h

Lines changed: 22 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -100,4 +100,26 @@ __arch_ftrace_set_direct_caller(struct pt_regs *regs, unsigned long addr)
100100

101101
#endif /* CONFIG_FUNCTION_TRACER */
102102

103+
#ifndef __ASSEMBLY__
104+
#ifdef CONFIG_FUNCTION_GRAPH_TRACER
105+
struct fgraph_ret_regs {
106+
/* a0 - a1 */
107+
unsigned long regs[2];
108+
109+
unsigned long fp;
110+
unsigned long __unused;
111+
};
112+
113+
static inline unsigned long fgraph_ret_regs_return_value(struct fgraph_ret_regs *ret_regs)
114+
{
115+
return ret_regs->regs[0];
116+
}
117+
118+
static inline unsigned long fgraph_ret_regs_frame_pointer(struct fgraph_ret_regs *ret_regs)
119+
{
120+
return ret_regs->fp;
121+
}
122+
#endif /* ifdef CONFIG_FUNCTION_GRAPH_TRACER */
123+
#endif
124+
103125
#endif /* _ASM_LOONGARCH_FTRACE_H */

arch/loongarch/kernel/asm-offsets.c

Lines changed: 14 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -12,6 +12,7 @@
1212
#include <asm/cpu-info.h>
1313
#include <asm/ptrace.h>
1414
#include <asm/processor.h>
15+
#include <asm/ftrace.h>
1516

1617
void output_ptreg_defines(void)
1718
{
@@ -264,11 +265,23 @@ void output_smpboot_defines(void)
264265
#ifdef CONFIG_HIBERNATION
265266
void output_pbe_defines(void)
266267
{
267-
COMMENT(" Linux struct pbe offsets. ");
268+
COMMENT("Linux struct pbe offsets.");
268269
OFFSET(PBE_ADDRESS, pbe, address);
269270
OFFSET(PBE_ORIG_ADDRESS, pbe, orig_address);
270271
OFFSET(PBE_NEXT, pbe, next);
271272
DEFINE(PBE_SIZE, sizeof(struct pbe));
272273
BLANK();
273274
}
274275
#endif
276+
277+
#ifdef CONFIG_FUNCTION_GRAPH_TRACER
278+
void output_fgraph_ret_regs_defines(void)
279+
{
280+
COMMENT("LoongArch fgraph_ret_regs offsets.");
281+
OFFSET(FGRET_REGS_A0, fgraph_ret_regs, regs[0]);
282+
OFFSET(FGRET_REGS_A1, fgraph_ret_regs, regs[1]);
283+
OFFSET(FGRET_REGS_FP, fgraph_ret_regs, fp);
284+
DEFINE(FGRET_REGS_SIZE, sizeof(struct fgraph_ret_regs));
285+
BLANK();
286+
}
287+
#endif

0 commit comments

Comments
 (0)