Skip to content

Commit b72b5fe

Browse files
committed
Merge tag 'trace-v6.3' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace
Pull tracing updates from Steven Rostedt: - Add function names as a way to filter function addresses - Add sample module to test ftrace ops and dynamic trampolines - Allow stack traces to be passed from beginning event to end event for synthetic events. This will allow seeing the stack trace of when a task is scheduled out and recorded when it gets scheduled back in. - Add trace event helper __get_buf() to use as a temporary buffer when printing out trace event output. - Add kernel command line to create trace instances on boot up. - Add enabling of events to instances created at boot up. - Add trace_array_puts() to write into instances. - Allow boot instances to take a snapshot at the end of boot up. - Allow live patch modules to include trace events - Minor fixes and clean ups * tag 'trace-v6.3' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace: (31 commits) tracing: Remove unnecessary NULL assignment tracepoint: Allow livepatch module add trace event tracing: Always use canonical ftrace path tracing/histogram: Fix stacktrace histogram Documententation tracing/histogram: Fix stacktrace key tracing/histogram: Fix a few problems with stacktrace variable printing tracing: Add BUILD_BUG() to make sure stacktrace fits in strings tracing/histogram: Don't use strlen to find length of stacktrace variables tracing: Allow boot instances to have snapshot buffers tracing: Add trace_array_puts() to write into instance tracing: Add enabling of events to boot instances tracing: Add creation of instances at boot command line tracing: Fix trace_event_raw_event_synth() if else statement samples: ftrace: Make some global variables static ftrace: sample: avoid open-coded 64-bit division samples: ftrace: Include the nospec-branch.h only for x86 tracing: Acquire buffer from temparary trace sequence tracing/histogram: Wrap remaining shell snippets in code blocks tracing/osnoise: No need for schedule_hrtimeout range bpf/tracing: Use stage6 of tracing to not duplicate macros ...
2 parents 9191423 + 7568a21 commit b72b5fe

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

42 files changed

+1113
-221
lines changed

Documentation/admin-guide/kernel-parameters.txt

Lines changed: 29 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1509,6 +1509,15 @@
15091509
boot up that is likely to be overridden by user space
15101510
start up functionality.
15111511

1512+
Optionally, the snapshot can also be defined for a tracing
1513+
instance that was created by the trace_instance= command
1514+
line parameter.
1515+
1516+
trace_instance=foo,sched_switch ftrace_boot_snapshot=foo
1517+
1518+
The above will cause the "foo" tracing instance to trigger
1519+
a snapshot at the end of boot up.
1520+
15121521
ftrace_dump_on_oops[=orig_cpu]
15131522
[FTRACE] will dump the trace buffers on oops.
15141523
If no parameter is passed, ftrace will dump
@@ -6283,6 +6292,26 @@
62836292
comma-separated list of trace events to enable. See
62846293
also Documentation/trace/events.rst
62856294

6295+
trace_instance=[instance-info]
6296+
[FTRACE] Create a ring buffer instance early in boot up.
6297+
This will be listed in:
6298+
6299+
/sys/kernel/tracing/instances
6300+
6301+
Events can be enabled at the time the instance is created
6302+
via:
6303+
6304+
trace_instance=<name>,<system1>:<event1>,<system2>:<event2>
6305+
6306+
Note, the "<system*>:" portion is optional if the event is
6307+
unique.
6308+
6309+
trace_instance=foo,sched:sched_switch,irq_handler_entry,initcall
6310+
6311+
will enable the "sched_switch" event (note, the "sched:" is optional, and
6312+
the same thing would happen if it was left off). The irq_handler_entry
6313+
event, and all events under the "initcall" system.
6314+
62866315
trace_options=[option-list]
62876316
[FTRACE] Enable or disable tracer options at boot.
62886317
The option-list is a comma delimited list of options

Documentation/trace/events.rst

Lines changed: 12 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -207,6 +207,18 @@ field name::
207207
As the kernel will have to know how to retrieve the memory that the pointer
208208
is at from user space.
209209

210+
You can convert any long type to a function address and search by function name::
211+
212+
call_site.function == security_prepare_creds
213+
214+
The above will filter when the field "call_site" falls on the address within
215+
"security_prepare_creds". That is, it will compare the value of "call_site" and
216+
the filter will return true if it is greater than or equal to the start of
217+
the function "security_prepare_creds" and less than the end of that function.
218+
219+
The ".function" postfix can only be attached to values of size long, and can only
220+
be compared with "==" or "!=".
221+
210222
5.2 Setting filters
211223
-------------------
212224

Documentation/trace/histogram.rst

Lines changed: 198 additions & 44 deletions
Original file line numberDiff line numberDiff line change
@@ -81,6 +81,7 @@ Documentation written by Tom Zanussi
8181
.usecs display a common_timestamp in microseconds
8282
.percent display a number of percentage value
8383
.graph display a bar-graph of a value
84+
.stacktrace display as a stacktrace (must by a long[] type)
8485
============= =================================================
8586

8687
Note that in general the semantics of a given field aren't
@@ -1786,6 +1787,8 @@ or assigned to a variable and referenced in a subsequent expression::
17861787
# echo 'hist:keys=next_pid:us_per_sec=1000000 ...' >> event/trigger
17871788
# echo 'hist:keys=next_pid:timestamp_secs=common_timestamp/$us_per_sec ...' >> event/trigger
17881789

1790+
Variables can even hold stacktraces, which are useful with synthetic events.
1791+
17891792
2.2.2 Synthetic Events
17901793
----------------------
17911794

@@ -1861,7 +1864,7 @@ A histogram can now be defined for the new synthetic event::
18611864
The above shows the latency "lat" in a power of 2 grouping.
18621865

18631866
Like any other event, once a histogram is enabled for the event, the
1864-
output can be displayed by reading the event's 'hist' file.
1867+
output can be displayed by reading the event's 'hist' file::
18651868

18661869
# cat /sys/kernel/tracing/events/synthetic/wakeup_latency/hist
18671870

@@ -1908,7 +1911,7 @@ output can be displayed by reading the event's 'hist' file.
19081911

19091912

19101913
The latency values can also be grouped linearly by a given size with
1911-
the ".buckets" modifier and specify a size (in this case groups of 10).
1914+
the ".buckets" modifier and specify a size (in this case groups of 10)::
19121915

19131916
# echo 'hist:keys=pid,prio,lat.buckets=10:sort=lat' >> \
19141917
/sys/kernel/tracing/events/synthetic/wakeup_latency/trigger
@@ -1940,6 +1943,157 @@ the ".buckets" modifier and specify a size (in this case groups of 10).
19401943
Entries: 16
19411944
Dropped: 0
19421945

1946+
To save stacktraces, create a synthetic event with a field of type "unsigned long[]"
1947+
or even just "long[]". For example, to see how long a task is blocked in an
1948+
uninterruptible state::
1949+
1950+
# cd /sys/kernel/tracing
1951+
# echo 's:block_lat pid_t pid; u64 delta; unsigned long[] stack;' > dynamic_events
1952+
# echo 'hist:keys=next_pid:ts=common_timestamp.usecs,st=stacktrace if prev_state == 2' >> events/sched/sched_switch/trigger
1953+
# echo 'hist:keys=prev_pid:delta=common_timestamp.usecs-$ts,s=$st:onmax($delta).trace(block_lat,prev_pid,$delta,$s)' >> events/sched/sched_switch/trigger
1954+
# echo 1 > events/synthetic/block_lat/enable
1955+
# cat trace
1956+
1957+
# tracer: nop
1958+
#
1959+
# entries-in-buffer/entries-written: 2/2 #P:8
1960+
#
1961+
# _-----=> irqs-off/BH-disabled
1962+
# / _----=> need-resched
1963+
# | / _---=> hardirq/softirq
1964+
# || / _--=> preempt-depth
1965+
# ||| / _-=> migrate-disable
1966+
# |||| / delay
1967+
# TASK-PID CPU# ||||| TIMESTAMP FUNCTION
1968+
# | | | ||||| | |
1969+
<idle>-0 [005] d..4. 521.164922: block_lat: pid=0 delta=8322 stack=STACK:
1970+
=> __schedule+0x448/0x7b0
1971+
=> schedule+0x5a/0xb0
1972+
=> io_schedule+0x42/0x70
1973+
=> bit_wait_io+0xd/0x60
1974+
=> __wait_on_bit+0x4b/0x140
1975+
=> out_of_line_wait_on_bit+0x91/0xb0
1976+
=> jbd2_journal_commit_transaction+0x1679/0x1a70
1977+
=> kjournald2+0xa9/0x280
1978+
=> kthread+0xe9/0x110
1979+
=> ret_from_fork+0x2c/0x50
1980+
1981+
<...>-2 [004] d..4. 525.184257: block_lat: pid=2 delta=76 stack=STACK:
1982+
=> __schedule+0x448/0x7b0
1983+
=> schedule+0x5a/0xb0
1984+
=> schedule_timeout+0x11a/0x150
1985+
=> wait_for_completion_killable+0x144/0x1f0
1986+
=> __kthread_create_on_node+0xe7/0x1e0
1987+
=> kthread_create_on_node+0x51/0x70
1988+
=> create_worker+0xcc/0x1a0
1989+
=> worker_thread+0x2ad/0x380
1990+
=> kthread+0xe9/0x110
1991+
=> ret_from_fork+0x2c/0x50
1992+
1993+
A synthetic event that has a stacktrace field may use it as a key in
1994+
histogram::
1995+
1996+
# echo 'hist:keys=delta.buckets=100,stack.stacktrace:sort=delta' > events/synthetic/block_lat/trigger
1997+
# cat events/synthetic/block_lat/hist
1998+
1999+
# event histogram
2000+
#
2001+
# trigger info: hist:keys=delta.buckets=100,stack.stacktrace:vals=hitcount:sort=delta.buckets=100:size=2048 [active]
2002+
#
2003+
{ delta: ~ 0-99, stack.stacktrace __schedule+0xa19/0x1520
2004+
schedule+0x6b/0x110
2005+
io_schedule+0x46/0x80
2006+
bit_wait_io+0x11/0x80
2007+
__wait_on_bit+0x4e/0x120
2008+
out_of_line_wait_on_bit+0x8d/0xb0
2009+
__wait_on_buffer+0x33/0x40
2010+
jbd2_journal_commit_transaction+0x155a/0x19b0
2011+
kjournald2+0xab/0x270
2012+
kthread+0xfa/0x130
2013+
ret_from_fork+0x29/0x50
2014+
} hitcount: 1
2015+
{ delta: ~ 0-99, stack.stacktrace __schedule+0xa19/0x1520
2016+
schedule+0x6b/0x110
2017+
io_schedule+0x46/0x80
2018+
rq_qos_wait+0xd0/0x170
2019+
wbt_wait+0x9e/0xf0
2020+
__rq_qos_throttle+0x25/0x40
2021+
blk_mq_submit_bio+0x2c3/0x5b0
2022+
__submit_bio+0xff/0x190
2023+
submit_bio_noacct_nocheck+0x25b/0x2b0
2024+
submit_bio_noacct+0x20b/0x600
2025+
submit_bio+0x28/0x90
2026+
ext4_bio_write_page+0x1e0/0x8c0
2027+
mpage_submit_page+0x60/0x80
2028+
mpage_process_page_bufs+0x16c/0x180
2029+
mpage_prepare_extent_to_map+0x23f/0x530
2030+
} hitcount: 1
2031+
{ delta: ~ 0-99, stack.stacktrace __schedule+0xa19/0x1520
2032+
schedule+0x6b/0x110
2033+
schedule_hrtimeout_range_clock+0x97/0x110
2034+
schedule_hrtimeout_range+0x13/0x20
2035+
usleep_range_state+0x65/0x90
2036+
__intel_wait_for_register+0x1c1/0x230 [i915]
2037+
intel_psr_wait_for_idle_locked+0x171/0x2a0 [i915]
2038+
intel_pipe_update_start+0x169/0x360 [i915]
2039+
intel_update_crtc+0x112/0x490 [i915]
2040+
skl_commit_modeset_enables+0x199/0x600 [i915]
2041+
intel_atomic_commit_tail+0x7c4/0x1080 [i915]
2042+
intel_atomic_commit_work+0x12/0x20 [i915]
2043+
process_one_work+0x21c/0x3f0
2044+
worker_thread+0x50/0x3e0
2045+
kthread+0xfa/0x130
2046+
} hitcount: 3
2047+
{ delta: ~ 0-99, stack.stacktrace __schedule+0xa19/0x1520
2048+
schedule+0x6b/0x110
2049+
schedule_timeout+0x11e/0x160
2050+
__wait_for_common+0x8f/0x190
2051+
wait_for_completion+0x24/0x30
2052+
__flush_work.isra.0+0x1cc/0x360
2053+
flush_work+0xe/0x20
2054+
drm_mode_rmfb+0x18b/0x1d0 [drm]
2055+
drm_mode_rmfb_ioctl+0x10/0x20 [drm]
2056+
drm_ioctl_kernel+0xb8/0x150 [drm]
2057+
drm_ioctl+0x243/0x560 [drm]
2058+
__x64_sys_ioctl+0x92/0xd0
2059+
do_syscall_64+0x59/0x90
2060+
entry_SYSCALL_64_after_hwframe+0x72/0xdc
2061+
} hitcount: 1
2062+
{ delta: ~ 0-99, stack.stacktrace __schedule+0xa19/0x1520
2063+
schedule+0x6b/0x110
2064+
schedule_timeout+0x87/0x160
2065+
__wait_for_common+0x8f/0x190
2066+
wait_for_completion_timeout+0x1d/0x30
2067+
drm_atomic_helper_wait_for_flip_done+0x57/0x90 [drm_kms_helper]
2068+
intel_atomic_commit_tail+0x8ce/0x1080 [i915]
2069+
intel_atomic_commit_work+0x12/0x20 [i915]
2070+
process_one_work+0x21c/0x3f0
2071+
worker_thread+0x50/0x3e0
2072+
kthread+0xfa/0x130
2073+
ret_from_fork+0x29/0x50
2074+
} hitcount: 1
2075+
{ delta: ~ 100-199, stack.stacktrace __schedule+0xa19/0x1520
2076+
schedule+0x6b/0x110
2077+
schedule_hrtimeout_range_clock+0x97/0x110
2078+
schedule_hrtimeout_range+0x13/0x20
2079+
usleep_range_state+0x65/0x90
2080+
pci_set_low_power_state+0x17f/0x1f0
2081+
pci_set_power_state+0x49/0x250
2082+
pci_finish_runtime_suspend+0x4a/0x90
2083+
pci_pm_runtime_suspend+0xcb/0x1b0
2084+
__rpm_callback+0x48/0x120
2085+
rpm_callback+0x67/0x70
2086+
rpm_suspend+0x167/0x780
2087+
rpm_idle+0x25a/0x380
2088+
pm_runtime_work+0x93/0xc0
2089+
process_one_work+0x21c/0x3f0
2090+
} hitcount: 1
2091+
2092+
Totals:
2093+
Hits: 10
2094+
Entries: 7
2095+
Dropped: 0
2096+
19432097
2.2.3 Hist trigger 'handlers' and 'actions'
19442098
-------------------------------------------
19452099

@@ -2054,11 +2208,11 @@ The following commonly-used handler.action pairs are available:
20542208
wakeup_new_test($testpid) if comm=="cyclictest"' >> \
20552209
/sys/kernel/tracing/events/sched/sched_wakeup_new/trigger
20562210

2057-
Or, equivalently, using the 'trace' keyword syntax:
2211+
Or, equivalently, using the 'trace' keyword syntax::
20582212

2059-
# echo 'hist:keys=$testpid:testpid=pid:onmatch(sched.sched_wakeup_new).\
2060-
trace(wakeup_new_test,$testpid) if comm=="cyclictest"' >> \
2061-
/sys/kernel/tracing/events/sched/sched_wakeup_new/trigger
2213+
# echo 'hist:keys=$testpid:testpid=pid:onmatch(sched.sched_wakeup_new).\
2214+
trace(wakeup_new_test,$testpid) if comm=="cyclictest"' >> \
2215+
/sys/kernel/tracing/events/sched/sched_wakeup_new/trigger
20622216

20632217
Creating and displaying a histogram based on those events is now
20642218
just a matter of using the fields and new synthetic event in the
@@ -2191,48 +2345,48 @@ The following commonly-used handler.action pairs are available:
21912345
resulting latency, stored in wakeup_lat, exceeds the current
21922346
maximum latency, a snapshot is taken. As part of the setup, all
21932347
the scheduler events are also enabled, which are the events that
2194-
will show up in the snapshot when it is taken at some point:
2348+
will show up in the snapshot when it is taken at some point::
21952349

2196-
# echo 1 > /sys/kernel/tracing/events/sched/enable
2350+
# echo 1 > /sys/kernel/tracing/events/sched/enable
21972351

2198-
# echo 'hist:keys=pid:ts0=common_timestamp.usecs \
2199-
if comm=="cyclictest"' >> \
2200-
/sys/kernel/tracing/events/sched/sched_waking/trigger
2352+
# echo 'hist:keys=pid:ts0=common_timestamp.usecs \
2353+
if comm=="cyclictest"' >> \
2354+
/sys/kernel/tracing/events/sched/sched_waking/trigger
22012355

2202-
# echo 'hist:keys=next_pid:wakeup_lat=common_timestamp.usecs-$ts0: \
2203-
onmax($wakeup_lat).save(next_prio,next_comm,prev_pid,prev_prio, \
2204-
prev_comm):onmax($wakeup_lat).snapshot() \
2205-
if next_comm=="cyclictest"' >> \
2206-
/sys/kernel/tracing/events/sched/sched_switch/trigger
2356+
# echo 'hist:keys=next_pid:wakeup_lat=common_timestamp.usecs-$ts0: \
2357+
onmax($wakeup_lat).save(next_prio,next_comm,prev_pid,prev_prio, \
2358+
prev_comm):onmax($wakeup_lat).snapshot() \
2359+
if next_comm=="cyclictest"' >> \
2360+
/sys/kernel/tracing/events/sched/sched_switch/trigger
22072361

22082362
When the histogram is displayed, for each bucket the max value
22092363
and the saved values corresponding to the max are displayed
22102364
following the rest of the fields.
22112365

22122366
If a snapshot was taken, there is also a message indicating that,
2213-
along with the value and event that triggered the global maximum:
2367+
along with the value and event that triggered the global maximum::
22142368

2215-
# cat /sys/kernel/tracing/events/sched/sched_switch/hist
2216-
{ next_pid: 2101 } hitcount: 200
2217-
max: 52 next_prio: 120 next_comm: cyclictest \
2218-
prev_pid: 0 prev_prio: 120 prev_comm: swapper/6
2369+
# cat /sys/kernel/tracing/events/sched/sched_switch/hist
2370+
{ next_pid: 2101 } hitcount: 200
2371+
max: 52 next_prio: 120 next_comm: cyclictest \
2372+
prev_pid: 0 prev_prio: 120 prev_comm: swapper/6
22192373

2220-
{ next_pid: 2103 } hitcount: 1326
2221-
max: 572 next_prio: 19 next_comm: cyclictest \
2222-
prev_pid: 0 prev_prio: 120 prev_comm: swapper/1
2374+
{ next_pid: 2103 } hitcount: 1326
2375+
max: 572 next_prio: 19 next_comm: cyclictest \
2376+
prev_pid: 0 prev_prio: 120 prev_comm: swapper/1
22232377

2224-
{ next_pid: 2102 } hitcount: 1982 \
2225-
max: 74 next_prio: 19 next_comm: cyclictest \
2226-
prev_pid: 0 prev_prio: 120 prev_comm: swapper/5
2378+
{ next_pid: 2102 } hitcount: 1982 \
2379+
max: 74 next_prio: 19 next_comm: cyclictest \
2380+
prev_pid: 0 prev_prio: 120 prev_comm: swapper/5
22272381

2228-
Snapshot taken (see tracing/snapshot). Details:
2229-
triggering value { onmax($wakeup_lat) }: 572 \
2230-
triggered by event with key: { next_pid: 2103 }
2382+
Snapshot taken (see tracing/snapshot). Details:
2383+
triggering value { onmax($wakeup_lat) }: 572 \
2384+
triggered by event with key: { next_pid: 2103 }
22312385

2232-
Totals:
2233-
Hits: 3508
2234-
Entries: 3
2235-
Dropped: 0
2386+
Totals:
2387+
Hits: 3508
2388+
Entries: 3
2389+
Dropped: 0
22362390

22372391
In the above case, the event that triggered the global maximum has
22382392
the key with next_pid == 2103. If you look at the bucket that has
@@ -2310,15 +2464,15 @@ The following commonly-used handler.action pairs are available:
23102464
$cwnd variable. If the value has changed, a snapshot is taken.
23112465
As part of the setup, all the scheduler and tcp events are also
23122466
enabled, which are the events that will show up in the snapshot
2313-
when it is taken at some point:
2467+
when it is taken at some point::
23142468

2315-
# echo 1 > /sys/kernel/tracing/events/sched/enable
2316-
# echo 1 > /sys/kernel/tracing/events/tcp/enable
2469+
# echo 1 > /sys/kernel/tracing/events/sched/enable
2470+
# echo 1 > /sys/kernel/tracing/events/tcp/enable
23172471

2318-
# echo 'hist:keys=dport:cwnd=snd_cwnd: \
2319-
onchange($cwnd).save(snd_wnd,srtt,rcv_wnd): \
2320-
onchange($cwnd).snapshot()' >> \
2321-
/sys/kernel/tracing/events/tcp/tcp_probe/trigger
2472+
# echo 'hist:keys=dport:cwnd=snd_cwnd: \
2473+
onchange($cwnd).save(snd_wnd,srtt,rcv_wnd): \
2474+
onchange($cwnd).snapshot()' >> \
2475+
/sys/kernel/tracing/events/tcp/tcp_probe/trigger
23222476

23232477
When the histogram is displayed, for each bucket the tracked value
23242478
and the saved values corresponding to that value are displayed
@@ -2341,10 +2495,10 @@ The following commonly-used handler.action pairs are available:
23412495
{ dport: 443 } hitcount: 211
23422496
changed: 10 snd_wnd: 26960 srtt: 17379 rcv_wnd: 28800
23432497

2344-
Snapshot taken (see tracing/snapshot). Details::
2498+
Snapshot taken (see tracing/snapshot). Details:
23452499

2346-
triggering value { onchange($cwnd) }: 10
2347-
triggered by event with key: { dport: 80 }
2500+
triggering value { onchange($cwnd) }: 10
2501+
triggered by event with key: { dport: 80 }
23482502

23492503
Totals:
23502504
Hits: 414

include/linux/kernel.h

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -297,7 +297,7 @@ bool mac_pton(const char *s, u8 *mac);
297297
*
298298
* Use tracing_on/tracing_off when you want to quickly turn on or off
299299
* tracing. It simply enables or disables the recording of the trace events.
300-
* This also corresponds to the user space /sys/kernel/debug/tracing/tracing_on
300+
* This also corresponds to the user space /sys/kernel/tracing/tracing_on
301301
* file, which gives a means for the kernel and userspace to interact.
302302
* Place a tracing_off() in the kernel where you want tracing to end.
303303
* From user space, examine the trace, and then echo 1 > tracing_on

0 commit comments

Comments
 (0)