-
Notifications
You must be signed in to change notification settings - Fork 127
Snapshot size caveats
-snapshot-size _ appears at first glance to be a simple argument, but it's not. This page tries to give you a full appreciation of what's actually going on.
perf actually has two non-orthogonal ways to configure snapshot size: "changing the snapshot size", -S{int} and "changing the auxtrace mmap size" -m,{int}. The docs for the former say that the default is the latter, so, to keep things simple, magic-trace only provides a way to configure the latter.
Changing the "auxtrace mmap size" changes the number of memory pages that perf mlocks (i.e. prevents from swapping to disk) to store Intel PT trace data. That data is much more compact than Fuchsia trace files, so setting a small value here might still result in a large trace.
It comes with some restrictions that magic-trace tries to automatically fix for you:
- The size must be an integer number of pages.
- The number of pages must be a power of 2.
Another caveat is that the "snapshot size" refers to the size of the compressed Intel PT stream, not to the Fuchsia trace file size, but the trace viewer only struggles on large Fuchsia trace files. In general, those two things are correlated. But if your program is just a single for loop in a function, the Intel PT trace will contain lots of jumps (and therefore be large) but the Fuchsia trace will only see one function call (and therefore be small).
The perf docs for configuring this are pretty obtuse:
new snapshot option
The difference between full trace and snapshot from the kernel’s
perspective is that in full trace we don’t overwrite trace data
that the user hasn’t collected yet (and indicated that by
advancing aux_tail), whereas in snapshot mode we let the trace
run and overwrite older data in the buffer so that whenever
something interesting happens, we can stop it and grab a snapshot
of what was going on around that interesting moment.
To select snapshot mode a new option has been added:
-S
Optionally it can be followed by the snapshot size e.g.
-S0x100000
The default snapshot size is the auxtrace mmap size. If neither
auxtrace mmap size nor snapshot size is specified, then the
default is 4MiB for privileged users (or if
/proc/sys/kernel/perf_event_paranoid < 0), 128KiB for
unprivileged users. If an unprivileged user does not specify mmap
pages, the mmap pages will be reduced as described in the new
auxtrace mmap size option section below.
The snapshot size is displayed if the option -vv is used e.g.
Intel PT snapshot size: %zu
new auxtrace mmap size option
Intel PT buffer size is specified by an addition to the -m option
e.g.
-m,16
selects a buffer size of 16 pages i.e. 64KiB.
Note that the existing functionality of -m is unchanged. The
auxtrace mmap size is specified by the optional addition of a
comma and the value.
The default auxtrace mmap size for Intel PT is 4MiB/page_size for
privileged users (or if /proc/sys/kernel/perf_event_paranoid <
0), 128KiB for unprivileged users. If an unprivileged user does
not specify mmap pages, the mmap pages will be reduced from the
default 512KiB/page_size to 256KiB/page_size, otherwise the user
is likely to get an error as they exceed their mlock limit (Max
locked memory as shown in /proc/self/limits). Note that perf does
not count the first 512KiB (actually
/proc/sys/kernel/perf_event_mlock_kb minus 1 page) per cpu
against the mlock limit so an unprivileged user is allowed 512KiB
per cpu plus their mlock limit (which defaults to 64KiB but is
not multiplied by the number of cpus).
In full-trace mode, powers of two are allowed for buffer size,
with a minimum size of 2 pages. In snapshot mode or sampling
mode, it is the same but the minimum size is 1 page.
The mmap size and auxtrace mmap size are displayed if the -vv
option is used e.g.
mmap length 528384
auxtrace mmap length 4198400