Skip to content

Sampling profilerΒ #29304

@Jongy

Description

@Jongy

I wrote a (very) initial PoC for a sampling profiler (tested on qemu_x86_64). A friend of mine @JonBruchim uses Zephyr and has had some performance issues. I saw that while Zephyr has some tracing/profiling tools, it doesn't have a sampling profiler. Myself, a frequent user of Linux's perf, I wanted to toy with creating a sampling profiler from scratch, for an existing system. Figured it's a good opportunity :).

So this PoC is:

  • a sampling profiler based on existing clock updates that generates the call stack via fp-based unwinding.
  • the "profiler" can be enabled for a specified duration using the shell command perf record <duration ms>.
  • the shell command perf printbuf dumps the recorded samples.
  • the samples are then converted with a Python script to the "collapsed" stacks format as used by flamegraph.pl; you can then use flamegraph.pl to get a flamegraph.

I believe it's useful addition to Zephyr's profiling capabilities, hence I'm sharing it here as a feature idea.
For example, here are some graphs generated from the net/sockets/echo_server sample:

Under heavy load:
zephyr flamegraph

More lightweight:
zephyr flamegraph 2

(Not sharing the .svgs because GitHub doesn't support uploading these 😒 )

Not sure what k_mem_slab_free is doing there so long, perhaps the simplistic stack unwinding is lying? Anyway, other stacks make sense to me.

It currently requires the following configs:

  • CONFIG_SMP=n since it's simpler, SMP can be supported by employing a per-CPU sampling buffer or by maintaining safe concurrent access to the single buffer, with locks.
  • CONFIG_TICKLESS_KERNEL=n so z_clock_announce is called at set intervals.
  • Made sure the system builds with frame pointer (used for unwinding).
  • CONFIG_THREAD_STACK_INFO=y so I can access current's registers easily.

You can see the current work in the perf branch of my Zephyr fork.
If you think such a profiler fits in Zephyr, I'd be happy to continue working on it towards mainlining :)

Here are the next steps as I see it:

  • Incorporate it into the tracing subsystem of Zephyr.
  • Support SMP by creating a per-CPU sampling buffer (then allow profiling only a set of the CPUs, ...)
  • When enabling perf, reconfigure timer interrupts to get any "frequency" we wish (currently it just uses the existing frequency, 100 HZ which is the default CONFIG_SYS_CLOCK_TICKS_PER_SEC). This will also allow to work in tandem with CONFIG_TICKLESS_KERNEL: for the duration of perf, we will enable timer interrupts in the set interval, then disable and move back to "tickless".
  • Think of a way to stream samples (from a background thread) so the samples buffer doesn't fill up (Linux's perf occasionally reads the buffer and appends to a file, I think for Zephyr it would be better to stream samples over UART/network connection, but writing to a file might also do).
  • Test on some real hardware.
  • Fix existing crashes / hangs πŸ˜… I'm a Zephyr novice so I don't even know if these are related to my changes or not, but I did get a few hangs while operating this feature.

P.S it won't let me add labels, I guess the appropriate ones are Feature and area: Profiling.

Metadata

Metadata

Assignees

Labels

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions