@@ -254,13 +254,28 @@ files in the current directory which are ELF images for all the JIT trampolines
254254that were created by Python.
255255
256256.. warning ::
257- Notice that when using ``--call-graph dwarf `` the ``perf `` tool will take
257+ When using ``--call-graph dwarf ``, the ``perf `` tool will take
258258 snapshots of the stack of the process being profiled and save the
259- information in the ``perf.data `` file. By default the size of the stack dump
260- is 8192 bytes but the user can change the size by passing the size after
261- comma like ``--call-graph dwarf,4096 ``. The size of the stack dump is
262- important because if the size is too small ``perf `` will not be able to
263- unwind the stack and the output will be incomplete. On the other hand, if
264- the size is too big, then ``perf `` won't be able to sample the process as
265- frequently as it would like as the overhead will be higher.
259+ information in the ``perf.data `` file. By default, the size of the stack dump
260+ is 8192 bytes, but you can change the size by passing it after
261+ a comma like ``--call-graph dwarf,16384 ``.
266262
263+ The size of the stack dump is important because if the size is too small
264+ ``perf `` will not be able to unwind the stack and the output will be
265+ incomplete. On the other hand, if the size is too big, then ``perf `` won't
266+ be able to sample the process as frequently as it would like as the overhead
267+ will be higher.
268+
269+ The stack size is particularly important when profiling Python code compiled
270+ with low optimization levels (like ``-O0 ``), as these builds tend to have
271+ larger stack frames. If you are compiling Python with ``-O0 `` and not seeing
272+ Python functions in your profiling output, try increasing the stack dump
273+ size to 65528 bytes (the maximum)::
274+
275+ $ perf record -F 9999 -g -k 1 --call-graph dwarf,65528 -o perf.data python -Xperf_jit my_script.py
276+
277+ Different compilation flags can significantly impact stack sizes:
278+
279+ - Builds with ``-O0 `` typically have much larger stack frames than those with ``-O1 `` or higher
280+ - Adding optimizations (``-O1 ``, ``-O2 ``, etc.) typically reduces stack size
281+ - Frame pointers (``-fno-omit-frame-pointer ``) generally provide more reliable stack unwinding
0 commit comments