1313
1414**Source code: ** :source: `Lib/profiling/sampling/ `
1515
16+ .. program :: profiling.sampling
17+
1618--------------
1719
1820.. image :: tachyon-logo.png
@@ -146,6 +148,10 @@ Generate a line-by-line heatmap::
146148
147149 python -m profiling.sampling run --heatmap script.py
148150
151+ Enable opcode-level profiling to see which bytecode instructions are executing::
152+
153+ python -m profiling.sampling run --opcodes --flamegraph script.py
154+
149155
150156Commands
151157========
@@ -308,7 +314,7 @@ The two most fundamental parameters are the sampling interval and duration.
308314Together, these determine how many samples will be collected during a profiling
309315session.
310316
311- The `` --interval `` option (`` -i ` `) sets the time between samples in
317+ The :option: ` --interval ` option (:option: ` -i `) sets the time between samples in
312318microseconds. The default is 100 microseconds, which produces approximately
31331910,000 samples per second::
314320
@@ -319,7 +325,7 @@ cost of slightly higher profiler CPU usage. Higher intervals reduce profiler
319325overhead but may miss short-lived functions. For most applications, the
320326default interval provides a good balance between accuracy and overhead.
321327
322- The `` --duration `` option (`` -d ` `) sets how long to profile in seconds. The
328+ The :option: ` --duration ` option (:option: ` -d `) sets how long to profile in seconds. The
323329default is 10 seconds::
324330
325331 python -m profiling.sampling run -d 60 script.py
@@ -337,8 +343,8 @@ Python programs often use multiple threads, whether explicitly through the
337343:mod: `threading ` module or implicitly through libraries that manage thread
338344pools.
339345
340- By default, the profiler samples only the main thread. The `` --all-threads ` `
341- option (`` -a ` `) enables sampling of all threads in the process::
346+ By default, the profiler samples only the main thread. The :option: ` --all-threads `
347+ option (:option: ` -a `) enables sampling of all threads in the process::
342348
343349 python -m profiling.sampling run -a script.py
344350
@@ -357,7 +363,7 @@ additional context about what the interpreter is doing at the moment each
357363sample is taken. These synthetic frames help distinguish different types of
358364execution that would otherwise be invisible.
359365
360- The `` --native ` ` option adds ``<native> `` frames to indicate when Python has
366+ The :option: ` --native ` option adds ``<native> `` frames to indicate when Python has
361367called into C code (extension modules, built-in functions, or the interpreter
362368itself)::
363369
@@ -369,7 +375,7 @@ in the Python function that made the call. This is useful when optimizing
369375code that makes heavy use of C extensions like NumPy or database drivers.
370376
371377By default, the profiler includes ``<GC> `` frames when garbage collection is
372- active. The `` --no-gc ` ` option suppresses these frames::
378+ active. The :option: ` --no-gc ` option suppresses these frames::
373379
374380 python -m profiling.sampling run --no-gc script.py
375381
@@ -379,10 +385,48 @@ see substantial time in ``<GC>`` frames, consider investigating object
379385allocation rates or using object pooling.
380386
381387
388+ Opcode-aware profiling
389+ ----------------------
390+
391+ The :option: `--opcodes ` option enables instruction-level profiling that captures
392+ which Python bytecode instructions are executing at each sample::
393+
394+ python -m profiling.sampling run --opcodes --flamegraph script.py
395+
396+ This feature provides visibility into Python's bytecode execution, including
397+ adaptive specialization optimizations. When a generic instruction like
398+ ``LOAD_ATTR `` is specialized at runtime into a more efficient variant like
399+ ``LOAD_ATTR_INSTANCE_VALUE ``, the profiler shows both the specialized name
400+ and the base instruction.
401+
402+ Opcode information appears in several output formats:
403+
404+ - **Live mode **: An opcode panel shows instruction-level statistics for the
405+ selected function, accessible via keyboard navigation
406+ - **Flame graphs **: Nodes display opcode information when available, helping
407+ identify which instructions consume the most time
408+ - **Heatmap **: Expandable bytecode panels per source line show instruction
409+ breakdown with specialization percentages
410+ - **Gecko format **: Opcode transitions are emitted as interval markers in the
411+ Firefox Profiler timeline
412+
413+ This level of detail is particularly useful for:
414+
415+ - Understanding the performance impact of Python's adaptive specialization
416+ - Identifying hot bytecode instructions that might benefit from optimization
417+ - Analyzing the effectiveness of different code patterns at the instruction level
418+ - Debugging performance issues that occur at the bytecode level
419+
420+ The :option: `--opcodes ` option is compatible with :option: `--live `, :option: `--flamegraph `,
421+ :option: `--heatmap `, and :option: `--gecko ` formats. It requires additional memory to store
422+ opcode information and may slightly reduce sampling performance, but provides
423+ unprecedented visibility into Python's execution model.
424+
425+
382426Real-time statistics
383427--------------------
384428
385- The `` --realtime-stats ` ` option displays sampling rate statistics during
429+ The :option: ` --realtime-stats ` option displays sampling rate statistics during
386430profiling::
387431
388432 python -m profiling.sampling run --realtime-stats script.py
@@ -434,7 +478,7 @@ CPU execution time, or time spent holding the global interpreter lock.
434478Wall-clock mode
435479---------------
436480
437- Wall-clock mode (`` --mode=wall ``) captures all samples regardless of what the
481+ Wall-clock mode (:option: ` --mode ` \ `` =wall ``) captures all samples regardless of what the
438482thread is doing. This is the default mode and provides a complete picture of
439483where time passes during program execution::
440484
@@ -454,7 +498,7 @@ latency.
454498CPU mode
455499--------
456500
457- CPU mode (`` --mode=cpu ``) records samples only when the thread is actually
501+ CPU mode (:option: ` --mode ` \ `` =cpu ``) records samples only when the thread is actually
458502executing on a CPU core::
459503
460504 python -m profiling.sampling run --mode=cpu script.py
@@ -488,7 +532,7 @@ connection pooling, or reducing wait time instead.
488532GIL mode
489533--------
490534
491- GIL mode (`` --mode=gil ``) records samples only when the thread holds Python's
535+ GIL mode (:option: ` --mode ` \ `` =gil ``) records samples only when the thread holds Python's
492536global interpreter lock::
493537
494538 python -m profiling.sampling run --mode=gil script.py
@@ -520,7 +564,7 @@ output goes to stdout, a file, or a directory depending on the format.
520564pstats format
521565-------------
522566
523- The pstats format (`` --pstats ` `) produces a text table similar to what
567+ The pstats format (:option: ` --pstats `) produces a text table similar to what
524568deterministic profilers generate. This is the default output format::
525569
526570 python -m profiling.sampling run script.py
@@ -567,31 +611,31 @@ interesting functions that highlights:
567611 samples (high cumulative/direct multiplier). These are frequently-nested
568612 functions that appear deep in many call chains.
569613
570- Use `` --no-summary ` ` to suppress both the legend and summary sections.
614+ Use :option: ` --no-summary ` to suppress both the legend and summary sections.
571615
572616To save pstats output to a file instead of stdout::
573617
574618 python -m profiling.sampling run -o profile.txt script.py
575619
576620The pstats format supports several options for controlling the display.
577- The `` --sort ` ` option determines the column used for ordering results::
621+ The :option: ` --sort ` option determines the column used for ordering results::
578622
579623 python -m profiling.sampling run --sort=tottime script.py
580624 python -m profiling.sampling run --sort=cumtime script.py
581625 python -m profiling.sampling run --sort=nsamples script.py
582626
583- The `` --limit ` ` option restricts output to the top N entries::
627+ The :option: ` --limit ` option restricts output to the top N entries::
584628
585629 python -m profiling.sampling run --limit=30 script.py
586630
587- The `` --no-summary ` ` option suppresses the header summary that precedes the
631+ The :option: ` --no-summary ` option suppresses the header summary that precedes the
588632statistics table.
589633
590634
591635Collapsed stacks format
592636-----------------------
593637
594- Collapsed stacks format (`` --collapsed ` `) produces one line per unique call
638+ Collapsed stacks format (:option: ` --collapsed `) produces one line per unique call
595639stack, with a count of how many times that stack was sampled::
596640
597641 python -m profiling.sampling run --collapsed script.py
@@ -621,7 +665,7 @@ visualization where you can click to zoom into specific call paths.
621665Flame graph format
622666------------------
623667
624- Flame graph format (`` --flamegraph ` `) produces a self-contained HTML file with
668+ Flame graph format (:option: ` --flamegraph `) produces a self-contained HTML file with
625669an interactive flame graph visualization::
626670
627671 python -m profiling.sampling run --flamegraph script.py
@@ -667,7 +711,7 @@ or through their callees.
667711Gecko format
668712------------
669713
670- Gecko format (`` --gecko ` `) produces JSON output compatible with the Firefox
714+ Gecko format (:option: ` --gecko `) produces JSON output compatible with the Firefox
671715Profiler::
672716
673717 python -m profiling.sampling run --gecko script.py
@@ -694,14 +738,14 @@ Firefox Profiler timeline:
694738- **Code type markers **: distinguish Python code from native (C extension) code
695739- **GC markers **: indicate garbage collection activity
696740
697- For this reason, the `` --mode ` ` option is not available with Gecko format;
741+ For this reason, the :option: ` --mode ` option is not available with Gecko format;
698742all relevant data is captured automatically.
699743
700744
701745Heatmap format
702746--------------
703747
704- Heatmap format (`` --heatmap ` `) generates an interactive HTML visualization
748+ Heatmap format (:option: ` --heatmap `) generates an interactive HTML visualization
705749showing sample counts at the source line level::
706750
707751 python -m profiling.sampling run --heatmap script.py
@@ -744,7 +788,7 @@ interpretation of hierarchical visualizations.
744788Live mode
745789=========
746790
747- Live mode (`` --live ` `) provides a terminal-based real-time view of profiling
791+ Live mode (:option: ` --live `) provides a terminal-based real-time view of profiling
748792data, similar to the ``top `` command for system processes::
749793
750794 python -m profiling.sampling run --live script.py
@@ -760,6 +804,11 @@ and thread status statistics (GIL held percentage, CPU usage, GC time). The
760804main table shows function statistics with the currently sorted column indicated
761805by an arrow (▼).
762806
807+ When :option: `--opcodes ` is enabled, an additional opcode panel appears below the
808+ main table, showing instruction-level statistics for the currently selected
809+ function. This panel displays which bytecode instructions are executing most
810+ frequently, including specialized variants and their base opcodes.
811+
763812
764813Keyboard commands
765814-----------------
@@ -813,20 +862,25 @@ Within live mode, keyboard commands control the display:
813862:kbd: `h ` or :kbd: `? `
814863 Show the help screen with all available commands.
815864
865+ :kbd: `j ` / :kbd: `k ` (or :kbd: `Up ` / :kbd: `Down `)
866+ Navigate through opcode entries in the opcode panel (when ``--opcodes `` is
867+ enabled). These keys scroll through the instruction-level statistics for the
868+ currently selected function.
869+
816870When profiling finishes (duration expires or target process exits), the display
817871shows a "PROFILING COMPLETE" banner and freezes the final results. You can
818872still navigate, sort, and filter the results before pressing :kbd: `q ` to exit.
819873
820- Live mode is incompatible with output format options (`` --collapsed ` `,
821- `` --flamegraph ` `, and so on) because it uses an interactive terminal
874+ Live mode is incompatible with output format options (:option: ` --collapsed `,
875+ :option: ` --flamegraph `, and so on) because it uses an interactive terminal
822876interface rather than producing file output.
823877
824878
825879Async-aware profiling
826880=====================
827881
828882For programs using :mod: `asyncio `, the profiler offers async-aware mode
829- (`` --async-aware ` `) that reconstructs call stacks based on the task structure
883+ (:option: ` --async-aware `) that reconstructs call stacks based on the task structure
830884rather than the raw Python frames::
831885
832886 python -m profiling.sampling run --async-aware async_script.py
@@ -846,16 +900,16 @@ and presenting stacks that reflect the ``await`` chain.
846900Async modes
847901-----------
848902
849- The `` --async-mode ` ` option controls which tasks appear in the profile::
903+ The :option: ` --async-mode ` option controls which tasks appear in the profile::
850904
851905 python -m profiling.sampling run --async-aware --async-mode=running async_script.py
852906 python -m profiling.sampling run --async-aware --async-mode=all async_script.py
853907
854- With `` --async-mode=running `` (the default), only the task currently executing
908+ With :option: ` --async-mode ` \ `` =running `` (the default), only the task currently executing
855909on the CPU is profiled. This shows where your program is actively spending time
856910and is the typical choice for performance analysis.
857911
858- With `` --async-mode=all ``, tasks that are suspended (awaiting I/O, locks, or
912+ With :option: ` --async-mode ` \ `` =all ``, tasks that are suspended (awaiting I/O, locks, or
859913other tasks) are also included. This mode is useful for understanding what your
860914program is waiting on, but produces larger profiles since every suspended task
861915appears in each sample.
@@ -884,8 +938,8 @@ Option restrictions
884938-------------------
885939
886940Async-aware mode uses a different stack reconstruction mechanism and is
887- incompatible with: `` --native ``, `` --no-gc ``, `` --all-threads ` `, and
888- `` --mode=cpu `` or `` --mode=gil ``.
941+ incompatible with: :option: ` --native `, :option: ` --no-gc `, :option: ` --all-threads `, and
942+ :option: ` --mode ` \ `` =cpu `` or :option: ` --mode ` \ `` =gil ``.
889943
890944
891945Command-line interface
@@ -939,6 +993,13 @@ Sampling options
939993
940994 Enable async-aware profiling for asyncio programs.
941995
996+ .. option :: --opcodes
997+
998+ Gather bytecode opcode information for instruction-level profiling. Shows
999+ which bytecode instructions are executing, including specializations.
1000+ Compatible with ``--live ``, ``--flamegraph ``, ``--heatmap ``, and ``--gecko ``
1001+ formats only.
1002+
9421003
9431004Mode options
9441005------------
0 commit comments