@@ -70,6 +70,99 @@ Summary --- release highlights
7070 New features
7171============
7272
73+ .. _whatsnew314-sampling-profiler :
74+
75+ Statistical sampling profiler for production debugging
76+ ------------------------------------------------------
77+
78+ A new statistical sampling profiler has been added to the :mod: `profile ` module as
79+ :mod: `profile.sample `. This profiler enables low-overhead performance analysis of
80+ running Python processes without requiring code modification or process restart.
81+
82+ Unlike deterministic profilers (:mod: `cProfile ` and :mod: `profile `) that instrument
83+ every function call, the sampling profiler periodically captures stack traces from
84+ running processes. This approach provides virtually zero overhead while achieving
85+ sampling rates of **up to 200,000 Hz **, making it the fastest sampling profiler
86+ available for Python (at the time of its contribution) and ideal for debugging
87+ performance issues in production environments.
88+
89+ Key features include:
90+
91+ * **Zero-overhead profiling **: Attach to any running Python process without
92+ affecting its performance
93+ * **No code modification required **: Profile existing applications without restart
94+ * **Real-time statistics **: Monitor sampling quality during data collection
95+ * **Multiple output formats **: Generate both detailed statistics and flamegraph data
96+ * **Thread-aware profiling **: Option to profile all threads or just the main thread
97+
98+ Profile an existing process for 10 seconds::
99+
100+ python -m profile.sample 1234
101+
102+ Profile with custom settings and real-time statistics::
103+
104+ python -m profile.sample --realtime-stats -i 50 -d 30 1234
105+
106+ Generate flamegraph data::
107+
108+ python -m profile.sample --collapsed -o stacks.txt 1234
109+
110+ The profiler generates statistical estimates of where time is spent::
111+
112+ Real-time sampling stats: Mean: 100261.5Hz (9.97µs) Min: 86333.4Hz (11.58µs) Max: 118807.2Hz (8.42µs) Samples: 400001
113+ Captured 498841 samples in 5.00 seconds
114+ Sample rate: 99768.04 samples/sec
115+ Error rate: 0.72%
116+ Profile Stats:
117+ nsamples sample% tottime (s) cumul% cumtime (s) filename:lineno(function)
118+ 43/418858 0.0 0.000 87.9 4.189 case.py:667(TestCase.run)
119+ 3293/418812 0.7 0.033 87.9 4.188 case.py:613(TestCase._callTestMethod)
120+ 158562/158562 33.3 1.586 33.3 1.586 test_compile.py:725(TestSpecifics.test_compiler_recursion_limit.<locals>.check_limit)
121+ 129553/129553 27.2 1.296 27.2 1.296 ast.py:46(parse)
122+ 0/128129 0.0 0.000 26.9 1.281 test_ast.py:884(AST_Tests.test_ast_recursion_limit.<locals>.check_limit)
123+ 7/67446 0.0 0.000 14.2 0.674 test_compile.py:729(TestSpecifics.test_compiler_recursion_limit)
124+ 6/60380 0.0 0.000 12.7 0.604 test_ast.py:888(AST_Tests.test_ast_recursion_limit)
125+ 3/50020 0.0 0.000 10.5 0.500 test_compile.py:727(TestSpecifics.test_compiler_recursion_limit)
126+ 1/38011 0.0 0.000 8.0 0.380 test_ast.py:886(AST_Tests.test_ast_recursion_limit)
127+ 1/25076 0.0 0.000 5.3 0.251 test_compile.py:728(TestSpecifics.test_compiler_recursion_limit)
128+ 22361/22362 4.7 0.224 4.7 0.224 test_compile.py:1368(TestSpecifics.test_big_dict_literal)
129+ 4/18008 0.0 0.000 3.8 0.180 test_ast.py:889(AST_Tests.test_ast_recursion_limit)
130+ 11/17696 0.0 0.000 3.7 0.177 subprocess.py:1038(Popen.__init__)
131+ 16968/16968 3.6 0.170 3.6 0.170 subprocess.py:1900(Popen._execute_child)
132+ 2/16941 0.0 0.000 3.6 0.169 test_compile.py:730(TestSpecifics.test_compiler_recursion_limit)
133+
134+ Legend:
135+ nsamples: Direct/Cumulative samples (direct executing / on call stack)
136+ sample%: Percentage of total samples this function was directly executing
137+ tottime: Estimated total time spent directly in this function
138+ cumul%: Percentage of total samples when this function was on the call stack
139+ cumtime: Estimated cumulative time (including time in called functions)
140+ filename:lineno(function): Function location and name
141+
142+ Summary of Interesting Functions:
143+
144+ Functions with Highest Direct/Cumulative Ratio (Hot Spots):
145+ 1.000 direct/cumulative ratio, 33.3% direct samples: test_compile.py:(TestSpecifics.test_compiler_recursion_limit.<locals>.check_limit)
146+ 1.000 direct/cumulative ratio, 27.2% direct samples: ast.py:(parse)
147+ 1.000 direct/cumulative ratio, 3.6% direct samples: subprocess.py:(Popen._execute_child)
148+
149+ Functions with Highest Call Frequency (Indirect Calls):
150+ 418815 indirect calls, 87.9% total stack presence: case.py:(TestCase.run)
151+ 415519 indirect calls, 87.9% total stack presence: case.py:(TestCase._callTestMethod)
152+ 159470 indirect calls, 33.5% total stack presence: test_compile.py:(TestSpecifics.test_compiler_recursion_limit)
153+
154+ Functions with Highest Call Magnification (Cumulative/Direct):
155+ 12267.9x call magnification, 159470 indirect calls from 13 direct: test_compile.py:(TestSpecifics.test_compiler_recursion_limit)
156+ 10581.7x call magnification, 116388 indirect calls from 11 direct: test_ast.py:(AST_Tests.test_ast_recursion_limit)
157+ 9740.9x call magnification, 418815 indirect calls from 43 direct: case.py:(TestCase.run)
158+
159+ The profiler automatically identifies performance bottlenecks through statistical
160+ analysis, highlighting functions with high CPU usage and call frequency patterns.
161+
162+ This capability is particularly valuable for debugging performance issues in
163+ production systems where traditional profiling approaches would be too intrusive.
164+
165+ (Contributed by Pablo Galindo and László Kiss Kollár in :gh: `135953 `.)
73166
74167
75168Other language changes
0 commit comments