@@ -141,8 +141,10 @@ A new structure is added to PyThreadState to support remote debugging:
141
141
142
142
This structure is appended to ``PyThreadState ``, adding only a few fields that
143
143
are **never accessed during normal execution **. The ``debugger_pending_call `` field
144
- indicates when a debugger has requested execution, while ``debugger_script ``
145
- provides Python code to be executed when the interpreter reaches a safe point.
144
+ indicates when a debugger has requested execution, while ``debugger_script_path ``
145
+ provides a filesystem path to a Python source file (.py) that will be executed when
146
+ the interpreter reaches a safe point. The path must point to a Python source file,
147
+ not compiled Python code (.pyc) or any other format.
146
148
147
149
The value for ``MAX_SCRIPT_PATH_SIZE `` will be a trade-off between binary size
148
150
and how big debugging scripts' paths can be. To limit the memory overhead per
@@ -177,7 +179,7 @@ debugger support:
177
179
These offsets allow debuggers to locate critical debugging control structures in
178
180
the target process's memory space. The ``eval_breaker `` and ``remote_debugger_support ``
179
181
offsets are relative to each ``PyThreadState ``, while the ``debugger_pending_call ``
180
- and ``debugger_script `` offsets are relative to each ``_PyRemoteDebuggerSupport ``
182
+ and ``debugger_script_path `` offsets are relative to each ``_PyRemoteDebuggerSupport ``
181
183
structure, allowing the new structure and its fields to be found regardless of
182
184
where they are in memory. ``debugger_script_path_size `` informs the attaching
183
185
tool of the size of the buffer.
@@ -200,13 +202,19 @@ When a debugger wants to attach to a Python process, it follows these steps:
200
202
201
203
5. Write control information:
202
204
203
- - Write a filename containing Python code to be executed into the
204
- ``debugger_script `` field in ``_PyRemoteDebuggerSupport ``.
205
+ - Most debuggers will pause the process before writing to its memory. This is
206
+ standard practice for tools like GDB, which use SIGSTOP or ptrace to pause the process.
207
+ This approach prevents races when writing to process memory. Profilers and other tools
208
+ that don't wish to stop the process can still use this interface, but they need to
209
+ handle possible races. This is a normal consideration for profilers.
210
+
211
+ - Write a file path to a Python source file (.py) into the
212
+ ``debugger_script_path `` field in ``_PyRemoteDebuggerSupport ``.
205
213
- Set ``debugger_pending_call `` flag in ``_PyRemoteDebuggerSupport `` to 1
206
214
- Set ``_PY_EVAL_PLEASE_STOP_BIT `` in the ``eval_breaker `` field
207
215
208
- Once the interpreter reaches the next safe point, it will execute the script
209
- provided by the debugger.
216
+ Once the interpreter reaches the next safe point, it will execute the Python code
217
+ contained in the file specified by the debugger.
210
218
211
219
Interpreter Integration
212
220
-----------------------
@@ -237,7 +245,7 @@ to be audited or disabled if desired by a system's administrator.
237
245
if (tstate->eval_breaker) {
238
246
if (tstate->remote_debugger_support.debugger_pending_call) {
239
247
tstate->remote_debugger_support.debugger_pending_call = 0;
240
- const char *path = tstate->remote_debugger_support.debugger_script ;
248
+ const char *path = tstate->remote_debugger_support.debugger_script_path ;
241
249
if (*path) {
242
250
if (0 != PySys_Audit("debugger_script", "%s", path)) {
243
251
PyErr_Clear();
@@ -273,28 +281,35 @@ arbitrary Python code within the context of a specified Python process:
273
281
274
282
.. code-block :: python
275
283
276
- def remote_exec (pid : int , code : str , timeout : int = 0 ) -> None :
284
+ def remote_exec (pid : int , script : str | bytes | PathLike ) -> None :
277
285
"""
278
- Executes a block of Python code in a given remote Python process.
286
+ Executes a file containing Python code in a given remote Python process.
287
+
288
+ This function returns immediately, and the code will be executed by the
289
+ target process's main thread at the next available opportunity, similarly
290
+ to how signals are handled. There is no interface to determine when the
291
+ code has been executed. The caller is responsible for making sure that
292
+ the file still exists whenever the remote process tries to read it and that
293
+ it hasn't been overwritten.
279
294
280
295
Args:
281
296
pid (int): The process ID of the target Python process.
282
- code (str): A string containing the Python code to be executed.
283
- timeout (int): An optional timeout for waiting for the remote
284
- process to execute the code. If the timeout is exceeded a
285
- ``TimeoutError`` will be raised.
297
+ script (str|bytes|PathLike): The path to a file containing
298
+ the Python code to be executed.
286
299
"""
287
300
288
301
An example usage of the API would look like:
289
302
290
303
.. code-block :: python
291
304
292
305
import sys
306
+ import uuid
293
307
# Execute a print statement in a remote Python process with PID 12345
308
+ script = f " /tmp/ { uuid.uuid4()} .py "
309
+ with open (script, " w" ) as f:
310
+ f.write(" print('Hello from remote execution!')" )
294
311
try :
295
- sys.remote_exec(12345 , " print('Hello from remote execution!')" , timeout = 3 )
296
- except TimeoutError :
297
- print (f " The remote process took too long to execute the code " )
312
+ sys.remote_exec(12345 , script)
298
313
except Exception as e:
299
314
print (f " Failed to execute code: { e} " )
300
315
@@ -322,6 +337,36 @@ feature. This way, tools can offer a useful error message explaining why they
322
337
won't work, instead of believing that they have attached and then never having
323
338
their script run.
324
339
340
+ Multi-threading Considerations
341
+ ------------------------------
342
+
343
+ The overall execution pattern resembles how Python handles signals internally.
344
+ The interpreter guarantees that injected code only runs at safe points, never
345
+ interrupting atomic operations within the interpreter itself. This approach
346
+ ensures that debugging operations cannot corrupt the interpreter state while
347
+ still providing timely execution in most real-world scenarios.
348
+
349
+ However, debugging code injected through this interface can execute in any
350
+ thread. This behavior is different than how Python handles signals, since
351
+ signal handlers can only run in the main thread. If a debugger wants to inject
352
+ code into every running thread, it must inject it into every ``PyThreadState ``.
353
+ If a debugger wants to run code in the first available thread, it needs to
354
+ inject it into every ``PyThreadState ``, and that injected code must check
355
+ whether it has already been run by another thread (likely by setting some flag
356
+ in the globals of some module).
357
+
358
+ Note that the Global Interpreter Lock (GIL) continues to govern execution as
359
+ normal when the injected code runs. This means if a target thread is currently
360
+ executing a C extension that holds the GIL continuously, the injected code
361
+ won't be able to run until that operation completes and the GIL becomes
362
+ available. However, the interface introduces no additional GIL contention
363
+ beyond what the injected code itself requires. Importantly, the interface
364
+ remains fully compatible with Python's free-threaded mode.
365
+
366
+ It may be useful for a debugger that injected some code to be run to follow
367
+ that up by sending some pre-registered signal to the process, which can
368
+ interrupt any blocking I/O or sleep states waiting for external resources, and
369
+ allow a safe opportunity to run the injected code.
325
370
326
371
Backwards Compatibility
327
372
=======================
@@ -454,8 +499,8 @@ Rejected Ideas
454
499
Writing Python code into the buffer
455
500
-----------------------------------
456
501
457
- We have chosen to have debuggers write the code to be executed into a file
458
- whose path is written into a buffer in the remote process. This has been deemed
502
+ We have chosen to have debuggers write the path to a file containing Python code
503
+ into a buffer in the remote process. This has been deemed
459
504
more secure than writing the Python code to be executed itself into a buffer in
460
505
the remote process, because it means that an attacker who has gained arbitrary
461
506
writes in a process but not arbitrary code execution or file system
0 commit comments