Skip to content

Commit 7f219a0

Browse files
authored
PEP 768: Add some clarifications and minor edits (#4284)
1 parent 5d27124 commit 7f219a0

File tree

1 file changed

+64
-19
lines changed

1 file changed

+64
-19
lines changed

peps/pep-0768.rst

Lines changed: 64 additions & 19 deletions
Original file line numberDiff line numberDiff line change
@@ -141,8 +141,10 @@ A new structure is added to PyThreadState to support remote debugging:
141141
142142
This structure is appended to ``PyThreadState``, adding only a few fields that
143143
are **never accessed during normal execution**. The ``debugger_pending_call`` field
144-
indicates when a debugger has requested execution, while ``debugger_script``
145-
provides Python code to be executed when the interpreter reaches a safe point.
144+
indicates when a debugger has requested execution, while ``debugger_script_path``
145+
provides a filesystem path to a Python source file (.py) that will be executed when
146+
the interpreter reaches a safe point. The path must point to a Python source file,
147+
not compiled Python code (.pyc) or any other format.
146148

147149
The value for ``MAX_SCRIPT_PATH_SIZE`` will be a trade-off between binary size
148150
and how big debugging scripts' paths can be. To limit the memory overhead per
@@ -177,7 +179,7 @@ debugger support:
177179
These offsets allow debuggers to locate critical debugging control structures in
178180
the target process's memory space. The ``eval_breaker`` and ``remote_debugger_support``
179181
offsets are relative to each ``PyThreadState``, while the ``debugger_pending_call``
180-
and ``debugger_script`` offsets are relative to each ``_PyRemoteDebuggerSupport``
182+
and ``debugger_script_path`` offsets are relative to each ``_PyRemoteDebuggerSupport``
181183
structure, allowing the new structure and its fields to be found regardless of
182184
where they are in memory. ``debugger_script_path_size`` informs the attaching
183185
tool of the size of the buffer.
@@ -200,13 +202,19 @@ When a debugger wants to attach to a Python process, it follows these steps:
200202

201203
5. Write control information:
202204

203-
- Write a filename containing Python code to be executed into the
204-
``debugger_script`` field in ``_PyRemoteDebuggerSupport``.
205+
- Most debuggers will pause the process before writing to its memory. This is
206+
standard practice for tools like GDB, which use SIGSTOP or ptrace to pause the process.
207+
This approach prevents races when writing to process memory. Profilers and other tools
208+
that don't wish to stop the process can still use this interface, but they need to
209+
handle possible races. This is a normal consideration for profilers.
210+
211+
- Write a file path to a Python source file (.py) into the
212+
``debugger_script_path`` field in ``_PyRemoteDebuggerSupport``.
205213
- Set ``debugger_pending_call`` flag in ``_PyRemoteDebuggerSupport`` to 1
206214
- Set ``_PY_EVAL_PLEASE_STOP_BIT`` in the ``eval_breaker`` field
207215

208-
Once the interpreter reaches the next safe point, it will execute the script
209-
provided by the debugger.
216+
Once the interpreter reaches the next safe point, it will execute the Python code
217+
contained in the file specified by the debugger.
210218

211219
Interpreter Integration
212220
-----------------------
@@ -237,7 +245,7 @@ to be audited or disabled if desired by a system's administrator.
237245
if (tstate->eval_breaker) {
238246
if (tstate->remote_debugger_support.debugger_pending_call) {
239247
tstate->remote_debugger_support.debugger_pending_call = 0;
240-
const char *path = tstate->remote_debugger_support.debugger_script;
248+
const char *path = tstate->remote_debugger_support.debugger_script_path;
241249
if (*path) {
242250
if (0 != PySys_Audit("debugger_script", "%s", path)) {
243251
PyErr_Clear();
@@ -273,28 +281,35 @@ arbitrary Python code within the context of a specified Python process:
273281

274282
.. code-block:: python
275283
276-
def remote_exec(pid: int, code: str, timeout: int = 0) -> None:
284+
def remote_exec(pid: int, script: str|bytes|PathLike) -> None:
277285
"""
278-
Executes a block of Python code in a given remote Python process.
286+
Executes a file containing Python code in a given remote Python process.
287+
288+
This function returns immediately, and the code will be executed by the
289+
target process's main thread at the next available opportunity, similarly
290+
to how signals are handled. There is no interface to determine when the
291+
code has been executed. The caller is responsible for making sure that
292+
the file still exists whenever the remote process tries to read it and that
293+
it hasn't been overwritten.
279294
280295
Args:
281296
pid (int): The process ID of the target Python process.
282-
code (str): A string containing the Python code to be executed.
283-
timeout (int): An optional timeout for waiting for the remote
284-
process to execute the code. If the timeout is exceeded a
285-
``TimeoutError`` will be raised.
297+
script (str|bytes|PathLike): The path to a file containing
298+
the Python code to be executed.
286299
"""
287300
288301
An example usage of the API would look like:
289302

290303
.. code-block:: python
291304
292305
import sys
306+
import uuid
293307
# Execute a print statement in a remote Python process with PID 12345
308+
script = f"/tmp/{uuid.uuid4()}.py"
309+
with open(script, "w") as f:
310+
f.write("print('Hello from remote execution!')")
294311
try:
295-
sys.remote_exec(12345, "print('Hello from remote execution!')", timeout=3)
296-
except TimeoutError:
297-
print(f"The remote process took too long to execute the code")
312+
sys.remote_exec(12345, script)
298313
except Exception as e:
299314
print(f"Failed to execute code: {e}")
300315
@@ -322,6 +337,36 @@ feature. This way, tools can offer a useful error message explaining why they
322337
won't work, instead of believing that they have attached and then never having
323338
their script run.
324339

340+
Multi-threading Considerations
341+
------------------------------
342+
343+
The overall execution pattern resembles how Python handles signals internally.
344+
The interpreter guarantees that injected code only runs at safe points, never
345+
interrupting atomic operations within the interpreter itself. This approach
346+
ensures that debugging operations cannot corrupt the interpreter state while
347+
still providing timely execution in most real-world scenarios.
348+
349+
However, debugging code injected through this interface can execute in any
350+
thread. This behavior is different than how Python handles signals, since
351+
signal handlers can only run in the main thread. If a debugger wants to inject
352+
code into every running thread, it must inject it into every ``PyThreadState``.
353+
If a debugger wants to run code in the first available thread, it needs to
354+
inject it into every ``PyThreadState``, and that injected code must check
355+
whether it has already been run by another thread (likely by setting some flag
356+
in the globals of some module).
357+
358+
Note that the Global Interpreter Lock (GIL) continues to govern execution as
359+
normal when the injected code runs. This means if a target thread is currently
360+
executing a C extension that holds the GIL continuously, the injected code
361+
won't be able to run until that operation completes and the GIL becomes
362+
available. However, the interface introduces no additional GIL contention
363+
beyond what the injected code itself requires. Importantly, the interface
364+
remains fully compatible with Python's free-threaded mode.
365+
366+
It may be useful for a debugger that injected some code to be run to follow
367+
that up by sending some pre-registered signal to the process, which can
368+
interrupt any blocking I/O or sleep states waiting for external resources, and
369+
allow a safe opportunity to run the injected code.
325370

326371
Backwards Compatibility
327372
=======================
@@ -454,8 +499,8 @@ Rejected Ideas
454499
Writing Python code into the buffer
455500
-----------------------------------
456501

457-
We have chosen to have debuggers write the code to be executed into a file
458-
whose path is written into a buffer in the remote process. This has been deemed
502+
We have chosen to have debuggers write the path to a file containing Python code
503+
into a buffer in the remote process. This has been deemed
459504
more secure than writing the Python code to be executed itself into a buffer in
460505
the remote process, because it means that an attacker who has gained arbitrary
461506
writes in a process but not arbitrary code execution or file system

0 commit comments

Comments
 (0)