|
| 1 | +# Python sys.monitoring Tracer Design |
| 2 | + |
| 3 | +## Overview |
| 4 | + |
| 5 | +This document outlines the design for integrating Python's `sys.monitoring` API with the `runtime_tracing` format. The goal is to produce CodeTracer-compatible traces for Python programs without modifying the interpreter. |
| 6 | + |
| 7 | +The tracer collects `sys.monitoring` events, converts them to `runtime_tracing` events, and streams them to `trace.json`/`trace.bin` along with metadata and source snapshots. |
| 8 | + |
| 9 | +## Architecture |
| 10 | + |
| 11 | +### Tool Initialization |
| 12 | +- Acquire a tool identifier via `sys.monitoring.use_tool_id`; store it for the lifetime of the tracer. |
| 13 | +- Register one callback per event using `sys.monitoring.register_callback`. |
| 14 | +- Enable all desired events by bitmask with `sys.monitoring.set_events`. |
| 15 | + |
| 16 | +### Writer Management |
| 17 | +- Open a `runtime_tracing` writer (`trace.json` or `trace.bin`) during `start_tracing`. |
| 18 | +- Expose methods to append metadata and file copies using existing `runtime_tracing` helpers. |
| 19 | +- Flush and close the writer when tracing stops. |
| 20 | + |
| 21 | +### Frame and Thread Tracking |
| 22 | +- Maintain a per-thread stack of frame identifiers to correlate `CALL`, `PY_START`, and returns. |
| 23 | +- Map `frame` objects to internal IDs for cross-referencing events. |
| 24 | +- Record thread start/end events when a new thread registers callbacks. |
| 25 | + |
| 26 | +## Event Handling |
| 27 | + |
| 28 | +Each bullet below represents a low-level operation translating a single `sys.monitoring` event into the `runtime_tracing` stream. |
| 29 | + |
| 30 | +### Control Flow |
| 31 | +- **PY_START** – Create a `Function` event for the code object and push a new frame ID onto the thread's stack. |
| 32 | +- **PY_RESUME** – Emit an `Event` log noting resumption and update the current frame's state. |
| 33 | +- **PY_RETURN** – Pop the frame ID, write a `Return` event with the value (if retrievable), and link to the caller. |
| 34 | +- **PY_YIELD** – Record a `Return` event flagged as a yield and keep the frame on the stack for later resumes. |
| 35 | +- **STOP_ITERATION** – Emit an `Event` indicating iteration exhaustion for the current frame. |
| 36 | +- **PY_UNWIND** – Mark the beginning of stack unwinding and note the target handler in an `Event`. |
| 37 | +- **PY_THROW** – Emit an `Event` describing the thrown value and the target generator/coroutine. |
| 38 | +- **RERAISE** – Log a re-raise event referencing the original exception. |
| 39 | + |
| 40 | +### Call and Line Tracking |
| 41 | +- **CALL** – Record a `Call` event, capturing argument values and the callee's `Function` ID. |
| 42 | +- **LINE** – Write a `Step` event with current path and line number; ensure the path is registered. |
| 43 | +- **INSTRUCTION** – Optionally emit a fine-grained `Event` containing the opcode name for detailed traces. |
| 44 | +- **JUMP** – Append an `Event` describing the jump target offset for control-flow visualization. |
| 45 | +- **BRANCH** – Record an `Event` with branch outcome (taken or not) to aid coverage analysis. |
| 46 | + |
| 47 | +### Exception Lifecycle |
| 48 | +- **RAISE** – Emit an `Event` containing exception type and message when raised. |
| 49 | +- **EXCEPTION_HANDLED** – Log an `Event` marking when an exception is caught. |
| 50 | + |
| 51 | +### C API Boundary |
| 52 | +- **C_RETURN** – On returning from a C function, emit a `Return` event tagged as foreign and include result summary. |
| 53 | +- **C_RAISE** – When a C function raises, record an `Event` with the exception info and current frame ID. |
| 54 | + |
| 55 | +### No Events |
| 56 | +- **NO_EVENTS** – Special constant; used only to disable monitoring. No runtime event is produced. |
| 57 | + |
| 58 | +## Metadata and File Capture |
| 59 | +- Collect the working directory, program name, and arguments and store them in `trace_metadata.json`. |
| 60 | +- Track every file path referenced; copy each into the trace directory under `files/`. |
| 61 | +- Record `VariableName`, `Type`, and `Value` entries when variables are inspected or logged. |
| 62 | + |
| 63 | +## Shutdown |
| 64 | +- On `stop_tracing`, call `sys.monitoring.set_events` with `NO_EVENTS` for the tool ID. |
| 65 | +- Unregister callbacks and free the tool ID with `sys.monitoring.free_tool_id`. |
| 66 | +- Close the writer and ensure all buffered events are flushed to disk. |
| 67 | + |
| 68 | +## Future Extensions |
| 69 | +- Add filtering to enable subsets of events for performance-sensitive scenarios. |
| 70 | +- Support streaming traces over a socket for live debugging. |
0 commit comments