Skip to content

Commit 74798c1

Browse files
committed
design: Initial design of code and test suite
-------- >8 -------- >8 -------- 8< -------- 8< -------- # Everything below the snippet mark will be ignored # # Content diff of this revision:
1 parent 63bf376 commit 74798c1

File tree

3 files changed

+360
-0
lines changed

3 files changed

+360
-0
lines changed

design-docs/design-001.md

Lines changed: 236 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,236 @@
1+
# Python sys.monitoring Tracer Design
2+
3+
## Overview
4+
5+
This document outlines the design for integrating Python's `sys.monitoring` API with the `runtime_tracing` format. The goal is to produce CodeTracer-compatible traces for Python programs without modifying the interpreter.
6+
7+
The tracer collects `sys.monitoring` events, converts them to `runtime_tracing` events, and streams them to `trace.json`/`trace.bin` along with metadata and source snapshots.
8+
9+
## Architecture
10+
11+
### Tool Initialization
12+
- Acquire a tool identifier via `sys.monitoring.use_tool_id`; store it for the lifetime of the tracer.
13+
```rs
14+
pub const MONITORING_TOOL_NAME: &str = "codetracer";
15+
pub struct ToolId { pub id: u8 }
16+
pub fn acquire_tool_id() -> PyResult<ToolId>;
17+
```
18+
- Register one callback per event using `sys.monitoring.register_callback`.
19+
```rs
20+
pub enum MonitoringEvent { PyStart, PyResume, PyReturn, PyYield, StopIteration, PyUnwind, PyThrow, Reraise, Call, Line, Instruction, Jump, Branch, Raise, ExceptionHandled, CReturn, CRaise }
21+
pub type CallbackFn = unsafe extern "C" fn(event: MonitoringEvent, frame: *mut PyFrameObject);
22+
pub fn register_callback(tool: &ToolId, event: MonitoringEvent, cb: CallbackFn);
23+
```
24+
- Enable all desired events by bitmask with `sys.monitoring.set_events`.
25+
```rs
26+
pub const ALL_EVENTS_MASK: u64 = 0xffff;
27+
pub fn enable_events(tool: &ToolId, mask: u64);
28+
```
29+
30+
### Writer Management
31+
- Open a `runtime_tracing` writer (`trace.json` or `trace.bin`) during `start_tracing`.
32+
```rs
33+
pub enum OutputFormat { Json, Binary }
34+
pub struct TraceWriter { pub format: OutputFormat }
35+
pub fn start_tracing(path: &Path, format: OutputFormat) -> io::Result<TraceWriter>;
36+
```
37+
- Expose methods to append metadata and file copies using existing `runtime_tracing` helpers.
38+
```rs
39+
pub fn append_metadata(writer: &mut TraceWriter, meta: &TraceMetadata);
40+
pub fn copy_source_file(writer: &mut TraceWriter, path: &Path) -> io::Result<()>;
41+
```
42+
- Flush and close the writer when tracing stops.
43+
```rs
44+
pub fn stop_tracing(writer: TraceWriter) -> io::Result<()>;
45+
```
46+
47+
### Frame and Thread Tracking
48+
- Maintain a per-thread stack of frame identifiers to correlate `CALL`, `PY_START`, and returns.
49+
```rs
50+
pub type FrameId = u64;
51+
pub struct ThreadState { pub stack: Vec<FrameId> }
52+
pub fn current_thread_state() -> &'static mut ThreadState;
53+
```
54+
- Map `frame` objects to internal IDs for cross-referencing events.
55+
```rs
56+
pub struct FrameRegistry { next: FrameId, map: HashMap<*mut PyFrameObject, FrameId> }
57+
pub fn intern_frame(reg: &mut FrameRegistry, frame: *mut PyFrameObject) -> FrameId;
58+
```
59+
- Record thread start/end events when a new thread registers callbacks.
60+
```rs
61+
pub fn on_thread_start(thread_id: u64);
62+
pub fn on_thread_stop(thread_id: u64);
63+
```
64+
65+
## Event Handling
66+
67+
Each bullet below represents a low-level operation translating a single `sys.monitoring` event into the `runtime_tracing` stream.
68+
69+
### Control Flow
70+
- **PY_START** – Create a `Function` event for the code object and push a new frame ID onto the thread's stack.
71+
```rs
72+
pub fn on_py_start(frame: *mut PyFrameObject);
73+
```
74+
- **PY_RESUME** – Emit an `Event` log noting resumption and update the current frame's state.
75+
```rs
76+
pub fn on_py_resume(frame: *mut PyFrameObject);
77+
```
78+
- **PY_RETURN** – Pop the frame ID, write a `Return` event with the value (if retrievable), and link to the caller.
79+
```rs
80+
pub struct ReturnRecord { pub frame: FrameId, pub value: Option<ValueRecord> }
81+
pub fn on_py_return(frame: *mut PyFrameObject, value: *mut PyObject);
82+
```
83+
- **PY_YIELD** – Record a `Return` event flagged as a yield and keep the frame on the stack for later resumes.
84+
```rs
85+
pub fn on_py_yield(frame: *mut PyFrameObject, value: *mut PyObject);
86+
```
87+
- **STOP_ITERATION** – Emit an `Event` indicating iteration exhaustion for the current frame.
88+
```rs
89+
pub fn on_stop_iteration(frame: *mut PyFrameObject);
90+
```
91+
- **PY_UNWIND** – Mark the beginning of stack unwinding and note the target handler in an `Event`.
92+
```rs
93+
pub fn on_py_unwind(frame: *mut PyFrameObject);
94+
```
95+
- **PY_THROW** – Emit an `Event` describing the thrown value and the target generator/coroutine.
96+
```rs
97+
pub fn on_py_throw(frame: *mut PyFrameObject, value: *mut PyObject);
98+
```
99+
- **RERAISE** – Log a re-raise event referencing the original exception.
100+
```rs
101+
pub fn on_reraise(frame: *mut PyFrameObject, exc: *mut PyObject);
102+
```
103+
104+
### Call and Line Tracking
105+
- **CALL** – Record a `Call` event, capturing argument values and the callee's `Function` ID.
106+
```rs
107+
pub fn on_call(callee: *mut PyObject, args: &PyTupleObject) -> FrameId;
108+
```
109+
- **LINE** – Write a `Step` event with current path and line number; ensure the path is registered.
110+
```rs
111+
pub fn on_line(frame: *mut PyFrameObject, lineno: u32);
112+
```
113+
- **INSTRUCTION** – Optionally emit a fine-grained `Event` containing the opcode name for detailed traces.
114+
```rs
115+
pub fn on_instruction(frame: *mut PyFrameObject, opcode: u8);
116+
```
117+
- **JUMP** – Append an `Event` describing the jump target offset for control-flow visualization.
118+
```rs
119+
pub fn on_jump(frame: *mut PyFrameObject, target: u32);
120+
```
121+
- **BRANCH** – Record an `Event` with branch outcome (taken or not) to aid coverage analysis.
122+
```rs
123+
pub fn on_branch(frame: *mut PyFrameObject, taken: bool);
124+
```
125+
126+
### Exception Lifecycle
127+
- **RAISE** – Emit an `Event` containing exception type and message when raised.
128+
```rs
129+
pub fn on_raise(frame: *mut PyFrameObject, exc: *mut PyObject);
130+
```
131+
- **EXCEPTION_HANDLED** – Log an `Event` marking when an exception is caught.
132+
```rs
133+
pub fn on_exception_handled(frame: *mut PyFrameObject);
134+
```
135+
136+
### C API Boundary
137+
- **C_RETURN** – On returning from a C function, emit a `Return` event tagged as foreign and include result summary.
138+
```rs
139+
pub fn on_c_return(func: *mut PyObject, result: *mut PyObject);
140+
```
141+
- **C_RAISE** – When a C function raises, record an `Event` with the exception info and current frame ID.
142+
```rs
143+
pub fn on_c_raise(func: *mut PyObject, exc: *mut PyObject);
144+
```
145+
146+
### No Events
147+
- **NO_EVENTS** – Special constant; used only to disable monitoring. No runtime event is produced.
148+
```rs
149+
pub const NO_EVENTS: u64 = 0;
150+
```
151+
152+
## Metadata and File Capture
153+
- Collect the working directory, program name, and arguments and store them in `trace_metadata.json`.
154+
```rs
155+
pub struct TraceMetadata { pub cwd: PathBuf, pub program: String, pub args: Vec<String> }
156+
pub fn write_metadata(writer: &mut TraceWriter, meta: &TraceMetadata);
157+
```
158+
- Track every file path referenced; copy each into the trace directory under `files/`.
159+
```rs
160+
pub fn track_file(writer: &mut TraceWriter, path: &Path) -> io::Result<()>;
161+
```
162+
- Record `VariableName`, `Type`, and `Value` entries when variables are inspected or logged.
163+
```rs
164+
pub struct VariableRecord { pub name: String, pub ty: TypeId, pub value: ValueRecord }
165+
pub fn record_variable(writer: &mut TraceWriter, rec: VariableRecord);
166+
```
167+
168+
## Value Translation and Recording
169+
- Maintain a type registry that maps Python `type` objects to `runtime_tracing` `Type` entries and assigns new `type_id` values on first encounter.
170+
```rs
171+
pub type TypeId = u32;
172+
pub type ValueId = u64;
173+
pub enum ValueRecord { Int(i64), Float(f64), Bool(bool), None, Str(String), Raw(Vec<u8>), Sequence(Vec<ValueRecord>), Tuple(Vec<ValueRecord>), Struct(Vec<(String, ValueRecord)>), Reference(ValueId) }
174+
pub struct TypeRegistry { next: TypeId, map: HashMap<*mut PyTypeObject, TypeId> }
175+
pub fn intern_type(reg: &mut TypeRegistry, ty: *mut PyTypeObject) -> TypeId;
176+
```
177+
- Convert primitives (`int`, `float`, `bool`, `None`, `str`) directly to their corresponding `ValueRecord` variants.
178+
```rs
179+
pub fn encode_primitive(obj: *mut PyObject) -> Option<ValueRecord>;
180+
```
181+
- Encode `bytes` and `bytearray` as `Raw` records containing base64 text to preserve binary data.
182+
```rs
183+
pub fn encode_bytes(obj: *mut PyObject) -> ValueRecord;
184+
```
185+
- Represent lists and sets as `Sequence` records and tuples as `Tuple` records, converting each element recursively.
186+
```rs
187+
pub fn encode_sequence(iter: &PySequence) -> ValueRecord;
188+
pub fn encode_tuple(tuple: &PyTupleObject) -> ValueRecord;
189+
```
190+
- Serialize dictionaries as a `Sequence` of two-element `Tuple` records for key/value pairs to avoid fixed field layouts.
191+
```rs
192+
pub fn encode_dict(dict: &PyDictObject) -> ValueRecord;
193+
```
194+
- For objects with accessible attributes, emit a `Struct` record with sorted field names; fall back to `Raw` with `repr(obj)` when inspection is unsafe.
195+
```rs
196+
pub fn encode_object(obj: *mut PyObject) -> ValueRecord;
197+
```
198+
- Track object identities to detect cycles and reuse `Reference` records with `id(obj)` for repeated structures.
199+
```rs
200+
pub struct SeenSet { map: HashMap<usize, ValueId> }
201+
pub fn record_reference(seen: &mut SeenSet, obj: *mut PyObject) -> Option<ValueRecord>;
202+
```
203+
204+
## Shutdown
205+
- On `stop_tracing`, call `sys.monitoring.set_events` with `NO_EVENTS` for the tool ID.
206+
```rs
207+
pub fn disable_events(tool: &ToolId);
208+
```
209+
- Unregister callbacks and free the tool ID with `sys.monitoring.free_tool_id`.
210+
```rs
211+
pub fn unregister_callbacks(tool: ToolId);
212+
pub fn free_tool_id(tool: ToolId);
213+
```
214+
- Close the writer and ensure all buffered events are flushed to disk.
215+
```rs
216+
pub fn finalize(writer: TraceWriter) -> io::Result<()>;
217+
```
218+
219+
## Current Limitations
220+
- **No structured support for threads or async tasks** – the trace format lacks explicit identifiers for concurrent execution.
221+
Distinguishing events emitted by different Python threads or `asyncio` tasks requires ad hoc `Event` entries, complicating
222+
analysis and preventing downstream tools from reasoning about scheduling.
223+
- **Generic `Event` log** – several `sys.monitoring` notifications like resume, unwind, and branch outcomes have no dedicated
224+
`runtime_tracing` variant. They must be encoded as free‑form `Event` logs, which reduces machine readability and hinders
225+
automation.
226+
- **Heavy value snapshots** – arguments and returns expect full `ValueRecord` structures. Serializing arbitrary Python objects is
227+
expensive and often degrades to lossy string dumps, limiting the visibility of rich runtime state.
228+
- **Append‑only path and function tables**`runtime_tracing` assumes files and functions are discovered once and never change.
229+
Dynamically generated code (`eval`, REPL snippets) forces extra bookkeeping and cannot update earlier entries, making
230+
dynamic features awkward to trace.
231+
- **No built‑in compression or streaming** – traces are written as monolithic JSON or binary files. Long sessions quickly grow in
232+
size and cannot be streamed to remote consumers without additional tooling.
233+
234+
## Future Extensions
235+
- Add filtering to enable subsets of events for performance-sensitive scenarios.
236+
- Support streaming traces over a socket for live debugging.

design-docs/py-api-001.md

Lines changed: 64 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,64 @@
1+
# Python sys.monitoring Tracer API
2+
3+
## Overview
4+
This document describes the user-facing Python API for the `codetracer` module built on top of `runtime_tracing` and `sys.monitoring`. The API exposes a minimal surface for starting and stopping traces, managing trace sessions, and integrating tracing into scripts or test suites.
5+
6+
## Module `codetracer`
7+
8+
### Constants
9+
- `DEFAULT_FORMAT: str = "binary"`
10+
- `TRACE_BINARY: str = "binary"`
11+
- `TRACE_JSON: str = "json"`
12+
13+
### Session Management
14+
- Start a global trace; returns a `TraceSession`.
15+
```py
16+
def start(path: str | os.PathLike, *, format: str = DEFAULT_FORMAT,
17+
capture_values: bool = True, source_roots: Iterable[str | os.PathLike] | None = None) -> TraceSession
18+
```
19+
- Stop the active trace if any.
20+
```py
21+
def stop() -> None
22+
```
23+
- Query whether tracing is active.
24+
```py
25+
def is_tracing() -> bool
26+
```
27+
- Context manager helper for scoped tracing.
28+
```py
29+
@contextlib.contextmanager
30+
def trace(path: str | os.PathLike, *, format: str = DEFAULT_FORMAT,
31+
capture_values: bool = True, source_roots: Iterable[str | os.PathLike] | None = None):
32+
...
33+
```
34+
- Flush buffered data to disk without ending the session.
35+
```py
36+
def flush() -> None
37+
```
38+
39+
## Class `TraceSession`
40+
Represents a live tracing session returned by `start()` and used by the context manager.
41+
42+
```py
43+
class TraceSession:
44+
path: pathlib.Path
45+
format: str
46+
47+
def stop(self) -> None: ...
48+
def flush(self) -> None: ...
49+
def __enter__(self) -> TraceSession: ...
50+
def __exit__(self, exc_type, exc, tb) -> None: ...
51+
```
52+
53+
## Environment Integration
54+
- Auto-start tracing when `CODETRACER_TRACE` is set; the value is interpreted as the output path.
55+
- When `CODETRACER_FORMAT` is provided, it overrides the default output format.
56+
- `CODETRACER_CAPTURE_VALUES` toggles value recording.
57+
58+
## Usage Example
59+
```py
60+
import codetracer
61+
62+
with codetracer.trace("trace.bin"):
63+
run_application()
64+
```

design-docs/test-design-001.md

Lines changed: 60 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,60 @@
1+
# Python sys.monitoring Tracer Test Design
2+
3+
## Overview
4+
This document outlines a test suite for validating the Python tracer built on `sys.monitoring` and `runtime_tracing`. Each test item corresponds to roughly 1–10 lines of implementation and exercises tracer behavior under typical and edge conditions.
5+
6+
## Setup
7+
- Establish a temporary directory for trace output and source snapshots.
8+
- Install the tracer module and import helper utilities for running traced Python snippets.
9+
- Provide fixtures that clear the trace buffer and reset global state between tests.
10+
11+
## Tool Initialization
12+
- Acquire a monitoring tool ID and ensure subsequent calls reuse the same identifier.
13+
- Register callbacks for all enabled events and verify the resulting mask matches the design.
14+
- Unregister callbacks on shutdown and confirm no events fire afterward.
15+
16+
## Event Recording
17+
### Control Flow Events
18+
- Capture `PY_START` and `PY_RETURN` for a simple script and assert a start/stop pair is recorded.
19+
- Resume and yield events within a generator function produce matching `PY_RESUME`/`PY_YIELD` entries.
20+
- A `PY_THROW` followed by `RERAISE` generates the expected unwind and rethrow sequence.
21+
22+
### Call Tracking
23+
- Direct function calls record `CALL` and `PY_RETURN` with correct frame identifiers.
24+
- Recursive calls nest frames correctly and unwind in LIFO order.
25+
- Decorated functions ensure wrapper frames are recorded separately from wrapped frames.
26+
27+
### Line and Branch Coverage
28+
- A loop with conditional branches emits `LINE` events for each executed line and `BRANCH` for each branch taken or skipped.
29+
- Jump statements such as `continue` and `break` produce `JUMP` events with source and destination line numbers.
30+
31+
### Exception Handling
32+
- Raising and catching an exception emits `RAISE` and `EXCEPTION_HANDLED` events with matching exception IDs.
33+
- An uncaught exception records `RAISE` followed by `PY_UNWIND` and terminates the trace with a `PY_THROW`.
34+
35+
### C API Boundary
36+
- Calling a built-in like `len` results in `C_CALL` and `C_RETURN` events linked to the Python frame.
37+
- A built-in that raises, such as `int("a")`, generates `C_RAISE` with the translated exception value.
38+
39+
## Value Translation
40+
- Primitive values (ints, floats, strings, bytes) round-trip through the value registry and appear in the trace as expected.
41+
- Complex collections like lists of dicts are serialized recursively with cycle detection preventing infinite loops.
42+
- Object references without safe representations fall back to `repr` with a stable identifier.
43+
44+
## Metadata and Source Capture
45+
- The trace writer copies the executing script into the output directory and records its SHA-256 hash.
46+
- Traces include `ProcessMetadata` fields for Python version and platform.
47+
48+
## Shutdown Behavior
49+
- Normal interpreter exit flushes the trace and closes files without losing events.
50+
- An abrupt shutdown via `os._exit` truncates the trace file but leaves previous events intact.
51+
52+
## Error and Edge Cases
53+
- Invalid event names in manual callback registration raise a clear `ValueError`.
54+
- Attempting to trace after the writer is closed results in a no-op without raising.
55+
- Large string values exceeding the configured limit are truncated with an explicit marker.
56+
57+
## Performance and Stress
58+
- Tracing a tight loop of 10⁶ iterations completes within an acceptable time budget.
59+
- Concurrent threads each produce isolated traces with no frame ID collisions.
60+

0 commit comments

Comments
 (0)