Skip to content

Commit 3073265

Browse files
committed
Exception and async handling
1 parent 1ff52ea commit 3073265

File tree

2 files changed

+206
-0
lines changed

2 files changed

+206
-0
lines changed
Lines changed: 131 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,131 @@
1+
# ADR 0012: Balanced sys.monitoring Call Stack Events
2+
3+
- **Status:** Proposed
4+
- **Date:** 2025-10-26
5+
- **Deciders:** codetracer recorder maintainers
6+
- **Consulted:** Runtime tracing stakeholders, Replay consumers
7+
- **Informed:** Support engineering, DX tooling crew
8+
9+
## Context
10+
- The Rust-backed recorder currently subscribes to `PY_START`, `PY_RETURN`, and `LINE` events from `sys.monitoring`.
11+
- `RuntimeTracer` only emits two structural trace records—`TraceWriter::register_call` and `TraceWriter::register_return`—because the trace file format has no explicit notion of yields, resumptions, or exception unwinding.
12+
- Generators, coroutines, and exception paths trigger additional `sys.monitoring` events (`PY_YIELD`, `PY_RESUME`, `PY_THROW`, `PY_UNWIND`). When we ignore them the call stack in the trace becomes unbalanced, causing downstream tooling to miscompute nesting depth, duration, and attribution.
13+
- CPython already exposes these events with complete callback metadata. We simply never hook them, so resumptions and unwinds silently skip our writer.
14+
15+
## Problem
16+
- Trace consumers require balanced call/return pairs to reconstruct execution trees and propagate per-activation metadata (filters, IO capture, telemetry).
17+
- When a generator yields, we never emit a `register_return`, so the activation remains "open" forever even if the generator is never resumed.
18+
- When the interpreter unwinds a frame because of an exception, we neither emit a `register_return` nor mark the activation inactive, so the lifecycle bookkeeping leaks and `TraceWriter::finish_writing_trace_events` ends with dangling activations.
19+
- Conversely, when a generator/coroutine resumes—either normally (`PY_RESUME`) or via `throw()` (`PY_THROW`)—we fail to emit the "call" edge that would push it back on the logical stack.
20+
- Without these edges, the runtime cannot guarantee `TraceWriter` invariants or present accurate trace metadata. Adding synthetic bookkeeping in consumers is not possible because the events are already lost.
21+
22+
## Decision
23+
1. **Treat additional monitoring events as structural aliases.**
24+
- Map `PY_YIELD` and `PY_UNWIND` callbacks to the same flow as `on_py_return`, ultimately calling `TraceWriter::register_return`.
25+
- Map `PY_RESUME` callbacks to the same flow as `on_py_start`, emitting a call edge with an empty argument vector because CPython does not provide the `send()` value (`https://docs.python.org/3/library/sys.monitoring.html#monitoring-event-PY_RESUME`).
26+
- Map `PY_THROW` callbacks to the call flow but propagate the exception object as the payload recorded for the resumed activation so downstream tools can correlate the injected error; encode it as a single argument named `exception` using the existing value encoder (`https://docs.python.org/3/library/sys.monitoring.html#monitoring-event-PY_THROW`).
27+
2. **Subscribe to the four events in `RuntimeTracer::interest`.** The tracer will request `{PY_START, PY_RETURN, PY_YIELD, PY_UNWIND, PY_RESUME, PY_THROW}` plus `LINE` to preserve current behaviour.
28+
3. **Unify lifecycle hooks.** Extend the activation manager so that yield/unwind events deactivate the frame and resumption events reactivate or spawn a continuation while preserving filter decisions, telemetry handles, and IO capture state.
29+
4. **Preserve file-format semantics.** We will not add new record types; instead we ensure every control-flow boundary ultimately produces the same call/return records the file already understands.
30+
5. **Defensive guards.** Log-and-disable behaviour stays unchanged: any callback failure still honours policy (`OnRecorderError`). The new events use the same `should_trace_code` and activation gates so filters can skip generators consistently.
31+
32+
## Consequences
33+
- **Benefits:**
34+
- Balanced call stacks for generators, coroutines, and exception unwinds without touching the trace schema.
35+
- Replay and analysis tools stop seeing "dangling activation" warnings, improving trust in exported traces.
36+
- The recorder can later add richer semantics (e.g., value capture on resume) because the structural foundation is sound.
37+
- **Costs:**
38+
- Slightly higher callback volume, especially in generator-heavy workloads (two extra events per yield/resume pair).
39+
- Additional complexity inside `RuntimeTracer` to differentiate return-like vs call-like flows while sharing writer helpers.
40+
- **Risks:**
41+
- Incorrect mapping could double-emit calls or returns, corrupting the trace. We mitigate this with targeted tests covering yields, exceptions, and `throw()`-driven resumes.
42+
- Performance regressions if the new paths capture values unnecessarily; we will keep value capture opt-in via filter policies.
43+
44+
## Alternatives
45+
- **Introduce new trace record kinds for each event.** Rejected because consumers, storage, and analytics would all need format upgrades, and the existing stack-only writer already conveys the necessary structure.
46+
- **Approximate via Python-side bookkeeping.** Rejected: the Python helper cannot observe generator unwinds once the Rust tracer suppresses the events.
47+
- **Ignore stack balancing and patch consumers.** Rejected because it hides the source of truth and still leaves us without activation lifecycle signals during recording (IO capture, telemetry).
48+
49+
## Key Examples
50+
51+
### 1. Ordinary Function Call
52+
```python
53+
def add(a, b):
54+
return a + b
55+
56+
result = add(4, 5)
57+
```
58+
- `PY_START` fires when `add` begins. We capture the two arguments via `capture_call_arguments` and call `TraceWriter::register_call(function_id=add, args=[("a", 4), ("b", 5)])`.
59+
- `PY_RETURN` fires just before the return. We record the value `9` through `record_return_value`, which invokes `TraceWriter::register_return(9)`.
60+
- The trace shows a single balanced call/return pair; no other structural events are emitted.
61+
62+
### 2. Generator Yield + Resume
63+
```python
64+
def ticker():
65+
yield "ready"
66+
yield "again"
67+
68+
g = ticker()
69+
first = next(g)
70+
second = next(g)
71+
```
72+
- First `next(g)`:
73+
- `PY_START``register_call(ticker, args=[])`.
74+
- `PY_YIELD``register_return("ready")`. The activation is now suspended but the trace stack is balanced.
75+
- Second `next(g)`:
76+
- `PY_RESUME``register_call(ticker, args=[])` (empty vector because CPython does not expose the send value).
77+
- `PY_YIELD``register_return("again")`.
78+
- When the generator exhausts, CPython emits `PY_RETURN`, so we `register_return(None)` (or whatever value was returned). Every suspension/resumption pair corresponds to alternating `register_return`/`register_call`, keeping the call stack consistent.
79+
80+
### 3. Generator Throw
81+
```python
82+
def worker():
83+
try:
84+
yield "ready"
85+
except RuntimeError as err:
86+
return f"caught {err}"
87+
88+
g = worker()
89+
next(g)
90+
g.throw(RuntimeError("boom"))
91+
```
92+
- Initial `next(g)` behaves like Example 2.
93+
- `g.throw(...)` triggers:
94+
- `PY_THROW` with the exception object. We emit `register_call(worker, args=[("exception", RuntimeError("boom"))])`, encoding the exception with the existing value encoder so it appears in the trace payload.
95+
- If the generator handles the exception and returns, `PY_RETURN` follows and we write `register_return("caught boom")`. If it re-raises, `PY_UNWIND` fires instead and we encode the exception value in `register_return`.
96+
97+
### 4. Exception Unwind Without Yield
98+
```python
99+
def explode():
100+
raise ValueError("bad news")
101+
102+
def run():
103+
return explode()
104+
105+
run()
106+
```
107+
- `explode()` starts: `PY_START``register_call(explode, args=[])`.
108+
- The function raises before returning, so CPython skips `PY_RETURN` and emits `PY_UNWIND` with the `ValueError`.
109+
- We treat `PY_UNWIND` like `PY_RETURN`: flush pending IO, encode the exception via `record_return_value`, and call `register_return(ValueError("bad news"))`. The activation controller marks the frame inactive, preventing dangling stack entries when tracing finishes.
110+
111+
### 5. Coroutine Await / Resume
112+
```python
113+
import asyncio
114+
115+
async def worker():
116+
await asyncio.sleep(0)
117+
return "done"
118+
119+
asyncio.run(worker())
120+
```
121+
- Entry: `PY_START``register_call(worker, args=[])`.
122+
- When the coroutine awaits `sleep(0)`, CPython emits `PY_YIELD` with no explicit value (await results are delivered later). We encode the pending await result (typically `None`) via `register_return`.
123+
- When the event loop resumes `worker`, `PY_RESUME` fires and we record another `register_call(worker, args=[])`. No payload is available because the resume value is implicit in the await machinery.
124+
- Final completion triggers `PY_RETURN` so we write `register_return("done")`.
125+
- The trace therefore shows multiple call/return pairs for the same coroutine activation, mirroring each suspend/resume cycle.
126+
127+
## Rollout
128+
1. Update the design docs with this ADR and the implementation plan.
129+
2. Implement the runtime changes behind standard CI, landing tests that prove stack balance for yields, unwinds, and resumes.
130+
3. Notify downstream consumers that generator traces now appear balanced without requiring schema or API changes.
131+
4. Monitor regression dashboards for callback volume and latency after enabling the new events by default.
Lines changed: 75 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,75 @@
1+
# Balanced Call Stack Events – Implementation Plan
2+
3+
Plan owners: codetracer recorder maintainers
4+
Target ADR: 0012 – Balanced sys.monitoring Call Stack Events
5+
Impacted components: `codetracer-python-recorder/src/monitoring`, `src/runtime/tracer`, `src/runtime/value_capture`, Python integration tests
6+
7+
## Goals
8+
- Subscribe to `PY_UNWIND`, `PY_YIELD`, `PY_RESUME`, and `PY_THROW` so the recorder observes every frame transition that affects the logical call stack.
9+
- Emit `TraceWriter::register_call` for resume/throw events and `TraceWriter::register_return` for yield/unwind events, keeping the on-disk trace strictly balanced without changing the format.
10+
- Preserve existing policies (filters, activation gating, IO capture, error handling) for the newly supported events.
11+
- Ship regression tests that prove generators, coroutines, and exception unwinds no longer leave dangling activations.
12+
13+
## Non-Goals
14+
- No new trace record kinds or schema updates—the file format will continue to expose only call/return/line records.
15+
- No STOP_ITERATION or async-step support in this iteration; we only handle the four events required for stack balancing.
16+
- No new Python-facing APIs or CLI flags; behaviour changes remain internal to the recorder.
17+
18+
## Current Gaps
19+
- `RuntimeTracer::interest` (`src/runtime/tracer/events.rs`) unions only `PY_START`, `LINE`, and `PY_RETURN`, so CPython never calls our callbacks for yield/resume/unwind/throw events.
20+
- Callback implementations for `on_py_resume`, `on_py_yield`, `on_py_throw`, and `on_py_unwind` defer to the default trait methods, meaning no trace entries are emitted when those events fire.
21+
- Activation gating (`ActivationController`) relies exclusively on `PY_RETURN` notifications to detect when the activation function exits and therefore never observes unwinds, while yields would prematurely end tracing if we naïvely treated them as "final" returns.
22+
- Tests (`tests/python/test_monitoring_events.py`) only assert start/return pairs for direct calls, leaving generators and exception flows unverified.
23+
24+
## Workstreams
25+
26+
### WS1 – Monitoring Mask & Callback Wiring
27+
**Scope:** Ensure the tracer actually receives the four additional events.
28+
- Update `RuntimeTracer::interest` to include `PY_YIELD`, `PY_UNWIND`, `PY_RESUME`, and `PY_THROW` alongside the existing mask.
29+
- Add documentation in `design-docs/design-001.md` detailing which `sys.monitoring` events map to `TraceWriter::register_call` vs `register_return`.
30+
- Confirm `install_tracer` reuses the updated `EventSet`, and add assertions in the Rust unit tests that `interest` toggles the correct bits.
31+
- Exit criteria: installing the tracer registers six Python events (start, resume, throw, return, yield, unwind) plus line.
32+
33+
### WS2 – Call/Return Edge Helpers
34+
**Scope:** Share logic for emitting structural trace events across multiple callbacks.
35+
- Introduce helper methods inside `RuntimeTracer` (e.g., `emit_call_edge(kind, code)` and `emit_return_edge(kind, code, payload)`) that encapsulate:
36+
- Activation gating (`should_process_event` + `should_trace_code`).
37+
- Filter lookups, telemetry handles, and IO flushes.
38+
- `TraceWriter::register_call`/`register_return` invocations, including empty argument lists for resume/throw, and descriptive labels (`"<yield>"`, `"<unwind>"`) for value redaction bookkeeping.
39+
- Reuse existing `capture_call_arguments` only for `PY_START`. `PY_RESUME` will emit an empty argument vector because the callback does not expose the `send()` value, while `PY_THROW` should wrap the provided exception object into a synthetic argument named `exception` and encode it with the existing value encoder (no new types) so the resumed activation records what was injected (per <https://docs.python.org/3/library/sys.monitoring.html#monitoring-event-PY_THROW>).
40+
- For `PY_THROW`, treat the incoming exception object as both (a) the return payload of the previous activation (because the generator exits up-stack) and (b) metadata recorded during the resumed activation? To keep semantics simple, treat `PY_THROW` strictly as a resume-side call edge with no immediate value capture.
41+
- Ensure all return-like paths (`PY_RETURN`, `PY_UNWIND`, `PY_YIELD`) call `flush_pending_io()` before writing the record to keep streamed IO aligned with frame exits.
42+
- Track event source in debug logs (`log_event`) so we can distinguish `on_py_return` vs `on_py_yield`, aiding support diagnostics.
43+
44+
### WS3 – Activation & Lifecycle Behaviour
45+
**Scope:** Maintain activation gating correctness while adding new structural events.
46+
- Extend `ActivationController` with a notion of "suspension" so `PY_YIELD` does **not** mark the activation as completed, while `PY_RETURN` and `PY_UNWIND` still shut it down. A simple approach is to thread an `ActivationExitKind` enum through `handle_return_event`.
47+
- When a generator resumes (`PY_RESUME`/`PY_THROW`), ensure `should_process_event` continues to consult the activation state so suspended activations can keep tracing once re-entered but completed activations remain off.
48+
- Confirm lifecycle flags (`mark_event`, `mark_failure`) are triggered for every emitted call/return and that `TraceWriter::finish_writing_trace_events` runs with a clean stack even when recording stops during an unwind.
49+
- Update any telemetry counters or metadata summaries that assumed a 1:1 relationship between `PY_START` / `PY_RETURN`.
50+
51+
### WS4 – Testing & Validation
52+
**Scope:** Prove generators, coroutines, and exceptions emit balanced traces.
53+
- Python tests:
54+
- Add generator that yields twice and is resumed via `send()` and `throw()`. Assert the low-level trace (via `codetracer_python_recorder.trace`) contains matching call/return counts and that resumption after `throw()` still emits a call edge.
55+
- Add coroutine/async def test that awaits, ensuring `PY_YIELD` semantics also cover `await`.
56+
- Add exception unwinding test where a function raises; assert the trace file closes the activation (no dangling frames).
57+
- Rust tests:
58+
- Enhance `tests/rust/print_tracer.rs` to count the new events when the integration feature is enabled, validating callback registration and error handling.
59+
- Unit-test the new helper functions with synthetic `CodeObjectWrapper` fixtures to ensure `emit_call_edge` emits empty argument vectors for resume events.
60+
- Manual verification:
61+
- Run `just test` plus a targeted script that dumps the generated `.trace` file to check for alternating call/return records even when generators are partially consumed.
62+
- Capture performance samples on generator-heavy code to ensure callback overhead stays within existing thresholds (<10% regression).
63+
64+
## Testing & Rollout Checklist
65+
- [ ] `cargo test` (workspace)
66+
- [ ] `just test` to exercise Python + Rust suites
67+
- [ ] Ensure new Python tests run under CPython 3.12 and 3.13 in CI
68+
- [ ] Validate that traces recorded from `examples/generator.py` (add if needed) now contain balanced call counts (e.g., via a small verification script)
69+
- [ ] Update release notes / changelog entry describing the new event coverage
70+
71+
## Risks & Mitigations
72+
- **Double-emission of call/return pairs:** Mitigate via targeted tests that assert the writer stack depth never drops below zero and ends at zero after sample programs.
73+
- **Activation path regressions:** `PY_YIELD` handling must not prematurely deactivate tracing; add regression tests that set `CODETRACER_ACTIVATION_PATH` to a generator and ensure tracing continues across resumes.
74+
- **Unhandled payload types:** `PY_UNWIND` carries an exception object that might not be JSON serialisable when value capture is disabled. Guard `record_return_value` with the existing redaction policy and log errors before surfacing them to Python.
75+
- **Performance overhead:** Monitor benchmarks once the feature lands. We expect only a handful of extra callbacks; if regression >10%, consider making resume/throw value capture lazier or batching events.

0 commit comments

Comments
 (0)