Skip to content

Conversation

@tzanko-matev
Copy link
Contributor

  • We hook to the PY_START, PY_RETURN and LINE events and can record a simple trace.
  • We can run the module as on a script like so:
python -m codetracer_python_recorder script.py

Summary of changes

- Implemented a Rust-backed tracer using runtime_tracing.
- Added a concrete implementation of the codetracer-python-recorder/src/tracer.rs trait.
- Wired the runtime tracer into the public Python API (start/stop/flush).

Key details

- New tracer implementation: src/runtime_tracer.rs
    - Struct RuntimeTracer backed by runtime_tracing::NonStreamingTraceWriter.
    - Maps selected sys.monitoring events (CALL, LINE, PY_RETURN) to runtime_tracing:
    - CALL: registers function and call (without capturing full arg lists yet).
    - LINE: registers a Step.
    - PY_RETURN: registers return value (optionally captured as ValueRecord).
- 
Minimal value encoder for None, bool, int, str; falls back to Raw string for others.
- 
Begins writing metadata, paths, and events on start and flushes them on finish.
- 
Exposes helper derive_sidecar_paths(events_path) -> (metadata.json, paths.json).
- 
Extended Tracer trait: src/tracer.rs
    - Now Tracer: Send + Any (to safely store in static Mutex).
    - Added downcasting support: fn as_any(&mut self) for optional future use.
    - Added default lifecycle hooks:
    - fn flush(&mut self, _py) -> PyResult<()> { Ok(()) }
    - fn finish(&mut self, _py) -> PyResult<()> { Ok(()) }
- 
Added flush_installed_tracer(py) to flush current tracer without uninstalling.
- 
uninstall_tracer(py) now calls tracer.finish(py) before unhooking callbacks.
- 
Python API integration: src/lib.rs
    - start_tracing(path, format, capture_values, source_roots)
    - Prevents double-start via ACTIVE flag.
    - Creates RuntimeTracer, derives output file names:
      - events: path (user-provided)
      - metadata: path with extension metadata.json
      - paths: path with extension paths.json
    - For simplicity and to remain object-safe in this environment, “binary” maps to the non-streaming BinaryV0 writer. JSON is supported as json.
    - Installs tracer via sys.monitoring and flips ACTIVE to true.
- stop_tracing()
    - Uninstalls the tracer (which calls finish on the tracer) and sets ACTIVE false.
- flush_tracing()
    - Calls flush_installed_tracer(py). For non-streaming formats this writes events to the file; for streaming formats (not used here) this would be a no-op by design.
- is_tracing(): returns ACTIVE.

Behavioral notes

- Formats:
    - json: uses JSON non-streaming writer.
    - binary: mapped to BinaryV0 non-streaming writer in this implementation to avoid relying on the private streaming writer module and to keep the tracer object Send-safe.
- Sidecar files:
    - metadata: .metadata.json
    - paths: .paths.json
    - events: 
- Values capturing:
    - Optional via capture_values flag. Basic types (None, bool, int, str) are handled; all others fall back to Raw.

What I didn’t change

- Existing tests and their tracer implementations (PrintTracer, CountingTracer) remain compatible because all new Tracer methods are default no-ops.

Next steps (optional)

- Expand event interest set (e.g., exceptions, C_RETURN/C_RAISE) to record richer traces.
- Enhance value capture and variable bindings for arguments and locals (requires more Python-level context).
- Consider supporting streaming binary output once a public API for the streaming writer is exposed (or if constraints allow depending on runtime_tracing crate updates).

Commands to run

- Build and tests rely on your environment’s Python/PyO3 toolchain. The repo’s recommended way:
    - just venv 3.13 dev
    - just test

This implementation hooks into sys.monitoring, records with runtime_tracing, and exposes start/stop/flush in the Python module, keeping the code defensive, testable, and focused on the requested trait
implementation.
Signed-off-by: Tzanko Matev <[email protected]>

Activate tracing on script entry
codetracer-python-recorder/codetracer_python_recorder/__main__.py: 
codetracer-python-recorder/codetracer_python_recorder/api.py: 
codetracer-python-recorder/src/lib.rs: 
codetracer-python-recorder/src/runtime_tracer.rs: 
trace.json: 
trace.paths.json: 

Signed-off-by: Tzanko Matev <[email protected]>

Only trace files in a whitelist (experiment)
codetracer-python-recorder/codetracer_python_recorder/__main__.py: 
codetracer-python-recorder/src/lib.rs: 
codetracer-python-recorder/src/runtime_tracer.rs: 

Signed-off-by: Tzanko Matev <[email protected]>
Base automatically changed from codetype-interface to main September 15, 2025 13:11
…rgs] script.py [script args]`

codetracer-python-recorder/codetracer_python_recorder/__init__.py: 
codetracer-python-recorder/codetracer_python_recorder/__main__.py: 
hello.py: 
trace.json: 
trace.metadata.json: 
trace.paths.json: 

Signed-off-by: Tzanko Matev <[email protected]>
@tzanko-matev
Copy link
Contributor Author

Fixed the issues. Will merge

@tzanko-matev tzanko-matev merged commit 1cceecc into main Sep 15, 2025
2 checks passed
@tzanko-matev tzanko-matev deleted the basic-tracer branch September 15, 2025 13:28
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants