|
| 1 | +# File-Level Single Responsibility Refactor Plan |
| 2 | + |
| 3 | +## Goals |
| 4 | +- Reshape the Rust crate and Python support package so that every source file encapsulates a single cohesive topic. |
| 5 | +- Reduce the amount of ad-hoc cross-module knowledge currently required to understand tracing start-up, event handling, and encoding logic. |
| 6 | +- Preserve the public Python API and Rust crate interfaces during the refactor to avoid disruptions for downstream tooling. |
| 7 | + |
| 8 | +## Current State Observations |
| 9 | +- `src/lib.rs` is responsible for PyO3 module registration, lifecycle management for tracing sessions, global logging initialisation, and runtime format selection, which mixes unrelated concerns in one file. |
| 10 | +- `src/runtime_tracer.rs` couples trace lifecycle control, activation toggling, and Python value encoding in a single module, making it difficult to unit test or substitute individual pieces. |
| 11 | +- `src/tracer.rs` combines the `Tracer` trait definition, sys.monitoring shims, callback registration utilities, and thread-safe storage, meaning small changes can ripple through unrelated logic. |
| 12 | +- `codetracer_python_recorder/api.py` interleaves environment based auto-start, context-manager ergonomics, backend state management, and format constants, leaving no clearly isolated entry-point for CLI or library callers. |
| 13 | + |
| 14 | +## Target Rust Module Layout |
| 15 | +| Topic | Target file | Notes | |
| 16 | +| --- | --- | --- | |
| 17 | +| PyO3 module definition & re-exports | `src/lib.rs` | Limit to module wiring plus `pub use` statements. |
| 18 | +| Global logging defaults | `src/logging.rs` | Provide helper to configure env_logger defaults reused by both lib.rs and tests. |
| 19 | +| Tracing session lifecycle (`start_tracing`, `stop_tracing`, `flush_tracing`, `is_tracing`) | `src/session.rs` | Own global `ACTIVE` flag and filesystem validation. |
| 20 | +| Runtime tracer orchestration (activation gating, writer plumbing) | `src/runtime/mod.rs` | Public `RuntimeTracer` facade constructed by session. |
| 21 | +| Value encoding helpers | `src/runtime/value_encoder.rs` | Convert Python objects into `runtime_tracing::ValueRecord` values; unit test in isolation. |
| 22 | +| Activation management (start-on-enter logic) | `src/runtime/activation.rs` | Encapsulate `activation_path`, `activation_code_id`, and toggling state. |
| 23 | +| Writer initialisation and file path selection | `src/runtime/output_paths.rs` | Determine file names for JSON/Binary and wrap TraceWriter begin/finish. |
| 24 | +| sys.monitoring integration utilities | `src/monitoring/mod.rs` | Provide `ToolId`, `EventId`, `MonitoringEvents`, `set_events`, etc. |
| 25 | +| Tracer trait & callback dispatch | `src/monitoring/tracer.rs` | Define `Tracer` trait and per-event callbacks; depend on `monitoring::events`. |
| 26 | +| Code object caching | `src/code_object.rs` | Remains focused on caching; consider relocating question comments to doc tests. |
| 27 | + |
| 28 | +The `runtime` and `monitoring` modules become directories with focused submodules, while `session.rs` consumes them via narrow interfaces. Any PyO3 FFI helper functions should live close to their domain (e.g., frame locals helpers inside `runtime/mod.rs`). |
| 29 | + |
| 30 | +## Target Python Package Layout |
| 31 | +| Topic | Target file | Notes | |
| 32 | +| --- | --- | --- | |
| 33 | +| Public API surface (`start`, `stop`, `is_tracing`, constants) | `codetracer_python_recorder/api.py` | Keep the public signatures unchanged; delegate to new helpers. |
| 34 | +| Session handle implementation | `codetracer_python_recorder/session.py` | Own `TraceSession` class and backend delegation logic. |
| 35 | +| Auto-start via environment variables | `codetracer_python_recorder/auto_start.py` | Move `_auto_start_from_env` and constants needed only for boot-time configuration. |
| 36 | +| Format constants & validation | `codetracer_python_recorder/formats.py` | Define `TRACE_BINARY`, `TRACE_JSON`, `DEFAULT_FORMAT`, and any helpers to negotiate format strings. |
| 37 | +| Module-level `__init__` exports | `codetracer_python_recorder/__init__.py` | Re-export the API and trigger optional auto-start. |
| 38 | + |
| 39 | +Splitting the Python helper package along these lines isolates side-effectful auto-start logic from the plain API and simplifies targeted testing. |
| 40 | + |
| 41 | +## Implementation Roadmap |
| 42 | + |
| 43 | +1. **Stabilise tests and build scripts** |
| 44 | + - Ensure `just test` passes to establish a green baseline. |
| 45 | + - Capture benchmarks or representative trace outputs to validate parity later. |
| 46 | + |
| 47 | +2. **Introduce foundational Rust modules (serial)** |
| 48 | + - Extract logging initialisation into `logging.rs` and update `lib.rs` to call the helper. |
| 49 | + - Move session lifecycle logic from `lib.rs` into a new `session.rs`, keeping function signatures untouched and re-exporting via `lib.rs`. |
| 50 | + - Update module declarations and adjust imports; verify tests. |
| 51 | + |
| 52 | +3. **Restructure runtime tracer internals (can parallelise subtasks)** |
| 53 | + - Create `src/runtime/mod.rs` as façade exposing `RuntimeTracer`. |
| 54 | + - **Task 3A (Team A)**: Extract activation control into `runtime/activation.rs`, exposing a small struct consumed by the tracer. |
| 55 | + - **Task 3B (Team B)**: Extract value encoding routines into `runtime/value_encoder.rs`, providing unit tests and benchmarks. |
| 56 | + - **Task 3C (Team C)**: Introduce `runtime/output_paths.rs` to encapsulate format-to-filename mapping and writer initialisation. |
| 57 | + - Integrate submodules back into `runtime/mod.rs` sequentially once individual tasks are complete; resolve merge conflicts around struct fields. |
| 58 | + |
| 59 | +4. **Modularise sys.monitoring glue (partially parallel)** |
| 60 | + - Add `monitoring/mod.rs` hosting shared types (`EventId`, `EventSet`, `ToolId`). |
| 61 | + - Split trait and dispatcher logic into `monitoring/tracer.rs`; keep callback registration helpers near the sys.monitoring bindings. |
| 62 | + - **Task 4A (Team A)**: Port OnceLock caches and registration helpers. |
| 63 | + - **Task 4B (Team B)**: Move `Tracer` trait definition and default implementations, updating call sites in runtime tracer and tests. |
| 64 | + |
| 65 | +5. **Python package decomposition (parallel with Step 4 once Step 2 is merged)** |
| 66 | + - Create `session.py`, `formats.py`, and `auto_start.py` with extracted logic. |
| 67 | + - Update `api.py` to delegate to the new modules but maintain backward-compatible imports. |
| 68 | + - Adjust `__init__.py` to import from `api` and trigger optional auto-start via the new helper. |
| 69 | + - Update Python tests and examples to use the reorganised structure. |
| 70 | + |
| 71 | +6. **Clean-up and follow-up tasks** |
| 72 | + - Remove obsolete comments (e.g., `//TODO AI!` placeholders) or move them into GitHub issues. |
| 73 | + - Update documentation and diagrams to reflect the new module tree. |
| 74 | + - Re-run `just test` and linting for both Rust and Python components; capture trace artifacts to confirm unchanged output format. |
| 75 | + |
| 76 | +## Parallelisation Notes |
| 77 | +- Step 2 touches the global entry points and should complete before deeper refactors to minimise rebasing pain. |
| 78 | +- Step 3 subtasks (activation, value encoding, output paths) operate on distinct sections of the existing `RuntimeTracer`; they can be implemented in parallel once `runtime/mod.rs` scaffolding exists. |
| 79 | +- Step 4's subtasks can proceed concurrently with Step 3 once the new `monitoring` module is introduced; teams should coordinate on shared types but work on separate files. |
| 80 | +- Step 5 (Python package) depends on Step 2 so that backend entry-points remain stable; it can overlap with late Step 3/4 work because it touches only the Python tree. |
| 81 | +- Documentation updates and clean-up in Step 6 can be distributed among contributors after core refactors merge. |
| 82 | + |
| 83 | +## Testing & Verification Strategy |
| 84 | +- Maintain existing integration and unit tests; add focused tests for newly separated modules (e.g., pure Rust tests for `value_encoder` conversions). |
| 85 | +- Extend Python tests to cover environment auto-start logic now that it lives in its own module. |
| 86 | +- For each phase, compare generated trace files against baseline fixtures to guarantee no behavioural regressions. |
| 87 | +- Require code review sign-off from domain owners for each phase to ensure the single-responsibility intent is preserved. |
0 commit comments