Skip to content

Commit d079719

Browse files
authored
Enterprise-grade error handling and logging (#45)
Tried to centralize and manage error-handling and logging. Here's the status of the task. See the README.md for better description of the new features. ## WS1 – Foundations & Inventory State: In progress Tooling: `just errors-audit` (finds `PyRuntimeError::new_err`, `unwrap`/`expect`/`panic!`, Python `RuntimeError`/`ValueError`). What we saw: - Rust modules now emit `RecorderError`; raw Python exceptions survive in `codetracer_python_recorder/session.py` and tests (ISSUE-014). - `src/monitoring/tracer.rs` still uses `lock().unwrap()` and lacks error reporting for callback failures (ISSUE-013). - Python glue keeps legacy assertions/unwraps (ISSUE-012). Next moves: - Land ISSUE-013 to sort the locking story. - Plan the Python facade cleanup (ISSUE-014) once WS4 is steady. ## WS2 – `recorder-errors` Crate State: Done (2025-10-02) Highlights: - Added `crates/recorder-errors` with `RecorderError`, enums, context helpers, macros (`usage!`, `enverr!`, `target!`, `bug!`, `ensure_*`), plus tests and optional serde support. - `cargo test -p recorder-errors` + workspace `cargo check` stay green. Next moves: Use this crate everywhere in WS3/WS4 work. ## WS3 – Retrofit Rust Modules State: Done (2025-10-02) Highlights: - `session/*`, `runtime/*`, and `monitoring/tracer.rs` now return `RecorderError` via the shared macros. - Python exposure happens through one `errors` mapper; IO errors now carry context. - No stray `PyRuntimeError::new_err` left outside that mapper. Next moves: Feed findings into WS4 and loop back to WS1 issues. ## WS4 – FFI Wrapper & Python Exception Hierarchy State: Done (2025-10-02) Highlights: - Added `ffi` guard around each PyO3 entry point to map `RecorderError` plus panic safety. - Exposed Python classes `RecorderError`, `UsageError`, `EnvironmentError`, `TargetError`, `InternalError`. - Rust and Python tests cover the new flow (`uv run cargo nextest run ...`; `.venv/bin/python -m pytest ...`). Next moves: Hold for WS5 until ISSUES 013/014 close. ## WS5 – Policy Switches & Runtime Configuration State: Done (2025-10-03) Highlights: - `TraceSession.start()` and `trace()` now refresh policy from env vars and accept override mappings so embeds wire recorder switches without manual plumbing. - Rust exports expose `configure_policy`/`configure_policy_from_env` under the expected Python names; unit tests cover env-driven and explicit override flows. - Runtime tracer finish path honours `RecorderPolicy`: callback errors respect `on_recorder_error` (disable detaches without surfacing exceptions), `require_trace` now fails cleanly when no events land, and partial traces are deleted or retained based on `keep_partial_trace`. - Python CLI integration tests exercise disable vs abort paths and require-trace enforcement using the new failure-injection toggles; CLI now propagates runtime shutdown errors so exit codes reflect policy outcomes while partial traces are cleaned per configuration. Next moves: Kick off WS6 once upstream WS1 cleanups land. ## WS6 – Logging, Metrics, and Diagnostics State: Done (2025-10-03) Highlights: - Replaced the `env_logger` helper with a structured JSON logger that always emits `run_id`, active `trace_id`, and `error_code` fields while honouring policy-driven log level and log file overrides. - Introduced a pluggable `RecorderMetrics` sink and instrumented dropped locations, policy-triggered detachments, and caught panics across the monitoring/runtime paths; Rust unit tests exercise the metrics capture. - Enabled the `--json-errors` policy path so runtime shutdown emits a single-line JSON trailer on stderr; CLI integration tests now assert the abort flow surfaces the trailer alongside existing stack traces. Next moves: Wire the metrics sink into the chosen exporter and align the log schema with Observability consumption before rolling out to downstream tooling. ## WS7 – Test Coverage & Tooling Enforcement State: Done (2025-10-04) Highlights: - Expanded `recorder-errors` and policy unit tests covering every macro (usage/target/internal ensures) plus invalid boolean parsing. - Added FFI unit tests for `dispatch`/`wrap_pyfunction`, panic containment, and Python exception attribute propagation. - Introduced integration coverage for environment permission failures, injected target argument capture errors, and synthetic callback panics (verifying JSON trailers and error classes). - Implemented `just lint` orchestration running `cargo clippy -D clippy::panic` and a repository script that blocks unchecked `.unwrap(` usage outside the legacy allowlist. Next moves: Monitor unwrap allowlist shrinkage once WS1 follow-ups land; evaluate extending the lint to `.expect(` once monitoring refactor closes. ## WS8 – Documentation & Rollout State: Done (2025-10-05) Highlights: - README now covers the recorder error policy, JSON trailers, exit codes, and a short Python `RecorderError` catch example. - Added `docs/onboarding/error-handling.md` with migration steps, policy wiring tips, and assertion rules for contributors. - Started `codetracer-python-recorder/CHANGELOG.md` to brief downstream tools on consuming structured errors. Next moves: - Share the onboarding doc with downstream maintainers and collect gaps before promoting ADR 0004 to **Accepted**. - Fold feedback into the change log before the next release tag.
2 parents 9db7e54 + f4f5981 commit d079719

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

48 files changed

+4390
-832
lines changed

.coverage

-52 KB
Binary file not shown.

.envrc

Lines changed: 3 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1 +1,3 @@
1-
use flake
1+
watch_file nix/flake.nix
2+
watch_file nix/flake.lock
3+
use flake ./nix

.github/workflows/ci.yml

Lines changed: 7 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -23,16 +23,16 @@ jobs:
2323
extra_nix_config: |
2424
experimental-features = nix-command flakes
2525
- name: Prepare dev environment
26-
run: nix develop --command bash -lc 'just venv ${{matrix.python-version}} dev'
26+
run: nix develop ./nix --command bash -lc 'just venv ${{matrix.python-version}} dev'
2727

2828
- name: Verify recorder version metadata
29-
run: nix develop --command bash -lc 'python3 scripts/check_recorder_version.py'
29+
run: nix develop ./nix --command bash -lc 'python3 scripts/check_recorder_version.py'
3030

3131
- name: Rust tests
32-
run: nix develop --command bash -lc 'just cargo-test'
32+
run: nix develop ./nix --command bash -lc 'just cargo-test'
3333

3434
- name: Python tests
35-
run: nix develop --command bash -lc 'just py-test'
35+
run: nix develop ./nix --command bash -lc 'just py-test'
3636

3737
coverage:
3838
name: Coverage (Python 3.12)
@@ -52,17 +52,17 @@ jobs:
5252
experimental-features = nix-command flakes
5353
5454
- name: Prepare dev environment (Python 3.12)
55-
run: nix develop --command bash -lc 'just venv 3.12 dev'
55+
run: nix develop ./nix --command bash -lc 'just venv 3.12 dev'
5656

5757
- name: Collect coverage
5858
id: coverage-run
59-
run: nix develop --command bash -lc 'just coverage'
59+
run: nix develop ./nix --command bash -lc 'just coverage'
6060

6161
- name: Generate coverage comment
6262
if: steps.coverage-run.outcome == 'success'
6363
run: |
6464
ROOT="$(pwd)"
65-
nix develop --command bash -lc "python3 codetracer-python-recorder/scripts/generate_coverage_comment.py \
65+
nix develop ./nix --command bash -lc "python3 codetracer-python-recorder/scripts/generate_coverage_comment.py \
6666
--rust-summary codetracer-python-recorder/target/coverage/rust/summary.json \
6767
--python-json codetracer-python-recorder/target/coverage/python/coverage.json \
6868
--output codetracer-python-recorder/target/coverage/coverage-comment.md \

.github/workflows/maturin-generated.yaml

Lines changed: 0 additions & 224 deletions
This file was deleted.

.gitignore

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -7,6 +7,6 @@ build
77
*~
88
.idea/
99
.cargo/
10-
10+
.coverage
1111
**/*.egg-info/
1212
codetracer-python-recorder/codetracer_python_recorder/*.so

Justfile

Lines changed: 24 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -31,22 +31,44 @@ venv version=PYTHON_DEFAULT_VERSION:
3131

3232
# Build the module in dev mode
3333
dev:
34-
uv run --directory codetracer-python-recorder maturin develop --uv
34+
uv run --directory codetracer-python-recorder maturin develop --uv --features integration-test
3535

3636
# Run unit tests of dev build
3737
test: cargo-test py-test
3838

3939
# Run Rust unit tests without default features to link Python C library
4040
cargo-test:
41-
uv run cargo nextest run --manifest-path codetracer-python-recorder/Cargo.toml --no-default-features
41+
uv run cargo nextest run --manifest-path codetracer-python-recorder/Cargo.toml --workspace --no-default-features
4242

4343
py-test:
4444
uv run --group dev --group test pytest codetracer-python-recorder/tests/python codetracer-pure-python-recorder
45+
46+
lint: lint-rust lint-errors
47+
48+
lint-rust:
49+
uv run cargo clippy --manifest-path codetracer-python-recorder/Cargo.toml --workspace --no-default-features -- -D clippy::panic
50+
51+
lint-errors:
52+
uv run python3 codetracer-python-recorder/scripts/lint_no_unwraps.py
4553

4654
# Run tests only on the pure recorder
4755
test-pure:
4856
uv run --group dev --group test pytest codetracer-pure-python-recorder
4957

58+
# Inspect ad-hoc error handling patterns across the Rust/Python recorder
59+
errors-audit:
60+
@echo "== PyRuntimeError construction =="
61+
@rg --color=never --no-heading -n "PyRuntimeError::new_err" codetracer-python-recorder/src codetracer-python-recorder/tests codetracer-python-recorder/codetracer_python_recorder || true
62+
@echo
63+
@echo "== unwrap()/expect()/panic! usage =="
64+
@rg --color=never --no-heading -n "\\.unwrap\\(" codetracer-python-recorder/src || true
65+
@rg --color=never --no-heading -n "\\.expect\\(" codetracer-python-recorder/src || true
66+
@rg --color=never --no-heading -n "panic!" codetracer-python-recorder/src || true
67+
@echo
68+
@echo "== Python-side bare RuntimeError/ValueError =="
69+
@rg --color=never --no-heading -n "raise RuntimeError" codetracer-python-recorder/codetracer_python_recorder || true
70+
@rg --color=never --no-heading -n "raise ValueError" codetracer-python-recorder/codetracer_python_recorder || true
71+
5072
# Generate combined coverage artefacts for both crates
5173
coverage:
5274
just coverage-rust

0 commit comments

Comments
 (0)