Skip to content

Commit 00f83f9

Browse files
authored
Trace filters (#53)
# Configurable Trace Filters ## Overview - Implements user story **US0028 – Configurable Python trace filters** (see `design-docs/US0028 - Configurable Python trace filters.md`). - Trace filters let callers decide which modules execute under tracing and which values are redacted before the recorder writes events. - Each filter file is TOML. Files can be chained to layer product defaults with per-project overrides. The runtime records the active filter summary in `trace_metadata.json`. - The recorder always prepends a built-in **builtin-default** filter that (a) skips CPython standard-library frames (including `asyncio`/concurrency internals) while still allowing third-party packages under `site-packages` (except helper shims like `_virtualenv.py`) and (b) redacts common sensitive identifiers (passwords, tokens, API keys, etc.) across locals/globals/args/returns/attributes. Project filters and explicit overrides append after this baseline and can relax rules where needed. ## Filter Files - Filters live alongside the project (default: `.codetracer/trace-filter.toml`). Any other file can be supplied via CLI, environment variable, or Python API. - Required sections: - `[meta]` – `name`, `version` (integer), optional `description`. - `[scope]` – `default_exec` (`"trace"`/`"skip"`), `default_value_action` (`"allow"`/`"redact"`/`"drop"`). - Rules appear under `[[scope.rules]]` in declaration order. Each rule has: - `selector` – matches a package, file, or object (see selector syntax). - Optional `exec` override (`"trace"`/`"skip"`). - Optional `value_default` override (`"allow"`/`"redact"`/`"drop"`). - Optional `reason` string stored in telemetry. - `[[scope.rules.value_patterns]]` entries that refine value capture by selector. - Example: ```toml [meta] name = "example-filter" version = 1 description = "Protect secrets while allowing metrics." [scope] default_exec = "trace" default_value_action = "allow" [[scope.rules]] selector = "pkg:my_app.services.*" value_default = "redact" [[scope.rules.value_patterns]] selector = "local:glob:public_*" action = "allow" [[scope.rules.value_patterns]] selector = 'local:regex:^(metric|masked)_\w+$' action = "allow" [[scope.rules.value_patterns]] selector = "local:glob:secret_*" action = "redact" [[scope.rules.value_patterns]] selector = "arg:literal:debug_payload" action = "drop" ``` ## Selector Syntax - Domains (`selector` prefix before the first colon): - `pkg` – fully-qualified module name (`package.module`). - `file` – source path relative to the project root (POSIX separators). - `obj` – module-qualified object (`package.module.func`). - `local`, `global`, `arg`, `ret`, `attr` – value-level selectors. - Match types (second segment in `kind:match:pattern`): - `glob` *(default)* – wildcard matching with `/` treated as a separator. - `regex` – Rust/RE2-style regular expressions; invalid patterns log a single warning and fall back to configuration errors. - `literal` – exact string match. - Value selectors inherit the match type when omitted (e.g., `local:token_*` uses glob). Declare the match type explicitly when combining separators or anchors. ## Loading and Chaining Filters - Default discovery: `RuntimeTracer` searches for `.codetracer/trace-filter.toml` near the target script. - CLI: `--trace-filter path/to/filter.toml`. Provide multiple times or use `::` within one argument to append more files. - Environment: `CODETRACER_TRACE_FILTER=filters/prod.toml::filters/hotfix.toml`. Respected by the auto-start hook and the CLI. - Python API: `trace(..., trace_filter=[path1, path2])` or pass a `::`-delimited string. Paths are expanded to absolute locations and must exist. - The recorder loads filters in the order discovered: the built-in `builtin-default` filter first, then project defaults, CLI/env entries, and explicit Python API arguments. Later rules override earlier ones when selectors overlap. ## Runtime Metadata - `trace_metadata.json` now exposes a `trace_filter` object containing: - `filters` – ordered list of filter summaries (`name`, `version`, SHA-256 digest, absolute path). - `stats.scopes_skipped` – total number of code objects blocked by `exec = "skip"`. - `stats.value_redactions` – per-kind counts for redacted values (`argument`, `local`, `global`, `return`, `attribute`). - `stats.value_drops` – per-kind counts for values removed entirely from the trace. - These counters help CI/quality tooling detect unexpectedly aggressive filters. ## Benchmarks and Guard Rails - Rust microbench: `cargo bench --bench trace_filter --no-default-features` exercises baseline vs glob/regex-heavy rule sets. - Python smoke benchmark: `pytest codetracer-python-recorder/tests/python/perf/test_trace_filter_perf.py` runs end-to-end tracing with synthetic workloads when `CODETRACER_TRACE_FILTER_PERF=1`. - `just bench` orchestrates both: 1. Ensures the development virtualenv exists (`just venv`). 2. Runs the Criterion bench with `PYO3_PYTHON` pinned to the virtualenv interpreter. 3. Executes the Python smoke benchmark, writing `codetracer-python-recorder/target/perf/trace_filter_py.json` (durations plus redaction/drop stats per scenario). - Use the JSON artefact to feed dashboards or simple regression checks while longer-term gating thresholds are defined.
2 parents a57d961 + 5c4fef6 commit 00f83f9

36 files changed

+5961
-46
lines changed

Justfile

Lines changed: 18 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -40,6 +40,24 @@ test: cargo-test py-test
4040
cargo-test:
4141
uv run cargo nextest run --manifest-path codetracer-python-recorder/Cargo.toml --workspace --no-default-features
4242

43+
bench:
44+
just venv
45+
ROOT="$(pwd)"; \
46+
PYTHON_BIN="$ROOT/.venv/bin/python"; \
47+
if [ ! -x "$PYTHON_BIN" ]; then \
48+
PYTHON_BIN="$ROOT/.venv/Scripts/python.exe"; \
49+
fi; \
50+
if [ ! -x "$PYTHON_BIN" ]; then \
51+
echo "Python interpreter not found. Run 'just venv <version>' first."; \
52+
exit 1; \
53+
fi; \
54+
PERF_DIR="$ROOT/codetracer-python-recorder/target/perf"; \
55+
mkdir -p "$PERF_DIR"; \
56+
PYO3_PYTHON="$PYTHON_BIN" uv run cargo bench --manifest-path codetracer-python-recorder/Cargo.toml --no-default-features --bench trace_filter && \
57+
CODETRACER_TRACE_FILTER_PERF=1 \
58+
CODETRACER_TRACE_FILTER_PERF_OUTPUT="$PERF_DIR/trace_filter_py.json" \
59+
uv run --group dev --group test pytest codetracer-python-recorder/tests/python/perf/test_trace_filter_perf.py -q
60+
4361
py-test:
4462
uv run --group dev --group test pytest codetracer-python-recorder/tests/python codetracer-pure-python-recorder
4563

README.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -137,6 +137,7 @@ Basic workflow:
137137
- Run the full split test suite (Rust nextest + Python pytest): `just test`
138138
- Run only Rust integration/unit tests: `just cargo-test`
139139
- Run only Python tests (including the pure-Python recorder to guard regressions): `just py-test`
140+
- Exercise the trace-filter benchmarks (Rust Criterion + Python smoke, JSON output under `codetracer-python-recorder/target/perf`): `just bench`
140141
- Collect coverage artefacts locally (LCOV + Cobertura/JSON): `just coverage`
141142

142143
The CI workflow mirrors these commands. Pull requests get an automated comment with the latest Rust/Python coverage tables and downloadable artefacts (`lcov.info`, `coverage.xml`, `coverage.json`).

codetracer-python-recorder/CHANGELOG.md

Lines changed: 8 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -5,7 +5,13 @@ All notable changes to `codetracer-python-recorder` will be documented in this f
55
The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.1.0/) and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).
66

77
## [Unreleased]
8+
9+
## [0.2.0] - 2025-10-17
810
### Added
11+
- Added configurable trace filters backed by layered TOML files with glob/regex/literal selectors for packages, files, objects, and value domains, strict schema validation via `TraceFilterConfig::from_paths`, and explicit `allow`/`redact`/`drop` value policies summarised with SHA-256 digests.
12+
- Added `TraceFilterEngine` and runtime wiring that cache scope resolutions, gate tracing, substitute `<redacted>` for filtered payloads, drop suppressed variables entirely, and emit per-kind redaction/drop counters alongside filter summaries in `trace_metadata.json`.
13+
- Exposed configurable filters through the Python API, auto-start hook, CLI (`--trace-filter`), and `CODETRACER_TRACE_FILTER` environment variable while always prepending the built-in default filter that skips stdlib noise and redacts common secrets before layering project overrides.
14+
- Added filter-focused documentation and benchmarking coverage, including onboarding and README guides plus Criterion + Python smoke benchmarks orchestrated by `just bench`.
915
- Introduced a line-aware IO capture pipeline that records stdout/stderr chunks with `{path_id, line, frame_id}` attribution via the shared `LineSnapshotStore` and multi-threaded `IoEventSink`.
1016
- Added `LineAwareStdout`, `LineAwareStderr`, and `LineAwareStdin` proxies that forward to the original streams while batching writes on newline, explicit `flush()`, 5 ms idle gaps, and step boundaries.
1117
- Added policy, CLI, and environment toggles for IO capture (`--io-capture`, `configure_policy(io_capture_line_proxies=..., io_capture_fd_fallback=...)`, `CODETRACER_CAPTURE_IO`) alongside the `ScopedMuteIoCapture` guard that suppresses recursive recorder logging.
@@ -22,5 +28,6 @@ The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.1.0/)
2228
- Support for generating `trace_metadata.json` and `trace_paths.json` artefacts compatible with the Codetracer db-backend importer.
2329
- Cross-platform packaging targeting CPython 3.12 and 3.13 on Linux (manylinux2014 `x86_64`/`aarch64`), macOS universal2, and Windows `amd64`.
2430

25-
[Unreleased]: https://github.com/metacraft-labs/cpr-main/compare/recorder-v0.1.0...HEAD
31+
[Unreleased]: https://github.com/metacraft-labs/cpr-main/compare/recorder-v0.2.0...HEAD
32+
[0.2.0]: https://github.com/metacraft-labs/cpr-main/compare/recorder-v0.1.0...recorder-v0.2.0
2633
[0.1.0]: https://github.com/metacraft-labs/cpr-main/releases/tag/recorder-v0.1.0

0 commit comments

Comments
 (0)