Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion codetracer-python-recorder/CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -9,7 +9,7 @@ The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.1.0/)
- Balanced call-stack handling for generators, coroutines, and unwinding frames by subscribing to `PY_YIELD`, `PY_UNWIND`, `PY_RESUME`, and `PY_THROW`, mapping resume/throw events to `TraceWriter::register_call`, yield/unwind to `register_return`, and capturing `PY_THROW` arguments as `exception` using the existing value encoder. Added Python + Rust integration tests that drive `.send()`/`.throw()` on coroutines and generators to guarantee the trace stays balanced and that exception payloads are recorded.

### Changed
- Module-level call events now use the actual dotted module name (e.g., `<my_pkg.mod>` or `<boto3.session>`) instead of the generic `<module>` label. `RuntimeTracer` derives the name via the shared module-identity helper, caches the result per code object, and falls back to `<module>` only for synthetic or nameless frames. Added Rust + Python tests plus README documentation covering the new semantics.
- Module-level call events now prefer the frame's `__name__`, fall back to filter hints, `sys.path`, and package markers, and no longer depend on the legacy resolver/cache. The globals-derived naming flag now defaults to enabled so direct scripts record `<__main__>` while package imports emit `<pkg.mod>`, with CLI and environment overrides available for the legacy resolver.

## [0.2.0] - 2025-10-17
### Added
Expand Down
3 changes: 2 additions & 1 deletion codetracer-python-recorder/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -84,7 +84,8 @@ action = "drop"

## Trace naming semantics

- Module-level activations no longer appear as the ambiguous `<module>` label. When the recorder sees `co_qualname == "<module>"`, it derives the actual dotted package name (e.g., `<my_pkg.mod>` or `<boto3.session>`) using project roots, `sys.modules`, and frame metadata.
- Module-level activations no longer appear as the ambiguous `<module>` label. When the recorder sees `co_qualname == "<module>"`, it first reuses the frame's `__name__`, then falls back to trace-filter hints, `sys.path` roots, and package markers so scripts report `<__main__>` while real modules keep their dotted names (e.g., `<my_pkg.mod>` or `<boto3.session>`).
- The globals-derived naming flow ships enabled by default; disable it temporarily with `--no-module-name-from-globals`, `codetracer.configure_policy(module_name_from_globals=False)`, or `CODETRACER_MODULE_NAME_FROM_GLOBALS=0` if you need to compare against the legacy resolver.
- The angle-bracket convention remains for module entries so downstream tooling can distinguish top-level activations at a glance.
- Traces will still emit `<module>` for synthetic filenames (`<stdin>`, `<string>`), frozen/importlib bootstrap frames, or exotic loaders that omit filenames entirely. This preserves previous behaviour when no reliable name exists.

Expand Down
4 changes: 2 additions & 2 deletions codetracer-python-recorder/benches/trace_filter.rs
Original file line number Diff line number Diff line change
Expand Up @@ -52,7 +52,7 @@ fn run_workload(engine: &TraceFilterEngine, dataset: &WorkloadDataset) {
for &index in &dataset.event_indices {
let code = dataset.codes[index].as_ref();
let resolution = engine
.resolve(py, code)
.resolve(py, code, None)
.expect("trace filter resolution should succeed during benchmarking");
let policy = resolution.value_policy();
for name in dataset.locals.iter() {
Expand All @@ -66,7 +66,7 @@ fn prewarm_engine(engine: &TraceFilterEngine, dataset: &WorkloadDataset) {
Python::with_gil(|py| {
for code in &dataset.codes {
let _ = engine
.resolve(py, code.as_ref())
.resolve(py, code.as_ref(), None)
.expect("prewarm resolution failed");
}
});
Expand Down
11 changes: 11 additions & 0 deletions codetracer-python-recorder/codetracer_python_recorder/cli.py
Original file line number Diff line number Diff line change
Expand Up @@ -120,6 +120,15 @@ def _parse_args(argv: Sequence[str]) -> RecorderCLIConfig:
"'proxies+fd' also mirrors raw file-descriptor writes."
),
)
parser.add_argument(
"--module-name-from-globals",
action=argparse.BooleanOptionalAction,
default=None,
help=(
"Derive module names from the Python frame's __name__ attribute (default: enabled). "
"Use '--no-module-name-from-globals' to fall back to the legacy resolver."
),
)

known, remainder = parser.parse_known_args(argv)
pending: list[str] = list(remainder)
Expand Down Expand Up @@ -181,6 +190,8 @@ def _parse_args(argv: Sequence[str]) -> RecorderCLIConfig:
policy["io_capture_fd_fallback"] = True
case other: # pragma: no cover - argparse choices block this
parser.error(f"unsupported io-capture mode '{other}'")
if known.module_name_from_globals is not None:
policy["module_name_from_globals"] = known.module_name_from_globals

return RecorderCLIConfig(
trace_dir=trace_dir,
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -34,7 +34,7 @@ exec = "skip"
reason = "Skip builtins module instrumentation"

[[scope.rules]]
selector = 'pkg:glob:*_distutils_hack*'
selector = 'pkg:literal:_distutils_hack'
exec = "skip"
reason = "Skip setuptools shim module"

Expand Down
Loading
Loading