Skip to content

Commit 77d25a3

Browse files
authored
Recorder exit code policy (#66)
## Context `ct record` invokes the Rust-backed `codetracer_python_recorder` CLI when capturing Python traces. The CLI currently returns the traced script's process exit code (`codetracer_python_recorder/cli.py:165`). When the target program exits with a non-zero status—whether via `SystemExit`, a failed assertion, or an explicit `sys.exit()`—the recorder propagates that status. The desktop CLI treats any non-zero exit as a fatal recording failure, so trace uploads and follow-on automation abort even though the trace artefacts are valid and the recorder itself completed successfully. Our recorder already captures the script's exit status in session metadata (`runtime/tracer/lifecycle.rs:143`) and exposes it through trace viewers. Downstream consumers that need to assert on the original program outcome can read that field. However, other integrations (CI pipelines, `ct record` automations, scripted data collection) rely on the CLI process exit code to decide whether to continue, and they expect Codetracer to return `0` when recording succeeded. We must let callers control whether the recorder propagates the script's exit status or reports recorder success independently. The default should favour Codetracer success (exit `0`) to preserve `ct record` expectations, while still allowing advanced users and direct CLI invocations to opt back into passthrough semantics. ## Decision Introduce a recorder exit-code policy with the following behaviour: 1. **Default:** When tracing completes without recorder errors (start, flush, stop, and write phases succeed and `require_trace` did not trigger), the CLI exits with status `0` regardless of the traced script's exit code. The recorder still records the script's status in trace metadata. 2. **Opt-in passthrough:** Expose a CLI flag `--propagate-script-exit` and environment override `CODETRACER_PROPAGATE_SCRIPT_EXIT`. When enabled, the CLI mirrors the traced script's exit code (the current behaviour). Both configuration surfaces resolve through the recorder policy layer so other entry points (e.g., embedded integrations) can opt in. 3. **User feedback:** If passthrough is disabled and the script exits non-zero, emit a one-line warning on stderr indicating the script's exit status and how to re-enable propagation. 4. **Recorder failure precedence:** Recorder failures (startup errors, policy violations such as `--require-trace`, flush/stop exceptions) continue to exit non-zero irrespective of the propagation setting to ensure automation can detect recorder malfunction. This policy applies uniformly to `python -m codetracer_python_recorder`, `ct record`, and any embedding that drives the same CLI module.
2 parents 8400d79 + 24e166c commit 77d25a3

File tree

15 files changed

+267
-16
lines changed

15 files changed

+267
-16
lines changed

README.md

Lines changed: 6 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -41,12 +41,13 @@ All subclasses carry the same attributes, so existing handlers can migrate by ca
4141

4242
`python -m codetracer_python_recorder` returns:
4343

44-
- `0` when tracing and the target script succeed.
45-
- The script's own exit code when it calls `sys.exit()`.
46-
- `1` when a `RecorderError` bubbles out of startup or shutdown.
47-
- `2` when the CLI arguments are incomplete.
44+
- `0` when the recorder finishes cleanly, even if the traced script exits non-zero. The script's status is still recorded in `trace_metadata.json`, and a warning on stderr highlights the suppressed status.
45+
- `1` when a `RecorderError` bubbles out of startup or shutdown (policy failures, `require_trace`, flush/stop issues).
46+
- `2` when the CLI arguments are incomplete or invalid.
4847

49-
Pass `--codetracer-json-errors` (or set the policy via `configure_policy(json_errors=True)`) to stream a one-line JSON trailer on stderr. The payload includes `run_id`, `trace_id`, `error_code`, `error_kind`, `message`, and the `context` map so downstream tooling can log failures without scraping text.
48+
Opt into mirroring the script's exit code with `--propagate-script-exit` (or `CODETRACER_PROPAGATE_SCRIPT_EXIT=true`). Use `--no-propagate-script-exit` to force suppression, even if the environment enables mirroring.
49+
50+
Pass `--json-errors` (or set the policy via `configure_policy(json_errors=True)`) to stream a one-line JSON trailer on stderr. The payload includes `run_id`, `trace_id`, `error_code`, `error_kind`, `message`, and the `context` map so downstream tooling can log failures without scraping text.
5051

5152
### IO capture configuration
5253

codetracer-python-recorder/CHANGELOG.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -10,6 +10,7 @@ The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.1.0/)
1010

1111
### Changed
1212
- Module-level call events now prefer the frame's `__name__`, fall back to filter hints, `sys.path`, and package markers, and no longer depend on the legacy resolver/cache. The globals-derived naming flag now defaults to enabled so direct scripts record `<__main__>` while package imports emit `<pkg.mod>`, with CLI and environment overrides available for the legacy resolver.
13+
- The CLI now exits with `0` when recording succeeds regardless of the traced script’s status, records a warning when suppressing non-zero script exits, and exposes `--propagate-script-exit` / `CODETRACER_PROPAGATE_SCRIPT_EXIT` / `configure_policy(propagate_script_exit=True)` to restore passthrough semantics.
1314

1415
## [0.2.0] - 2025-10-17
1516
### Added

codetracer-python-recorder/codetracer_python_recorder/cli.py

Lines changed: 41 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -11,7 +11,7 @@
1111
from pathlib import Path
1212
from typing import Iterable, Sequence
1313

14-
from . import flush, start, stop
14+
from . import flush, policy_snapshot, start, stop
1515
from .auto_start import ENV_TRACE_FILTER
1616
from .formats import DEFAULT_FORMAT, SUPPORTED_FORMATS, normalize_format
1717

@@ -129,6 +129,15 @@ def _parse_args(argv: Sequence[str]) -> RecorderCLIConfig:
129129
"Use '--no-module-name-from-globals' to fall back to the legacy resolver."
130130
),
131131
)
132+
parser.add_argument(
133+
"--propagate-script-exit",
134+
action=argparse.BooleanOptionalAction,
135+
default=None,
136+
help=(
137+
"Mirror the traced script's exit status when the recorder succeeds (default: disabled). "
138+
"Use '--no-propagate-script-exit' to force a zero exit status."
139+
),
140+
)
132141

133142
known, remainder = parser.parse_known_args(argv)
134143
pending: list[str] = list(remainder)
@@ -192,6 +201,8 @@ def _parse_args(argv: Sequence[str]) -> RecorderCLIConfig:
192201
parser.error(f"unsupported io-capture mode '{other}'")
193202
if known.module_name_from_globals is not None:
194203
policy["module_name_from_globals"] = known.module_name_from_globals
204+
if known.propagate_script_exit is not None:
205+
policy["propagate_script_exit"] = known.propagate_script_exit
195206

196207
return RecorderCLIConfig(
197208
trace_dir=trace_dir,
@@ -286,7 +297,11 @@ def main(argv: Iterable[str] | None = None) -> int:
286297
sys.argv = old_argv
287298
return 1
288299

300+
snapshot = policy_snapshot()
301+
propagate_script_exit = bool(snapshot.get("propagate_script_exit"))
302+
289303
exit_code: int | None = None
304+
recorder_failed = False
290305
try:
291306
try:
292307
runpy.run_path(str(script_path), run_name="__main__")
@@ -297,13 +312,35 @@ def main(argv: Iterable[str] | None = None) -> int:
297312
finally:
298313
try:
299314
flush()
315+
except Exception as exc:
316+
recorder_failed = True
317+
sys.stderr.write(f"Failed to flush Codetracer session: {exc}\n")
300318
finally:
301-
stop(exit_code=exit_code)
302-
sys.argv = old_argv
319+
try:
320+
stop(exit_code=exit_code)
321+
except Exception as exc:
322+
recorder_failed = True
323+
sys.stderr.write(f"Failed to stop Codetracer session: {exc}\n")
324+
finally:
325+
sys.argv = old_argv
303326

304327
_serialise_metadata(trace_dir, script=script_path)
305328

306-
return exit_code if exit_code is not None else 0
329+
script_exit_code = exit_code if exit_code is not None else 0
330+
331+
if recorder_failed:
332+
return 1
333+
334+
if propagate_script_exit:
335+
return script_exit_code
336+
337+
if script_exit_code != 0:
338+
sys.stderr.write(
339+
f"Script exited with status {script_exit_code}; returning 0. "
340+
"Use '--propagate-script-exit' to mirror the script exit code.\n"
341+
)
342+
343+
return 0
307344

308345

309346
__all__ = ("main", "RecorderCLIConfig")

codetracer-python-recorder/codetracer_python_recorder/session.py

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -89,7 +89,8 @@ def start(
8989
policy:
9090
Optional mapping of runtime policy overrides forwarded to
9191
:func:`configure_policy` before tracing begins. Keys match the policy
92-
keyword arguments (``on_recorder_error``, ``require_trace``, etc.).
92+
keyword arguments (``on_recorder_error``, ``require_trace``,
93+
``propagate_script_exit``, etc.).
9394
apply_env_policy:
9495
When ``True`` (default), refresh policy settings from environment
9596
variables via :func:`configure_policy_from_env` prior to applying

codetracer-python-recorder/src/policy.rs

Lines changed: 7 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -8,7 +8,7 @@ mod model;
88
pub use env::{
99
configure_policy_from_env, ENV_CAPTURE_IO, ENV_JSON_ERRORS, ENV_KEEP_PARTIAL_TRACE,
1010
ENV_LOG_FILE, ENV_LOG_LEVEL, ENV_MODULE_NAME_FROM_GLOBALS, ENV_ON_RECORDER_ERROR,
11-
ENV_REQUIRE_TRACE,
11+
ENV_PROPAGATE_SCRIPT_EXIT, ENV_REQUIRE_TRACE,
1212
};
1313
#[allow(unused_imports)]
1414
pub use ffi::{configure_policy_py, py_configure_policy_from_env, py_policy_snapshot};
@@ -43,6 +43,7 @@ mod tests {
4343
assert!(snap.io_capture.line_proxies);
4444
assert!(!snap.io_capture.fd_fallback);
4545
assert!(snap.module_name_from_globals);
46+
assert!(!snap.propagate_script_exit);
4647
}
4748

4849
#[test]
@@ -58,6 +59,7 @@ mod tests {
5859
update.io_capture_line_proxies = Some(true);
5960
update.io_capture_fd_fallback = Some(true);
6061
update.module_name_from_globals = Some(true);
62+
update.propagate_script_exit = Some(true);
6163

6264
apply_policy_update(update);
6365

@@ -71,6 +73,7 @@ mod tests {
7173
assert!(snap.io_capture.line_proxies);
7274
assert!(snap.io_capture.fd_fallback);
7375
assert!(snap.module_name_from_globals);
76+
assert!(snap.propagate_script_exit);
7477
reset_policy();
7578
}
7679

@@ -86,6 +89,7 @@ mod tests {
8689
std::env::set_var(ENV_JSON_ERRORS, "yes");
8790
std::env::set_var(ENV_CAPTURE_IO, "proxies,fd");
8891
std::env::set_var(ENV_MODULE_NAME_FROM_GLOBALS, "true");
92+
std::env::set_var(ENV_PROPAGATE_SCRIPT_EXIT, "true");
8993

9094
configure_policy_from_env().expect("configure from env");
9195

@@ -101,6 +105,7 @@ mod tests {
101105
assert!(snap.io_capture.line_proxies);
102106
assert!(snap.io_capture.fd_fallback);
103107
assert!(snap.module_name_from_globals);
108+
assert!(snap.propagate_script_exit);
104109
reset_policy();
105110
}
106111

@@ -163,6 +168,7 @@ mod tests {
163168
ENV_JSON_ERRORS,
164169
ENV_CAPTURE_IO,
165170
ENV_MODULE_NAME_FROM_GLOBALS,
171+
ENV_PROPAGATE_SCRIPT_EXIT,
166172
] {
167173
std::env::remove_var(key);
168174
}

codetracer-python-recorder/src/policy/env.rs

Lines changed: 11 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -21,6 +21,8 @@ pub const ENV_JSON_ERRORS: &str = "CODETRACER_JSON_ERRORS";
2121
pub const ENV_CAPTURE_IO: &str = "CODETRACER_CAPTURE_IO";
2222
/// Environment variable toggling globals-based module name resolution.
2323
pub const ENV_MODULE_NAME_FROM_GLOBALS: &str = "CODETRACER_MODULE_NAME_FROM_GLOBALS";
24+
/// Environment variable toggling whether the recorder mirrors script exit codes.
25+
pub const ENV_PROPAGATE_SCRIPT_EXIT: &str = "CODETRACER_PROPAGATE_SCRIPT_EXIT";
2426

2527
/// Load policy overrides from environment variables.
2628
pub fn configure_policy_from_env() -> RecorderResult<()> {
@@ -66,6 +68,10 @@ pub fn configure_policy_from_env() -> RecorderResult<()> {
6668
update.module_name_from_globals = Some(parse_bool(&value)?);
6769
}
6870

71+
if let Ok(value) = env::var(ENV_PROPAGATE_SCRIPT_EXIT) {
72+
update.propagate_script_exit = Some(parse_bool(&value)?);
73+
}
74+
6975
apply_policy_update(update);
7076
Ok(())
7177
}
@@ -148,6 +154,7 @@ mod tests {
148154
std::env::set_var(ENV_JSON_ERRORS, "yes");
149155
std::env::set_var(ENV_CAPTURE_IO, "proxies,fd");
150156
std::env::set_var(ENV_MODULE_NAME_FROM_GLOBALS, "true");
157+
std::env::set_var(ENV_PROPAGATE_SCRIPT_EXIT, "true");
151158

152159
configure_policy_from_env().expect("configure from env");
153160
let snap = policy_snapshot();
@@ -163,17 +170,20 @@ mod tests {
163170
assert!(snap.io_capture.line_proxies);
164171
assert!(snap.io_capture.fd_fallback);
165172
assert!(snap.module_name_from_globals);
173+
assert!(snap.propagate_script_exit);
166174
}
167175

168176
#[test]
169177
fn configure_policy_from_env_disables_module_name_from_globals() {
170178
let _guard = EnvGuard;
171179
reset_policy_for_tests();
172180
std::env::set_var(ENV_MODULE_NAME_FROM_GLOBALS, "false");
181+
std::env::set_var(ENV_PROPAGATE_SCRIPT_EXIT, "false");
173182

174183
configure_policy_from_env().expect("configure from env");
175184
let snap = policy_snapshot();
176185
assert!(!snap.module_name_from_globals);
186+
assert!(!snap.propagate_script_exit);
177187
}
178188

179189
#[test]
@@ -202,6 +212,7 @@ mod tests {
202212
ENV_JSON_ERRORS,
203213
ENV_CAPTURE_IO,
204214
ENV_MODULE_NAME_FROM_GLOBALS,
215+
ENV_PROPAGATE_SCRIPT_EXIT,
205216
] {
206217
std::env::remove_var(key);
207218
}

codetracer-python-recorder/src/policy/ffi.rs

Lines changed: 17 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -11,7 +11,7 @@ use std::path::PathBuf;
1111
use std::str::FromStr;
1212

1313
#[pyfunction(name = "configure_policy")]
14-
#[pyo3(signature = (on_recorder_error=None, require_trace=None, keep_partial_trace=None, log_level=None, log_file=None, json_errors=None, io_capture_line_proxies=None, io_capture_fd_fallback=None, module_name_from_globals=None))]
14+
#[pyo3(signature = (on_recorder_error=None, require_trace=None, keep_partial_trace=None, log_level=None, log_file=None, json_errors=None, io_capture_line_proxies=None, io_capture_fd_fallback=None, module_name_from_globals=None, propagate_script_exit=None))]
1515
pub fn configure_policy_py(
1616
on_recorder_error: Option<&str>,
1717
require_trace: Option<bool>,
@@ -22,6 +22,7 @@ pub fn configure_policy_py(
2222
io_capture_line_proxies: Option<bool>,
2323
io_capture_fd_fallback: Option<bool>,
2424
module_name_from_globals: Option<bool>,
25+
propagate_script_exit: Option<bool>,
2526
) -> PyResult<()> {
2627
let mut update = PolicyUpdate::default();
2728

@@ -69,6 +70,10 @@ pub fn configure_policy_py(
6970
update.module_name_from_globals = Some(value);
7071
}
7172

73+
if let Some(value) = propagate_script_exit {
74+
update.propagate_script_exit = Some(value);
75+
}
76+
7277
apply_policy_update(update);
7378
Ok(())
7479
}
@@ -106,6 +111,7 @@ pub fn py_policy_snapshot(py: Python<'_>) -> PyResult<PyObject> {
106111
"module_name_from_globals",
107112
snapshot.module_name_from_globals,
108113
)?;
114+
dict.set_item("propagate_script_exit", snapshot.propagate_script_exit)?;
109115

110116
let io_dict = PyDict::new(py);
111117
io_dict.set_item("line_proxies", snapshot.io_capture.line_proxies)?;
@@ -133,6 +139,7 @@ mod tests {
133139
Some(true),
134140
Some(true),
135141
Some(true),
142+
Some(true),
136143
)
137144
.expect("configure policy via PyO3 facade");
138145

@@ -152,6 +159,7 @@ mod tests {
152159
assert!(snap.io_capture.line_proxies);
153160
assert!(snap.io_capture.fd_fallback);
154161
assert!(snap.module_name_from_globals);
162+
assert!(snap.propagate_script_exit);
155163
reset_policy_for_tests();
156164
}
157165

@@ -168,6 +176,7 @@ mod tests {
168176
None,
169177
None,
170178
None,
179+
None,
171180
)
172181
.expect_err("invalid variant should error");
173182
// Ensure the error maps through map_recorder_error by checking the display text.
@@ -208,6 +217,7 @@ mod tests {
208217
Some(false),
209218
Some(false),
210219
Some(false),
220+
Some(true),
211221
)
212222
.expect("configure policy");
213223

@@ -224,6 +234,11 @@ mod tests {
224234
dict.contains("io_capture").expect("check io_capture key"),
225235
"expected io_capture in snapshot"
226236
);
237+
assert!(
238+
dict.contains("propagate_script_exit")
239+
.expect("check propagate_script_exit key"),
240+
"expected propagate_script_exit in snapshot"
241+
);
227242
});
228243
reset_policy_for_tests();
229244
}
@@ -241,6 +256,7 @@ mod tests {
241256
super::super::env::ENV_JSON_ERRORS,
242257
super::super::env::ENV_CAPTURE_IO,
243258
super::super::env::ENV_MODULE_NAME_FROM_GLOBALS,
259+
super::super::env::ENV_PROPAGATE_SCRIPT_EXIT,
244260
] {
245261
std::env::remove_var(key);
246262
}

codetracer-python-recorder/src/policy/model.rs

Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -72,6 +72,7 @@ pub struct RecorderPolicy {
7272
pub json_errors: bool,
7373
pub io_capture: IoCapturePolicy,
7474
pub module_name_from_globals: bool,
75+
pub propagate_script_exit: bool,
7576
}
7677

7778
impl Default for RecorderPolicy {
@@ -85,6 +86,7 @@ impl Default for RecorderPolicy {
8586
json_errors: false,
8687
io_capture: IoCapturePolicy::default(),
8788
module_name_from_globals: true,
89+
propagate_script_exit: false,
8890
}
8991
}
9092
}
@@ -128,6 +130,9 @@ impl RecorderPolicy {
128130
if let Some(module_name_from_globals) = update.module_name_from_globals {
129131
self.module_name_from_globals = module_name_from_globals;
130132
}
133+
if let Some(propagate_script_exit) = update.propagate_script_exit {
134+
self.propagate_script_exit = propagate_script_exit;
135+
}
131136
}
132137
}
133138

@@ -150,6 +155,7 @@ pub(crate) struct PolicyUpdate {
150155
pub(crate) io_capture_line_proxies: Option<bool>,
151156
pub(crate) io_capture_fd_fallback: Option<bool>,
152157
pub(crate) module_name_from_globals: Option<bool>,
158+
pub(crate) propagate_script_exit: Option<bool>,
153159
}
154160

155161
/// Snapshot the current policy.

0 commit comments

Comments
 (0)