|
| 1 | +In Python monitoring sometimes the co_filename of a code object |
| 2 | +doesn't point to a real file, but something else. Those filenames look |
| 3 | +like `<...>`. |
| 4 | + |
| 5 | +Lines from those files cannot be traced. For this reason, we should |
| 6 | +skip them for all monitoring events. |
| 7 | + |
| 8 | +sys.monitoring provides the capability to turn off monitoring for |
| 9 | +specific lines by having the callback return a special value |
| 10 | +`sys.monitoring.DISABLE`. We want to use this functionality to disable |
| 11 | +monitoring of those lines and improve performance. |
| 12 | + |
| 13 | +The following changes need to be made: |
| 14 | + |
| 15 | +1. Extend the `Tracer` trait so every callback can signal back a |
| 16 | + `sys.monitoring` action (continue or disable). Update all existing |
| 17 | + implementations and tests to use the new return type. |
| 18 | +2. Add reusable logic that decides whether a given code object refers |
| 19 | + to a real on-disk file and cache the decision per `co_filename` / |
| 20 | + code id. |
| 21 | +3. Invoke the new filtering logic from every `RuntimeTracer` callback |
| 22 | + before any expensive work. When a code object should be ignored, |
| 23 | + skip our bookkeeping and return the disable sentinel to CPython so |
| 24 | + further events from that location stop firing. |
| 25 | + |
| 26 | +Note: We cannot import `sys.monitoring` inside the hot callbacks, |
| 27 | +because in some embedded runtimes importing during tracing is either |
| 28 | +prohibited or will deadlock. We must therefore cache the |
| 29 | +`sys.monitoring.DISABLE` sentinel ahead of time while we are still in a |
| 30 | +safe context (e.g., during tracer installation). |
| 31 | + |
| 32 | +We need to make sure that our test suite has comprehensive tests that |
| 33 | +prove the new filtering/disable behaviour and cover regressions on the |
| 34 | +public tracer API. |
| 35 | + |
| 36 | +# Technical design solutions |
| 37 | + |
| 38 | +## Tracer callback return values |
| 39 | + |
| 40 | +- Introduce a new enum `CallbackOutcome` in `src/tracer.rs` with two |
| 41 | + variants: `Continue` (default) and `DisableLocation`. |
| 42 | +- Define a `type CallbackResult = PyResult<CallbackOutcome>` so every |
| 43 | + trait method can surface Python errors and signal whether the |
| 44 | + location must be disabled. `Continue` replaces the current implicit |
| 45 | + unit return. |
| 46 | +- Update the `Tracer` trait so all callbacks return `CallbackResult`. |
| 47 | + Default implementations continue to return `Ok(CallbackOutcome::Continue)` |
| 48 | + so existing tracers only need minimal changes. |
| 49 | +- The PyO3 callback shims (`callback_line`, `callback_py_start`, etc.) |
| 50 | + will translate `CallbackOutcome::DisableLocation` into the cached |
| 51 | + Python sentinel and otherwise return `None`. This keeps the Python |
| 52 | + side compliant with `sys.monitoring` semantics |
| 53 | + (see https://docs.python.org/3/library/sys.monitoring.html#sys.monitoring.DISABLE). |
| 54 | + |
| 55 | +## Accessing `sys.monitoring.DISABLE` |
| 56 | + |
| 57 | +- During `install_tracer`, after we obtain `monitoring_events`, load |
| 58 | + `sys.monitoring.DISABLE` once and store it in the global tracer state |
| 59 | + (`Global` struct) as a `Py<PyAny>`. Because `Py<PyAny>` is `Send` |
| 60 | + + `Sync`, it can be safely cached behind the global mutex and reused |
| 61 | + inside callbacks without re-importing modules. |
| 62 | +- Provide a helper on `Global` (e.g., `fn disable_sentinel<'py>(&self, |
| 63 | + py: Python<'py>) -> Bound<'py, PyAny>`) that returns the bound object |
| 64 | + when we need to hand the sentinel back to Python. |
| 65 | +- Make sure `uninstall_tracer` drops the sentinel alongside other |
| 66 | + state so a new install can reload it cleanly. |
| 67 | + |
| 68 | +## `RuntimeTracer` filename filtering |
| 69 | + |
| 70 | +- Add a dedicated method `fn should_trace_code(&mut self, |
| 71 | + py: Python<'_>, code: &CodeObjectWrapper) -> ShouldTrace` returning a |
| 72 | + new internal enum `{ Trace, SkipAndDisable }`. |
| 73 | + - A file is considered “real” when `co_filename` does not match the |
| 74 | + `<...>` pattern. For now we treat any filename that begins with `<` |
| 75 | + and ends with `>` (after trimming whitespace) as synthetic. This |
| 76 | + covers `<frozen importlib._bootstrap>`, `<stdin>`, `<string>`, etc. |
| 77 | + - Cache negative decisions in a `HashSet<usize>` keyed by the code |
| 78 | + object id so subsequent events avoid repeating the string checks. |
| 79 | + The set is cleared on `flush()`/`finish()` if we reset state. |
| 80 | +- Each public callback (`on_py_start`, `on_line`, `on_py_return`) will |
| 81 | + call `should_trace_code` first. When the decision is `SkipAndDisable` |
| 82 | + we: |
| 83 | + - Return `CallbackOutcome::DisableLocation` immediately so CPython |
| 84 | + stops sending events for that location. |
| 85 | + - Avoid calling any of the expensive frame/value capture paths. |
| 86 | +- When the decision allows tracing, we continue with the existing |
| 87 | + behaviour. The activation-path logic runs before the filtering so a |
| 88 | + deactivated tracer still ignores events regardless of filename. |
| 89 | + |
| 90 | +## Backwards compatibility and ergonomics |
| 91 | + |
| 92 | +- `RuntimeTracer` becomes the only tracer that returns |
| 93 | + `DisableLocation`; other tracers keep returning `Continue`. |
| 94 | +- Update the test helper tracers under `tests/` to use the new return |
| 95 | + type but still assert on event counts; their filenames will remain |
| 96 | + real so behaviour does not change. |
| 97 | +- Document the change in the crate-level docs (`src/lib.rs`) to warn |
| 98 | + downstream implementors that callbacks now return `CallbackResult`. |
| 99 | + |
| 100 | +# Test suite |
| 101 | + |
| 102 | +- Rust unit test for the pure filename predicate (e.g., |
| 103 | + `<string>`, `<frozen importlib._bootstrap>`, `script.py`) to prevent |
| 104 | + regressions in the heuristic. |
| 105 | +- Runtime tracer integration test that registers a `RuntimeTracer`, |
| 106 | + executes code with a `<string>` filename, and asserts that: |
| 107 | + - No events are written to the trace writer. |
| 108 | + - The corresponding callbacks return the disable sentinel (inspect |
| 109 | + via a lightweight shim or mock writer). |
| 110 | +- Complementary test that runs a real file (use `tempfile` to emit a |
| 111 | + small script) and ensures events are still recorded. |
| 112 | +- Regression tests for the updated trait: adjust `tests/print_tracer.rs` |
| 113 | + counting tracer to assert it still receives events and that the |
| 114 | + return value defaults to `Continue`. |
| 115 | +- Add a smoke test checking we do not attempt to import |
| 116 | + `sys.monitoring` inside callbacks by patching the module import hook |
| 117 | + during a run. |
| 118 | + |
| 119 | +# Implementation Plan |
| 120 | + |
| 121 | +1. Introduce `CallbackOutcome`/`CallbackResult` in `src/tracer.rs` and |
| 122 | + update every trait method signature plus the PyO3 callback shims. |
| 123 | + Store the `sys.monitoring.DISABLE` sentinel in the `Global` state. |
| 124 | +2. Propagate signature updates through existing tracers and tests, |
| 125 | + ensuring they all return `CallbackOutcome::Continue`. |
| 126 | +3. Extend `RuntimeTracer` with the filename filtering method, cached |
| 127 | + skip set, and early-return logic that emits `DisableLocation` when |
| 128 | + appropriate. |
| 129 | +4. Update the runtime tracer callbacks (`on_py_start`, `on_line`, |
| 130 | + `on_py_return`, and any other events we wire up later) to invoke the |
| 131 | + filtering method first. |
| 132 | +5. Expand the test suite with the new unit/integration coverage and |
| 133 | + adjust existing tests to the trait changes. |
| 134 | +6. Perform a final pass to document the new behaviour in public docs |
| 135 | + and ensure formatting/lints pass. |
0 commit comments