|
| 1 | +# Archived Issues - 2025-09-09 |
| 2 | + |
| 3 | +## ISSUE-001 |
| 4 | +### Description |
| 5 | +We need to record function arguments when calling a function |
| 6 | + |
| 7 | +We have a function `encode_value` which is used to convert Python objects to value records. We need to use this function to encode the function arguments. To do that we should modify the `on_py_start` hook to load the current frame and to read the function arguments from it. |
| 8 | + |
| 9 | +### Definition of Done |
| 10 | +- Arguments for positional and pos-or-keyword parameters are recorded on function entry using the current frame's locals. |
| 11 | +- Values are encoded via `encode_value` and attached to the `Call` event payload. |
| 12 | +- A unit test asserts that multiple positional arguments (e.g. `a`, `b`) are present with correct encoded values. |
| 13 | +- Varargs/kwargs and positional-only coverage are tracked in separate issues (see ISSUE-002, ISSUE-005). |
| 14 | + |
| 15 | +### Status |
| 16 | +Archived |
| 17 | + |
| 18 | +Implemented for positional (and pos-or-keyword) arguments on function entry |
| 19 | +using `sys._getframe(0)` and `co_varnames[:co_argcount]`, with counting fixed to |
| 20 | +use `co_argcount` directly (includes positional-only; avoids double-counting). |
| 21 | +Values are encoded via `encode_value` and attached to the `Call` event. Tests |
| 22 | +validate correct presence and values. Varargs/kwargs remain covered by |
| 23 | +ISSUE-002. |
| 24 | + |
| 25 | + |
| 26 | + |
| 27 | + |
| 28 | +## ISSUE-002 |
| 29 | +### Description |
| 30 | +Capture all Python argument kinds on function entry: positional-only, |
| 31 | +pos-or-kw, keyword-only, plus varargs (`*args`) and kwargs (`**kwargs`). Extend |
| 32 | +the current implementation that uses `co_argcount` and `co_varnames` to also |
| 33 | +leverage `co_posonlyargcount` and `co_kwonlyargcount`, and detect varargs/kwargs |
| 34 | +via code flags. Encode `*args` as a list value and `**kwargs` as a mapping value |
| 35 | +to preserve structure. |
| 36 | + |
| 37 | +### Definition of Done |
| 38 | +- All argument kinds are captured on function entry: positional-only, pos-or-keyword, keyword-only, varargs (`*args`), and kwargs (`**kwargs`). |
| 39 | +- `*args` is encoded as a list value; `**kwargs` is encoded as a mapping value. |
| 40 | +- Positional-only and keyword-only parameters are included using `co_posonlyargcount` and `co_kwonlyargcount`. |
| 41 | +- Comprehensive tests cover each argument kind and validate the encoded structure and values. |
| 42 | + |
| 43 | +### Status |
| 44 | +Archived |
| 45 | + |
| 46 | +All argument kinds are captured on function entry, including kwargs with |
| 47 | +structured encoding. Varargs are preserved as `Tuple` (per CPython), and |
| 48 | +`**kwargs` are encoded as a `Sequence` of 2-element `Tuple`s `(key, value)` |
| 49 | +with string keys, enabling lossless downstream analysis. The updated test |
| 50 | +`test_all_argument_kinds_recorded_on_py_start` verifies the behavior. |
| 51 | + |
| 52 | +Note: While the original Definition of Done referenced a mapping value kind, |
| 53 | +the implementation follows the proposed approach in ISSUE-008 to represent |
| 54 | +kwargs as a sequence of tuples using existing value kinds. |
| 55 | + |
| 56 | + |
| 57 | + |
| 58 | +## ISSUE-005 |
| 59 | +### Description |
| 60 | +Include positional-only parameters in argument capture. The current logic uses |
| 61 | +only `co_argcount` for the positional slice, which excludes positional-only |
| 62 | +arguments (PEP 570). As a result, names before the `/` in a signature like |
| 63 | +`def f(p, /, q, *args, r, **kwargs)` are dropped. |
| 64 | + |
| 65 | +### Definition of Done |
| 66 | +- Positional-only parameters are included in the captured argument set. |
| 67 | +- The selection of positional names accounts for `co_posonlyargcount` in addition to `co_argcount`. |
| 68 | +- Tests add a function with positional-only parameters and assert their presence and correct encoding. |
| 69 | + |
| 70 | +### Status |
| 71 | +Archived |
| 72 | + |
| 73 | +Implemented by selecting positional names from `co_varnames` using |
| 74 | +`co_argcount` directly (which already includes positional-only per CPython 3.8+). |
| 75 | +This prevents double-counting and keeps indexing stable. Tests in |
| 76 | +`test_all_argument_kinds_recorded_on_py_start` assert presence of the |
| 77 | +positional-only parameter `p` and pass. |
| 78 | + |
| 79 | + |
| 80 | + |
| 81 | +## ISSUE-003 |
| 82 | +### Description |
| 83 | +Avoid defensive fallback in argument capture. The current change swallows |
| 84 | +failures to access the frame/locals and proceeds with empty `args`. Per |
| 85 | +`rules/source-code.md` ("Avoid defensive programming"), we should fail fast when |
| 86 | +encountering such edge cases. |
| 87 | + |
| 88 | +### Definition of Done |
| 89 | +- Silent fallbacks that return empty arguments on failure are removed. |
| 90 | +- The recorder raises a clear, actionable error when it cannot access frame/locals. |
| 91 | +- Tests verify the fail-fast path. |
| 92 | + |
| 93 | +### Status |
| 94 | +Archived |
| 95 | + |
| 96 | +`RuntimeTracer::on_py_start` now returns `PyResult<()>` and raises a |
| 97 | +`RuntimeError` when frame/locals access fails; `callback_py_start` propagates |
| 98 | +the error to Python. A pytest (`tests/test_fail_fast_on_py_start.py`) asserts |
| 99 | +the fail-fast behavior by monkeypatching `sys._getframe` to raise. |
| 100 | + |
| 101 | + |
| 102 | + |
| 103 | +## ISSUE-004 |
| 104 | +### Description |
| 105 | +Stabilize string value encoding for arguments and tighten tests. The new test |
| 106 | +accepts either `String` or `Raw` kinds for the `'x'` argument, which can hide |
| 107 | +regressions. We should standardize encoding of `str` as `String` (or document |
| 108 | +when `Raw` is expected) and update tests to assert the exact kind. |
| 109 | + |
| 110 | +### Definition of Done |
| 111 | +- String values are consistently encoded as `String` (or the expected canonical kind), with any exceptions explicitly documented. |
| 112 | +- Tests assert the exact kind for `str` arguments and fail if an unexpected kind (e.g., `Raw`) is produced. |
| 113 | +- Documentation clarifies encoding rules for string-like types to avoid ambiguity in future changes. |
| 114 | + |
| 115 | +### Status |
| 116 | +Archived |
| 117 | + |
| 118 | +Stricter tests now assert `str` values are encoded as `String` with the exact text payload, and runtime docs clarify canonical encoding. No runtime logic change was required since `encode_value` already produced `String` for Python `str`. |
| 119 | + |
| 120 | + |
| 121 | + |
| 122 | +## ISSUE-006 |
| 123 | +### Description |
| 124 | +Accidental check-in of Cargo cache/artifact files under `codetracer-python-recorder/.cargo/**` (e.g., `registry/CACHEDIR.TAG`, `.package-cache`). These are build/cache directories and should be excluded from version control. |
| 125 | + |
| 126 | +### Definition of Done |
| 127 | +- Add ignore rules to exclude Cargo cache directories (e.g., `.cargo/**`, `target/**`) from version control. |
| 128 | +- Remove already-checked-in cache files from the repository. |
| 129 | +- Verify the working tree is clean after a clean build; no cache artifacts appear as changes. |
| 130 | + |
| 131 | +### Status |
| 132 | +Archived |
| 133 | + |
| 134 | + |
| 135 | + |
| 136 | +## ISSUE-007 |
| 137 | +### Description |
| 138 | +Immediately stop tracing when any monitoring callback raises an error. |
| 139 | + |
| 140 | +Current behavior: `RuntimeTracer::on_py_start` intentionally fails fast when it |
| 141 | +cannot capture function arguments (e.g., when `sys._getframe` is unavailable or |
| 142 | +patched to raise). The callback error is propagated to Python via |
| 143 | +`callback_py_start` (it returns the `PyResult` from `on_py_start`). However, the |
| 144 | +tracer remains installed and active after the error. As a result, any further |
| 145 | +Python function start (even from exception-handling or printing the exception) |
| 146 | +triggers `on_py_start` again, re-raising the same error and interfering with the |
| 147 | +program’s own error handling. |
| 148 | + |
| 149 | +This is observable in `codetracer-python-recorder/tests/test_fail_fast_on_py_start.py`: |
| 150 | +the test simulates `_getframe` failure, which correctly raises in `on_py_start`, |
| 151 | +but `print(e)` inside the test’s `except` block invokes codec machinery that |
| 152 | +emits additional `PY_START` events. Those callbacks raise again, causing the test |
| 153 | +to fail before reaching its assertions. |
| 154 | + |
| 155 | +### Impact |
| 156 | +- Breaks user code paths that attempt to catch and handle exceptions while the |
| 157 | + tracer is active — routine operations like `print(e)` can cascade failures. |
| 158 | +- Hard to debug because the original error is masked by subsequent callback |
| 159 | + errors from unrelated modules (e.g., `codecs`). |
| 160 | + |
| 161 | +### Proposed Solution |
| 162 | +Fail fast and disable tracing at the first callback error. |
| 163 | + |
| 164 | +Implementation sketch: |
| 165 | +- In each callback wrapper (e.g., `callback_py_start`), if the underlying |
| 166 | + tracer method returns `Err`, immediately disable further monitoring before |
| 167 | + returning the error: |
| 168 | + - Set events to `NO_EVENTS` (via `set_events`) to prevent any more callbacks. |
| 169 | + - Unregister all previously registered callbacks for our tool id. |
| 170 | + - Optionally call `finish()` on the tracer to flush/close writers. |
| 171 | + - Option A (hard uninstall): call `uninstall_tracer(py)` to release tool id |
| 172 | + and clear the registry. This fully tears down the tracer. Note that the |
| 173 | + high-level `ACTIVE` flag in `lib.rs` is not updated by `uninstall_tracer`, |
| 174 | + so either: |
| 175 | + - expose an internal “deactivate_from_callback()” in `lib.rs` that clears |
| 176 | + `ACTIVE`, or |
| 177 | + - keep a soft-stop in `tracer.rs` by setting `NO_EVENTS` and unregistering |
| 178 | + callbacks without touching `ACTIVE`, allowing `stop_tracing()` to be a |
| 179 | + no-op later. |
| 180 | + - Ensure reentrancy safety: perform the disable sequence only once (e.g., with |
| 181 | + a guard flag) to avoid nested teardown during callback execution. |
| 182 | + |
| 183 | +Behavioral details: |
| 184 | +- The original callback error must still be propagated to Python so the user |
| 185 | + sees the true failure cause, but subsequent code should not receive further |
| 186 | + monitoring callbacks. |
| 187 | +- If error occurs before activation gating triggers, the disable sequence should |
| 188 | + still run to avoid repeated failures from unrelated modules importing. |
| 189 | + |
| 190 | +### Definition of Done |
| 191 | +- On any callback error (at minimum `on_py_start`, and future callbacks that may |
| 192 | + return `PyResult`), all further monitoring callbacks from this tool are |
| 193 | + disabled immediately within the same GIL context. |
| 194 | +- The initial error is propagated unchanged to Python. |
| 195 | +- The failing test `test_fail_fast_on_py_start.py` passes: after the first |
| 196 | + failure, `print(e)` does not trigger additional tracer errors. |
| 197 | +- Writers are flushed/closed or left in a consistent state (documented), and no |
| 198 | + additional events are recorded after disablement. |
| 199 | +- Unit/integration tests cover: error in `on_py_start`, repeated calls after |
| 200 | + disablement are no-ops, and explicit `stop_tracing()` is safe after a |
| 201 | + callback-induced shutdown. |
| 202 | + |
| 203 | +### Status |
| 204 | +Archived |
| 205 | + |
| 206 | +Implemented soft-stop on first callback error in `callback_py_start`: |
| 207 | +on error, the tracer finishes writers, unregisters callbacks for the |
| 208 | +configured mask, sets events to `NO_EVENTS`, clears the registry, and |
| 209 | +records `global.mask = NO_EVENTS`. The original error is propagated to |
| 210 | +Python, and subsequent `PY_START` events are not delivered. This keeps the |
| 211 | +module-level `ACTIVE` flag unchanged until `stop_tracing()`, making the |
| 212 | +shutdown idempotent. The test `tests/test_fail_fast_on_py_start.py` |
| 213 | +exercises the behavior by re-running the program after the initial failure. |
| 214 | + |
| 215 | + |
| 216 | + |
| 217 | +## ISSUE-008 |
| 218 | +### Description |
| 219 | +Provide structured encoding for kwargs (`**kwargs`) on function entry. The |
| 220 | +current backend encodes kwargs as `Raw` text because the `runtime_tracing` |
| 221 | +format lacks a mapping value. Introduce a mapping representation so kwargs can |
| 222 | +be recorded losslessly with key/value structure and recursively encoded values. |
| 223 | + |
| 224 | +### Definition of Done |
| 225 | +- `runtime_tracing` supports a mapping value kind (e.g., `Map` with string keys). |
| 226 | +- `RuntimeTracer::encode_value` encodes Python `dict` to the mapping kind with |
| 227 | + recursively encoded values; key type restricted to `str` (non-`str` keys may |
| 228 | + be stringified or rejected, behavior documented). |
| 229 | +- `on_py_start` records `**kwargs` using the new mapping encoding. |
| 230 | +- Tests verify kwargs structure and values; large and nested kwargs are covered. |
| 231 | + |
| 232 | +### Proposed solution |
| 233 | +- We can represent our `Map` as a sequenced of tuples. This way we can use the current value record types to encode dictionaries. |
| 234 | +- In the Python recorder, downcast to `dict` and iterate items, encoding values |
| 235 | + recursively; keep behavior minimal and fail fast on unexpected key types per |
| 236 | + repo rules (no defensive fallbacks). |
| 237 | + |
| 238 | +### Dependent issues |
| 239 | +- Blocks completion of ISSUE-002 |
| 240 | + |
| 241 | +### Status |
| 242 | +Archived |
| 243 | + |
| 244 | +Implemented structured kwargs encoding in the Rust tracer by representing |
| 245 | +Python `dict` as a `Sequence` of `(key, value)` `Tuple`s, with keys encoded as |
| 246 | +`String` when possible. Tests in |
| 247 | +`codetracer-python-recorder/test/test_monitoring_events.py` validate that |
| 248 | +kwargs are recorded structurally. This fulfills the goal without introducing a |
| 249 | +new mapping value kind, per the proposed solution. |
| 250 | + |
| 251 | + |
| 252 | + |
| 253 | + |
| 254 | +## ISSUE-011 |
| 255 | +### Description |
| 256 | +Create a concise set of small Python example scripts to exercise key code paths of the Rust‑backed recorder during development. Place all examples under `/examples` and make them easy to run with the module CLI. |
| 257 | + |
| 258 | +### Definition of Done |
| 259 | +- Create `/examples` at repo root. |
| 260 | +- Add minimal, deterministic scripts covering common scenarios: |
| 261 | + - `basic_args.py`: positional‑only, pos‑or‑kw, kw‑only, `*args`, `**kwargs`. |
| 262 | + - `exceptions.py`: raise, catch, and `print(e)` inside `except`. |
| 263 | + - `classes_methods.py`: instance, `@classmethod`, `@staticmethod`, property access. |
| 264 | + - `recursion.py`: direct and mutual recursion. |
| 265 | + - `generators_async.py`: generator, `async`/`await`, async generator. |
| 266 | + - `context_and_closures.py`: `with` (context manager) and nested closures. |
| 267 | + - `threading.py`: two threads invoking traced functions and joining. |
| 268 | + - `imports_side_effects.py`: module‑level code vs `if __name__ == "__main__"`. |
| 269 | + - `kwargs_nested.py`: nested kwargs structure to validate structured encoding. |
| 270 | +- Each script: |
| 271 | + - Has a brief module docstring stating its focus. |
| 272 | + - Defines `main()` and uses the `__name__ == "__main__"` guard. |
| 273 | + - Produces stable, minimal output without external dependencies. |
| 274 | +- Add `/examples/README.md` listing scripts, purpose, and how to run via: |
| 275 | + - `python -m codetracer_python_recorder --codetracer-format=json examples/<script>.py` |
| 276 | + |
| 277 | +### Proposed solution |
| 278 | +- Keep scripts focused and short to spotlight specific behaviors. |
| 279 | +- Prefer deterministic flows; join threads and avoid non‑deterministic timing. |
| 280 | +- Use the provided module CLI so recorder activation is consistent across runs. |
| 281 | + |
| 282 | +### Status |
| 283 | +Archived |
| 284 | + |
| 285 | +Added the examples directory and scripts: |
| 286 | +`basic_args.py`, `exceptions.py`, `classes_methods.py`, `recursion.py`, |
| 287 | +`generators_async.py`, `context_and_closures.py`, `threading.py`, |
| 288 | +`imports_side_effects.py`, `kwargs_nested.py`, plus `examples/README.md` |
| 289 | +with usage instructions for running via |
| 290 | +`python -m codetracer_python_recorder --codetracer-format=json examples/<script>.py`. |
0 commit comments