Skip to content

Commit 3c4fa04

Browse files
committed
issues(codex): Archiving done issues
1 parent f0540a8 commit 3c4fa04

File tree

3 files changed

+305
-272
lines changed

3 files changed

+305
-272
lines changed

.archive/issues-2025-09-09.md

Lines changed: 290 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,290 @@
1+
# Archived Issues - 2025-09-09
2+
3+
## ISSUE-001
4+
### Description
5+
We need to record function arguments when calling a function
6+
7+
We have a function `encode_value` which is used to convert Python objects to value records. We need to use this function to encode the function arguments. To do that we should modify the `on_py_start` hook to load the current frame and to read the function arguments from it.
8+
9+
### Definition of Done
10+
- Arguments for positional and pos-or-keyword parameters are recorded on function entry using the current frame's locals.
11+
- Values are encoded via `encode_value` and attached to the `Call` event payload.
12+
- A unit test asserts that multiple positional arguments (e.g. `a`, `b`) are present with correct encoded values.
13+
- Varargs/kwargs and positional-only coverage are tracked in separate issues (see ISSUE-002, ISSUE-005).
14+
15+
### Status
16+
Archived
17+
18+
Implemented for positional (and pos-or-keyword) arguments on function entry
19+
using `sys._getframe(0)` and `co_varnames[:co_argcount]`, with counting fixed to
20+
use `co_argcount` directly (includes positional-only; avoids double-counting).
21+
Values are encoded via `encode_value` and attached to the `Call` event. Tests
22+
validate correct presence and values. Varargs/kwargs remain covered by
23+
ISSUE-002.
24+
25+
26+
27+
28+
## ISSUE-002
29+
### Description
30+
Capture all Python argument kinds on function entry: positional-only,
31+
pos-or-kw, keyword-only, plus varargs (`*args`) and kwargs (`**kwargs`). Extend
32+
the current implementation that uses `co_argcount` and `co_varnames` to also
33+
leverage `co_posonlyargcount` and `co_kwonlyargcount`, and detect varargs/kwargs
34+
via code flags. Encode `*args` as a list value and `**kwargs` as a mapping value
35+
to preserve structure.
36+
37+
### Definition of Done
38+
- All argument kinds are captured on function entry: positional-only, pos-or-keyword, keyword-only, varargs (`*args`), and kwargs (`**kwargs`).
39+
- `*args` is encoded as a list value; `**kwargs` is encoded as a mapping value.
40+
- Positional-only and keyword-only parameters are included using `co_posonlyargcount` and `co_kwonlyargcount`.
41+
- Comprehensive tests cover each argument kind and validate the encoded structure and values.
42+
43+
### Status
44+
Archived
45+
46+
All argument kinds are captured on function entry, including kwargs with
47+
structured encoding. Varargs are preserved as `Tuple` (per CPython), and
48+
`**kwargs` are encoded as a `Sequence` of 2-element `Tuple`s `(key, value)`
49+
with string keys, enabling lossless downstream analysis. The updated test
50+
`test_all_argument_kinds_recorded_on_py_start` verifies the behavior.
51+
52+
Note: While the original Definition of Done referenced a mapping value kind,
53+
the implementation follows the proposed approach in ISSUE-008 to represent
54+
kwargs as a sequence of tuples using existing value kinds.
55+
56+
57+
58+
## ISSUE-005
59+
### Description
60+
Include positional-only parameters in argument capture. The current logic uses
61+
only `co_argcount` for the positional slice, which excludes positional-only
62+
arguments (PEP 570). As a result, names before the `/` in a signature like
63+
`def f(p, /, q, *args, r, **kwargs)` are dropped.
64+
65+
### Definition of Done
66+
- Positional-only parameters are included in the captured argument set.
67+
- The selection of positional names accounts for `co_posonlyargcount` in addition to `co_argcount`.
68+
- Tests add a function with positional-only parameters and assert their presence and correct encoding.
69+
70+
### Status
71+
Archived
72+
73+
Implemented by selecting positional names from `co_varnames` using
74+
`co_argcount` directly (which already includes positional-only per CPython 3.8+).
75+
This prevents double-counting and keeps indexing stable. Tests in
76+
`test_all_argument_kinds_recorded_on_py_start` assert presence of the
77+
positional-only parameter `p` and pass.
78+
79+
80+
81+
## ISSUE-003
82+
### Description
83+
Avoid defensive fallback in argument capture. The current change swallows
84+
failures to access the frame/locals and proceeds with empty `args`. Per
85+
`rules/source-code.md` ("Avoid defensive programming"), we should fail fast when
86+
encountering such edge cases.
87+
88+
### Definition of Done
89+
- Silent fallbacks that return empty arguments on failure are removed.
90+
- The recorder raises a clear, actionable error when it cannot access frame/locals.
91+
- Tests verify the fail-fast path.
92+
93+
### Status
94+
Archived
95+
96+
`RuntimeTracer::on_py_start` now returns `PyResult<()>` and raises a
97+
`RuntimeError` when frame/locals access fails; `callback_py_start` propagates
98+
the error to Python. A pytest (`tests/test_fail_fast_on_py_start.py`) asserts
99+
the fail-fast behavior by monkeypatching `sys._getframe` to raise.
100+
101+
102+
103+
## ISSUE-004
104+
### Description
105+
Stabilize string value encoding for arguments and tighten tests. The new test
106+
accepts either `String` or `Raw` kinds for the `'x'` argument, which can hide
107+
regressions. We should standardize encoding of `str` as `String` (or document
108+
when `Raw` is expected) and update tests to assert the exact kind.
109+
110+
### Definition of Done
111+
- String values are consistently encoded as `String` (or the expected canonical kind), with any exceptions explicitly documented.
112+
- Tests assert the exact kind for `str` arguments and fail if an unexpected kind (e.g., `Raw`) is produced.
113+
- Documentation clarifies encoding rules for string-like types to avoid ambiguity in future changes.
114+
115+
### Status
116+
Archived
117+
118+
Stricter tests now assert `str` values are encoded as `String` with the exact text payload, and runtime docs clarify canonical encoding. No runtime logic change was required since `encode_value` already produced `String` for Python `str`.
119+
120+
121+
122+
## ISSUE-006
123+
### Description
124+
Accidental check-in of Cargo cache/artifact files under `codetracer-python-recorder/.cargo/**` (e.g., `registry/CACHEDIR.TAG`, `.package-cache`). These are build/cache directories and should be excluded from version control.
125+
126+
### Definition of Done
127+
- Add ignore rules to exclude Cargo cache directories (e.g., `.cargo/**`, `target/**`) from version control.
128+
- Remove already-checked-in cache files from the repository.
129+
- Verify the working tree is clean after a clean build; no cache artifacts appear as changes.
130+
131+
### Status
132+
Archived
133+
134+
135+
136+
## ISSUE-007
137+
### Description
138+
Immediately stop tracing when any monitoring callback raises an error.
139+
140+
Current behavior: `RuntimeTracer::on_py_start` intentionally fails fast when it
141+
cannot capture function arguments (e.g., when `sys._getframe` is unavailable or
142+
patched to raise). The callback error is propagated to Python via
143+
`callback_py_start` (it returns the `PyResult` from `on_py_start`). However, the
144+
tracer remains installed and active after the error. As a result, any further
145+
Python function start (even from exception-handling or printing the exception)
146+
triggers `on_py_start` again, re-raising the same error and interfering with the
147+
program’s own error handling.
148+
149+
This is observable in `codetracer-python-recorder/tests/test_fail_fast_on_py_start.py`:
150+
the test simulates `_getframe` failure, which correctly raises in `on_py_start`,
151+
but `print(e)` inside the test’s `except` block invokes codec machinery that
152+
emits additional `PY_START` events. Those callbacks raise again, causing the test
153+
to fail before reaching its assertions.
154+
155+
### Impact
156+
- Breaks user code paths that attempt to catch and handle exceptions while the
157+
tracer is active — routine operations like `print(e)` can cascade failures.
158+
- Hard to debug because the original error is masked by subsequent callback
159+
errors from unrelated modules (e.g., `codecs`).
160+
161+
### Proposed Solution
162+
Fail fast and disable tracing at the first callback error.
163+
164+
Implementation sketch:
165+
- In each callback wrapper (e.g., `callback_py_start`), if the underlying
166+
tracer method returns `Err`, immediately disable further monitoring before
167+
returning the error:
168+
- Set events to `NO_EVENTS` (via `set_events`) to prevent any more callbacks.
169+
- Unregister all previously registered callbacks for our tool id.
170+
- Optionally call `finish()` on the tracer to flush/close writers.
171+
- Option A (hard uninstall): call `uninstall_tracer(py)` to release tool id
172+
and clear the registry. This fully tears down the tracer. Note that the
173+
high-level `ACTIVE` flag in `lib.rs` is not updated by `uninstall_tracer`,
174+
so either:
175+
- expose an internal “deactivate_from_callback()” in `lib.rs` that clears
176+
`ACTIVE`, or
177+
- keep a soft-stop in `tracer.rs` by setting `NO_EVENTS` and unregistering
178+
callbacks without touching `ACTIVE`, allowing `stop_tracing()` to be a
179+
no-op later.
180+
- Ensure reentrancy safety: perform the disable sequence only once (e.g., with
181+
a guard flag) to avoid nested teardown during callback execution.
182+
183+
Behavioral details:
184+
- The original callback error must still be propagated to Python so the user
185+
sees the true failure cause, but subsequent code should not receive further
186+
monitoring callbacks.
187+
- If error occurs before activation gating triggers, the disable sequence should
188+
still run to avoid repeated failures from unrelated modules importing.
189+
190+
### Definition of Done
191+
- On any callback error (at minimum `on_py_start`, and future callbacks that may
192+
return `PyResult`), all further monitoring callbacks from this tool are
193+
disabled immediately within the same GIL context.
194+
- The initial error is propagated unchanged to Python.
195+
- The failing test `test_fail_fast_on_py_start.py` passes: after the first
196+
failure, `print(e)` does not trigger additional tracer errors.
197+
- Writers are flushed/closed or left in a consistent state (documented), and no
198+
additional events are recorded after disablement.
199+
- Unit/integration tests cover: error in `on_py_start`, repeated calls after
200+
disablement are no-ops, and explicit `stop_tracing()` is safe after a
201+
callback-induced shutdown.
202+
203+
### Status
204+
Archived
205+
206+
Implemented soft-stop on first callback error in `callback_py_start`:
207+
on error, the tracer finishes writers, unregisters callbacks for the
208+
configured mask, sets events to `NO_EVENTS`, clears the registry, and
209+
records `global.mask = NO_EVENTS`. The original error is propagated to
210+
Python, and subsequent `PY_START` events are not delivered. This keeps the
211+
module-level `ACTIVE` flag unchanged until `stop_tracing()`, making the
212+
shutdown idempotent. The test `tests/test_fail_fast_on_py_start.py`
213+
exercises the behavior by re-running the program after the initial failure.
214+
215+
216+
217+
## ISSUE-008
218+
### Description
219+
Provide structured encoding for kwargs (`**kwargs`) on function entry. The
220+
current backend encodes kwargs as `Raw` text because the `runtime_tracing`
221+
format lacks a mapping value. Introduce a mapping representation so kwargs can
222+
be recorded losslessly with key/value structure and recursively encoded values.
223+
224+
### Definition of Done
225+
- `runtime_tracing` supports a mapping value kind (e.g., `Map` with string keys).
226+
- `RuntimeTracer::encode_value` encodes Python `dict` to the mapping kind with
227+
recursively encoded values; key type restricted to `str` (non-`str` keys may
228+
be stringified or rejected, behavior documented).
229+
- `on_py_start` records `**kwargs` using the new mapping encoding.
230+
- Tests verify kwargs structure and values; large and nested kwargs are covered.
231+
232+
### Proposed solution
233+
- We can represent our `Map` as a sequenced of tuples. This way we can use the current value record types to encode dictionaries.
234+
- In the Python recorder, downcast to `dict` and iterate items, encoding values
235+
recursively; keep behavior minimal and fail fast on unexpected key types per
236+
repo rules (no defensive fallbacks).
237+
238+
### Dependent issues
239+
- Blocks completion of ISSUE-002
240+
241+
### Status
242+
Archived
243+
244+
Implemented structured kwargs encoding in the Rust tracer by representing
245+
Python `dict` as a `Sequence` of `(key, value)` `Tuple`s, with keys encoded as
246+
`String` when possible. Tests in
247+
`codetracer-python-recorder/test/test_monitoring_events.py` validate that
248+
kwargs are recorded structurally. This fulfills the goal without introducing a
249+
new mapping value kind, per the proposed solution.
250+
251+
252+
253+
254+
## ISSUE-011
255+
### Description
256+
Create a concise set of small Python example scripts to exercise key code paths of the Rust‑backed recorder during development. Place all examples under `/examples` and make them easy to run with the module CLI.
257+
258+
### Definition of Done
259+
- Create `/examples` at repo root.
260+
- Add minimal, deterministic scripts covering common scenarios:
261+
- `basic_args.py`: positional‑only, pos‑or‑kw, kw‑only, `*args`, `**kwargs`.
262+
- `exceptions.py`: raise, catch, and `print(e)` inside `except`.
263+
- `classes_methods.py`: instance, `@classmethod`, `@staticmethod`, property access.
264+
- `recursion.py`: direct and mutual recursion.
265+
- `generators_async.py`: generator, `async`/`await`, async generator.
266+
- `context_and_closures.py`: `with` (context manager) and nested closures.
267+
- `threading.py`: two threads invoking traced functions and joining.
268+
- `imports_side_effects.py`: module‑level code vs `if __name__ == "__main__"`.
269+
- `kwargs_nested.py`: nested kwargs structure to validate structured encoding.
270+
- Each script:
271+
- Has a brief module docstring stating its focus.
272+
- Defines `main()` and uses the `__name__ == "__main__"` guard.
273+
- Produces stable, minimal output without external dependencies.
274+
- Add `/examples/README.md` listing scripts, purpose, and how to run via:
275+
- `python -m codetracer_python_recorder --codetracer-format=json examples/<script>.py`
276+
277+
### Proposed solution
278+
- Keep scripts focused and short to spotlight specific behaviors.
279+
- Prefer deterministic flows; join threads and avoid non‑deterministic timing.
280+
- Use the provided module CLI so recorder activation is consistent across runs.
281+
282+
### Status
283+
Archived
284+
285+
Added the examples directory and scripts:
286+
`basic_args.py`, `exceptions.py`, `classes_methods.py`, `recursion.py`,
287+
`generators_async.py`, `context_and_closures.py`, `threading.py`,
288+
`imports_side_effects.py`, `kwargs_nested.py`, plus `examples/README.md`
289+
with usage instructions for running via
290+
`python -m codetracer_python_recorder --codetracer-format=json examples/<script>.py`.

issues-overview.md

Lines changed: 15 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,15 @@
1+
# Issues Overview
2+
3+
| Issue | Status |
4+
|-------|--------|
5+
| ISSUE-001 | Archived |
6+
| ISSUE-002 | Archived |
7+
| ISSUE-003 | Archived |
8+
| ISSUE-004 | Archived |
9+
| ISSUE-005 | Archived |
10+
| ISSUE-006 | Archived |
11+
| ISSUE-007 | Archived |
12+
| ISSUE-008 | Archived |
13+
| ISSUE-009 | Low priority. We won't work on this unless it blocks another issue. |
14+
| ISSUE-010 | Low priority. We won't work on this until a user reports that it causes issues. |
15+
| ISSUE-011 | Archived |

0 commit comments

Comments
 (0)