Skip to content

Commit da36aad

Browse files
authored
Add a human-readable spec for the temporary JSON format (#15)
1 parent 1be514a commit da36aad

File tree

1 file changed

+277
-0
lines changed

1 file changed

+277
-0
lines changed

docs/trace_json_spec.md

Lines changed: 277 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,277 @@
1+
# CodeTracer Trace JSON Format
2+
3+
This document describes the JSON files produced by the `runtime_tracing` library. These files store recorded program execution data for the CodeTracer omniscient debugger.
4+
5+
## Files in a Trace
6+
7+
A trace directory typically contains the following entries:
8+
9+
* `trace.json` – array of program events forming the execution trace.
10+
* `trace_metadata.json` – metadata about the recorded program.
11+
* `trace_paths.json` – list of file paths referenced by the trace.
12+
* `files/` – copies of the program source files for offline debugging.
13+
14+
Each file is encoded in UTF‑8 and uses pretty standard JSON produced by [Serde](https://serde.rs/). The structures below correspond to Rust types from `src/types.rs`.
15+
16+
## Trace Metadata
17+
18+
The file `trace_metadata.json` is a single JSON object with the following fields:
19+
20+
```json
21+
{
22+
"workdir": "path to the working directory",
23+
"program": "name of the traced program",
24+
"args": ["list", "of", "command", "line", "arguments"]
25+
}
26+
```
27+
28+
`workdir` and `program` are strings. `args` is an array of strings representing the arguments supplied to the program when tracing started.
29+
30+
## Path List
31+
32+
`trace_paths.json` contains an array of strings. Each element is a path that was referenced in the trace. Paths are stored in the order they were discovered so that other events can refer to them by numeric identifier.
33+
34+
Example:
35+
36+
```json
37+
["/path/to/main.rs", "/path/to/lib.rs"]
38+
```
39+
40+
The index of a path within this array is used as the `PathId` elsewhere in the event stream.
41+
42+
## Event Stream (`trace.json`)
43+
44+
`trace.json` is an array. Each element represents one `TraceLowLevelEvent` value serialized as a JSON object. The outer object contains a single key naming the event variant. The value associated with the key holds the fields specific to that variant.
45+
46+
Example (simplified):
47+
48+
```json
49+
[
50+
{"Path": "/path/to/main.rs"},
51+
{"Function": {"path_id": 0, "line": 1, "name": "main"}},
52+
{"Step": {"path_id": 0, "line": 1}},
53+
{"Call": {"function_id": 0, "args": []}},
54+
{"Return": {"return_value": {"kind": "None", "type_id": 0}}}
55+
]
56+
```
57+
58+
The recognized event variants and their payloads are listed below. Integer wrapper types such as `PathId`, `StepId`, `VariableId`, `FunctionId`, `Line`, and `Place` are encoded simply as numbers.
59+
60+
### `Path`
61+
```json
62+
{"Path": "absolute/or/relative/path"}
63+
```
64+
Registers a new file path. The numeric identifier of the path is its position within `trace_paths.json`.
65+
66+
### `VariableName`
67+
```json
68+
{"VariableName": "name"}
69+
```
70+
Introduces a variable name and assigns it a `VariableId` based on the order of appearance.
71+
72+
### `Type`
73+
```json
74+
{"Type": {
75+
"kind": <numeric TypeKind>,
76+
"lang_type": "language specific name",
77+
"specific_info": {
78+
"kind": "None" | "Struct" | "Pointer",
79+
...
80+
}
81+
}}
82+
```
83+
Describes a new type. `TypeKind` values are encoded as numbers. When `specific_info.kind` is `Struct`, the object also contains `fields` which is an array of `{ "name": String, "type_id": TypeId }`. When `Pointer`, it contains `dereference_type_id`.
84+
85+
### `Value`
86+
```json
87+
{"Value": {"variable_id": <id>, "value": <ValueRecord>}}
88+
```
89+
Stores the full value of a variable. `ValueRecord` objects use the representation `{"kind": "Variant", ...}` as shown below in the **Value Records** section.
90+
91+
### `Function`
92+
```json
93+
{"Function": {"path_id": <id>, "line": <line>, "name": "function name"}}
94+
```
95+
Registers a function so that subsequent `Call` events can reference it.
96+
97+
### `Step`
98+
```json
99+
{"Step": {"path_id": <id>, "line": <line>}}
100+
```
101+
Marks execution of a particular line in a file.
102+
103+
### `Call`
104+
```json
105+
{"Call": {"function_id": <id>, "args": [<FullValueRecord>, ...]}}
106+
```
107+
Signals the start of a function call. Each argument is represented as a `FullValueRecord` (the same structure used by `Value`).
108+
109+
### `Return`
110+
```json
111+
{"Return": {"return_value": <ValueRecord>}}
112+
```
113+
Signals function return and provides the return value.
114+
115+
### `Event`
116+
```json
117+
{"Event": {"kind": <numeric EventLogKind>, "metadata": "", "content": "text"}}
118+
```
119+
A general‑purpose log entry. `EventLogKind` is encoded as a number. `metadata` is currently a free‑form string and may be empty.
120+
121+
### `Asm`
122+
```json
123+
{"Asm": ["instruction", ...]}
124+
```
125+
Embeds raw assembly or bytecode instructions relevant to the step.
126+
127+
### `BindVariable`
128+
```json
129+
{"BindVariable": {"variable_id": <id>, "place": <place>}}
130+
```
131+
Associates a variable with an opaque `Place` identifier. Places can be used to track mutations of complex values.
132+
133+
### `Assignment`
134+
```json
135+
{"Assignment": {"to": <variable_id>, "pass_by": "Value" | "Reference", "from": <RValue>}}
136+
```
137+
Records a by‑value or by‑reference assignment. `RValue` is described below.
138+
139+
### `DropVariables`
140+
```json
141+
{"DropVariables": [<variable_id>, ...]}
142+
```
143+
Signals that a set of variables went out of scope.
144+
145+
### `CompoundValue`
146+
```json
147+
{"CompoundValue": {"place": <place>, "value": <ValueRecord>}}
148+
```
149+
Defines a value located at a `Place` that consists of multiple parts (for example, the elements of a collection).
150+
151+
### `CellValue`
152+
```json
153+
{"CellValue": {"place": <place>, "value": <ValueRecord>}}
154+
```
155+
Stores the current value of a mutable cell located at a `Place`.
156+
157+
### `AssignCompoundItem`
158+
```json
159+
{"AssignCompoundItem": {"place": <place>, "index": <number>, "item_place": <place>}}
160+
```
161+
Connects an index within a compound value to a new `Place` containing the item.
162+
163+
### `AssignCell`
164+
```json
165+
{"AssignCell": {"place": <place>, "new_value": <ValueRecord>}}
166+
```
167+
Updates the value stored at a `Place`.
168+
169+
### `VariableCell`
170+
```json
171+
{"VariableCell": {"variable_id": <id>, "place": <place>}}
172+
```
173+
Binds a variable directly to a `Place`.
174+
175+
### `DropVariable`
176+
```json
177+
{"DropVariable": <variable_id>}
178+
```
179+
Removes the association of a variable with any value.
180+
181+
### `DropLastStep`
182+
```json
183+
{"DropLastStep": null}
184+
```
185+
A special marker used when a previously emitted `Step` should be ignored. It keeps the trace append‑only.
186+
187+
## Value Records
188+
189+
Many events embed `ValueRecord` objects. They all use an internally tagged representation with a `kind` field. The possible variants are:
190+
191+
* `Int``{ "kind": "Int", "i": number, "type_id": TypeId }`
192+
* `Float``{ "kind": "Float", "f": number, "type_id": TypeId }`
193+
* `Bool``{ "kind": "Bool", "b": true|false, "type_id": TypeId }`
194+
* `String``{ "kind": "String", "text": "...", "type_id": TypeId }`
195+
* `Sequence``{ "kind": "Sequence", "elements": [<ValueRecord>], "is_slice": bool, "type_id": TypeId }`
196+
* `Tuple``{ "kind": "Tuple", "elements": [<ValueRecord>], "type_id": TypeId }`
197+
* `Struct``{ "kind": "Struct", "field_values": [<ValueRecord>], "type_id": TypeId }`
198+
* `Variant``{ "kind": "Variant", "discriminator": "name", "contents": <ValueRecord>, "type_id": TypeId }`
199+
* `Reference``{ "kind": "Reference", "dereferenced": <ValueRecord>, "address": number, "mutable": bool, "type_id": TypeId }`
200+
* `Raw``{ "kind": "Raw", "r": "text", "type_id": TypeId }`
201+
* `Error``{ "kind": "Error", "msg": "description", "type_id": TypeId }`
202+
* `None``{ "kind": "None", "type_id": TypeId }`
203+
* `Cell``{ "kind": "Cell", "place": <place> }`
204+
205+
## RValue
206+
207+
`RValue` is used inside `Assignment` events to describe the right‑hand side of an assignment.
208+
209+
* `{"kind": "Simple", "0": <variable_id>}` – reference to a single variable.
210+
* `{"kind": "Compound", "0": [<variable_id>, ...]}` – a composite value built from several variables.
211+
212+
## Numeric Enumerations
213+
214+
`TypeKind` and `EventLogKind` are serialized as numbers. Their numeric values correspond to the order of variants defined in `src/types.rs`.
215+
216+
Example: the default `TypeKind::Seq` has value `0`, `TypeKind::Set` has value `1`, and so on. Consumers should be prepared to handle unknown values gracefully as the enumeration may evolve.
217+
218+
### `TypeKind` values
219+
220+
| Value | Variant |
221+
| -----:| ------- |
222+
| 0 | Seq |
223+
| 1 | Set |
224+
| 2 | HashSet |
225+
| 3 | OrderedSet |
226+
| 4 | Array |
227+
| 5 | Varargs |
228+
| 6 | Struct |
229+
| 7 | Int |
230+
| 8 | Float |
231+
| 9 | String |
232+
| 10 | CString |
233+
| 11 | Char |
234+
| 12 | Bool |
235+
| 13 | Literal |
236+
| 14 | Ref |
237+
| 15 | Recursion |
238+
| 16 | Raw |
239+
| 17 | Enum |
240+
| 18 | Enum16 |
241+
| 19 | Enum32 |
242+
| 20 | C |
243+
| 21 | TableKind |
244+
| 22 | Union |
245+
| 23 | Pointer |
246+
| 24 | Error |
247+
| 25 | FunctionKind |
248+
| 26 | TypeValue |
249+
| 27 | Tuple |
250+
| 28 | Variant |
251+
| 29 | Html |
252+
| 30 | None |
253+
| 31 | NonExpanded |
254+
| 32 | Any |
255+
| 33 | Slice |
256+
257+
### `EventLogKind` values
258+
259+
| Value | Variant |
260+
| -----:| ------- |
261+
| 0 | Write |
262+
| 1 | WriteFile |
263+
| 2 | WriteOther |
264+
| 3 | Read |
265+
| 4 | ReadFile |
266+
| 5 | ReadOther |
267+
| 6 | ReadDir |
268+
| 7 | OpenDir |
269+
| 8 | CloseDir |
270+
| 9 | Socket |
271+
| 10 | Open |
272+
| 11 | Error |
273+
| 12 | TraceLogEvent |
274+
275+
## Summary
276+
277+
The JSON format is intentionally simple. Events are appended to `trace.json` in the order they occur. Auxiliary files (`trace_metadata.json`, `trace_paths.json`, and the `files/` directory) provide context so that the trace is completely self contained.

0 commit comments

Comments
 (0)