You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: BENCHMARK.md
+26-14Lines changed: 26 additions & 14 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -10,7 +10,7 @@ This document explains what `benchmarks/bench.py` measures and **why** the suite
10
10
11
11
## How to run
12
12
13
-
Because `agentjson`ships a top-level `orjson`shim, you must use **two separate environments** if you want to compare with the real `orjson` package:
13
+
Because `agentjson`provides a top-level `orjson`drop-in module, you must use **two separate environments** if you want to compare with the real `orjson` package:
14
14
15
15
```bash
16
16
# Env A: real orjson
@@ -47,6 +47,19 @@ python benchmarks/bench.py
47
47
-**PR‑101 (parallel delimiter indexer)**: use `large_root_array_suite` and increase `BENCH_LARGE_MB` (e.g. `200,1000`) to find the crossover where parallel indexing starts paying off.
48
48
-**PR‑102 (nested huge value / corpus)**: use `nested_corpus_suite` to benchmark `scale_target_keys=["corpus"]` with `allow_parallel` on/off.
49
49
50
+
### Example: CLI mmap suite (PR‑006)
51
+
52
+
On macOS this suite records **wall time** (the `/usr/bin/time -v` max-RSS path is Linux-friendly).
53
+
54
+
Example run (Env B, `BENCH_CLI_MMAP_MB=256`, 2025-12-14):
55
+
56
+
| Mode | Elapsed |
57
+
|---|---:|
58
+
|`mmap(default)`| 1.27 s |
59
+
|`read(--no-mmap)`| 1.38 s |
60
+
61
+
Interpretation: mmap’s main win is avoiding **upfront heap allocation / extra copies** on huge files; it may or may not be faster depending on OS page cache and IO patterns.
62
+
50
63
## Suite 1 — LLM messy JSON suite (primary)
51
64
52
65
### What it tests
@@ -93,24 +106,24 @@ With `agentjson` as an `orjson` drop-in (same call site):
@@ -145,15 +158,15 @@ In the benchmark run, this case shows up exactly as:
145
158
-**Top‑1 hit** misses (not the expected value),
146
159
- but **Top‑K hit (K=5)** succeeds (the expected value is present in the candidate list).
147
160
148
-
### Example results (2025-12-13, Python 3.12.0, macOS 14.1 arm64)
161
+
### Example results (2025-12-14, Python 3.12.0, macOS 14.1 arm64)
149
162
150
163
| Metric | Value |
151
164
|---|---:|
152
165
| Top‑1 hit rate | 7/8 |
153
166
| Top‑K hit rate (K=5) | 8/8 |
154
167
| Avg candidates returned | 1.25 |
155
168
| Avg best confidence | 0.57 |
156
-
| Best time / case |92.7 µs |
169
+
| Best time / case |38.2 µs |
157
170
158
171
## Suite 3 — Large root-array parsing (big data angle)
159
172
@@ -169,15 +182,15 @@ and measures how long `loads(...)` takes for sizes like 5MB and 20MB.
169
182
170
183
For comparing `json/ujson/orjson`, use **Env A (real orjson)**. In Env B, `import orjson` is the shim.
171
184
172
-
### Example results (Env A: real `orjson`, 2025-12-13)
185
+
### Example results (Env A: real `orjson`, 2025-12-14)
173
186
174
187
| Library | 5 MB | 20 MB |
175
188
|---|---:|---:|
176
-
|`json.loads(str)`|52.3 ms |209.6 ms |
177
-
|`ujson.loads(str)`|42.2 ms |176.1 ms |
178
-
|`orjson.loads(bytes)` (real) |24.6 ms |115.9 ms |
189
+
|`json.loads(str)`|53.8 ms |217.2 ms |
190
+
|`ujson.loads(str)`|45.9 ms |173.7 ms |
191
+
|`orjson.loads(bytes)` (real) |27.0 ms |116.2 ms |
179
192
180
-
`benchmarks/bench.py` also measures `agentjson.scale(serial|parallel)` (Env B). On 5–20MB inputs the parallel path is slower due to overhead; it’s intended for much larger payloads (GB‑scale root arrays).
193
+
`benchmarks/bench.py` also measures `agentjson.scale(serial|parallel)` (Env B). On 5–20MB inputs the crossover depends on your machine; it’s intended for much larger payloads (GB‑scale root arrays).
181
194
182
195
## Suite 3b — Nested `corpus` suite (targeted huge value)
183
196
@@ -202,4 +215,3 @@ Important nuance:
202
215
203
216
- This suite uses **DOM** mode (`scale_output="dom"`) so `split_mode` shows whether nested targeting triggered (see `rust/src/scale.rs::try_nested_target_split`).
204
217
- Wiring nested targeting into **tape** mode (`scale_output="tape"`) is the next-step work for true “huge nested value without DOM” workloads.
Most LLM/agent stacks already call `orjson.loads()` everywhere. `agentjson`bundles an `orjson`-compatible shim so you can keep those call sites unchanged and still recover from “near‑JSON” outputs:
228
+
Most LLM/agent stacks already call `orjson.loads()` everywhere. `agentjson`provides an `orjson`-compatible drop-in module so you can keep those call sites unchanged and still recover from “near‑JSON” outputs:
227
229
228
230
```python
229
231
import orjson
@@ -232,6 +234,12 @@ data = orjson.loads(b'{"a": 1}')
232
234
blob = orjson.dumps({"a": 1})
233
235
```
234
236
237
+
If you prefer to be explicit (or want to avoid `orjson` name conflicts), you can also do:
238
+
239
+
```python
240
+
import agentjson as orjson
241
+
```
242
+
235
243
By default the shim is strict (like real `orjson`). To enable repair/scale fallback without changing call sites:
236
244
237
245
```bash
@@ -255,9 +263,9 @@ This suite reflects the context: LLM outputs like “json입니다~ …”, mark
Key point: **drop-in call sites** (`import orjson; orjson.loads(...)`) can go from *0% success* → *100% success* just by setting `JSONPROB_ORJSON_MODE=auto`.
263
271
@@ -271,19 +279,19 @@ This suite checks whether the “intended” JSON object is recovered as the **b
271
279
| Top‑K hit rate (K=5) | 8/8 |
272
280
| Avg candidates returned | 1.25 |
273
281
| Avg best confidence | 0.57 |
274
-
| Best time / case |92.7 µs |
282
+
| Best time / case |38.2 µs |
275
283
276
284
### 3) Large root-array parsing (big data angle)
277
285
278
286
Valid JSON only (parsing a single large root array).
279
287
280
288
| Library | 5 MB | 20 MB |
281
289
|---|---:|---:|
282
-
|`json.loads(str)`|52.3 ms |209.6 ms |
283
-
|`ujson.loads(str)`|42.2 ms |176.1 ms |
284
-
|`orjson.loads(bytes)` (real) |24.6 ms |115.9 ms |
290
+
|`json.loads(str)`|53.8 ms |217.2 ms |
291
+
|`ujson.loads(str)`|45.9 ms |173.7 ms |
292
+
|`orjson.loads(bytes)` (real) |27.0 ms |116.2 ms |
285
293
286
-
`agentjson` also benchmarks `agentjson.scale(serial|parallel)` in the same script. On 5–20MB inputs the parallel path is slower due to overhead; it’s intended for much larger payloads (GB‑scale root arrays).
294
+
`agentjson` also benchmarks `agentjson.scale(serial|parallel)` in the same script. On 5–20MB inputs the crossover depends on your machine: on this run the parallel path is slower at 5MB and slightly faster at 20MB; it’s intended for much larger payloads (GB‑scale root arrays).
0 commit comments