Skip to content

Commit 9483c23

Browse files
authored
Merge pull request #1 from sigridjineth/fix/abi3-wheel
release: abi3 wheels for py3.9+
2 parents a43b4b2 + f37ea32 commit 9483c23

File tree

9 files changed

+112
-106
lines changed

9 files changed

+112
-106
lines changed

.cargo/config.toml

Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,6 @@
1+
[target.aarch64-apple-darwin]
2+
rustflags = ["-C", "link-arg=-undefined", "-C", "link-arg=dynamic_lookup"]
3+
4+
[target.x86_64-apple-darwin]
5+
rustflags = ["-C", "link-arg=-undefined", "-C", "link-arg=dynamic_lookup"]
6+

BENCHMARK.md

Lines changed: 26 additions & 14 deletions
Original file line numberDiff line numberDiff line change
@@ -10,7 +10,7 @@ This document explains what `benchmarks/bench.py` measures and **why** the suite
1010
1111
## How to run
1212

13-
Because `agentjson` ships a top-level `orjson` shim, you must use **two separate environments** if you want to compare with the real `orjson` package:
13+
Because `agentjson` provides a top-level `orjson` drop-in module, you must use **two separate environments** if you want to compare with the real `orjson` package:
1414

1515
```bash
1616
# Env A: real orjson
@@ -47,6 +47,19 @@ python benchmarks/bench.py
4747
- **PR‑101 (parallel delimiter indexer)**: use `large_root_array_suite` and increase `BENCH_LARGE_MB` (e.g. `200,1000`) to find the crossover where parallel indexing starts paying off.
4848
- **PR‑102 (nested huge value / corpus)**: use `nested_corpus_suite` to benchmark `scale_target_keys=["corpus"]` with `allow_parallel` on/off.
4949

50+
### Example: CLI mmap suite (PR‑006)
51+
52+
On macOS this suite records **wall time** (the `/usr/bin/time -v` max-RSS path is Linux-friendly).
53+
54+
Example run (Env B, `BENCH_CLI_MMAP_MB=256`, 2025-12-14):
55+
56+
| Mode | Elapsed |
57+
|---|---:|
58+
| `mmap(default)` | 1.27 s |
59+
| `read(--no-mmap)` | 1.38 s |
60+
61+
Interpretation: mmap’s main win is avoiding **upfront heap allocation / extra copies** on huge files; it may or may not be faster depending on OS page cache and IO patterns.
62+
5063
## Suite 1 — LLM messy JSON suite (primary)
5164

5265
### What it tests
@@ -93,24 +106,24 @@ With `agentjson` as an `orjson` drop-in (same call site):
93106

94107
```python
95108
import os
96-
import orjson # agentjson shim
109+
import orjson # provided by agentjson (drop-in)
97110

98111
os.environ["JSONPROB_ORJSON_MODE"] = "auto"
99112
orjson.loads('preface```json\n{"a":1}\n```suffix') # -> {"a": 1}
100113
```
101114

102115
This is the core PR pitch: **don’t change code**, just switch the package and flip a mode when needed.
103116

104-
### Example results (2025-12-13, Python 3.12.0, macOS 14.1 arm64)
117+
### Example results (2025-12-14, Python 3.12.0, macOS 14.1 arm64)
105118

106119
| Library / mode | Success | Correct | Best time / case |
107120
|---|---:|---:|---:|
108121
| `json` (strict) | 0/10 | 0/10 | n/a |
109122
| `ujson` (strict) | 0/10 | 0/10 | n/a |
110123
| `orjson` (strict, real) | 0/10 | 0/10 | n/a |
111-
| `orjson` (auto, agentjson shim) | 10/10 | 10/10 | 45.9 µs |
112-
| `agentjson.parse(mode=auto)` | 10/10 | 10/10 | 39.8 µs |
113-
| `agentjson.parse(mode=probabilistic)` | 10/10 | 10/10 | 39.7 µs |
124+
| `agentjson` (drop-in `orjson.loads`, mode=auto) | 10/10 | 10/10 | 23.5 µs |
125+
| `agentjson.parse(mode=auto)` | 10/10 | 10/10 | 19.5 µs |
126+
| `agentjson.parse(mode=probabilistic)` | 10/10 | 10/10 | 19.5 µs |
114127

115128
## Suite 2 — Top‑K repair suite (secondary)
116129

@@ -145,15 +158,15 @@ In the benchmark run, this case shows up exactly as:
145158
- **Top‑1 hit** misses (not the expected value),
146159
- but **Top‑K hit (K=5)** succeeds (the expected value is present in the candidate list).
147160

148-
### Example results (2025-12-13, Python 3.12.0, macOS 14.1 arm64)
161+
### Example results (2025-12-14, Python 3.12.0, macOS 14.1 arm64)
149162

150163
| Metric | Value |
151164
|---|---:|
152165
| Top‑1 hit rate | 7/8 |
153166
| Top‑K hit rate (K=5) | 8/8 |
154167
| Avg candidates returned | 1.25 |
155168
| Avg best confidence | 0.57 |
156-
| Best time / case | 92.7 µs |
169+
| Best time / case | 38.2 µs |
157170

158171
## Suite 3 — Large root-array parsing (big data angle)
159172

@@ -169,15 +182,15 @@ and measures how long `loads(...)` takes for sizes like 5MB and 20MB.
169182

170183
For comparing `json/ujson/orjson`, use **Env A (real orjson)**. In Env B, `import orjson` is the shim.
171184

172-
### Example results (Env A: real `orjson`, 2025-12-13)
185+
### Example results (Env A: real `orjson`, 2025-12-14)
173186

174187
| Library | 5 MB | 20 MB |
175188
|---|---:|---:|
176-
| `json.loads(str)` | 52.3 ms | 209.6 ms |
177-
| `ujson.loads(str)` | 42.2 ms | 176.1 ms |
178-
| `orjson.loads(bytes)` (real) | 24.6 ms | 115.9 ms |
189+
| `json.loads(str)` | 53.8 ms | 217.2 ms |
190+
| `ujson.loads(str)` | 45.9 ms | 173.7 ms |
191+
| `orjson.loads(bytes)` (real) | 27.0 ms | 116.2 ms |
179192

180-
`benchmarks/bench.py` also measures `agentjson.scale(serial|parallel)` (Env B). On 5–20MB inputs the parallel path is slower due to overhead; it’s intended for much larger payloads (GB‑scale root arrays).
193+
`benchmarks/bench.py` also measures `agentjson.scale(serial|parallel)` (Env B). On 5–20MB inputs the crossover depends on your machine; it’s intended for much larger payloads (GB‑scale root arrays).
181194

182195
## Suite 3b — Nested `corpus` suite (targeted huge value)
183196

@@ -202,4 +215,3 @@ Important nuance:
202215

203216
- This suite uses **DOM** mode (`scale_output="dom"`) so `split_mode` shows whether nested targeting triggered (see `rust/src/scale.rs::try_nested_target_split`).
204217
- Wiring nested targeting into **tape** mode (`scale_output="tape"`) is the next-step work for true “huge nested value without DOM” workloads.
205-

Cargo.lock

Lines changed: 1 addition & 1 deletion
Some generated files are not rendered by default. Learn more about customizing how changed files appear on GitHub.

Cargo.toml

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
[package]
22
name = "agentjson"
3-
version = "0.1.1"
3+
version = "0.1.2"
44
edition = "2021"
55
license = "MIT OR Apache-2.0"
66
description = "Probabilistic JSON repair library powered by Rust"
@@ -14,5 +14,5 @@ crate-type = ["cdylib"]
1414
path = "rust-pyo3/src/lib.rs"
1515

1616
[dependencies]
17-
pyo3 = { version = "0.23", features = ["extension-module"] }
17+
pyo3 = { version = "0.23", features = ["extension-module", "abi3-py39"] }
1818
json_prob_parser = { package = "agentjson-core", path = "rust" }

README.md

Lines changed: 18 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -56,6 +56,8 @@ uv add agentjson
5656
# or: python -m pip install agentjson
5757
```
5858

59+
Note: `agentjson` ships **abi3** wheels (Python **3.9+**) so the same wheel works across CPython versions (e.g. 3.11, 3.12).
60+
5961
### Build from source (development)
6062

6163
#### 1) Install Rust toolchain
@@ -221,9 +223,9 @@ cargo build --release
221223
- Disable mmap: `--no-mmap`
222224
- Reproducible beam ordering: `--deterministic-seed 42`
223225

224-
## orjson Drop-in Shim
226+
## orjson Drop-in
225227

226-
Most LLM/agent stacks already call `orjson.loads()` everywhere. `agentjson` bundles an `orjson`-compatible shim so you can keep those call sites unchanged and still recover from “near‑JSON” outputs:
228+
Most LLM/agent stacks already call `orjson.loads()` everywhere. `agentjson` provides an `orjson`-compatible drop-in module so you can keep those call sites unchanged and still recover from “near‑JSON” outputs:
227229

228230
```python
229231
import orjson
@@ -232,6 +234,12 @@ data = orjson.loads(b'{"a": 1}')
232234
blob = orjson.dumps({"a": 1})
233235
```
234236

237+
If you prefer to be explicit (or want to avoid `orjson` name conflicts), you can also do:
238+
239+
```python
240+
import agentjson as orjson
241+
```
242+
235243
By default the shim is strict (like real `orjson`). To enable repair/scale fallback without changing call sites:
236244

237245
```bash
@@ -255,9 +263,9 @@ This suite reflects the context: LLM outputs like “json입니다~ …”, mark
255263
| `json` (strict) | 0/10 | 0/10 | n/a |
256264
| `ujson` (strict) | 0/10 | 0/10 | n/a |
257265
| `orjson` (strict, real) | 0/10 | 0/10 | n/a |
258-
| `orjson` (auto, agentjson shim) | 10/10 | 10/10 | 45.9 µs |
259-
| `agentjson.parse(mode=auto)` | 10/10 | 10/10 | 39.8 µs |
260-
| `agentjson.parse(mode=probabilistic)` | 10/10 | 10/10 | 39.7 µs |
266+
| `agentjson` (drop-in `orjson.loads`, mode=auto) | 10/10 | 10/10 | 23.5 µs |
267+
| `agentjson.parse(mode=auto)` | 10/10 | 10/10 | 19.5 µs |
268+
| `agentjson.parse(mode=probabilistic)` | 10/10 | 10/10 | 19.5 µs |
261269

262270
Key point: **drop-in call sites** (`import orjson; orjson.loads(...)`) can go from *0% success**100% success* just by setting `JSONPROB_ORJSON_MODE=auto`.
263271

@@ -271,19 +279,19 @@ This suite checks whether the “intended” JSON object is recovered as the **b
271279
| Top‑K hit rate (K=5) | 8/8 |
272280
| Avg candidates returned | 1.25 |
273281
| Avg best confidence | 0.57 |
274-
| Best time / case | 92.7 µs |
282+
| Best time / case | 38.2 µs |
275283

276284
### 3) Large root-array parsing (big data angle)
277285

278286
Valid JSON only (parsing a single large root array).
279287

280288
| Library | 5 MB | 20 MB |
281289
|---|---:|---:|
282-
| `json.loads(str)` | 52.3 ms | 209.6 ms |
283-
| `ujson.loads(str)` | 42.2 ms | 176.1 ms |
284-
| `orjson.loads(bytes)` (real) | 24.6 ms | 115.9 ms |
290+
| `json.loads(str)` | 53.8 ms | 217.2 ms |
291+
| `ujson.loads(str)` | 45.9 ms | 173.7 ms |
292+
| `orjson.loads(bytes)` (real) | 27.0 ms | 116.2 ms |
285293

286-
`agentjson` also benchmarks `agentjson.scale(serial|parallel)` in the same script. On 5–20MB inputs the parallel path is slower due to overhead; it’s intended for much larger payloads (GB‑scale root arrays).
294+
`agentjson` also benchmarks `agentjson.scale(serial|parallel)` in the same script. On 5–20MB inputs the crossover depends on your machine: on this run the parallel path is slower at 5MB and slightly faster at 20MB; it’s intended for much larger payloads (GB‑scale root arrays).
287295

288296
### 3b) Nested `corpus` split (targeted huge value)
289297

pyproject.toml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -4,7 +4,7 @@ build-backend = "maturin"
44

55
[project]
66
name = "agentjson"
7-
version = "0.1.1"
7+
version = "0.1.2"
88
description = "Probabilistic JSON repair library powered by Rust - fixes broken JSON from LLMs"
99
readme = "README.md"
1010
requires-python = ">=3.9"

rust-pyo3/src/lib.rs

Lines changed: 8 additions & 42 deletions
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,7 @@
11
use pyo3::prelude::*;
2-
use pyo3::buffer::PyBuffer;
3-
use pyo3::types::{PyByteArray, PyBytes, PyDict, PyList};
2+
use pyo3::types::{PyBytes, PyDict, PyList};
43
use pyo3::IntoPyObjectExt;
4+
use std::borrow::Cow;
55

66
use json_prob_parser::beam;
77
use json_prob_parser::json::JsonValue;
@@ -36,57 +36,23 @@ fn json_to_py(py: Python<'_>, v: &JsonValue) -> PyObject {
3636

3737
#[pyfunction]
3838
fn strict_loads_py(py: Python<'_>, input: &Bound<'_, PyAny>) -> PyResult<PyObject> {
39-
let parsed = if let Ok(s) = input.extract::<&str>() {
40-
strict::strict_parse(s)
41-
.map_err(|e| pyo3::exceptions::PyValueError::new_err((e.message, e.pos)))?
42-
} else if let Ok(b) = input.downcast::<PyBytes>() {
43-
let s = std::str::from_utf8(b.as_bytes()).map_err(|_| {
39+
let parsed = if let Ok(s) = input.extract::<Cow<str>>() {
40+
strict::strict_parse(s.as_ref())
41+
} else if let Ok(b) = input.extract::<Cow<[u8]>>() {
42+
let s = std::str::from_utf8(b.as_ref()).map_err(|_| {
4443
pyo3::exceptions::PyValueError::new_err((
4544
"str is not valid UTF-8: surrogates not allowed".to_string(),
4645
0_usize,
4746
))
4847
})?;
4948
strict::strict_parse(s)
50-
.map_err(|e| pyo3::exceptions::PyValueError::new_err((e.message, e.pos)))?
51-
} else if let Ok(ba) = input.downcast::<PyByteArray>() {
52-
let parsed = {
53-
// SAFETY: We do not call back into Python while using this slice.
54-
let bytes = unsafe { ba.as_bytes() };
55-
let s = std::str::from_utf8(bytes).map_err(|_| {
56-
pyo3::exceptions::PyValueError::new_err((
57-
"str is not valid UTF-8: surrogates not allowed".to_string(),
58-
0_usize,
59-
))
60-
})?;
61-
strict::strict_parse(s)
62-
};
63-
parsed.map_err(|e| pyo3::exceptions::PyValueError::new_err((e.message, e.pos)))?
64-
} else if let Ok(buf) = PyBuffer::<u8>::get(input) {
65-
let parsed = {
66-
let cells = buf.as_slice(py).ok_or_else(|| {
67-
pyo3::exceptions::PyValueError::new_err((
68-
"input buffer must be C-contiguous".to_string(),
69-
0_usize,
70-
))
71-
})?;
72-
73-
// ReadOnlyCell<u8> is repr(transparent) over UnsafeCell<u8>, so this is safe.
74-
let bytes = unsafe { std::slice::from_raw_parts(cells.as_ptr() as *const u8, cells.len()) };
75-
let s = std::str::from_utf8(bytes).map_err(|_| {
76-
pyo3::exceptions::PyValueError::new_err((
77-
"str is not valid UTF-8: surrogates not allowed".to_string(),
78-
0_usize,
79-
))
80-
})?;
81-
strict::strict_parse(s)
82-
};
83-
parsed.map_err(|e| pyo3::exceptions::PyValueError::new_err((e.message, e.pos)))?
8449
} else {
8550
return Err(pyo3::exceptions::PyValueError::new_err((
8651
"input must be bytes, bytearray, memoryview, or str".to_string(),
8752
0_usize,
8853
)));
89-
};
54+
}
55+
.map_err(|e| pyo3::exceptions::PyValueError::new_err((e.message, e.pos)))?;
9056

9157
Ok(json_to_py(py, &parsed))
9258
}

rust/Cargo.lock

Lines changed: 1 addition & 1 deletion
Some generated files are not rendered by default. Learn more about customizing how changed files appear on GitHub.

0 commit comments

Comments
 (0)