Use yyjson for significantly faster JSON parsing #304

DePasqualeOrg · 2025-12-27T21:18:49Z

JSON parsing is one of the biggest performance bottlenecks for tokenizer loading, and yyjson, a high-performance C library, offers significant speed gains for large tokenizer files: it's 3.4x faster for raw JSON parsing and 2.1x faster for building the Config, saving around 600 ms in a typical tokenizer load.

Changes

Add yyjson 0.12.0 as a dependency
Add YYJSONParser with direct yyjson → Config conversion (no intermediate Foundation objects)
Update HubApi.configuration(fileURL:) to use yyjson
Remove JSONSerialization+BOM.swift (yyjson handles BOM correctly)
Add Benchmarks test target (run with RUN_BENCHMARKS=1 swift test --filter Benchmarks)

Performance

Tested with the 11.4 MB tokenizer.json from mlx-community/Qwen3-0.6B-Base-DQ5:

Benchmark	yyjson	JSONSerialization	Improvement
Raw JSON parsing	19 ms	66 ms	3.4x (47 ms)
JSON → Config	540 ms	1,160 ms	2.1x (620 ms)

This saves ~600 ms per tokenizer load on an M3 MacBook Pro.

All existing tests pass.

DePasqualeOrg · 2026-01-05T15:53:29Z

@mattt, @pcuenca, I think this PR would be a good one to start with whenever you're ready, since #303 is based on it. For that reason, #303 looks bigger than it actually is. I added some refinements to all three of my PRs in this repo today, and I think they're now all ready for review.

ZachNagengast · 2026-01-05T18:10:01Z

Sources/Hub/HubApi.swift

-        guard let parsed = try? JSONSerialization.bomPreservingJsonObject(with: data) else {
-            throw Hub.HubClientError.jsonSerialization(fileURL: fileURL, message: "JSON Serialization failed for \(fileURL). Please verify that you have set the HF_TOKEN environment variable.")
+        do {
+            return try YYJSONParser.parseToConfig(data)
+        } catch {
+            throw Hub.HubClientError.jsonSerialization(
+                fileURL: fileURL,
+                message: "JSON parsing failed for \(fileURL): \(error.localizedDescription). If this is a private model, verify that HF_TOKEN is set."
+            )
        }
-        guard let dictionary = parsed as? [NSString: Any] else { throw Hub.HubClientError.parse }
-        return Config(dictionary)
    }
 }


2c on this:

I think theres an opportunity to protocolize json parsing, which would allow the dependency footprint to be reduced for this specific project but still enable yyjson usage outside of it.

protocol JSONParser { func parseToConfig(_ data: Data) throws -> Config }

Then

func configuration(fileURL: URL, parser: JSONParser = DefaultJSONParser()) throws -> Config { let data = try Data(contentsOf: fileURL) do { return try parser.parseToConfig(data) } catch { throw Hub.HubClientError.jsonSerialization( fileURL: fileURL, message: "JSON parsing failed for \(fileURL): \(error.localizedDescription). If this is a private model, verify that HF_TOKEN is set." ) } }

Then JSONParser could be passed to the HubApi init or an object that is passed into configuration call.

let customParser = YYJSONParser() let config = try hubApi.configuration(fileURL: someURL, parser: customParser)

Ideally this project would remain pure swift w/ swift dependencies but still allow fast implementations via protocols.

That's a nice idea, although the Python transformers library uses the Rust tokenizers library, which uses serde for JSON parsing. I think there is a good argument for just having a fast default like in the Python transformers, especially since what's available in Swift is so slow. People running MLX models in Swift are already using C++ libraries through C bridging. yyjson is in C, so Swift can call it directly with minimal overhead.

@DePasqualeOrg Amazing work! I just opened a PR demonstrating the effect of in-situ parsing on speed and memory here: DePasqualeOrg#2

@ZachNagengast I'm sympathetic to the idea of dependency injection, but in this case, it's hard to imagine a scenario in which an API consumer wouldn't opt-in to faster JSON parsing. Assuming the performance is consistently better, and barring segfaults or incorrect behavior, then this seems like a slam dunk.

If the additional dependency is a concern, I suppose we could compromise with a trait that's enabled by default and could be disabled on an opt-out basis.

Fast default would be great, on the other hand swift apps have the consideration of compilation time and distributable binary size that also should be optimized. Testing the build on this branch appears to add 1.2MB of C code which compresses well to be fair to around 113KB. Do you think this dependency can be transitioned via the protocol to the MLX repo since that is already compiling C code?

Posted before reading your comment, the extra dependency is a concern but it could be isolated with traits or simple compiler flags checking for canImport(yyjson) similar to this WIP branch that pulls jinja out of the compilation: main...ZachNagengast:swift-transformers:optional-jinja-import-for-hub-and-tokenizers

Something like this would allow the Transformers library to import the fast solution by default, but more targeted implementations that just want Hub and Tokenizers could have an optimal dependency footprint

DePasqualeOrg · 2026-01-05T23:34:15Z

Thanks for this, @mattt. I dug into it, and it looks like both methods use identical memory (~68 MB) when measured in separate tests. The 0 KB measurement may have been due to memory reuse between sequential tests. Let me know what you think: https://github.com/DePasqualeOrg/swift-transformers/tree/benchmark-memory-use

mattt · 2026-01-09T15:58:28Z

@DePasqualeOrg Running my own benchmarks, I found that YYJSON is actually ~8.7x faster than Foundation for parsing that ~10MB tokenizers.json file:

Metric	Foundation	YYJSON	Improvement
Time (p50)	57.0 ms	6.5 ms	8.7x faster
Peak Memory	242 MB	52 MB	78% less

And according to Swift Benchmark, in-situ parsing correctly showed 0 allocations.

All the more reason for us to move forward, in my opinion.

@pcuenca Any strong feelings about how to proceed?

DePasqualeOrg · 2026-01-09T17:52:06Z

@mattt, I don't fully understand the implications of in-situ parsing, but I'm not sure there's a benefit. Here's the analysis from Claude Code, for the record:

The "0 allocations" result comes from measuring only the parse step, after the buffer is allocated and before the Config conversion. Since convertToConfig immediately copies all strings via String(cString:), the in-situ benefit is negated.

Use yyjson for significantly faster JSON parsing

a5f77c6

This was referenced Dec 27, 2025

Optimizations for significantly faster tokenizer loading #303

Open

Optimizations for significantly faster downloads and cache hits #302

Open

Optimize model loading performance ml-explore/mlx-swift-lm#34

Merged

DePasqualeOrg added 9 commits January 5, 2026 00:52

Remove BOM flag (not needed: yyjson parses correctly)

7552a05

Use yyjson type-checking API

e2b6808

Preserve error message info

1a9c235

Handle integer overflow edge case in yyjson parser

c568e52

Improve benchmarks

26edb80

Add tests for yyjson

e16af86

Handle signed integer overflow edge case

bfda485

Handle empty data edge case

24eaf0b

Clean up (yyjson handles BOM characters correctly)

891a51b

ZachNagengast reviewed Jan 5, 2026

View reviewed changes

mattt mentioned this pull request Jan 8, 2026

Add swift-yyjson project to README ibireme/yyjson#244

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Use yyjson for significantly faster JSON parsing #304

Use yyjson for significantly faster JSON parsing #304

Uh oh!

DePasqualeOrg commented Dec 27, 2025 •

edited

Loading

Uh oh!

DePasqualeOrg commented Jan 5, 2026

Uh oh!

ZachNagengast Jan 5, 2026

Uh oh!

DePasqualeOrg Jan 5, 2026

Uh oh!

mattt Jan 5, 2026

Uh oh!

ZachNagengast Jan 5, 2026

Uh oh!

ZachNagengast Jan 5, 2026

Uh oh!

DePasqualeOrg commented Jan 5, 2026

Uh oh!

mattt commented Jan 9, 2026 •

edited

Loading

Uh oh!

DePasqualeOrg commented Jan 9, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Use yyjson for significantly faster JSON parsing #304

Are you sure you want to change the base?

Use yyjson for significantly faster JSON parsing #304

Uh oh!

Conversation

DePasqualeOrg commented Dec 27, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Changes

Performance

Uh oh!

DePasqualeOrg commented Jan 5, 2026

Uh oh!

ZachNagengast Jan 5, 2026

Choose a reason for hiding this comment

Uh oh!

DePasqualeOrg Jan 5, 2026

Choose a reason for hiding this comment

Uh oh!

mattt Jan 5, 2026

Choose a reason for hiding this comment

Uh oh!

ZachNagengast Jan 5, 2026

Choose a reason for hiding this comment

Uh oh!

ZachNagengast Jan 5, 2026

Choose a reason for hiding this comment

Uh oh!

DePasqualeOrg commented Jan 5, 2026

Uh oh!

mattt commented Jan 9, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

DePasqualeOrg commented Jan 9, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

DePasqualeOrg commented Dec 27, 2025 •

edited

Loading

mattt commented Jan 9, 2026 •

edited

Loading