-
Notifications
You must be signed in to change notification settings - Fork 161
Use yyjson for significantly faster JSON parsing #304
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
| guard let parsed = try? JSONSerialization.bomPreservingJsonObject(with: data) else { | ||
| throw Hub.HubClientError.jsonSerialization(fileURL: fileURL, message: "JSON Serialization failed for \(fileURL). Please verify that you have set the HF_TOKEN environment variable.") | ||
| do { | ||
| return try YYJSONParser.parseToConfig(data) | ||
| } catch { | ||
| throw Hub.HubClientError.jsonSerialization( | ||
| fileURL: fileURL, | ||
| message: "JSON parsing failed for \(fileURL): \(error.localizedDescription). If this is a private model, verify that HF_TOKEN is set." | ||
| ) | ||
| } | ||
| guard let dictionary = parsed as? [NSString: Any] else { throw Hub.HubClientError.parse } | ||
| return Config(dictionary) | ||
| } | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
2c on this:
I think theres an opportunity to protocolize json parsing, which would allow the dependency footprint to be reduced for this specific project but still enable yyjson usage outside of it.
protocol JSONParser {
func parseToConfig(_ data: Data) throws -> Config
}Then
func configuration(fileURL: URL, parser: JSONParser = DefaultJSONParser()) throws -> Config {
let data = try Data(contentsOf: fileURL)
do {
return try parser.parseToConfig(data)
} catch {
throw Hub.HubClientError.jsonSerialization(
fileURL: fileURL,
message: "JSON parsing failed for \(fileURL): \(error.localizedDescription). If this is a private model, verify that HF_TOKEN is set."
)
}
}Then JSONParser could be passed to the HubApi init or an object that is passed into configuration call.
let customParser = YYJSONParser()
let config = try hubApi.configuration(fileURL: someURL, parser: customParser)Ideally this project would remain pure swift w/ swift dependencies but still allow fast implementations via protocols.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's a nice idea, although the Python transformers library uses the Rust tokenizers library, which uses serde for JSON parsing. I think there is a good argument for just having a fast default like in the Python transformers, especially since what's available in Swift is so slow. People running MLX models in Swift are already using C++ libraries through C bridging. yyjson is in C, so Swift can call it directly with minimal overhead.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@DePasqualeOrg Amazing work! I just opened a PR demonstrating the effect of in-situ parsing on speed and memory here: DePasqualeOrg#2
@ZachNagengast I'm sympathetic to the idea of dependency injection, but in this case, it's hard to imagine a scenario in which an API consumer wouldn't opt-in to faster JSON parsing. Assuming the performance is consistently better, and barring segfaults or incorrect behavior, then this seems like a slam dunk.
If the additional dependency is a concern, I suppose we could compromise with a trait that's enabled by default and could be disabled on an opt-out basis.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fast default would be great, on the other hand swift apps have the consideration of compilation time and distributable binary size that also should be optimized. Testing the build on this branch appears to add 1.2MB of C code which compresses well to be fair to around 113KB. Do you think this dependency can be transitioned via the protocol to the MLX repo since that is already compiling C code?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Posted before reading your comment, the extra dependency is a concern but it could be isolated with traits or simple compiler flags checking for canImport(yyjson) similar to this WIP branch that pulls jinja out of the compilation: main...ZachNagengast:swift-transformers:optional-jinja-import-for-hub-and-tokenizers
Something like this would allow the Transformers library to import the fast solution by default, but more targeted implementations that just want Hub and Tokenizers could have an optimal dependency footprint
|
Thanks for this, @mattt. I dug into it, and it looks like both methods use identical memory (~68 MB) when measured in separate tests. The 0 KB measurement may have been due to memory reuse between sequential tests. Let me know what you think: https://github.com/DePasqualeOrg/swift-transformers/tree/benchmark-memory-use |
|
@DePasqualeOrg Running my own benchmarks, I found that YYJSON is actually ~8.7x faster than Foundation for parsing that ~10MB
And according to Swift Benchmark, in-situ parsing correctly showed 0 allocations. All the more reason for us to move forward, in my opinion. @pcuenca Any strong feelings about how to proceed? |
|
@mattt, I don't fully understand the implications of in-situ parsing, but I'm not sure there's a benefit. Here's the analysis from Claude Code, for the record:
|
JSON parsing is one of the biggest performance bottlenecks for tokenizer loading, and yyjson, a high-performance C library, offers significant speed gains for large tokenizer files: it's 3.4x faster for raw JSON parsing and 2.1x faster for building the
Config, saving around 600 ms in a typical tokenizer load.Changes
YYJSONParserwith direct yyjson →Configconversion (no intermediate Foundation objects)HubApi.configuration(fileURL:)to use yyjsonJSONSerialization+BOM.swift(yyjson handles BOM correctly)Benchmarkstest target (run withRUN_BENCHMARKS=1 swift test --filter Benchmarks)Performance
Tested with the 11.4 MB
tokenizer.jsonfrommlx-community/Qwen3-0.6B-Base-DQ5:This saves ~600 ms per tokenizer load on an M3 MacBook Pro.
All existing tests pass.