Skip to content

Commit 472e7ca

Browse files
committed
Update readme
1 parent 14cd6f5 commit 472e7ca

File tree

1 file changed

+84
-20
lines changed

1 file changed

+84
-20
lines changed

README.md

Lines changed: 84 additions & 20 deletions
Original file line numberDiff line numberDiff line change
@@ -8,7 +8,7 @@ To use MLX Structured in your project, add the following to your `Package.swift`
88

99
```swift
1010
dependencies: [
11-
.package(url: "https://github.com/petrukha-ivan/mlx-swift-structured", from: "0.0.1")
11+
.package(url: "https://github.com/petrukha-ivan/mlx-swift-structured", from: "0.0.2")
1212
]
1313
```
1414

@@ -27,7 +27,7 @@ dependencies: [
2727
Start by defining a `Grammar`. You can use JSON Schema to describe the desired output:
2828

2929
```swift
30-
let grammar = try Grammar.schema(.object(
30+
let schema = JSONSchema.object(
3131
description: "Person info",
3232
properties: [
3333
"name": .string(),
@@ -36,7 +36,9 @@ let grammar = try Grammar.schema(.object(
3636
"name",
3737
"age"
3838
]
39-
))
39+
)
40+
41+
let grammar = try Grammar.schema(schema)
4042
```
4143

4244
Starting with macOS 26 and iOS 26, you can use a `@Generable` type as a grammar source:
@@ -52,24 +54,87 @@ struct PersonInfo {
5254
let age: Int
5355
}
5456

55-
let grammar = try Grammar.schema(generable: PersonInfo.self)
57+
let grammar = try Grammar.generable(PersonInfo.self)
5658
```
5759

58-
You can also use regex:
60+
You can also use a regex:
5961

6062
```swift
61-
let grammar = Grammar.regex(#"^[A-Za-z0-9._%+-]+@[A-Za-z0-9.-]+\.[A-Za-z]{2,}$"#) // Simple email regex
63+
let regex = #"^[A-Za-z0-9._%+-]+@[A-Za-z0-9.-]+\.[A-Za-z]{2,}$"# // Simple email regex
64+
let grammar = Grammar.regex(regex)
6265
```
6366

6467
Or define your own grammar rules with [EBNF](https://en.wikipedia.org/wiki/Extended_Backus–Naur_form) syntax:
6568

6669
```swift
67-
let grammar = Grammar.ebnf(#"root ::= ("YES" | "NO")"#) // Answer only "YES" or "NO"
70+
let ebnf = #"root ::= ("YES" | "NO")"# // Answer only "YES" or "NO"
71+
let grammar = Grammar.ebnf(ebnf)
72+
```
73+
74+
### Complex Grammar
75+
76+
You can define rich, composable grammar rules via a grammar builder. This enables you to describe structured output formats precisely:
77+
78+
```swift
79+
let grammar = try Grammar {
80+
SequenceFormat {
81+
ConstTextFormat(text: "Hello!")
82+
OrFormat {
83+
JSONSchemaFormat(...)
84+
RegexFormat(...)
85+
}
86+
}
87+
}
88+
```
89+
90+
This can be used in different ways. Here is an example of a constrained Qwen3 tool-calling format:
91+
92+
```swift
93+
let grammar = try Grammar {
94+
SequenceFormat {
95+
if forceThinking {
96+
TagFormat(begin: "<think>", end: "</think>") {
97+
AnyTextFormat()
98+
}
99+
}
100+
TriggeredTagsFormat(triggers: ["<tool_call>"], options: [.atLeastOne, .stopAfterFirst]) {
101+
for tool in tools {
102+
TagFormat(begin: "<tool_call>\n{\"name\": \"\(tool.name)\", \"arguments\": ", end: "}\n</tool_call>") {
103+
JSONSchemaFormat(schema: tool.parameters)
104+
}
105+
}
106+
}
107+
}
108+
}
68109
```
69110

70111
### Generation
71112

72-
To use a defined grammar during text generation, create a logit processor and pass it to `TokenIterator`:
113+
To use a defined grammar during text generation, use the convenient `generate` method:
114+
115+
```swift
116+
let result = try await generate(input: input, context: context, grammar: grammar)
117+
print(result.output) // Generated text
118+
```
119+
120+
You can also pass a `Generable` type as an argument to generate it:
121+
122+
```swift
123+
let (result, model) = try await generate(input: input, context: context, generating: PersonInfo.self)
124+
print(result.output) // Generated text
125+
print(model) // Generated model
126+
```
127+
128+
With a `Generable` type, you can use streaming generation, which returns `PartiallyGenerated` content for your type:
129+
130+
```swift
131+
let stream = try await generate(input: input, context: context, generating: PersonInfo.self)
132+
for await content in stream {
133+
print("Partially generated:", content)
134+
}
135+
```
136+
137+
You can also create a logit processor manually and pass it to `TokenIterator`:
73138

74139
```swift
75140
let processor = try await GrammarMaskedLogitProcessor.from(configuration: context.configuration, grammar: grammar)
@@ -82,7 +147,7 @@ You can find more usage examples in the `MLXStructuredCLI` target and in the uni
82147

83148
### Performance
84149

85-
In synthetic tests with the Llama model and a vocabulary of 60,000 tokens, the performance drop was less than 10%. However, with real models the results are worse. In practice, you can expect generation speed to be about 15% slower.
150+
In synthetic tests with the Llama model and a vocabulary of 60,000 tokens, the performance drop was less than 10%. However, with real models, the results are worse. In practice, you can expect generation speed to be about 15% slower.
86151
The exact slowdown depends on the model, vocabulary size, and the complexity of your grammar.
87152

88153
| Model | Vocab Size | Plain (tokens/s) | Constrained (tokens/s) |
@@ -109,10 +174,10 @@ let grammar = try Grammar.schema(.object(
109174
description: "Movie record",
110175
properties: [
111176
"title": .string(),
112-
"year": .integer(),
177+
"year": .integer(minimum: 1900, maximum: 2026),
113178
"genres": .array(items: .string(), maxItems: 3),
114179
"director": .string(),
115-
"actors": .array(items: .string(), maxItems: 10)
180+
"actors": .array(items: .string(), maxItems: 5)
116181
], required: [
117182
"title",
118183
"year",
@@ -123,7 +188,7 @@ let grammar = try Grammar.schema(.object(
123188
))
124189
```
125190

126-
For large proprietary models like ChatGPT, this is not a problem. With the right prompt, they can successfully generate valid JSON even without constrained decoding. But with smaller models like Gemma3 270M (especially when quantized to 4-bit) the output almost always contains invalid JSON, even if the schema is provided in the prompt.
191+
For large proprietary models like ChatGPT, this is not a problem. With the right prompt, they can successfully generate valid JSON even without constrained decoding. However, with smaller models like Gemma3 270M (especially when quantized to 4-bit), the output almost always contains invalid JSON, even if the schema is provided in the prompt.
127192

128193
```plain
129194
[
@@ -156,23 +221,22 @@ Here is the output using constrained decoding:
156221

157222
```plain
158223
{
159-
"director": "Christian Bale",
160-
"year": 2008,
161224
"title": "The Dark Knight",
225+
"year": 2008,
226+
"genres": [
227+
"superhero",
228+
"crime"
229+
],
230+
"director": "Christopher Nolan",
162231
"actors": [
163232
"Christian Bale",
164233
"Heath Ledger",
165234
"Michael Caine"
166-
],
167-
"genres": [
168-
"crime",
169-
"action",
170-
"mystery"
171235
]
172236
}
173237
```
174238

175-
The order of keys here is random because `Dictionary` in Swift is unordered. I plan to address this in the future. However, the output is fully valid JSON that exactly matches the provided schema. This shows that, with the right approach, even small models like Gemma3 270M 4-bit (which is just 150 MB) can produce correct structured output.
239+
The output is fully valid JSON that exactly matches the provided schema. This shows that, with the right approach, even small models like Gemma3 270M 4-bit (which is just 150 MB) can produce correct structured output.
176240

177241
## Troubleshooting
178242

0 commit comments

Comments
 (0)