Skip to content

Commit 8bf58fb

Browse files
authored
[Release 0.7]: Fix lint (#12815)
1 parent bd0c30b commit 8bf58fb

File tree

1 file changed

+129
-129
lines changed

1 file changed

+129
-129
lines changed

docs/source/llm/run-on-ios.md

Lines changed: 129 additions & 129 deletions
Original file line numberDiff line numberDiff line change
@@ -1,129 +1,129 @@
1-
# Running LLMs on iOS
2-
3-
ExecuTorch’s LLM-specific runtime components provide an experimental Objective-C and Swift components around the core C++ LLM runtime.
4-
5-
## Prerequisites
6-
7-
Make sure you have a model and tokenizer files ready, as described in the prerequisites section of the [Running LLMs with C++](run-with-c-plus-plus.md) guide.
8-
9-
## Runtime API
10-
11-
Once linked against the [`executorch_llm`](../using-executorch-ios.md) framework, you can import the necessary components.
12-
13-
### Importing
14-
15-
Objective-C:
16-
```objectivec
17-
#import <ExecuTorchLLM/ExecuTorchLLM.h>
18-
```
19-
20-
Swift:
21-
```swift
22-
import ExecuTorchLLM
23-
```
24-
25-
### TextLLMRunner
26-
27-
The `ExecuTorchTextLLMRunner` class (bridged to Swift as `TextLLMRunner`) provides a simple Objective-C/Swift interface for loading a text-generation model, configuring its tokenizer with custom special tokens, generating token streams, and stopping execution.
28-
This API is experimental and subject to change.
29-
30-
#### Initialization
31-
32-
Create a runner by specifying paths to your serialized model (`.pte`) and tokenizer data, plus an array of special tokens to use during tokenization.
33-
Initialization itself is lightweight and doesn’t load the program data immediately.
34-
35-
Objective-C:
36-
```objectivec
37-
NSString *modelPath = [[NSBundle mainBundle] pathForResource:@"llama-3.2-instruct" ofType:@"pte"];
38-
NSString *tokenizerPath = [[NSBundle mainBundle] pathForResource:@"tokenizer" ofType:@"model"];
39-
NSArray<NSString *> *specialTokens = @[ @"<|bos|>", @"<|eos|>" ];
40-
41-
ExecuTorchTextLLMRunner *runner = [[ExecuTorchTextLLMRunner alloc] initWithModelPath:modelPath
42-
tokenizerPath:tokenizerPath
43-
specialTokens:specialTokens];
44-
```
45-
46-
Swift:
47-
```swift
48-
let modelPath = Bundle.main.path(forResource: "llama-3.2-instruct", ofType: "pte")!
49-
let tokenizerPath = Bundle.main.path(forResource: "tokenizer", ofType: "model")!
50-
let specialTokens = ["<|bos|>", "<|eos|>"]
51-
52-
let runner = TextLLMRunner(
53-
modelPath: modelPath,
54-
tokenizerPath: tokenizerPath,
55-
specialTokens: specialTokens
56-
)
57-
```
58-
59-
#### Loading
60-
61-
Explicitly load the model before generation to avoid paying the load cost during your first `generate` call.
62-
63-
Objective-C:
64-
```objectivec
65-
NSError *error = nil;
66-
BOOL success = [runner loadWithError:&error];
67-
if (!success) {
68-
NSLog(@"Failed to load: %@", error);
69-
}
70-
```
71-
72-
Swift:
73-
```swift
74-
do {
75-
try runner.load()
76-
} catch {
77-
print("Failed to load: \(error)")
78-
}
79-
```
80-
81-
#### Generating
82-
83-
Generate up to a given number of tokens from an initial prompt. The callback block is invoked once per token as it’s produced.
84-
85-
Objective-C:
86-
```objectivec
87-
NSError *error = nil;
88-
BOOL success = [runner generate:@"Once upon a time"
89-
sequenceLength:50
90-
withTokenCallback:^(NSString *token) {
91-
NSLog(@"Generated token: %@", token);
92-
}
93-
error:&error];
94-
if (!success) {
95-
NSLog(@"Generation failed: %@", error);
96-
}
97-
```
98-
99-
Swift:
100-
```swift
101-
do {
102-
try runner.generate("Once upon a time", sequenceLength: 50) { token in
103-
print("Generated token:", token)
104-
}
105-
} catch {
106-
print("Generation failed:", error)
107-
}
108-
```
109-
110-
#### Stopping Generation
111-
112-
If you need to interrupt a long‐running generation, call:
113-
114-
Objective-C:
115-
```objectivec
116-
[runner stop];
117-
```
118-
119-
Swift:
120-
```swift
121-
runner.stop()
122-
```
123-
124-
## Demo
125-
126-
Get hands-on with our [LLaMA iOS Demo App](llama-demo-ios.md) to see the LLM runtime APIs in action.
127-
128-
129-
1+
# Running LLMs on iOS
2+
3+
ExecuTorch’s LLM-specific runtime components provide an experimental Objective-C and Swift components around the core C++ LLM runtime.
4+
5+
## Prerequisites
6+
7+
Make sure you have a model and tokenizer files ready, as described in the prerequisites section of the [Running LLMs with C++](run-with-c-plus-plus.md) guide.
8+
9+
## Runtime API
10+
11+
Once linked against the [`executorch_llm`](../using-executorch-ios.md) framework, you can import the necessary components.
12+
13+
### Importing
14+
15+
Objective-C:
16+
```objectivec
17+
#import <ExecuTorchLLM/ExecuTorchLLM.h>
18+
```
19+
20+
Swift:
21+
```swift
22+
import ExecuTorchLLM
23+
```
24+
25+
### TextLLMRunner
26+
27+
The `ExecuTorchTextLLMRunner` class (bridged to Swift as `TextLLMRunner`) provides a simple Objective-C/Swift interface for loading a text-generation model, configuring its tokenizer with custom special tokens, generating token streams, and stopping execution.
28+
This API is experimental and subject to change.
29+
30+
#### Initialization
31+
32+
Create a runner by specifying paths to your serialized model (`.pte`) and tokenizer data, plus an array of special tokens to use during tokenization.
33+
Initialization itself is lightweight and doesn’t load the program data immediately.
34+
35+
Objective-C:
36+
```objectivec
37+
NSString *modelPath = [[NSBundle mainBundle] pathForResource:@"llama-3.2-instruct" ofType:@"pte"];
38+
NSString *tokenizerPath = [[NSBundle mainBundle] pathForResource:@"tokenizer" ofType:@"model"];
39+
NSArray<NSString *> *specialTokens = @[ @"<|bos|>", @"<|eos|>" ];
40+
41+
ExecuTorchTextLLMRunner *runner = [[ExecuTorchTextLLMRunner alloc] initWithModelPath:modelPath
42+
tokenizerPath:tokenizerPath
43+
specialTokens:specialTokens];
44+
```
45+
46+
Swift:
47+
```swift
48+
let modelPath = Bundle.main.path(forResource: "llama-3.2-instruct", ofType: "pte")!
49+
let tokenizerPath = Bundle.main.path(forResource: "tokenizer", ofType: "model")!
50+
let specialTokens = ["<|bos|>", "<|eos|>"]
51+
52+
let runner = TextLLMRunner(
53+
modelPath: modelPath,
54+
tokenizerPath: tokenizerPath,
55+
specialTokens: specialTokens
56+
)
57+
```
58+
59+
#### Loading
60+
61+
Explicitly load the model before generation to avoid paying the load cost during your first `generate` call.
62+
63+
Objective-C:
64+
```objectivec
65+
NSError *error = nil;
66+
BOOL success = [runner loadWithError:&error];
67+
if (!success) {
68+
NSLog(@"Failed to load: %@", error);
69+
}
70+
```
71+
72+
Swift:
73+
```swift
74+
do {
75+
try runner.load()
76+
} catch {
77+
print("Failed to load: \(error)")
78+
}
79+
```
80+
81+
#### Generating
82+
83+
Generate up to a given number of tokens from an initial prompt. The callback block is invoked once per token as it’s produced.
84+
85+
Objective-C:
86+
```objectivec
87+
NSError *error = nil;
88+
BOOL success = [runner generate:@"Once upon a time"
89+
sequenceLength:50
90+
withTokenCallback:^(NSString *token) {
91+
NSLog(@"Generated token: %@", token);
92+
}
93+
error:&error];
94+
if (!success) {
95+
NSLog(@"Generation failed: %@", error);
96+
}
97+
```
98+
99+
Swift:
100+
```swift
101+
do {
102+
try runner.generate("Once upon a time", sequenceLength: 50) { token in
103+
print("Generated token:", token)
104+
}
105+
} catch {
106+
print("Generation failed:", error)
107+
}
108+
```
109+
110+
#### Stopping Generation
111+
112+
If you need to interrupt a long‐running generation, call:
113+
114+
Objective-C:
115+
```objectivec
116+
[runner stop];
117+
```
118+
119+
Swift:
120+
```swift
121+
runner.stop()
122+
```
123+
124+
## Demo
125+
126+
Get hands-on with our [LLaMA iOS Demo App](llama-demo-ios.md) to see the LLM runtime APIs in action.
127+
128+
129+

0 commit comments

Comments
 (0)