[Release 0.7]: Fix lint (#12815)

mergennachin · web-flow · commit 8bf58fb4cf1e · 2025-07-24T09:27:13.000-07:00
diff --git a/docs/source/llm/run-on-ios.md b/docs/source/llm/run-on-ios.md
@@ -1,129 +1,129 @@
-# Running LLMs on iOS
-
-ExecuTorch’s LLM-specific runtime components provide an experimental Objective-C and Swift components around the core C++ LLM runtime.
-
-## Prerequisites
-
-Make sure you have a model and tokenizer files ready, as described in the prerequisites section of the [Running LLMs with C++](run-with-c-plus-plus.md) guide.
-
-## Runtime API
-
-Once linked against the [`executorch_llm`](../using-executorch-ios.md) framework, you can import the necessary components.
-
-### Importing
-
-Objective-C:
-```objectivec
-#import <ExecuTorchLLM/ExecuTorchLLM.h>
-```
-
-Swift:
-```swift
-import ExecuTorchLLM
-```
-
-### TextLLMRunner
-
-The `ExecuTorchTextLLMRunner` class (bridged to Swift as `TextLLMRunner`) provides a simple Objective-C/Swift interface for loading a text-generation model, configuring its tokenizer with custom special tokens, generating token streams, and stopping execution.
-This API is experimental and subject to change.
-
-#### Initialization
-
-Create a runner by specifying paths to your serialized model (`.pte`) and tokenizer data, plus an array of special tokens to use during tokenization.
-Initialization itself is lightweight and doesn’t load the program data immediately.
-
-Objective-C:
-```objectivec
-NSString *modelPath     = [[NSBundle mainBundle] pathForResource:@"llama-3.2-instruct" ofType:@"pte"];
-NSString *tokenizerPath = [[NSBundle mainBundle] pathForResource:@"tokenizer" ofType:@"model"];
-NSArray<NSString *> *specialTokens = @[ @"<|bos|>", @"<|eos|>" ];
-
-ExecuTorchTextLLMRunner *runner = [[ExecuTorchTextLLMRunner alloc] initWithModelPath:modelPath
-                                                                       tokenizerPath:tokenizerPath
-                                                                       specialTokens:specialTokens];
-```
-
-Swift:
-```swift
-let modelPath     = Bundle.main.path(forResource: "llama-3.2-instruct", ofType: "pte")!
-let tokenizerPath = Bundle.main.path(forResource: "tokenizer", ofType: "model")!
-let specialTokens = ["<|bos|>", "<|eos|>"]
-
-let runner = TextLLMRunner(
-  modelPath: modelPath,
-  tokenizerPath: tokenizerPath,
-  specialTokens: specialTokens
-)
-```
-
-#### Loading
-
-Explicitly load the model before generation to avoid paying the load cost during your first `generate` call.
-
-Objective-C:
-```objectivec
-NSError *error = nil;
-BOOL success = [runner loadWithError:&error];
-if (!success) {
-  NSLog(@"Failed to load: %@", error);
-}
-```
-
-Swift:
-```swift
-do {
-  try runner.load()
-} catch {
-  print("Failed to load: \(error)")
-}
-```
-
-#### Generating
-
-Generate up to a given number of tokens from an initial prompt. The callback block is invoked once per token as it’s produced.
-
-Objective-C:
-```objectivec
-NSError *error = nil;
-BOOL success = [runner generate:@"Once upon a time"
-                 sequenceLength:50
-              withTokenCallback:^(NSString *token) {
-                NSLog(@"Generated token: %@", token);
-              }
-                          error:&error];
-if (!success) {
-  NSLog(@"Generation failed: %@", error);
-}
-```
-
-Swift:
-```swift
-do {
-  try runner.generate("Once upon a time", sequenceLength: 50) { token in
-    print("Generated token:", token)
-  }
-} catch {
-  print("Generation failed:", error)
-}
-```
-
-#### Stopping Generation
-
-If you need to interrupt a long‐running generation, call:
-
-Objective-C:
-```objectivec
-[runner stop];
-```
-
-Swift:
-```swift
-runner.stop()
-```
-
-## Demo
-
-Get hands-on with our [LLaMA iOS Demo App](llama-demo-ios.md) to see the LLM runtime APIs in action.
-
-
-
+# Running LLMs on iOS
+
+ExecuTorch’s LLM-specific runtime components provide an experimental Objective-C and Swift components around the core C++ LLM runtime.
+
+## Prerequisites
+
+Make sure you have a model and tokenizer files ready, as described in the prerequisites section of the [Running LLMs with C++](run-with-c-plus-plus.md) guide.
+
+## Runtime API
+
+Once linked against the [`executorch_llm`](../using-executorch-ios.md) framework, you can import the necessary components.
+
+### Importing
+
+Objective-C:
+```objectivec
+#import <ExecuTorchLLM/ExecuTorchLLM.h>
+```
+
+Swift:
+```swift
+import ExecuTorchLLM
+```
+
+### TextLLMRunner
+
+The `ExecuTorchTextLLMRunner` class (bridged to Swift as `TextLLMRunner`) provides a simple Objective-C/Swift interface for loading a text-generation model, configuring its tokenizer with custom special tokens, generating token streams, and stopping execution.
+This API is experimental and subject to change.
+
+#### Initialization
+
+Create a runner by specifying paths to your serialized model (`.pte`) and tokenizer data, plus an array of special tokens to use during tokenization.
+Initialization itself is lightweight and doesn’t load the program data immediately.
+
+Objective-C:
+```objectivec
+NSString *modelPath     = [[NSBundle mainBundle] pathForResource:@"llama-3.2-instruct" ofType:@"pte"];
+NSString *tokenizerPath = [[NSBundle mainBundle] pathForResource:@"tokenizer" ofType:@"model"];
+NSArray<NSString *> *specialTokens = @[ @"<|bos|>", @"<|eos|>" ];
+
+ExecuTorchTextLLMRunner *runner = [[ExecuTorchTextLLMRunner alloc] initWithModelPath:modelPath
+                                                                       tokenizerPath:tokenizerPath
+                                                                       specialTokens:specialTokens];
+```
+
+Swift:
+```swift
+let modelPath     = Bundle.main.path(forResource: "llama-3.2-instruct", ofType: "pte")!
+let tokenizerPath = Bundle.main.path(forResource: "tokenizer", ofType: "model")!
+let specialTokens = ["<|bos|>", "<|eos|>"]
+
+let runner = TextLLMRunner(
+  modelPath: modelPath,
+  tokenizerPath: tokenizerPath,
+  specialTokens: specialTokens
+)
+```
+
+#### Loading
+
+Explicitly load the model before generation to avoid paying the load cost during your first `generate` call.
+
+Objective-C:
+```objectivec
+NSError *error = nil;
+BOOL success = [runner loadWithError:&error];
+if (!success) {
+  NSLog(@"Failed to load: %@", error);
+}
+```
+
+Swift:
+```swift
+do {
+  try runner.load()
+} catch {
+  print("Failed to load: \(error)")
+}
+```
+
+#### Generating
+
+Generate up to a given number of tokens from an initial prompt. The callback block is invoked once per token as it’s produced.
+
+Objective-C:
+```objectivec
+NSError *error = nil;
+BOOL success = [runner generate:@"Once upon a time"
+                 sequenceLength:50
+              withTokenCallback:^(NSString *token) {
+                NSLog(@"Generated token: %@", token);
+              }
+                          error:&error];
+if (!success) {
+  NSLog(@"Generation failed: %@", error);
+}
+```
+
+Swift:
+```swift
+do {
+  try runner.generate("Once upon a time", sequenceLength: 50) { token in
+    print("Generated token:", token)
+  }
+} catch {
+  print("Generation failed:", error)
+}
+```
+
+#### Stopping Generation
+
+If you need to interrupt a long‐running generation, call:
+
+Objective-C:
+```objectivec
+[runner stop];
+```
+
+Swift:
+```swift
+runner.stop()
+```
+
+## Demo
+
+Get hands-on with our [LLaMA iOS Demo App](llama-demo-ios.md) to see the LLM runtime APIs in action.
+
+
+