Skip to content

Commit 482b9fc

Browse files
committed
Update with +
1 parent 61d458d commit 482b9fc

File tree

5 files changed

+31
-31
lines changed

5 files changed

+31
-31
lines changed

examples/models/deepseek-r1-distill-llama-8B/README.md

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -54,9 +54,9 @@ torch.save(sd, "/tmp/deepseek-ai/DeepSeek-R1-Distill-Llama-8B/checkpoint.pth")
5454
```
5555
python -m extension.llm.export.export_llm \
5656
--config examples/models/deepseek-r1-distill-llama-8B/config/deepseek-r1-distill-llama-8B
57-
base.checkpoint=/tmp/deepseek-ai/DeepSeek-R1-Distill-Llama-8B/checkpoint.pth \
58-
base.params=params.json \
59-
export.output_name="DeepSeek-R1-Distill-Llama-8B.pte"
57+
+base.checkpoint=/tmp/deepseek-ai/DeepSeek-R1-Distill-Llama-8B/checkpoint.pth \
58+
+base.params=params.json \
59+
+export.output_name="DeepSeek-R1-Distill-Llama-8B.pte"
6060
```
6161

6262
6. Run the model on your desktop for validation or integrate with iOS/Android apps. Instructions for these are available in the Llama [README](../llama/README.md) starting at Step 3.

examples/models/llama/README.md

Lines changed: 12 additions & 12 deletions
Original file line numberDiff line numberDiff line change
@@ -169,9 +169,9 @@ LLAMA_PARAMS=path/to/params.json
169169
170170
python -m extension.llm.export.export_llm \
171171
--config examples/models/llamaconfig/llama_bf16.yaml
172-
base.model_class="llama3_2" \
173-
base.checkpoint="${LLAMA_CHECKPOINT:?}" \
174-
base.params="${LLAMA_PARAMS:?}" \
172+
+base.model_class="llama3_2" \
173+
+base.checkpoint="${LLAMA_CHECKPOINT:?}" \
174+
+base.params="${LLAMA_PARAMS:?}" \
175175
```
176176
For convenience, an [exported ExecuTorch bf16 model](https://huggingface.co/executorch-community/Llama-3.2-1B-ET/blob/main/llama3_2-1B.pte) is available on Hugging Face. The export was created using [this detailed recipe notebook](https://huggingface.co/executorch-community/Llama-3.2-1B-ET/blob/main/ExportRecipe_1B.ipynb).
177177

@@ -187,9 +187,9 @@ LLAMA_PARAMS=path/to/spinquant/params.json
187187
188188
python -m extension.llm.export.export_llm \
189189
--config examples/models/llama/config/llama_xnnpack_spinquant.yaml
190-
base.model_class="llama3_2" \
191-
base.checkpoint="${LLAMA_QUANTIZED_CHECKPOINT:?}" \
192-
base.params="${LLAMA_PARAMS:?}" \
190+
+base.model_class="llama3_2" \
191+
+base.checkpoint="${LLAMA_QUANTIZED_CHECKPOINT:?}" \
192+
+base.params="${LLAMA_PARAMS:?}" \
193193
```
194194
For convenience, an [exported ExecuTorch SpinQuant model](https://huggingface.co/executorch-community/Llama-3.2-1B-Instruct-SpinQuant_INT4_EO8-ET/blob/main/Llama-3.2-1B-Instruct-SpinQuant_INT4_EO8.pte) is available on Hugging Face. The export was created using [this detailed recipe notebook](https://huggingface.co/executorch-community/Llama-3.2-1B-Instruct-SpinQuant_INT4_EO8-ET/blob/main/Export_Recipe_Llama_3_2_1B_Instruct_SpinQuant_INT4_EO8.ipynb).
195195

@@ -204,9 +204,9 @@ LLAMA_PARAMS=path/to/qlora/params.json
204204
205205
python -m extension.llm.export.export_llm \
206206
--config examples/models/llama/config/llama_xnnpack_qat.yaml
207-
base.model_class="llama3_2" \
208-
base.checkpoint="${LLAMA_QUANTIZED_CHECKPOINT:?}" \
209-
base.params="${LLAMA_PARAMS:?}" \
207+
+base.model_class="llama3_2" \
208+
+base.checkpoint="${LLAMA_QUANTIZED_CHECKPOINT:?}" \
209+
+base.params="${LLAMA_PARAMS:?}" \
210210
```
211211
For convenience, an [exported ExecuTorch QAT+LoRA model](https://huggingface.co/executorch-community/Llama-3.2-1B-Instruct-QLORA_INT4_EO8-ET/blob/main/Llama-3.2-1B-Instruct-QLORA_INT4_EO8.pte) is available on Hugging Face. The export was created using [this detailed recipe notebook](https://huggingface.co/executorch-community/Llama-3.2-1B-Instruct-QLORA_INT4_EO8-ET/blob/main/Export_Recipe_Llama_3_2_1B_Instruct_QLORA_INT4_EO8.ipynb).
212212

@@ -220,9 +220,9 @@ You can export and run the original Llama 3 8B instruct model.
220220
```
221221
python -m extension.llm.export.export_llm \
222222
--config examples/models/llama/config/llama_q8da4w.yaml
223-
base.model_clas="llama3"
224-
base.checkpoint=<consolidated.00.pth.pth> \
225-
base.params=<params.json> \
223+
+base.model_clas="llama3"
224+
+base.checkpoint=<consolidated.00.pth.pth> \
225+
+base.params=<params.json> \
226226
```
227227
Due to the larger vocabulary size of Llama 3, we recommend quantizing the embeddings with `quantization.embedding_quantize=\'4,32\'` as shown above to further reduce the model size.
228228

examples/models/phi_4_mini/README.md

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -34,9 +34,9 @@ PHI_CHECKPOINT=path/to/checkpoint.pth
3434
3535
python -m extension.llm.export.export_llm \
3636
--config config/phi_4_mini_xnnpack.yaml
37-
base.checkpoint="${PHI_CHECKPOINT=path/to/checkpoint.pth:?}" \
38-
base.params="examples/models/phi-4-mini/config/config.json" \
39-
export.output_name="phi-4-mini.pte" \
37+
+base.checkpoint="${PHI_CHECKPOINT=path/to/checkpoint.pth:?}" \
38+
+base.params="examples/models/phi-4-mini/config/config.json" \
39+
+export.output_name="phi-4-mini.pte" \
4040
```
4141

4242
Run using the executor runner:

examples/models/qwen2_5/README.md

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -34,10 +34,10 @@ QWEN_CHECKPOINT=path/to/checkpoint.pth
3434
3535
python -m extension.llm.export.export_llm \
3636
--config examples/models/qwen2_5/config/qwen2_5_xnnpack_q8da4w.yaml
37-
base.model_class="qwen2_5" \
38-
base.checkpoint="${QWEN_CHECKPOINT:?}" \
39-
base.params="examples/models/qwen2_5/1_5b_config.json" \
40-
export.output_name="qwen2_5-1_5b.pte" \
37+
+base.model_class="qwen2_5" \
38+
+base.checkpoint="${QWEN_CHECKPOINT:?}" \
39+
+base.params="examples/models/qwen2_5/1_5b_config.json" \
40+
+export.output_name="qwen2_5-1_5b.pte" \
4141
```
4242

4343
Run using the executor runner:

examples/models/qwen3/README.md

Lines changed: 9 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -18,28 +18,28 @@ Export 0.6b to XNNPack, quantized with 8da4w:
1818
```
1919
python -m extension.llm.export.export_llm \
2020
--config examples/models/qwen3/config/qwen3_xnnpack_q8da4w.yaml
21-
base.model_class="qwen3_0_6b" \
22-
base.params="examples/models/qwen3/config/0_6b_config.json" \
23-
export.output_name="qwen3_0_6b.pte" \
21+
+base.model_class="qwen3_0_6b" \
22+
+base.params="examples/models/qwen3/config/0_6b_config.json" \
23+
+export.output_name="qwen3_0_6b.pte" \
2424
2525
```
2626

2727
Export 1.7b to XNNPack, quantized with 8da4w:
2828
```
2929
python -m extension.llm.export.export_llm \
3030
--config examples/models/qwen3/config/qwen3_xnnpack_q8da4w.yaml
31-
base.model_class="qwen3_1_7b" \
32-
base.params="examples/models/qwen3/config/1_7b_config.json" \
33-
export.output_name="qwen3_1_7b.pte" \
31+
+base.model_class="qwen3_1_7b" \
32+
+base.params="examples/models/qwen3/config/1_7b_config.json" \
33+
+export.output_name="qwen3_1_7b.pte" \
3434
```
3535

3636
Export 4b to XNNPack, quantized with 8da4w:
3737
```
3838
python -m extension.llm.export.export_llm \
3939
--config examples/models/qwen3/config/qwen3_xnnpack_q8da4w.yaml
40-
base.model_class="qwen3_4b" \
41-
base.params="examples/models/qwen3/config/4b_config.json" \
42-
export.output_name="qwen3_4b.pte" \
40+
+base.model_class="qwen3_4b" \
41+
+base.params="examples/models/qwen3/config/4b_config.json" \
42+
+export.output_name="qwen3_4b.pte" \
4343
```
4444

4545
### Example run

0 commit comments

Comments
 (0)