Skip to content

Commit 9d7010f

Browse files
authored
Update README.md
1 parent f17199e commit 9d7010f

File tree

1 file changed

+4
-5
lines changed

1 file changed

+4
-5
lines changed

examples/models/llama/README.md

Lines changed: 4 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -158,7 +158,7 @@ Llama 3 8B performance was measured on the Samsung Galaxy S22, S24, and OnePlus
158158

159159
1. Download `consolidated.00.pth`, `params.json` and `tokenizer.model` from [Llama website](https://www.llama.com/llama-downloads/) or [Hugging Face](https://huggingface.co/meta-llama/Llama-3.2-1B). For chat use-cases, download the instruct models.
160160

161-
2. Export model and generate `.pte` file. For convenience, here's an already ExecuTorch [exported model](https://huggingface.co/executorch-community/Llama-3.2-1B-ET/blob/main/llama3_2-1B.pte) using [this recipe](https://huggingface.co/executorch-community/Llama-3.2-1B-ET/blob/main/ExportRecipe_1B.ipynb) on Hugging Face.
161+
2. Export model and generate `.pte` file.
162162

163163
- Use **original BF16** version, without any quantization.
164164
```
@@ -177,6 +177,7 @@ python -m examples.models.llama.export_llama \
177177
--metadata '{"get_bos_id":128000, "get_eos_ids":[128009, 128001]}' \
178178
--output_name="llama3_2.pte"
179179
```
180+
For convenience, here's an already ExecuTorch [exported bf16 model](https://huggingface.co/executorch-community/Llama-3.2-1B-ET/blob/main/llama3_2-1B.pte) using [this recipe](https://huggingface.co/executorch-community/Llama-3.2-1B-ET/blob/main/ExportRecipe_1B.ipynb) on Hugging Face.
180181

181182
- To use **SpinQuant**, here are two ways:
182183
- Download directly from [Llama website](https://www.llama.com/llama-downloads). The model weights are prequantized and can be exported to `pte` file directly.
@@ -206,8 +207,7 @@ python -m examples.models.llama.export_llama \
206207
--use_spin_quant native \
207208
--metadata '{"get_bos_id":128000, "get_eos_ids":[128009, 128001]}'
208209
```
209-
210-
For convenience, here's an already ExecuTorch [exported model](https://huggingface.co/executorch-community/Llama-3.2-1B-Instruct-SpinQuant_INT4_EO8-ET/blob/main/Llama-3.2-1B-Instruct-SpinQuant_INT4_EO8.pte) using [this recipe](https://huggingface.co/executorch-community/Llama-3.2-1B-Instruct-SpinQuant_INT4_EO8-ET/blob/main/Export_Recipe_Llama_3_2_1B_Instruct_SpinQuant_INT4_EO8.ipynb) on Hugging Face.
210+
For convenience, here's an already ExecuTorch [exported SpinQuant model](https://huggingface.co/executorch-community/Llama-3.2-1B-Instruct-SpinQuant_INT4_EO8-ET/blob/main/Llama-3.2-1B-Instruct-SpinQuant_INT4_EO8.pte) using [this recipe](https://huggingface.co/executorch-community/Llama-3.2-1B-Instruct-SpinQuant_INT4_EO8-ET/blob/main/Export_Recipe_Llama_3_2_1B_Instruct_SpinQuant_INT4_EO8.ipynb) on Hugging Face.
211211

212212

213213
- To use **QAT+LoRA**, download directly from [Llama website](https://www.llama.com/llama-downloads). The model weights are prequantized and can be exported to `pte` file directly by:
@@ -237,8 +237,7 @@ python -m examples.models.llama.export_llama \
237237
--output_name "llama3_2.pte" \
238238
--metadata '{"get_bos_id":128000, "get_eos_ids":[128009, 128001]}'
239239
```
240-
241-
For convenience, here's an already ExecuTorch [exported model](https://huggingface.co/executorch-community/Llama-3.2-1B-Instruct-QLORA_INT4_EO8-ET/blob/main/Llama-3.2-1B-Instruct-QLORA_INT4_EO8.pte) using [this recipe](https://huggingface.co/executorch-community/Llama-3.2-1B-Instruct-QLORA_INT4_EO8-ET/blob/main/Export_Recipe_Llama_3_2_1B_Instruct_QLORA_INT4_EO8.ipynb) on Hugging Face.
240+
For convenience, here's an already ExecuTorch [exported QAT+LoRA model](https://huggingface.co/executorch-community/Llama-3.2-1B-Instruct-QLORA_INT4_EO8-ET/blob/main/Llama-3.2-1B-Instruct-QLORA_INT4_EO8.pte) using [this recipe](https://huggingface.co/executorch-community/Llama-3.2-1B-Instruct-QLORA_INT4_EO8-ET/blob/main/Export_Recipe_Llama_3_2_1B_Instruct_QLORA_INT4_EO8.ipynb) on Hugging Face.
242241

243242
### Option B: Download and export Llama 3 8B instruct model
244243

0 commit comments

Comments
 (0)