Skip to content

Commit be85af8

Browse files
authored
Update README.md
1 parent 08461ec commit be85af8

File tree

1 file changed

+35
-5
lines changed

1 file changed

+35
-5
lines changed

examples/openvino/llama/README.md

Lines changed: 35 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -1,11 +1,41 @@
11

2-
LLAMA_CHECKPOINT=<model_directory>/consolidated.00.pth
3-
LLAMA_PARAMS=<model_directory>/params.json
4-
LLAMA_TOKENIZER=<model_directory>/tokenizer.model
2+
# Export Llama with OpenVINO Backend
53

6-
python -m extension.llm.export.export_llm \
4+
## Download the Model
5+
Follow the [instructions](../../examples/models/llama#step-2-prepare-model) to download the required model files. Export Llama with OpenVINO backend is only verified with Llama-3.2-1B variants at this time.
6+
7+
## Environment Setup
8+
Follow the [instructions](../../backends/openvino/README.md) of **Prerequisites** and **Setup** in `backends/openvino/README.md` to set up the OpenVINO backend.
9+
10+
## Export the model:
11+
Navigate into `<executorch_root>/examples/openvino/llama` and execute the commands below to export the model. Update the model file paths to match the location where your model is downloaded.
12+
13+
```
14+
LLAMA_CHECKPOINT=<path/to/model/folder>/consolidated.00.pth
15+
LLAMA_PARAMS=<path/to/model/folder>/params.json
16+
LLAMA_TOKENIZER=<path/to/model/folder>/tokenizer.model
17+
18+
python -m executorch.extension.llm.export.export_llm \
719
--config llama3_2_ov_4wo_config.yaml \
820
+base.model_class="llama3_2" \
921
+base.checkpoint="${LLAMA_CHECKPOINT:?}" \
1022
+base.params="${LLAMA_PARAMS:?}" \
11-
+base.tokenizer_path="${LLAMA_TOKENIZER:?}" \
23+
+base.tokenizer_path="${LLAMA_TOKENIZER:?}"
24+
```
25+
26+
## Build OpenVINO C++ Runtime with Llama Runner:
27+
First, build the backend libraries by executing the script below in `<executorch_root>/backends/openvino/scripts` folder:
28+
```bash
29+
./openvino_build.sh --cpp_runtime
30+
```
31+
Then, build the llama runner by executing the script below (with `--llama_runner` argument) also in `<executorch_root>/backends/openvino/scripts` folder:
32+
```bash
33+
./openvino_build.sh --llama_runner
34+
```
35+
The executable is saved in `<executorch_root>/cmake-out/examples/models/llama/llama_main`
36+
37+
## Execute Inference Using Llama Runner
38+
Update the model tokenizer file path to match the location where your model is downloaded and replace the prompt.
39+
```
40+
./cmake-out/examples/models/llama/llama_main --model_path=llama3_2.pte --tokenizer_path=<path/to/model/folder>/tokenizer.model --prompt="Your custom prompt"
41+
```

0 commit comments

Comments
 (0)