Skip to content

Commit 5535b79

Browse files
authored
Merge pull request #2252 from annietllnd/fix
Update ExecuTorch with RPi 5 Learning Path
2 parents 73de356 + 94b66ae commit 5535b79

File tree

4 files changed

+41
-78
lines changed

4 files changed

+41
-78
lines changed

content/learning-paths/embedded-and-microcontrollers/introduction-to-tinyml-on-arm/2-env-setup.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -50,7 +50,7 @@ Run the commands below to set up the ExecuTorch internal dependencies:
5050

5151
```bash
5252
git submodule sync
53-
git submodule update --init
53+
git submodule update --init --recursive
5454
./install_executorch.sh
5555
```
5656

content/learning-paths/embedded-and-microcontrollers/rpi-llama3/executorch.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -60,9 +60,9 @@ After cloning the repository, the project's submodules are updated, and two scri
6060
git clone https://github.com/pytorch/executorch.git
6161
cd executorch
6262
git submodule sync
63-
git submodule update --init
63+
git submodule update --init --recursive
6464
./install_executorch.sh
65-
./examples/models/llama2/install_requirements.sh
65+
./examples/models/llama/install_requirements.sh
6666
```
6767

6868
When these scripts finish successfully, ExecuTorch is all set up. That means it's time to dive into the world of Llama models!

content/learning-paths/embedded-and-microcontrollers/rpi-llama3/llama3.md

Lines changed: 23 additions & 59 deletions
Original file line numberDiff line numberDiff line change
@@ -23,52 +23,21 @@ The next steps explain how to compile and run the Llama 3 model.
2323

2424
## Download and export the Llama 3 8B model
2525

26-
To get started with Llama 3, you can obtain the pre-trained parameters by visiting [Meta's Llama Downloads](https://llama.meta.com/llama-downloads/) page.
26+
To get started with Llama 3, you can obtain the pre-trained parameters by visiting [Meta's Llama Downloads](https://llama.meta.com/llama-downloads/) page.
2727

2828
Request access by filling out your details, and read through and accept the Responsible Use Guide. This grants you a license and a download link that is valid for 24 hours. The Llama 3 8B model is used for this part, but the same instructions apply for other models.
2929

30-
Clone the Llama 3 Git repository and install the dependencies:
30+
Use the `llama-stack` library to download the model after having the license granted.
3131

3232
```bash
33-
git clone https://github.com/meta-llama/llama-models
34-
cd llama-models
35-
pip install -e .
36-
pip install buck torchao
33+
pip install llama-stack
34+
llama model download --source meta --model-id meta-llama/Llama-3.1-8B
3735
```
3836

39-
Run the script to download, and paste the download link from the email when prompted:
40-
41-
```bash
42-
cd models/llama3_1
43-
./download.sh
44-
```
45-
46-
You are asked which models you would like to download. Enter `meta-llama-3.1-8b` to get the model used for this Learning Path:
47-
48-
```output
49-
**** Model list ***
50-
- meta-llama-3.1-405b
51-
- meta-llama-3.1-70b
52-
- meta-llama-3.1-8b
53-
- meta-llama-guard-3-8b
54-
- prompt-guard
55-
```
56-
57-
After entering `meta-llama-3.1-8b` you are prompted again with the available models:
58-
59-
```output
60-
**** Available models to download: ***
61-
- meta-llama-3.1-8b-instruct
62-
- meta-llama-3.1-8b
63-
Enter the list of models to download without spaces or press Enter for all:
64-
```
65-
66-
Enter `meta-llama-3.1-8b` to start the download.
67-
6837
When the download is finished, you can list the files in the new directory:
6938

7039
```bash
71-
ls Meta-Llama-3.1-8B
40+
ls /home/pi/.llama/checkpoints/Llama3.1-8B
7241
```
7342

7443
The output is:
@@ -85,34 +54,26 @@ If you encounter the error "Sorry, we could not process your request at this mom
8554

8655
The next step is to generate a `.pte` file that can be used for prompts. From the `executorch` directory, compile the model executable. Note the quantization option, which reduces the model size significantly.
8756

88-
If you've followed the tutorial, this should now take you to the `executorch` base directory.
89-
90-
Navigate back to the top-level directory of the `executorch` repository:
91-
92-
```bash {cwd="executorch"}
93-
cd ../../../
94-
```
95-
96-
You are now in `$HOME/executorch` and ready to create the model file for ExecuTorch.
57+
If you've followed the tutorial, you should be in the `executorch` base directory.
9758

98-
Run the Python command below to create the model file, `llama3_kv_sdpa_xnn_qe_4_32.pte`.
59+
Run the Python command below to create the model file, `llama3_kv_sdpa_xnn_qe_4_32.pte`.
9960

10061
```bash
101-
python -m examples.models.llama2.export_llama --checkpoint llama-models/models/llama3_1/Meta-Llama-3.1-8B/consolidated.00.pth \
102-
-p llama-models/models/llama3_1/Meta-Llama-3.1-8B/params.json -kv --use_sdpa_with_kv_cache -X -qmode 8da4w \
62+
python -m examples.models.llama.export_llama --checkpoint /home/pi/.llama/checkpoints/Llama3.1-8B/consolidated.00.pth \
63+
-p /home/pi/.llama/checkpoints/Llama3.1-8B/params.json -kv --use_sdpa_with_kv_cache -X -qmode 8da4w \
10364
--group_size 128 -d fp32 --metadata '{"get_bos_id":128000, "get_eos_id":128001}' \
10465
--embedding-quantize 4,32 --output_name="llama3_kv_sdpa_xnn_qe_4_32.pte"
10566
```
10667

107-
Where `consolidated.00.pth` and `params.json` are the paths to the downloaded model files, found in `llama3/Meta-Llama-3-8B`.
68+
Where `consolidated.00.pth` and `params.json` are the paths to the downloaded model files, found in `/home/pi/.llama/checkpoints/Llama3.1-8B`.
10869

109-
This step takes some time and will run out of memory if you have 32 GB RAM or less.
70+
This step takes some time and will run out of memory if you have 32 GB RAM or less.
11071

11172
## Compile and build the executable
11273

11374
Follow the steps below to build ExecuTorch and the Llama runner to run models.
11475

115-
The final step for running the model is to build `llama_main` and `llama_main` which are used to run the Llama 3 model.
76+
The final step for running the model is to build `llama_main` and `llama_main` which are used to run the Llama 3 model.
11677

11778
First, compile and build ExecuTorch with `cmake`:
11879

@@ -127,6 +88,9 @@ cmake -DPYTHON_EXECUTABLE=python \
12788
-DEXECUTORCH_BUILD_KERNELS_QUANTIZED=ON \
12889
-DEXECUTORCH_BUILD_KERNELS_OPTIMIZED=ON \
12990
-DEXECUTORCH_BUILD_KERNELS_CUSTOM=ON \
91+
-DEXECUTORCH_BUILD_EXTENSION_FLAT_TENSOR=ON \
92+
-DEXECUTORCH_BUILD_EXTENSION_LLM_RUNNER=ON \
93+
-DEXECUTORCH_BUILD_EXTENSION_LLM=ON \
13094
-Bcmake-out .
13195
cmake --build cmake-out -j16 --target install --config Release
13296
```
@@ -141,9 +105,9 @@ cmake -DPYTHON_EXECUTABLE=python \
141105
-DEXECUTORCH_BUILD_KERNELS_OPTIMIZED=ON \
142106
-DEXECUTORCH_BUILD_XNNPACK=ON \
143107
-DEXECUTORCH_BUILD_KERNELS_QUANTIZED=ON \
144-
-Bcmake-out/examples/models/llama2 \
145-
examples/models/llama2
146-
cmake --build cmake-out/examples/models/llama2 -j16 --config Release
108+
-Bcmake-out/examples/models/llama \
109+
examples/models/llama
110+
cmake --build cmake-out/examples/models/llama -j16 --config Release
147111
```
148112

149113
The CMake build options are available on [GitHub](https://github.com/pytorch/executorch/blob/main/CMakeLists.txt#L59).
@@ -152,17 +116,17 @@ When the build completes, you have everything you need to test the model.
152116

153117
## Run the model
154118

155-
Use `llama_main` to run the model with a sample prompt:
119+
Use `llama_main` to run the model with a sample prompt:
156120

157121
``` bash
158-
cmake-out/examples/models/llama2/llama_main \
122+
cmake-out/examples/models/llama/llama_main \
159123
--model_path=llama3_kv_sdpa_xnn_qe_4_32.pte \
160-
--tokenizer_path=./llama-models/models/llama3_1/Meta-Llama-3.1-8B/tokenizer.model \
124+
--tokenizer_path=/home/pi/.llama/checkpoints/Llama3.1-8B/tokenizer.model \
161125
--cpu_threads=4 \
162126
--prompt="Write a python script that prints the first 15 numbers in the Fibonacci series. Annotate the script with comments explaining what the code does."
163127
```
164128

165-
You can use `cmake-out/examples/models/llama2/llama_main --help` to read about the options.
129+
You can use `cmake-out/examples/models/llama2/llama_main --help` to read about the options.
166130

167131
If all goes well, you will see the model output along with some memory statistics. Some output has been omitted for better readability.
168132

@@ -185,5 +149,5 @@ I 00:00:46.844400 executorch:runner.cpp:134] append_eos_to_prompt: 0
185149

186150
You now know how to run a Llama model in Raspberry Pi OS using ExecuTorch. You can experiment with different prompts and different numbers of CPU threads.
187151

188-
If you have access to the RPi 5, continue to the next section to see how to deploy the software to the board and run it.
152+
If you have access to the RPi 5, continue to the next section to see how to deploy the software to the board and run it.
189153

content/learning-paths/embedded-and-microcontrollers/rpi-llama3/run.md

Lines changed: 15 additions & 16 deletions
Original file line numberDiff line numberDiff line change
@@ -9,7 +9,7 @@ This final section explains how to test the model by experimenting with differen
99

1010
## Set up your Raspberry Pi 5
1111

12-
If you want to see how the LLM behaves in an embedded environment, you need a Raspberry Pi 5 running Raspberry Pi OS.
12+
If you want to see how the LLM behaves in an embedded environment, you need a Raspberry Pi 5 running Raspberry Pi OS.
1313

1414
Install Raspberry Pi OS using the [Raspberry Pi documentation](https://www.raspberrypi.com/documentation/computers/getting-started.html). There are numerous ways to prepare an SD card, but Raspberry Pi recommends [Raspberry Pi Imager](https://www.raspberrypi.com/software/) on a Windows, Linux, or macOS computer with an SD card slot or SD card adapter.
1515

@@ -19,22 +19,21 @@ The 8GB RAM Raspberry Pi 5 model is preferred for exploring an LLM.
1919

2020
## Collect the files into an archive
2121

22-
There are just a few files that you need to transfer to the Raspberry Pi 5. You can bundle them together and transfer them from the running container to the development machine, and then to the Raspberry Pi 5.
22+
There are just a few files that you need to transfer to the Raspberry Pi 5. You can bundle them together and transfer them from the running container to the development machine, and then to the Raspberry Pi 5.
2323

24-
You should still be in the container, in the `$HOME/executorch` directory.
24+
You should still be in the container, in the `$HOME/executorch` directory.
2525

2626
The commands below copy the needed files to a new directory. The model file is very large and takes time to copy.
2727

2828
Run the commands below to collect the files:
2929

3030
```bash
3131
mkdir llama3-files
32-
cp cmake-out/examples/models/llama2/llama_main ./llama3-files/llama_main
33-
cp llama-models/models/llama3_1/Meta-Llama-3.1-8B/params.json ./llama3-files/params.json
34-
cp llama-models/models/llama3_1/Meta-Llama-3.1-8B/tokenizer.model ./llama3-files/tokenizer.model
32+
cp cmake-out/examples/models/llama/llama_main ./llama3-files/llama_main
33+
cp /home/pi/.llama/checkpoints/Llama3.1-8B/params.json ./llama3-files/params.json
34+
cp /home/pi/.llama/checkpoints/Llama3.1-8B/tokenizer.model ./llama3-files/tokenizer.model
3535
cp llama3_kv_sdpa_xnn_qe_4_32.pte ./llama3-files/llama3_kv_sdpa_xnn_qe_4_32.pte
36-
cp ./cmake-out/examples/models/llama2/runner/libllama_runner.so ./llama3-files
37-
cp ./cmake-out/lib/libextension_module.so ./llama3-files
36+
cp ./cmake-out/examples/models/llama/runner/libllama_runner.so ./llama3-files
3837
```
3938

4039
Compress the files into an archive using the `tar` command:
@@ -45,7 +44,7 @@ tar czvf llama3-files.tar.gz ./llama3-files
4544

4645
Next, copy the compressed tar file out of the container to the development computer. This is done using the `docker cp` command from the development machine.
4746

48-
Open a new shell or terminal on the development machine where Docker is running the container.
47+
Open a new shell or terminal on the development machine where Docker is running the container.
4948

5049
Find the `CONTAINER ID` for the running container:
5150

@@ -60,7 +59,7 @@ CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAME
6059
88c34c899c8c rpi-os "/bin/bash" 7 hours ago Up 7 hours fervent_vaughan
6160
```
6261

63-
Your `CONTAINER ID` will be different so substitute your value.
62+
Your `CONTAINER ID` will be different so substitute your value.
6463

6564
Copy the compressed file out of the container:
6665

@@ -70,17 +69,17 @@ docker cp 88c34c899c8c:/home/pi/executorch/llama3-files.tar.gz .
7069

7170
## Transfer the archive to the Raspberry Pi 5
7271

73-
Now you can transfer the archive from the development machine to your Raspberry Pi 5.
72+
Now you can transfer the archive from the development machine to your Raspberry Pi 5.
7473

75-
There are multiple ways to do this: via cloud storage services, with a USB thumb drive, or using SSH. Use any method that is convenient for you.
74+
There are multiple ways to do this: via cloud storage services, with a USB thumb drive, or using SSH. Use any method that is convenient for you.
7675

7776
For example, you can use `scp` running from a terminal in your Raspberry Pi 5 device as shown. Follow the same option as you did in the previous step.
7877

7978
```bash
8079
scp llama3-files.tar.gz <pi-user>@<pi-ip>:~/
8180
```
8281

83-
Substitute the username and the IP address of the Raspberry Pi 5.
82+
Substitute the username and the IP address of the Raspberry Pi 5.
8483

8584
The file is very large so you can also consider using a USB drive.
8685

@@ -91,7 +90,7 @@ Finally, log in to the Raspberry Pi 5 and run the model in a terminal using the
9190
Extract the file:
9291

9392
```bash
94-
tar xvfz llama3-files.tar.gz
93+
tar xvfz llama3-files.tar.gz
9594
```
9695

9796
Change to the new directory:
@@ -108,9 +107,9 @@ LD_LIBRARY_PATH=. ./llama_main --model_path=llama3_kv_sdpa_xnn_qe_4_32.pte --to
108107
```
109108

110109
{{% notice Note %}}
111-
The `llama_main` program uses dynamic linking, so you need to inform the dynamic linker to look for the 2 libraries in the current directory.
110+
The `llama_main` program uses dynamic linking, so you need to inform the dynamic linker to look for the 2 libraries in the current directory.
112111
{{% /notice %}}
113112

114113
From here, you can experiment with different prompts and command line options on your Raspberry Pi 5.
115114

116-
Make sure to exit your container and clean up any development resources you created.
115+
Make sure to exit your container and clean up any development resources you created.

0 commit comments

Comments
 (0)