|
| 1 | +# ExecuTorch on Raspberry Pi |
| 2 | + |
| 3 | +## TLDR |
| 4 | + |
| 5 | +This tutorial demonstrates how to deploy **Llama models on Raspberry Pi 4/5 devices** using ExecuTorch: |
| 6 | + |
| 7 | +- **Prerequisites**: Linux host machine, Python 3.10-3.12, conda environment, Raspberry Pi 4/5 |
| 8 | +- **Setup**: Automated cross-compilation using `setup.sh` script for ARM toolchain installation |
| 9 | +- **Export**: Convert Llama models to optimized `.pte` format with quantization options |
| 10 | +- **Deploy**: Transfer binaries to Raspberry Pi and configure runtime libraries |
| 11 | +- **Optimize**: Build optimization and performance tuning techniques |
| 12 | +- **Result**: Efficient on-device Llama inference |
| 13 | + |
| 14 | +## Prerequisites and Hardware Requirements |
| 15 | + |
| 16 | +### Host Machine Requirements |
| 17 | + |
| 18 | +**Operating System**: Linux x86_64 (Ubuntu 20.04+ or CentOS Stream 9+) |
| 19 | + |
| 20 | +**Software Dependencies**: |
| 21 | + |
| 22 | +- **Python 3.10-3.12** (ExecuTorch requirement) |
| 23 | +- **conda** or **venv** for environment management |
| 24 | +- **CMake 3.29.6+** for cross-compilation |
| 25 | +- **Git** for repository cloning |
| 26 | + |
| 27 | +### Target Device Requirements |
| 28 | + |
| 29 | +**Supported Devices**: **Raspberry Pi 4** and **Raspberry Pi 5** with **64-bit OS** |
| 30 | + |
| 31 | +**Memory Requirements**: |
| 32 | + |
| 33 | +- **Minimum 4GB RAM** (8GB recommended for larger models) |
| 34 | +- **8GB+ storage** for model files and binaries |
| 35 | +- **64-bit Raspberry Pi OS** (Bullseye or newer) |
| 36 | + |
| 37 | +### Verification Commands |
| 38 | + |
| 39 | +Verify your host machine compatibility: |
| 40 | +```bash |
| 41 | +# Check OS and architecture |
| 42 | +uname -s # Should output: Linux |
| 43 | +uname -m # Should output: x86_64 |
| 44 | + |
| 45 | +# Check Python version |
| 46 | +python3 --version # Should be 3.10-3.12 |
| 47 | + |
| 48 | +# Check required tools |
| 49 | +which cmake git md5sum |
| 50 | +cmake --version # Should be 3.29.6+ at minimum |
| 51 | + |
| 52 | +## Development Environment Setup |
| 53 | + |
| 54 | +### Clone ExecuTorch Repository |
| 55 | + |
| 56 | +First, clone the ExecuTorch repository with the Raspberry Pi support: |
| 57 | + |
| 58 | +```bash |
| 59 | +# Create project directory |
| 60 | +mkdir ~/executorch-rpi && cd ~/executorch-rpi |
| 61 | +
|
| 62 | +# Clone ExecuTorch repository |
| 63 | +git clone -b release/1.0 https://github.com/pytorch/executorch.git |
| 64 | +cd executorch |
| 65 | +``` |
| 66 | + |
| 67 | +### Create Conda Environment |
| 68 | + |
| 69 | +```bash |
| 70 | +# Create conda environment |
| 71 | +conda create -yn executorch python=3.10.0 |
| 72 | +conda activate executorch |
| 73 | +
|
| 74 | +# Upgrade pip |
| 75 | +pip install --upgrade pip |
| 76 | +``` |
| 77 | + |
| 78 | +Alternative: Virtual Environment |
| 79 | +If you prefer Python's built-in virtual environment: |
| 80 | +
|
| 81 | +```bash |
| 82 | +python3 -m venv .venv |
| 83 | +source .venv/bin/activate |
| 84 | +pip install --upgrade pip |
| 85 | +``` |
| 86 | +
|
| 87 | +Refer to → {doc}`getting-started` for more details. |
| 88 | +
|
| 89 | +## Cross-Compilation Toolchain Setup |
| 90 | +
|
| 91 | +Run the following automated cross compile script on your Linux host machine: |
| 92 | +
|
| 93 | +```bash |
| 94 | +# Run the Raspberry Pi setup script for Pi 5 |
| 95 | +examples/raspberry_pi/setup.sh pi5 |
| 96 | +
|
| 97 | +[100%] Linking CXX executable llama_main |
| 98 | +[100%] Built target llama_main |
| 99 | +[SUCCESS] LLaMA runner built successfully |
| 100 | +
|
| 101 | +==== Verifying Build Outputs ==== |
| 102 | +[SUCCESS] ✓ llama_main (6.1M) |
| 103 | +[SUCCESS] ✓ libllama_runner.so (4.0M) |
| 104 | +[SUCCESS] ✓ libextension_module.a (89K) - static library |
| 105 | +
|
| 106 | +✓ ExecuTorch cross-compilation setup completed successfully! |
| 107 | +``` |
| 108 | +
|
| 109 | +## Model Preparation and Export |
| 110 | +
|
| 111 | +### Download Llama Models |
| 112 | +
|
| 113 | +Download the Llama model from Hugging Face or any other source, and make sure that following files exist. |
| 114 | +
|
| 115 | +- consolidated.00.pth (model weights) |
| 116 | +- params.json (model config) |
| 117 | +- tokenizer.model (tokenizer) |
| 118 | +
|
| 119 | +### Export Llama to ExecuTorch Format |
| 120 | +
|
| 121 | +After downloading the Llama model, export it to ExecuTorch format using the provided script: |
| 122 | +
|
| 123 | +```bash |
| 124 | +
|
| 125 | +#### Set these paths to point to the exported files. Following is an example instruction to export a llama model |
| 126 | +
|
| 127 | +LLAMA_QUANTIZED_CHECKPOINT=path/to/consolidated.00.pth |
| 128 | +LLAMA_PARAMS=path/to/params.json |
| 129 | +
|
| 130 | +python -m extension.llm.export.export_llm \ |
| 131 | + --config examples/models/llama/config/llama_xnnpack_spinquant.yaml \ |
| 132 | + +base.model_class="llama3_2" \ |
| 133 | + +base.checkpoint="${LLAMA_QUANTIZED_CHECKPOINT:?}" \ |
| 134 | + +base.params="${LLAMA_PARAMS:?}" |
| 135 | +``` |
| 136 | +
|
| 137 | +The file llama3_2.pte will be generated at the place where you run the command |
| 138 | +
|
| 139 | +## Raspberry Pi Deployment |
| 140 | +
|
| 141 | +### Transfer Binaries to Raspberry Pi |
| 142 | +
|
| 143 | +After successful cross-compilation, transfer the required files: |
| 144 | +
|
| 145 | +```bash |
| 146 | +##### Set Raspberry Pi details |
| 147 | +export RPI_UN="pi" # Your Raspberry Pi username |
| 148 | +export RPI_IP="your-rpi-ip-address" |
| 149 | +
|
| 150 | +##### Create deployment directory on Raspberry Pi |
| 151 | +ssh $RPI_UN@$RPI_IP 'mkdir -p ~/executorch-deployment' |
| 152 | +##### Copy main executable |
| 153 | +scp cmake-out/examples/models/llama/llama_main $RPI_UN@$RPI_IP:~/executorch-deployment/ |
| 154 | +##### Copy runtime library |
| 155 | +scp cmake-out/examples/models/llama/runner/libllama_runner.so $RPI_UN@$RPI_IP:~/executorch-deployment/ |
| 156 | +##### Copy model file |
| 157 | +scp llama3_2.pte $RPI_UN@$RPI_IP:~/executorch-deployment/ |
| 158 | +scp ./tokenizer.model $RPI_UN@$RPI_IP:~/executorch-deployment/ |
| 159 | +``` |
| 160 | +
|
| 161 | +### Configure Runtime Libraries on Raspberry Pi |
| 162 | +
|
| 163 | +SSH into your Raspberry Pi and configure the runtime: |
| 164 | +
|
| 165 | +#### Set up library environment |
| 166 | +
|
| 167 | +```bash |
| 168 | +cd ~/executorch-deployment |
| 169 | +echo 'export LD_LIBRARY_PATH=$(pwd):$LD_LIBRARY_PATH' > setup_env.sh |
| 170 | +chmod +x setup_env.sh |
| 171 | +
|
| 172 | +#### Make executable |
| 173 | +
|
| 174 | +chmod +x llama_main |
| 175 | +``` |
| 176 | +
|
| 177 | +## Dry Run |
| 178 | +
|
| 179 | +```bash |
| 180 | +source setup_env.sh |
| 181 | +./llama_main --help |
| 182 | +``` |
| 183 | +
|
| 184 | +Make sure that the output does not have any GLIBC / other library mismatch errors in the output. If you see any, follow the troubleshooting steps below. |
| 185 | +
|
| 186 | +## Troubleshooting |
| 187 | +
|
| 188 | +### Issue 1: GLIBC Version Mismatch |
| 189 | +
|
| 190 | +**Problem:** The binary was compiled with a newer GLIBC version (2.38) than what's available on your Raspberry Pi (2.36). |
| 191 | + |
| 192 | +**Error Symptoms:** |
| 193 | + |
| 194 | +```bash |
| 195 | +./llama_main: /lib/aarch64-linux-gnu/libm.so.6: version `GLIBC_2.38' not found (required by ./llama_main) |
| 196 | +./llama_main: /lib/aarch64-linux-gnu/libc.so.6: version `GLIBC_2.38' not found (required by ./llama_main) |
| 197 | +./llama_main: /lib/aarch64-linux-gnu/libstdc++.so.6: version `CXXABI_1.3.15' not found (required by ./llama_main) |
| 198 | +./llama_main: /lib/aarch64-linux-gnu/libc.so.6: version `GLIBC_2.38' not found (required by /lib/libllama_runner.so) |
| 199 | +``` |
| 200 | + |
| 201 | +#### Solution A: Upgrade GLIBC on Raspberry Pi (Recommended) |
| 202 | + |
| 203 | +1. **Check your current GLIBC version:** |
| 204 | + |
| 205 | +```bash |
| 206 | +ldd --version |
| 207 | +# Output: ldd (Debian GLIBC 2.36-9+rpt2+deb12u12) 2.36 |
| 208 | +``` |
| 209 | + |
| 210 | +2. **Upgrade to newer GLIBC:** |
| 211 | + |
| 212 | +```bash |
| 213 | +# Add Debian unstable repository |
| 214 | +echo "deb http://deb.debian.org/debian sid main contrib non-free" | sudo tee -a /etc/apt/sources.list |
| 215 | +
|
| 216 | +# Update package lists |
| 217 | +sudo apt update |
| 218 | +
|
| 219 | +# Install newer GLIBC packages |
| 220 | +sudo apt-get -t sid install libc6 libstdc++6 |
| 221 | +
|
| 222 | +# Reboot system |
| 223 | +sudo reboot |
| 224 | +``` |
| 225 | + |
| 226 | +3. **Test the fix:** |
| 227 | + |
| 228 | +```bash |
| 229 | +cd ~/executorch-deployment |
| 230 | +source setup_env.sh |
| 231 | +./llama_main --model_path ./llama3_2.pte --tokenizer_path ./tokenizer.model --seq_len 128 --prompt "Hello" |
| 232 | +``` |
| 233 | + |
| 234 | +**Important Notes:** |
| 235 | + |
| 236 | +- Select "Yes" when prompted to restart services |
| 237 | +- Press Enter to keep current version for configuration files |
| 238 | +- Backup important data before upgrading |
| 239 | + |
| 240 | +#### Solution B: Rebuild with Raspberry Pi's GLIBC (Advanced) |
| 241 | + |
| 242 | +If you prefer not to upgrade your Raspberry Pi system: |
| 243 | + |
| 244 | +1. **Copy Pi's filesystem to host machine:** |
| 245 | +```bash |
| 246 | +# On Raspberry Pi - install rsync |
| 247 | +ssh pi@<your-rpi-ip> |
| 248 | +sudo apt update && sudo apt install rsync |
| 249 | +exit |
| 250 | +
|
| 251 | +# On host machine - copy Pi's filesystem |
| 252 | +mkdir -p ~/rpi5-sysroot |
| 253 | +rsync -aAXv --exclude={"/proc","/sys","/dev","/run","/tmp","/mnt","/media","/lost+found"} \ |
| 254 | + pi@<your-rpi-ip>:/ ~/rpi5-sysroot |
| 255 | +``` |
| 256 | + |
| 257 | +2. **Update CMake toolchain file:** |
| 258 | +```bash |
| 259 | +# Edit arm-toolchain-pi5.cmake |
| 260 | +# Replace this line: |
| 261 | +# set(CMAKE_SYSROOT "${TOOLCHAIN_PATH}/aarch64-none-linux-gnu/libc") |
| 262 | + |
| 263 | +# With this: |
| 264 | +set(CMAKE_SYSROOT "/home/yourusername/rpi5-sysroot") |
| 265 | +set(CMAKE_FIND_ROOT_PATH "${CMAKE_SYSROOT}") |
| 266 | +``` |
| 267 | + |
| 268 | +3. **Rebuild binaries:** |
| 269 | +```bash |
| 270 | +# Clean and rebuild |
| 271 | +rm -rf cmake-out |
| 272 | +./examples/raspberry_pi/rpi_setup.sh pi5 --force-rebuild |
| 273 | + |
| 274 | +# Verify GLIBC version |
| 275 | +strings ./cmake-out/examples/models/llama/llama_main | grep GLIBC_ |
| 276 | +# Should show max GLIBC_2.36 (matching your Pi) |
| 277 | +``` |
| 278 | + |
| 279 | +--- |
| 280 | + |
| 281 | +### Issue 2: Library Not Found |
| 282 | + |
| 283 | +**Problem:** Required libraries are not found at runtime. |
| 284 | + |
| 285 | +**Error Symptoms:** |
| 286 | +```bash |
| 287 | +./llama_main: error while loading shared libraries: libllama_runner.so: cannot open shared object file |
| 288 | +``` |
| 289 | +
|
| 290 | +**Solution:** |
| 291 | +```bash |
| 292 | +# Ensure you're in the correct directory and environment is set |
| 293 | +cd ~/executorch-deployment |
| 294 | +source setup_env.sh |
| 295 | +./llama_main --help |
| 296 | +``` |
| 297 | +
|
| 298 | +**Root Cause:** Either `LD_LIBRARY_PATH` is not set or you're not in the deployment directory. |
| 299 | +
|
| 300 | +--- |
| 301 | +
|
| 302 | +### Issue 3: Tokenizer JSON Parsing Warnings |
| 303 | +
|
| 304 | +**Problem:** Warning messages about JSON parsing errors after running the llama_main binary. |
| 305 | +
|
| 306 | +**Error Symptoms:** |
| 307 | +
|
| 308 | +```bash |
| 309 | +E tokenizers:hf_tokenizer.cpp:60] Error parsing json file: [json.exception.parse_error.101] |
| 310 | +``` |
| 311 | +
|
| 312 | +**Solution:** These warnings can be safely ignored. They don't affect model inference. |
| 313 | +
|
| 314 | +--- |
| 315 | +
|
| 316 | +
|
| 317 | +## Quick Test Command |
| 318 | +
|
| 319 | +After resolving issues, test with: |
| 320 | +
|
| 321 | +```bash |
| 322 | +cd ~/executorch-deployment |
| 323 | +source setup_env.sh |
| 324 | +./llama_main --model_path ./llama3_2.pte --tokenizer_path ./tokenizer.model --seq_len 128 --prompt "What is the meaning of life?" |
| 325 | +``` |
| 326 | +
|
| 327 | +## Debugging Tools |
| 328 | +
|
| 329 | +Enable ExecuTorch logging: |
| 330 | +
|
| 331 | +```bash |
| 332 | +# Set log level for debugging |
| 333 | +export ET_LOG_LEVEL=Info |
| 334 | +./llama_main --model_path ./model.pte --verbose |
| 335 | +``` |
| 336 | +
|
| 337 | +## Final Run command |
| 338 | +
|
| 339 | +```bash |
| 340 | +cd ~/executorch-deployment |
| 341 | +source setup_env.sh |
| 342 | +./llama_main --model_path ./llama3_2.pte --tokenizer_path ./tokenizer.model --seq_len 128 --prompt "What is the meaning of life?" |
| 343 | +``` |
| 344 | +
|
| 345 | +Happy Inferencing! |
0 commit comments