Skip to content

Commit 217c9c4

Browse files
committed
Refactor program-data separation example
1 parent b6ada3d commit 217c9c4

File tree

5 files changed

+68
-45
lines changed

5 files changed

+68
-45
lines changed

program-data-separation/README.md

Lines changed: 16 additions & 44 deletions
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,10 @@
11
# Program Data Separation Examples
22

3-
This directory provides an example of the Program Data Separation APIs in ExecuTorch.
3+
This directory provides an example of the Program Data Separation APIs in ExecuTorch. Specifically, it showcases:
4+
1. Simple program data separation examples using the portable operators and XNNPACK.
5+
2. LoRA inference example with a LoRA and non-LoRA model sharing foundation weights.
6+
7+
## Program Data Separation
48

59
The program-data separation APIs allow users to generate a separate data file when exporting and lowering a model. i.e., generate a PTE file containing the model execution program, and one (or more) [PTD](https://github.com/pytorch/executorch/blob/main/extension/flat_tensor/README.md) file/s containing only weights.
610

@@ -9,13 +13,6 @@ PTD files are used to store data outside of the PTE file. Some use-cases:
913
- Deduplication: sharing model weights between multiple executable PTE files. This can significantly reduce binary file size and runtime memory usage.
1014
- Flexible deployment: allow async updates between program and data, especially if they are updated with different cadences.
1115

12-
## LoRA
13-
A major use-case that program-data separation enables is inference with multiple LoRA adapters. LoRA is a fine-tuning technique introduced in [LoRA: Low-Rank Adaptation of Large Language Models](https://arxiv.org/abs/2106.09685). LoRA fine-tuning produces lightweight 'adapter' weights that can be applied to an existing model to adapt it to a new task. LoRA adapters are typically small in comparison to LLM foundation weights. They are generally on the order of KB,MB, depending on the finetuning setup and model size.
14-
15-
With program-data separation, users can generate a PTE file containing the program and LoRA weights, and save the original foundation weights to a separate PTD file. Provided they are based on the same underlying model, multiple LoRA-adapted PTE files can share the same foundation weights. This means adding a model adapted to a new task incurs minimal binary size and runtime memory overhead; the cost of the lora adapter weights.
16-
17-
An example of this usage is coming soon.
18-
1916
## Virtual environment setup
2017
Create and activate a Python virtual environment:
2118
```bash
@@ -27,23 +24,20 @@ conda create -yn executorch-ptd python=3.10.0 && conda activate executorch-ptd
2724
```
2825

2926
Install dependencies:
30-
31-
[Please install ExecuTorch pip package from source](https://docs.pytorch.org/executorch/stable/using-executorch-building-from-source.html#install-executorch-pip-package-from-source), until executorch==0.7.0 is released.
32-
3327
```
3428
pip install executorch==0.7.0
3529
```
3630

3731
## Export a model with program-data separation
3832
To export a non-delegated linear model, into the current directory:
3933
```python
40-
python export.py --outdir .
34+
python export_linear.py --outdir .
4135
```
4236
Expect the files 'linear.pte' and 'linear.ptd'.
4337

4438
To export a linear model delegated to XNNPACK, into the current directory:
4539
```python
46-
python export.py --outdir . --xnnpack
40+
python export_linear.py --outdir . --xnnpack
4741
```
4842
Expect the files 'linear_xnnpack.pte' and 'linear_xnnpack.ptd'.
4943

@@ -53,38 +47,16 @@ Note:
5347

5448
For more information on the PTD data format, please see the [flat_tensor](https://github.com/pytorch/executorch/blob/main/extension/flat_tensor/README.md) directory.
5549

56-
## Runtime (cpp)
57-
The cpp/ directory contains the executorch submodule along with a main.cpp file that demonstrates how to load the PTE and PTD files and execute the program.
58-
59-
First, export your PTE and PTD files using the instructions above.
60-
61-
**Build instructions**
62-
63-
Change to the cpp directory.
64-
```
65-
cd cpp
66-
```
67-
68-
Create build directory if it doesn't exist.
69-
```
70-
mkdir -p build
71-
cd build
72-
```
50+
Please see [program-data-separation/cpp](cpp/) for instructions on running the exported models.
7351

74-
Configure CMake.
75-
```
76-
cmake -DCMAKE_BUILD_TYPE=Release ..
77-
```
52+
## Export a model with LoRA
53+
A major use-case that program-data separation enables is inference with multiple LoRA adapters. LoRA is a fine-tuning technique introduced in [LoRA: Low-Rank Adaptation of Large Language Models](https://arxiv.org/abs/2106.09685). LoRA fine-tuning produces lightweight 'adapter' weights that can be applied to an existing model to adapt it to a new task. LoRA adapters are typically small in comparison to LLM foundation weights, on the order of KB-MB depending on the finetuning setup and model size.
7854

79-
Build the project.
80-
```
81-
cmake --build . -j$(nproc)
82-
echo "Build complete! Executable located at: ./bin/executorch_program_data_separation"
83-
```
55+
To enable LoRA, we generate:
56+
- PTE file/s: containing program and LoRA adapter weights.
57+
- PTD file: containing foundation weights.
8458

85-
Run the executable.
86-
```
87-
./bin/executorch_program_data_separation --model-path ../../linear.pte --data-path ../../linear.ptd
59+
Multiple LoRA-adapted PTE files can share the same foundation weights and adding a model adapted to a new task incurs minimal binary size and runtime memory overhead.
8860

89-
./bin/executorch_program_data_separation --model-path ../../linear_xnnpack.pte --data-path ../../linear_xnnpack.ptd
90-
```
61+
### Requirements
62+
LoRA is currently supported on executorch main. [Please install ExecuTorch pip package from source](https://docs.pytorch.org/executorch/stable/using-executorch-building-from-source.html#install-executorch-pip-package-from-source), until executorch==1.0 is released.

program-data-separation/cpp/README.md

Lines changed: 37 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,37 @@
1+
# ExecuTorch Program Data Separation Demo C++.
2+
3+
This directory contains the C++ code to run the examples generated in [program-data-separation](../program-data-separation/README.md).
4+
5+
## Build instructions
6+
0. Export the model/s. See [program-data-separation](../program-data-separation/README.md) for instructions.
7+
1. The ExecuTorch repository is configured as a git submodule at `~/executorch-examples/program-data-separation/cpp/executorch`. To initialize it:
8+
```bash
9+
cd ~/executorch-examples/
10+
git submodule sync
11+
git submodule update --init --recursive
12+
```
13+
2. Install dev requirements for ExecuTorch
14+
15+
```bash
16+
cd ~/executorch-examples/mv2/cpp/executorch
17+
pip install -r requirements-dev.txt
18+
```
19+
20+
## Program-data separation demo
21+
**Build instructions**
22+
23+
Build the executable:
24+
```bash
25+
cd ~/executorch-examples/program-data-separation/cpp
26+
chmod +x build_example.sh
27+
./build_example.sh
28+
```
29+
30+
Run the executable.
31+
```
32+
./bin/executorch_program_data_separation --model-path ../../linear.pte --data-path ../../linear.ptd
33+
34+
./bin/executorch_program_data_separation --model-path ../../linear_xnnpack.pte --data-path ../../linear_xnnpack.ptd
35+
```
36+
37+
## LoRA demo
Lines changed: 14 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,14 @@
1+
#!/bin/bash
2+
set -e
3+
4+
# Create build directory if it doesn't exist
5+
mkdir -p build
6+
cd build
7+
8+
# Configure CMake
9+
cmake -DCMAKE_BUILD_TYPE=Release ..
10+
11+
# Build the project
12+
cmake --build . -j$(nproc)
13+
14+
echo "Build complete! Executable located at: ./bin/executorch_program_data_separation"
Submodule executorch updated 1095 files

0 commit comments

Comments
 (0)