Skip to content

Commit 1135dbb

Browse files
authored
add preliminary notes for lora (#47)
1 parent 2f4542b commit 1135dbb

File tree

1 file changed

+7
-0
lines changed

1 file changed

+7
-0
lines changed

program-data-separation/README.md

Lines changed: 7 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -9,6 +9,13 @@ PTD files are used to store data outside of the PTE file. Some use-cases:
99
- Deduplication: sharing model weights between multiple executable PTE files. This can significantly reduce binary file size and runtime memory usage.
1010
- Flexible deployment: allow async updates between program and data, especially if they are updated with different cadences.
1111

12+
## LoRA
13+
A major use-case that program-data separation enables is inference with multiple LoRA adapters. LoRA is a fine-tuning technique introduced in [LoRA: Low-Rank Adaptation of Large Language Models](https://arxiv.org/abs/2106.09685). LoRA fine-tuning produces lightweight 'adapter' weights that can be applied to an existing model to adapt it to a new task. LoRA adapters are typically small in comparison to LLM foundation weights. They are generally on the order of KB,MB, depending on the finetuning setup and model size.
14+
15+
With program-data separation, users can generate a PTE file containing the program and LoRA weights, and save the original foundation weights to a separate PTD file. Provided they are based on the same underlying model, multiple LoRA-adapted PTE files can share the same foundation weights. This means adding a model adapted to a new task incurs minimal binary size and runtime memory overhead; the cost of the lora adapter weights.
16+
17+
An example of this usage is coming soon.
18+
1219
## Virtual environment setup
1320
Create and activate a Python virtual environment:
1421
```bash

0 commit comments

Comments
 (0)