Skip to content

Commit 264c92b

Browse files
authored
Add Fine-Tuning Diffusion Models with Olive blog (#2315)
## Describe your changes 1. Add ine-Tuning Diffusion Models with Olive blog. 2. Update CLI doc ## Checklist before requesting a review - [ ] Add unit tests for this change. - [ ] Make sure all tests can pass. - [ ] Update documents if necessary. - [ ] Lint and apply fixes to your code by running `lintrunner -a` - [ ] Is this a user-facing change? If yes, give a description of this change to be included in the release notes. ## (Optional) Issue link
1 parent 2c92e7b commit 264c92b

File tree

3 files changed

+267
-0
lines changed

3 files changed

+267
-0
lines changed

docs/source/blogs/index.md

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -6,6 +6,9 @@
66
77
- header: "{octicon}`cpu` Exploring Optimal Quantization Settings for Small Language Models"
88
content: "An exploration of how Olive applies different quantization strategies such as GPTQ, mixed precision, and QuaRot to optimize small language models for efficiency and accuracy.<br/>{octicon}`arrow-right` [Exploring Optimal Quantization Settings for Small Language Models](quant-slms.md)"
9+
10+
- header: "{octicon}`image` Fine-Tuning Diffusion Models with Olive"
11+
content: "Learn how to train LoRA adapters for Stable Diffusion and Flux models using Olive CLI or JSON configuration.<br/>{octicon}`arrow-right` [Fine-Tuning Diffusion Models with Olive](sd-lora.md)"
912
```
1013

1114

@@ -14,4 +17,5 @@
1417
:hidden:
1518
1619
quant-slms.md
20+
sd-lora.md
1721
```

docs/source/blogs/sd-lora.md

Lines changed: 252 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,252 @@
1+
# Fine-Tuning Diffusion Models with Olive
2+
3+
*Author: Xiaoyu Zhang*
4+
*Created: 2026-01-26*
5+
6+
This guide shows you how to fine-tune Stable Diffusion and Flux models with LoRA adapters using Olive. You can use either:
7+
8+
- **CLI**: Quick start with `olive diffusion-lora` command
9+
- **JSON Configuration**: Full control over data preprocessing and training options
10+
11+
## Overview
12+
13+
Olive provides a simple CLI command to train LoRA (Low-Rank Adaptation) adapters for diffusion models. This allows you to:
14+
15+
- Teach your model new artistic styles
16+
- Train it to generate specific subjects (DreamBooth)
17+
- Customize image generation without modifying the full model weights
18+
19+
### Supported Models
20+
21+
| Model Type | Example Models | Default Resolution |
22+
|------------|----------------|-------------------|
23+
| SD 1.5 | `runwayml/stable-diffusion-v1-5` | 512x512 |
24+
| SDXL | `stabilityai/stable-diffusion-xl-base-1.0` | 1024x1024 |
25+
| Flux | `black-forest-labs/FLUX.1-dev` | 1024x1024 |
26+
27+
## Quick Start
28+
29+
### Basic LoRA Training
30+
31+
Train a LoRA adapter on your own images:
32+
33+
```bash
34+
# Using a local image folder
35+
olive diffusion-lora \
36+
-m runwayml/stable-diffusion-v1-5 \
37+
-d /path/to/your/images \
38+
-o my-style-lora
39+
40+
# Using a HuggingFace dataset
41+
olive diffusion-lora \
42+
-m runwayml/stable-diffusion-v1-5 \
43+
--data_name linoyts/Tuxemon \
44+
--caption_column prompt \
45+
-o tuxemon-lora
46+
```
47+
48+
### DreamBooth Training
49+
50+
Train the model to generate a specific subject (person, pet, object):
51+
52+
```bash
53+
olive diffusion-lora \
54+
-m stabilityai/stable-diffusion-xl-base-1.0 \
55+
--model_variant sdxl \
56+
-d /path/to/subject/images \
57+
--dreambooth \
58+
--instance_prompt "a photo of sks dog" \
59+
--with_prior_preservation \
60+
--class_prompt "a photo of a dog" \
61+
-o my-dog-lora
62+
```
63+
64+
## Data Sources
65+
66+
Olive supports two ways to provide training data:
67+
68+
### 1. Local Image Folder
69+
70+
Organize your images in a folder with optional caption files:
71+
72+
```
73+
my_training_data/
74+
├── image1.jpg
75+
├── image1.txt # Caption: "a beautiful sunset over mountains"
76+
├── image2.png
77+
├── image2.txt # Caption: "a cat sitting on a couch"
78+
└── subfolder/
79+
├── image3.jpg
80+
└── image3.txt
81+
```
82+
83+
Each `.txt` file contains the caption/prompt for the corresponding image.
84+
85+
**No captions?** No problem! Use the `auto_caption` preprocessing step to automatically generate captions using BLIP-2 or Florence-2 models. See the [Data Preprocessing](#data-preprocessing) section for details.
86+
87+
### 2. HuggingFace Dataset
88+
89+
Use any image dataset from the HuggingFace Hub. Specify `--data_name` with optional `--image_column` and `--caption_column` parameters.
90+
91+
## Command Reference
92+
93+
For the complete list of CLI options, see the [Diffusion LoRA CLI Reference](../../reference/cli.rst#diffusion-lora).
94+
95+
```bash
96+
olive diffusion-lora --help
97+
```
98+
99+
## Using the Trained LoRA
100+
101+
After training, load your LoRA adapter with diffusers:
102+
103+
```python
104+
from diffusers import DiffusionPipeline
105+
import torch
106+
107+
# Load base model (works for SD, SDXL, Flux)
108+
pipe = DiffusionPipeline.from_pretrained(
109+
"runwayml/stable-diffusion-v1-5",
110+
torch_dtype=torch.float16
111+
).to("cuda")
112+
113+
# Load LoRA adapter
114+
pipe.load_lora_weights("./my-lora-output/adapter")
115+
116+
# Generate images
117+
image = pipe("a beautiful landscape").images[0]
118+
image.save("output.png")
119+
```
120+
121+
## Tips and Best Practices
122+
123+
### Dataset Preparation
124+
125+
1. **Image Quality**: Use high-quality, consistent images. Aim for 10-50 images for style transfer, 5-20 for DreamBooth.
126+
127+
2. **Captions**: Write descriptive captions that include the key elements you want the model to learn. For DreamBooth, use a unique trigger word (e.g., "sks") that doesn't conflict with existing concepts.
128+
129+
3. **Resolution**: Images don't need to match the training resolution exactly. Olive automatically handles aspect ratio bucketing and resizing, but remember to set `--model_variant sdxl/flux` or `--base_resolution 1024` when training SDXL/Flux so preprocessing runs at the correct size.
130+
131+
### Training Parameters
132+
133+
1. **LoRA Rank (`-r`)**:
134+
- SD 1.5/SDXL: 4-16 is usually sufficient
135+
- Flux: Use 16-64 for better quality
136+
137+
2. **Training Steps**:
138+
- Style transfer: 1000-3000 steps
139+
- DreamBooth: 500-1500 steps
140+
141+
3. **Learning Rate**:
142+
- Start with `1e-4` and adjust based on results
143+
- Lower (e.g., `5e-5`) if overfitting, higher (e.g., `2e-4`) if underfitting
144+
145+
4. **Prior Preservation**: Always use `--with_prior_preservation` for DreamBooth to prevent the model from forgetting general concepts.
146+
147+
### Hardware Requirements (guidelines)
148+
149+
| Model | Minimum VRAM | Recommended VRAM |
150+
|-------|--------------|------------------|
151+
| SD 1.5 | 8 GB | 12+ GB |
152+
| SDXL | 16 GB | 24+ GB |
153+
| Flux | 24 GB | 40+ GB |
154+
155+
156+
## Advanced: Custom Configuration
157+
158+
For more control, you can use Olive's configuration file instead of CLI options:
159+
160+
```json
161+
{
162+
"input_model": {
163+
"type": "DiffusersModel",
164+
"model_path": "stabilityai/stable-diffusion-xl-base-1.0"
165+
},
166+
"data_configs": [{
167+
"name": "train_data",
168+
"type": "ImageDataContainer",
169+
"load_dataset_config": {
170+
"type": "huggingface_dataset",
171+
"params": {
172+
"data_name": "linoyts/Tuxemon",
173+
"split": "train",
174+
"image_column": "image",
175+
"caption_column": "prompt"
176+
}
177+
},
178+
"pre_process_data_config": {
179+
"type": "image_lora_preprocess",
180+
"params": {
181+
"base_resolution": 1024,
182+
"steps": {
183+
"auto_caption": {"model_type": "florence2"},
184+
"aspect_ratio_bucketing": {}
185+
}
186+
}
187+
}
188+
}],
189+
"passes": {
190+
"sd_lora": {
191+
"type": "SDLoRA",
192+
"train_data_config": "train_data",
193+
"r": 16,
194+
"alpha": 16,
195+
"training_args": {
196+
"max_train_steps": 2000,
197+
"learning_rate": 1e-4,
198+
"train_batch_size": 1,
199+
"gradient_accumulation_steps": 4,
200+
"mixed_precision": "bf16"
201+
}
202+
}
203+
},
204+
"systems": {
205+
"local_system": {
206+
"type": "LocalSystem",
207+
"accelerators": [{"device": "gpu"}]
208+
}
209+
},
210+
"host": "local_system",
211+
"target": "local_system",
212+
"output_dir": "my-lora-output"
213+
}
214+
```
215+
216+
Run with:
217+
218+
```bash
219+
olive run --config my_lora_config.json
220+
```
221+
222+
## Data Preprocessing
223+
224+
Olive supports automatic data preprocessing including image filtering, auto-captioning, tagging, and aspect ratio bucketing.
225+
226+
**CLI** only supports basic aspect ratio bucketing via `--base_resolution`. For advanced preprocessing (auto-captioning, filtering, tagging), use a JSON configuration file.
227+
228+
For detailed preprocessing options and examples, see the [SD LoRA Feature Documentation](../../features/sd-lora.md#data-configuration).
229+
230+
## Export to ONNX and Run Inference
231+
232+
After fine-tuning, you can merge the LoRA adapter into the base model and export the pipeline to ONNX with Olive's CLI, then run inference using ONNX Runtime.
233+
234+
### 1. Export with the CLI
235+
236+
Use `capture-onnx-graph` to export the base components together with your LoRA adapter:
237+
238+
```bash
239+
olive capture-onnx-graph \
240+
-m stabilityai/stable-diffusion-xl-base-1.0 \
241+
-a my-lora-output/adapter \
242+
--output_path sdxl-lora-onnx
243+
```
244+
245+
### Multi LoRA + inference
246+
247+
Want to combine multiple adapters or see a full inference notebook? Check [sd_multilora.ipynb](https://github.com/microsoft/Olive/blob/main/notebooks/sd_multilora/sd_multilora.ipynb) for an end-to-end example covering multi-LoRA composition and ONNX Runtime inference.
248+
249+
## Related Resources
250+
251+
- [DreamBooth Paper](https://arxiv.org/abs/2208.12242)
252+
- [LoRA Paper](https://arxiv.org/abs/2106.09685)

docs/source/reference/cli.rst

Lines changed: 11 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -69,6 +69,17 @@ Fine-tune a model on a dataset using HuggingFace peft. Huggingface training argu
6969
:prog: olive
7070
:path: finetune
7171

72+
Diffusion LoRA
73+
==============
74+
75+
Train LoRA adapters for diffusion models (Stable Diffusion 1.5, SDXL, Flux). Supports both local image folders and HuggingFace datasets.
76+
77+
.. argparse::
78+
:module: olive.cli.launcher
79+
:func: get_cli_parser
80+
:prog: olive
81+
:path: diffusion-lora
82+
7283
Auto-Optimization
7384
=================
7485

0 commit comments

Comments
 (0)