Skip to content

Commit 73d340d

Browse files
authored
Image Gen/Edit demo change (#3621)
change model to stable-diffusion-v1.5 which is supported on NPU change images and prompts note about using --cache_dir on NPU https://openvino-doc.iotg.sclab.intel.com/dkalinow-image-edit-v2/model-server/ovms_demos_image_generation.html
1 parent 7de1895 commit 73d340d

File tree

6 files changed

+79
-68
lines changed

6 files changed

+79
-68
lines changed

demos/image_generation/README.md

Lines changed: 79 additions & 68 deletions
Original file line numberDiff line numberDiff line change
@@ -3,6 +3,8 @@
33
This demo shows how to deploy image generation models (Stable Diffusion/Stable Diffusion 3/Stable Diffusion XL/FLUX) to create and edit images with the OpenVINO Model Server.
44
Image generation pipelines are exposed via [OpenAI API](https://platform.openai.com/docs/api-reference/images/create) `images/generations` and `images/edits` endpoints.
55

6+
Check [supported models](https://openvinotoolkit.github.io/openvino.genai/docs/supported-models/#image-generation-models).
7+
68
> **Note:** This demo was tested on Intel® Xeon®, Intel® Core®, Intel® Arc™ A770, Intel® Arc™ B580 on Ubuntu 22/24, RedHat 9 and Windows 11.
79
810
## Prerequisites
@@ -22,7 +24,7 @@ Image generation pipelines are exposed via [OpenAI API](https://platform.openai.
2224

2325
> **NOTE:** Model downloading feature is described in depth in separate documentation page: [Pulling HuggingFaces Models](../../docs/pull_hf_models.md).
2426
25-
This command pulls the `OpenVINO/FLUX.1-schnell-int4-ov` quantized model directly from HuggingFaces and starts the serving. If the model already exists locally, it will skip the downloading and immediately start the serving.
27+
This command pulls the `OpenVINO/stable-diffusion-v1-5-int8-ov` quantized model directly from HuggingFaces and starts the serving. If the model already exists locally, it will skip the downloading and immediately start the serving.
2628

2729
> **NOTE:** Optionally, to only download the model and omit the serving part, use `--pull` parameter.
2830
@@ -41,7 +43,7 @@ docker run -d --rm --user $(id -u):$(id -g) -p 8000:8000 -v $(pwd)/models:/model
4143
--rest_port 8000 \
4244
--model_repository_path /models/ \
4345
--task image_generation \
44-
--source_model OpenVINO/FLUX.1-schnell-int4-ov
46+
--source_model OpenVINO/stable-diffusion-v1-5-int8-ov
4547
```
4648
:::
4749

@@ -62,7 +64,7 @@ mkdir models
6264
ovms --rest_port 8000 ^
6365
--model_repository_path ./models/ ^
6466
--task image_generation ^
65-
--source_model OpenVINO/FLUX.1-schnell-int4-ov
67+
--source_model OpenVINO/stable-diffusion-v1-5-int8-ov
6668
```
6769
:::
6870

@@ -85,7 +87,7 @@ docker run -d --rm -p 8000:8000 -v $(pwd)/models:/models/:rw \
8587
--rest_port 8000 \
8688
--model_repository_path /models/ \
8789
--task image_generation \
88-
--source_model OpenVINO/FLUX.1-schnell-int4-ov \
90+
--source_model OpenVINO/stable-diffusion-v1-5-int8-ov \
8991
--target_device GPU
9092
```
9193
:::
@@ -101,7 +103,7 @@ mkdir models
101103
ovms --rest_port 8000 ^
102104
--model_repository_path ./models/ ^
103105
--task image_generation ^
104-
--source_model OpenVINO/FLUX.1-schnell-int4-ov ^
106+
--source_model OpenVINO/stable-diffusion-v1-5-int8-ov ^
105107
--target_device GPU
106108
```
107109
:::
@@ -111,7 +113,7 @@ ovms --rest_port 8000 ^
111113

112114
### NPU or mixed device
113115

114-
Image generation endpoints consist of 3 steps: text encoding, denoising and vae decoder. It is possible to select device for each step separately. In this example, we will use NPU for text encoding and denoising, and GPU for vae decoder. This is useful when the model is too large to fit into NPU memory, but the NPU can still be used for the first two steps.
116+
Image generation endpoints consist of 3 models: vae encoder, denoising and vae decoder. It is possible to select device for each step separately. In this example, we will use NPU for text encoding and denoising, and GPU for vae decoder. This is useful when the model is too large to fit into NPU memory, but the NPU can still be used for the first two steps.
115117

116118
::::{tab-set}
117119
:::{tab-item} Docker (Linux)
@@ -121,23 +123,26 @@ In this specific case, we also need to use `--device /dev/dri`, because we also
121123

122124
> **NOTE:** The NPU device requires the pipeline to be reshaped to static shape, this is why the `--resolution` parameter is used to define the input resolution.
123125
124-
> **NOTE:** This feature will be available in 2025.3 and later releases, so until next release, it is required to build the model server from source from the `main` branch.
125-
126+
> **NOTE:** In case the model loading phase takes too long, consider caching the model with `--cache_dir` parameter, as seen in example below.
126127
127128
It can be applied using the commands below:
128129
```bash
129130
mkdir -p models
131+
mkdir -p cache
130132

131-
docker run -d --rm -p 8000:8000 -v $(pwd)/models:/models/:rw \
133+
docker run -d --rm -p 8000:8000 \
134+
-v $(pwd)/models:/models/:rw \
135+
-v $(pwd)/cache:/cache/:rw \
132136
--user $(id -u):$(id -g) --device /dev/accel --device /dev/dri --group-add=$(stat -c "%g" /dev/dri/render* | head -n 1) \
133137
-e http_proxy=$http_proxy -e https_proxy=$https_proxy -e no_proxy=$no_proxy \
134138
openvino/model_server:latest-gpu \
135139
--rest_port 8000 \
136140
--model_repository_path /models/ \
137141
--task image_generation \
138-
--source_model OpenVINO/FLUX.1-schnell-int4-ov \
139-
--target_device 'NPU NPU GPU' \
140-
--resolution 512x512
142+
--source_model OpenVINO/stable-diffusion-v1-5-int8-ov \
143+
--target_device 'NPU NPU NPU' \
144+
--resolution 512x512 \
145+
--cache_dir /cache
141146
```
142147
:::
143148

@@ -147,13 +152,15 @@ docker run -d --rm -p 8000:8000 -v $(pwd)/models:/models/:rw \
147152

148153
```bat
149154
mkdir models
155+
mkdir cache
150156
151157
ovms --rest_port 8000 ^
152158
--model_repository_path ./models/ ^
153159
--task image_generation ^
154-
--source_model OpenVINO/FLUX.1-schnell-int4-ov ^
155-
--target_device 'NPU NPU GPU' ^
156-
--resolution 512x512
160+
--source_model OpenVINO/stable-diffusion-v1-5-int8-ov ^
161+
--target_device "NPU NPU NPU" ^
162+
--resolution 512x512 ^
163+
--cache_dir ./cache
157164
```
158165
:::
159166

@@ -174,7 +181,7 @@ mkdir models
174181

175182
Run `export_model.py` script to download and quantize the model:
176183

177-
> **Note:** Before downloading the model, access must be requested. Follow the instructions on the [HuggingFace model page](https://huggingface.co/black-forest-labs/FLUX.1-schnell) to request access. When access is granted, create an authentication token in the HuggingFace account -> Settings -> Access Tokens page. Issue the following command and enter the authentication token. Authenticate via `huggingface-cli login`.
184+
> **Note:** Before downloading the model, access must be requested. Follow the instructions on the [HuggingFace model page](https://huggingface.co/stable-diffusion-v1-5/stable-diffusion-v1-5) to request access. When access is granted, create an authentication token in the HuggingFace account -> Settings -> Access Tokens page. Issue the following command and enter the authentication token. Authenticate via `huggingface-cli login`.
178185
179186
> **Note:** The users in China need to set environment variable HF_ENDPOINT="https://hf-mirror.com" before running the export script to connect to the HF Hub.
180187
@@ -183,8 +190,8 @@ Run `export_model.py` script to download and quantize the model:
183190
### Export model for CPU
184191
```console
185192
python export_model.py image_generation \
186-
--source_model black-forest-labs/FLUX.1-schnell \
187-
--weight-format int4 \
193+
--source_model stable-diffusion-v1-5/stable-diffusion-v1-5 \
194+
--weight-format int8 \
188195
--config_file_path models/config.json \
189196
--model_repository_path models \
190197
--extra_quantization_params "--group-size 64" \
@@ -194,8 +201,8 @@ python export_model.py image_generation \
194201
### Export model for GPU
195202
```console
196203
python export_model.py image_generation \
197-
--source_model black-forest-labs/FLUX.1-schnell \
198-
--weight-format int4 \
204+
--source_model stable-diffusion-v1-5/stable-diffusion-v1-5 \
205+
--weight-format int8 \
199206
--target_device GPU \
200207
--config_file_path models/config.json \
201208
--model_repository_path models \
@@ -205,19 +212,20 @@ python export_model.py image_generation \
205212

206213
### Export model for NPU or mixed device
207214

208-
Image generation endpoints consist of 3 steps: text encoding, denoising and vae decoder. It is possible to select device for each step separately. In this example, we will use NPU for text encoding and denoising, and GPU for vae decoder. This is useful when the model is too large to fit into NPU memory, but the NPU can still be used for the first two steps.
215+
Image generation endpoints consist of 3 models: vae encoder, denoising and vae decoder. It is possible to select device for each step separately. In this example, we will use NPU for all the steps.
209216

210217
> **NOTE:** The NPU device requires the pipeline to be reshaped to static shape, this is why the `--resolution` parameter is used to define the input resolution.
211218
212-
> **NOTE:** This feature will be available in 2025.3 and later releases, so until next release, it is required to use export script from the `main` branch.
219+
> **NOTE:** In case the model loading phase takes too long, consider caching the model with `--cache_dir` parameter, as seen in example below.
213220
214221

215222
```console
216223
python export_model.py image_generation \
217-
--source_model black-forest-labs/FLUX.1-schnell \
218-
--weight-format int4 \
219-
--target_device 'NPU NPU GPU' \
224+
--source_model stable-diffusion-v1-5/stable-diffusion-v1-5 \
225+
--weight-format int8 \
226+
--target_device 'NPU NPU NPU' \
220227
--resolution '512x512' \
228+
--ov_cache_dir /cache \
221229
--config_file_path models/config.json \
222230
--model_repository_path models \
223231
--overwrite_models
@@ -249,8 +257,8 @@ Start docker container:
249257
docker run -d --rm -p 8000:8000 -v $(pwd)/models:/models:ro \
250258
openvino/model_server:latest \
251259
--rest_port 8000 \
252-
--model_name OpenVINO/FLUX.1-schnell-int4-ov \
253-
--model_path /models/black-forest-labs/FLUX.1-schnell
260+
--model_name OpenVINO/stable-diffusion-v1-5-int8-ov \
261+
--model_path /models/stable-diffusion-v1-5/stable-diffusion-v1-5
254262
```
255263
:::
256264

@@ -267,8 +275,8 @@ as mentioned in [deployment guide](../../docs/deploying_server_baremetal.md), in
267275

268276
```bat
269277
ovms --rest_port 8000 ^
270-
--model_name OpenVINO/FLUX.1-schnell-int4-ov ^
271-
--model_path ./models/black-forest-labs/FLUX.1-schnell
278+
--model_name OpenVINO/stable-diffusion-v1-5-int8-ov ^
279+
--model_path ./models/stable-diffusion-v1-5/stable-diffusion-v1-5
272280
```
273281
:::
274282

@@ -288,8 +296,8 @@ docker run -d --rm -p 8000:8000 -v $(pwd)/models:/models:ro \
288296
--device /dev/dri --group-add=$(stat -c "%g" /dev/dri/render* | head -n 1) \
289297
openvino/model_server:latest-gpu \
290298
--rest_port 8000 \
291-
--model_name OpenVINO/FLUX.1-schnell-int4-ov \
292-
--model_path /models/black-forest-labs/FLUX.1-schnell
299+
--model_name OpenVINO/stable-diffusion-v1-5-int8-ov \
300+
--model_path /models/stable-diffusion-v1-5/stable-diffusion-v1-5
293301
```
294302

295303
:::
@@ -301,17 +309,15 @@ Depending on how you prepared models in the first step of this demo, they are de
301309

302310
```bat
303311
ovms --rest_port 8000 ^
304-
--model_name OpenVINO/FLUX.1-schnell-int4-ov ^
305-
--model_path ./models/black-forest-labs/FLUX.1-schnell
312+
--model_name OpenVINO/stable-diffusion-v1-5-int8-ov ^
313+
--model_path ./models/stable-diffusion-v1-5/stable-diffusion-v1-5
306314
```
307315
:::
308316

309317
::::
310318

311319
**NPU or mixed device**
312320

313-
This feature will be available in 2025.3 and later releases. Until then, please build the model server from source from the `main` branch.
314-
315321
::::{tab-set}
316322
:::{tab-item} Docker (Linux)
317323
:sync: docker
@@ -322,12 +328,15 @@ In this specific case, we also need to use `--device /dev/dri`, because we also
322328

323329
It can be applied using the commands below:
324330
```bash
325-
docker run -d --rm -p 8000:8000 -v $(pwd)/models:/models:ro \
331+
mkdir -p cache
332+
docker run -d --rm -p 8000:8000 \
333+
-v $(pwd)/models:/models:ro \
334+
-v $(pwd)/cache:/cache:ro \
326335
--device /dev/accel --device /dev/dri --group-add=$(stat -c "%g" /dev/dri/render* | head -n 1) \
327336
openvino/model_server:latest-gpu \
328337
--rest_port 8000 \
329-
--model_name OpenVINO/FLUX.1-schnell-int4-ov \
330-
--model_path /models/black-forest-labs/FLUX.1-schnell
338+
--model_name OpenVINO/stable-diffusion-v1-5-int8-ov \
339+
--model_path /models/stable-diffusion-v1-5/stable-diffusion-v1-5
331340
```
332341

333342
:::
@@ -339,8 +348,8 @@ Depending on how you prepared models in the first step of this demo, they are de
339348

340349
```bat
341350
ovms --rest_port 8000 ^
342-
--model_name OpenVINO/FLUX.1-schnell-int4-ov ^
343-
--model_path ./models/black-forest-labs/FLUX.1-schnell
351+
--model_name OpenVINO/stable-diffusion-v1-5-int8-ov ^
352+
--model_path ./models/stable-diffusion-v1-5/stable-diffusion-v1-5
344353
```
345354
:::
346355

@@ -353,9 +362,10 @@ Wait for the model to load. You can check the status with a simple command:
353362
```console
354363
curl http://localhost:8000/v1/config
355364
```
365+
356366
```json
357367
{
358-
"OpenVINO/FLUX.1-schnell-int4-ov" :
368+
"OpenVINO/stable-diffusion-v1-5-int8-ov" :
359369
{
360370
"model_version_status": [
361371
{
@@ -389,31 +399,31 @@ Linux
389399
curl http://localhost:8000/v3/images/generations \
390400
-H "Content-Type: application/json" \
391401
-d '{
392-
"model": "OpenVINO/FLUX.1-schnell-int4-ov",
393-
"prompt": "three cute cats sitting on a bench",
394-
"rng_seed": 45,
395-
"num_inference_steps": 3,
402+
"model": "OpenVINO/stable-diffusion-v1-5-int8-ov",
403+
"prompt": "Three astronauts on the moon, cold color palette, muted colors, detailed, 8k",
404+
"rng_seed": 409,
405+
"num_inference_steps": 50,
396406
"size": "512x512"
397-
}'| jq -r '.data[0].b64_json' | base64 --decode > output.png
407+
}'| jq -r '.data[0].b64_json' | base64 --decode > generate_output.png
398408
```
399409

400410
Windows Powershell
401411
```powershell
402412
$response = Invoke-WebRequest -Uri "http://localhost:8000/v3/images/generations" `
403413
-Method POST `
404414
-Headers @{ "Content-Type" = "application/json" } `
405-
-Body '{"model": "OpenVINO/FLUX.1-schnell-int4-ov", "prompt": "three cute cats sitting on a bench", "rng_seed": 45, "num_inference_steps": 3, "size": "512x512"}'
415+
-Body '{"model": "OpenVINO/stable-diffusion-v1-5-int8-ov", "prompt": "Three astronauts on the moon, cold color palette, muted colors, detailed, 8k", "rng_seed": 409, "num_inference_steps": 50, "size": "512x512"}'
406416
407417
$base64 = ($response.Content | ConvertFrom-Json).data[0].b64_json
408418
409-
[IO.File]::WriteAllBytes('output.png', [Convert]::FromBase64String($base64))
419+
[IO.File]::WriteAllBytes('generate_output.png', [Convert]::FromBase64String($base64))
410420
```
411421

412422
Windows Command Prompt
413423
```bat
414424
curl http://localhost:8000/v3/images/generations ^
415425
-H "Content-Type: application/json" ^
416-
-d "{\"model\": \"OpenVINO/FLUX.1-schnell-int4-ov\", \"prompt\": \"three cute cats sitting on a bench\", \"rng_seed\": 45, \"num_inference_steps\": 3, \"size\": \"512x512\"}"
426+
-d "{\"model\": \"OpenVINO/stable-diffusion-v1-5-int8-ov\", \"prompt\": \"Three astronauts on the moon, cold color palette, muted colors, detailed, 8k\", \"rng_seed\": 409, \"num_inference_steps\": 50, \"size\": \"512x512\"}"
417427
```
418428

419429

@@ -428,9 +438,9 @@ Expected Response
428438
}
429439
```
430440

431-
The commands will have the generated image saved in output.png.
441+
The commands will have the generated image saved in generate_output.png.
432442

433-
![output](./output.png)
443+
![output](./generate_output.png)
434444

435445

436446
### Requesting image generation with OpenAI Python package
@@ -454,28 +464,25 @@ client = OpenAI(
454464
)
455465

456466
response = client.images.generate(
457-
model="OpenVINO/FLUX.1-schnell-int4-ov",
458-
prompt="three cute cats sitting on a bench",
467+
model="OpenVINO/stable-diffusion-v1-5-int8-ov",
468+
prompt="Three astronauts on the moon, cold color palette, muted colors, detailed, 8k",
459469
extra_body={
460-
"rng_seed": 60,
470+
"rng_seed": 409,
461471
"size": "512x512",
462-
"num_inference_steps": 3
472+
"num_inference_steps": 50
463473
}
464474
)
465475
base64_image = response.data[0].b64_json
466476

467477
image_data = base64.b64decode(base64_image)
468478
image = Image.open(BytesIO(image_data))
469-
image.save('output2.png')
470-
479+
image.save('generate_output.png')
471480
```
472481

473-
Output file (`output2.png`):
474-
![output2](./output2.png)
475-
476-
477482
### Requesting image edit with OpenAI Python package
478483

484+
Example changing the previously generated image to: `Three astronauts in the jungle, vibrant color palette, live colors, detailed, 8k`:
485+
479486
```python
480487
from openai import OpenAI
481488
import base64
@@ -488,31 +495,35 @@ client = OpenAI(
488495
)
489496

490497
response = client.images.edit(
491-
model="OpenVINO/FLUX.1-schnell-int4-ov",
492-
image=open("output2.png", "rb"),
493-
prompt="pink cats",
498+
model="OpenVINO/stable-diffusion-v1-5-int8-ov",
499+
image=open("generate_output.png", "rb"),
500+
prompt="Three astronauts in the jungle, vibrant color palette, live colors, detailed, 8k",
494501
extra_body={
495-
"rng_seed": 60,
502+
"rng_seed": 409,
496503
"size": "512x512",
497-
"num_inference_steps": 3,
498-
"strength": 0.7
504+
"num_inference_steps": 50,
505+
"strength": 0.67
499506
}
500507
)
501508
base64_image = response.data[0].b64_json
502509

503510
image_data = base64.b64decode(base64_image)
504511
image = Image.open(BytesIO(image_data))
505512
image.save('edit_output.png')
506-
507513
```
508514

509515
Output file (`edit_output.png`):
510516
![edit_output](./edit_output.png)
511517

518+
### Strength influence on final damage
519+
520+
![strength](./strength.png)
512521

522+
Please follow [OpenVINO notebook](https://github.com/openvinotoolkit/openvino_notebooks/blob/latest/notebooks/image-to-image-genai/image-to-image-genai.ipynb) to understand how other parameters affect editing.
513523

514524
## References
515525
- [Image Generation API](../../docs/model_server_rest_api_image_generation.md)
516526
- [Image Edit API](../../docs/model_server_rest_api_image_edit.md)
517527
- [Writing client code](../../docs/clients_genai.md)
518528
- [Image Generation/Edit calculator reference](../../docs/image_generation/reference.md)
529+
- [Supported models](https://openvinotoolkit.github.io/openvino.genai/docs/supported-models/#image-generation-models)
22.1 KB
Loading
523 KB
Loading

demos/image_generation/output.png

-567 KB
Binary file not shown.

demos/image_generation/output2.png

-437 KB
Binary file not shown.
1.93 MB
Loading

0 commit comments

Comments
 (0)