openvinotoolkit
diff --git a/‎demos/image_generation/README.md‎
Lines changed: 79 additions & 68 deletions b/‎demos/image_generation/README.md‎
Lines changed: 79 additions & 68 deletions
diff --git a/‎demos/image_generation/edit_output.png‎
22.1 KB b/‎demos/image_generation/edit_output.png‎
22.1 KB
diff --git a/‎demos/image_generation/generate_output.png‎
523 KB b/‎demos/image_generation/generate_output.png‎
523 KB
diff --git a/‎demos/image_generation/output.png‎
-567 KB b/‎demos/image_generation/output.png‎
-567 KB
diff --git a/‎demos/image_generation/output2.png‎
-437 KB b/‎demos/image_generation/output2.png‎
-437 KB
diff --git a/‎demos/image_generation/strength.png‎
1.93 MB b/‎demos/image_generation/strength.png‎
1.93 MB
@@ -3,6 +3,8 @@
 This demo shows how to deploy image generation models (Stable Diffusion/Stable Diffusion 3/Stable Diffusion XL/FLUX) to create and edit images with the OpenVINO Model Server.
 Image generation pipelines are exposed via [OpenAI API](https://platform.openai.com/docs/api-reference/images/create) `images/generations` and `images/edits` endpoints.
 
+Check [supported models](https://openvinotoolkit.github.io/openvino.genai/docs/supported-models/#image-generation-models).
+
 > **Note:** This demo was tested on Intel® Xeon®, Intel® Core®, Intel® Arc™ A770, Intel® Arc™ B580 on Ubuntu 22/24, RedHat 9 and Windows 11.
 
 ## Prerequisites
@@ -22,7 +24,7 @@ Image generation pipelines are exposed via [OpenAI API](https://platform.openai.
 
 > **NOTE:** Model downloading feature is described in depth in separate documentation page: [Pulling HuggingFaces Models](../../docs/pull_hf_models.md).
 
-This command pulls the `OpenVINO/FLUX.1-schnell-int4-ov` quantized model directly from HuggingFaces and starts the serving. If the model already exists locally, it will skip the downloading and immediately start the serving.
+This command pulls the `OpenVINO/stable-diffusion-v1-5-int8-ov` quantized model directly from HuggingFaces and starts the serving. If the model already exists locally, it will skip the downloading and immediately start the serving.
 
 > **NOTE:** Optionally, to only download the model and omit the serving part, use `--pull` parameter.
 
@@ -41,7 +43,7 @@ docker run -d --rm --user $(id -u):$(id -g) -p 8000:8000 -v $(pwd)/models:/model
     --rest_port 8000 \
     --model_repository_path /models/ \
     --task image_generation \
-    --source_model OpenVINO/FLUX.1-schnell-int4-ov
+    --source_model OpenVINO/stable-diffusion-v1-5-int8-ov
 ```
 :::
 
@@ -62,7 +64,7 @@ mkdir models
 ovms --rest_port 8000 ^
   --model_repository_path ./models/ ^
   --task image_generation ^
-  --source_model OpenVINO/FLUX.1-schnell-int4-ov
+  --source_model OpenVINO/stable-diffusion-v1-5-int8-ov
 ```
 :::
 
@@ -85,7 +87,7 @@ docker run -d --rm -p 8000:8000 -v $(pwd)/models:/models/:rw \
     --rest_port 8000 \
     --model_repository_path /models/ \
     --task image_generation \
-    --source_model OpenVINO/FLUX.1-schnell-int4-ov \
+    --source_model OpenVINO/stable-diffusion-v1-5-int8-ov \
     --target_device GPU
 ```
 :::
@@ -101,7 +103,7 @@ mkdir models
 ovms --rest_port 8000 ^
   --model_repository_path ./models/ ^
   --task image_generation ^
-  --source_model OpenVINO/FLUX.1-schnell-int4-ov ^
+  --source_model OpenVINO/stable-diffusion-v1-5-int8-ov ^
   --target_device GPU
 ```
 :::
@@ -111,7 +113,7 @@ ovms --rest_port 8000 ^
 
 ### NPU or mixed device
 
-Image generation endpoints consist of 3 steps: text encoding, denoising and vae decoder. It is possible to select device for each step separately. In this example, we will use NPU for text encoding and denoising, and GPU for vae decoder. This is useful when the model is too large to fit into NPU memory, but the NPU can still be used for the first two steps.
+Image generation endpoints consist of 3 models: vae encoder, denoising and vae decoder. It is possible to select device for each step separately. In this example, we will use NPU for text encoding and denoising, and GPU for vae decoder. This is useful when the model is too large to fit into NPU memory, but the NPU can still be used for the first two steps.
 
 ::::{tab-set}
 :::{tab-item} Docker (Linux)
@@ -121,23 +123,26 @@ In this specific case, we also need to use `--device /dev/dri`, because we also
 
 > **NOTE:** The NPU device requires the pipeline to be reshaped to static shape, this is why the `--resolution` parameter is used to define the input resolution.
 
-> **NOTE:** This feature will be available in 2025.3 and later releases, so until next release, it is required to build the model server from source from the `main` branch.
-
+> **NOTE:** In case the model loading phase takes too long, consider caching the model with `--cache_dir` parameter, as seen in example below.
 
 It can be applied using the commands below:
 ```bash
 mkdir -p models
+mkdir -p cache
 
-docker run -d --rm -p 8000:8000 -v $(pwd)/models:/models/:rw \
+docker run -d --rm -p 8000:8000 \
+  -v $(pwd)/models:/models/:rw \
+  -v $(pwd)/cache:/cache/:rw \
   --user $(id -u):$(id -g) --device /dev/accel --device /dev/dri --group-add=$(stat -c "%g" /dev/dri/render* | head -n 1) \
   -e http_proxy=$http_proxy -e https_proxy=$https_proxy -e no_proxy=$no_proxy \
   openvino/model_server:latest-gpu \
     --rest_port 8000 \
     --model_repository_path /models/ \
     --task image_generation \
-    --source_model OpenVINO/FLUX.1-schnell-int4-ov \
-    --target_device 'NPU NPU GPU' \
-    --resolution 512x512
+    --source_model OpenVINO/stable-diffusion-v1-5-int8-ov \
+    --target_device 'NPU NPU NPU' \
+    --resolution 512x512 \
+    --cache_dir /cache
 ```
 :::
 
@@ -147,13 +152,15 @@ docker run -d --rm -p 8000:8000 -v $(pwd)/models:/models/:rw \
 
 ```bat
 mkdir models
+mkdir cache
 
 ovms --rest_port 8000 ^
   --model_repository_path ./models/ ^
   --task image_generation ^
-  --source_model OpenVINO/FLUX.1-schnell-int4-ov ^
-  --target_device 'NPU NPU GPU' ^
-  --resolution 512x512
+  --source_model OpenVINO/stable-diffusion-v1-5-int8-ov ^
+  --target_device "NPU NPU NPU" ^
+  --resolution 512x512 ^
+  --cache_dir ./cache
 ```
 :::
 
@@ -174,7 +181,7 @@ mkdir models
 
 Run `export_model.py` script to download and quantize the model:
 
-> **Note:** Before downloading the model, access must be requested. Follow the instructions on the [HuggingFace model page](https://huggingface.co/black-forest-labs/FLUX.1-schnell) to request access. When access is granted, create an authentication token in the HuggingFace account -> Settings -> Access Tokens page. Issue the following command and enter the authentication token. Authenticate via `huggingface-cli login`. 
+> **Note:** Before downloading the model, access must be requested. Follow the instructions on the [HuggingFace model page](https://huggingface.co/stable-diffusion-v1-5/stable-diffusion-v1-5) to request access. When access is granted, create an authentication token in the HuggingFace account -> Settings -> Access Tokens page. Issue the following command and enter the authentication token. Authenticate via `huggingface-cli login`. 
 
 > **Note:** The users in China need to set environment variable HF_ENDPOINT="https://hf-mirror.com" before running the export script to connect to the HF Hub.
 
@@ -183,8 +190,8 @@ Run `export_model.py` script to download and quantize the model:
 ### Export model for CPU
 ```console
 python export_model.py image_generation \
-  --source_model black-forest-labs/FLUX.1-schnell \
-  --weight-format int4 \
+  --source_model stable-diffusion-v1-5/stable-diffusion-v1-5 \
+  --weight-format int8 \
   --config_file_path models/config.json \
   --model_repository_path models \
   --extra_quantization_params "--group-size 64" \
@@ -194,8 +201,8 @@ python export_model.py image_generation \
 ### Export model for GPU
 ```console
 python export_model.py image_generation \
-  --source_model black-forest-labs/FLUX.1-schnell \
-  --weight-format int4 \
+  --source_model stable-diffusion-v1-5/stable-diffusion-v1-5 \
+  --weight-format int8 \
   --target_device GPU \
   --config_file_path models/config.json \
   --model_repository_path models \
@@ -205,19 +212,20 @@ python export_model.py image_generation \
 
 ### Export model for NPU or mixed device
 
-Image generation endpoints consist of 3 steps: text encoding, denoising and vae decoder. It is possible to select device for each step separately. In this example, we will use NPU for text encoding and denoising, and GPU for vae decoder. This is useful when the model is too large to fit into NPU memory, but the NPU can still be used for the first two steps.
+Image generation endpoints consist of 3 models: vae encoder, denoising and vae decoder. It is possible to select device for each step separately. In this example, we will use NPU for all the steps.
 
 > **NOTE:** The NPU device requires the pipeline to be reshaped to static shape, this is why the `--resolution` parameter is used to define the input resolution.
 
-> **NOTE:** This feature will be available in 2025.3 and later releases, so until next release, it is required to use export script from the `main` branch.
+> **NOTE:** In case the model loading phase takes too long, consider caching the model with `--cache_dir` parameter, as seen in example below.
 
 
 ```console
 python export_model.py image_generation \
-  --source_model black-forest-labs/FLUX.1-schnell \
-  --weight-format int4 \
-  --target_device 'NPU NPU GPU' \
+  --source_model stable-diffusion-v1-5/stable-diffusion-v1-5 \
+  --weight-format int8 \
+  --target_device 'NPU NPU NPU' \
   --resolution '512x512' \
+  --ov_cache_dir /cache \
   --config_file_path models/config.json \
   --model_repository_path models \
   --overwrite_models
@@ -249,8 +257,8 @@ Start docker container:
 docker run -d --rm -p 8000:8000 -v $(pwd)/models:/models:ro \
   openvino/model_server:latest \
     --rest_port 8000 \
-    --model_name OpenVINO/FLUX.1-schnell-int4-ov \
-    --model_path /models/black-forest-labs/FLUX.1-schnell
+    --model_name OpenVINO/stable-diffusion-v1-5-int8-ov \
+    --model_path /models/stable-diffusion-v1-5/stable-diffusion-v1-5
 ```
 :::
 
@@ -267,8 +275,8 @@ as mentioned in [deployment guide](../../docs/deploying_server_baremetal.md), in
 
 ```bat
 ovms --rest_port 8000 ^
-  --model_name OpenVINO/FLUX.1-schnell-int4-ov ^
-  --model_path ./models/black-forest-labs/FLUX.1-schnell
+  --model_name OpenVINO/stable-diffusion-v1-5-int8-ov ^
+  --model_path ./models/stable-diffusion-v1-5/stable-diffusion-v1-5
 ```
 :::
 
@@ -288,8 +296,8 @@ docker run -d --rm -p 8000:8000 -v $(pwd)/models:/models:ro \
   --device /dev/dri --group-add=$(stat -c "%g" /dev/dri/render* | head -n 1) \
   openvino/model_server:latest-gpu \
     --rest_port 8000 \
-    --model_name OpenVINO/FLUX.1-schnell-int4-ov \
-    --model_path /models/black-forest-labs/FLUX.1-schnell
+    --model_name OpenVINO/stable-diffusion-v1-5-int8-ov \
+    --model_path /models/stable-diffusion-v1-5/stable-diffusion-v1-5
 ```
 
 :::
@@ -301,17 +309,15 @@ Depending on how you prepared models in the first step of this demo, they are de
 
 ```bat
 ovms --rest_port 8000 ^
-  --model_name OpenVINO/FLUX.1-schnell-int4-ov ^
-  --model_path ./models/black-forest-labs/FLUX.1-schnell
+  --model_name OpenVINO/stable-diffusion-v1-5-int8-ov ^
+  --model_path ./models/stable-diffusion-v1-5/stable-diffusion-v1-5
 ```
 :::
 
 ::::
 
 **NPU or mixed device**  
 
-This feature will be available in 2025.3 and later releases. Until then, please build the model server from source from the `main` branch.
-
 ::::{tab-set}
 :::{tab-item} Docker (Linux)
 :sync: docker
@@ -322,12 +328,15 @@ In this specific case, we also need to use `--device /dev/dri`, because we also
 
 It can be applied using the commands below:
 ```bash
-docker run -d --rm -p 8000:8000 -v $(pwd)/models:/models:ro \
+mkdir -p cache
+docker run -d --rm -p 8000:8000 \
+  -v $(pwd)/models:/models:ro \
+  -v $(pwd)/cache:/cache:ro \
   --device /dev/accel --device /dev/dri --group-add=$(stat -c "%g" /dev/dri/render* | head -n 1) \
   openvino/model_server:latest-gpu \
     --rest_port 8000 \
-    --model_name OpenVINO/FLUX.1-schnell-int4-ov \
-    --model_path /models/black-forest-labs/FLUX.1-schnell
+    --model_name OpenVINO/stable-diffusion-v1-5-int8-ov \
+    --model_path /models/stable-diffusion-v1-5/stable-diffusion-v1-5
 ```
 
 :::
@@ -339,8 +348,8 @@ Depending on how you prepared models in the first step of this demo, they are de
 
 ```bat
 ovms --rest_port 8000 ^
-  --model_name OpenVINO/FLUX.1-schnell-int4-ov ^
-  --model_path ./models/black-forest-labs/FLUX.1-schnell
+  --model_name OpenVINO/stable-diffusion-v1-5-int8-ov ^
+  --model_path ./models/stable-diffusion-v1-5/stable-diffusion-v1-5
 ```
 :::
 
@@ -353,9 +362,10 @@ Wait for the model to load. You can check the status with a simple command:
 ```console
 curl http://localhost:8000/v1/config
 ```
+
 ```json
 {
- "OpenVINO/FLUX.1-schnell-int4-ov" :
+ "OpenVINO/stable-diffusion-v1-5-int8-ov" :
  {
   "model_version_status": [
    {
@@ -389,31 +399,31 @@ Linux
 curl http://localhost:8000/v3/images/generations \
   -H "Content-Type: application/json" \
   -d '{
-    "model": "OpenVINO/FLUX.1-schnell-int4-ov",
-    "prompt": "three cute cats sitting on a bench",
-    "rng_seed": 45,
-    "num_inference_steps": 3,
+    "model": "OpenVINO/stable-diffusion-v1-5-int8-ov",
+    "prompt": "Three astronauts on the moon, cold color palette, muted colors, detailed, 8k",
+    "rng_seed": 409,
+    "num_inference_steps": 50,
     "size": "512x512"
-  }'| jq -r '.data[0].b64_json' | base64 --decode > output.png
+  }'| jq -r '.data[0].b64_json' | base64 --decode > generate_output.png
 ```
 
 Windows Powershell
 ```powershell
 $response = Invoke-WebRequest -Uri "http://localhost:8000/v3/images/generations" `
     -Method POST `
     -Headers @{ "Content-Type" = "application/json" } `
-    -Body '{"model": "OpenVINO/FLUX.1-schnell-int4-ov", "prompt": "three cute cats sitting on a bench", "rng_seed": 45, "num_inference_steps": 3, "size": "512x512"}'
+    -Body '{"model": "OpenVINO/stable-diffusion-v1-5-int8-ov", "prompt": "Three astronauts on the moon, cold color palette, muted colors, detailed, 8k", "rng_seed": 409, "num_inference_steps": 50, "size": "512x512"}'
 
 $base64 = ($response.Content | ConvertFrom-Json).data[0].b64_json
 
-[IO.File]::WriteAllBytes('output.png', [Convert]::FromBase64String($base64))
+[IO.File]::WriteAllBytes('generate_output.png', [Convert]::FromBase64String($base64))
 ```
 
 Windows Command Prompt
 ```bat
 curl http://localhost:8000/v3/images/generations ^
   -H "Content-Type: application/json" ^
-  -d "{\"model\": \"OpenVINO/FLUX.1-schnell-int4-ov\", \"prompt\": \"three cute cats sitting on a bench\", \"rng_seed\": 45, \"num_inference_steps\": 3, \"size\": \"512x512\"}"
+  -d "{\"model\": \"OpenVINO/stable-diffusion-v1-5-int8-ov\", \"prompt\": \"Three astronauts on the moon, cold color palette, muted colors, detailed, 8k\", \"rng_seed\": 409, \"num_inference_steps\": 50, \"size\": \"512x512\"}"
 ```
 
 
@@ -428,9 +438,9 @@ Expected Response
 }
 ```
 
-The commands will have the generated image saved in output.png.
+The commands will have the generated image saved in generate_output.png.
 
-![output](./output.png)
+![output](./generate_output.png)
 
 
 ### Requesting image generation with OpenAI Python package
@@ -454,28 +464,25 @@ client = OpenAI(
 )
 
 response = client.images.generate(
-            model="OpenVINO/FLUX.1-schnell-int4-ov",
-            prompt="three cute cats sitting on a bench",
+            model="OpenVINO/stable-diffusion-v1-5-int8-ov",
+            prompt="Three astronauts on the moon, cold color palette, muted colors, detailed, 8k",
             extra_body={
-                "rng_seed": 60,
+                "rng_seed": 409,
                 "size": "512x512",
-                "num_inference_steps": 3
+                "num_inference_steps": 50
             }
         )
 base64_image = response.data[0].b64_json
 
 image_data = base64.b64decode(base64_image)
 image = Image.open(BytesIO(image_data))
-image.save('output2.png')
-
+image.save('generate_output.png')
 ```
 
-Output file (`output2.png`):  
-![output2](./output2.png)
-
-
 ### Requesting image edit with OpenAI Python package
 
+Example changing the previously generated image to: `Three astronauts in the jungle, vibrant color palette, live colors, detailed, 8k`:
+
 ```python
 from openai import OpenAI
 import base64
@@ -488,31 +495,35 @@ client = OpenAI(
 )
 
 response = client.images.edit(
-            model="OpenVINO/FLUX.1-schnell-int4-ov",
-            image=open("output2.png", "rb"),
-            prompt="pink cats",
+            model="OpenVINO/stable-diffusion-v1-5-int8-ov",
+            image=open("generate_output.png", "rb"),
+            prompt="Three astronauts in the jungle, vibrant color palette, live colors, detailed, 8k",
             extra_body={
-                "rng_seed": 60,
+                "rng_seed": 409,
                 "size": "512x512",
-                "num_inference_steps": 3,
-                "strength": 0.7
+                "num_inference_steps": 50,
+                "strength": 0.67
             }
         )
 base64_image = response.data[0].b64_json
 
 image_data = base64.b64decode(base64_image)
 image = Image.open(BytesIO(image_data))
 image.save('edit_output.png')
-
 ```
 
 Output file (`edit_output.png`):  
 ![edit_output](./edit_output.png)
 
+### Strength influence on final damage
+
+![strength](./strength.png)
 
+Please follow [OpenVINO notebook](https://github.com/openvinotoolkit/openvino_notebooks/blob/latest/notebooks/image-to-image-genai/image-to-image-genai.ipynb) to understand how other parameters affect editing.
 
 ## References
 - [Image Generation API](../../docs/model_server_rest_api_image_generation.md)
 - [Image Edit API](../../docs/model_server_rest_api_image_edit.md)
 - [Writing client code](../../docs/clients_genai.md)
 - [Image Generation/Edit calculator reference](../../docs/image_generation/reference.md)
+- [Supported models](https://openvinotoolkit.github.io/openvino.genai/docs/supported-models/#image-generation-models)