|
| 1 | +# Image Edit API |
| 2 | + |
| 3 | +vLLM-Omni provides an OpenAI DALL-E compatible API for image edit using diffusion models. |
| 4 | + |
| 5 | +Each server instance runs a single model (specified at startup via `vllm serve <model> --omni`). |
| 6 | + |
| 7 | +## Quick Start |
| 8 | + |
| 9 | +### Start the Server |
| 10 | + |
| 11 | +For example... |
| 12 | + |
| 13 | +```bash |
| 14 | +# Qwen-Image |
| 15 | +vllm serve Qwen/Qwen-Image-Edit-2511 --omni --port 8000 |
| 16 | + |
| 17 | + |
| 18 | +### Generate Images |
| 19 | + |
| 20 | +**Using curl:** |
| 21 | + |
| 22 | +```bash |
| 23 | +curl -s -D >(grep -i x-request-id >&2) \ |
| 24 | + -o >(jq -r '.data[0].b64_json' | base64 --decode > gift-basket.png) \ |
| 25 | + -X POST "http://localhost:8000/v1/images/edits" \ |
| 26 | + -F "model=xxx" \ |
| 27 | + -F "image=@./xx.png" \ |
| 28 | + -F "prompt='this bear is wearing sportwear. holding a basketball, and bending one leg.'" \ |
| 29 | + -F "size=1024x1024" \ |
| 30 | + -F "output_format=png" |
| 31 | +``` |
| 32 | + |
| 33 | + |
| 34 | +**Using OpenAI SDK:** |
| 35 | + |
| 36 | +```python |
| 37 | +import base64 |
| 38 | +from openai import OpenAI |
| 39 | +from pathlib import Path |
| 40 | +client = OpenAI( |
| 41 | + api_key="None", |
| 42 | + base_url="http://localhost:8000/v1" |
| 43 | +) |
| 44 | +
|
| 45 | +input_image_url = "https://vllm-public-assets.s3.us-west-2.amazonaws.com/omni-assets/qwen-bear.png" |
| 46 | +
|
| 47 | +result = client.images.edit( |
| 48 | + image=[], |
| 49 | + model="Qwen-Image-Edit-2511", |
| 50 | + prompt="Change the bears in the two input images into walking together.", |
| 51 | + size='512x512', |
| 52 | + stream=False, |
| 53 | + output_format='jpeg', |
| 54 | + # url格式 |
| 55 | + extra_body={ |
| 56 | + "url": [input_image_url1,input_image_url], |
| 57 | + "num_inference_steps": 50, |
| 58 | + "guidance_scale": 1, |
| 59 | + "seed": 777, |
| 60 | + } |
| 61 | +) |
| 62 | +
|
| 63 | +image_base64 = result.data[0].b64_json |
| 64 | +image_bytes = base64.b64decode(image_base64) |
| 65 | +
|
| 66 | +# Save the image to a file |
| 67 | +with open("edit_out_http.jpeg", "wb") as f: |
| 68 | + f.write(image_bytes) |
| 69 | +``` |
| 70 | + |
| 71 | +## API Reference |
| 72 | + |
| 73 | +### Endpoint |
| 74 | + |
| 75 | +``` |
| 76 | +POST /v1/images/edits |
| 77 | +Content-Type: multipart/form-data |
| 78 | +``` |
| 79 | +
|
| 80 | +### Request Parameters |
| 81 | +
|
| 82 | +#### OpenAI Standard Parameters |
| 83 | +
|
| 84 | +| Parameter | Type | Default | Description | |
| 85 | +|-----------|------|---------|-------------| |
| 86 | +| `prompt` | string | **required** | A text description of the desired image | |
| 87 | +| `model` | string | server's model | Model to use (optional, should match server if specified) | |
| 88 | +| `image` | string or array | **required** | The image(s) to edit. | |
| 89 | +| `n` | integer | 1 | Number of images to generate (1-10) | |
| 90 | +| `size` | string | "auto" | Image dimensions in WxH format (e.g., "1024x1024", "512x512"), when set to auto, it decide size from first input image. | |
| 91 | +| `response_format` | string | "b64_json" | Response format (only "b64_json" supported) | |
| 92 | +| `user` | string | null | User identifier for tracking | |
| 93 | +| `output_format` | string | "png" | The format in which the generated images are returned. Must be one of "png", "jpg", "jpeg", "webp". | |
| 94 | +| `output_compression` | integer | 100 | The compression level (0-100%) for the generated images. | |
| 95 | +| `background` | string or null | "auto" | Allows to set transparency for the background of the generated image(s). |
| 96 | +
|
| 97 | +#### vllm-omni Extension Parameters |
| 98 | +
|
| 99 | +| Parameter | Type | Default | Description | |
| 100 | +|-----------|------|---------|-------------| |
| 101 | +| `url` | string or array | None | The image(s) to edit. | |
| 102 | +| `negative_prompt` | string | null | Text describing what to avoid in the image | |
| 103 | +| `num_inference_steps` | integer | model defaults | Number of diffusion steps | |
| 104 | +| `guidance_scale` | float | model defaults | Classifier-free guidance scale (typically 0.0-20.0) | |
| 105 | +| `true_cfg_scale` | float | model defaults | True CFG scale (model-specific parameter, may be ignored if not supported) | |
| 106 | +| `seed` | integer | null | Random seed for reproducibility | |
| 107 | +
|
| 108 | +### Response Format |
| 109 | +
|
| 110 | +```json |
| 111 | +{ |
| 112 | + "created": 1701234567, |
| 113 | + "data": [ |
| 114 | + { |
| 115 | + "b64_json": "<base64-encoded PNG>", |
| 116 | + "url": null, |
| 117 | + "revised_prompt": null |
| 118 | + } |
| 119 | + ], |
| 120 | + "output_format": null, |
| 121 | + "size": null, |
| 122 | +} |
| 123 | +``` |
| 124 | + |
| 125 | +## Examples |
| 126 | + |
| 127 | +### Multiple Images input |
| 128 | + |
| 129 | +```bash |
| 130 | +curl -s -D >(grep -i x-request-id >&2) \ |
| 131 | + -o >(jq -r '.data[0].b64_json' | base64 --decode > gift-basket.png) \ |
| 132 | + -X POST "http://localhost:8000/v1/images/edits" \ |
| 133 | + -F "model=xxx" \ |
| 134 | + -F "image=@xx.png" \ |
| 135 | + -F "image=@xx.png" |
| 136 | + -F "prompt='this bear is wearing sportwear. holding a basketball, and bending one leg.'" \ |
| 137 | + -F "size=1024x1024" \ |
| 138 | + -F "output_format=png" |
| 139 | +``` |
| 140 | + |
| 141 | + |
| 142 | +## Parameter Handling |
| 143 | + |
| 144 | +The API passes parameters directly to the diffusion pipeline without model-specific transformation: |
| 145 | + |
| 146 | +- **Default values**: When parameters are not specified, the underlying model uses its own defaults |
| 147 | +- **Pass-through design**: User-provided values are forwarded directly to the diffusion engine |
| 148 | +- **Minimal validation**: Only basic type checking and range validation at the API level |
| 149 | + |
| 150 | +### Parameter Compatibility |
| 151 | + |
| 152 | +The API passes parameters directly to the diffusion pipeline without model-specific validation. |
| 153 | + |
| 154 | +- Unsupported parameters may be silently ignored by the model |
| 155 | +- Incompatible values will result in errors from the underlying pipeline |
| 156 | +- Recommended values vary by model - consult model documentation |
| 157 | + |
| 158 | +**Best Practice:** Start with the model's recommended parameters, then adjust based on your needs. |
| 159 | + |
| 160 | +## Error Responses |
| 161 | + |
| 162 | +### 400 Bad Request |
| 163 | + |
| 164 | +Invalid parameters (e.g., model mismatch): |
| 165 | + |
| 166 | +```json |
| 167 | +{ |
| 168 | + "detail": "Invalid size format: '1024x'. Expected format: 'WIDTHxHEIGHT' (e.g., '1024x1024')." |
| 169 | +} |
| 170 | +``` |
| 171 | + |
| 172 | +### 422 Unprocessable Entity |
| 173 | + |
| 174 | +Validation errors (missing required fields): |
| 175 | + |
| 176 | +```json |
| 177 | +{ |
| 178 | + "detail": "Field 'image' or 'url' is required" |
| 179 | +} |
| 180 | +``` |
| 181 | + |
| 182 | +## Troubleshooting |
| 183 | + |
| 184 | +### Server Not Running |
| 185 | + |
| 186 | +```bash |
| 187 | +# Check if server is responding |
| 188 | +curl -X http://localhost:8000/v1/images/edit \ |
| 189 | + -F "prompt='test'" |
| 190 | +``` |
| 191 | + |
| 192 | +### Out of Memory |
| 193 | + |
| 194 | +If you encounter OOM errors: |
| 195 | +1. Reduce image size: `"size": "512x512"` |
| 196 | +2. Reduce inference steps: `"num_inference_steps": 25` |
| 197 | + |
| 198 | +## Development |
| 199 | + |
| 200 | +Enable debug logging to see prompts and generation details: |
| 201 | + |
| 202 | +```bash |
| 203 | +vllm serve Qwen/Qwen-Image-Edit-2511 --omni \ |
| 204 | + --uvicorn-log-level debug |
| 205 | +``` |
0 commit comments