Skip to content

Commit df3381e

Browse files
committed
tutorial using qwen imge edit and inference providers
1 parent 4b90235 commit df3381e

File tree

1 file changed

+324
-0
lines changed

1 file changed

+324
-0
lines changed
Lines changed: 324 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,324 @@
1+
# Building an AI Image Editor with Gradio and Inference Providers
2+
3+
In this guide, we'll build an AI-powered image editor that lets users upload images and edit them using natural language prompts. This project demonstrates how to combine Inference Providers with image-to-image models like [Qwen's Image Edit](https://huggingface.co/Qwen/Qwen-Image-Edit).
4+
5+
Our app will:
6+
7+
1. **Accept image uploads** through a web interface
8+
2. **Process natural language prompts** editing instructions like "Turn the cat into a tiger"
9+
3. **Transform images** using Qwen Image Edit
10+
4. **Display results** in a Gradio interface
11+
12+
<Tip>
13+
14+
This guide assumes you have a Hugging Face account. If you don't have one, you can create one for free at [huggingface.co](https://huggingface.co).
15+
16+
</Tip>
17+
18+
## Step 1: Set Up Authentication
19+
20+
Before we start coding, authenticate with Hugging Face using your token:
21+
22+
```bash
23+
# Get your token from https://huggingface.co/settings/tokens
24+
export HF_TOKEN="your_token_here"
25+
```
26+
27+
When you set this environment variable, it handles authentication automatically for all your inference calls. You can generate a token from [your settings page](https://huggingface.co/settings/tokens/new?ownUserPermissions=inference.serverless.write&tokenType=fineGrained).
28+
29+
## Step 2: Project Setup
30+
31+
Create a new project directory and initialize it with uv:
32+
33+
```bash
34+
mkdir image-editor-app
35+
cd image-editor-app
36+
uv init
37+
```
38+
39+
This creates a basic project structure with a `pyproject.toml` file. Now add the required dependencies:
40+
41+
```bash
42+
uv add huggingface-hub>=0.34.4 gradio>=5.0.0 pillow>=11.3.0
43+
```
44+
45+
The dependencies are now installed and ready to use! Also, `uv` will create a handy `pyproject.toml` file for you to manage your dependencies as a project.
46+
47+
<Tip>
48+
49+
We're using `uv` because it's a fast Python package manager that handles dependency resolution and virtual environment management automatically. It's much faster than pip and provides better dependency resolution. If you're not familiar with `uv`, check it out [here](https://docs.astral.sh/uv/).
50+
51+
</Tip>
52+
53+
## Step 3: Build the Core Image Editing Function
54+
55+
Now let's create the main logic for our application - the image editing function that transforms images using AI.
56+
57+
Create `main.py` then import the necessary libraries and instantiate the InferenceClient. We're using the `fal-ai` provider for fast image processing, but other providers are available.
58+
59+
```python
60+
import os
61+
import gradio as gr
62+
from huggingface_hub import InferenceClient
63+
import io
64+
65+
# Initialize the client with fal-ai provider for fast image processing
66+
client = InferenceClient(
67+
provider="fal-ai",
68+
api_key=os.environ["HF_TOKEN"],
69+
)
70+
```
71+
72+
Now let's create the image editing function. This function takes an input image and a prompt, and returns an edited image. We also want to handle errors gracefully and return the original image if there's an error, so our UI always shows something.
73+
74+
```python
75+
def edit_image(input_image, prompt):
76+
"""
77+
Edit an image using the given prompt.
78+
79+
Args:
80+
input_image: PIL Image object from Gradio
81+
prompt: String prompt for image editing
82+
83+
Returns:
84+
PIL Image object (edited image)
85+
"""
86+
if input_image is None:
87+
return None
88+
89+
if not prompt or prompt.strip() == "":
90+
return input_image
91+
92+
try:
93+
# Convert PIL Image to bytes
94+
img_bytes = io.BytesIO()
95+
input_image.save(img_bytes, format="PNG")
96+
img_bytes = img_bytes.getvalue()
97+
98+
# Use the image_to_image method with Qwen's image editing model
99+
edited_image = client.image_to_image(
100+
img_bytes,
101+
prompt=prompt.strip(),
102+
model="Qwen/Qwen-Image-Edit",
103+
)
104+
105+
return edited_image
106+
107+
except Exception as e:
108+
print(f"Error editing image: {e}")
109+
return input_image
110+
```
111+
112+
<Tip>
113+
114+
We're using the `fal-ai` provider with the `Qwen/Qwen-Image-Edit` model. The fal-ai provider offers fast inference times, perfect for interactive applications. In some use cases, you might want to switch between providers for maximum performance. Whilst in others you might want to go for the consistency of a single provider.
115+
116+
You can experiment with different providers for various performance characteristics:
117+
118+
```python
119+
client = InferenceClient(provider="replicate", api_key=os.environ["HF_TOKEN"])
120+
client = InferenceClient(provider="auto", api_key=os.environ["HF_TOKEN"]) # Automatic selection
121+
```
122+
123+
</Tip>
124+
125+
## Step 4: Create the Gradio Interface
126+
127+
Now let's build a simple user-friendly interface using Gradio.
128+
129+
```python
130+
# Create the Gradio interface
131+
with gr.Blocks(title="Image Editor", theme=gr.themes.Soft()) as interface:
132+
gr.Markdown(
133+
"""
134+
# 🎨 AI Image Editor
135+
Upload an image and describe how you want to edit it using natural language!
136+
"""
137+
)
138+
139+
with gr.Row():
140+
with gr.Column():
141+
input_image = gr.Image(label="Upload Image", type="pil", height=400)
142+
prompt = gr.Textbox(
143+
label="Edit Prompt",
144+
placeholder="Describe how you want to edit the image...",
145+
lines=2,
146+
)
147+
edit_btn = gr.Button("✨ Edit Image", variant="primary", size="lg")
148+
149+
with gr.Column():
150+
output_image = gr.Image(label="Edited Image", type="pil", height=400)
151+
152+
# Example images and prompts
153+
with gr.Row():
154+
gr.Examples(
155+
examples=[
156+
["cat.png", "Turn the cat into a tiger"],
157+
["cat.png", "Make it look like a watercolor painting"],
158+
["cat.png", "Change the background to a forest"],
159+
],
160+
inputs=[input_image, prompt],
161+
outputs=output_image,
162+
fn=edit_image,
163+
cache_examples=False,
164+
)
165+
166+
# Event handlers
167+
edit_btn.click(fn=edit_image, inputs=[input_image, prompt], outputs=output_image)
168+
169+
# Allow Enter key to trigger editing
170+
prompt.submit(fn=edit_image, inputs=[input_image, prompt], outputs=output_image)
171+
```
172+
173+
In this app we'll use some practical Gradio features to make a user-friendly app
174+
175+
- We'll use blocks to create a two column layout with the image upload and the edited image.
176+
- We'll drop some markdown into to explain what the app does.
177+
- And, we'll use `gr.Examples` to show some example inputs to give the user some inspiration.
178+
179+
Finally, add the launch configuration at the end of `main.py`:
180+
181+
```python
182+
if __name__ == "__main__":
183+
interface.launch(
184+
share=True, # Creates a public link
185+
server_name="0.0.0.0", # Allow external access
186+
server_port=7860, # Default Gradio port
187+
show_error=True, # Show errors in the interface
188+
)
189+
```
190+
191+
Now run your application:
192+
193+
```bash
194+
python main.py
195+
```
196+
197+
Your app will launch locally at `http://localhost:7860` and Gradio will also provide a public shareable link!
198+
199+
200+
## Complete Working Code
201+
202+
<details>
203+
<summary><strong>📋 Click to view the complete main.py file</strong></summary>
204+
205+
```python
206+
import os
207+
import gradio as gr
208+
from huggingface_hub import InferenceClient
209+
from PIL import Image
210+
import io
211+
212+
# Initialize the client
213+
client = InferenceClient(
214+
provider="fal-ai",
215+
api_key=os.environ["HF_TOKEN"],
216+
)
217+
218+
def edit_image(input_image, prompt):
219+
"""
220+
Edit an image using the given prompt.
221+
222+
Args:
223+
input_image: PIL Image object from Gradio
224+
prompt: String prompt for image editing
225+
226+
Returns:
227+
PIL Image object (edited image)
228+
"""
229+
if input_image is None:
230+
return None
231+
232+
if not prompt or prompt.strip() == "":
233+
return input_image
234+
235+
try:
236+
# Convert PIL Image to bytes
237+
img_bytes = io.BytesIO()
238+
input_image.save(img_bytes, format="PNG")
239+
img_bytes = img_bytes.getvalue()
240+
241+
# Use the image_to_image method
242+
edited_image = client.image_to_image(
243+
img_bytes,
244+
prompt=prompt.strip(),
245+
model="Qwen/Qwen-Image-Edit",
246+
)
247+
248+
return edited_image
249+
250+
except Exception as e:
251+
print(f"Error editing image: {e}")
252+
return input_image
253+
254+
# Create Gradio interface
255+
with gr.Blocks(title="Image Editor", theme=gr.themes.Soft()) as interface:
256+
gr.Markdown(
257+
"""
258+
# 🎨 AI Image Editor
259+
Upload an image and describe how you want to edit it using natural language!
260+
"""
261+
)
262+
263+
with gr.Row():
264+
with gr.Column():
265+
input_image = gr.Image(label="Upload Image", type="pil", height=400)
266+
prompt = gr.Textbox(
267+
label="Edit Prompt",
268+
placeholder="Describe how you want to edit the image...",
269+
lines=2,
270+
)
271+
edit_btn = gr.Button("✨ Edit Image", variant="primary", size="lg")
272+
273+
with gr.Column():
274+
output_image = gr.Image(label="Edited Image", type="pil", height=400)
275+
276+
# Example images and prompts
277+
with gr.Row():
278+
gr.Examples(
279+
examples=[
280+
["cat.png", "Turn the cat into a tiger"],
281+
["cat.png", "Make it look like a watercolor painting"],
282+
["cat.png", "Change the background to a forest"],
283+
],
284+
inputs=[input_image, prompt],
285+
outputs=output_image,
286+
fn=edit_image,
287+
cache_examples=False,
288+
)
289+
290+
# Event handlers
291+
edit_btn.click(fn=edit_image, inputs=[input_image, prompt], outputs=output_image)
292+
293+
# Allow Enter key to trigger editing
294+
prompt.submit(fn=edit_image, inputs=[input_image, prompt], outputs=output_image)
295+
296+
if __name__ == "__main__":
297+
interface.launch(
298+
share=True, # Creates a public link
299+
server_name="0.0.0.0", # Allow external access
300+
server_port=7860, # Default Gradio port
301+
show_error=True, # Show errors in the interface
302+
)
303+
```
304+
305+
</details>
306+
307+
## Deploy on Hugging Face Spaces
308+
309+
1. **Create a new Space**: Go to [huggingface.co/new-space](https://huggingface.co/new-space)
310+
2. **Choose Gradio SDK** and make it public
311+
3. **Upload your files**: Upload `main.py` and any example images
312+
4. **Add your token**: In Space settings, add `HF_TOKEN` as a secret
313+
5. **Launch**: Your app will be live at `https://huggingface.co/spaces/your-username/your-space-name`
314+
315+
316+
# Next Steps
317+
318+
Congratulations! You've created a production-ready AI image editor. Now that you have a working image editor, here are some ideas to extend it:
319+
320+
- **Batch processing**: Edit multiple images at once
321+
- **Object removal**: Remove unwanted objects from images
322+
- **Provider comparison**: Benchmark different providers for your use case
323+
324+
Happy building! And remember to share your app with the community on the hub.

0 commit comments

Comments
 (0)