Skip to content

Commit f13b315

Browse files
committed
init
1 parent 5d4b049 commit f13b315

File tree

1 file changed

+130
-0
lines changed

1 file changed

+130
-0
lines changed
Lines changed: 130 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,130 @@
1+
---
2+
title: "SGLang Diffusion: Diffusion Generation in SGLang"
3+
author: "The SGLang Team"
4+
date: "November 6, 2025"
5+
previewImg: /images/blog/sglang/cover.jpg
6+
---
7+
8+
## Intro
9+
10+
We are excited to release **SGLang Diffusion**, accelerating image and video generation ...
11+
12+
13+
14+
15+
Source code available [here](https://github.com/sgl-project/sglang/tree/main/python/sglang/multimodal_gen)
16+
17+
## Why Diffusion in SGLang?
18+
19+
With diffusion models becoming the backbone for state-of-the-art image and video generation, we have heard strong community demand for bringing SGLang's signature performance and seamless user experience to these new modalities.
20+
21+
SGLang Diffusion is built to meet this need, providing a unified api for both language and diffusion tasks. The future of generation lies in combining modalities, as seen in pioneering models like ByteDance's [Bagel](https://github.com/ByteDance-Seed/Bagel) that fuse autoregressive and diffusion architectures.
22+
23+
24+
SGLang Diffusion is designed to be a future-proof, high-performance solution ready to power these innovative systems.
25+
26+
27+
28+
29+
## Architecture
30+
31+
- Follow SGLang scheduler architecture
32+
- Reuse layer and sgl-kernel
33+
- Focus on easy of use: CLI, Python engine API, OpenAI-compatible API
34+
- Built on a fork from FastVideo with additional enhancement. Collaborating with the FastVideo team.
35+
36+
## Model Support
37+
38+
We support various popular open-source video & image generation models, including:
39+
- Video models: Wan-series, FastWan, Hunyuan
40+
- Image models: Qwen-Image, Qwen-Image-Edit, Flux
41+
42+
## Usage
43+
44+
### Install
45+
46+
SGL diffusion can be installed via:
47+
48+
49+
Reference [install guide](https://github.com/sgl-project/sglang/blob/main/python/sglang/multimodal_gen/docs/install.md) for more info
50+
51+
52+
53+
### Demo
54+
55+
#### Text to Video: Wan-AI/Wan2.1
56+
57+
```bash
58+
sglang generate --model-path Wan-AI/Wan2.1-T2V-1.3B-Diffusers \
59+
--prompt "A curious raccoon" \
60+
--save-output
61+
```
62+
63+
64+
#### Image to Video: Wan-AI/Wan2.1-I2V
65+
66+
```bash
67+
sglang generate --prompt="Summer beach vacation style, a white cat wearing sunglasses sits on a surfboard. The fluffy-furred feline gazes directly at the camera with a relaxed expression. Blurred beach scenery forms the background featuring crystal-clear waters, distant green hills, and a blue sky dotted with white clouds. The cat assumes a naturally relaxed posture, as if savoring the sea breeze and warm sunlight. A close-up shot highlights the feline's intricate details and the refreshing atmosphere of the seaside. "\
68+
69+
--image-path=https://github.com/Wan-Video/Wan2.2/blob/990af50de458c19590c245151197326e208d7191/examples/i2v_input.JPG?raw=true
70+
--save-output --model-path=Wan-AI/Wan2.1-I2V-14B-480P-Diffusers --num-gpus 2 --enable-cfg-parallel
71+
```
72+
73+
#### Text to Image: FLUX
74+
75+
```bash
76+
sglang generate --model-path black-forest-labs/FLUX.1-dev \
77+
--prompt "A Logo With Bold Large Text: SGL Diffusion" \
78+
--save-output
79+
```
80+
81+
#### Text to Image: Qwen-Image
82+
83+
```bash
84+
sglang generate \
85+
--prompt='A curious raccoon' --save-output \
86+
--width=720 --height=720 --model-path=Qwen/Qwen-Image
87+
```
88+
89+
#### Image to Image: Qwen-Image-Edit
90+
91+
92+
93+
```bash
94+
sglang generate --prompt='keep the original style, but change the text to: \"SGL Diffusion\"'
95+
--save-output --image-path=https://raw.githubusercontent.com/sgl-project/sgl-test-files/refs/heads/main/images/sgl_logo.png \
96+
--model-path=Qwen/Qwen-Image-Edit
97+
```
98+
99+
100+
## Performance Benchmark
101+
102+
Todo: convert this to a bar chart (with Google sheet or PPT)
103+
104+
| Model | Baseline (s) (diffusers) | SGLang Diffusion (s) |
105+
|------------------|-------------------------:|---------------------:|
106+
| Flux | 38.7 | 6.50 |
107+
| Wan 2.1 T2V 1.3B | 102.7 | 78.83 |
108+
| Qwen-Image | 13.30 | 10.99 |
109+
| Qwen-Image-Edit | 52.5 | 34.4 |
110+
111+
## Roadmap and Diffusion Ecosystem
112+
113+
- Call for contribution
114+
- Performance optimization
115+
- Blackwell optimizations (Flash Attention 4 integration)
116+
- Model coverage
117+
118+
## Acknowledgment
119+
120+
SGLang Diffusion Team: Mick..
121+
122+
FastVideo Team: …
123+
124+
## Learn more
125+
126+
- Roadmap: TBD
127+
- Slack channel: [slack.sglang.ai](https://slack.sglang.ai) (`#diffusion`)
128+
129+
130+

0 commit comments

Comments
 (0)