|
| 1 | +--- |
| 2 | +title: "SGLang Diffusion: Diffusion Generation in SGLang" |
| 3 | +author: "The SGLang Team" |
| 4 | +date: "November 6, 2025" |
| 5 | +previewImg: /images/blog/sglang/cover.jpg |
| 6 | +--- |
| 7 | + |
| 8 | +## Intro |
| 9 | + |
| 10 | +We are excited to release **SGLang Diffusion**, accelerating image and video generation ... |
| 11 | + |
| 12 | + |
| 13 | + |
| 14 | + |
| 15 | +Source code available [here](https://github.com/sgl-project/sglang/tree/main/python/sglang/multimodal_gen) |
| 16 | + |
| 17 | +## Why Diffusion in SGLang? |
| 18 | + |
| 19 | +With diffusion models becoming the backbone for state-of-the-art image and video generation, we have heard strong community demand for bringing SGLang's signature performance and seamless user experience to these new modalities. |
| 20 | + |
| 21 | +SGLang Diffusion is built to meet this need, providing a unified api for both language and diffusion tasks. The future of generation lies in combining modalities, as seen in pioneering models like ByteDance's [Bagel](https://github.com/ByteDance-Seed/Bagel) that fuse autoregressive and diffusion architectures. |
| 22 | + |
| 23 | + |
| 24 | +SGLang Diffusion is designed to be a future-proof, high-performance solution ready to power these innovative systems. |
| 25 | + |
| 26 | + |
| 27 | + |
| 28 | + |
| 29 | +## Architecture |
| 30 | + |
| 31 | +- Follow SGLang scheduler architecture |
| 32 | +- Reuse layer and sgl-kernel |
| 33 | +- Focus on easy of use: CLI, Python engine API, OpenAI-compatible API |
| 34 | +- Built on a fork from FastVideo with additional enhancement. Collaborating with the FastVideo team. |
| 35 | + |
| 36 | +## Model Support |
| 37 | + |
| 38 | +We support various popular open-source video & image generation models, including: |
| 39 | + - Video models: Wan-series, FastWan, Hunyuan |
| 40 | + - Image models: Qwen-Image, Qwen-Image-Edit, Flux |
| 41 | + |
| 42 | +## Usage |
| 43 | + |
| 44 | +### Install |
| 45 | + |
| 46 | +SGL diffusion can be installed via: |
| 47 | + |
| 48 | + |
| 49 | +Reference [install guide](https://github.com/sgl-project/sglang/blob/main/python/sglang/multimodal_gen/docs/install.md) for more info |
| 50 | + |
| 51 | + |
| 52 | + |
| 53 | +### Demo |
| 54 | + |
| 55 | +#### Text to Video: Wan-AI/Wan2.1 |
| 56 | + |
| 57 | +```bash |
| 58 | +sglang generate --model-path Wan-AI/Wan2.1-T2V-1.3B-Diffusers \ |
| 59 | + --prompt "A curious raccoon" \ |
| 60 | + --save-output |
| 61 | +``` |
| 62 | + |
| 63 | + |
| 64 | +#### Image to Video: Wan-AI/Wan2.1-I2V |
| 65 | + |
| 66 | +```bash |
| 67 | +sglang generate --prompt="Summer beach vacation style, a white cat wearing sunglasses sits on a surfboard. The fluffy-furred feline gazes directly at the camera with a relaxed expression. Blurred beach scenery forms the background featuring crystal-clear waters, distant green hills, and a blue sky dotted with white clouds. The cat assumes a naturally relaxed posture, as if savoring the sea breeze and warm sunlight. A close-up shot highlights the feline's intricate details and the refreshing atmosphere of the seaside. "\ |
| 68 | + |
| 69 | + --image-path=https://github.com/Wan-Video/Wan2.2/blob/990af50de458c19590c245151197326e208d7191/examples/i2v_input.JPG?raw=true |
| 70 | + --save-output --model-path=Wan-AI/Wan2.1-I2V-14B-480P-Diffusers --num-gpus 2 --enable-cfg-parallel |
| 71 | +``` |
| 72 | + |
| 73 | +#### Text to Image: FLUX |
| 74 | + |
| 75 | +```bash |
| 76 | +sglang generate --model-path black-forest-labs/FLUX.1-dev \ |
| 77 | + --prompt "A Logo With Bold Large Text: SGL Diffusion" \ |
| 78 | + --save-output |
| 79 | +``` |
| 80 | + |
| 81 | +#### Text to Image: Qwen-Image |
| 82 | + |
| 83 | +```bash |
| 84 | +sglang generate \ |
| 85 | + --prompt='A curious raccoon' --save-output \ |
| 86 | + --width=720 --height=720 --model-path=Qwen/Qwen-Image |
| 87 | +``` |
| 88 | + |
| 89 | +#### Image to Image: Qwen-Image-Edit |
| 90 | + |
| 91 | + |
| 92 | + |
| 93 | +```bash |
| 94 | +sglang generate --prompt='keep the original style, but change the text to: \"SGL Diffusion\"' |
| 95 | + --save-output --image-path=https://raw.githubusercontent.com/sgl-project/sgl-test-files/refs/heads/main/images/sgl_logo.png \ |
| 96 | + --model-path=Qwen/Qwen-Image-Edit |
| 97 | +``` |
| 98 | + |
| 99 | + |
| 100 | +## Performance Benchmark |
| 101 | + |
| 102 | +Todo: convert this to a bar chart (with Google sheet or PPT) |
| 103 | + |
| 104 | +| Model | Baseline (s) (diffusers) | SGLang Diffusion (s) | |
| 105 | +|------------------|-------------------------:|---------------------:| |
| 106 | +| Flux | 38.7 | 6.50 | |
| 107 | +| Wan 2.1 T2V 1.3B | 102.7 | 78.83 | |
| 108 | +| Qwen-Image | 13.30 | 10.99 | |
| 109 | +| Qwen-Image-Edit | 52.5 | 34.4 | |
| 110 | + |
| 111 | +## Roadmap and Diffusion Ecosystem |
| 112 | + |
| 113 | +- Call for contribution |
| 114 | +- Performance optimization |
| 115 | +- Blackwell optimizations (Flash Attention 4 integration) |
| 116 | +- Model coverage |
| 117 | + |
| 118 | +## Acknowledgment |
| 119 | + |
| 120 | +SGLang Diffusion Team: Mick.. |
| 121 | + |
| 122 | +FastVideo Team: … |
| 123 | + |
| 124 | +## Learn more |
| 125 | + |
| 126 | +- Roadmap: TBD |
| 127 | +- Slack channel: [slack.sglang.ai](https://slack.sglang.ai) (`#diffusion`) |
| 128 | + |
| 129 | + |
| 130 | + |
0 commit comments