Skip to content

Commit ac5c9e6

Browse files
Update examples/server-async/README.md
1 parent 0ecdfc3 commit ac5c9e6

File tree

3 files changed

+54
-17
lines changed

3 files changed

+54
-17
lines changed

examples/server-async/Pipelines.py

Lines changed: 0 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,3 @@
1-
# Pipelines.py
21
from diffusers.pipelines.stable_diffusion_3.pipeline_stable_diffusion_3 import StableDiffusion3Pipeline
32
from diffusers.pipelines.flux.pipeline_flux import FluxPipeline
43
import torch
@@ -102,8 +101,6 @@ def initialize_pipeline(self):
102101
self.model_type = "SD3_5"
103102
elif self.model in preset_models.Flux:
104103
self.model_type = "Flux"
105-
else:
106-
self.model_type = "SD"
107104

108105
# Create appropriate pipeline based on model type and type_models
109106
if self.type_models == 't2im':

examples/server-async/README.md

Lines changed: 49 additions & 14 deletions
Original file line numberDiff line numberDiff line change
@@ -5,24 +5,24 @@
55
66
## ⚠️ IMPORTANT
77

8-
* The server and inference harness live in this repo: `https://github.com/F4k3r22/DiffusersServer`.
9-
The example demonstrates how to run pipelines like `StableDiffusion3-3.5` and `Flux.1` concurrently while keeping a single copy of the heavy model parameters on GPU.
8+
* The example demonstrates how to run pipelines like `StableDiffusion3-3.5` and `Flux.1` concurrently while keeping a single copy of the heavy model parameters on GPU.
109

1110
## Necessary components
1211

13-
All the components needed to create the inference server are in `DiffusersServer/`
12+
All the components needed to create the inference server are in the current directory:
1413

1514
```
16-
DiffusersServer/
15+
server-async/
1716
├── utils/
1817
├─────── __init__.py
19-
├─────── scheduler.py # BaseAsyncScheduler wrapper and async_retrieve_timesteps for secure inferences
20-
├─────── requestscopedpipeline.py # RequestScoped Pipeline for inference with a single in-memory model
21-
├── __init__.py
22-
├── create_server.py # helper script to build/run the app programmatically
23-
├── Pipelines.py # pipeline loader classes (SD3, Flux, legacy SD, video)
24-
├── serverasync.py # FastAPI app factory (create\_app\_fastapi)
25-
├── uvicorn_diffu.py # convenience script to start uvicorn with recommended flags
18+
├─────── scheduler.py # BaseAsyncScheduler wrapper and async_retrieve_timesteps for secure inferences
19+
├─────── requestscopedpipeline.py # RequestScoped Pipeline for inference with a single in-memory model
20+
├─────── utils.py # Image/video saving utilities and service configuration
21+
├── Pipelines.py # pipeline loader classes (SD3, Flux, legacy SD, video)
22+
├── serverasync.py # FastAPI app with lifespan management and async inference endpoints
23+
├── test.py # Client test script for inference requests
24+
├── requirements.txt # Dependencies
25+
└── README.md # This documentation
2626
```
2727

2828
## What `diffusers-async` adds / Why we needed it
@@ -69,13 +69,28 @@ pip install -r requirements.txt
6969

7070
### 2) Start the server
7171

72-
Using the `server.py` file that already has everything you need:
72+
Using the `serverasync.py` file that already has everything you need:
7373

7474
```bash
75-
python server.py
75+
python serverasync.py
7676
```
7777

78-
### 3) Example request
78+
The server will start on `http://localhost:8500` by default with the following features:
79+
- FastAPI application with async lifespan management
80+
- Automatic model loading and pipeline initialization
81+
- Request counting and active inference tracking
82+
- Memory cleanup after each inference
83+
- CORS middleware for cross-origin requests
84+
85+
### 3) Test the server
86+
87+
Use the included test script:
88+
89+
```bash
90+
python test.py
91+
```
92+
93+
Or send a manual request:
7994

8095
`POST /api/diffusers/inference` with JSON body:
8196

@@ -95,6 +110,13 @@ Response example:
95110
}
96111
```
97112

113+
### 4) Server endpoints
114+
115+
- `GET /` - Welcome message
116+
- `POST /api/diffusers/inference` - Main inference endpoint
117+
- `GET /images/{filename}` - Serve generated images
118+
- `GET /api/status` - Server status and memory info
119+
98120
## Advanced Configuration
99121

100122
### RequestScopedPipeline Parameters
@@ -117,6 +139,19 @@ RequestScopedPipeline(
117139
* Enhanced debugging with `__repr__` and `__str__` methods
118140
* Full compatibility with existing scheduler APIs
119141

142+
### Server Configuration
143+
144+
The server configuration can be modified in `serverasync.py` through the `ServerConfigModels` dataclass:
145+
146+
```python
147+
@dataclass
148+
class ServerConfigModels:
149+
model: str = 'stabilityai/stable-diffusion-3-medium'
150+
type_models: str = 't2im'
151+
host: str = '0.0.0.0'
152+
port: int = 8500
153+
```
154+
120155
## Troubleshooting (quick)
121156

122157
* `Already borrowed` — previously a Rust tokenizer concurrency error.

examples/server-async/serverasync.py

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -221,3 +221,8 @@ async def get_status():
221221
allow_methods=["*"],
222222
allow_headers=["*"],
223223
)
224+
225+
if __name__ == "__main__":
226+
import uvicorn
227+
228+
uvicorn.run(app, host=server_config.host, port=server_config.port)

0 commit comments

Comments
 (0)