Update examples/server-async/README.md

FredyRivera-dev · FredyRivera-dev · commit 6b693673e460 · 2025-09-17T12:05:11.000-06:00
diff --git a/examples/server-async/README.md b/examples/server-async/README.md
@@ -1,11 +1,11 @@
 # Asynchronous server and parallel execution of models
 
-> Example/demo server that keeps a single model in memory while safely running parallel inference requests by creating per-request lightweight views and cloning only small, stateful components (schedulers, RNG state, small mutable attrs). Works with StableDiffusion3/Flux pipelines.
+> Example/demo server that keeps a single model in memory while safely running parallel inference requests by creating per-request lightweight views and cloning only small, stateful components (schedulers, RNG state, small mutable attrs). Works with StableDiffusion3 pipelines.
 > We recommend running 10 to 50 inferences in parallel for optimal performance, averaging between 25 and 30 seconds to 1 minute and 1 minute and 30 seconds. (This is only recommended if you have a GPU with 35GB of VRAM or more; otherwise, keep it to one or two inferences in parallel to avoid decoding or saving errors due to memory shortages.)
 
 ## ⚠️ IMPORTANT
 
-* The example demonstrates how to run pipelines like `StableDiffusion3-3.5` and `Flux.1` concurrently while keeping a single copy of the heavy model parameters on GPU.
+* The example demonstrates how to run pipelines like `StableDiffusion3-3.5` concurrently while keeping a single copy of the heavy model parameters on GPU.
 
 ## Necessary components
 
@@ -18,7 +18,7 @@ server-async/
 ├─────── scheduler.py              # BaseAsyncScheduler wrapper and async_retrieve_timesteps for secure inferences
 ├─────── requestscopedpipeline.py  # RequestScoped Pipeline for inference with a single in-memory model
 ├─────── utils.py                  # Image/video saving utilities and service configuration
-├── Pipelines.py                   # pipeline loader classes (SD3, Flux, legacy SD, video)
+├── Pipelines.py                   # pipeline loader classes (SD3)
 ├── serverasync.py                 # FastAPI app with lifespan management and async inference endpoints
 ├── test.py                        # Client test script for inference requests
 ├── requirements.txt               # Dependencies