Skip to content

Commit e635406

Browse files
committed
feedback
1 parent 4750ce6 commit e635406

File tree

1 file changed

+17
-8
lines changed
  • docs/source/en/api/pipelines

1 file changed

+17
-8
lines changed

docs/source/en/api/pipelines/flux.md

Lines changed: 17 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -357,11 +357,16 @@ image.save('flux_ip_adapter_output.jpg')
357357

358358
## Optimize
359359

360-
Flux is a very large model and requires ~50GB of RAM. Enable some of the optimizations below to lower the memory requirements.
360+
Flux is a very large model and requires ~50GB of RAM/VRAM to load all the modeling components. Enable some of the optimizations below to lower the memory requirements.
361361

362362
### Group offloading
363363

364-
[Group offloading](../../optimization/memory#group-offloading) saves memory by offloading groups of internal layers rather than the whole model or weights. Use [`~hooks.apply_group_offloading`] on a model and you can optionally specify the `offload_type`. Setting it to `leaf_level` offloads the lowest leaf-level parameters to the CPU instead of offloading at the module-level.
364+
[Group offloading](../../optimization/memory#group-offloading) lowers VRAM usage by offloading groups of internal layers rather than the whole model or weights. You need to use [`~hooks.apply_group_offloading`] on all the model components of a pipeline. The `offload_type` parameter allows you to toggle between block and leaf-level offloading. Setting it to `leaf_level` offloads the lowest leaf-level parameters to the CPU instead of offloading at the module-level.
365+
366+
On CUDA devices that support asynchronous data streaming, set `use_stream=True` to overlap data transfer and computation to accelerate inference.
367+
368+
> [!TIP]
369+
> It is possible to mix block and leaf-level offloading for different components in a pipeline.
365370
366371
```py
367372
import torch
@@ -380,34 +385,38 @@ apply_group_offloading(
380385
offload_type="leaf_level",
381386
offload_device=torch.device("cpu"),
382387
onload_device=torch.device("cuda"),
388+
use_stream=True,
383389
)
384390
apply_group_offloading(
385391
pipe.text_encoder,
386392
offload_device=torch.device("cpu"),
387393
onload_device=torch.device("cuda"),
388-
offload_type="leaf_level"
394+
offload_type="leaf_level",
395+
use_stream=True,
389396
)
390397
apply_group_offloading(
391398
pipe.text_encoder_2,
392399
offload_device=torch.device("cpu"),
393400
onload_device=torch.device("cuda"),
394-
offload_type="leaf_level"
401+
offload_type="leaf_level",
402+
use_stream=True,
395403
)
396404
apply_group_offloading(
397405
pipe.vae,
398406
offload_device=torch.device("cpu"),
399407
onload_device=torch.device("cuda"),
400-
offload_type="leaf_level"
408+
offload_type="leaf_level",
409+
use_stream=True,
401410
)
402411

403412
prompt="A cat wearing sunglasses and working as a lifeguard at pool."
404413

405414
generator = torch.Generator().manual_seed(181201)
406415
image = pipe(
407416
prompt,
408-
width=576,
409-
height=1024,
410-
num_inference_steps=30,
417+
width=576,
418+
height=1024,
419+
num_inference_steps=30,
411420
generator=generator
412421
).images[0]
413422
image

0 commit comments

Comments
 (0)