Skip to content

Commit 43273e2

Browse files
committed
feedback
1 parent c194358 commit 43273e2

File tree

2 files changed

+26
-21
lines changed

2 files changed

+26
-21
lines changed

docs/source/en/quicktour.md

Lines changed: 11 additions & 12 deletions
Original file line numberDiff line numberDiff line change
@@ -53,9 +53,8 @@ import torch
5353
from diffusers import DiffusionPipeline
5454

5555
pipeline = DiffusionPipeline.from_pretrained(
56-
"Qwen/Qwen-Image",
57-
torch_dtype=torch.bfloat16
58-
).to("cuda")
56+
"Qwen/Qwen-Image", torch_dtype=torch.bfloat16, device_map="cuda"
57+
)
5958

6059
prompt = """
6160
cinematic film still of a cat sipping a margarita in a pool in Palm Springs, California
@@ -84,7 +83,8 @@ pipeline = DiffusionPipeline.from_pretrained(
8483
"Wan-AI/Wan2.2-T2V-A14B-Diffusers",
8584
vae=vae
8685
torch_dtype=torch.bfloat16,
87-
).to("cuda")
86+
device_map="cuda"
87+
)
8888

8989
prompt = """
9090
Cinematic video of a sleek cat lounging on a colorful inflatable in a crystal-clear turquoise pool in Palm Springs,
@@ -110,13 +110,11 @@ import torch
110110
from diffusers import DiffusionPipeline
111111

112112
pipeline = DiffusionPipeline.from_pretrained(
113-
"Qwen/Qwen-Image",
114-
torch_dtype=torch.bfloat16
113+
"Qwen/Qwen-Image", torch_dtype=torch.bfloat16, device_map="cuda"
115114
)
116115
pipeline.load_lora_weights(
117116
"flymy-ai/qwen-image-realism-lora",
118117
)
119-
pipeline.to("cuda")
120118

121119
prompt = """
122120
super Realism cinematic film still of a cat sipping a margarita in a pool in Palm Springs in the style of umempart, California
@@ -149,7 +147,8 @@ pipeline = DiffusionPipeline.from_pretrained(
149147
"Qwen/Qwen-Image",
150148
torch_dtype=torch.bfloat16,
151149
quantization_config=quant_config,
152-
).to("cuda")
150+
device_map="cuda"
151+
)
153152

154153
prompt = """
155154
cinematic film still of a cat sipping a margarita in a pool in Palm Springs, California
@@ -187,7 +186,8 @@ pipeline = DiffusionPipeline.from_pretrained(
187186
"Qwen/Qwen-Image",
188187
torch_dtype=torch.bfloat16,
189188
quantization_config=quant_config,
190-
).to("cuda")
189+
device_map="cuda"
190+
)
191191
pipeline.enable_model_cpu_offload()
192192

193193
prompt = """
@@ -213,9 +213,8 @@ import torch
213213
from diffusers import DiffusionPipeline
214214

215215
pipeline = DiffusionPipeline.from_pretrained(
216-
"Qwen/Qwen-Image",
217-
torch_dtype=torch.bfloat16
218-
).to("cuda")
216+
"Qwen/Qwen-Image", torch_dtype=torch.bfloat16, device_map="cuda"
217+
)
219218

220219
pipeline.transformer.compile_repeated_blocks(
221220
fullgraph=True,

docs/source/en/stable_diffusion.md

Lines changed: 15 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -22,14 +22,17 @@ This guide recommends some basic performance tips for using the [`DiffusionPipel
2222

2323
Reducing the amount of memory used indirectly speeds up generation and can help a model fit on device.
2424

25+
The [`~DiffusionPipeline.enable_model_cpu_offload`] method moves a model to the CPU when it is not in use to save GPU memory.
26+
2527
```py
2628
import torch
2729
from diffusers import DiffusionPipeline
2830

2931
pipeline = DiffusionPipeline.from_pretrained(
3032
"stabilityai/stable-diffusion-xl-base-1.0",
31-
torch_dtype=torch.bfloat16
32-
).to("cuda")
33+
torch_dtype=torch.bfloat16,
34+
device_map="cuda"
35+
)
3336
pipeline.enable_model_cpu_offload()
3437

3538
prompt = """
@@ -44,7 +47,7 @@ print(f"Max memory reserved: {torch.cuda.max_memory_allocated() / 1024**3:.2f} G
4447

4548
Denoising is the most computationally demanding process during diffusion. Methods that optimizes this process accelerates inference speed. Try the following methods for a speed up.
4649

47-
- Add `.to("cuda")` to place the pipeline on a GPU. Placing a model on an accelerator, like a GPU, increases speed because it performs computations in parallel.
50+
- Add `device_map="cuda"` to place the pipeline on a GPU. Placing a model on an accelerator, like a GPU, increases speed because it performs computations in parallel.
4851
- Set `torch_dtype=torch.bfloat16` to execute the pipeline in half-precision. Reducing the data type precision increases speed because it takes less time to perform computations in a lower precision.
4952

5053
```py
@@ -54,8 +57,9 @@ from diffusers import DiffusionPipeline, DPMSolverMultistepScheduler
5457

5558
pipeline = DiffusionPipeline.from_pretrained(
5659
"stabilityai/stable-diffusion-xl-base-1.0",
57-
torch_dtype=torch.bfloat16
58-
).to("cuda")
60+
torch_dtype=torch.bfloat16,
61+
device_map="cuda
62+
)
5963
```
6064

6165
- Use a faster scheduler, such as [`DPMSolverMultistepScheduler`], which only requires ~20-25 steps.
@@ -88,8 +92,9 @@ Many modern diffusion models deliver high-quality images out-of-the-box. However
8892

8993
pipeline = DiffusionPipeline.from_pretrained(
9094
"stabilityai/stable-diffusion-xl-base-1.0",
91-
torch_dtype=torch.bfloat16
92-
).to("cuda")
95+
torch_dtype=torch.bfloat16,
96+
device_map="cuda"
97+
)
9398

9499
prompt = """
95100
cinematic film still of a cat sipping a margarita in a pool in Palm Springs, California
@@ -109,8 +114,9 @@ Many modern diffusion models deliver high-quality images out-of-the-box. However
109114

110115
pipeline = DiffusionPipeline.from_pretrained(
111116
"stabilityai/stable-diffusion-xl-base-1.0",
112-
torch_dtype=torch.bfloat16
113-
).to("cuda")
117+
torch_dtype=torch.bfloat16,
118+
device_map="cuda"
119+
)
114120
pipeline.scheduler = HeunDiscreteScheduler.from_config(pipeline.scheduler.config)
115121

116122
prompt = """

0 commit comments

Comments
 (0)