Skip to content

Commit 3d9cd16

Browse files
authored
[examples] add --enable-cpu-offload args (vllm-project#930)
Signed-off-by: David Chen <530634352@qq.com>
1 parent 28a6597 commit 3d9cd16

File tree

8 files changed

+28
-0
lines changed

8 files changed

+28
-0
lines changed

examples/offline_inference/image_to_image/image_edit.py

Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -279,6 +279,11 @@ def parse_args() -> argparse.Namespace:
279279
action="store_true",
280280
help="Disable torch.compile and force eager execution.",
281281
)
282+
parser.add_argument(
283+
"--enable-cpu-offload",
284+
action="store_true",
285+
help="Enable CPU offloading for diffusion models.",
286+
)
282287
return parser.parse_args()
283288

284289

@@ -344,6 +349,7 @@ def main():
344349
cache_config=cache_config,
345350
parallel_config=parallel_config,
346351
enforce_eager=args.enforce_eager,
352+
enable_cpu_offload=args.enable_cpu_offload,
347353
)
348354
print("Pipeline loaded")
349355

examples/offline_inference/image_to_image/image_to_image.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -47,3 +47,4 @@ Key arguments:
4747
- `--guidance_scale`: guidance scale for guidance-distilled models (default: 1.0, disabled). Unlike classifier-free guidance (--cfg_scale), guidance-distilled models take the guidance scale directly as an input parameter. Enabled when guidance_scale > 1. Ignored when not using guidance-distilled models.
4848
- `--num_inference_steps`: diffusion sampling steps (more steps = higher quality, slower).
4949
- `--output`: path to save the generated PNG.
50+
- `--enable-cpu-offload`: enable CPU offloading for diffusion models.

examples/offline_inference/image_to_video/README.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -54,3 +54,4 @@ Key arguments:
5454
- `--num_inference_steps`: Number of denoising steps (default 50).
5555
- `--fps`: Frames per second for the saved MP4 (requires `diffusers` export_to_video).
5656
- `--output`: Path to save the generated video.
57+
- `--enable-cpu-offload`: enable CPU offloading for diffusion models.

examples/offline_inference/image_to_video/image_to_video.py

Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -58,6 +58,11 @@ def parse_args() -> argparse.Namespace:
5858
)
5959
parser.add_argument("--output", type=str, default="i2v_output.mp4", help="Path to save the video (mp4).")
6060
parser.add_argument("--fps", type=int, default=16, help="Frames per second for the output video.")
61+
parser.add_argument(
62+
"--enable-cpu-offload",
63+
action="store_true",
64+
help="Enable CPU offloading for diffusion models.",
65+
)
6166
return parser.parse_args()
6267

6368

@@ -105,6 +110,7 @@ def main():
105110
vae_use_tiling=vae_use_tiling,
106111
boundary_ratio=args.boundary_ratio,
107112
flow_shift=args.flow_shift,
113+
enable_cpu_offload=args.enable_cpu_offload,
108114
)
109115

110116
if profiler_enabled:

examples/offline_inference/text_to_image/README.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -61,6 +61,7 @@ Key arguments:
6161
- `--num_inference_steps`: diffusion sampling steps (more steps = higher quality, slower).
6262
- `--height/--width`: output resolution (defaults 1024x1024).
6363
- `--output`: path to save the generated PNG.
64+
- `--enable-cpu-offload`: enable CPU offloading for diffusion models.
6465

6566
> ℹ️ Qwen-Image currently publishes best-effort presets at `1328x1328`, `1664x928`, `928x1664`, `1472x1140`, `1140x1472`, `1584x1056`, and `1056x1584`. Adjust `--height/--width` accordingly for the most reliable outcomes.
6667

examples/offline_inference/text_to_image/text_to_image.py

Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -96,6 +96,11 @@ def parse_args() -> argparse.Namespace:
9696
action="store_true",
9797
help="Disable torch.compile and force eager execution.",
9898
)
99+
parser.add_argument(
100+
"--enable-cpu-offload",
101+
action="store_true",
102+
help="Enable CPU offloading for diffusion models.",
103+
)
99104
parser.add_argument(
100105
"--tensor_parallel_size",
101106
type=int,
@@ -162,6 +167,7 @@ def main():
162167
cache_config=cache_config,
163168
parallel_config=parallel_config,
164169
enforce_eager=args.enforce_eager,
170+
enable_cpu_offload=args.enable_cpu_offload,
165171
)
166172

167173
if profiler_enabled:

examples/offline_inference/text_to_video/text_to_video.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -28,3 +28,4 @@ Key arguments:
2828
- `--boundary_ratio`: Boundary split ratio for low/high DiT.
2929
- `--fps`: frames per second for the saved MP4 (requires `diffusers` export_to_video).
3030
- `--output`: path to save the generated video.
31+
- `--enable-cpu-offload`: enable CPU offloading for diffusion models.

examples/offline_inference/text_to_video/text_to_video.py

Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -35,6 +35,11 @@ def parse_args() -> argparse.Namespace:
3535
)
3636
parser.add_argument("--output", type=str, default="wan22_output.mp4", help="Path to save the video (mp4).")
3737
parser.add_argument("--fps", type=int, default=24, help="Frames per second for the output video.")
38+
parser.add_argument(
39+
"--enable-cpu-offload",
40+
action="store_true",
41+
help="Enable CPU offloading for diffusion models.",
42+
)
3843
return parser.parse_args()
3944

4045

@@ -56,6 +61,7 @@ def main():
5661
vae_use_tiling=vae_use_tiling,
5762
boundary_ratio=args.boundary_ratio,
5863
flow_shift=args.flow_shift,
64+
enable_cpu_offload=args.enable_cpu_offload,
5965
)
6066

6167
if profiler_enabled:

0 commit comments

Comments
 (0)