Skip to content

Commit e4111a4

Browse files
micah-wilgshtras
authored andcommitted
[Core] Run garbage collector after CUDA graph capture to fix throughput regression (vllm-project#24128)
Signed-off-by: Gregory Shtrasberg <[email protected]> Co-authored-by: Gregory Shtrasberg <[email protected]>
1 parent d0f9a92 commit e4111a4

File tree

1 file changed

+1
-0
lines changed

1 file changed

+1
-0
lines changed

vllm/v1/worker/gpu_model_runner.py

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -2885,6 +2885,7 @@ def freeze_gc():
28852885
finally:
28862886
if should_freeze:
28872887
gc.unfreeze()
2888+
gc.collect()
28882889

28892890
# Trigger CUDA graph capture for specific shapes.
28902891
# Capture the large shapes first so that the smaller shapes

0 commit comments

Comments
 (0)