Skip to content

Commit 2518230

Browse files
[MISC] Fix misleading batch_size_capture_list when cuda_graph_sizes < 4 (#25829)
Signed-off-by: billishyahao <[email protected]> Co-authored-by: Luka Govedic <[email protected]>
1 parent a332b84 commit 2518230

File tree

1 file changed

+6
-3
lines changed

1 file changed

+6
-3
lines changed

vllm/config/vllm.py

Lines changed: 6 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -580,9 +580,12 @@ def _set_cudagraph_sizes(self):
580580
not self.model_config.enforce_eager:
581581
cuda_graph_sizes = self.scheduler_config.cuda_graph_sizes
582582
if len(cuda_graph_sizes) == 1:
583-
batch_size_capture_list = [1, 2, 4] + [
584-
i for i in range(8, cuda_graph_sizes[0] + 1, 8)
585-
]
583+
max_graph_size = cuda_graph_sizes[0]
584+
assert max_graph_size >= 1, "Maximum cudagraph size should be" \
585+
" greater than or equal to 1."
586+
batch_size_capture_list = [
587+
i for i in [1, 2, 4] if i <= max_graph_size
588+
] + list(range(8, max_graph_size + 1, 8))
586589
elif len(cuda_graph_sizes) > 1:
587590
batch_size_capture_list = sorted(cuda_graph_sizes)
588591
else:

0 commit comments

Comments
 (0)