Skip to content

Commit 0664d85

Browse files
Delayed approach v0.1
Signed-off-by: Diego-Castan <[email protected]>
1 parent 19cf84f commit 0664d85

File tree

1 file changed

+3
-3
lines changed

1 file changed

+3
-3
lines changed

vllm/v1/worker/gpu_worker.py

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -399,9 +399,9 @@ def compile_cuda_graph(input_size: int):
399399
scheduler_output.total_num_scheduled_tokens)
400400
compile_cuda_graph(scheduler_output.total_num_scheduled_tokens)
401401
else:
402-
next_comp = list(
403-
warmup_sizes_set.difference(
404-
self._token_compiled_cudagraphs))[0]
402+
next_comp_set = warmup_sizes_set.difference(self._token_compiled_cudagraphs)
403+
if len(next_comp_set) != 0:
404+
next_comp = list(next_comp_set)[0]
405405
self._token_compiled_cudagraphs.add(next_comp)
406406
compile_cuda_graph(next_comp)
407407

0 commit comments

Comments
 (0)