Skip to content

Commit b3ac3db

Browse files
Delayed approach v0.2
Signed-off-by: Diego-Castan <[email protected]>
1 parent 0664d85 commit b3ac3db

File tree

1 file changed

+3
-3
lines changed

1 file changed

+3
-3
lines changed

vllm/v1/worker/gpu_worker.py

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -401,9 +401,9 @@ def compile_cuda_graph(input_size: int):
401401
else:
402402
next_comp_set = warmup_sizes_set.difference(self._token_compiled_cudagraphs)
403403
if len(next_comp_set) != 0:
404-
next_comp = list(next_comp_set)[0]
405-
self._token_compiled_cudagraphs.add(next_comp)
406-
compile_cuda_graph(next_comp)
404+
next_comp = list(next_comp_set)
405+
self._token_compiled_cudagraphs.add(next_comp[0])
406+
compile_cuda_graph(next_comp[0])
407407

408408
output = self.model_runner.execute_model(scheduler_output,
409409
intermediate_tensors)

0 commit comments

Comments
 (0)