Skip to content

Commit 3ee4333

Browse files
Delayed approach v0.2
Signed-off-by: Diego-Castan <[email protected]>
1 parent b3ac3db commit 3ee4333

File tree

1 file changed

+2
-2
lines changed

1 file changed

+2
-2
lines changed

vllm/v1/worker/gpu_worker.py

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -402,8 +402,8 @@ def compile_cuda_graph(input_size: int):
402402
next_comp_set = warmup_sizes_set.difference(self._token_compiled_cudagraphs)
403403
if len(next_comp_set) != 0:
404404
next_comp = list(next_comp_set)
405-
self._token_compiled_cudagraphs.add(next_comp[0])
406-
compile_cuda_graph(next_comp[0])
405+
self._token_compiled_cudagraphs.add(next_comp[0])
406+
compile_cuda_graph(next_comp[0])
407407

408408
output = self.model_runner.execute_model(scheduler_output,
409409
intermediate_tensors)

0 commit comments

Comments
 (0)