Skip to content

Commit 11ca288

Browse files
committed
fix test
Signed-off-by: Yibin Li <[email protected]>
1 parent 7c8faa0 commit 11ca288

File tree

1 file changed

+2
-1
lines changed

1 file changed

+2
-1
lines changed

tensorrt_llm/_torch/pyexecutor/py_executor.py

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -2593,7 +2593,8 @@ def _handle_responses(self):
25932593
if request.is_finished:
25942594
# Finalize any remaining logits transfers for the finished request in chunked mode
25952595
if request.py_use_chunked_generation_logits and request.py_return_generation_logits:
2596-
request.py_result.transfer_remaining_device_logits()
2596+
with torch.inference_mode():
2597+
request.py_result.transfer_remaining_device_logits()
25972598

25982599
request_done = False
25992600
if request.py_decoding_iter == 1 or request.is_finished or \

0 commit comments

Comments
 (0)