Skip to content

Commit 65b6d2c

Browse files
committed
fix test
Signed-off-by: Yibin Li <[email protected]>
1 parent 02ed233 commit 65b6d2c

File tree

1 file changed

+2
-1
lines changed

1 file changed

+2
-1
lines changed

tensorrt_llm/_torch/pyexecutor/py_executor.py

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -2642,7 +2642,8 @@ def _handle_responses(self):
26422642
if request.is_finished:
26432643
# Finalize any remaining logits transfers for the finished request in chunked mode
26442644
if request.py_use_chunked_generation_logits and request.py_return_generation_logits:
2645-
request.py_result.transfer_remaining_device_logits()
2645+
with torch.inference_mode():
2646+
request.py_result.transfer_remaining_device_logits()
26462647

26472648
request_done = False
26482649
if request.py_decoding_iter == 1 or request.is_finished or \

0 commit comments

Comments
 (0)