Skip to content

Commit 4b54ca3

Browse files
achartieryufeiwu-nv
authored andcommitted
[None][feat] Pass KvCacheRetentionConfig to torch LlmRequest (NVIDIA#8634)
Signed-off-by: Aurelien Chartier <2567591+achartier@users.noreply.github.com> Signed-off-by: yufeiwu-nv <230315618+yufeiwu-nv@users.noreply.github.com>
1 parent 73da0cd commit 4b54ca3

File tree

1 file changed

+2
-1
lines changed

1 file changed

+2
-1
lines changed

tensorrt_llm/_torch/pyexecutor/llm_request.py

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -764,7 +764,8 @@ def executor_request_to_llm_request(
764764
cache_salt_id=executor_request.cache_salt_id,
765765
arrival_time=getattr(executor_request, "py_arrival_time", None),
766766
py_multimodal_data=getattr(executor_request, "py_multimodal_data",
767-
None))
767+
None),
768+
kv_cache_retention_config=executor_request.kv_cache_retention_config)
768769
if child_req_ids:
769770
for child_id in child_req_ids:
770771
llm_request.create_child_request(child_id)

0 commit comments

Comments
 (0)