Skip to content

Commit 2e4fe48

Browse files
authored
[NIXL] Increase default KV block eviction timeout on P (vllm-project#25897)
Signed-off-by: NickLucche <[email protected]>
1 parent 8eb0a1d commit 2e4fe48

File tree

2 files changed

+3
-3
lines changed

2 files changed

+3
-3
lines changed

docs/features/nixl_connector_usage.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -84,7 +84,7 @@ python tests/v1/kv_connector/nixl_integration/toy_proxy_server.py \
8484
- Connection info is passed via KVTransferParams from prefiller to decoder for handshake
8585

8686
- `VLLM_NIXL_ABORT_REQUEST_TIMEOUT`: Timeout (in seconds) for automatically releasing the prefiller’s KV cache for a particular request. (Optional)
87-
- Default: 120
87+
- Default: 480
8888
- If a request is aborted and the decoder has not yet read the KV-cache blocks through the nixl channel, the prefill instance will release its KV-cache blocks after this timeout to avoid holding them indefinitely.
8989

9090
## Multi-Instance Setup

vllm/envs.py

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -174,7 +174,7 @@
174174
"NONE"] = "NONE"
175175
VLLM_ROCM_QUICK_REDUCE_CAST_BF16_TO_FP16: bool = True
176176
VLLM_ROCM_QUICK_REDUCE_MAX_SIZE_BYTES_MB: Optional[int] = None
177-
VLLM_NIXL_ABORT_REQUEST_TIMEOUT: int = 120
177+
VLLM_NIXL_ABORT_REQUEST_TIMEOUT: int = 480
178178
VLLM_USE_CUDNN_PREFILL: bool = False
179179
VLLM_ENABLE_CUDAGRAPH_GC: bool = False
180180
VLLM_LOOPBACK_IP: str = ""
@@ -1330,7 +1330,7 @@ def get_vllm_port() -> Optional[int]:
13301330
# consumer. This is only applicable when using NixlConnector in a
13311331
# disaggregated decode-prefill setup.
13321332
"VLLM_NIXL_ABORT_REQUEST_TIMEOUT":
1333-
lambda: int(os.getenv("VLLM_NIXL_ABORT_REQUEST_TIMEOUT", "120")),
1333+
lambda: int(os.getenv("VLLM_NIXL_ABORT_REQUEST_TIMEOUT", "480")),
13341334

13351335
# Controls whether or not to use cudnn prefill
13361336
"VLLM_USE_CUDNN_PREFILL":

0 commit comments

Comments
 (0)