Skip to content

Commit 6cc619a

Browse files
committed
[TRTLLM-7964][infra] set nixl to default cache transceiver backend
Signed-off-by: Bo Deng <deemod@nvidia.com>
1 parent e30d9ac commit 6cc619a

File tree

10 files changed

+10
-13
lines changed

10 files changed

+10
-13
lines changed

cpp/tensorrt_llm/batch_manager/cacheTransceiver.cpp

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -89,7 +89,7 @@ std::unique_ptr<BaseCacheTransceiver> CacheTransceiverFactory::createCacheTransc
8989
}
9090
else
9191
{
92-
backendType = executor::CacheTransceiverConfig::BackendType::UCX;
92+
backendType = executor::CacheTransceiverConfig::BackendType::NIXL;
9393
}
9494
}
9595
cacheTransceiverConfig.value().setBackendType(backendType);

docker/common/install_nixl.sh

Lines changed: 0 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -40,5 +40,3 @@ cd builddir && ninja install
4040
cd ../..
4141
rm -rf nixl* # Remove NIXL source tree to save space
4242
export LD_LIBRARY_PATH=$OLD_LD_LIBRARY_PATH
43-
44-
echo "export LD_LIBRARY_PATH=/opt/nvidia/nvda_nixl/lib/${ARCH_NAME}:/opt/nvidia/nvda_nixl/lib64:\$LD_LIBRARY_PATH" >> "${ENV}"

docker/common/install_ucx.sh

Lines changed: 0 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -26,4 +26,3 @@ cd ucx
2626
make install -j$(nproc)
2727
cd ..
2828
rm -rf ucx # Remove UCX source to save space
29-
echo "export LD_LIBRARY_PATH=${UCX_INSTALL_PATH}/lib:\$LD_LIBRARY_PATH" >> "${ENV}"

docs/source/features/auto_deploy/auto-deploy.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -28,7 +28,7 @@ AutoDeploy provides an alternative method for deploying models using the LLM API
2828
AutoDeploy is included with the TRT-LLM installation.
2929

3030
```bash
31-
sudo apt-get -y install libopenmpi-dev && pip3 install --upgrade pip setuptools && pip3 install tensorrt_llm
31+
sudo apt-get -y install libopenmpi-dev libzmq3-dev && pip3 install --upgrade pip setuptools && pip3 install tensorrt_llm
3232
```
3333

3434
You can refer to [TRT-LLM installation guide](../../installation/linux.md) for more information.

docs/source/features/disagg-serving.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -106,7 +106,7 @@ cache_transceiver_config:
106106
max_tokens_in_buffer: <int>
107107
```
108108
109-
`backend` specifies the communication backend for transferring the kvCache, valid options include `DEFAULT`,`UCX`, `NIXL`, and `MPI`, the default backend is UCX.
109+
`backend` specifies the communication backend for transferring the kvCache, valid options include `DEFAULT`,`UCX`, `NIXL`, and `MPI`, the default backend is NIXL.
110110

111111
`max_tokens_in_buffer` defines the buffer size for kvCache transfers, it is recommended to set this value greater than or equal to the maximum ISL (Input Sequence Length) of all requests for optimal performance.
112112

docs/source/installation/linux.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -16,7 +16,7 @@
1616
# Optional step: Only required for NVIDIA Blackwell GPUs and SBSA platform
1717
pip3 install torch==2.7.1 torchvision torchaudio --index-url https://download.pytorch.org/whl/cu128
1818

19-
sudo apt-get -y install libopenmpi-dev
19+
sudo apt-get -y install libopenmpi-dev libzmq3-dev
2020
```
2121

2222
PyTorch CUDA 12.8 package is required for supporting NVIDIA Blackwell GPUs and SBSA platform. On prior GPUs or Linux x86_64 platform, this extra installation is not required.

docs/source/torch/auto_deploy/auto-deploy.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -28,7 +28,7 @@ AutoDeploy provides an alternative method for deploying models using the LLM API
2828
AutoDeploy is included with the TRT-LLM installation.
2929

3030
```bash
31-
sudo apt-get -y install libopenmpi-dev && pip3 install --upgrade pip setuptools && pip3 install tensorrt_llm
31+
sudo apt-get -y install libopenmpi-dev libzmq3-dev && pip3 install --upgrade pip setuptools && pip3 install tensorrt_llm
3232
```
3333

3434
You can refer to [TRT-LLM installation guide](../../installation/linux.md) for more information.

examples/auto_deploy/README.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -9,7 +9,7 @@ ______________________________________________________________________
99
AutoDeploy is included with the TRT-LLM installation.
1010

1111
```bash
12-
sudo apt-get -y install libopenmpi-dev && pip3 install --upgrade pip setuptools && pip3 install tensorrt_llm
12+
sudo apt-get -y install libopenmpi-dev libzmq3-dev && pip3 install --upgrade pip setuptools && pip3 install tensorrt_llm
1313
```
1414

1515
You can refer to [TRT-LLM installation guide](https://github.com/NVIDIA/TensorRT-LLM/blob/main/docs/source/installation/linux.md) for more information.

examples/disaggregated/README.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -12,7 +12,7 @@ The `trtllm-serve` command supports the `extra-llm-config.yaml` parameter. In th
1212

1313
```yaml
1414
cache_transceiver_config:
15-
# KV cache transmission backend. Valid options include `DEFAULT` (i.e., UCX), `UCX`, `NIXL`.
15+
# KV cache transmission backend. Valid options include `DEFAULT` (i.e., NIXL), `UCX`, `NIXL`.
1616
backend: <str>
1717
# KV cache buffer size. Set it ≥ the maximum ISL (Input Sequence Length) for best performance.
1818
max_tokens_in_buffer: <int>

tensorrt_llm/_torch/pyexecutor/kv_cache_transceiver.py

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -38,10 +38,10 @@ def create_kv_cache_transceiver(
3838

3939
if cache_transceiver_config.backend == BackendTypeCpp.DEFAULT:
4040
# When cache_transceiver_config.backend is not set, fallback to env_vars settings
41-
# UCX is the default backend
42-
cache_transceiver_config.backend = BackendTypeCpp.UCX
41+
# NIXL is the default backend
42+
cache_transceiver_config.backend = BackendTypeCpp.NIXL
4343
# Ordered by priority
44-
env_vars = [("TRTLLM_USE_NIXL_KVCACHE", BackendTypeCpp.NIXL),
44+
env_vars = [("TRTLLM_USE_UCX_KVCACHE", BackendTypeCpp.UCX),
4545
("TRTLLM_USE_MPI_KVCACHE", BackendTypeCpp.MPI)]
4646
for env_var, be_type in env_vars:
4747
if getenv(env_var) == "1":

0 commit comments

Comments
 (0)