Skip to content

Commit 33d19c5

Browse files
author
lijiachen19
committed
fix docs
1 parent 4f3ca7e commit 33d19c5

File tree

4 files changed

+10
-7
lines changed

4 files changed

+10
-7
lines changed

README.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -68,7 +68,7 @@ in either a local filesystem for single-machine scenarios or through NFS mount p
6868

6969
## Quick Start
7070

71-
please refer to [Quick Start](https://ucm.readthedocs.io/en/latest/getting-started/quick_start.html).
71+
please refer to [Quick Start for vLLM](https://ucm.readthedocs.io/en/latest/getting-started/quickstart_vllm.html) and [Quick Start for vLLM-Ascend](https://ucm.readthedocs.io/en/latest/getting-started/quickstart_vllm_ascend.html).
7272

7373
---
7474

docs/source/getting-started/quickstart_vllm.md

Lines changed: 4 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -163,12 +163,14 @@ vllm serve Qwen/Qwen2.5-14B-Instruct \
163163
--kv-transfer-config \
164164
'{
165165
"kv_connector": "UCMConnector",
166+
"kv_connector_module_path": "ucm.integration.vllm.ucm_connector",
166167
"kv_role": "kv_both",
167-
"kv_connector_extra_config": {"UCM_CONFIG_FILE": "/vllm-workspace/unified-cache-management/examples/ucm_config_example.yaml"}
168+
"kv_connector_extra_config": {"UCM_CONFIG_FILE": "/workspace/unified-cache-management/examples/ucm_config_example.yaml"}
168169
}'
169170
```
171+
**⚠️ The parameter `--no-enable-prefix-caching` is for SSD performance testing, please remove it for production.**
170172

171-
**⚠️ Make sure to replace `"/vllm-workspace/unified-cache-management/examples/ucm_config_example.yaml"` with your actual config file path.**
173+
**⚠️ Make sure to replace `"/workspace/unified-cache-management/examples/ucm_config_example.yaml"` with your actual config file path.**
172174

173175

174176
If you see log as below:

docs/source/getting-started/quickstart_vllm_ascend.md

Lines changed: 4 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -131,12 +131,14 @@ vllm serve Qwen/Qwen2.5-14B-Instruct \
131131
--kv-transfer-config \
132132
'{
133133
"kv_connector": "UCMConnector",
134+
"kv_connector_module_path": "ucm.integration.vllm.ucm_connector",
134135
"kv_role": "kv_both",
135-
"kv_connector_extra_config": {"UCM_CONFIG_FILE": "/vllm-workspace/unified-cache-management/examples/ucm_config_example.yaml"}
136+
"kv_connector_extra_config": {"UCM_CONFIG_FILE": "/workspace/unified-cache-management/examples/ucm_config_example.yaml"}
136137
}'
137138
```
139+
**⚠️ The parameter `--no-enable-prefix-caching` is for SSD performance testing, please remove it for production.**
138140

139-
**⚠️ Make sure to replace `"/vllm-workspace/unified-cache-management/examples/ucm_config_example.yaml"` with your actual config file path.**
141+
**⚠️ Make sure to replace `"/workspace/unified-cache-management/examples/ucm_config_example.yaml"` with your actual config file path.**
140142

141143

142144
If you see log as below:

docs/source/user-guide/prefix-cache/nfs_store.md

Lines changed: 1 addition & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -109,8 +109,6 @@ Explanation:
109109

110110
## Launching Inference
111111

112-
### Offline Inference
113-
114112
In this guide, we describe **online inference** using vLLM with the UCM connector, deployed as an OpenAI-compatible server. For best performance with UCM, it is recommended to set `block_size` to 128.
115113

116114
To start the vLLM server with the Qwen/Qwen2.5-14B-Instruct model, run:
@@ -129,6 +127,7 @@ vllm serve Qwen/Qwen2.5-14B-Instruct \
129127
'{
130128
"kv_connector": "UCMConnector",
131129
"kv_role": "kv_both",
130+
"kv_connector_module_path": "ucm.integration.vllm.ucm_connector",
132131
"kv_connector_extra_config": {"UCM_CONFIG_FILE": "/vllm-workspace/unified-cache-management/examples/ucm_config_example.yaml"}
133132
}'
134133
```

0 commit comments

Comments
 (0)