Skip to content

Commit dd00969

Browse files
authored
[bugfix] ascend schedule encountered an incorrect req block length in… (#2394)
… the check_watermark_for_prefill function ### What this PR does / why we need it? ascend schedule encountered an incorrect req block length in the check_watermark_for_prefill function,under the current writing method, it will always be 1. ### Does this PR introduce _any_ user-facing change? ### How was this patch tested? before: http://image.huawei.com/tiny-lts/v1/images/mdstorm/c6cff7cf33d500a3833f5f80352df373_1183x377.png after: http://image.huawei.com/tiny-lts/v1/images/mdstorm/57207a490d8ac0a70fc87dd08d02dee6_1470x954.png Signed-off-by: liziyu <[email protected]>
1 parent 17c2884 commit dd00969

File tree

2 files changed

+3
-3
lines changed

2 files changed

+3
-3
lines changed

docs/source/developer_guide/performance/distributed_dp_server_with_large_ep.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -173,7 +173,7 @@ In the PD separation scenario, we provide a recommended optimized configuration.
173173
- **prefiller node**
174174

175175
1. set HCCL_BUFFSIZE=256
176-
2. add '--enforce-eager' commond to 'vllm serve'
176+
2. add '--enforce-eager' command to 'vllm serve'
177177
3. Take '--additional-config' as follow
178178

179179
```shell
@@ -231,7 +231,7 @@ python load_balance_proxy_server_example.py \
231231
```
232232

233233
:::{note}
234-
Each node local ip should repeat the same times as its '**dp_size_local**', at the same time, each node has the same number of ports as '**dp_size_local**', and ther ports increase sequentially starting from '**engine_port**'.
234+
Each node local ip should repeat the same times as its '**dp_size_local**', at the same time, each node has the same number of ports as '**dp_size_local**', and their ports increase sequentially starting from '**engine_port**'.
235235
:::
236236

237237
You can get the proxy program in the repository's examples, [load\_balance\_proxy\_server\_example.py](https://github.com/vllm-project/vllm-ascend/blob/v0.9.1-dev/examples/disaggregate_prefill_v1/load_balance_proxy_server_example.py)

vllm_ascend/core/scheduler.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -433,7 +433,7 @@ def _check_watermark_for_prefill(self,
433433
self.block_size)
434434
req_blocks = self.kv_cache_manager.coordinator.get_blocks(
435435
request.request_id)
436-
num_new_blocks = (num_required_blocks - len(req_blocks) -
436+
num_new_blocks = (num_required_blocks - len(req_blocks[0]) -
437437
len(computed_blocks))
438438
num_evictable_computed_blocks = sum(1 for blk in computed_blocks
439439
if blk.ref_cnt == 0)

0 commit comments

Comments
 (0)