Skip to content

Conversation

hust17yixuan
Copy link
Contributor

What this PR does / why we need it?

This PR presents a large-EP deployment solution based on vllm-ascend, using DeepSeek as an example. It outlines the end-to-end workflow for model deployment and serves as a reference for developers.

Does this PR introduce any user-facing change?

No

How was this patch tested?

@github-actions github-actions bot added the documentation Improvements or additions to documentation label Aug 12, 2025
Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request adds documentation for deploying a large Expert Parallelism (EP) model using a distributed DP server. The documentation is comprehensive, but I've found a few issues in the code examples that could cause problems for users.
Specifically, a Python script example has incorrect data types for variables, which will lead to runtime errors. There's also a likely copy-paste error in a shell command example, and another command example is illustrative rather than a concrete, runnable command, which might be confusing. Finally, there's a syntax error in a markdown note block that will likely break rendering.
I've provided suggestions to fix these issues to improve the clarity and correctness of the documentation.

Comment on lines 80 to 85
dp_size = "all dp size for decode/prefill"
dp_size_local = "dp size for current node"
dp_rank_start = "dp beginning number for current node"
dp_ip = "master node ip"
dp_port = "port used to communication"
engine_port = "the beginning port for all dp group in current node"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

critical

The placeholder values for dp_size, dp_size_local, dp_rank_start, dp_port, and engine_port are strings, but they are used in arithmetic operations or as port numbers, which requires integers. This will cause a TypeError when the script is run. The placeholder values should be integers, and it would be helpful to provide example integer values.

Suggested change
dp_size = "all dp size for decode/prefill"
dp_size_local = "dp size for current node"
dp_rank_start = "dp beginning number for current node"
dp_ip = "master node ip"
dp_port = "port used to communication"
engine_port = "the beginning port for all dp group in current node"
dp_size = 8 # Example: total number of DP workers for decode/prefill
dp_size_local = 2 # Example: number of DP workers on the current node
dp_rank_start = 0 # Example: starting DP rank for the current node
dp_ip = "192.168.1.100" # Example: master node IP
dp_port = 12345 # Example: port used for communication
engine_port = 8000 # Example: the starting port for all DP groups on the current node

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed


```shell
cd vllm-ascend/examples/disaggregate_prefill_v1/
bash gen_ranktable.sh --ips prefiller_node1_local_ip prefiller_node2_local_ip decoder_node1_local_ip decoder_node1_local_ip \
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

There appears to be a copy-paste error in the example command for gen_ranktable.sh. The IP for decoder_node1_local_ip is repeated, which is likely incorrect and should probably be for two different decoder nodes (e.g., decoder_node1_local_ip and decoder_node2_local_ip). This could lead to misconfiguration for users following this guide.

Suggested change
bash gen_ranktable.sh --ips prefiller_node1_local_ip prefiller_node2_local_ip decoder_node1_local_ip decoder_node1_local_ip \
bash gen_ranktable.sh --ips prefiller_node1_local_ip prefiller_node2_local_ip decoder_node1_local_ip decoder_node2_local_ip \

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed

Comment on lines 195 to 230
python load_balance_proxy_server_example.py \
--port "proxy port" \
--host 0.0.0.0 \
--prefiller-hosts \
prefiller node1 local ip \
prefiller node2 local ip \
--prefiller-ports \
engine_port engine_port \
--decoder-hosts \
decoder node1 local ip \
decoder node1 local ip \
decoder node2 local ip \
decoder node2 local ip \
--decoder-ports \
engine_port ... engine_port + dp_size_local \
engine_port ... engine_port + dp_size_local \
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

The example for running load_balance_proxy_server_example.py is illustrative and contains placeholders that are not valid command-line arguments (e.g., prefiller node1 local ip, engine_port ...). This could be confusing for users. It would be more helpful to provide a complete, concrete example that they can adapt.

Suggested change
python load_balance_proxy_server_example.py \
--port "proxy port" \
--host 0.0.0.0 \
--prefiller-hosts \
prefiller node1 local ip \
prefiller node2 local ip \
--prefiller-ports \
engine_port engine_port \
--decoder-hosts \
decoder node1 local ip \
decoder node1 local ip \
decoder node2 local ip \
decoder node2 local ip \
--decoder-ports \
engine_port ... engine_port + dp_size_local \
engine_port ... engine_port + dp_size_local \
python load_balance_proxy_server_example.py \
--port 8000 \
--host 0.0.0.0 \
--prefiller-hosts \
192.168.1.101 \
192.168.1.102 \
--prefiller-ports \
9000 9001 \
--decoder-hosts \
192.168.1.103 \
192.168.1.103 \
192.168.1.104 \
192.168.1.104 \
--decoder-ports \
9002 9003 9004 9005

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed

Comment on lines +213 to +235
:::{note}
Each node local ip should repeat the same times as its '**dp_size_local**', at the same time, each node has the same number of ports as '**dp_size_local**', and ther ports increase sequentially starting from '**engine_port**'.
:::
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

The syntax for the note block is incorrect. It uses ::::{note} and :::: which will likely cause rendering issues. The correct MyST syntax for a note directive is :::{note} and :::.

Suggested change
:::{note}
Each node local ip should repeat the same times as its '**dp_size_local**', at the same time, each node has the same number of ports as '**dp_size_local**', and ther ports increase sequentially starting from '**engine_port**'.
:::
:::{note}
Each node local ip should repeat the same times as its '**dp_size_local**', at the same time, each node has the same number of ports as '**dp_size_local**', and ther ports increase sequentially starting from '**engine_port**'.
:::

@hust17yixuan hust17yixuan force-pushed the v0.9.1-dev branch 7 times, most recently from 2d23bc7 to 5a7a03c Compare August 14, 2025 01:18
Signed-off-by: hust17yixuan <[email protected]>

Signed-off-by: hust17yixuan <[email protected]>

Signed-off-by: hust17yixuan <[email protected]>

Signed-off-by: hust17yixuan <[email protected]>

Signed-off-by: hust17yixuan <[email protected]>

Signed-off-by: hust17yixuan <[email protected]>

Signed-off-by: hust17yixuan <[email protected]>

Signed-off-by: hust17yixuan <[email protected]>
@ganyi1996ppo ganyi1996ppo merged commit 17c2884 into vllm-project:v0.9.1-dev Aug 14, 2025
5 checks passed
JC-ut0 pushed a commit to JC-ut0/vllm-ascend that referenced this pull request Aug 16, 2025
### What this PR does / why we need it?

This PR presents a large-EP deployment solution based on vllm-ascend,
using DeepSeek as an example. It outlines the end-to-end workflow for
model deployment and serves as a reference for developers.

### Does this PR introduce _any_ user-facing change?
No

### How was this patch tested?

Signed-off-by: hust17yixuan <[email protected]>
Signed-off-by: xuyexiong <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
documentation Improvements or additions to documentation
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants