Skip to content

Commit b9f715d

Browse files
authored
[Docs] Sync main doc to v0.9.1-dev (#2227)
### What this PR does / why we need it? Sync main doc to v0.9.1-dev, excluded: - multi_npu_moge.md - single_node_300i.md - single_npu_audio.md - single_npu_qwen3_embedding.md - multi_npu_qwen3_moe.md (without ep enable) - faqs.md (exclude 300I part) - cn docs - remove eplb_swift_balancer, we will add this back after main backport ### Does this PR introduce _any_ user-facing change? No, doc only ### How was this patch tested? Preview: https://vllm-ascend--2227.org.readthedocs.build/en/2227/ Signed-off-by: Yikun Jiang <[email protected]>
1 parent e3636c7 commit b9f715d

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

52 files changed

+1950
-375
lines changed
-58 KB
Binary file not shown.

docs/source/assets/multi_node_dp.png

115 KB
Loading

docs/source/community/contributors.md

Lines changed: 19 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -7,15 +7,33 @@
77
| Xiyuan Wang| [@wangxiyuan](https://github.com/wangxiyuan) | 2025/01 |
88
| Yikun Jiang| [@Yikun](https://github.com/Yikun) | 2025/02 |
99
| Yi Gan| [@ganyi1996ppo](https://github.com/ganyi1996ppo) | 2025/02 |
10+
| Shoujian Zheng| [@jianzs](https://github.com/jianzs) | 2025/06 |
1011

1112
## Contributors
1213

1314
vLLM Ascend every release would not have been possible without the following contributors:
1415

15-
Updated on 2025-06-10:
16+
Updated on 2025-07-07:
1617

1718
| Number | Contributor | Date | Commit ID |
1819
|:------:|:-----------:|:-----:|:---------:|
20+
| 83 | [@ZhengWG](https://github.com/) | 2025/7/7 | [3a469de](https://github.com/vllm-project/vllm-ascend/commit/9c886d0a1f0fc011692090b0395d734c83a469de) |
21+
| 82 | [@wm901115nwpu](https://github.com/) | 2025/7/7 | [a2a47d4](https://github.com/vllm-project/vllm-ascend/commit/f08c4f15a27f0f27132f4ca7a0c226bf0a2a47d4) |
22+
| 81 | [@Agonixiaoxiao](https://github.com/) | 2025/7/2 | [6f84576](https://github.com/vllm-project/vllm-ascend/commit/7fc1a984890bd930f670deedcb2dda3a46f84576) |
23+
| 80 | [@zhanghw0354](https://github.com/zhanghw0354) | 2025/7/2 | [d3df9a5](https://github.com/vllm-project/vllm-ascend/commit/9fb3d558e5b57a3c97ee5e11b9f5dba6ad3df9a5) |
24+
| 79 | [@GDzhu01](https://github.com/GDzhu01) | 2025/6/28 | [de256ac](https://github.com/vllm-project/vllm-ascend/commit/b308a7a25897b88d4a23a9e3d583f4ec6de256ac) |
25+
| 78 | [@leo-pony](https://github.com/leo-pony) | 2025/6/26 | [3f2a5f2](https://github.com/vllm-project/vllm-ascend/commit/10253449120307e3b45f99d82218ba53e3f2a5f2) |
26+
| 77 | [@zeshengzong](https://github.com/zeshengzong) | 2025/6/26 | [3ee25aa](https://github.com/vllm-project/vllm-ascend/commit/192dbbcc6e244a8471d3c00033dc637233ee25aa) |
27+
| 76 | [@sharonyunyun](https://github.com/sharonyunyun) | 2025/6/25 | [2dd8666](https://github.com/vllm-project/vllm-ascend/commit/941269a6c5bbc79f6c1b6abd4680dc5802dd8666) |
28+
| 75 | [@Pr0Wh1teGivee](https://github.com/Pr0Wh1teGivee) | 2025/6/25 | [c65dd40](https://github.com/vllm-project/vllm-ascend/commit/2fda60464c287fe456b4a2f27e63996edc65dd40) |
29+
| 74 | [@xleoken](https://github.com/xleoken) | 2025/6/23 | [c604de0](https://github.com/vllm-project/vllm-ascend/commit/4447e53d7ad5edcda978ca6b0a3a26a73c604de0) |
30+
| 73 | [@lyj-jjj](https://github.com/lyj-jjj) | 2025/6/23 | [5cbd74e](https://github.com/vllm-project/vllm-ascend/commit/5177bef87a21331dcca11159d3d1438075cbd74e) |
31+
| 72 | [@farawayboat](https://github.com/farawayboat)| 2025/6/21 | [bc7d392](https://github.com/vllm-project/vllm-ascend/commit/097e7149f75c0806774bc68207f0f6270bc7d392)
32+
| 71 | [@yuancaoyaoHW](https://github.com/yuancaoyaoHW) | 2025/6/20 | [7aa0b94](https://github.com/vllm-project/vllm-ascend/commit/00ae250f3ced68317bc91c93dc1f1a0977aa0b94)
33+
| 70 | [@songshanhu07](https://github.com/songshanhu07) | 2025/6/18 | [5e1de1f](https://github.com/vllm-project/vllm-ascend/commit/2a70dbbdb8f55002de3313e17dfd595e1de1f)
34+
| 69 | [@wangyanhui-cmss](https://github.com/wangyanhui-cmss) | 2025/6/12| [40c9e88](https://github.com/vllm-project/vllm-ascend/commit/2a5fb4014b863cee6abc3009f5bc5340c9e88) |
35+
| 68 | [@chenwaner](https://github.com/chenwaner) | 2025/6/11 | [c696169](https://github.com/vllm-project/vllm-ascend/commit/e46dc142bf1180453c64226d76854fc1ec696169) |
36+
| 67 | [@yzim](https://github.com/yzim) | 2025/6/11 | [aaf701b](https://github.com/vllm-project/vllm-ascend/commit/4153a5091b698c2270d160409e7fee73baaf701b) |
1937
| 66 | [@Yuxiao-Xu](https://github.com/Yuxiao-Xu) | 2025/6/9 | [6b853f1](https://github.com/vllm-project/vllm-ascend/commit/6b853f15fe69ba335d2745ebcf14a164d0bcc505) |
2038
| 65 | [@ChenTaoyu-SJTU](https://github.com/ChenTaoyu-SJTU) | 2025/6/7 | [20dedba](https://github.com/vllm-project/vllm-ascend/commit/20dedba5d1fc84b7ae8b49f9ce3e3649389e2193) |
2139
| 64 | [@zxdukki](https://github.com/zxdukki) | 2025/6/7 | [87ebaef](https://github.com/vllm-project/vllm-ascend/commit/87ebaef4e4e519988f27a6aa378f614642202ecf) |

docs/source/community/governance.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,7 @@
11
# Governance
22

33
## Mission
4-
As a vital component of vLLM, the vLLM Ascend project is dedicated to providing an easy, fast, and cheap LLM Serving for Everyone on Ascend NPU, and to actively contribute to the enrichment of vLLM.
4+
As a vital component of vLLM, the vLLM Ascend project is dedicated to providing an easy, fast, and cheap LLM Serving for Everyone on Ascend NPU, and to actively contribute to the enrichment of vLLM.
55

66
## Principles
77
vLLM Ascend follows the vLLM community's code of conduct:[vLLM - CODE OF CONDUCT](https://github.com/vllm-project/vllm/blob/main/CODE_OF_CONDUCT.md)
@@ -13,7 +13,7 @@ vLLM Ascend is an open-source project under the vLLM community, where the author
1313

1414
**Responsibility:** Help new contributors on boarding, handle and respond to community questions, review RFCs, code
1515

16-
**Requirements:** Complete at least 1 contribution. Contributor is someone who consistently and actively participates in a project, included but not limited to issue/review/commits/community involvement.
16+
**Requirements:** Complete at least 1 contribution. Contributor is someone who consistently and actively participates in a project, included but not limited to issue/review/commits/community involvement.
1717

1818
Contributors will be empowered [vllm-project/vllm-ascend](https://github.com/vllm-project/vllm-ascend) Github repo `Triage` permissions (`Can read and clone this repository. Can also manage issues and pull requests`) to help community developers collaborate more efficiently.
1919

Lines changed: 19 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,19 @@
1+
# User Stories
2+
3+
Read case studies on how users and developers solves real, everyday problems with vLLM Ascend
4+
5+
- [LLaMA-Factory](./llamafactory.md) is an easy-to-use and efficient platform for training and fine-tuning large language models, it supports vLLM Ascend to speed up inference since [LLaMA-Factory#7739](https://github.com/hiyouga/LLaMA-Factory/pull/7739), gain 2x performance enhancement of inference.
6+
7+
- [Huggingface/trl](https://github.com/huggingface/trl) is a cutting-edge library designed for post-training foundation models using advanced techniques like SFT, PPO and DPO, it uses vLLM Ascend since [v0.17.0](https://github.com/huggingface/trl/releases/tag/v0.17.0) to support RLHF on Ascend NPU.
8+
9+
- [MindIE Turbo](https://pypi.org/project/mindie-turbo) is an LLM inference engine acceleration plug-in library developed by Huawei on Ascend hardware, which includes self-developed large language model optimization algorithms and optimizations related to the inference engine framework. It supports vLLM Ascend since [2.0rc1](https://www.hiascend.com/document/detail/zh/mindie/20RC1/AcceleratePlugin/turbodev/mindie-turbo-0001.html).
10+
11+
- [GPUStack](https://github.com/gpustack/gpustack) is an open-source GPU cluster manager for running AI models. It supports vLLM Ascend since [v0.6.2](https://github.com/gpustack/gpustack/releases/tag/v0.6.2), see more GPUStack performance evaluation info on [link](https://mp.weixin.qq.com/s/pkytJVjcH9_OnffnsFGaew).
12+
13+
- [verl](https://github.com/volcengine/verl) is a flexible, efficient and production-ready RL training library for large language models (LLMs), uses vLLM Ascend since [v0.4.0](https://github.com/volcengine/verl/releases/tag/v0.4.0), see more info on [verl x Ascend Quickstart](https://verl.readthedocs.io/en/latest/ascend_tutorial/ascend_quick_start.html).
14+
15+
:::{toctree}
16+
:caption: More details
17+
:maxdepth: 1
18+
llamafactory
19+
:::
Lines changed: 19 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,19 @@
1+
# LLaMA-Factory
2+
3+
**About / Introduction**
4+
5+
[LLaMA-Factory](https://github.com/hiyouga/LLaMA-Factory) is an easy-to-use and efficient platform for training and fine-tuning large language models. With LLaMA-Factory, you can fine-tune hundreds of pre-trained models locally without writing any code.
6+
7+
LLaMA-Facotory users need to evaluate and inference the model after fine-tuning the model.
8+
9+
**The Business Challenge**
10+
11+
LLaMA-Factory used transformers to perform inference on Ascend NPU, but the speed was slow.
12+
13+
**Solving Challenges and Benefits with vLLM Ascend**
14+
15+
With the joint efforts of LLaMA-Factory and vLLM Ascend ([LLaMA-Factory#7739](https://github.com/hiyouga/LLaMA-Factory/pull/7739)), the performance of LLaMA-Factory in the model inference stage has been significantly improved. According to the test results, the inference speed of LLaMA-Factory has been increased to 2x compared to the transformers version.
16+
17+
**Learn more**
18+
19+
See more about LLaMA-Factory and how it uses vLLM Ascend for inference on the Ascend NPU in the following documentation: [LLaMA-Factory Ascend NPU Inference](https://llamafactory.readthedocs.io/en/latest/advanced/npu_inference.html).

docs/source/developer_guide/versioning_policy.md renamed to docs/source/community/versioning_policy.md

Lines changed: 6 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -22,6 +22,8 @@ Following is the Release Compatibility Matrix for vLLM Ascend Plugin:
2222

2323
| vLLM Ascend | vLLM | Python | Stable CANN | PyTorch/torch_npu | MindIE Turbo |
2424
|-------------|--------------|------------------|-------------|--------------------|--------------|
25+
| v0.9.2rc1 | v0.9.2 | >= 3.9, < 3.12 | 8.1.RC1 | 2.5.1 / 2.5.1.post1.dev20250619 | |
26+
| v0.9.1rc1 | v0.9.1 | >= 3.9, < 3.12 | 8.1.RC1 | 2.5.1 / 2.5.1.post1.dev20250528 | |
2527
| v0.9.0rc2 | v0.9.0 | >= 3.9, < 3.12 | 8.1.RC1 | 2.5.1 / 2.5.1 | |
2628
| v0.9.0rc1 | v0.9.0 | >= 3.9, < 3.12 | 8.1.RC1 | 2.5.1 / 2.5.1 | |
2729
| v0.8.5rc1 | v0.8.5.post1 | >= 3.9, < 3.12 | 8.1.RC1 | 2.5.1 / 2.5.1 | |
@@ -35,6 +37,8 @@ Following is the Release Compatibility Matrix for vLLM Ascend Plugin:
3537

3638
| Date | Event |
3739
|------------|-------------------------------------------|
40+
| 2025.07.11 | Release candidates, v0.9.2rc1 |
41+
| 2025.06.22 | Release candidates, v0.9.1rc1 |
3842
| 2025.06.10 | Release candidates, v0.9.0rc2 |
3943
| 2025.06.09 | Release candidates, v0.9.0rc1 |
4044
| 2025.05.29 | v0.7.x post release, v0.7.3.post1 |
@@ -72,8 +76,8 @@ Usually, each minor version of vLLM (such as 0.7) will correspond to a vLLM Asce
7276

7377
| Branch | Status | Note |
7478
|------------|--------------|--------------------------------------|
75-
| main | Maintained | CI commitment for vLLM main branch and vLLM 0.9.x branch |
76-
| v0.9.1-dev | Maintained | CI commitment for vLLM 0.9.0 and 0.9.1 version |
79+
| main | Maintained | CI commitment for vLLM main branch and vLLM 0.9.2 branch |
80+
| v0.9.1-dev | Maintained | CI commitment for vLLM 0.9.1 version |
7781
| v0.7.3-dev | Maintained | CI commitment for vLLM 0.7.3 version |
7882
| v0.7.1-dev | Unmaintained | Replaced by v0.7.3-dev |
7983

docs/source/developer_guide/contributing.md renamed to docs/source/developer_guide/contribution/index.md

Lines changed: 44 additions & 46 deletions
Original file line numberDiff line numberDiff line change
@@ -4,82 +4,74 @@
44
It's recommended to set up a local development environment to build and test
55
before you submit a PR.
66

7-
### Prepare environment and build
7+
### Setup development environment
88

99
Theoretically, the vllm-ascend build is only supported on Linux because
1010
`vllm-ascend` dependency `torch_npu` only supports Linux.
1111

1212
But you can still set up dev env on Linux/Windows/macOS for linting and basic
1313
test as following commands:
1414

15+
#### Run lint locally
16+
1517
```bash
1618
# Choose a base dir (~/vllm-project/) and set up venv
1719
cd ~/vllm-project/
1820
python3 -m venv .venv
1921
source ./.venv/bin/activate
2022

21-
# Clone vllm code and install
22-
git clone https://github.com/vllm-project/vllm.git
23-
cd vllm
24-
pip install -r requirements/build.txt
25-
VLLM_TARGET_DEVICE="empty" pip install .
26-
cd ..
27-
2823
# Clone vllm-ascend and install
2924
git clone https://github.com/vllm-project/vllm-ascend.git
3025
cd vllm-ascend
31-
# install system requirement
32-
apt install -y gcc g++ cmake libnuma-dev
33-
# install project requirement
34-
pip install -r requirements-dev.txt
35-
36-
# Then you can run lint and mypy test
37-
bash format.sh
3826

39-
# Build:
40-
# - only supported on Linux (torch_npu available)
41-
# pip install -e .
42-
# - build without deps for debugging in other OS
43-
# pip install -e . --no-deps
44-
# - build without custom ops
45-
# COMPILE_CUSTOM_KERNELS=0 pip install -e .
27+
# Install lint requirement and enable pre-commit hook
28+
pip install -r requirements-lint.txt
4629

47-
# Commit changed files using `-s`
48-
git commit -sm "your commit info"
30+
# Run lint (You need install pre-commits deps via proxy network at first time)
31+
bash format.sh
4932
```
5033

51-
### Testing
34+
#### Run CI locally
5235

53-
Although vllm-ascend CI provide integration test on [Ascend](https://github.com/vllm-project/vllm-ascend/blob/main/.github/workflows/vllm_ascend_test.yaml), you can run it
54-
locally. The simplest way to run these integration tests locally is through a container:
55-
56-
```bash
57-
# Under Ascend NPU environment
58-
git clone https://github.com/vllm-project/vllm-ascend.git
59-
cd vllm-ascend
36+
After complete "Run lint" setup, you can run CI locally:
6037

61-
export IMAGE=vllm-ascend-dev-image
62-
export CONTAINER_NAME=vllm-ascend-dev
63-
export DEVICE=/dev/davinci1
38+
```{code-block} bash
39+
:substitutions:
6440
65-
# The first build will take about 10 mins (10MB/s) to download the base image and packages
66-
docker build -t $IMAGE -f ./Dockerfile .
67-
# You can also specify the mirror repo via setting VLLM_REPO to speedup
68-
# docker build -t $IMAGE -f ./Dockerfile . --build-arg VLLM_REPO=https://gitee.com/mirrors/vllm
41+
cd ~/vllm-project/
6942
70-
docker run --rm --name $CONTAINER_NAME --network host --device $DEVICE \
71-
--device /dev/davinci_manager --device /dev/devmm_svm \
72-
--device /dev/hisi_hdc -v /usr/local/dcmi:/usr/local/dcmi \
73-
-v /usr/local/bin/npu-smi:/usr/local/bin/npu-smi \
74-
-v /usr/local/Ascend/driver/lib64/:/usr/local/Ascend/driver/lib64/ \
75-
-ti $IMAGE bash
43+
# Run CI need vLLM installed
44+
git clone --branch |vllm_version| https://github.com/vllm-project/vllm.git
45+
cd vllm
46+
pip install -r requirements/build.txt
47+
VLLM_TARGET_DEVICE="empty" pip install .
48+
cd ..
7649
50+
# Install requirements
7751
cd vllm-ascend
52+
# For Linux:
7853
pip install -r requirements-dev.txt
54+
# For non Linux:
55+
cat requirements-dev.txt | grep -Ev '^#|^--|^$|^-r' | while read PACKAGE; do pip install "$PACKAGE"; done
56+
cat requirements.txt | grep -Ev '^#|^--|^$|^-r' | while read PACKAGE; do pip install "$PACKAGE"; done
57+
58+
# Run ci:
59+
bash format.sh ci
60+
```
61+
62+
#### Submit the commit
7963

80-
pytest tests/
64+
```bash
65+
# Commit changed files using `-s`
66+
git commit -sm "your commit info"
8167
```
8268

69+
🎉 Congratulations! You have completed the development environment setup.
70+
71+
### Test locally
72+
73+
You can refer to [Testing](./testing.md) doc to help you setup testing environment and running tests locally.
74+
8375
## DCO and Signed-off-by
8476

8577
When contributing changes to this project, you must agree to the DCO. Commits must include a `Signed-off-by:` header which certifies agreement with the terms of the DCO.
@@ -111,3 +103,9 @@ If the PR spans more than one category, please include all relevant prefixes.
111103

112104
You may find more information about contributing to vLLM Ascend backend plugin on [<u>docs.vllm.ai</u>](https://docs.vllm.ai/en/latest/contributing/overview.html).
113105
If you find any problem when contributing, you can feel free to submit a PR to improve the doc to help other developers.
106+
107+
:::{toctree}
108+
:caption: Index
109+
:maxdepth: 1
110+
testing
111+
:::

0 commit comments

Comments
 (0)