Skip to content

Commit 0740d10

Browse files
authored
[0.9.1][doc] add release note for 0.9.1-dev branch (#2728)
add release note for 0.9.1-dev branch. cherry-pick from #2646 Signed-off-by: wangxiyuan <[email protected]>
1 parent ae9a8bf commit 0740d10

File tree

4 files changed

+51
-8
lines changed

4 files changed

+51
-8
lines changed

docs/source/_templates/sections/header.html

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -54,5 +54,5 @@
5454
</style>
5555

5656
<div class="notification-bar">
57-
<p>You are viewing the latest developer preview docs. <a href="https://vllm-ascend.readthedocs.io/en/v0.7.3-dev">Click here</a> to view docs for the latest stable release(v0.7.3.post1).</p>
57+
<p>You are viewing the latest official docs. </p>
5858
</div>

docs/source/community/versioning_policy.md

Lines changed: 2 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -22,14 +22,10 @@ Following is the Release Compatibility Matrix for vLLM Ascend Plugin:
2222

2323
| vLLM Ascend | vLLM | Python | Stable CANN | PyTorch/torch_npu | MindIE Turbo |
2424
|-------------|--------------|------------------|-------------|--------------------|--------------|
25-
| v0.9.2rc1 | v0.9.2 | >= 3.9, < 3.12 | 8.1.RC1 | 2.5.1 / 2.5.1.post1.dev20250619 | |
25+
| v0.9.1 | v0.9.1 | >= 3.9, < 3.12 | 8.2.RC1 | 2.5.1 / 2.5.1.post1 | |
2626
| v0.9.1rc3 | v0.9.1 | >= 3.9, < 3.12 | 8.1.RC1 | 2.5.1 / 2.5.1.post1 | |
2727
| v0.9.1rc2 | v0.9.1 | >= 3.9, < 3.12 | 8.1.RC1 | 2.5.1 / 2.5.1.post1| |
2828
| v0.9.1rc1 | v0.9.1 | >= 3.9, < 3.12 | 8.1.RC1 | 2.5.1 / 2.5.1.post1.dev20250528 | |
29-
| v0.9.0rc2 | v0.9.0 | >= 3.9, < 3.12 | 8.1.RC1 | 2.5.1 / 2.5.1 | |
30-
| v0.9.0rc1 | v0.9.0 | >= 3.9, < 3.12 | 8.1.RC1 | 2.5.1 / 2.5.1 | |
31-
| v0.8.5rc1 | v0.8.5.post1 | >= 3.9, < 3.12 | 8.1.RC1 | 2.5.1 / 2.5.1 | |
32-
| v0.8.4rc2 | v0.8.4 | >= 3.9, < 3.12 | 8.0.0 | 2.5.1 / 2.5.1 | |
3329
| v0.7.3.post1| v0.7.3 | >= 3.9, < 3.12 | 8.1.RC1 | 2.5.1 / 2.5.1 | 2.0rc1 |
3430
| v0.7.3 | v0.7.3 | >= 3.9, < 3.12 | 8.1.RC1 | 2.5.1 / 2.5.1 | 2.0rc1 |
3531

@@ -39,6 +35,7 @@ Following is the Release Compatibility Matrix for vLLM Ascend Plugin:
3935

4036
| Date | Event |
4137
|------------|-------------------------------------------|
38+
| 2025.09.03 | v0.9.1 Final release |
4239
| 2025.08.22 | Release candidates, v0.9.1rc3 |
4340
| 2025.08.06 | Release candidates, v0.9.1rc2 |
4441
| 2025.07.11 | Release candidates, v0.9.2rc1 |

docs/source/conf.py

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -69,10 +69,10 @@
6969
# the branch of vllm-ascend, used in vllm-ascend clone and image tag
7070
# - main branch: 'main'
7171
# - vX.Y.Z branch: latest vllm-ascend release tag
72-
'vllm_ascend_version': 'v0.9.1rc3',
72+
'vllm_ascend_version': 'v0.9.1',
7373
# the newest release version of vllm-ascend and matched vLLM, used in pip install.
7474
# This value should be updated when cut down release.
75-
'pip_vllm_ascend_version': "0.9.1rc3",
75+
'pip_vllm_ascend_version': "0.9.1",
7676
'pip_vllm_version': "0.9.1",
7777
# CANN image tag
7878
'cann_image_tag': "8.2.rc1-910b-ubuntu22.04-py3.11",

docs/source/user_guide/release_notes.md

Lines changed: 46 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,51 @@
11
# Release note
22

3+
## v0.9.1 - 2025.09.03
4+
5+
We are excited to announce the newest official release of vLLM Ascend. This release includes many feature supports, performance improvements and bug fixes. We recommend users to upgrade from 0.7.3 to this version. Please always set `VLLM_USE_V1=1` to use V1 engine.
6+
7+
In this release, we added many enhancements for large scale expert parallel case. It's recommended to follow the [official guide](https://vllm-ascend.readthedocs.io/en/v0.9.1-dev/tutorials/large_scale_ep.html).
8+
9+
Please note that this release note will list all the important changes from last official release(v0.7.3)
10+
11+
### Highlights
12+
13+
- DeepSeek V3/R1 is supported with high quality and performance. MTP can work with DeepSeek as well. Please refer to [muliti node tutorials](https://vllm-ascend.readthedocs.io/en/v0.9.1-dev/tutorials/multi_node.html) and [Large Scale Expert Parallelism](https://vllm-ascend.readthedocs.io/en/v0.9.1-dev/tutorials/large_scale_ep.html).
14+
- Qwen series models work with graph mode now. It works by default with V1 Engine. Please refer to [Qwen tutorials](https://vllm-ascend.readthedocs.io/en/v0.9.1-dev/tutorials/index.html).
15+
- Disaggregated Prefilling support for V1 Engine. Please refer to [Large Scale Expert Parallelism](https://vllm-ascend.readthedocs.io/en/v0.9.1-dev/tutorials/large_scale_ep.html) tutorials.
16+
- Automatic prefix caching and chunked prefill feature is supported.
17+
- Speculative decoding feature works with Ngram and MTP method.
18+
- MOE and dense w4a8 quantization support now. Please refer to [quantization guide](https://vllm-ascend.readthedocs.io/en/v0.9.1-dev/user_guide/feature_guide/quantization.html).
19+
- Sleep Mode feature is supported for V1 engine. Please refer to [Sleep mode tutorials](https://vllm-ascend.readthedocs.io/en/v0.9.1-dev/user_guide/feature_guide/sleep_mode.html).
20+
- Dynamic and Static EPLB support is added. This feature is still experimental.
21+
22+
### Note
23+
The following notes are especially for reference when upgrading from last final release (v0.7.3):
24+
25+
- V0 Engine is not supported from this release. Please always set `VLLM_USE_V1=1` to use V1 engine with vLLM Ascend.
26+
- Mindie Turbo is not needed with this release. And the old version of Mindie Turbo is not compatible. Please do not install it. Currently all the function and enhancement is included in vLLM Ascend already. We'll consider to add it back in the future in needed.
27+
- Torch-npu is upgraded to 2.5.1.post1. CANN is upgraded to 8.2.RC1. Don't forget to upgrade them.
28+
29+
### Core
30+
31+
- The Ascend scheduler is added for V1 engine. This scheduler is more affine with Ascend hardware.
32+
- Structured output feature works now on V1 Engine.
33+
- A batch of custom ops are added to improve the performance.
34+
35+
### Changes
36+
37+
- EPLB support for Qwen3-moe model. [#2000](https://github.com/vllm-project/vllm-ascend/pull/2000)
38+
- Fix the bug that MTP doesn't work well with Prefill Decode Disaggregation. [#2610](https://github.com/vllm-project/vllm-ascend/pull/2610) [#2554](https://github.com/vllm-project/vllm-ascend/pull/2554) [#2531](https://github.com/vllm-project/vllm-ascend/pull/2531)
39+
- Fix few bugs to make sure Prefill Decode Disaggregation works well. [#2538](https://github.com/vllm-project/vllm-ascend/pull/2538) [#2509](https://github.com/vllm-project/vllm-ascend/pull/2509) [#2502](https://github.com/vllm-project/vllm-ascend/pull/2502)
40+
- Fix file not found error with shutil.rmtree in torchair mode. [#2506](https://github.com/vllm-project/vllm-ascend/pull/2506)
41+
42+
### Known Issues
43+
- When running MoE model, Aclgraph mode only work with tensor parallel. DP/EP doesn't work in this release.
44+
- Pipeline parallelism is not supported in this release for V1 engine.
45+
- If you use w4a8 quantization with eager mode, please set `VLLM_ASCEND_MLA_PARALLEL=1` to avoid oom error.
46+
- Accuracy test with some tools may not be correct. It doesn't affect the real user case. We'll fix it in the next post release. [#2654](https://github.com/vllm-project/vllm-ascend/pull/2654)
47+
- We notice that there are still some problems when running vLLM Ascend with Prefill Decode Disaggregation. For example, the memory may be leaked and the service may be stuck. It's caused by known issue by vLLM and vLLM Ascend. We'll fix it in the next post release. [#2650](https://github.com/vllm-project/vllm-ascend/pull/2650) [#2604](https://github.com/vllm-project/vllm-ascend/pull/2604) [vLLM#22736](https://github.com/vllm-project/vllm/pull/22736) [vLLM#23554](https://github.com/vllm-project/vllm/pull/23554) [vLLM#23981](https://github.com/vllm-project/vllm/pull/23981)
48+
349
## v0.9.1rc3 - 2025.08.22
450

551
This is the 3rd release candidate of v0.9.1 for vLLM Ascend. Please follow the [official doc](https://vllm-ascend.readthedocs.io/en/v0.9.1-dev/) to get started.

0 commit comments

Comments
 (0)