vllm-project · yiz-liu · Mar 25, 2026
@@ -30,11 +30,12 @@ on:
         type: choice
         options:
           - main
+          - v0.18.0rc1
           - v0.17.0rc1
           - v0.16.0rc1
           - v0.15.0rc1
           - v0.14.0rc1
-          - v0.13.0rc3
+          - v0.13.0
 
 jobs:
   image_build:

@@ -33,11 +33,12 @@ on:
         type: choice
         options:
           - main
+          - v0.18.0rc1
           - v0.17.0rc1
           - v0.16.0rc1
           - v0.15.0rc1
           - v0.14.0rc1
-          - v0.13.0rc3
+          - v0.13.0
 
 jobs:
   build_and_release_code:

@@ -53,7 +53,7 @@ By using vLLM Ascend plugin, popular open-source models, including Transformer-l
 - OS: Linux
 - Software:
     - Python >= 3.10, < 3.12
-    - CANN == 8.5.0 (Ascend HDK version refers to [here](https://www.hiascend.com/document/detail/zh/canncommercial/83RC2/releasenote/releasenote_0000.html))
+    - CANN == 8.5.1 (Ascend HDK version refers to [here](https://www.hiascend.com/document/detail/zh/canncommercial/83RC2/releasenote/releasenote_0000.html))
     - PyTorch == 2.9.0, torch-npu == 2.9.0
     - vLLM (the same version as vllm-ascend)
 
@@ -63,7 +63,7 @@ Please use the following recommended versions to get started quickly:
 
 | Version    | Release type | Doc                                  |
 |------------|--------------|--------------------------------------|
-| v0.17.0rc1 | Latest release candidate | See [QuickStart](https://docs.vllm.ai/projects/ascend/en/latest/quick_start.html) and [Installation](https://docs.vllm.ai/projects/ascend/en/latest/installation.html) for more details |
+| v0.18.0rc1 | Latest release candidate | See [QuickStart](https://docs.vllm.ai/projects/ascend/en/latest/quick_start.html) and [Installation](https://docs.vllm.ai/projects/ascend/en/latest/installation.html) for more details |
 | v0.13.0 | Latest stable version | See [QuickStart](https://docs.vllm.ai/projects/ascend/en/v0.13.0/quick_start.html) and [Installation](https://docs.vllm.ai/projects/ascend/en/v0.13.0/installation.html) for more details |
 
 ## Contributing
@@ -86,7 +86,7 @@ Below are the maintained branches:
 
 | Branch     | Status       | Note                                 |
 |------------|--------------|--------------------------------------|
-| main       | Maintained   | CI commitment for vLLM main branch and vLLM v0.17.0 tag   |
+| main       | Maintained   | CI commitment for vLLM main branch and vLLM v0.18.0 tag   |
 | v0.7.1-dev | Unmaintained | Only doc fixes are allowed |
 | v0.7.3-dev | Maintained   | CI commitment for vLLM 0.7.3 version, only bug fixes are allowed, and no new release tags anymore. |
 | v0.9.1-dev | Maintained   | CI commitment for vLLM 0.9.1 version |

@@ -47,7 +47,7 @@ vLLM 昇腾插件 (`vllm-ascend`) 是一个由社区维护的让vLLM在Ascend NP
 - 操作系统：Linux
 - 软件：
     - Python >= 3.10, < 3.12
-    - CANN == 8.5.0 (Ascend HDK 版本参考[这里](https://www.hiascend.com/document/detail/zh/canncommercial/83RC2/releasenote/releasenote_0000.html))
+    - CANN == 8.5.1 (Ascend HDK 版本参考[这里](https://www.hiascend.com/document/detail/zh/canncommercial/83RC2/releasenote/releasenote_0000.html))
     - PyTorch == 2.9.0, torch-npu == 2.9.0
     - vLLM (与vllm-ascend版本一致)
 
@@ -57,7 +57,7 @@ vLLM 昇腾插件 (`vllm-ascend`) 是一个由社区维护的让vLLM在Ascend NP
 
 | Version    | Release type | Doc                                  |
 |------------|--------------|--------------------------------------|
-|v0.17.0rc1| 最新RC版本 |请查看[快速开始](https://docs.vllm.ai/projects/ascend/en/latest/quick_start.html)和[安装指南](https://docs.vllm.ai/projects/ascend/en/latest/installation.html)了解更多|
+|v0.18.0rc1| 最新RC版本 |请查看[快速开始](https://docs.vllm.ai/projects/ascend/en/latest/quick_start.html)和[安装指南](https://docs.vllm.ai/projects/ascend/en/latest/installation.html)了解更多|
 |v0.13.0| 最新正式/稳定版本 |[快速开始](https://docs.vllm.ai/projects/ascend/en/v0.13.0/quick_start.html) and [安装指南](https://docs.vllm.ai/projects/ascend/en/v0.13.0/installation.html)了解更多|
 
 ## 贡献
@@ -80,7 +80,7 @@ vllm-ascend有主干分支和开发分支。
 
 | 分支         | 状态         | 备注                  |
 |------------|------------|---------------------|
-| main       | Maintained | 基于vLLM main分支和vLLM最新版本（v0.17.0）CI看护   |
+| main       | Maintained | 基于vLLM main分支和vLLM最新版本（v0.18.0）CI看护   |
 | v0.7.1-dev | Unmaintained | 只允许文档修复 |
 | v0.7.3-dev | Maintained | 基于vLLM v0.7.3版本CI看护, 只允许Bug修复，不会再发布新版本 |
 | v0.9.1-dev | Maintained | 基于vLLM v0.9.1版本CI看护 |

@@ -20,14 +20,32 @@
 | HeXiang Wang| [@whx-sjtu](https://github.com/whx-sjtu) | 2026/01 |
 
 ## Contributors
-<!-- last_commit: 23bf5d4d48e6ec09e2b4f726279591a1b42f033b -->
+<!-- last_commit: 8e3f8bab57cff0a98dc75ad43d8bf5bb4113f34e -->
 
 Every release of vLLM Ascend would not have been possible without the following contributors:
 
-Updated on 2026-03-09:
+Updated on 2026-03-25:
 
 | Number | Contributor | Date | Commit ID |
 |:------:|:-----------:|:-----:|:---------:|
+| 363 | [@GoMarck](https://github.com/GoMarck) | 2026/03/25 | [17da966](https://github.com/vllm-project/vllm-ascend/commit/17da96658f0b53a7e9b5932e64ced69a334f035c) |
+| 362 | [@drizzlezyk](https://github.com/drizzlezyk) | 2026/03/24 | [5487946](https://github.com/vllm-project/vllm-ascend/commit/54879467c41784a446aa5b486a391d9bfbf488fa) |
+| 361 | [@liuhy1213-cell](https://github.com/liuhy1213-cell) | 2026/03/23 | [fb283b5](https://github.com/vllm-project/vllm-ascend/commit/fb283b5820effe930d7f60952aca48177d710e94) |
+| 360 | [@ZhuQi-seu](https://github.com/ZhuQi-seu) | 2026/03/23 | [e942b62](https://github.com/vllm-project/vllm-ascend/commit/e942b62d742ebc5bf128e85bc086d728df8d4935) |
+| 359 | [@ksiyuan](https://github.com/ksiyuan) | 2026/03/20 | [a16c991](https://github.com/vllm-project/vllm-ascend/commit/a16c99141b0830240eeff0cbe01bfc3c833c62fb) |
+| 358 | [@idouba](https://github.com/idouba) | 2026/03/20 | [f39f566](https://github.com/vllm-project/vllm-ascend/commit/f39f566e22b87ee75bd1205f982e4255a882c3a4) |
+| 357 | [@yesyue-w](https://github.com/yesyue-w) | 2026/03/20 | [c860535](https://github.com/vllm-project/vllm-ascend/commit/c860535246cc751b6be7d1da2092e4380013598c) |
+| 356 | [@jiangmengyu18](https://github.com/jiangmengyu18) | 2026/03/18 | [305820f](https://github.com/vllm-project/vllm-ascend/commit/305820f1a982ed9597932778891b5da64ecccae9) |
+| 355 | [@SparrowMu](https://github.com/SparrowMu) | 2026/03/18 | [fb8e22e](https://github.com/vllm-project/vllm-ascend/commit/fb8e22ec00aef2b2d42a5f2d3ae7267848ec5016) |
+| 354 | [@ppppeng](https://github.com/ppppeng) | 2026/03/17 | [a457d0f](https://github.com/vllm-project/vllm-ascend/commit/a457d0f0e8d91060c62d7ff2b1741bfc74d79560) |
+| 353 | [@asunxiao](https://github.com/asunxiao) | 2026/03/17 | [a370dfa](https://github.com/vllm-project/vllm-ascend/commit/a370dfa9623e648439b724569931988a852e462e) |
+| 352 | [@GGGGua](https://github.com/GGGGua) | 2026/03/16 | [b1a7888](https://github.com/vllm-project/vllm-ascend/commit/b1a78886a928cd7b5881026302fba79609972bd2) |
+| 351 | [@bazingazhou233-hub](https://github.com/bazingazhou233-hub) | 2026/03/14 | [9e6c547](https://github.com/vllm-project/vllm-ascend/commit/9e6c547d9808eb5fa532d49102969c91b79be905) |
+| 350 | [@tfhddd](https://github.com/tfhddd) | 2026/03/12 | [21fea86](https://github.com/vllm-project/vllm-ascend/commit/21fea86b08edf4a016749a0d637d18cf7017dd2a) |
+| 349 | [@ZRJ026](https://github.com/ZRJ026) | 2026/03/10 | [a398fa6](https://github.com/vllm-project/vllm-ascend/commit/a398fa6a0b024f59aaa823c483529bcf2357540f) |
+| 348 | [@xmpp777](https://github.com/xmpp777) | 2026/03/10 | [9216e1b](https://github.com/vllm-project/vllm-ascend/commit/9216e1b0505c7e290d8c02cc64cb8817bfdd49f5) |
+| 347 | [@wanghuanjun2113](https://github.com/wanghuanjun2113) | 2026/03/09 | [dec04ec](https://github.com/vllm-project/vllm-ascend/commit/dec04ec8d884a45f1946b72dea129bc686cc2f44) |
+| 346 | [@liuchenbing2026](https://github.com/liuchenbing2026) | 2026/03/09 | [542258a](https://github.com/vllm-project/vllm-ascend/commit/542258ac9d9229aab4e8822de42443245a93f001) |
 | 345 | [@chenxi-hh](https://github.com/chenxi-hh) | 2026/03/09 | [737dfcf](https://github.com/vllm-project/vllm-ascend/commit/737dfcf638eae71d6c24c340dee20ff205f21ed9) |
 | 344 | [@xiaocongtou6](https://github.com/xiaocongtou6) | 2026/03/06 | [bc0fd7c](https://github.com/vllm-project/vllm-ascend/commit/bc0fd7ca7217498d5faa91504b0e8c3f822a5cc6) |
 | 343 | [@wanghengkang](https://github.com/wanghengkang) | 2026/03/06 | [c49ce18](https://github.com/vllm-project/vllm-ascend/commit/c49ce18ea544970510ebb04fff49a484533fe2a3) |

@@ -23,6 +23,7 @@ The table below is the release compatibility matrix for vLLM Ascend release.
 
 | vLLM Ascend | vLLM              | Python          | Stable CANN |        PyTorch/torch_npu        | Triton Ascend |
 |-------------|-------------------|-----------------|-------------|---------------------------------|---------------|
+| v0.18.0rc1  | v0.18.0           | >= 3.10, < 3.12 | 8.5.1       | 2.9.0  / 2.9.0                  | 3.2.0         |
 | v0.17.0rc1  | v0.17.0           | >= 3.10, < 3.12 | 8.5.1       | 2.9.0  / 2.9.0                  | 3.2.0         |
 | v0.16.0rc1  | v0.16.0           | >= 3.10, < 3.12 | 8.5.1       | 2.9.0  / 2.9.0                  | 3.2.0         |
 | v0.15.0rc1  | v0.15.0           | >= 3.10, < 3.12 | 8.5.0       | 2.9.0  / 2.9.0                  | 3.2.0         |
@@ -59,14 +60,15 @@ For main branch of vLLM Ascend, we usually make it compatible with the latest vL
 
 | vLLM Ascend | vLLM         | Python           | Stable CANN | PyTorch/torch_npu  |
 |-------------|--------------|------------------|-------------|--------------------|
-|     main    | ed359c497a728f08b5b41456c07a688ccd510fbc, v0.18.0 tag | >= 3.10, < 3.12   | 8.5.0 | 2.9.0 / 2.9.0 |
+|     main    | ed359c497a728f08b5b41456c07a688ccd510fbc, v0.18.0 tag | >= 3.10, < 3.12   | 8.5.1 | 2.9.0 / 2.9.0 |
 
 ## Release cadence
 
 ### Release window
 
 | Date       | Event                                     |
 |------------|-------------------------------------------|
+| 2026.03.27 | Release candidates, v0.18.0rc1            |
 | 2026.03.15 | Release candidates, v0.17.0rc1            |
 | 2026.03.10 | Release candidates, v0.16.0rc1            |
 | 2026.02.27 | Release candidates, v0.15.0rc1            |
@@ -126,7 +128,7 @@ Usually, each minor version of vLLM (such as 0.7) corresponds to a vLLM Ascend v
 
 | Branch     | State        | Note                                                     |
 | ---------- | ------------ | -------------------------------------------------------- |
-| main       | Maintained   | CI commitment for vLLM main branch and vLLM 0.16.0 tag |
+| main       | Maintained   | CI commitment for vLLM main branch and vLLM 0.18.0 tag |
 | releases/v0.13.0 | Maintained | CI commitment for vLLM 0.13.0 version                |
 | v0.11.0-dev| Maintained   | CI commitment for vLLM 0.11.0 version |
 | v0.9.1-dev | Maintained   | CI commitment for vLLM 0.9.1 version                     |

@@ -65,15 +65,15 @@
     # the branch of vllm, used in vllm clone
     # - main branch: 'main'
     # - vX.Y.Z branch: 'vX.Y.Z'
-    "vllm_version": "v0.17.0",
+    "vllm_version": "v0.18.0",
     # the branch of vllm-ascend, used in vllm-ascend clone and image tag
     # - main branch: 'main'
     # - vX.Y.Z branch: latest vllm-ascend release tag
-    "vllm_ascend_version": "v0.17.0rc1",
+    "vllm_ascend_version": "v0.18.0rc1",
     # the newest release version of vllm-ascend and matched vLLM, used in pip install.
     # This value should be updated when cut down release.
-    "pip_vllm_ascend_version": "0.17.0rc1",
-    "pip_vllm_version": "0.17.0",
+    "pip_vllm_ascend_version": "0.18.0rc1",
+    "pip_vllm_version": "0.18.0",
     # CANN image tag
     "cann_image_tag": "8.5.1-910b-ubuntu22.04-py3.11",
     # vllm version in ci

@@ -2,6 +2,7 @@
 
 ## Version Specific FAQs
 
+- [[v0.18.0rc1] FAQ & Feedback](https://github.com/vllm-project/vllm-ascend/issues/7633)
 - [[v0.17.0rc1] FAQ & Feedback](https://github.com/vllm-project/vllm-ascend/issues/7173)
 - [[v0.13.0] FAQ & Feedback](https://github.com/vllm-project/vllm-ascend/issues/6583)
 
@@ -104,7 +105,7 @@ If all above steps are not working, feel free to submit a GitHub issue.
 
 ### 7. How vllm-ascend work with vLLM?
 
-`vllm-ascend` is a hardware plugin for vLLM. The version of `vllm-ascend` is the same as the version of `vllm`. For example, if you use `vllm` 0.9.1, you should use vllm-ascend 0.9.1 as well. For the main branch, we ensure that `vllm-ascend` and `vllm` are compatible at every commit.
+`vllm-ascend` is a hardware plugin for vLLM. Stable releases usually align with the same vLLM version, while RC releases may use the corresponding vLLM final release version. For example, `vllm-ascend` `v0.18.0rc1` matches vLLM `v0.18.0`. For the main branch, we ensure that `vllm-ascend` and `vllm` are compatible at every commit.
 
 ### 8. Does vllm-ascend support Prefill Disaggregation feature?
 

@@ -1,5 +1,45 @@
 # Release Notes
 
+## v0.18.0rc1 - 2026.03.27
+
+This is the first release candidate of v0.18.0 for vLLM Ascend. Please follow the [official doc](https://docs.vllm.ai/projects/ascend/en/latest) to get started.
+
+### Highlights
+
+- Balance scheduling is now supported via `VLLM_ASCEND_BALANCE_SCHEDULING` for better data-parallel load balancing. [#7611](https://github.com/vllm-project/vllm-ascend/pull/7611)
+- Flash Comm V1 now supports VL models with MLA, removing a previous limitation for multimodal serving. [#7390](https://github.com/vllm-project/vllm-ascend/pull/7390)
+- DeepSeek models are now supported on A5 through new MLA operators. [#7232](https://github.com/vllm-project/vllm-ascend/pull/7232)
+
+### Features
+
+- Support separate attention backends for target and draft models in speculative decoding, allowing finer backend tuning per model. [#7342](https://github.com/vllm-project/vllm-ascend/pull/7342)
+- VL MoE models now support SP, and `sp_threshold` is removed in favor of `sp_min_token_num` from vLLM. [#7044](https://github.com/vllm-project/vllm-ascend/pull/7044)
+- Qwen VL models now support `w8a8_mxfp8` quantization. [#7417](https://github.com/vllm-project/vllm-ascend/pull/7417)
+- LayerwiseConnector now supports virtual push on decode nodes in PD deployment. [#7361](https://github.com/vllm-project/vllm-ascend/pull/7361)
+
+### Performance
+
+- Optimized the Qwen3.5 and Qwen3-Next GDN prefill path by prebuilding chunk metadata, reducing host-device synchronization overhead. [#7487](https://github.com/vllm-project/vllm-ascend/pull/7487)
+- Simplified the FIA prefill context merge path for better runtime efficiency. [#7293](https://github.com/vllm-project/vllm-ascend/pull/7293)
+
+### Dependencies
+
+- vLLM is upgraded to v0.18.0 for docker and release flows. [#7523](https://github.com/vllm-project/vllm-ascend/pull/7523) [#7502](https://github.com/vllm-project/vllm-ascend/pull/7502)
+
+### Documentation
+
+- Added configuration documentation for `enable_sparse_c8`. [#7600](https://github.com/vllm-project/vllm-ascend/pull/7600)
+- Refreshed deployment and model docs for Kimi-K2.5, GLM-4.7, DeepSeek-V3.2, MiniMax-M2.5, and PD disaggregation guides. [#7371](https://github.com/vllm-project/vllm-ascend/pull/7371) [#7403](https://github.com/vllm-project/vllm-ascend/pull/7403) [#7292](https://github.com/vllm-project/vllm-ascend/pull/7292) [#7296](https://github.com/vllm-project/vllm-ascend/pull/7296) [#7300](https://github.com/vllm-project/vllm-ascend/pull/7300)
+
+### Others
+
+- Lowered the log level in PD disaggregation to reduce noisy deployment logs. [#7589](https://github.com/vllm-project/vllm-ascend/pull/7589)
+- Fixed a PD separation issue where decode nodes could get stuck because shapes were not aligned across DP nodes. [#7534](https://github.com/vllm-project/vllm-ascend/pull/7534)
+- Fixed a regression where hybrid attention plus mamba models on Ascend could start with an incorrect block size after the v0.18.0 upgrade. [#7528](https://github.com/vllm-project/vllm-ascend/pull/7528)
+- Fixed multi-instance serving OOM calculation on single-card deployments. [#7427](https://github.com/vllm-project/vllm-ascend/pull/7427)
+- Fixed the speculative decoding proposer path for v0.18.0. [#7544](https://github.com/vllm-project/vllm-ascend/pull/7544)
+- Fixed DeepSeek v3.1 C8 when overlaying MTP with full decode and full graph modes. [#7571](https://github.com/vllm-project/vllm-ascend/pull/7571)
+
 ## v0.17.0rc1 - 2026.03.15
 
 This is the first release candidate of v0.17.0 for vLLM Ascend. Please follow the [official doc](https://docs.vllm.ai/projects/ascend/en/latest) to get started.