Skip to content

Commit 13fc844

Browse files
authored
[0.9.1] cancel the verification between deepseek_mtp and non-ascend scheduler in disaggregated_prefill deployment (#2368)
### What this PR does / why we need it? Currently deepseek_mtp can be enabled with original vllm scheduler only in disaggregated prefill scenarios (experimental). This pr change the verification logic to allow users to enable deepseek_mtp without ascend scheduler in disaggregated prefill deployments. ### Does this PR introduce _any_ user-facing change? Users can enable deepseek_mtp model without ascend scheduler in disaggregated_prefill deployments. ### How was this patch tested? CI and e2e vllm serving passed. Signed-off-by: linfeng-yuan <[email protected]>
1 parent f5226e3 commit 13fc844

File tree

1 file changed

+13
-4
lines changed

1 file changed

+13
-4
lines changed

vllm_ascend/ascend_config.py

Lines changed: 13 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -204,13 +204,22 @@ def check_ascend_config(vllm_config, enforce_eager):
204204
"Ascend scheduler is only supported for V1 Engine.")
205205
# for v1 engine
206206
else:
207-
# TODO(yexiong): remove this verification after mtp model supports original vllm scheduler
207+
# TODO(yexiong): Currently deepseek_mtp can be enabled with original vllm
208+
# scheduler only in disaggregated prefill scenarios. Otherwise, it should
209+
# be enabled with only ascend scheduler. This block will be removed once
210+
# mtp is completely compatible with all scenarios.
208211
if (not ascend_config.ascend_scheduler_config.enabled
209212
and vllm_config.speculative_config
210213
and vllm_config.speculative_config.method == 'deepseek_mtp'):
211-
raise NotImplementedError(
212-
"Currently deepseek MTP model is only supported for ascend scheduler."
213-
)
214+
if vllm_config.kv_transfer_config is None:
215+
raise NotImplementedError(
216+
"Using deepseek_mtp without the ascend scheduler is only supported for disaggregated prefill deployments now."
217+
)
218+
else:
219+
logger.warning(
220+
"Deepseek MTP model is enabled without ascend scheduler. Combination of deepseek MTP and original vllm "
221+
"scheduler is an experimental setting in disaggregated prefill scenarios and will be completely released soon."
222+
)
214223
# for eager mode
215224
if enforce_eager:
216225
# torchair_graph cannot be enabled with eager mode.

0 commit comments

Comments
 (0)