Skip to content

Commit 4e578f5

Browse files
authored
[0.9.1][bugfix]fix wrong cached torchair graph directory of deepseek_mtp model (#2531)
### What this PR does / why we need it? Fix the inconsistency of cached directories between deepseek and its mtp model with torchair graph. This bug would lead to assertion error while running deepseek_mtp. ### Does this PR introduce _any_ user-facing change? No. ### How was this patch tested? CI and e2e vllm serving pass. Signed-off-by: linfeng-yuan <[email protected]>
1 parent a5ca6a5 commit 4e578f5

File tree

1 file changed

+2
-1
lines changed

1 file changed

+2
-1
lines changed

vllm_ascend/worker/mtp_proposer_v1.py

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -19,7 +19,7 @@
1919
from vllm_ascend.attention.utils import AscendCommonAttentionMetadata
2020
from vllm_ascend.distributed.utils import is_lmhead_tp
2121
from vllm_ascend.models.deepseek_mtp import CustomDeepSeekMTP
22-
from vllm_ascend.utils import ProfileExecuteDuration
22+
from vllm_ascend.utils import TORCHAIR_CACHE_DIR, ProfileExecuteDuration
2323

2424

2525
# FIXME(woosuk): The logic here is duplicated with the main sampling code.
@@ -423,6 +423,7 @@ def _get_torchair_lazy_compiled_model(self, batch_size: int):
423423
self.model.__dict__[forward_proxy_name],
424424
dynamic=True,
425425
fullgraph=envs_vllm.VLLM_TEST_DYNAMO_FULLGRAPH_CAPTURE,
426+
cache_dir=TORCHAIR_CACHE_DIR,
426427
config=config,
427428
ge_cache=False)
428429
return self.torchair_compiled_models[batch_size]

0 commit comments

Comments
 (0)