Skip to content

Commit aadc75c

Browse files
authored
[Fix] Resolve data-parallel (DP) assertion errors in TorchAir (#2626)
### What this PR does / why we need it? It is confirmed that `num_input_tokens` must be assigned the value of `maybe_padded_num_tokens` under all circumstances. ### Does this PR introduce _any_ user-facing change? None. ### How was this patch tested? Waiting for daily test for TorchAir. - vLLM version: v0.10.1.1 - vLLM main: vllm-project/vllm@006477e Signed-off-by: Yizhou Liu <[email protected]>
1 parent 600b08f commit aadc75c

File tree

2 files changed

+4
-4
lines changed

2 files changed

+4
-4
lines changed

vllm_ascend/torchair/torchair_model_runner.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -100,7 +100,7 @@ def _sync_metadata_across_dp(
100100
num_tokens_across_dp = torch.full((self.dp_size, ),
101101
maybe_padded_num_tokens,
102102
dtype=torch.int32,
103-
device="cpu")
103+
device="npu")
104104
else:
105105
maybe_padded_num_tokens = num_tokens
106106

vllm_ascend/worker/model_runner_v1.py

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -1095,9 +1095,9 @@ def _prepare_inputs(
10951095
enable_dbo) = self._sync_metadata_across_dp(num_input_tokens,
10961096
with_prefill, enable_dbo)
10971097

1098-
if self.use_aclgraph:
1099-
# When using TorchAir with DP, we have other plans for padding
1100-
num_input_tokens = maybe_padded_num_tokens
1098+
# TODO: Now that num_input_tokens is basically identical with maybe_padded_num_tokens
1099+
# We should consider removing maybe_padded_num_tokens later
1100+
num_input_tokens = maybe_padded_num_tokens
11011101

11021102
# Hot-Swap lora model
11031103
if self.lora_config:

0 commit comments

Comments
 (0)