Skip to content

Commit 349a945

Browse files
committed
[Bugfix]fix ds3.2+dcp
Signed-off-by: weiguihua2 <weiguihua2@huawei.com>
1 parent 2a99f68 commit 349a945

File tree

1 file changed

+1
-1
lines changed
  • vllm_ascend/attention/context_parallel

1 file changed

+1
-1
lines changed

vllm_ascend/attention/context_parallel/sfa_cp.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -258,7 +258,7 @@ def _execute_sparse_flash_attention_process(
258258

259259
def _align_to_graph_bucket_tokens(self, attn_output: torch.Tensor | None, attn_metadata: M) -> torch.Tensor | None:
260260
if attn_output is None or self.pcp_size == 1:
261-
return None
261+
return attn_output
262262
# In graph/piecewise mode, output buffer uses graph bucket token size
263263
# (forward_context.num_tokens), while PCP path may compute only valid
264264
# tokens. Align to the larger one to avoid later write-back mismatch.

0 commit comments

Comments
 (0)