Skip to content

Conversation

weiguihua2
Copy link
Contributor

@weiguihua2 weiguihua2 commented Aug 14, 2025

What this PR does / why we need it?

  1. Decouple attention metadata from the runner.
  2. Use common input parameters for the attention metadata build and build_torchair_graph_dummy methods.
  3. Use the community method split_decodes_and_prefills in mla to obtain data such as num_decodes.
  4. MTP supports v1 schedule.

Does this PR introduce any user-facing change?

NO

How was this patch tested?

Environmental measurement

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request refactors the attention metadata building process by introducing AscendCommonAttentionMetadata and TorchairCommonAttentionMetadata dataclasses. This is a positive change that improves code structure and decouples components from the main runner. However, the refactoring has introduced a few critical issues where the runner object is still being accessed after its removal from class initializers, which will lead to AttributeError exceptions. Additionally, there are some inconsistencies in the new dataclasses that will cause TypeError exceptions. I've provided detailed comments and suggestions to fix these issues.

Copy link

👋 Hi! Thank you for contributing to the vLLM Ascend project. The following points will speed up your PR merge:‌‌

  • A PR should do only one thing, smaller PRs enable faster reviews.
  • Every PR should include unit tests and end-to-end tests ‌to ensure it works and is not broken by other future PRs.
  • Write the commit message by fulfilling the PR description to help reviewer and future developers understand.

If CI fails, you can run linting and testing checks locally according Contributing and Testing.

Copy link

This pull request has conflicts, please resolve those before we can evaluate the pull request.

Signed-off-by: weiguihua2 <[email protected]>

refact attn metadata build

Signed-off-by: weiguihua2 <[email protected]>

refact attn metadata build

Signed-off-by: weiguihua2 <[email protected]>

refact attn metadata build

Signed-off-by: weiguihua2 <[email protected]>

refact attn metadata build

Signed-off-by: weiguihua2 <[email protected]>

refact attn metadata build

Signed-off-by: weiguihua2 <[email protected]>

refact attn metadata build

Signed-off-by: weiguihua2 <[email protected]>

refact attn metadata build

Signed-off-by: weiguihua2 <[email protected]>

refact attn metadata build

Signed-off-by: weiguihua2 <[email protected]>

refact attn metadata build

Signed-off-by: weiguihua2 <[email protected]>

refact attn metadata build

Signed-off-by: weiguihua2 <[email protected]>

refact attn metadata build

Signed-off-by: weiguihua2 <[email protected]>

refact attn metadata build

Signed-off-by: weiguihua2 <[email protected]>

refact attn metadata build

Signed-off-by: weiguihua2 <[email protected]>

refact model runner

Signed-off-by: weiguihua2 <[email protected]>

refact model runner

Signed-off-by: weiguihua2 <[email protected]>

refact model runner

Signed-off-by: weiguihua2 <[email protected]>

refact model runner

Signed-off-by: weiguihua2 <[email protected]>
Copy link

This pull request has conflicts, please resolve those before we can evaluate the pull request.

@weiguihua2 weiguihua2 closed this Aug 20, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants