[4/N][Refactor] Refactor `AscendAttentionMetadataBuilder` for better extensibility and make the builder class of torchair extend from it #2375

shen-shanshan · 2025-08-14T08:22:24Z

What this PR does / why we need it?

Refactor AscendAttentionMetadataBuilder for better extensibility and make the builder class of torchair extend from it.

Extract _assemble_build_info() and _assemble_attn_metadata() method from build() in AscendAttentionMetadataBuilder for better extensibility.

Workflow of build() method:

Prepare build info: the common logic of preparing build info.
_assemble_build_info(): the custom logic that can be overwritten in torchair_attention.py.
_assemble_attn_metadata(): the custom logic that can be overwritten in torchair_attention.py.

After this refactor, we can remove the build() method in AscendAttentionTorchairMetadataBuilder, and just need to overwrite these two methods: _assemble_build_info() and _assemble_attn_metadata().

Note

Do not merge this PR before #2017.

Does this PR introduce any user-facing change?

How was this patch tested?

vLLM version: v0.10.1.1
vLLM main: vllm-project/vllm@67c1490

gemini-code-assist

Code Review

This pull request refactors the build method in AscendAttentionMetadataBuilder by extracting _prepare_build_info and _assemble_build_info. This is a good change that improves modularity and extensibility, as demonstrated by the new AscendAttentionTorchairMetadataBuilder. The implementation is solid, but I've found one area in the new torchair_attention.py file with some confusing code that could be clarified.

gemini-code-assist · 2025-08-14T08:23:52Z

vllm_ascend/torchair/torchair_attention.py

+                    pad_value = 0
+                    num_token_pad_size = graph_pad_size - num_actual_tokens
+                    num_reqs_pad_size = (
+                        graph_pad_size // self.runner.decode_token_per_req -
+                        num_reqs)


The variable pad_value is assigned the value 0 on line 247, but this value is never used because it is unconditionally reassigned to 1 on line 252 before its first use. This makes the assignment on line 247 dead code, which is confusing and should be removed to improve clarity.

num_token_pad_size = graph_pad_size - num_actual_tokens num_reqs_pad_size = ( graph_pad_size // self.runner.decode_token_per_req - num_reqs)

github-actions · 2025-08-14T08:30:23Z

👋 Hi! Thank you for contributing to the vLLM Ascend project. The following points will speed up your PR merge:‌‌

A PR should do only one thing, smaller PRs enable faster reviews.
Every PR should include unit tests and end-to-end tests ‌to ensure it works and is not broken by other future PRs.
Write the commit message by fulfilling the PR description to help reviewer and future developers understand.

If CI fails, you can run linting and testing checks locally according Contributing and Testing.

github-actions · 2025-08-19T02:28:44Z

This pull request has conflicts, please resolve those before we can evaluate the pull request.

codecov · 2025-08-22T11:07:01Z

Codecov Report

❌ Patch coverage is 92.00000% with 2 lines in your changes missing coverage. Please review.
✅ Project coverage is 78.01%. Comparing base (5d8ec28) to head (cafae88).

Files with missing lines	Patch %	Lines
vllm_ascend/attention/attention_v1.py	92.00%	2 Missing ⚠️

Additional details and impacted files

@@            Coverage Diff             @@
##             main    #2375      +/-   ##
==========================================
+ Coverage   77.99%   78.01%   +0.02%     
==========================================
  Files         134      134              
  Lines       18498    18515      +17     
==========================================
+ Hits        14427    14444      +17     
  Misses       4071     4071

Flag	Coverage Δ
unittests	`78.01% <92.00%> (+0.02%)`	⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

…make the builder class of torchair extend from it Signed-off-by: shen-shanshan <[email protected]>

shen-shanshan mentioned this pull request Aug 14, 2025

[Refactor]: Refactor torchair in vllm-ascend #2273

Open

9 tasks

gemini-code-assist bot reviewed Aug 14, 2025

View reviewed changes

shen-shanshan force-pushed the air branch from fddbc04 to 643f31e Compare August 15, 2025 02:48

github-actions bot added the merge-conflicts label Aug 19, 2025

shen-shanshan force-pushed the air branch from 643f31e to 37b500a Compare August 22, 2025 08:40

github-actions bot removed the merge-conflicts label Aug 22, 2025

shen-shanshan added ready-for-test start test by label for PR accuracy-test enable all accuracy test for PR labels Aug 25, 2025

shen-shanshan force-pushed the air branch 3 times, most recently from cafae88 to 5ae0c6f Compare August 29, 2025 08:47

Refactor AscendAttentionMetadataBuilder for better extensibility and …

1639834

…make the builder class of torchair extend from it Signed-off-by: shen-shanshan <[email protected]>

shen-shanshan force-pushed the air branch from 5ae0c6f to 1639834 Compare August 29, 2025 09:39

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[4/N][Refactor] Refactor `AscendAttentionMetadataBuilder` for better extensibility and make the builder class of torchair extend from it #2375

[4/N][Refactor] Refactor `AscendAttentionMetadataBuilder` for better extensibility and make the builder class of torchair extend from it #2375

shen-shanshan commented Aug 14, 2025 •

edited by github-actions bot

Loading

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

gemini-code-assist bot Aug 14, 2025

Uh oh!

github-actions bot commented Aug 14, 2025

Uh oh!

github-actions bot commented Aug 19, 2025

Uh oh!

codecov bot commented Aug 22, 2025 •

edited

Loading

Uh oh!

Uh oh!

[4/N][Refactor] Refactor AscendAttentionMetadataBuilder for better extensibility and make the builder class of torchair extend from it #2375

Are you sure you want to change the base?

[4/N][Refactor] Refactor AscendAttentionMetadataBuilder for better extensibility and make the builder class of torchair extend from it #2375

Conversation

shen-shanshan commented Aug 14, 2025 • edited by github-actions bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What this PR does / why we need it?

Does this PR introduce any user-facing change?

How was this patch tested?

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist bot Aug 14, 2025

Choose a reason for hiding this comment

Uh oh!

github-actions bot commented Aug 14, 2025

Uh oh!

github-actions bot commented Aug 19, 2025

Uh oh!

codecov bot commented Aug 22, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

Uh oh!

[4/N][Refactor] Refactor `AscendAttentionMetadataBuilder` for better extensibility and make the builder class of torchair extend from it #2375

[4/N][Refactor] Refactor `AscendAttentionMetadataBuilder` for better extensibility and make the builder class of torchair extend from it #2375

shen-shanshan commented Aug 14, 2025 •

edited by github-actions bot

Loading

codecov bot commented Aug 22, 2025 •

edited

Loading