-
-
Notifications
You must be signed in to change notification settings - Fork 13.1k
[CI] Reduce Blackwell Fusion test runtime by filtering tests and only run all tests in nightly #28074
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
ProExpertProg
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
…ases Co-authored-by: ProExpertProg <[email protected]>
…update test filters Co-authored-by: ProExpertProg <[email protected]>
2d596f4 to
d789a5c
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
💡 Codex Review
Here are some automated review suggestions for this pull request.
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
.buildkite/test-pipeline.yaml
Outdated
| - pytest -v -s tests/compile/test_silu_mul_quant_fusion.py | ||
| # this runner has 2 GPUs available even though num_gpus=2 is not set | ||
| - pytest -v -s tests/compile/test_fusion_all_reduce.py | ||
| - pytest -v -s tests/compile/test_fusions_e2e.py::test_tp2_attn_quant_allreduce_rmsnorm -k "True and Llama-3.1 and -quant_fp8 and -rms_norm" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ensure Blackwell fusion smoke test actually runs target case
The new command filters test_tp2_attn_quant_allreduce_rmsnorm with -k "True and Llama-3.1 and -quant_fp8 and -rms_norm". In a pytest -k expression the leading - is parsed as logical NOT, so this expression matches parameterizations that do not contain quant_fp8 or rms_norm. Because every parameterization of this test includes rms_norm (and the FP8 ones include quant_fp8), the filter deselects every case and the step exits with 0 selected tests. The non-optional pipeline thus never runs any Blackwell fusion end-to-end test, which defeats the purpose of adding the focused coverage.
Useful? React with 👍 / 👎.
ProExpertProg
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@copilot bootstrap CI fails with the following error:
<html>
<body>
<!--StartFragment-->
[2025-11-05T21:29:30Z] 2025-11-05 21:29:30 INFO Found 1 files that match "pipeline.yaml"
--
| [2025-11-05T21:29:30Z] 2025-11-05 21:29:30 INFO Uploading to default Buildkite artifact storage
| [2025-11-05T21:29:30Z] 2025-11-05 21:29:30 INFO Creating (0-1)/1 artifacts
| [2025-11-05T21:29:31Z] 2025-11-05 21:29:31 INFO Uploading artifact 019a55ec-c258-4c18-a014-d1467dca774c pipeline.yaml (146 KiB)
| [2025-11-05T21:29:31Z] 2025-11-05 21:29:31 INFO Successfully uploaded artifact "pipeline.yaml"
| [2025-11-05T21:29:33Z] 2025-11-05 21:29:33 INFO Artifact uploads completed successfully
| [2025-11-05T21:29:33Z] 2025-11-05 21:29:33 INFO Reading pipeline config from "pipeline.yaml"
| [2025-11-05T21:29:33Z] 2025-11-05 21:29:33 INFO Updating BUILDKITE_COMMIT to "d789a5cb5a001905dd36eee66be71d3f031ef054"
| [2025-11-05T21:29:33Z] fatal: pipeline parsing of "pipeline.yaml" failed: line 1103: did not find expected ',' or ']'
| [2025-11-05T21:29:33Z] 🚨 Error: The command exited with status 1
| [2025-11-05T21:29:33Z] user command error: exit status 1
<!--EndFragment-->
</body>
</html>
Co-authored-by: ProExpertProg <[email protected]>
Head branch was pushed to by a user without write access
Fixed the YAML parsing error by changing from double quotes to single quotes in the |
|
@copilot still same error |
Co-authored-by: ProExpertProg <[email protected]>
Signed-off-by: Luka Govedič <[email protected]>
Signed-off-by: Luka Govedič <[email protected]>
|
@copilot update the PR description |
Updated the PR description to reflect all changes made across the 6 commits. The new description includes: Summary
Key Changes
Test Strategy
The description now accurately captures the test runtime reduction strategy and comprehensive coverage approach. |
mgoin
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you!
… run all tests in nightly (vllm-project#28074)
… run all tests in nightly (vllm-project#28074) Signed-off-by: xuebwang-amd <[email protected]>
… run all tests in nightly (vllm-project#28074)
Summary
This PR reduces Blackwell Fusion test runtime in CI by:
+and-) for comprehensive testing in optional groupChanges Made
1.
.buildkite/test-pipeline.yamlPyTorch Fullgraph Test (Line 474-476):
-k 'TRITON and -quant_fp8'to limit to TRITON backend and exclude+quant_fp8custom opsBlackwell Fusion Tests (Line 930-932) - Non-optional, runs on every PR:
test_tp2_attn_quant_allreduce_rmsnorm-k 'True and Llama-3.1 and -quant_fp8 and -rms_norm'-quant_fp8and-rms_norm)Blackwell Fusion E2E Tests (Line 934-953) - Optional, runs on nightly:
optional: truepytest -v -s tests/compile/test_fusions_e2e.py2.
tests/compile/test_fusions_e2e.pyExpanded Custom Ops Coverage (Lines 98, 173):
CUSTOM_OPS_FP8: Added both"-quant_fp8"and"+quant_fp8"(previously only had-quant_fp8)CUSTOM_OPS_RMS_NORM: Added both"-rms_norm"and"+rms_norm"(previously only had-rms_norm)Model Changes (Lines 57-62):
nvidia/Llama-4-Scout-17B-16E-Instruct-FP4tonvidia/Llama-3.1-8B-Instruct-FP4attention_fusions=32,allreduce_fusions=65(from 48 and 96)Test Strategy
Non-Optional Tests (Run on every PR)
-quant_fp8onlyOptional Tests (Run on nightly builds)
nvidia/Llama-4-Scout-17B-16E-Instruct-FP8), FP4 (nvidia/Llama-3.1-8B-Instruct-FP4), unquantized (meta-llama/Llama-3.1-8B-Instruct)+quant_fp8/-quant_fp8,+rms_norm/-rms_norm)Technical Notes
Validation
yaml.safe_load()💬 We'd love your input! Share your thoughts on Copilot coding agent in our 2 minute survey.