Skip to content

Commit 7c8bcdb

Browse files
authored
tests: xfail moe quantization classes mxfp8_bf16 UTs on sm103 (#1754)
<!-- .github/pull_request_template.md --> ## 📌 Description Temporarily marking test_trtllm_gen_fused_moe mxfp8_bf16 cases as xfail until we converge on fix without causing regression on B200. ## 🔍 Related Issues <!-- Link any related issues here --> ## 🚀 Pull Request Checklist Thank you for contributing to FlashInfer! Before we review your pull request, please make sure the following items are complete. ### ✅ Pre-commit Checks - [ ] I have installed `pre-commit` by running `pip install pre-commit` (or used your preferred method). - [ ] I have installed the hooks with `pre-commit install`. - [ ] I have run the hooks manually with `pre-commit run --all-files` and fixed any reported issues. > If you are unsure about how to set up `pre-commit`, see [the pre-commit documentation](https://pre-commit.com/). ## 🧪 Tests - [ ] Tests have been added or updated as needed. - [ ] All tests are passing (`unittest`, etc.). ## Reviewer Notes <!-- Optional: anything you'd like reviewers to focus on, concerns, etc. --> --------- Co-authored-by: jimmzhou <[email protected]>
1 parent 5afeac1 commit 7c8bcdb

File tree

1 file changed

+11
-0
lines changed

1 file changed

+11
-0
lines changed

tests/test_trtllm_gen_fused_moe.py

Lines changed: 11 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -2034,6 +2034,17 @@ def test_moe_quantization_classes(
20342034
f"Incompatible: {moe_impl.name} + {weight_processing['use_shuffled_weight']} + {weight_processing['layout']}"
20352035
)
20362036

2037+
# TODO(jimmzhou): enable MxFP4xBf16 on SM103
2038+
if (
2039+
type(moe_impl) is FP4Moe
2040+
and moe_impl.quant_mode == QuantMode.FP4_MXFP4_Bf16
2041+
and compute_capability[0] == 10
2042+
and compute_capability[1] == 3
2043+
):
2044+
pytest.xfail(
2045+
"Note(jimmzhou): Make MxFP4xBf16 nonfunctional on SM103 to avoid B200 regression"
2046+
)
2047+
20372048
moe_impl._cache_permute_indices = cache_permute_indices
20382049

20392050
seed = 0

0 commit comments

Comments
 (0)