Commit 27dccc1
Drop
SUMMARY:
Drop the skip related to requiring `flash_attn` be installed in the
tests for the `quantizing_moe` examples. Recent CI failures related to
this package and CUDA compatibility with the recently released PyTorch
2.7.0 has resulted in findings that it is not required for these tests.
TEST PLAN:
An [internal test run][1] that drops the installation of `flash-attn`
and runs the changes on this branch indicates that the tests will pass
(one successful so far, will mark PR as ready once the run completes and
the remaining show expected results).
Specific relevant output (will update with other tests’ results):
```
tests/examples/test_quantizing_moe.py::TestQuantizingMOE::test_deepseek_example_script[deepseek_moe_w8a8_int8.py] PASSED
tests/examples/test_quantizing_moe.py::TestQuantizingMOE::test_deepseek_example_script[deepseek_moe_w8a8_fp8.py] PASSED
```
[1]:
https://github.com/neuralmagic/llm-compressor-testing/actions/runs/14712618904
Signed-off-by: Domenic Barbuzzi <dbarbuzz@redhat.com>flash_attn skip for quantizing_moe example tests (#1396)1 parent 8e7c288 commit 27dccc1
1 file changed
+0
-6
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
11 | 11 | | |
12 | 12 | | |
13 | 13 | | |
14 | | - | |
15 | | - | |
16 | | - | |
17 | | - | |
18 | | - | |
19 | | - | |
20 | 14 | | |
21 | 15 | | |
22 | 16 | | |
| |||
0 commit comments