Skip to content

Commit 20db99c

Browse files
[CI Bugfix] Make sure TRTLLM attention is available in test_blackwell_moe (#26188)
Signed-off-by: mgoin <[email protected]> Signed-off-by: Michael Goin <[email protected]> Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
1 parent 6431be8 commit 20db99c

File tree

1 file changed

+9
-1
lines changed

1 file changed

+9
-1
lines changed

tests/quantization/test_blackwell_moe.py

Lines changed: 9 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -15,7 +15,15 @@
1515
"This test only runs on Blackwell GPUs (SM100).", allow_module_level=True
1616
)
1717

18-
os.environ["FLASHINFER_NVCC_THREADS"] = "16"
18+
19+
@pytest.fixture(scope="module", autouse=True)
20+
def set_test_environment():
21+
"""Sets environment variables required for this test module."""
22+
# Make sure TRTLLM attention is available
23+
os.environ["VLLM_HAS_FLASHINFER_CUBIN"] = "1"
24+
# Set compilation threads to 16 to speed up startup
25+
os.environ["FLASHINFER_NVCC_THREADS"] = "16"
26+
1927

2028
# dummy_hf_overrides = {"num_layers": 4, "num_hidden_layers": 4,
2129
# "text_config": {"num_layers": 4, "num_hidden_layers": 4}}

0 commit comments

Comments
 (0)