Skip to content

Commit bfc5919

Browse files
authored
[https://nvbugs/5745152][fix] Fix some GPTOSS test setups (#10085)
Signed-off-by: Dongfeng Yu <[email protected]>
1 parent 4a5ef84 commit bfc5919

File tree

3 files changed

+7
-2
lines changed

3 files changed

+7
-2
lines changed

tests/integration/defs/accuracy/test_disaggregated_serving.py

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1091,11 +1091,13 @@ def test_auto_dtype(self, block_reuse, mocker):
10911091
"max_attention_window": [128, 32768],
10921092
"enable_block_reuse": block_reuse,
10931093
"enable_partial_reuse": False,
1094+
"free_gpu_memory_fraction": 0.5,
10941095
}
10951096
gen_server_config["kv_cache_config"] = {
10961097
"max_attention_window": [128, 32768],
10971098
"enable_block_reuse": block_reuse,
10981099
"enable_partial_reuse": False,
1100+
"free_gpu_memory_fraction": 0.5,
10991101
}
11001102
disaggregated_server_config = {
11011103
"hostname": "localhost",

tests/integration/defs/accuracy/test_llm_api_pytorch.py

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -4369,6 +4369,11 @@ def test_eagle3_4gpus(self, moe_backend, one_model, overlap_scheduler,
43694369
"https://nvbugs/5636916: Remaining Hopper Eagle Accuracy Issue for only TP=4"
43704370
)
43714371

4372+
if not one_model and overlap_scheduler:
4373+
pytest.skip(
4374+
"https://nvbugs/5745152: two_model + overlap_scheduler can sometimes time out."
4375+
)
4376+
43724377
MAX_OUTPUT_LEN = 128179
43734378
MAX_INPUT_LEN = 32768
43744379

tests/integration/test_lists/waives.txt

Lines changed: 0 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -320,8 +320,6 @@ accuracy/test_llm_api_pytorch.py::TestLlama3_1NemotronNano8Bv1::test_fp8_prequan
320320
accuracy/test_llm_api_pytorch.py::TestNemotronH_47B_Base::test_auto_dtype[tp8ep4-cuda_graph=True] SKIP (https://nvbugs/5640697)
321321
accuracy/test_llm_api_pytorch.py::TestNemotronH_47B_Base::test_reasoning_fp8_prequantized[tp8ep8-cuda_graph=True] SKIP (https://nvbugs/5640697)
322322
accuracy/test_llm_api_pytorch.py::TestQwQ_32B::test_auto_dtype_tp4 SKIP (https://nvbugs/5640697)
323-
accuracy/test_disaggregated_serving.py::TestGPTOSS::test_auto_dtype[True] SKIP (https://nvbugs/5644632)
324-
accuracy/test_disaggregated_serving.py::TestGPTOSS::test_auto_dtype[False] SKIP (https://nvbugs/5644632)
325323
test_e2e.py::test_ptp_quickstart_multimodal[mistral-small-3.1-24b-instruct-Mistral-Small-3.1-24B-Instruct-2503-image-True] SKIP (https://nvbugs/5648560)
326324
test_e2e.py::test_ptp_quickstart_multimodal[mistral-small-3.1-24b-instruct-Mistral-Small-3.1-24B-Instruct-2503-image-False] SKIP (https://nvbugs/5648560)
327325
accuracy/test_llm_api_pytorch.py::TestDeepSeekR1::test_nvfp4_multi_gpus[latency_trtllmgen_adp_lmtp] SKIP (https://nvbugs/5629136)

0 commit comments

Comments
 (0)