Skip to content

Commit 595f780

Browse files
authored
[https://nvbugs/5624367][fix] Fix disagg GPT-OSS test (#8870)
Signed-off-by: Chuang Zhu <[email protected]>
1 parent 1ce8358 commit 595f780

File tree

5 files changed

+7
-5
lines changed

5 files changed

+7
-5
lines changed

tests/integration/defs/accuracy/test_disaggregated_serving.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -921,7 +921,7 @@ def test_auto_dtype(self, block_reuse, mocker):
921921
with launch_disaggregated_llm(disaggregated_server_config,
922922
ctx_server_config, gen_server_config,
923923
self.MODEL_PATH) as llm:
924-
model_name = "GPT-OSS/MXFP4"
924+
model_name = "GPT-OSS/120B-MXFP4"
925925
task = GSM8K(model_name)
926926
task.evaluate(llm,
927927
extra_evaluator_kwargs=self.extra_evaluator_kwargs)

tests/integration/test_lists/qa/llm_function_core.txt

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -516,6 +516,8 @@ accuracy/test_disaggregated_serving.py::TestDeepSeekV3Lite::test_guided_decoding
516516
accuracy/test_disaggregated_serving.py::TestDeepSeekV3Lite::test_guided_decoding[llguidance-mtp_nextn=2]
517517
accuracy/test_disaggregated_serving.py::TestGemma3_1BInstruct::test_auto_dtype[False]
518518
accuracy/test_disaggregated_serving.py::TestGemma3_1BInstruct::test_auto_dtype[True]
519+
accuracy/test_disaggregated_serving.py::TestGPTOSS::test_auto_dtype[True]
520+
accuracy/test_disaggregated_serving.py::TestGPTOSS::test_auto_dtype[False]
519521
accuracy/test_llm_api_pytorch.py::TestQwen3_8B::test_w4a8_mxfp4[fp8-latency]
520522
accuracy/test_llm_api_pytorch.py::TestQwen3_8B::test_w4a8_mxfp4[mxfp8-latency]
521523
accuracy/test_llm_api_pytorch.py::TestQwen3_30B_A3B::test_w4a8_mxfp4[fp8-latency-CUTLASS]

tests/integration/test_lists/qa/llm_function_core_sanity.txt

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -30,6 +30,8 @@ accuracy/test_disaggregated_serving.py::TestLlama4ScoutInstruct::test_auto_dtype
3030
accuracy/test_disaggregated_serving.py::TestLlama4ScoutInstruct::test_auto_dtype[True]
3131
accuracy/test_disaggregated_serving.py::TestQwen3_30B_A3B::test_mixed_ctx_gen_model[ctxpp2gentp2]
3232
accuracy/test_disaggregated_serving.py::TestQwen3_8B::test_nixl_backend
33+
accuracy/test_disaggregated_serving.py::TestGPTOSS::test_auto_dtype[True]
34+
accuracy/test_disaggregated_serving.py::TestGPTOSS::test_auto_dtype[False]
3335
accuracy/test_llm_api_pytorch.py::TestBielik11BInstruct::test_auto_dtype
3436
accuracy/test_llm_api_pytorch.py::TestBielik11BInstruct::test_fp8
3537
accuracy/test_llm_api_pytorch.py::TestDeepSeekR1::test_nvfp4_multi_gpus[latency_trtllmgen]

tests/integration/test_lists/qa/llm_function_nim.txt

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -382,11 +382,11 @@ accuracy/test_llm_api_pytorch.py::TestNemotronUltra::test_fp8_prequantized[tp8-c
382382
accuracy/test_llm_api_pytorch.py::TestQwQ_32B::test_auto_dtype_tp4
383383
accuracy/test_llm_api_pytorch.py::TestCodestral_22B_V01::test_auto_dtype
384384
accuracy/test_llm_api_pytorch.py::TestKimiK2::test_fp8_blockscale[latency]
385-
386385
accuracy/test_llm_api_pytorch_multimodal.py::TestQwen2_VL_7B::test_auto_dtype
387386
accuracy/test_llm_api_pytorch_multimodal.py::TestQwen2_5_VL_7B::test_auto_dtype
388387
accuracy/test_llm_api_pytorch_multimodal.py::TestLlava_V1_6_Mistral_7B::test_auto_dtype
389-
388+
accuracy/test_disaggregated_serving.py::TestGPTOSS::test_auto_dtype[True]
389+
accuracy/test_disaggregated_serving.py::TestGPTOSS::test_auto_dtype[False]
390390
test_e2e.py::test_openai_chat_harmony
391391
test_e2e.py::test_ptp_quickstart_advanced_multi_gpus[Nemotron-Ultra-253B-nemotron-nas/Llama-3_1-Nemotron-Ultra-253B-v1-8]
392392
test_e2e.py::test_ptp_quickstart_advanced[Nemotron4_4B-BF16-nemotron/Minitron-4B-Base]

tests/integration/test_lists/waives.txt

Lines changed: 0 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -373,8 +373,6 @@ triton_server/test_triton_rcca.py::test_rcca_bug_4934893[Temperature:0.5-TOP_P:0
373373
unittest/_torch/thop/parallel/test_fp8_rowwise_linear.py::test_fp8_rowwise_linear[dtype0] SKIP (https://nvbugs/5619396)
374374
unittest/_torch/thop/parallel/test_fp8_rowwise_linear.py::test_fp8_rowwise_linear[dtype1] SKIP (https://nvbugs/5619396)
375375
accuracy/test_disaggregated_serving.py::TestQwen3_30B_A3B::test_mixed_ctx_gen_model[ctxpp2gentp2] SKIP (https://nvbugs/5582258)
376-
accuracy/test_disaggregated_serving.py::TestGPTOSS::test_auto_dtype[True] SKIP (https://nvbugs/5624367)
377-
accuracy/test_disaggregated_serving.py::TestGPTOSS::test_auto_dtype[False] SKIP (https://nvbugs/5624367)
378376
disaggregated/test_disaggregated.py::test_disaggregated_benchmark_on_diff_backends[llama-v3-8b-hf] SKIP (https://nvbugs/5587574)
379377
triton_server/test_triton_llm.py::test_llava[False-1---False-True-False-0-128-enableDecoupleMode-inflight_fused_batching-disableTrtOverlap-0.7-max_utilization---1-1-1-False-tensorrt_llm_bls] SKIP (https://nvbugs/5434308)
380378
accuracy/test_llm_api_pytorch.py::TestDeepSeekR1::test_nvfp4_multi_gpus[throughput_tp8] SKIP (https://nvbugs/5629910)

0 commit comments

Comments
 (0)