Skip to content

Commit ab4b996

Browse files
ruodilLarryXFly
andauthored
[TRTLLM-7287][test] add multimodal chunked_prefill cases (#8011)
Signed-off-by: Ruodi Lu <[email protected]> Co-authored-by: Ruodi Lu <[email protected]> Co-authored-by: Larry Xu <[email protected]>
1 parent 4545700 commit ab4b996

File tree

2 files changed

+26
-0
lines changed

2 files changed

+26
-0
lines changed

tests/integration/defs/perf/pytorch_model_config.py

Lines changed: 20 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -221,6 +221,26 @@ def get_model_yaml_config(model_label: str,
221221
'stream_interval': 10,
222222
'num_postprocess_workers': 4
223223
}
224+
},
225+
# Phi-4-multimodal-instruct with chunked prefill and kv_cache_reuse
226+
{
227+
'patterns': [
228+
'phi_4_multimodal_instruct-bench-pytorch-bfloat16-maxbs:48-maxnt:256-input_output_len:500,2000-con:250',
229+
'phi_4_multimodal_instruct-bench-pytorch-bfloat16-maxbs:128-maxnt:512-input_output_len:1000,1000-con:250'
230+
],
231+
'config': {
232+
'enable_chunked_prefill': True,
233+
}
234+
},
235+
# Mistral-Small-3.1-24B-Instruct-2503 with chunked prefill and kv_cache_reuse
236+
{
237+
'patterns': [
238+
'mistral_small_v3.1_24b-bench-pytorch-bfloat16-maxbs:48-maxnt:256-input_output_len:1000,2000-reqs:500-con:200',
239+
'mistral_small_v3.1_24b-bench-pytorch-bfloat16-maxbs:128-maxnt:512-input_output_len:1000,2000-reqs:500-con:200'
240+
],
241+
'config': {
242+
'enable_chunked_prefill': True,
243+
}
224244
}
225245
]
226246

tests/integration/test_lists/qa/llm_perf_core.yml

Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -39,6 +39,9 @@ llm_perf_core:
3939
- perf/test_perf.py::test_perf[phi_4_multimodal_instruct-bench-pytorch-bfloat16-input_output_len:1000,1000-con:250]
4040
- perf/test_perf.py::test_perf[phi_4_multimodal_instruct-bench-pytorch-bfloat16-input_output_len:128,128]
4141
- perf/test_perf.py::test_perf[phi_4_multimodal_instruct-bench-pytorch-bfloat16-input_output_len:512,32]
42+
# Phi-4-multimodal-instruct with chunked prefill and kv_cache_reuse
43+
- perf/test_perf.py::test_perf[phi_4_multimodal_instruct-bench-pytorch-bfloat16-maxbs:48-maxnt:256-input_output_len:500,2000-con:250]
44+
- perf/test_perf.py::test_perf[phi_4_multimodal_instruct-bench-pytorch-bfloat16-maxbs:128-maxnt:512-input_output_len:1000,1000-con:250]
4245
# Bielik-11B-v2.2-Instruct
4346
- perf/test_perf.py::test_perf[bielik_11b_v2.2_instruct-bench-pytorch-bfloat16-input_output_len:128,128]
4447
- perf/test_perf.py::test_perf[bielik_11b_v2.2_instruct-bench-pytorch-bfloat16-input_output_len:512,32]
@@ -52,6 +55,9 @@ llm_perf_core:
5255
#Mistral-Small-3.1-24B-Instruct-2503
5356
- perf/test_perf.py::test_perf[mistral_small_v3.1_24b-bench-pytorch-bfloat16-maxbs:1-input_output_len:1000,2000-reqs:8-con:1]
5457
- perf/test_perf.py::test_perf[mistral_small_v3.1_24b-bench-pytorch-bfloat16-input_output_len:1000,2000-reqs:500-con:200]
58+
# Mistral-Small-3.1-24B-Instruct-2503 with chunked prefill and kv_cache_reuse
59+
- perf/test_perf.py::test_perf[mistral_small_v3.1_24b-bench-pytorch-bfloat16-maxbs:48-maxnt:256-input_output_len:1000,2000-reqs:500-con:200]
60+
- perf/test_perf.py::test_perf[mistral_small_v3.1_24b-bench-pytorch-bfloat16-maxbs:128-maxnt:512-input_output_len:1000,2000-reqs:500-con:200]
5561

5662
# Test list validation
5763
- test_list_validation.py::test_list_validation

0 commit comments

Comments
 (0)