Skip to content

Commit 101f208

Browse files
authored
Fix benchmark_multimodal (microsoft#1714)
(1) When I run `python benchmark_multimodal.py -i /sunghcho_data/onnx_models/whisper-tiny-en/cuda/cuda-fp16/ -au /home/jiafa/accuracy/open_asr_leaderboard/whisper/data/20090202-0900-PLENARY-9-en_20090202-17\:20\:18_2.wav -m 448`, the code ` inputs = processor(prompt, images=image, audios=audio)` has core dump because `strings.size()==0` for `auto shape = std::array<int64_t, 2>{static_cast<int64_t>(strings.size()), static_cast<int64_t>(encoded.size() / strings.size())};` in `model.cpp`. This is because ` WhisperProcessor::Process` only goes through `EncodeBatch` whereas `payload.prompts={}` when we set up a single audio there. So for single audio case, we capsulate into `prompts` and then process. (2) The code refactoring causes `params.set_inputs(inputs)` no longer works.
1 parent e22739f commit 101f208

File tree

1 file changed

+6
-5
lines changed

1 file changed

+6
-5
lines changed

benchmark/python/benchmark_multimodal.py

Lines changed: 6 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -157,18 +157,19 @@ def run_benchmark(args, model, processor, image, audio, generation_length, max_l
157157
main_prompt = "What is the meaning of life?"
158158
prompt = f'{user_prompt}{main_prompt}{prompt_suffix}{assistant_prompt}'
159159

160-
inputs = processor(prompt, images=image, audios=audio)
161-
prompt_length = inputs['input_ids'].shape[1]
160+
prompts = [prompt]
161+
inputs = processor(prompts, images=image, audios=audio)
162+
prompt_length = inputs['input_ids'].shape()[1]
162163
if args.verbose: print(f"Prompt used: {prompt}")
163164

164165
params = og.GeneratorParams(model)
165-
params.set_inputs(inputs)
166166
do_sample = args.top_k > 1 or (args.top_p != 1.0 and args.top_p > 0.0)
167167
params.set_search_options(do_sample=do_sample, top_k=args.top_k, top_p=args.top_p, temperature=temperature, max_length=max_length, min_length=max_length)
168168

169169
if args.verbose: print("Processed inputs, running warmup runs...")
170170
for _ in tqdm(range(args.warmup)):
171171
generator = og.Generator(model, params)
172+
generator.set_inputs(inputs)
172173
i = 1
173174
while not generator.is_done() and i < generation_length:
174175
generator.generate_next_token()
@@ -188,18 +189,18 @@ def run_benchmark(args, model, processor, image, audio, generation_length, max_l
188189

189190
# Measure prompt and image processing
190191
process_start_time = time.perf_counter()
191-
inputs = processor(prompt, images=image, audios=audio)
192+
inputs = processor(prompts, images=image, audios=audio)
192193
process_end_time = time.perf_counter()
193194
process_times.append(process_end_time - process_start_time)
194195

195196
# Prepare run
196197
params = og.GeneratorParams(model)
197-
params.set_inputs(inputs)
198198
params.set_search_options(do_sample=do_sample, top_k=args.top_k, top_p=args.top_p, temperature=temperature, max_length=max_length, min_length=max_length)
199199

200200
# Measure prompt processing
201201
prompt_start_time = time.perf_counter()
202202
generator = og.Generator(model, params)
203+
generator.set_inputs(inputs)
203204
prompt_end_time = time.perf_counter()
204205
prompt_times.append(prompt_end_time - prompt_start_time)
205206

0 commit comments

Comments
 (0)