You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
(1) When I run `python benchmark_multimodal.py -i
/sunghcho_data/onnx_models/whisper-tiny-en/cuda/cuda-fp16/ -au
/home/jiafa/accuracy/open_asr_leaderboard/whisper/data/20090202-0900-PLENARY-9-en_20090202-17\:20\:18_2.wav
-m 448`, the code ` inputs = processor(prompt, images=image,
audios=audio)` has core dump because `strings.size()==0` for `auto shape
= std::array<int64_t, 2>{static_cast<int64_t>(strings.size()),
static_cast<int64_t>(encoded.size() / strings.size())};` in `model.cpp`.
This is because ` WhisperProcessor::Process` only goes through
`EncodeBatch` whereas `payload.prompts={}` when we set up a single audio
there. So for single audio case, we capsulate into `prompts` and then
process.
(2) The code refactoring causes `params.set_inputs(inputs)` no longer
works.
0 commit comments