Convert to ONNX #134
Replies: 11 comments 14 replies
-
More work on this topic.
TIME torch: 2.71
TIME torch: 1.50 If all model can be processed without bottlenecks, we can achieve ~60% better performance on CPU. |
Beta Was this translation helpful? Give feedback.
-
Cool. |
Beta Was this translation helpful? Give feedback.
-
How did you integrate it to the code? I think the decoder needs to be exported into two versions, with KV caching and without KV caching, so it needs to be switched between the two. Otherwise, the performance will be bad. |
Beta Was this translation helpful? Give feedback.
-
@ArtyomZemlyak That looks great! We are interested in deploying Whisper on our confidential AI inference engine (https://github.com/mithril-security/blindai). |
Beta Was this translation helpful? Give feedback.
-
Yes, I am also very interested in this. First of all, thank you for your sharing. If possible, can you share the converted ONNX model or tell us how to convert it in detail? |
Beta Was this translation helpful? Give feedback.
-
We converted whisper model to ONNX. The kv_cache size was fixed at the maximum length when exporting. Export
ONNX Inference You can run converted ONNX model using ONNX Runtime. The model will automatically download.
If you use ailia SDK instead of ONNX Runtime, you can get 2.31 times faster on macOS.
Our benchmark on M1 Pro Max of 40sec wave file with beam_size = 1.
Our ONNX file size.
|
Beta Was this translation helpful? Give feedback.
-
Please forgive the naive question, but is it not possible to do the conversion to ONNX using: |
Beta Was this translation helpful? Give feedback.
-
I suggest that you have a look at We are supporting whisper in sherpa-onnx At present, you can use the code from the above PR to export whisper models to ONNX and use the exported model with onnxruntime in Python for speech recognition. We are adding C++ support to sherpa-onnx |
Beta Was this translation helpful? Give feedback.
-
huggingface also make onnx export but limited to their framework https://huggingface.co/docs/optimum/exporters/onnx/usage_guides/export_a_model |
Beta Was this translation helpful? Give feedback.
-
Is there anyone here who can convert this model to rknn |
Beta Was this translation helpful? Give feedback.
-
I would like to go through the repo of whisper pytorch to onnx conversion code. it will be very helpful. |
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
-
Hi! Awesome model!
We are looking towards improving performance on the CPU. To do this, we trying to convert tiny and large models into the onnx format.
Converted:
First try results:
The size of the model weight files has increased by 1.5 -2 times.
tiny: 74mb -> 226mb
large: 3gb->6gb)
Performance has become more 10 times worse.
tiny: 9s -> 95s
large: 30s -> >30min (and i stopped)
At least for the tiny model, the recognition result become worse? very much, compared to the torch.
Beta Was this translation helpful? Give feedback.
All reactions