Convert to ONNX #134

ArtyomZemlyak · 2022-09-26T02:43:59Z

ArtyomZemlyak
Sep 26, 2022

Hi! Awesome model!
We are looking towards improving performance on the CPU. To do this, we trying to convert tiny and large models into the onnx format.

Converted:

Encoder model
Decoder model

First try results:

The size of the model weight files has increased by 1.5 -2 times.
tiny: 74mb -> 226mb
large: 3gb->6gb)
Performance has become more 10 times worse.
tiny: 9s -> 95s
large: 30s -> >30min (and i stopped)
At least for the tiny model, the recognition result become worse? very much, compared to the torch.

ArtyomZemlyak · 2022-09-26T10:07:56Z

ArtyomZemlyak
Sep 26, 2022
Author

liuxpro Jan 12, 2024

@ArtyomZemlyak hi ！ Has the task of converting the wishper model to onnx been resolved?

liuxpro Jan 12, 2024

If it has been resolved, what method was used?

WilliamTambellini · 2022-09-26T16:18:38Z

WilliamTambellini
Sep 26, 2022

Cool.
Do you use
https://github.com/onnx/onnx/blob/main/docs/Operators.md#DFT
for the log-mel ?

1 reply

ArtyomZemlyak Sep 26, 2022
Author

No any additional changes. Just raw covertation.
But i fixed .T, as mentioned in this #141

Y-T-G · 2022-10-02T05:51:27Z

Y-T-G
Oct 2, 2022

@ArtyomZemlyak

How did you integrate it to the code?

I think the decoder needs to be exported into two versions, with KV caching and without KV caching, so it needs to be switched between the two. Otherwise, the performance will be bad.

1 reply

ArtyomZemlyak Oct 2, 2022
Author

Yes, I just tried convert without KV.
If you interesting in implementation with KV, you can check this repo from #208

dhuynh95 · 2022-10-04T12:53:10Z

dhuynh95
Oct 4, 2022

@ArtyomZemlyak That looks great! We are interested in deploying Whisper on our confidential AI inference engine (https://github.com/mithril-security/blindai).
Could you share the code to export the model to ONNX and the code to test it? Thanks!

3 replies

ArtyomZemlyak Oct 4, 2022
Author

Hi!
My solution dont support KV caching. And have big troubles with quality.
But you can check this repo for another solution with KV supported, and some models on HF. But you need ask owner or deep dive to commits to find onnx export code. Also its repo from here #208

zara0m Feb 20, 2023

Hi! My solution dont support KV caching. And have big troubles with quality. But you can check this repo for another solution with KV supported, and some models on HF. But you need ask owner or deep dive to commits to find onnx export code. Also its repo from here #208

Hello, I tried to replace onnx encoder and decoder instead of whisper class in model.py, and remove any part which is related to kv_cache. The output was something meaningless with lots of language tokens only. I cannot debug and found the reason. Could you please guide me how did you inference without kv_cache?

Thank you.

QiongWang-l Mar 9, 2023

Hi! My solution dont support KV caching. And have big troubles with quality. But you can check this repo for another solution with KV supported, and some models on HF. But you need ask owner or deep dive to commits to find onnx export code. Also its repo from here #208

Hello, I tried to replace onnx encoder and decoder instead of whisper class in model.py, and remove any part which is related to kv_cache. The output was something meaningless with lots of language tokens only. I cannot debug and found the reason. Could you please guide me how did you inference without kv_cache?

Thank you.

I encountered the same problem as you, just remove all code in function logits() other than 'return xxx'

standyyyy · 2023-04-07T01:54:02Z

standyyyy
Apr 7, 2023

Yes, I am also very interested in this. First of all, thank you for your sharing. If possible, can you share the converted ONNX model or tell us how to convert it in detail?

0 replies

kyakuno · 2023-04-07T08:45:48Z

kyakuno
Apr 7, 2023

We converted whisper model to ONNX. The kv_cache size was fixed at the maximum length when exporting.

Export
https://github.com/axinc-ai/whisper-export/tree/onnx-export

$ python3 cli.py audio.wav --model medium --export_encoder
$ python3 cli.py audio.wav --model medium --export_decoder

ONNX
https://storage.googleapis.com/ailia-models/whisper/encoder_tiny.onnx
https://storage.googleapis.com/ailia-models/whisper/decoder_tiny_fix_kv_cache.onnx

Inference
https://github.com/axinc-ai/ailia-models/tree/master/audio_processing/whisper

You can run converted ONNX model using ONNX Runtime. The model will automatically download.

$ python3 whisper.py --onnx --normal

If you use ailia SDK instead of ONNX Runtime, you can get 2.31 times faster on macOS.

$ python3 whisper.py

Our benchmark on M1 Pro Max of 40sec wave file with beam_size = 1.

Framework	Inference Time (ms)
Pytorch 1.10.1	34195
ONNX Runtime 1.13.1	52387
ailia SDK 1.2.4	19904

Our ONNX file size.

Model	Encoder(MB)	Decoder(MB)
Tiny	32.9	198.9
Small	352.7	775.1

5 replies

andrch-FS Jun 14, 2023

hello, thanks for sharing this.
I'm trying to deploy the onnx models on triton and it fails to load the decoder model because the "offset" input has dims: []
does anyone know how to go around this? any help is appreciated, thanks!

iGerman00 Jul 18, 2023

Has anyone thought of porting this over to Cloudflare Constellation? I don't have the experience myself, and their docs seem to be slightly outdated, but Constellation is like a free cloud ONNX runtime, so it could be nice to just run an endpoint for oneself without relying on expensive hardware

liuxpro Jan 12, 2024

@kyakuno hi ！ https://github.com/axinc-ai/whisper-export/tree/onnx-export ，Is the model in this git (such as media. en) consistent with the model in OpenAI's huggingface? I think there's a missing embedding ?
WhisperForConditionalGeneration(
(model): WhisperModel(
(encoder): WhisperEncoder(
(conv1): Conv1d(80, 1024, kernel_size=(3,), stride=(1,), padding=(1,))
(conv2): Conv1d(1024, 1024, kernel_size=(3,), stride=(2,), padding=(1,))
(embed_positions): Embedding(1500, 1024)
(layers): ModuleList(

zllljf Mar 28, 2025

@kyakuno hi! Where can I download the whisper small_encoder onnx?

kyakuno Apr 2, 2025

You can download models from here.
https://github.com/axinc-ai/ailia-models/tree/master/audio_processing/whisper

Netron:
https://netron.app/?url=https://storage.googleapis.com/ailia-models/whisper/encoder_small.opt3.onnx.prototxt
Encoder Small:
https://storage.googleapis.com/ailia-models/whisper/encoder_small.opt3.onnx

wilderfield · 2023-07-17T21:43:41Z

wilderfield
Jul 17, 2023

Please forgive the naive question, but is it not possible to do the conversion to ONNX using:
torch.onnx.export() ?

1 reply

phineas-pta Jul 18, 2023

it's basically what they were doing

csukuangfj · 2023-08-05T14:48:19Z

csukuangfj
Aug 5, 2023

I suggest that you have a look at
k2-fsa/sherpa-onnx#238

We are supporting whisper in sherpa-onnx

At present, you can use the code from the above PR to export whisper models to ONNX and use the exported model with onnxruntime in Python for speech recognition.

We are adding C++ support to sherpa-onnx

1 reply

csukuangfj Jul 12, 2024

We have just supported converting whisper large, large v1, large v2, large v3, and distil large v2 to onnx and we also provide C++ runtime
to run the exported onnx model on CPU and GPU.

A colab notebook for demonstration is available at

https://github.com/k2-fsa/colab/blob/master/sherpa-onnx/sherpa_onnx_whisper_large_v3.ipynb

phineas-pta · 2023-10-01T20:12:54Z

phineas-pta
Oct 1, 2023

huggingface also make onnx export but limited to their framework

https://huggingface.co/docs/optimum/exporters/onnx/usage_guides/export_a_model

0 replies

AntyRia · 2024-06-06T08:40:05Z

AntyRia
Jun 6, 2024

Is there anyone here who can convert this model to rknn

0 replies

soham-vyas73 · 2024-10-25T07:05:52Z

soham-vyas73
Oct 25, 2024

I would like to go through the repo of whisper pytorch to onnx conversion code. it will be very helpful.

0 replies

Convert to ONNX #134

Uh oh!

Replies: 11 comments · 14 replies

Uh oh!

ArtyomZemlyak Sep 26, 2022 Author

Uh oh!

Uh oh!

Uh oh!

Uh oh!

ArtyomZemlyak Sep 26, 2022 Author

Uh oh!

Uh oh!

ArtyomZemlyak Oct 2, 2022 Author

Uh oh!

Uh oh!

ArtyomZemlyak Oct 4, 2022 Author

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Replies: 11 comments 14 replies

ArtyomZemlyak
Sep 26, 2022
Author

ArtyomZemlyak Sep 26, 2022
Author

ArtyomZemlyak Oct 2, 2022
Author

ArtyomZemlyak Oct 4, 2022
Author