how to solve low inference speed of my model #1815

Topology2333 · 2023-11-16T04:37:54Z

Topology2333
Nov 16, 2023

Inquiry about Inference Time and Custom Tokenizer in Whisper-Small

Dear Whisper community,

I hope this message finds you well. As a beginner in the field, I am currently working on leveraging Whisper's robust alignment capabilities to fine-tune the model for low-resource languages. My approach involves training a language-to-IPA model to address specific phonetic requirements. However, I'm encountering challenges, particularly with the support for the fricative sound ɬ.

Issue Description:

After fine-tuning the model with a custom tokenizer (IPA), I noticed a significant increase in the time required for predicting a single sentence—approximately 7-8 minutes. Strangely, when I used the default Whisper tokenizer, the predictions were much faster, taking only a few seconds. However, the results were in multiple languages, which is unexpected.

Details:

Custom Tokenizer: IPA-based
Training: OpenAI's Whisper-Small model
Inference Time: 7-8 minutes per sentence (custom tokenizer) vs. a few seconds (default Whisper tokenizer)

Questions:

Any insights into why using a custom IPA tokenizer would result in such a drastic increase in inference time?
Is there a recommended approach for incorporating a custom tokenizer without compromising speed?
How can I ensure language consistency in predictions when using a custom tokenizer?

Additionally, I'm in the process of building a language-to-IPA model due to specific phonetic requirements. This is the context that led to the aforementioned issues. Given Whisper's strengths in alignment, I believe it's a powerful tool for my intended workflow: speech-to-IPA-to-low-resource language text. Any advice or insights, especially considering my beginner status, on optimizing the model for handling phonetic nuances, particularly the fricative sound ɬ, would be immensely valuable.

Thank you for your time and assistance.

Best regards,
Topology2333

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

how to solve low inference speed of my model #1815

Uh oh!

{{title}}

Uh oh!

Replies: 0 comments

Select a reply

Uh oh!

how to solve low inference speed of my model #1815

Uh oh!

Topology2333 Nov 16, 2023

Inquiry about Inference Time and Custom Tokenizer in Whisper-Small

Issue Description:

Details:

Questions:

Replies: 0 comments

Topology2333
Nov 16, 2023