Multi languages. #2391
-
Hello everyone, I'm testing Whisper using an audio file with multiple languages and would like to know if it's possible to have it return the detected language for each segment. |
Beta Was this translation helpful? Give feedback.
Replies: 2 comments 5 replies
-
whisper doesn't support audio with multiple lang |
Beta Was this translation helpful? Give feedback.
-
I'm getting issues where I'm speaking a single language, but the final output that I'm getting from the Whisper contains multiple languages like Hindi plus English. What is the way to solve that I'm using inference provider like Grok? |
Beta Was this translation helpful? Give feedback.
@guilhermeasena32 While it is true that Whisper can sometimes transcribe multiple languages in the same audio, it won't do this reliably. Whisper was trained on monolingual audio files for a range of separate languages, but probably some examples of multilingual audio files were included but incorrectly labelled as single language audio files. As a result, Whisper can sometimes output a transcript for a multilingual audio with full multilingual transcripts, but incorrectly labelling it as a single language.
Even if you are OK with the incorrect labelling, the problem is that Whisper's training data just didn't include enough examples of multiple languages in the same audio file, which is …