Problems with Transcribing Multilingual Audio #1367

RohitMidha23 · 2023-05-19T07:01:49Z

RohitMidha23
May 19, 2023

I have a video where a speaker talks in Gujarati mixed with English. To transcribe the video, I don't mind if the output is either completely in English or Gujarati.

I'm using the following snippet to transcribe the audio:

import whisper

model = whisper.load_model("large-v2")
result = model.transcribe("input.mp3", verbose=True)
print(result["text"])

Some sample output:

Detected language: Gujarati
[00:00.000 --> 00:07.000]  गुडि पढ़वाति, वचण-अमरुत नु वान्चन,
[05:51.500 --> 05:55.500]  पण परिवर्तन चेतन नी अवस्था मा,

[05:55.500 --> 05:57.000]  To make it simple,
[05:57.000 --> 05:59.500]  Chit shuddhi thaye j,

[07:24.500 --> 07:28.500]  And at a young age, he was running around with a dog.
[07:39.000 --> 07:41.000]  I had an incident just now,

I've added only the output which is relevant but happy to share full output if required.

Issues

So I noticed that there are major issues with this output which I require help with.

It starts outputting in Gujarati. Around the 5 min mark it encounters the first English sentence, and from then out it starts outputting Gujarati in English as well : Chit shuddhi thaye j
At around the 7 min mark, it starts translating the output to English which is by far the weirdest behavior I've observed.

I have tried enforcing language to English but even in that case the issue of translation instead of transcription occurs.

Any help would be greatly appreciated.

phineas-pta · 2023-05-19T21:37:31Z

phineas-pta
May 19, 2023

unfortunately whisper doesn't support multilingual audio

0 replies

SagarP-GPU · 2024-12-04T06:59:19Z

SagarP-GPU
Dec 4, 2024

Add this in model during generation of response - forced_decoder_ids=processor.get_decoder_prompt_ids(language="gu", task="transcribe")

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Problems with Transcribing Multilingual Audio #1367

Uh oh!

{{title}}

Uh oh!

Replies: 2 comments

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Select a reply

Uh oh!

Problems with Transcribing Multilingual Audio #1367

Uh oh!

RohitMidha23 May 19, 2023

Issues

Replies: 2 comments

Uh oh!

phineas-pta May 19, 2023

Uh oh!

SagarP-GPU Dec 4, 2024

RohitMidha23
May 19, 2023

phineas-pta
May 19, 2023

SagarP-GPU
Dec 4, 2024