-
Hi I started using whisper today with the below setup and it was able to transcribe a two hour long meeting audio from Japanese into English almost perfectly (I was surprised with the high quality of the translation). However this was a mistake because I needed the audio in Japanese so I changed the result line from: result = model.transcribe(audio_path) to result = model.transcribe(audio_path, language="Japanese") However this returned a transcription that only lasted the first hour and no matter how many times I attempted the transcription with the language set as Japanese it'll always drop out somewhere around the 1 hour mark. When I tried transcribing from JPN to ENG again that was fine, it gave me that perfect translation. Does anyone know what is happening and how I can modify the code to keep it from translating the audio before transcribing?
|
Beta Was this translation helpful? Give feedback.
Replies: 2 comments 1 reply
-
There isn't enough information here to say for certain what is happening. Are you running this on a local machine, or could you be running into a quota or usage limitation on a cloud service? |
Beta Was this translation helpful? Give feedback.
-
Note to others who come across this thread: it seems like the current state of Whisper has a limit on how long it can transcribe for languages other than english as outlined by other users who are working on active projects:
For now my solution is to break my audio up into 15 minute chunks and transcribing them individually before concatenating back together. |
Beta Was this translation helpful? Give feedback.
Note to others who come across this thread: it seems like the current state of Whisper has a limit on how long it can transcribe for languages other than english as outlined by other users who are working on active projects:
For now my solution is to break my audio up into 15 minute chunks and transcribing them individually before concatenating back together.