word_timestamps parameter results in out-of-sequence output with large-v3 #2024
Unanswered
AutomationAdam
asked this question in
Q&A
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
I'm running this simple code below, which produces good results with large-v2 and better results with large-v3.
But when I use the
word_timestamps=True
parameter with large-v3, words and sentence fragments start to get out of sequence, usually beginning about half way through the transcript. I've tested with several spoken 2 minute mp3 files, all with clean audio.Do I need to do something differently for large-v3, or could this be a bug?
`import whisper
audio = './test_1.mp3'
model = whisper.load_model("large-v2") #or large-v3
result = model.transcribe(
audio=audio,
language='en',
word_timestamps=True,
task="transcribe"
)
print(result["text"])`
Beta Was this translation helpful? Give feedback.
All reactions