using the large model within python #1354
-
Installed Whisper and everything works from the command line and within a python script. However, when using the following command line command, I get much better results (as expected): There are words in the audio that are transcribed correctly this way. However, I'm not having any luck using the large model from within a python script. I have tried the following:
and:
and:
I'm able to get the file transcribed, but not with the accuracy as the cli results and it runs much faster so for those two reasons, I don't believe it's actually using the large (or large-v2) model. Any suggestions would be appreciated as this has been a wall I can't figure out for about a week now. And, thank you. Brad |
Beta Was this translation helpful? Give feedback.
Replies: 1 comment 1 reply
-
This is likely because the command line invocation is using different default settings for the transcription, as described here: You can just load the "large" model, it is the same as "large-v2". Try changing the transcription line in your 3rd code block to: model.transcribe("20230428.mp3", beam_size=5, best_of=5) |
Beta Was this translation helpful? Give feedback.
This is likely because the command line invocation is using different default settings for the transcription, as described here:
You can just load the "large" model, it is the same as "large-v2". Try changing the transcription line in your 3rd code block to: