Whisper can't do large gaps between spoken sections #1278
Unanswered
fingertrouble
asked this question in
Q&A
Replies: 2 comments 1 reply
-
It's the classic Whisper hallucination issue. You can try running it with |
Beta Was this translation helpful? Give feedback.
1 reply
-
I got around this with my whisper continuous dictation & remote control tool by using The gist of it is here, although listening to the mic,
|
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
-
Whisper is brilliant - and getting better I think (or learning more about how I say things) but one thing is a real faff - it does weird things if you have long gaps between spoken words (like I am a music podcaster, so I will have gaps between the spoken bits).
Interestingly it can also pick up lyrics and singing, but if I find I have long instrumental sections, it will quite often lose and skip whole spoken word sections repeating a word like this:
01:19:15.600 --> 01:19:17.600
Jingle
01:19:37.600 --> 01:19:47.600
Jingle
01:19:51.600 --> 01:19:53.600
Jingle
01:20:07.600 --> 01:20:17.600
Jingle
01:49:42.600 --> 01:49:45.600
Ooh, ooh, ooh
01:49:45.600 --> 01:49:48.600
Ooh, ooh, ooh
01:49:48.600 --> 01:49:52.600
Ooh, ooh, ooh
01:49:53.600 --> 01:49:56.600
Ooh, ooh, ooh
01:49:59.600 --> 01:50:02.600
Ooh, ooh, ooh
It does this if I work on the original podcast (with music sections/beds) or just export the speech with no music behind
What I have ended up doing is exporting all the spoken bits separately with small gaps between chunks, but that means any timings are wrong, and it's a faff to edit and export a second podcast just for transcription.
Then if the gaps are quite short it doesn't seem to lose whole parts. I'd love to have transcription in the player, like some players allow, for accessibility but that's not possible with the way Whisper works.
Is there a setting I've missed for it to not lose track? Or is this a bug?
Using Whisper via brew (and formerly pip install, same issue, I recently upgraded to the latest build via Homebrew, didn't fix this) on a M1 Pro Macbook, Ventura 13.3.1, Python 3.11.3 - latter Python install via homebrew, cos Macs still use 2.7 for some odd reason.
Beta Was this translation helpful? Give feedback.
All reactions