Replies: 3 comments 1 reply
-
It's pretty obviously at least partially web videos from a major site. I often see in moments of extended silence "Thanks for watching" or "Please like and subscribe". This is likely because a lot of videos on sites like that include that in their subtitles at the very end, even if they don't actually say the words, so the model has learned that sometimes in silence it just needs to toss that in. |
Beta Was this translation helpful? Give feedback.
-
Amara.org has appeared to me in some moments of silence. |
Beta Was this translation helpful? Give feedback.
-
This makes sense. But how do we stop this from happening? I'd really like to find a way to find all of these "silence" statements and stop them from appearing in the transcriptions Even ChatGPT (Paid version) is affected by this bug since it uses Whisper under the hood. |
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
-
Thank you for this amazing project!
I was wondering about the 680k hours of audio data. After reading through the paper and blog post, I don't think I saw any mention of where the data came from (other than the "from the web" phrase in the blog post). Are you able to say more about this?
I don't mean to sound like I'm looking a gift horse in the mouth, I'm just super curious about this. 😄
Beta Was this translation helpful? Give feedback.
All reactions