When i compare whisper and other STT model on TED dataset, i get weird result #1127
SeunghyunSEO
started this conversation in
General
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
Hello, I am an ASR researcher and I attempted to compare the Whisper model with other models on the TED dataset.
While transcribing some samples using Whisper, I encountered some unexpected results, such as the following transcription:
It appears that Whisper outputted the names of both the interviewer and the interviewee, with "Dan Short" being the male interviewee and "Vanessa Karnes" being the female interviewer.
However, when using another model, this did not occur.
I am curious if there are other features that could infer the speaker in a dialogue or if this is simply because the labels were not preprocessed during training. Could you please provide some insight on this matter?
Thank you.
Beta Was this translation helpful? Give feedback.
All reactions