Replies: 1 comment
-
Hi, Don't know what the issue is, but seems quite random to get stuck like this. |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
Today I've tested the v2 model vs the v1 and found that in a lot of cases the v2 one tends to just spit out this text: " Feliratok az Amara.org közösségétől". I think there might be a lot of invalid data scraped from this site: amara.org. I mostly tested the models with music, here are a few examples:
https://youtu.be/1BI54w6T_Uo
https://youtu.be/M5CwqYQNRcY
I think it would be beneficial to look this string up and clean the training data.
Otherwise, I've seen a large improvement in Hungarian, whenever the whisper was willing to create real transcripts.
Beta Was this translation helpful? Give feedback.
All reactions