Most accurate model for English? #1540
Replies: 4 comments 10 replies
-
Go to the README page and click on the "Paper" link. Section 4.3 answers your question including a diagram:
|
Beta Was this translation helpful? Give feedback.
-
The left part of diagram shows the English model can outperform.
…On Fri, Jul 21, 2023 at 11:44 PM ryanheise ***@***.***> wrote:
The models available for downland, based on my understanding, have been
trained.
ALL models (by definition) have been trained. I'm not sure if you think
that there are other kinds of models that have not been trained, but that
doesn't really make sense. An untrained model wouldn't really be a model of
anything, and so the training of it is part and parcel of models. Or if you
think that the paper is maybe talking about secret models that they haven't
made available for download as opposed to the ones that ARE available for
download and that these are somehow different, then no, the paper
explicitly mentions the models that they have published and which are
available for download. The paper tells you that the larger
multilingual/multitask models outperform the English-only models, and so it
does answer your question.
—
Reply to this email directly, view it on GitHub
<#1540 (reply in thread)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/A47TGBYPCKJYIIKL7KLDG33XRNECPANCNFSM6AAAAAA2TQH2AE>
.
You are receiving this because you authored the thread.Message ID:
***@***.***>
|
Beta Was this translation helpful? Give feedback.
-
I was confused. Apologies. That said, I did read the paper after you quoted it, particularly that section. |
Beta Was this translation helpful? Give feedback.
-
English WER:
So, for English, On page 11 of the Whisper paper (PDF) is this graph and table showing the stats: The way that I read @ryanheise 's graph/reference (Section 4.3 + Figure 9), is: IF you have English audio, then:
A tipping point happened at
I'll repost the relevant section here. It explains the technical details behind WHY the larger model, when fed more data, began outperforming the single-language English-only version.
|
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
-
Medium English only model or large multi-language model?
"The .en models for English-only applications tend to perform better."
Beta Was this translation helpful? Give feedback.
All reactions