Difference in performance of postags between small and large models of portuguese #8633

ricardojosehlima · 2021-07-07T19:53:36Z

ricardojosehlima
Jul 7, 2021

I am using version 3.0.6 and as shown in https://spacy.io/models/pt the small model has 0.96 accuracy in POS_ACC while large model has 0.97. Is this due solely to the size of the models or is there any other differences between them in this task?
I have seen some differences in performance in both models where the small model performs better than the large model, one case is "Não vou falar com ele nem que ele peça" ("I won't talk to him even if he asks") and "Não vou falar com ele nem que ele me peça" ("I won't talk to him even if he asks me"). As can be seen the difference between the two sentences is the pronoun "me". For the small model, it has no effect as it correctly tags "peça" as verb in both cases. But in the large model, when the pronoun is inserted, the tag for "peça" is the erroneous noun.
Is it expected that a model that performs 0.97 while other performs 0.96 to have errors that the other doesn't have? Intuitively, the answer should be 'no', as one is increasing on the other.

polm · 2021-07-08T04:20:28Z

polm
Jul 8, 2021

Is this due solely to the size of the models or is there any other differences between them in this task?

Besides being larger, medium and large models typically incorporate word vectors that aren't used in the small models, which can improve accuracy. The architecture is not different.

Is it expected that a model that performs 0.97 while other performs 0.96 to have errors that the other doesn't have? Intuitively, the answer should be 'no', as one is increasing on the other.

A large model is not a small model with some of the errors fixed. Large models are expected to make fewer errors on average, but the distribution of those errors will be different, so there will be cases where the small model is right and the large model is wrong. If there is a strong pattern to those cases or they seem very frequent that could indicate an issue, but the fact that such cases exist is neither surprising nor a cause for concern. Also see #3052 for notes on model errors in general.

It might be helpful to think of models like students. If you have a student with high grades (Alice) and a student with low grades (Bob), the student with high grades will make fewer mistakes, but there's no guarantee that on any particular question it's impossible for Bob to get it right and Alice to get it wrong. Grades, like accuracy scores, are aggregate statistics.

1 reply

ricardojosehlima Jul 8, 2021
Author

Thanks a lot for the reply! I am new to the field and still experimenting its structures and trying to know more about it. It totally makes sense what you said about errors and it changed my mind so whenever I see model A perform better than model B I won't jump to the conclusion that all errors of B are fixed by A. I'll be trying both models with more details soon - I found an item in the FAQ of Discussions that maybe in some cases small is better than large. We'll see in the case of postags in portuguese.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Difference in performance of postags between small and large models of portuguese #8633

Uh oh!

{{title}}

Uh oh!

Replies: 1 comment 1 reply

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Select a reply

Uh oh!

Uh oh!

Difference in performance of postags between small and large models of portuguese #8633

Uh oh!

ricardojosehlima Jul 7, 2021

Replies: 1 comment · 1 reply

Uh oh!

polm Jul 8, 2021

Uh oh!

ricardojosehlima Jul 8, 2021 Author

ricardojosehlima
Jul 7, 2021

Replies: 1 comment 1 reply

polm
Jul 8, 2021

ricardojosehlima Jul 8, 2021
Author