(Categorical) Evaluation not what I expect #10085
-
Hi All, Apologies if this ends up being a simple mistake, but I've been looking into it for a few days now and losing my mind. Spacy evaluation is different from my evaluation, specifically wrt microprecision. If I use Language.evaluate(examples) I get ~70% micro precision.
If I load everything in a dataset and check if the label is the same as the prediction, I get ~32%.
Have I failed to understand micro precision? Do I need more arguments on evaluate to make sure it only takes credit for the top categorical prediction? I was not surprised to see 70%. I am surprised to see 30%-- if anything I expected this to overfit and I'm evaluating on the training data here. Should I have specified a different evaluation metric for training somehow? Am I losing touch with reality? I know this isn't fully replicable without the data. I can hear my grad school advisor in my head telling me to start over with a dummy set... But I'm asking the internet instead. If nothing else, I would love to hear a better way to get to evaluation from DocBins. Thanks for your help. |
Beta Was this translation helpful? Give feedback.
Replies: 1 comment 3 replies
-
Gosh dang it... at least asking for help publicly solved my problem... Like those people who smoke cigarettes to get a bus to show up. As you may have guessed, the data in the doc bin was not what I remembered it being. The 'text' in |
Beta Was this translation helpful? Give feedback.
Gosh dang it... at least asking for help publicly solved my problem... Like those people who smoke cigarettes to get a bus to show up. As you may have guessed, the data in the doc bin was not what I remembered it being. The 'text' in
down
was much longer than the examples in the docbins which were used for training. I would still love to get feedback on the mangled way I'm manipulating these datasets. Perhaps there is an easy-ish way to compare the text/examples in one dataset to that in a pandas DF? Or an especially good way to move pandas into docbin? I would love to be better at this intersection...