How to debug model's predictions? #8003

amirj · 2021-05-04T17:11:26Z

amirj
May 4, 2021

I'm using nlp.evaluate() to evaluate a trained custom entity linker on a dataset. It's great but I've been thinking is there a good way to monitor model's prediction to learn more about different types of mistakes?

How do you monitor the behaviour of your NLP pipeline?

Thanks for your suggestions.

svlandeg · 2021-05-05T08:18:02Z

svlandeg
May 5, 2021

To really understand the type of (wrong) predictions by your model on your data, you probably want to perform a manual error analysis. Ideally, you'd first annotate some unseen data manually, so you have a gold-standard. Then you take the same data, run it through your model, and start inspecting the cases where the predictions are different to the annotation. Then you can start dissecting for each case, what when wrong and why.

For instance, some time ago I did an evaluation of an NEL use-case by first manually annotating some data with Prodigy:

And then I'd compare those results to the predictions. Errors would largely fall in these categories:

The correct ID was not in the knowledge base (and thus couldn't be recovered)
The entity was wrongly picked up as a named entity (mistake of NER)
The ID was predicted wrongly, even though there was enough context in the sentence
There was not enough context in the sentence/document to make a correct disambiguation
...

This error analysis will give you an idea on realistic upper bounds of your system, as well as providing clues which parts/steps of the model to improve.

0 replies

polm · 2021-05-06T04:41:32Z

polm
May 6, 2021

Nothing beats manual error analysis, but doing automatic evaluation of certain kinds of errors can also be helpful. There was a great paper last year with a system called CheckList about automatically testing NLP models; this article is a nice summary of it.

An example of what you can do is swap named entities with different named entities and see how often that affects the prediction, or see what adding "not" to a sentence does. spaCy doesn't have a built-in feature for this, but the dependency parse related features make implementation easy.

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

How to debug model's predictions? #8003

Uh oh!

{{title}}

Uh oh!

Replies: 2 comments

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Select a reply

Uh oh!

Uh oh!

How to debug model's predictions? #8003

Uh oh!

amirj May 4, 2021

Replies: 2 comments

Uh oh!

svlandeg May 5, 2021

Uh oh!

polm May 6, 2021

amirj
May 4, 2021

svlandeg
May 5, 2021

polm
May 6, 2021