How do I recommend improvements in spacy models NER extraction? #11019

nitinthewiz · 2022-06-23T23:04:21Z

nitinthewiz
Jun 23, 2022

Hi,

I'm using en_core_web_md and I see a few discrepancies in the NERs it extracts from a text.
For example -
"Binance" is marked as a GPE while it should be ORG
"Coinbase" is marked a PERSON when it should be ORG

How can I suggest these improvements to the models? Is opening a discussion the best way?

Or should I collect these locally and train the model further using something like augmenty to get the correct labels?

I understand that entity recognition is only about 85% accurate in these models and 90% in the transformer model.

Answered by xxyzz

Jun 23, 2022

You could use the Entity Ruler and add it before the ner component:

ruler = nlp.add_pipe("entity_ruler", before="ner")

View full answer

xxyzz · 2022-06-23T23:48:54Z

xxyzz
Jun 23, 2022

You could use the Entity Ruler and add it before the ner component:

ruler = nlp.add_pipe("entity_ruler", before="ner")

1 reply

nitinthewiz Jun 24, 2022
Author

Thank you for pointing me in this direction, which seems to be more native than using a third party tool. I'd still like to know if we can contribute to improving the models, though.

polm · 2022-06-24T04:03:48Z

polm
Jun 24, 2022

We don't collect training data from users to directly improve the models, or to address specific issues. If the models are performing poorly for your use case or entities you care about you should train your own model. It's also our expectation that while the pretrained models are broadly useful and great for getting started, for serious applications you should usually be training your own model for your own data.

Also see #3052 about inaccurate predictions in general.

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

How do I recommend improvements in spacy models NER extraction? #11019

Uh oh!

{{title}}

Uh oh!

Replies: 2 comments 1 reply

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Select a reply

Uh oh!

Uh oh!

How do I recommend improvements in spacy models NER extraction? #11019

Uh oh!

nitinthewiz Jun 23, 2022

Replies: 2 comments · 1 reply

Uh oh!

xxyzz Jun 23, 2022

Uh oh!

nitinthewiz Jun 24, 2022 Author

Uh oh!

polm Jun 24, 2022

nitinthewiz
Jun 23, 2022

Replies: 2 comments 1 reply

xxyzz
Jun 23, 2022

nitinthewiz Jun 24, 2022
Author

polm
Jun 24, 2022