How do I recommend improvements in spacy models NER extraction? #11019
-
Hi, I'm using en_core_web_md and I see a few discrepancies in the NERs it extracts from a text. How can I suggest these improvements to the models? Is opening a discussion the best way? Or should I collect these locally and train the model further using something like augmenty to get the correct labels? I understand that entity recognition is only about 85% accurate in these models and 90% in the transformer model. |
Beta Was this translation helpful? Give feedback.
Replies: 2 comments 1 reply
-
You could use the Entity Ruler and add it before the ruler = nlp.add_pipe("entity_ruler", before="ner") |
Beta Was this translation helpful? Give feedback.
-
We don't collect training data from users to directly improve the models, or to address specific issues. If the models are performing poorly for your use case or entities you care about you should train your own model. It's also our expectation that while the pretrained models are broadly useful and great for getting started, for serious applications you should usually be training your own model for your own data. Also see #3052 about inaccurate predictions in general. |
Beta Was this translation helpful? Give feedback.
You could use the Entity Ruler and add it before the
ner
component: