Does adding new entity types help improve the accuracy of existing entity types #8459
Replies: 1 comment
-
It could certainly help in some situations. spaCy only allows each token to have one label, so if there are some words that your model is labelling as an entity when it shouldn't, giving them another entity label can reduce confusion. Whether this will actually work for your specific label and data depends on many details, so honestly the best way to find out if it'll work or not is to actually build a model and see. Based on the description of your data I'd say it's worth a shot.
That kind of text replacement has a long history in NLP but is uncommon with modern methods and is generally not necessary or a good idea with spaCy. If you used this particular scheme spaCy would learn things that are flagrantly untrue like "human names are always six letters long and all capital letters". It's best to use data as you find it. |
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
-
I have a model that does a pretty good job of identifying JOB_TITLES/DESIGNATIONS, as part of improving the model - I wanted to ask if adding a new entity type called PERSON would help in improving accuracy of the JOB_TITLE entity type.
(the JOB_TITLES are often in the neighbour hood of names. )
p.s - I am not talking about using a PERSON entity model to pre-process the text and replace PERSON with a special token and then train the JOB_TITLE entity model. I am asking if in the same model - accuracy of one entity type is improved by adding more entity types.
Beta Was this translation helpful? Give feedback.
All reactions