NER and Overlapped entities #10885
-
Hi all, As example: Is it possible to get this overlap only with the Spacy NER? |
Beta Was this translation helpful? Give feedback.
Replies: 2 comments 12 replies
-
It indeed sounds like you're more interested in span categorisation than named entity recognition. Both tasks involve finding substrings in text, but span categorisation allows for overlapping spans. It's supported as of spaCy 3.1 and you can read more about it here. There's an example project on GitHub that you might find useful as a starting point. Training a model for span categorisation is very similar to training one for named entity recognition in spaCy, you'd still use the If you're unfamiliar with the spaCy command-line interface or the project structure, then you may appreciate this YouTube video. |
Beta Was this translation helpful? Give feedback.
-
I thought one of the assumptions of your problem was you had overlapping spans, in this case, NER wouldn't work. You may be overwriting your relevant spans if you were actually able to train a model.
You can access the label on each span in the same way you do for entities, like this: [span.label_ for span in doc.spans['sc']]
How much training data do you have? This might just be an artifact of this training instance - if you don't have many examples, I wouldn't expect the model to pick up on the same information every time. |
Beta Was this translation helpful? Give feedback.
It indeed sounds like you're more interested in span categorisation than named entity recognition. Both tasks involve finding substrings in text, but span categorisation allows for overlapping spans. It's supported as of spaCy 3.1 and you can read more about it here. There's an example project on GitHub that you might find useful as a starting point.
Training a model for span categorisation is very similar to training one for named entity recognition in spaCy, you'd still use the
spacy train
command from the terminal. You'll need the have training data prepared and a configuration file, which can be configured easily from the online interface on the docs.If you're unfamiliar with the sp…