PoS Tags as additional features for training NER #9641
-
Hi, |
Beta Was this translation helpful? Give feedback.
Replies: 1 comment
-
This is technically possible and I can understand why it sounds attractive, but I actually suspect it won't help. The main reason is that PROPN vs. NOUN is one of the places where the tagger makes the most mistakes, so the tags may not be accurate enough to really help. Another is that with the current default configs, the If you do want to try this, you should look at where the POS annotation is coming from in your pipeline. It might be from If you're sourcing the POS-related components from another pipeline like More background on annotating components: https://spacy.io/usage/training#annotating-components Then you want to add [components.ner.model.tok2vec.embed]
@architectures = "spacy.MultiHashEmbed.v2"
width = 96
attrs = ["NORM","PREFIX","SUFFIX","SHAPE","SPACY","POS"]
rows = [5000,2500,2500,2500,100,2500]
include_static_vectors = false |
Beta Was this translation helpful? Give feedback.
This is technically possible and I can understand why it sounds attractive, but I actually suspect it won't help. The main reason is that PROPN vs. NOUN is one of the places where the tagger makes the most mistakes, so the tags may not be accurate enough to really help. Another is that with the current default configs, the
ner
model already uses the exact same tok2vec features as the tagger, so it's already taking the same features into consideration. But I haven't tested this and I could be wrong, and I could be wrong for particular domains/datasets for sure.If you do want to try this, you should look at where the POS annotation is coming from in your pipeline. It might be from
morpholo…