The use of seeding / patterns for improving a NER model #10900
Replies: 2 comments
-
I have what I believe a very closely related question here: #10894 Though that is a bit more focused on the mechanics of incorporating labels generated by a rules-based component into the NER model training process. |
Beta Was this translation helpful? Give feedback.
-
You can use a rule-based annotator to create training data which can use for a model, that process is called "weak supervision". You might look at skweak, which is for weak supervision in spaCy. There's not an easy to way to tag arbitrary spans as being "probably entities", the way you might use existence in a glossary as a feature in some older NER systems. |
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
-
Hi all,
im currently tasked with improving a NER model with 'seeding' it with a list of patterns. per example:
{"label": "PHENOTYPE", "pattern": [{"LOWER": "mitochondrion"}]}
the assumption was that adding patterns like this during training would give the model an 'idea' / examples of what the entity phenotype looks like. However i'm finding no way to do this.
What i am finding is using patterns to match entities in corpus you're going to annotate and then manually accepting / rejecting the entities that were found by the simple pattern match.
in this context seeding / providing a pattern file doesn't do anything directly with the model it just improves the annotating workflow.
Am i correct in that it is not possible to improve a model by seeding alone during training and that it's just a tool for easier data annotation?
Beta Was this translation helpful? Give feedback.
All reactions