The use of seeding / patterns for improving a NER model #10900

Larsdegroot · 2022-06-02T13:44:20Z

Larsdegroot
Jun 2, 2022

Hi all,

im currently tasked with improving a NER model with 'seeding' it with a list of patterns. per example:

{"label": "PHENOTYPE", "pattern": [{"LOWER": "mitochondrion"}]}

the assumption was that adding patterns like this during training would give the model an 'idea' / examples of what the entity phenotype looks like. However i'm finding no way to do this.

What i am finding is using patterns to match entities in corpus you're going to annotate and then manually accepting / rejecting the entities that were found by the simple pattern match.
in this context seeding / providing a pattern file doesn't do anything directly with the model it just improves the annotating workflow.

Am i correct in that it is not possible to improve a model by seeding alone during training and that it's just a tool for easier data annotation?

mapadofu · 2022-06-02T21:13:39Z

mapadofu
Jun 2, 2022

I have what I believe a very closely related question here: #10894

Though that is a bit more focused on the mechanics of incorporating labels generated by a rules-based component into the NER model training process.

0 replies

polm · 2022-06-03T03:17:04Z

polm
Jun 3, 2022

You can use a rule-based annotator to create training data which can use for a model, that process is called "weak supervision". You might look at skweak, which is for weak supervision in spaCy.

There's not an easy to way to tag arbitrary spans as being "probably entities", the way you might use existence in a glossary as a feature in some older NER systems.

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

The use of seeding / patterns for improving a NER model #10900

Uh oh!

{{title}}

Uh oh!

Replies: 2 comments

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{editor}}'s edit

{{editor}}'s edit

Uh oh!

Uh oh!

{{title}}

Uh oh!

Select a reply

Uh oh!

Uh oh!

The use of seeding / patterns for improving a NER model #10900

Uh oh!

Larsdegroot Jun 2, 2022

Replies: 2 comments

Uh oh!

Uh oh!

mapadofu Jun 2, 2022

Uh oh!

polm Jun 3, 2022

Larsdegroot
Jun 2, 2022

mapadofu
Jun 2, 2022

polm
Jun 3, 2022