Best practices for NER training with new entity types #8288

michelemarzollo · 2021-06-04T16:41:21Z

michelemarzollo
Jun 4, 2021

Hello,

I have some questions about training a NER model. Although these questions are not totally new, I couldn’t find satisfactory answers in the previous discussions, and I think they could be useful also for other people.

If I want to fine-tune a pre-trained NER pipeline, is it better to fine-tune only the NER component, or to also update the transformer’s weights? And if I train the NER component from scratch is it the same?
Let’s assume I have a manually annotated corpus with a subset of the entity types of the pretrained Spacy models, plus some other entity types. Is it better to fine-tune the existing NER component, or to train it from scratch?
Do you have indications on the number of entities per type that are required to train the model to recognize new entities and achieve good accuracies?

Any suggestion will be appreciated.

Thank you!

polm · 2021-06-05T09:45:01Z

polm
Jun 5, 2021

As a note, when you ask "is it better to X or Y?" type questions, sometimes we can give you an answer based on our experience, but often the best answer is "try both and see what works". Modern machine learning is highly empirical, which means people try stuff until it works, and without getting really into your requirements and data it's hard to make general recommendations.

That said I will try to give some advice.

If I want to fine-tune a pre-trained NER pipeline, is it better to fine-tune only the NER component, or to also update the transformer’s weights? And if I train the NER component from scratch is it the same?

If you have enough data it's usually better to train the tok2vec/transformer as well. Where the borderline for "enough data" is is unclear, but for transformers you need more data. If you have a lot of data there's generally not much benefit to fine tuning.

Let’s assume I have a manually annotated corpus with a subset of the entity types of the pretrained Spacy models, plus some other entity types. Is it better to fine-tune the existing NER component, or to train it from scratch?

Do you care about the entity types that are not in your data? If not then I would skip fine tuning. Fine tuning is possible in the scenario you describe but will be complicated if you want to avoid forgetting the entities you aren't using.

This depends a lot on the quantity of your data and how much it is like the data for the pretrained models.

Do you have indications on the number of entities per type that are required to train the model to recognize new entities and achieve good accuracies?

I wouldn't try training a model with fewer than several hundred examples, though how much is enough depends a lot on the specifics of the entities.

1 reply

michelemarzollo Jun 8, 2021
Author

Thank you very much for your suggestions! I will surely make experiments, but starting from the ideas of somebody who is knowledgeable is really useful!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Best practices for NER training with new entity types #8288

Uh oh!

{{title}}

Uh oh!

Replies: 1 comment 1 reply

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Select a reply

Uh oh!

Uh oh!

Best practices for NER training with new entity types #8288

Uh oh!

michelemarzollo Jun 4, 2021

Replies: 1 comment · 1 reply

Uh oh!

polm Jun 5, 2021

Uh oh!

michelemarzollo Jun 8, 2021 Author

michelemarzollo
Jun 4, 2021

Replies: 1 comment 1 reply

polm
Jun 5, 2021

michelemarzollo Jun 8, 2021
Author