How to use the components of the training pipeline with a ner annotated dataset? #10882
-
My training and test datasets were annotated for NER, so they contain only the sentences with the entities. However, I would like to know if there is a way for me to use different components of the training pipeline (e.g., tagger, morphologizer, trainable_lemmatizer, and others) for my evaluation. When I try to run "[E143] Labels for component 'morphologizer' not initialized. This can be fixed by calling add_label, or by providing a representative batch of examples to the component's Is there any way to configure a default "initialization" for these components or I won't be able to do this with my ner annotated documents? To create my .spacy file I am using this conversion function (just so you can understand what annotation I am using) that I have created to convert the previous spaCy annotation to the new one:
Is there a way to use some sort of default initialization that enables the use of other training pipeline components to evaluate these ner annotated datasets? Many thanks! |
Beta Was this translation helpful? Give feedback.
Replies: 1 comment 5 replies
-
Sorry, I don't understand what you want to do here. In evaluation the training pipeline is run as usual and compared to gold data (there is no "evaluation pipeline"). Also if you are evaluating performance on NER data and change the tagger, for example, that also won't affect your evaluation at all, since only NER scores matter. Do you want to make a pipeline for making predictions that includes pretrained components as well as the component you just trained? If so you can do that by sourcing components. Maybe you could explain what you're trying to do from the start? I notice you've had several connected issues recently, it might help us to have some more perspective on your project.
I'm not sure from your sample code if this describes your situation, but if your JSON data is the spaCy v2 format then |
Beta Was this translation helpful? Give feedback.
Sorry, I don't understand what you want to do here. In evaluation the training pipeline is run as usual and compared to gold data (there is no "evaluation pipeline"). Also if you are evaluating performance on NER data and change the tagger, for example, that also won't affect your evaluation at all, since only NER scores matter.
Do you want to make a pipeline for making predictions that includ…