Precision on SpaCy pipeline and the possibility to use the same base models independantly #10787
Replies: 1 comment 2 replies
-
For details on preparing training data, see the training data section of the docs. For POS tags and dependency annotations in particular it might be easiest to convert from a CoNLLu file, see here. You can add multiple copies of the same component to a pipeline, each instance just needs its own name. See the double NER project for an example of two NER components. However, using two part of speech taggers or dependency parsers won't really work, since there's only one place in a Doc object to put POS or dependency annotations. Is the goal to compare the output of your model with the pretrained spaCy models? If so it might make sense to just have two pipelines. Also, can you clarify why you're training a custom tagger/parser? It's not something that's required very often, so extra background on your problem might help us understand your goals better. |
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
-
Hello,
I have a question about the spaCy pipeline and the right way to implement my idea. I'm trying to add a custom spaCy tagger, train it on custom data and use it in a pipeline alongside the pretrained tagger of the model I'm using. For what I understand, adding this custom tagger to an existing pipeline would replace its existing tagger. Is there a solution to add multiple instances of the same spaCy models in the same pipeline in this fashion : ["original spaCy tagger", "custom tagger"].
I have the same question for the base DependencyParser model.
If that's possible, could you point me out the way to do this and to correctly form the spaCy Doc objects for training (specially, how to point to the correct Doc attribute to store my custom tags).
Thanks a lot !
Beta Was this translation helpful? Give feedback.
All reactions