Merging of two different pipelines that use transformers #6366
-
Hello, I have some troubles that didn't occur in spaCy 2.3.0. I have tried to train two different pipelines one for tagging and parsing and other for tagging and NER. When I try to add parsing component from the first pipeline to the other the precision of the component drops down drastically I suppose because the transformer model is fine-tuned on different sets of weights than for the parsing component. I have tried to add transformer component from the second model to the first but that just drops the accuracy of the NER component. Is there any way to merge these components and still preserve the original accuracy. Info about spaCy
|
Beta Was this translation helpful? Give feedback.
Replies: 7 comments 2 replies
-
What exactly are the configurations of your pipelines, are you using Tok2Vec or Transformer listeners? In spaCy 2.3, each component would have its own Tok2Vec layer - there would be no multi-task learning and no interference. In spaCy 3, you have the possibility to define a Tok2Vec layer or transformer only once in the pipeline, and then use a listener to fetch the outputs in different components. This is documented here. My guess is that somehow, after adding a component from another pipeline, that this component ends up "listening" to the wrong transformer/tok2vec component. This is why |
Beta Was this translation helpful? Give feedback.
-
Hello Sofie, I'm using TransformerListener, I'm combining the pipelines by using add_pipe and with source from other pipeline. II suppose that I have to change the name of the transformer component from the parser to match upstream_name. Also I suppose that I need to add this parameter to components.parser.model.tok2vec section of the config.cfg file and I do not need to retrain the model again. Another possible solution could be if I understood correctly not to use TransformerListener just to train independent componen? I have started the training with only parser component without the listener by using independent Tok2VecTransformer. |
Beta Was this translation helpful? Give feedback.
-
Yes - I think you should be able to make this work with a little hacking to avoid retraining. First you add the transformer from the other pipeline and give it a new name with I think some hack like this should work:
Because You might also have to reset the listeners of the corresponding components and call Ideally, yes, this would have been set correctly in the config files before training, but you'd need this PR: explosion/spacy-transformers#230 And yes - an entirely different solution is to have a separate |
Beta Was this translation helpful? Give feedback.
-
Thanks Sofie for the help, I have added the PR but have some error, I guess the config checker needs to be updated too
Removing the upstream attribute from TransformerListener now produces:
Also after calling |
Beta Was this translation helpful? Give feedback.
-
Hm, I added a unit test to the PR and it seems to work just fine, cf this edit which shows how to change the config. The config checker uses the function declarations, so in principle I think the PR should be fine (but I could be wrong as I don't have enough information to reproduce your specific use-case). Your first error What's weird is that your second error With respect to your final error "no attribute upstream_name" - can you double check the type of Anyway. Like I mentioned before, all of this is a bit hacky and it's not ideal to be changing the functions and configs between training and predictions. I was hoping to help you prevent retraining, but it looks like things have gotten more complex and retraining (either with the fix from the PR, or with entirely independent Tok2Vec/Transformer components) will be the best option by far. Then you should be able to combine the components as you originally described. |
Beta Was this translation helpful? Give feedback.
-
Dear Sofie, Thank you very much I have changed the configuration, I was adding it to wrong place. I have loaded the configuration added the components and the accuracy is unchanged :-) All the best, |
Beta Was this translation helpful? Give feedback.
-
So just in case other might be having this problem as well:
# combine then into one pipeline:
import spacy
nlp = spacy.load("da_core_news_trf", exclude="ner")
nlp_ner = spacy.load("da_dacy_small_ner_fine_grained")
nlp.add_pipe(factory_name="transformer", name="ner-transformer", source=nlp_ner)
comp = nlp.add_pipe(factory_name="ner", source=nlp_ner)
# make sure that is listens to the correct component
comp.tok2vec.layers[0].layers[0].upstream_name = "ner-transformer"
nlp._link_components() # unsure if this is needed?
doc = nlp("Ord som Aarhus og kl. 07:30 bliver i denne tekst annoteret")
# check that everything works as intended:
for ent in doc.ents:
print(ent)
print(ent.label_) |
Beta Was this translation helpful? Give feedback.
Yes - I think you should be able to make this work with a little hacking to avoid retraining. First you add the transformer from the other pipeline and give it a new name with
nlp.add_pipe(transformer, name="other_transformer", source=...)
. Then you fetch the component from the other pipeline that is trained onother_transformer
, let's say it's theparser
component.I think some hack like this should work:
Because
layers[0]
should be theTransformerListener
if I'm not mistaken.You might also have to reset the listeners of the corresponding components and call
nlp._link_components()
again to ensure the listener…