Issue with loading transformers into pipelines #10613
-
I am experimenting with adding transformers to spaCy models, and have run into an issue with loading them. Initially I wanted to load a transformer model which has been finetuned on a specific task, and see whether the knowledge gained there, would be of any use while training it for standard tasks. However using the code from the docs, leads to both the transformer itself, and the tokenizer for it not being loaded (i.e. they remain None). Upon some further digging, I've found that the same issue is still present, when attempting the most basic task, i.e. loading the transformer into blank English model, using the default config. It is hard for me to debug this further, as the stacktrace goes into spacy-transformers, and then thinc, but for some reason the hf_model is initialized as follows: Is my procedure wrong? If so what would be the correct one? How to reproduce the behaviour
Your Environment
|
Beta Was this translation helpful? Give feedback.
Replies: 2 comments 1 reply
-
You're just missing the initialize step that actually loads the transformer model based on the config: nlp = spacy.blank("en")
trf = nlp.add_pipe("transformer", config=DEFAULT_CONFIG["transformer"])
nlp.initialize() To be honest, most of the tokenizer and transformer config settings should actually have been placed in Related docs on the initialization step: https://spacy.io/usage/training#initialization |
Beta Was this translation helpful? Give feedback.
-
There is now an FAQ post about this process: #10768 |
Beta Was this translation helpful? Give feedback.
You're just missing the initialize step that actually loads the transformer model based on the config:
To be honest, most of the tokenizer and transformer config settings should actually have been placed in
[initialize]
rather than[components]
, but we released the first versions oftransformer
with this in[components]
and it would be confusing for users if it changed now.Related docs on the initialization step: https://spacy.io/usage/training#initialization