Fine-tune a transformer on mps? #12228
Replies: 1 comment 1 reply
-
spaCy itself supports training on Apple Silicon GPUs. The issue was that very recently Torch still had some bugs that lead to crashes during training. There is no fundamental difference here between spaCy and Huggingface transformers, because the These issues may be resolved now and it may depend on the model (since there is some variety in the ops used by models). For example, I just tested finetuning of our German transformer pipeline, which uses the German BERT, and it works fine. Even without CPU fallbacks. So, I'd encourage you to just try and see if it works. If you are running into issues with |
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
-
Hello!
I've been using the tok2vec ensemble as the embedding layer for a multi-label non-exclusive textcat model. It's now much faster to train on m-series CPUs, so thanks for thinc-apple-ops! I have naturally become curious about the potential for improved model performance via transformers.
My early experiments incorporating distilroberta-base are very promising, and I'd like to experiment with other models but do not have access to a GPU other than the M2 GPU. I understand that currently only trf inference is supported on Apple GPUs through spaCy, but one can now train using Apple GPUs through PyTorch-- at least, most of the ops are supported and some rely on CPU fallback.
Is the best course of action fine-tuning a trf model on my data using Apple GPUs through huggingface (with a classification head, which will be discarded when the model is sourced in the spacy config), wrapping it, and freezing it to train the textcat component with cpu? Are there other setups you might recommend?
Since this isn't urgent I'm happy to wait for Apple GPU training support directly via spaCy but understand that could be a ways away, but if that's not at all on your roadmap I could investigate some other supported solution like the one above. I'd like to stay in spaCy's ecosystem as much as possible since the projects and config system are so nice!
Thanks for all you do!
Beta Was this translation helpful? Give feedback.
All reactions