Inference speed between distilled and parent model is almost the same #11257
Unanswered
probavee
asked this question in
Help: Model Advice
Replies: 1 comment
-
For a more accurate comparison, train with the The speed of the following components in the pipeline is similar no matter which transformer model is used, so you won't see a 2x difference in the whole pipeline. If you want to test just the |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
-
Hello !
I trained distilCamemBert using spacy's project for parser and tagging to have a lighter and faster model but there's almost no difference with the
fr_core_news_trf
based on Camembert. However the paper on distilbert suggest a 2x speed up.I did some benchmark on this text using different approach.
My environment :
jupyter notebook served with jupyter lab in a docker container
gpu: Tesla P100-PCIE-16GB
cuda: 11.6
spacy 3.4.1
distilcamembert pipeline:
["transformer","tagger","morphologizer","trainable_lemmatizer","parser"]
(I don't know why it spikes: google cloud infra, spacy or notebook)
Then for bigger documents

So I wanted to know if someone has clues on why it isn't twice as fast and where does those spikes comes from ?
Also the transformer component complexity is expected to be quadratic ?
Finally does the training config can influence the inference speed ?
Thank you !
Beta Was this translation helpful? Give feedback.
All reactions