Description
I was immensely impressed by the 7.3x speedup demonstrated in the t5 tutorial (though I was only able to reproduce 4.0x on my machine, still pretty good).
However, I can only speed up Bart by 1.6x in the same way.
I have determined this is not due to the difference in model size, as t5-base (which is larger than bart-base) is sped up 3.8x.
Steps to reproduce
Use the t5 notebook but replace:
't5-small' with 'facebook/bart-base'
optimize_model(model.encoder) with optimize_model(model.model.encoder)
optimize_model(model.decoder) with optimize_model(model.model.decoder)
Expected Behavior
Speedup comparable to that seen in T5.
Actual Behavior
A comparatively small 1.6x speedup.
Your environment
The docker container from the README.
Self-service
Code of Conduct