following our new generalized model approach (#64) , we need to instantiate an object for huggingface transformer models for computer vision tasks