Hi there! Just want to quickly congratulate all the effort done in this project!
Could you share some information on what setup you used for the training of the transformer model?
- how many gpu / for how long
- how many steps
- what batch size
It would be helpful to have these information to better understand the cost of training models.
Hi there! Just want to quickly congratulate all the effort done in this project!
Could you share some information on what setup you used for the training of the transformer model?
It would be helpful to have these information to better understand the cost of training models.