Thanks for your great work!
I plan to train efficientvit classification model on imagenet-22k dataset for a better backbone initialization for open vocabulary object detection.
I know that the released classification model now are all pretrained on imagenet-1k dataset.
But I'm wondering how the training hyperparams should be set for the imagenet-22k pretraining. And should I use the 1k pretrained model to initialize or just train from scratch, which one is prefered ?
Hope your reply @han-cai ! Thanks !!!