I want to finetune the model using a dataset with over 1000 categories to recognize fine-grained classes. However, when I include all the categories in the label_list, it exceeds the maximum token length of 512 during BERT processing.
How should I address this problem? Should I use a larger BERT model, or are there other methods to support my needs? Thank you for your assistance!