@Chtholly17 @yunfeixie233 @cezhou3 @HeartyHaven Thanks for your great work and open-source datasets. That's really helped a lot.
Could you please provide the pre-training loss and finetuning loss? We've been customizing the architecture and pretrain on your datasets but the loss seems a little weird. It stopped decreasing after 0.2-0.3 epoch.