Hello,
I would like to ask whether a 6M parameter model would be too small for 4 million data samples, based on the details provided in the paper. Could you share any insights on the relative relationship between model size and data quantity?
Thank you!