You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
We are trying to use your pipeline to train a smaller model, such as Qwen3-1.7B. We would like to ask whether it is appropriate to follow your hyperparameter settings and use a batch size of 128, or if we should choose a smaller batch size, such as 64?