-
Notifications
You must be signed in to change notification settings - Fork 51
Open
Description
Thanks for your remarkable open-source work. The self-train part in README says
Regarding Llama3-8B-Instruct and Mistral-7B: MetaMATH, we use the default repo of [[MAmmoTH]](https://github.com/TIGER-AI-Lab/MAmmoTH) to train the policy model and evaluate.
Regarding SciGLM-6B, we use the default repo of [[SciGLM]](https://github.com/THUDM/SciGLM) to train the policy model and evaluate.
Since collected SFT data and models are available, why choosing these two repos for SFT other than writing training code like PRM/train_mistral.py in your repo? Is this just for convenience or does the corresponding repos have some performance advantages for specific model/dataset?
Metadata
Metadata
Assignees
Labels
No labels