You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Thanks for your work! I want to know how many GPUs (and the memory) did you use to train RM and PPO. I noticed in the code that you don't use Lora by default. Can you provide the settings of Lora?