Skip to content

请教参数设定 #44

@AragornHorse

Description

@AragornHorse

我在8xH100上做的测试,用的Qwen2.5-7B-base,1个卡放reference model,4个卡rollout,3个卡训练。
由于原始参数下会OOM,下面是我现在的主要参数设定:
beta = 0.04
all_steps = 1000
Q_batch_size = 4
num_pre_Q = 8
train_batch_size = 2
gen_update_steps = 16
save_steps = 200
compute_gen_logps = True
clip_param = 0.2
gradient_accumulation_steps = 4
lr = 1e-6
max_tokens = 700

我花了约1个小时使用gsm训练了800步,gsm用evalscope上测试的准确率只从0.08涨到了0.14,没有达到这里Readme中的效果,请问是因为我的参数设置问题吗?如果是的话有没有办法优化。

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions