请问Qwen3-1.7b thinking 模型对于 math-500 数据集的评测参数是怎样的? #1805
Unanswered
Nothing-is-important
asked this question in
Q&A
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
-
使用模板 {problem}\nPlease reason step by step, and put your final answer within \boxed{}.
使用 max_out_len=32768,top_k=0.6,top_p=0.95, min_p=0, do_sample=True, temperature=0.6, 生成64个回复,结果为91.24,与官网的93.4还是有差距。
官网的评测指标是 pass@64 还是 accuracy?评测参数是什么样的?
Beta Was this translation helpful? Give feedback.
All reactions