-
Notifications
You must be signed in to change notification settings - Fork 2.8k
Description
出bug的具体模型
DeepSeek-R1-Distill-Qwen3-8B
出bug的具体模型教程
DeepSeek-R1-Distill-Qwen3-8B GRPO微调教程
教程负责人
郭宣伯
Bug描述
教程可读性太差,逻辑和代码混乱,建议重新整理下
https://github.com/datawhalechina/self-llm/blob/master/models/DeepSeek-R1-Distill-Qwen/05-DeepSeek-R1-0528-Qwen3-8B-GRPO%E5%8F%8Aswanlab%E5%8F%AF%E8%A7%86%E5%8C%96.md
复现步骤
教程可读性太差,逻辑和代码混乱,建议重新整理下
https://github.com/datawhalechina/self-llm/blob/master/models/DeepSeek-R1-Distill-Qwen/05-DeepSeek-R1-0528-Qwen3-8B-GRPO%E5%8F%8Aswanlab%E5%8F%AF%E8%A7%86%E5%8C%96.md
期望行为
教程可读性太差,逻辑和代码混乱,建议重新整理下
https://github.com/datawhalechina/self-llm/blob/master/models/DeepSeek-R1-Distill-Qwen/05-DeepSeek-R1-0528-Qwen3-8B-GRPO%E5%8F%8Aswanlab%E5%8F%AF%E8%A7%86%E5%8C%96.md
环境信息
教程可读性太差,逻辑和代码混乱,建议重新整理下
https://github.com/datawhalechina/self-llm/blob/master/models/DeepSeek-R1-Distill-Qwen/05-DeepSeek-R1-0528-Qwen3-8B-GRPO%E5%8F%8Aswanlab%E5%8F%AF%E8%A7%86%E5%8C%96.md
其他信息
教程可读性太差,逻辑和代码混乱,建议重新整理下
https://github.com/datawhalechina/self-llm/blob/master/models/DeepSeek-R1-Distill-Qwen/05-DeepSeek-R1-0528-Qwen3-8B-GRPO%E5%8F%8Aswanlab%E5%8F%AF%E8%A7%86%E5%8C%96.md
确认事项 / Verification
- 此问题未在过往Issue中被报告过 / This issue hasn't been reported before