Skip to content

[Bug] DeepSeek-R1-Distill-Qwen3-8B GRPO微调教程-教程可读性太差,逻辑和代码混乱,建议重新整理下 #476

@yzw3270978316

Description

@yzw3270978316

出bug的具体模型

DeepSeek-R1-Distill-Qwen3-8B

出bug的具体模型教程

DeepSeek-R1-Distill-Qwen3-8B GRPO微调教程

教程负责人

郭宣伯

Bug描述

教程可读性太差,逻辑和代码混乱,建议重新整理下
https://github.com/datawhalechina/self-llm/blob/master/models/DeepSeek-R1-Distill-Qwen/05-DeepSeek-R1-0528-Qwen3-8B-GRPO%E5%8F%8Aswanlab%E5%8F%AF%E8%A7%86%E5%8C%96.md

复现步骤

教程可读性太差,逻辑和代码混乱,建议重新整理下
https://github.com/datawhalechina/self-llm/blob/master/models/DeepSeek-R1-Distill-Qwen/05-DeepSeek-R1-0528-Qwen3-8B-GRPO%E5%8F%8Aswanlab%E5%8F%AF%E8%A7%86%E5%8C%96.md

期望行为

教程可读性太差,逻辑和代码混乱,建议重新整理下
https://github.com/datawhalechina/self-llm/blob/master/models/DeepSeek-R1-Distill-Qwen/05-DeepSeek-R1-0528-Qwen3-8B-GRPO%E5%8F%8Aswanlab%E5%8F%AF%E8%A7%86%E5%8C%96.md

环境信息

教程可读性太差,逻辑和代码混乱,建议重新整理下
https://github.com/datawhalechina/self-llm/blob/master/models/DeepSeek-R1-Distill-Qwen/05-DeepSeek-R1-0528-Qwen3-8B-GRPO%E5%8F%8Aswanlab%E5%8F%AF%E8%A7%86%E5%8C%96.md

其他信息

教程可读性太差,逻辑和代码混乱,建议重新整理下
https://github.com/datawhalechina/self-llm/blob/master/models/DeepSeek-R1-Distill-Qwen/05-DeepSeek-R1-0528-Qwen3-8B-GRPO%E5%8F%8Aswanlab%E5%8F%AF%E8%A7%86%E5%8C%96.md

确认事项 / Verification

  • 此问题未在过往Issue中被报告过 / This issue hasn't been reported before

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions