How to Enable requires_grad=True for SFT Model Parameters in open-r1 GRPO Training? #642

tztechno · 2025-05-15T07:56:26Z

tztechno
May 15, 2025

I’m experimenting with the open-r1 repo and tried the following flow:

Perform SFT (Supervised Fine-Tuning) on a base model such as Qwen2.5
Run GRPO using the fine-tuned model to further improve performance

However, when I ran GRPO, I observed no learning effect at all.
Upon investigation, I found that all model parameters had requires_grad=False after SFT:

for name, param in model.named_parameters():
    print(f"{name}: {param.requires_grad}")

I attempted to manually set requires_grad=True but it didn't solve the issue. I suspect this might be related to how the model is passed to GRPOTrainer or how it is initialized internally.

My question is:
 How can I correctly configure the model so that requires_grad=True for parameters during GRPO training in open-r1?
Any advice or pointer to a working example or part of the codebase would be greatly appreciated!

Thanks in advance.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

How to Enable requires_grad=True for SFT Model Parameters in open-r1 GRPO Training? #642

Uh oh!

{{title}}

Uh oh!

Replies: 0 comments

Select a reply

Uh oh!

How to Enable requires_grad=True for SFT Model Parameters in open-r1 GRPO Training? #642

Uh oh!

tztechno May 15, 2025

Replies: 0 comments

tztechno
May 15, 2025