Are there any plans to set parameters for CPPO as well? #3291

tztechno · 2025-04-15T03:36:48Z

tztechno
Apr 15, 2025

According to the recently published CPPO literature and GitHub
CPPO: Accelerating the Training of Group Relative Policy Optimization-Based Reasoning Models
https://arxiv.org/abs/2503.22342
https://github.com/lzhxmu/CPPO

It seems that this can be achieved by setting three parameters for GRPOConfig
https://github.com/lzhxmu/CPPO/blob/main/scripts/CPPO.sh

metric= 'smallest'
pruning= 0.5
allocation= True

Currently, none of these are included in the configuration items.

In v0.16.0, it became possible to set scale_rewards in response to the DRGRPO literature, and I was impressed by how quickly it could be set up.

Are there any plans to set parameters for CPPO as well?

tztechno · 2025-04-16T08:23:59Z

tztechno
Apr 16, 2025
Author

I found custom settings by the authors are published as modified version of open r1.
https://github.com/lzhxmu/CPPO/
src/open_r1/configs.py
src/open_r1/grpo_trainer_gsm.py

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Are there any plans to set parameters for CPPO as well? #3291

Uh oh!

{{title}}

Uh oh!

Replies: 1 comment

Uh oh!

{{title}}

Uh oh!

Select a reply

Uh oh!

Are there any plans to set parameters for CPPO as well? #3291

Uh oh!

tztechno Apr 15, 2025

Replies: 1 comment

Uh oh!

tztechno Apr 16, 2025 Author

tztechno
Apr 15, 2025

tztechno
Apr 16, 2025
Author