CGPO

Official implementation of CGPO (ICLR 2026): Boosting Multi-Domain Reasoning of LLMs via Curvature-Guided Policy Optimization.

Note: This repository is under active development. We have uploaded the core codebase, while documentation, training scripts, and reproducibility instructions are still being organized and will be updated soon.

News

2026: CGPO accepted to ICLR 2026 (The Fourteenth International Conference on Learning Representations).

Citation

If you find this repository useful, please cite:

@inproceedings{liang2026boosting,
  title={Boosting Multi-Domain Reasoning of LLMs via Curvature-Guided Policy Optimization},
  author={Liang, Xize and Yang, Lin and Wang, Jie and Liu, Rui and Lu, Yang and Zeng, Jinliang and Chen, Hanzhu and Li, Dong and Hao, Jianye},
  booktitle={The Fourteenth International Conference on Learning Representations},
  year={2026}
}

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
SandboxFusion		SandboxFusion
evaluation		evaluation
scripts/model_merger		scripts/model_merger
utils		utils
verl		verl
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
compute_score.py		compute_score.py
data_preprocess.py		data_preprocess.py
environment.yaml		environment.yaml
general-reasoner-requirements.txt		general-reasoner-requirements.txt
input.txt		input.txt
main_ppo.py		main_ppo.py
temp.py		temp.py
verifier.py		verifier.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

CGPO

News

Citation

About

Uh oh!

Releases

Packages

Uh oh!

Contributors 1

Languages

Folders and files

Latest commit

History

Repository files navigation

CGPO

News

Citation

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors 1

Languages

Packages