Skip to content

feat(grpo_trainer.py): Variational Sequence-Level Soft Policy Optimization (VESPO)#5199

Open
casinca wants to merge 8 commits intohuggingface:mainfrom
casinca:VESPO
Open

feat(grpo_trainer.py): Variational Sequence-Level Soft Policy Optimization (VESPO)#5199
casinca wants to merge 8 commits intohuggingface:mainfrom
casinca:VESPO

Commits

Commits on Feb 27, 2026

Commits on Feb 28, 2026

Commits on Mar 1, 2026

Commits on Mar 2, 2026

Commits on Mar 5, 2026