Skip to content

New options for preference tuning: rpo alpha, logprobs normalization, reference-free, simpo gamma#327

Merged
timofeev1995 merged 13 commits intomainfrom
egor/dpo-improvements
Jun 16, 2025
Merged

New options for preference tuning: rpo alpha, logprobs normalization, reference-free, simpo gamma#327
timofeev1995 merged 13 commits intomainfrom
egor/dpo-improvements

Commits

Commits on Jun 12, 2025

Commits on Jun 13, 2025