New options for preference tuning: rpo alpha, logprobs normalization, reference-free, simpo gamma#327
Merged
timofeev1995 merged 13 commits intomainfrom Jun 16, 2025
Merged
Commits
Commits on Jun 12, 2025
- committed
- committed
- committed
- committed
- committed
- committed
- committed
- committed
- committed
Commits on Jun 13, 2025
- committed
- committed
- committed
- committed