ORPO (Or DPO?) #1350
Closed
exdownloader
started this conversation in
General
ORPO (Or DPO?)
#1350
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
-
I've seen a few discussions about DPO for sd-scripts, specifically this and this.
However there hasn't been further movement on either, from what I can tell.
ORPO is related to DPO and some even consider it superior.
I was recently browsing the various forks of sd-scripts and found the following repo which appears to be under active development.
The branch doesn't function for me, erroring out with the following:

I think that any kind of preference training would be interesting to explore and would be happy to see this kind of feature in sd-scripts but I have not been able to successfully contact the developer of this fork and so I'm raising awareness here in case there is a chance to gain traction.
After speaking with other AI/ML researchers and developers, I have been informed that regular DPO training is "easy" to implement.
Beta Was this translation helpful? Give feedback.
All reactions