Skip to content

Intuitive understanding of the algorithm? #27

@ZeratuuLL

Description

@ZeratuuLL

Hey authors! I find your KTO paper quite interesting and would like to explore its application in my work. I am here to see if I can have a better intuitive understanding of the algorithm especially how it compares with RL-based methods such as PPO or DPO. I could be wrong or missed some key points in the paper, and would appreciate if you can point out!

Here are some of my questions:

  1. Why that specific form of r_\theta? I didn't find sentences talking about the relationship between human utility and the preference probability for a pair of sentences (which is the Bradley-Terry style). For me the formula of r_\theta just came out of air in definition 3.4 and (I think) a natural question is whether there is a better formulation of r_\theta that gives better result. Although it is explained how this definition is compared to classic prospect theory, I find it hard to understand why we should define it in nats like this.
  2. Why a biased KL divergence works? It is hard to see the estimate is "good". The experiments shows empirically it works, but what it means? Does that mean the estimate is not really noisy, or it is the existence instead of the value of the baseline is important?
  3. How does KTO intuitively work? Although the 6th page has a paragraph talking about "Intuitively, KTO works as follows" but does it really make sense as we have a noisy estimate of KL and it does not have gradient flow? It's not punishing a large KL at all and a positive KL will make the model to favor a even larger r_theta. This should only make "the model increases
    the reward of a desirable example in a blunt manner" even worse.

Thanks for reading and look forward to hearing back!

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions