Skip to content

Commit dffd8e6

Browse files
committed
[Algorithm] DPO
ghstack-source-id: 88af210 Pull-Request: #3427
1 parent 01413ca commit dffd8e6

File tree

5 files changed

+1390
-1
lines changed

5 files changed

+1390
-1
lines changed

docs/source/reference/llms.rst

Lines changed: 23 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -517,6 +517,8 @@ SFT
517517

518518
SFTLoss
519519
SFTLossOutput
520+
sft_loss
521+
minor_sft_loss
520522

521523
.. currentmodule:: torchrl.data.llm
522524

@@ -525,3 +527,24 @@ SFT
525527
:template: rl_template.rst
526528

527529
TopKRewardSelector
530+
531+
DPO
532+
~~~
533+
534+
.. currentmodule:: torchrl.objectives.llm
535+
536+
.. autosummary::
537+
:toctree: generated/
538+
:template: rl_template.rst
539+
540+
DPOLoss
541+
DPOLossOutput
542+
dpo_loss
543+
544+
.. currentmodule:: torchrl.data.llm
545+
546+
.. autosummary::
547+
:toctree: generated/
548+
:template: rl_template.rst
549+
550+
AcceptanceRewardSelector

0 commit comments

Comments
 (0)