You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: docs/reinforcement_learning/rlhf.md
+9-1Lines changed: 9 additions & 1 deletion
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -56,6 +56,12 @@ Using human feedback in reinforcement learning has several benefits, but also pr
56
56
57
57
- Reinforcement learning from human feedback (RLHF) has shown great potential in improving natural language processing (NLP) tasks. In NLP, the use of human feedback can help to capture the nuances of language and better align the agent's behavior with the user's expectations.
58
58
59
+
<figuremarkdown>
60
+

61
+
<figcaption>PPO model trained with RLHF outperforming SFT and base models by OpenAI. Source [2]</figcaption>
62
+
</figure>
63
+
64
+
59
65
### Summarization
60
66
61
67
- One of the first examples of utilizing RLHF in NLP was proposed in [1] to improve summarization using human feedback. Summarization aims to generate summaries that capture the most important information from a longer text. In RLHF, human feedback can be used to evaluate the quality of summaries and guide the agent towards more informative and concise summaries. This is quite difficult to capture using the metrics like ROUGE as they miss the human preferences.
@@ -86,4 +92,6 @@ Using human feedback in reinforcement learning has several benefits, but also pr
86
92
87
93
## References
88
94
89
-
[1][Learning to summarize from human feedback](https://arxiv.org/abs/2009.01325)
95
+
[1][Learning to summarize from human feedback](https://arxiv.org/abs/2009.01325)
96
+
97
+
[2][Training language models to follow instructions with human feedback](https://arxiv.org/abs/2203.02155)
0 commit comments