Skip to content

Commit 3ebf667

Browse files
committed
added instructgpt image in rlhf page
1 parent 2c7b2be commit 3ebf667

File tree

1 file changed

+9
-1
lines changed

1 file changed

+9
-1
lines changed

docs/reinforcement_learning/rlhf.md

Lines changed: 9 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -56,6 +56,12 @@ Using human feedback in reinforcement learning has several benefits, but also pr
5656

5757
- Reinforcement learning from human feedback (RLHF) has shown great potential in improving natural language processing (NLP) tasks. In NLP, the use of human feedback can help to capture the nuances of language and better align the agent's behavior with the user's expectations.
5858

59+
<figure markdown>
60+
![](../imgs/rl_rlhf_instructgpt.png)
61+
<figcaption>PPO model trained with RLHF outperforming SFT and base models by OpenAI. Source [2]</figcaption>
62+
</figure>
63+
64+
5965
### Summarization
6066

6167
- One of the first examples of utilizing RLHF in NLP was proposed in [1] to improve summarization using human feedback. Summarization aims to generate summaries that capture the most important information from a longer text. In RLHF, human feedback can be used to evaluate the quality of summaries and guide the agent towards more informative and concise summaries. This is quite difficult to capture using the metrics like ROUGE as they miss the human preferences.
@@ -86,4 +92,6 @@ Using human feedback in reinforcement learning has several benefits, but also pr
8692

8793
## References
8894

89-
[1] [Learning to summarize from human feedback](https://arxiv.org/abs/2009.01325)
95+
[1] [Learning to summarize from human feedback](https://arxiv.org/abs/2009.01325)
96+
97+
[2] [Training language models to follow instructions with human feedback](https://arxiv.org/abs/2203.02155)

0 commit comments

Comments
 (0)