Add TRL example notebook to RLHF docs (#26346)

sergiopaniego · web-flow · commit 883b42896a9e · 2025-10-07T11:31:28.000Z
Signed-off-by: sergiopaniego &lt;sergiopaniegoblanco@gmail.com&gt;
diff --git a/docs/training/rlhf.md b/docs/training/rlhf.md
@@ -12,4 +12,5 @@ See the following basic examples to get started if you don't want to use an exis
 
 See the following notebooks showing how to use vLLM for GRPO:
 
+- [Efficient Online Training with GRPO and vLLM in TRL](https://huggingface.co/learn/cookbook/grpo_vllm_online_training)
 - [Qwen-3 4B GRPO using Unsloth + vLLM](https://colab.research.google.com/github/unslothai/notebooks/blob/main/nb/Qwen3_(4B)-GRPO.ipynb)

Original file line number	Diff line number	Diff line change
`@@ -12,4 +12,5 @@ See the following basic examples to get started if you don't want to use an exis`
`12`	`12`
`13`	`13`	`See the following notebooks showing how to use vLLM for GRPO:`
`14`	`14`
	`15`	`+- [Efficient Online Training with GRPO and vLLM in TRL](https://huggingface.co/learn/cookbook/grpo_vllm_online_training)`
`15`	`16`	`- [Qwen-3 4B GRPO using Unsloth + vLLM](https://colab.research.google.com/github/unslothai/notebooks/blob/main/nb/Qwen3_(4B)-GRPO.ipynb)`