Skip to content

Commit 152ea5c

Browse files
Merge pull request #319 from behroozazarkhalili/add-grpo-advanced-reward-notebook
Add TRL GRPO Reasoning with Advanced Reward notebook
2 parents c8589a8 + b711c5a commit 152ea5c

File tree

3 files changed

+883
-1
lines changed

3 files changed

+883
-1
lines changed

notebooks/en/_toctree.yml

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -78,6 +78,8 @@
7878
title: Scaling Test-Time Compute for Longer Thinking in LLMs
7979
- local: fine_tuning_llm_grpo_trl
8080
title: Post training an LLM for reasoning with GRPO in TRL
81+
- local: trl_grpo_reasoning_advanced_reward
82+
title: TRL GRPO Reasoning with Advanced Reward
8183
- local: medical_rag_and_reasoning
8284
title: HuatuoGPT-o1 Medical RAG and Reasoning
8385
- local: fine_tune_chatbot_docs_synthetic

notebooks/en/index.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -8,9 +8,9 @@ applications and solving various machine learning tasks using open-source tools
88
Check out the recently added notebooks:
99

1010
- [Post training an VLM for reasoning with GRPO using TRL](fine_tuning_vlm_grpo_trl)
11+
- [TRL GRPO Reasoning with Advanced Reward](trl_grpo_reasoning_advanced_reward)
1112
- [Fine-Tuning a Vision Language Model with TRL using MPO](fine_tuning_vlm_mpo)
1213
- [Fine tuning a VLM for Object Detection Grounding using TRL](fine_tuning_vlm_object_detection_grounding)
13-
- [Hyperparameter Optimization with Optuna and Transformers](optuna_hpo_with_transformers)
1414
- [Fine-tuning T5 for Automatic GitHub Tag Generation with PEFT](finetune_t5_for_search_tag_generation)
1515

1616
You can also check out the notebooks in the cookbook's [GitHub repo](https://github.com/huggingface/cookbook).

0 commit comments

Comments
 (0)