Skip to content

Commit b7e2a37

Browse files
committed
Add updated disclaimer to rft healthbench
1 parent ab40061 commit b7e2a37

File tree

2 files changed

+5
-1
lines changed

2 files changed

+5
-1
lines changed

examples/fine-tuned_qa/reinforcement_finetuning_healthbench.ipynb

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -7,8 +7,12 @@
77
"source": [
88
"# Reinforcement Fine-Tuning with the OpenAI API for Conversational Reasoning\n",
99
"\n",
10+
"*This guide is for developers and ML practitioners who have some experience with OpenAIʼs APIs and wish to use their fine-tuned models for research or other appropriate uses. OpenAI’s services are not intended for the personalized treatment or diagnosis of any medical condition and are subject to our [applicable terms](https://openai.com/policies/).*\n",
11+
"\n",
1012
"This notebook demonstrates how to use OpenAI's reinforcement fine-tuning (RFT) to improve a model's conversational reasoning capabilities (specifically asking questions to gain additional context and reduce uncertainty). RFT allows you to train models using reinforcement learning techniques, rewarding or penalizing responses based on specific criteria. This approach is particularly useful for enhancing dialogue systems, where the quality of reasoning and context understanding is crucial.\n",
1113
"\n",
14+
"For a deep dive into the Reinforcement Fine-Tuning API and how to write effective graders, see [Exploring Model Graders for Reinforcement Fine-Tuning](https://cookbook.openai.com/examples/reinforcement_fine_tuning).\n",
15+
"\n",
1216
"### HealthBench\n",
1317
"\n",
1418
"This cookbook evaluates and improves model performance on a focused subset of [HealthBench](https://openai.com/index/healthbench/), a benchmark suite for medical QA. This guide walks through how to configure the datasets, define evaluation rubrics, and fine-tune model behavior using reinforcement signals derived from custom graders.\n",

registry.yaml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -14,7 +14,7 @@
1414
- fine-tuning
1515
- reinforcement-learning-graders
1616

17-
- title: Reinforcement Fine-tuning with the OpenAI API
17+
- title: Reinforcement Fine-Tuning for Conversational Reasoning with the OpenAI API
1818
path: examples/fine-tuned_qa/reinforcement_finetuning_healthbench.ipynb
1919
date: 2025-05-21
2020
authors:

0 commit comments

Comments
 (0)