Merge pull request #5108 from mrbullwinkle/patch-47

prmerger-automator[bot] · web-flow · commit 238b80903ff6 · 2025-05-21T15:57:21.000Z
[Azure OpenAI] Update reinforcement-fine-tuning (Preview)
diff --git a/articles/ai-services/openai/how-to/reinforcement-fine-tuning.md b/articles/ai-services/openai/how-to/reinforcement-fine-tuning.md
@@ -1,5 +1,5 @@
 ---
-title: 'Customize o4-mini model with Azure OpenAI and reinforcement fine-tuning'
+title: 'Customize o4-mini model with Azure OpenAI and reinforcement fine-tuning (Preview)'
 description: Learn how to use reinforcement fine-tuning with Azure OpenAI
 manager: nitinme
 ms.service: azure-ai-openai
@@ -10,7 +10,7 @@ author: mrbullwinkle
 ms.author: mbullwin
 ---
 
-# Reinforcement fine-tuning (RFT) with Azure OpenAI o4-mini
+# Reinforcement fine-tuning (RFT) with Azure OpenAI o4-mini (Preview)
 
 Reinforcement fine-tuning (RFT) is a technique for improving reasoning models like o4-mini by training them through a reward-based process, rather than relying only on labeled data. By using feedback or "rewards" to guide learning, RFT helps models develop better reasoning and problem-solving skills, especially in cases where labeled examples are limited or complex behaviors are desired.
 
@@ -404,4 +404,4 @@ We also provide a grader check API that you can use to check the validity of you
 
 Aim for a few hundred examples initially and consider scaling up to around 1,000 examples if necessary. The dataset should be balanced, in terms of classes predicted, to avoid bias and ensure generalization.
 
-For the prompts, make sure to provide clear and detailed instructions, including specifying the response format and any constraints on the outputs (e.g. minimum length for explanations, only respond with true/false etc.)
+For the prompts, make sure to provide clear and detailed instructions, including specifying the response format and any constraints on the outputs (e.g. minimum length for explanations, only respond with true/false etc.)