Skip to content

Commit 7e0c88d

Browse files
authored
Merge pull request #3311 from MicrosoftDocs/main
3/4/2025 11:00 AM IST Publish
2 parents bf9270b + b411489 commit 7e0c88d

36 files changed

+1152
-968
lines changed

articles/ai-services/openai/how-to/fine-tuning-deploy.md

Lines changed: 470 additions & 0 deletions
Large diffs are not rendered by default.
Lines changed: 74 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,74 @@
1+
---
2+
title: 'Direct preference optimization'
3+
titleSuffix: Azure OpenAI
4+
description: Learn how to use direct preference optimization technique to fine-tune Azure OpenAI models.
5+
#services: cognitive-services
6+
manager: nitinme
7+
ms.service: azure-ai-openai
8+
ms.custom: build-2023, build-2023-dataai, devx-track-python, references_regions
9+
ms.topic: how-to
10+
ms.date: 02/24/2025
11+
author: mrbullwinkle
12+
ms.author: mbullwin
13+
---
14+
15+
# Direct preference optimization (preview)
16+
17+
Direct preference optimization (DPO) is an alignment technique for large language models, used to adjust model weights based on human preferences. It differs from reinforcement learning from human feedback (RLHF) in that it does not require fitting a reward model and uses simpler binary data preferences for training. It is computationally lighter weight and faster than RLHF, while being equally effective at alignment.
18+
19+
## Why is DPO useful?
20+
21+
DPO is especially useful in scenarios where there's no clear-cut correct answer, and subjective elements like tone, style, or specific content preferences are important. This approach also enables the model to learn from both positive examples (what's considered correct or ideal) and negative examples (what's less desired or incorrect).
22+
23+
DPO is believed to be a technique that will make it easier for customers to generate high-quality training data sets. While many customers struggle to generate sufficient large data sets for supervised fine-tuning, they often have preference data already collected based on user logs, A/B tests, or smaller manual annotation efforts.
24+
25+
## Direct preference optimization dataset format
26+
27+
Direct preference optimization files have a different format than supervised fine-tuning. Customers provide a "conversation" containing the system message and the initial user message, and then "completions" with paired preference data. Users can only provide two completions.
28+
29+
Three top-level fields: `input`, `preferred_output` and `non_preferred_output`
30+
31+
- Each element in the preferred_output/non_preferred_output must contain at least one assistant message
32+
- Each element in the preferred_output/non_preferred_output can only have roles in (assistant, tool)
33+
34+
```json
35+
{
36+
"input": {
37+
"messages": {"role": "system", "content": ...},
38+
"tools": [...],
39+
"parallel_tool_calls": true
40+
},
41+
"preferred_output": [{"role": "assistant", "content": ...}],
42+
"non_preferred_output": [{"role": "assistant", "content": ...}]
43+
}
44+
```
45+
46+
Training datasets must be in `jsonl` format:
47+
48+
```jsonl
49+
{{"input": {"messages": [{"role": "system", "content": "You are a chatbot assistant. Given a user question with multiple choice answers, provide the correct answer."}, {"role": "user", "content": "Question: Janette conducts an investigation to see which foods make her feel more fatigued. She eats one of four different foods each day at the same time for four days and then records how she feels. She asks her friend Carmen to do the same investigation to see if she gets similar results. Which would make the investigation most difficult to replicate? Answer choices: A: measuring the amount of fatigue, B: making sure the same foods are eaten, C: recording observations in the same chart, D: making sure the foods are at the same temperature"}]}, "preferred_output": [{"role": "assistant", "content": "A: Measuring The Amount Of Fatigue"}], "non_preferred_output": [{"role": "assistant", "content": "D: making sure the foods are at the same temperature"}]}
50+
}
51+
```
52+
53+
## Direct preference optimization model support
54+
55+
- `gpt-4o-2024-08-06` supports direct preference optimization in its respective fine-tuning regions. Latest region availability is updated in the [models page](../concepts/models.md#fine-tuning-models)
56+
57+
Users can use preference fine tuning with base models as well as models that have already been fine-tuned using supervised fine-tuning as long as they are of a supported model/version.
58+
59+
## How to use direct preference optimization fine-tuning
60+
61+
:::image type="content" border="true" source="/azure/ai-services/openai/media/fine-tuning/preference-optimization.gif" alt-text="GIF of preference optimization fine-tuning steps.":::
62+
63+
1. Prepare `jsonl` datasets in the [preference format](#direct-preference-optimization-dataset-format).
64+
2. Select the model and then select the method of customization **Direct Preference Optimization**.
65+
3. Upload datasets – training and validation. Preview as needed.
66+
4. Select hyperparameters, defaults are recommended for initial experimentation.
67+
5. Review the selections and create a fine tuning job.
68+
69+
70+
## Next steps
71+
72+
- Explore the fine-tuning capabilities in the [Azure OpenAI fine-tuning tutorial](../tutorials/fine-tune.md).
73+
- Review fine-tuning [model regional availability](../concepts/models.md#fine-tuning-models)
74+
- Learn more about [Azure OpenAI quotas](../quotas-limits.md)
Lines changed: 60 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,60 @@
1+
---
2+
title: 'Safety evaluation for fine-tuning (preview)'
3+
titleSuffix: Azure OpenAI
4+
description: Learn how the safety evaluation works for Azure OpenAI fine-tuning.
5+
#services: cognitive-services
6+
manager: nitinme
7+
ms.service: azure-ai-openai
8+
ms.custom: build-2023, build-2023-dataai, devx-track-python, references_regions
9+
ms.topic: how-to
10+
ms.date: 02/24/2025
11+
author: mrbullwinkle
12+
ms.author: mbullwin
13+
---
14+
15+
# Safety evaluation for fine-tuning (preview)
16+
17+
GPT-4o, GPT-4o-mini, and GPT-4 are our most advanced models that can be fine-tuned to your needs. As with Azure OpenAI models generally, the advanced capabilities of fine-tuned models come with increased responsible AI challenges related to harmful content, manipulation, human-like behavior, privacy issues, and more. Learn more about risks, capabilities, and limitations in the [Overview of Responsible AI practices](/legal/cognitive-services/openai/overview?context=%2Fazure%2Fai-services%2Fopenai%2Fcontext%2Fcontext) and [Transparency Note](/legal/cognitive-services/openai/transparency-note?context=%2Fazure%2Fcognitive-services%2Fopenai%2Fcontext%2Fcontext&tabs=text). To help mitigate the risks associated with advanced fine-tuned models, we have implemented additional evaluation steps to help detect and prevent harmful content in the training and outputs of fine-tuned models. These steps are grounded in the [Microsoft Responsible AI Standard](https://www.microsoft.com/ai/responsible-ai) and [Azure OpenAI Service content filtering](/azure/ai-services/openai/concepts/content-filter?tabs=warning%2Cpython-new).
18+
19+
- Evaluations are conducted in dedicated, customer specific, private workspaces;
20+
- Evaluation endpoints are in the same geography as the Azure OpenAI resource;
21+
- Training data is not stored in connection with performing evaluations; only the final model assessment (deployable or not deployable) is persisted; and
22+
23+
GPT-4o, GPT-4o-mini, and GPT-4 fine-tuned model evaluation filters are set to predefined thresholds and cannot be modified by customers; they aren't tied to any custom content filtering configuration you might have created.
24+
25+
## Data evaluation
26+
27+
Before training starts, your data is evaluated for potentially harmful content (violence, sexual, hate, and fairness, self-harm – see category definitions [here](/azure/ai-services/openai/concepts/content-filter?tabs=warning%2Cpython-new#risk-categories)). If harmful content is detected above the specified severity level, your training job will fail, and you'll receive a message informing you of the categories of failure.
28+
29+
**Sample message:**
30+
31+
```output
32+
The provided training data failed RAI checks for harm types: [hate_fairness, self_harm, violence]. Please fix the data and try again.
33+
```
34+
35+
Your training data is evaluated automatically within your data import job as part of providing the fine-tuning capability.
36+
37+
If the fine-tuning job fails due to the detection of harmful content in training data, you won't be charged.
38+
39+
## Model evaluation
40+
41+
After training completes but before the fine-tuned model is available for deployment, the resulting model is evaluated for potentially harmful responses using Azure’s built-in [risk and safety metrics](/azure/ai-studio/concepts/evaluation-metrics-built-in?tabs=warning#risk-and-safety-metrics). Using the same approach to testing that we use for the base large language models, our evaluation capability simulates a conversation with your fine-tuned model to assess the potential to output harmful content, again using specified harmful content [categories](/azure/ai-services/openai/concepts/content-filter?tabs=warning%2Cpython-new#risk-categories) (violence, sexual, hate, and fairness, self-harm).
42+
43+
If a model is found to generate output containing content detected as harmful at above an acceptable rate, you'll be informed that your model isn't available for deployment, with information about the specific categories of harm detected:
44+
45+
**Sample Message**:
46+
47+
```output
48+
This model is unable to be deployed. Model evaluation identified that this fine tuned model scores above acceptable thresholds for [Violence, Self Harm]. Please review your training data set and resubmit the job.
49+
```
50+
51+
:::image type="content" source="../media/fine-tuning/failure.png" alt-text="Screenshot of a failed fine-tuning job due to safety evaluation." lightbox="../media/fine-tuning/failure.png":::
52+
53+
As with data evaluation, the model is evaluated automatically within your fine-tuning job as part of providing the fine-tuning capability. Only the resulting assessment (deployable or not deployable) is logged by the service. If deployment of the fine-tuned model fails due to the detection of harmful content in model outputs, you won't be charged for the training run.
54+
55+
56+
## Next steps
57+
58+
- Explore the fine-tuning capabilities in the [Azure OpenAI fine-tuning tutorial](../tutorials/fine-tune.md).
59+
- Review fine-tuning [model regional availability](../concepts/models.md#fine-tuning-models)
60+
- Learn more about [Azure OpenAI quotas](../quotas-limits.md)
Lines changed: 80 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,80 @@
1+
---
2+
title: 'Troubleshooting for Azure OpenAI fine-tuning'
3+
titleSuffix: Azure OpenAI
4+
description: Learn how to troubleshoot Azure OpenAI Service fine-tuning.
5+
#services: cognitive-services
6+
manager: nitinme
7+
ms.service: azure-ai-openai
8+
ms.custom: build-2023, build-2023-dataai, devx-track-python, references_regions
9+
ms.topic: how-to
10+
ms.date: 02/24/2025
11+
author: mrbullwinkle
12+
ms.author: mbullwin
13+
---
14+
15+
# Troubleshooting for Azure OpenAI fine-tuning
16+
17+
## How do I enable fine-tuning?
18+
19+
In order to successfully access fine-tuning, you need **Cognitive Services OpenAI Contributor assigned**. Even someone with high-level Service Administrator permissions would still need this account explicitly set in order to access fine-tuning. For more information, please review the [role-based access control guidance](/azure/ai-services/openai/how-to/role-based-access-control#cognitive-services-openai-contributor).
20+
21+
## Why did my upload fail?
22+
23+
If your file upload fails in Azure AI Foundry portal, you can view the error message under **Data files** in Azure AI Foundry portal. Hover your mouse over where it says **error** (under the status column) and an explanation of the failure will be displayed.
24+
25+
:::image type="content" source="../media/fine-tuning/error.png" alt-text="Screenshot of fine-tuning error message." lightbox="../media/fine-tuning/error.png":::
26+
27+
## My fine-tuned model doesn't seem to have improved
28+
29+
- **Missing system message:** You need to provide a system message when you fine tune; you'll want to provide that same system message when you use the fine-tuned model. If you provide a different system message, you may see different results than what you fine-tuned for.
30+
31+
- **Not enough data:** while 10 is the minimum for the pipeline to run, you need hundreds to thousands of data points to teach the model a new skill. Too few data points risks overfitting and poor generalization. Your fine-tuned model may perform well on the training data, but poorly on other data because it has memorized the training examples instead of learning patterns. For best results, plan to prepare a data set with hundreds or thousands of data points.
32+
33+
- **Bad data:** A poorly curated or unrepresentative dataset will produce a low-quality model. Your model may learn inaccurate or biased patterns from your dataset. For example, if you're training a chatbot for customer service, but only provide training data for one scenario (e.g. item returns) it will not know how to respond to other scenarios. Or, if your training data is bad (contains incorrect responses), your model will learn to provide incorrect results.
34+
35+
## Fine-tuning with vision
36+
37+
**What to do if your images get skipped**
38+
39+
Your images can get skipped for the following reasons:
40+
41+
- contains CAPTCHAs
42+
- contains people
43+
- contains faces
44+
45+
Remove the image. For now, we can't fine-tune models with images containing these entities.
46+
47+
**Common issues**
48+
49+
|Issue| Reason/Solution|
50+
|:----|:-----|
51+
|**Images skipped**| Images can get skipped for the following reasons: contains CAPTCHAs, people, or faces.<br><br> Remove the image. For now, we can't fine-tune models with images containing these entities.|
52+
|**Inaccessible URL**| Check that the image URL is publicly accessible.|
53+
|**Image too large**| Check that your images fall within our dataset size limits.|
54+
|**Invalid image format**| Check that your images fall within our dataset format.|
55+
56+
**How to upload large files**
57+
58+
Your training files might get quite large. You can upload files up to 8 GB in multiple parts using the [Uploads API](/rest/api/azureopenai/upload-file?view=rest-azureopenai-2024-10-21&preserve-view=true) as opposed to the Files API, which only allows file uploads of up to 512 MB.
59+
60+
**Reducing training cost**
61+
62+
If you set the detail parameter for an image to low, the image is resized to 512 by 512 pixels and is only represented by 85 tokens regardless of its size. This will reduce the cost of training.
63+
64+
```json
65+
{
66+
"type": "image_url",
67+
68+
"image_url": {
69+
70+
"url": "https://raw.githubusercontent.com/MicrosoftDocs/azure-ai-docs/main/articles/ai-services/openai/media/how-to/generated-seattle.png",
71+
72+
"detail": "low"
73+
74+
}
75+
}
76+
```
77+
78+
**Other considerations for vision fine-tuning**
79+
80+
To control the fidelity of image understanding, set the detail parameter of `image_url` to `low`, `high`, or `auto` for each image. This will also affect the number of tokens per image that the model sees during training time and will affect the cost of training.
Lines changed: 78 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,78 @@
1+
---
2+
title: 'Vision customization'
3+
titleSuffix: Azure OpenAI
4+
description: Learn how to fine-tune a model with image inputs.
5+
#services: cognitive-services
6+
manager: nitinme
7+
ms.service: azure-ai-openai
8+
ms.custom: build-2023, build-2023-dataai, devx-track-python, references_regions
9+
ms.topic: how-to
10+
ms.date: 02/24/2025
11+
author: mrbullwinkle
12+
ms.author: mbullwin
13+
---
14+
15+
# Vision fine-tuning
16+
17+
Fine-tuning is also possible with images in your JSONL files. Just as you can send one or many image inputs to chat completions, you can include those same message types within your training data. Images can be provided either as publicly accessible URLs or data URIs containing [base64 encoded images](/azure/ai-services/openai/how-to/gpt-with-vision?tabs=rest#call-the-chat-completion-apis).
18+
19+
## Image dataset requirements
20+
21+
- Your training file can contain a maximum of 50,000 examples that contain images (not including text examples).
22+
- Each example can have at most 64 images.
23+
- Each image can be at most 10 MB.
24+
25+
## Format
26+
27+
Images must be:
28+
29+
- JPEG
30+
- PNG
31+
- WEBP
32+
33+
Images must be in the RGB or RGBA image mode.
34+
35+
You cannot include images as output from messages with the assistant role.
36+
37+
As with all fine-tuning training your example file requires at least 10 examples.
38+
39+
### Example file format
40+
41+
```json
42+
{
43+
"messages": [
44+
{ "role": "system", "content": "You are a helpful AI assistant." },
45+
{ "role": "user", "content": "Describe the image?" },
46+
{ "role": "user", "content": [
47+
{
48+
"type": "image_url",
49+
"image_url": {
50+
"url": "https://raw.githubusercontent.com/MicrosoftDocs/azure-ai-docs/main/articles/ai-services/openai/media/how-to/generated-seattle.png"
51+
}
52+
}
53+
]
54+
},
55+
{ "role": "assistant", "content": "The image appears to be a watercolor painting of a city skyline, featuring tall buildings and a recognizable structure often associated with Seattle, like the Space Needle. The artwork uses soft colors and brushstrokes to create a somewhat abstract and artistic representation of the cityscape" }
56+
]
57+
}
58+
```
59+
60+
61+
## Content moderation policy
62+
63+
We scan your images before training to ensure that they comply with our usage policy [Transparency Note](/legal/cognitive-services/openai/transparency-note?context=%2Fazure%2Fai-services%2Fopenai%2Fcontext%2Fcontext&tabs=text). This may introduce latency in file validation before fine tuning begins.
64+
65+
Images containing the following will be excluded from your dataset and not used for training:
66+
67+
- People
68+
- Faces
69+
- CAPTCHAs
70+
71+
> [!IMPORTANT]
72+
>For vision fine tuning face screening process: We screen for faces/people to skip those images from training the model. The screening capability leverages face detection **WITHOUT** Face identification which means we don't create facial templates or measure specific facial geometry, and the technology used to screen for faces is incapable of uniquely identifying the individuals. To know more about data and Privacy for face refer to - [Data and privacy for Face - Azure AI services | Microsoft Learn](/legal/cognitive-services/computer-vision/imageanalysis-data-privacy-security?context=%2Fazure%2Fai-services%2Fcomputer-vision%2Fcontext%2Fcontext).
73+
74+
## Next steps
75+
76+
- [Deploy a finetuned model](fine-tuning-deploy.md).
77+
- Review fine-tuning [model regional availability](../concepts/models.md#fine-tuning-models)
78+
- Learn more about [Azure OpenAI quotas](../quotas-limits.md)

0 commit comments

Comments
 (0)