Skip to content

Commit f5fc558

Browse files
committed
update
1 parent f6aff59 commit f5fc558

File tree

6 files changed

+107
-5
lines changed

6 files changed

+107
-5
lines changed

articles/ai-services/openai/how-to/fine-tuning.md

Lines changed: 84 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -47,6 +47,41 @@ We use LoRA, or low rank approximation, to fine-tune models in a way that reduce
4747

4848
::: zone-end
4949

50+
## Vision fine-tuning
51+
52+
Fine-tuning is also possible with images in your JSONL files. Just as you can send one or many image inputs to chat completions, you can include those same message types within your training data. Images can be provided either as publicly available URLs or data URIs containing [base64 encoded images](/azure/ai-services/openai/how-to/gpt-with-vision?tabs=rest#call-the-chat-completion-apis).
53+
54+
### Image dataset requirements
55+
56+
- Your training file can contain a maximum of 50,000 examples that contain images (not including text examples).
57+
- Each example can have at most 64 images.
58+
- Each image can be at most 10 MB.
59+
60+
### Format
61+
62+
Images must be:
63+
64+
- JPEG
65+
- PNG
66+
- WEBP
67+
68+
Images must be in the RGB or RGBA image mode.
69+
70+
You cannot include images as output from messages with the assistant role.
71+
72+
### Content moderation policy
73+
74+
We scan your images before training to ensure that they comply with our usage policy [Transparency Note](/legal/cognitive-services/openai/transparency-note?context=%2Fazure%2Fai-services%2Fopenai%2Fcontext%2Fcontext&tabs=text). This may introduce latency in file validation before fine tuning begins.
75+
76+
Images containing the following will be excluded from your dataset and not used for training:
77+
78+
- People
79+
- Faces
80+
- CAPTCHAs
81+
82+
> [!IMPORTANT]
83+
>For vision fine tuning face screening process: We screen for faces/people to skip those images from training the model. The screening capability leverages face detection **WITHOUT** Face identification which means we don't create facial templates or measure specific facial geometry, and the technology used to screen for faces is incapable of uniquely identifying the individuals. To know more about data and Privacy for face refer to - [Data and privacy for Face - Azure AI services | Microsoft Learn](/legal/cognitive-services/computer-vision/imageanalysis-data-privacy-security?context=%2Fazure%2Fai-services%2Fcomputer-vision%2Fcontext%2Fcontext).
84+
5085
## Troubleshooting
5186

5287
### How do I enable fine-tuning? Create a custom model is greyed out in Azure OpenAI Studio?
@@ -67,6 +102,55 @@ If your file upload fails in Azure OpenAI Studio, you can view the error message
67102

68103
- **Bad data:** A poorly curated or unrepresentative dataset will produce a low-quality model. Your model may learn inaccurate or biased patterns from your dataset. For example, if you are training a chatbot for customer service, but only provide training data for one scenario (e.g. item returns) it will not know how to respond to other scenarios. Or, if your training data is bad (contains incorrect responses), your model will learn to provide incorrect results.
69104

105+
### Fine-tuning with vision
106+
107+
**What to do if your images get skipped?**
108+
109+
Your images can get skipped for the following reasons:
110+
111+
- contains CAPTCHAs
112+
- contains people
113+
- contains faces
114+
115+
Remove the image. For now, we cannot fine-tune models with images containing these entities.
116+
117+
**Common Issues**
118+
119+
|Issue| Reason/Solution|
120+
|:----|:-----|
121+
|**Images skipped**| Images can get skipped for the following reasons: contains CAPTCHAs, people, or faces.<br><br> Remove the image. For now, we cannot fine-tune models with images containing these entities.|
122+
|**Inaccessible URL**| Check that the image URL is publicly accessible.|
123+
|**Image too large**| Check that your images fall within our dataset size limits.|
124+
|**Invalid image format**| Check that your images fall within our dataset format.|
125+
126+
**How to upload large files**
127+
128+
Your training files might get quite large. You can upload files up to 8 GB in multiple parts using the [Uploads API](/rest/api/azureopenai/upload-file?view=rest-azureopenai-2024-10-21) as opposed to the Files API, which only allows file uploads of up to 512 MB.
129+
130+
**Reducing training cost**
131+
132+
If you set the detail parameter for an image to low, the image is resized to 512 by 512 pixels and is only represented by 85 tokens regardless of its size. This will reduce the cost of training. See here for more information.
133+
134+
```json
135+
{
136+
137+
"type": "image_url",
138+
139+
"image_url": {
140+
141+
"url": "https://raw.githubusercontent.com/MicrosoftDocs/azure-ai-docs/main/articles/ai-services/openai/media/how-to/generated-seattle.png",
142+
143+
"detail": "low"
144+
145+
}
146+
147+
}
148+
```
149+
150+
Other considerations for vision fine-tuning
151+
152+
To control the fidelity of image understanding, set the detail parameter of `image_url` to `low`, `high`, or `auto` for each image. This will also affect the number of tokens per image that the model sees during training time and will affect the cost of training.
153+
70154
## Next steps
71155

72156
- Explore the fine-tuning capabilities in the [Azure OpenAI fine-tuning tutorial](../tutorials/fine-tune.md).

articles/ai-services/openai/how-to/manage-costs.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -42,7 +42,7 @@ In practice, for this type of completion call, the token input/output wouldn't b
4242

4343
Azure OpenAI fine-tuned models are charged based on three factors:
4444

45-
- Training hours
45+
- [Number of tokens](https://techcommunity.microsoft.com/blog/azure-ai-services-blog/pricing-update-token-based-billing-for-fine-tuning-training-%F0%9F%8E%89/4164465) in your training file
4646
- Hosting hours
4747
- Inference per 1,000 tokens
4848

articles/ai-services/openai/includes/fine-tuning-openai-in-ai-studio.md

Lines changed: 7 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -80,7 +80,7 @@ If you would like a step-by-step walk-through of fine-tuning a `gpt-35-turbo-061
8080
{"messages": [{"role": "system", "content": "You are an Xbox customer support agent whose primary goal is to help users with issues they are experiencing with their Xbox devices. You are friendly and concise. You only provide factual answers to queries, and do not provide answers that are not related to Xbox."}, {"role": "user", "content": "I'm having trouble connecting my Xbox to the Wi-Fi."}, {"role": "assistant", "content": "No worries, let's go through the network settings on your Xbox. Can you please tell me what happens when you try to connect it to the Wi-Fi?"}]}
8181
```
8282

83-
## Multi-turn chat file format
83+
### Multi-turn chat file format
8484

8585
Multiple turns of a conversation in a single line of your jsonl training file is also supported. To skip fine-tuning on specific assistant messages add the optional `weight` key value pair. Currently `weight` can be set to 0 or 1.
8686

@@ -90,6 +90,12 @@ Multiple turns of a conversation in a single line of your jsonl training file is
9090
{"messages": [{"role": "system", "content": "Marv is a factual chatbot that is also sarcastic."}, {"role": "user", "content": "How far is the Moon from Earth?"}, {"role": "assistant", "content": "384,400 kilometers", "weight": 0}, {"role": "user", "content": "Can you be more sarcastic?"}, {"role": "assistant", "content": "Around 384,400 kilometers. Give or take a few, like that really matters.", "weight": 1}]}
9191
```
9292

93+
### Chat completions with vision
94+
95+
```json
96+
{"messages": [{"role": "user", "content": [{"type": "text", "text": "What's in this image?"}, {"type": "image_url", "image_url": {"url": "https://raw.githubusercontent.com/MicrosoftDocs/azure-ai-docs/main/articles/ai-services/openai/media/how-to/generated-seattle.png"}}]}, {"role": "assistant", "content": "The image appears to be a watercolor painting of a city skyline, featuring tall buildings and a recognizable structure often associated with Seattle, like the Space Needle. The artwork uses soft colors and brushstrokes to create a somewhat abstract and artistic representation of the cityscape."}]}
97+
```
98+
9399
In addition to the JSONL format, training and validation data files must be encoded in UTF-8 and include a byte-order mark (BOM). The file must be less than 512 MB in size.
94100

95101
### Create your training and validation datasets

articles/ai-services/openai/includes/fine-tuning-python.md

Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -86,6 +86,12 @@ Multiple turns of a conversation in a single line of your jsonl training file is
8686
{"messages": [{"role": "system", "content": "Marv is a factual chatbot that is also sarcastic."}, {"role": "user", "content": "How far is the Moon from Earth?"}, {"role": "assistant", "content": "384,400 kilometers", "weight": 0}, {"role": "user", "content": "Can you be more sarcastic?"}, {"role": "assistant", "content": "Around 384,400 kilometers. Give or take a few, like that really matters.", "weight": 1}]}
8787
```
8888

89+
### Chat completions with vision
90+
91+
```json
92+
{"messages": [{"role": "user", "content": [{"type": "text", "text": "What's in this image?"}, {"type": "image_url", "image_url": {"url": "https://raw.githubusercontent.com/MicrosoftDocs/azure-ai-docs/main/articles/ai-services/openai/media/how-to/generated-seattle.png"}}]}, {"role": "assistant", "content": "The image appears to be a watercolor painting of a city skyline, featuring tall buildings and a recognizable structure often associated with Seattle, like the Space Needle. The artwork uses soft colors and brushstrokes to create a somewhat abstract and artistic representation of the cityscape."}]}
93+
```
94+
8995
In addition to the JSONL format, training and validation data files must be encoded in UTF-8 and include a byte-order mark (BOM). The file must be less than 512 MB in size.
9096

9197
### Create your training and validation datasets

articles/ai-services/openai/includes/fine-tuning-rest.md

Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -85,6 +85,12 @@ Multiple turns of a conversation in a single line of your jsonl training file is
8585
{"messages": [{"role": "system", "content": "Marv is a factual chatbot that is also sarcastic."}, {"role": "user", "content": "How far is the Moon from Earth?"}, {"role": "assistant", "content": "384,400 kilometers", "weight": 0}, {"role": "user", "content": "Can you be more sarcastic?"}, {"role": "assistant", "content": "Around 384,400 kilometers. Give or take a few, like that really matters.", "weight": 1}]}
8686
```
8787

88+
### Chat completions with vision
89+
90+
```json
91+
{"messages": [{"role": "user", "content": [{"type": "text", "text": "What's in this image?"}, {"type": "image_url", "image_url": {"url": "https://raw.githubusercontent.com/MicrosoftDocs/azure-ai-docs/main/articles/ai-services/openai/media/how-to/generated-seattle.png"}}]}, {"role": "assistant", "content": "The image appears to be a watercolor painting of a city skyline, featuring tall buildings and a recognizable structure often associated with Seattle, like the Space Needle. The artwork uses soft colors and brushstrokes to create a somewhat abstract and artistic representation of the cityscape."}]}
92+
```
93+
8894
In addition to the JSONL format, training and validation data files must be encoded in UTF-8 and include a byte-order mark (BOM). The file must be less than 512 MB in size.
8995

9096

articles/ai-services/openai/includes/fine-tuning-studio.md

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -76,7 +76,7 @@ If you would like a step-by-step walk-through of fine-tuning a `gpt-4o-mini` (20
7676
{"messages": [{"role": "system", "content": "You are an Xbox customer support agent whose primary goal is to help users with issues they are experiencing with their Xbox devices. You are friendly and concise. You only provide factual answers to queries, and do not provide answers that are not related to Xbox."}, {"role": "user", "content": "I'm having trouble connecting my Xbox to the Wi-Fi."}, {"role": "assistant", "content": "No worries, let's go through the network settings on your Xbox. Can you please tell me what happens when you try to connect it to the Wi-Fi?"}]}
7777
```
7878

79-
## Multi-turn chat file format Azure OpenAI
79+
### Multi-turn chat file format Azure OpenAI
8080

8181
Multiple turns of a conversation in a single line of your jsonl training file is also supported. To skip fine-tuning on specific assistant messages add the optional `weight` key value pair. Currently `weight` can be set to 0 or 1.
8282

@@ -86,10 +86,10 @@ Multiple turns of a conversation in a single line of your jsonl training file is
8686
{"messages": [{"role": "system", "content": "Marv is a factual chatbot that is also sarcastic."}, {"role": "user", "content": "How far is the Moon from Earth?"}, {"role": "assistant", "content": "384,400 kilometers", "weight": 0}, {"role": "user", "content": "Can you be more sarcastic?"}, {"role": "assistant", "content": "Around 384,400 kilometers. Give or take a few, like that really matters.", "weight": 1}]}
8787
```
8888

89-
## Chat completions with vision
89+
### Chat completions with vision
9090

9191
```json
92-
{"messages": [{"role": "user", "content": [{"type": "text", "text": "What's in this image?"}, {"type": "image_url", "image_url": {"url": "https://raw.githubusercontent.com/MicrosoftDocs/azure-docs/main/articles/ai-services/openai/media/how-to/generated-seattle.png"}}]}, {"role": "assistant", "content": "The image depicts a serene landscape featuring a wooden pathway that winds through lush green grass and vegetation. The sky above is partly cloudy with a blue hue, suggesting a pleasant day. The pathway leads into the distance, surrounded by various trees and bushes on either side. This scene evokes a sense of tranquility and connection with nature."}]}
92+
{"messages": [{"role": "user", "content": [{"type": "text", "text": "What's in this image?"}, {"type": "image_url", "image_url": {"url": "https://raw.githubusercontent.com/MicrosoftDocs/azure-ai-docs/main/articles/ai-services/openai/media/how-to/generated-seattle.png"}}]}, {"role": "assistant", "content": "The image appears to be a watercolor painting of a city skyline, featuring tall buildings and a recognizable structure often associated with Seattle, like the Space Needle. The artwork uses soft colors and brushstrokes to create a somewhat abstract and artistic representation of the cityscape."}]}
9393
```
9494

9595
In addition to the JSONL format, training and validation data files must be encoded in UTF-8 and include a byte-order mark (BOM). The file must be less than 512 MB in size.

0 commit comments

Comments
 (0)