MicrosoftDocs
diff --git a/‎articles/ai-services/openai/how-to/fine-tuning.md‎
Lines changed: 86 additions & 2 deletions b/‎articles/ai-services/openai/how-to/fine-tuning.md‎
Lines changed: 86 additions & 2 deletions
diff --git a/‎articles/ai-services/openai/how-to/manage-costs.md‎
Lines changed: 10 additions & 9 deletions b/‎articles/ai-services/openai/how-to/manage-costs.md‎
Lines changed: 10 additions & 9 deletions
diff --git a/‎articles/ai-services/openai/includes/fine-tune-models.md‎
Lines changed: 2 additions & 2 deletions b/‎articles/ai-services/openai/includes/fine-tune-models.md‎
Lines changed: 2 additions & 2 deletions
diff --git a/‎articles/ai-services/openai/includes/fine-tune.md‎
Lines changed: 1 addition & 1 deletion b/‎articles/ai-services/openai/includes/fine-tune.md‎
Lines changed: 1 addition & 1 deletion
diff --git a/‎articles/ai-services/openai/includes/fine-tuning-openai-in-ai-studio.md‎
Lines changed: 7 additions & 1 deletion b/‎articles/ai-services/openai/includes/fine-tuning-openai-in-ai-studio.md‎
Lines changed: 7 additions & 1 deletion
@@ -24,7 +24,7 @@ Azure OpenAI Service lets you tailor our models to your personal datasets by usi
 
 In contrast to few-shot learning, fine tuning improves the model by training on many more examples than can fit in a prompt, letting you achieve better results on a wide number of tasks. Because fine tuning adjusts the base model’s weights to improve performance on the specific task, you won’t have to include as many examples or instructions in your prompt. This means less text sent and fewer tokens processed on every API call, potentially saving cost, and improving request latency.
 
-We use LoRA, or low rank approximation, to fine-tune models in a way that reduces their complexity without significantly affecting their performance. This method works by approximating the original high-rank matrix with a lower rank one, thus only fine-tuning a smaller subset of "important" parameters during the supervised training phase, making the model more manageable and efficient. For users, this makes training faster and more affordable than other techniques.
+We use LoRA, or low rank approximation, to fine-tune models in a way that reduces their complexity without significantly affecting their performance. This method works by approximating the original high-rank matrix with a lower rank one, thus only fine-tuning a smaller subset of *important* parameters during the supervised training phase, making the model more manageable and efficient. For users, this makes training faster and more affordable than other techniques.
 
 > [!NOTE]
 > Azure OpenAI currently only supports text-to-text fine-tuning for all supported models including GPT-4o mini.
@@ -47,9 +47,44 @@ We use LoRA, or low rank approximation, to fine-tune models in a way that reduce
 
 ::: zone-end
 
+## Vision fine-tuning
+
+Fine-tuning is also possible with images in your JSONL files. Just as you can send one or many image inputs to chat completions, you can include those same message types within your training data. Images can be provided either as publicly accessible URLs or data URIs containing [base64 encoded images](/azure/ai-services/openai/how-to/gpt-with-vision?tabs=rest#call-the-chat-completion-apis). 
+
+### Image dataset requirements
+
+- Your training file can contain a maximum of 50,000 examples that contain images (not including text examples).
+- Each example can have at most 64 images.
+- Each image can be at most 10 MB.
+
+### Format
+
+Images must be:
+
+- JPEG
+- PNG
+- WEBP
+
+Images must be in the RGB or RGBA image mode.
+
+You cannot include images as output from messages with the assistant role.
+
+### Content moderation policy
+
+We scan your images before training to ensure that they comply with our usage policy [Transparency Note](/legal/cognitive-services/openai/transparency-note?context=%2Fazure%2Fai-services%2Fopenai%2Fcontext%2Fcontext&tabs=text). This may introduce latency in file validation before fine tuning begins.
+
+Images containing the following will be excluded from your dataset and not used for training:
+
+- People
+- Faces
+- CAPTCHAs
+
+> [!IMPORTANT]
+>For  vision fine tuning face screening process: We screen for faces/people to skip those images from training the model. The screening capability leverages face detection **WITHOUT** Face identification which means we don't create facial templates or measure specific facial geometry, and the technology used to screen for faces is incapable of uniquely identifying the individuals. To know more about data and Privacy for face refer to - [Data and privacy for Face - Azure AI services | Microsoft Learn](/legal/cognitive-services/computer-vision/imageanalysis-data-privacy-security?context=%2Fazure%2Fai-services%2Fcomputer-vision%2Fcontext%2Fcontext).
+
 ## Troubleshooting
 
-### How do I enable fine-tuning? Create a custom model is greyed out in Azure OpenAI Studio?
+### How do I enable fine-tuning?
 
 In order to successfully access fine-tuning, you need **Cognitive Services OpenAI Contributor assigned**. Even someone with high-level Service Administrator permissions would still need this account explicitly set in order to access fine-tuning. For more information, please review the [role-based access control guidance](/azure/ai-services/openai/how-to/role-based-access-control#cognitive-services-openai-contributor).
 
@@ -67,6 +102,55 @@ If your file upload fails in Azure OpenAI Studio, you can view the error message
 
 - **Bad data:** A poorly curated or unrepresentative dataset will produce a low-quality model. Your model may learn inaccurate or biased patterns from your dataset. For example, if you are training a chatbot for customer service, but only provide training data for one scenario (e.g. item returns) it will not know how to respond to other scenarios. Or, if your training data is bad (contains incorrect responses), your model will learn to provide incorrect results.
 
+### Fine-tuning with vision
+
+**What to do if your images get skipped**
+
+Your images can get skipped for the following reasons:
+
+- contains CAPTCHAs
+- contains people
+- contains faces
+
+Remove the image. For now, we cannot fine-tune models with images containing these entities.
+
+**Common issues**
+
+|Issue| Reason/Solution|
+|:----|:-----|
+|**Images skipped**| Images can get skipped for the following reasons: contains CAPTCHAs, people, or faces.<br><br> Remove the image. For now, we cannot fine-tune models with images containing these entities.|
+|**Inaccessible URL**| Check that the image URL is publicly accessible.|
+|**Image too large**| Check that your images fall within our dataset size limits.|
+|**Invalid image format**| Check that your images fall within our dataset format.|
+
+**How to upload large files**
+
+Your training files might get quite large. You can upload files up to 8 GB in multiple parts using the [Uploads API](/rest/api/azureopenai/upload-file?view=rest-azureopenai-2024-10-21&preserve-view=true) as opposed to the Files API, which only allows file uploads of up to 512 MB.
+
+**Reducing training cost**
+
+If you set the detail parameter for an image to low, the image is resized to 512 by 512 pixels and is only represented by 85 tokens regardless of its size. This will reduce the cost of training.
+
+```json
+{ 
+
+    "type": "image_url", 
+
+    "image_url": { 
+
+        "url": "https://raw.githubusercontent.com/MicrosoftDocs/azure-ai-docs/main/articles/ai-services/openai/media/how-to/generated-seattle.png", 
+
+        "detail": "low" 
+
+    } 
+
+} 
+```
+
+**Other considerations for vision fine-tuning**
+
+To control the fidelity of image understanding, set the detail parameter of `image_url` to `low`, `high`, or `auto` for each image. This will also affect the number of tokens per image that the model sees during training time and will affect the cost of training.
+
 ## Next steps
 
 - Explore the fine-tuning capabilities in the [Azure OpenAI fine-tuning tutorial](../tutorials/fine-tune.md).
 
@@ -18,7 +18,7 @@ You can also review forecasted costs and identify spending trends to identify ar
 
 ## Prerequisites
 
-Cost analysis in Cost Management supports most Azure account types, but not all of them. To view the full list of supported account types, see [Understand Cost Management data](/azure/cost-management-billing/costs/understand-cost-mgt-data?WT.mc_id=costmanagementcontent_docsacmhorizontal_-inproduct-learn). To view cost data, you need at least read access for an Azure account. For information about assigning access to Azure Cost Management data, see [Assign access to data](/azure/cost-management/assign-access-acm-data?WT.mc_id=costmanagementcontent_docsacmhorizontal_-inproduct-learn).
+Cost analysis in Cost Management supports most Azure account types, but not all of them. To view the full list of supported account types, see [Understand Cost Management data](/azure/cost-management-billing/costs/understand-cost-mgt-data?WT.mc_id=costmanagementcontent_docsacmhorizontal_-inproduct-learn). To view cost data, you need at least read access for an Azure account. For information about assigning access to Microsoft Cost Management data, see [Assign access to data](/azure/cost-management/assign-access-acm-data?WT.mc_id=costmanagementcontent_docsacmhorizontal_-inproduct-learn).
 
 ## Estimate costs before using Azure OpenAI
 
@@ -28,25 +28,26 @@ Use the [Azure pricing calculator](https://azure.microsoft.com/pricing/calculato
 
 Azure OpenAI Service runs on Azure infrastructure that accrues costs when you deploy new resources. There could be other infrastructure costs that might accrue. The following sections describe how you're charged for Azure OpenAI Service.
 
-### Base series and Codex series models
+### Model inference chat completions
 
-Azure OpenAI base series and Codex series models are charged per 1,000 tokens. Costs vary depending on which model series you choose: Ada, Babbage, Curie, Davinci, or Code-Cushman.
+Azure OpenAI chat completions model inference is [charged per 1,000 tokens with different rates](https://azure.microsoft.com/pricing/details/cognitive-services/openai-service/) depending on model and [deployment type](./deployment-types.md).
 
 Azure OpenAI models understand and process text by breaking it down into tokens. For reference, each token is roughly four characters for typical English text.
 
 Token costs are for both input and output. For example, suppose you have a 1,000 token JavaScript code sample that you ask an Azure OpenAI model to convert to Python. You would be charged approximately 1,000 tokens for the initial input request sent, and 1,000 more tokens for the output that is received in response for a total of 2,000 tokens.
 
 In practice, for this type of completion call, the token input/output wouldn't be perfectly 1:1. A conversion from one programming language to another could result in a longer or shorter output depending on many factors. One such factor is the value assigned to the `max_tokens` parameter.
 
-### Base Series and Codex series fine-tuned models
+### Fine-tuned models
 
-Azure OpenAI fine-tuned models are charged based on three factors:
+Azure OpenAI fine-tuning models are charged based on the [number of tokens in your training file](https://techcommunity.microsoft.com/blog/azure-ai-services-blog/pricing-update-token-based-billing-for-fine-tuning-training-%F0%9F%8E%89/4164465). For the latest prices, see the [official pricing page](https://azure.microsoft.com/pricing/details/cognitive-services/openai-service/).
+
+Once your fine-tuned model is deployed, you're also charged based on:
 
-- Training hours
 - Hosting hours
-- Inference per 1,000 tokens
+- Inference per 1,000 tokens (broken down by input usage and output usage)
 
-The hosting hours cost is important to be aware of since after a fine-tuned model is deployed, it continues to incur an hourly cost regardless of whether you're actively using it. Monitor fine-tuned model costs closely.
+The hosting hours cost is important to be aware of since after a fine-tuned model is deployed, it continues to incur an hourly cost regardless of whether you're actively using it. Monitor deployed fine-tuned model costs closely.
 
 > [!IMPORTANT]
 > After you deploy a customized model, if at any time the deployment remains inactive for greater than fifteen (15) days,
@@ -130,7 +131,7 @@ You can also [export your cost data](/azure/cost-management-billing/costs/tutori
 
 ## Next steps
 
-- Learn [how to optimize your cloud investment with Azure Cost Management](/azure/cost-management-billing/costs/cost-mgt-best-practices?WT.mc_id=costmanagementcontent_docsacmhorizontal_-inproduct-learn).
+- Learn [how to optimize your cloud investment with Microsoft Cost Management](/azure/cost-management-billing/costs/cost-mgt-best-practices?WT.mc_id=costmanagementcontent_docsacmhorizontal_-inproduct-learn).
 - Learn more about managing costs with [cost analysis](/azure/cost-management-billing/costs/quick-acm-cost-analysis?WT.mc_id=costmanagementcontent_docsacmhorizontal_-inproduct-learn).
 - Learn about how to [prevent unexpected costs](/azure/cost-management-billing/understand/analyze-unexpected-charges?WT.mc_id=costmanagementcontent_docsacmhorizontal_-inproduct-learn).
 - Take the [Cost Management](/training/paths/control-spending-manage-bills?WT.mc_id=costmanagementcontent_docsacmhorizontal_-inproduct-learn) guided learning course.
@@ -23,7 +23,7 @@ manager: nitinme
 | `gpt-35-turbo` (1106) | East US2 <br> North Central US <br> Sweden Central <br> Switzerland West | Input: 16,385<br> Output: 4,096 |  Sep 2021|
 | `gpt-35-turbo` (0125)  | East US2 <br> North Central US <br> Sweden Central <br> Switzerland West | 16,385 | Sep 2021 |
 | `gpt-4` (0613) <sup>**1**</sup> | North Central US <br> Sweden Central | 8192 | Sep 2021 |
-| `gpt-4o-mini` <sup>**1**</sup> (2024-07-18) | North Central US <br> Sweden Central | Input: 128,000 <br> Output: 16,384  <br> Training example context length: 64,536 | Oct 2023 |
-| `gpt-4o` <sup>**1**</sup> (2024-08-06) | East US2 <br> North Central US <br> Sweden Central | Input: 128,000 <br> Output: 16,384  <br> Training example context length: 64,536 | Oct 2023 | 
+| `gpt-4o-mini` (2024-07-18) | North Central US <br> Sweden Central | Input: 128,000 <br> Output: 16,384  <br> Training example context length: 64,536 | Oct 2023 |
+| `gpt-4o` (2024-08-06) | East US2 <br> North Central US <br> Sweden Central | Input: 128,000 <br> Output: 16,384  <br> Training example context length: 64,536 | Oct 2023 | 
 
 **<sup>1</sup>** GPT-4 is currently in public preview.
@@ -22,4 +22,4 @@ keywords: ChatGPT
 > As described in [Azure OpenAI Service pricing](https://azure.microsoft.com/pricing/details/cognitive-services/openai-service/),
 > each customized (fine-tuned) model that's deployed incurs an hourly hosting cost regardless of whether completions
 > or chat completions calls are being made to the model. To learn more about planning and managing costs with Azure OpenAI,
-> refer to the guidance in [Plan to manage costs for Azure OpenAI Service](../how-to/manage-costs.md#base-series-and-codex-series-fine-tuned-models).
+> refer to the guidance in [Plan to manage costs for Azure OpenAI Service](../how-to/manage-costs.md#fine-tuned-models).
@@ -80,7 +80,7 @@ If you would like a step-by-step walk-through of fine-tuning a `gpt-35-turbo-061
 {"messages": [{"role": "system", "content": "You are an Xbox customer support agent whose primary goal is to help users with issues they are experiencing with their Xbox devices. You are friendly and concise. You only provide factual answers to queries, and do not provide answers that are not related to Xbox."}, {"role": "user", "content": "I'm having trouble connecting my Xbox to the Wi-Fi."}, {"role": "assistant", "content": "No worries, let's go through the network settings on your Xbox. Can you please tell me what happens when you try to connect it to the Wi-Fi?"}]}
 ```
 
-## Multi-turn chat file format
+### Multi-turn chat file format
 
 Multiple turns of a conversation in a single line of your jsonl training file is also supported. To skip fine-tuning on specific assistant messages add the optional `weight` key value pair. Currently `weight` can be set to 0 or 1.  
 
@@ -90,6 +90,12 @@ Multiple turns of a conversation in a single line of your jsonl training file is
 {"messages": [{"role": "system", "content": "Marv is a factual chatbot that is also sarcastic."}, {"role": "user", "content": "How far is the Moon from Earth?"}, {"role": "assistant", "content": "384,400 kilometers", "weight": 0}, {"role": "user", "content": "Can you be more sarcastic?"}, {"role": "assistant", "content": "Around 384,400 kilometers. Give or take a few, like that really matters.", "weight": 1}]}
 ```
 
+### Chat completions with vision
+
+```json
+{"messages": [{"role": "user", "content": [{"type": "text", "text": "What's in this image?"}, {"type": "image_url", "image_url": {"url": "https://raw.githubusercontent.com/MicrosoftDocs/azure-ai-docs/main/articles/ai-services/openai/media/how-to/generated-seattle.png"}}]}, {"role": "assistant", "content": "The image appears to be a watercolor painting of a city skyline, featuring tall buildings and a recognizable structure often associated with Seattle, like the Space Needle. The artwork uses soft colors and brushstrokes to create a somewhat abstract and artistic representation of the cityscape."}]}
+```
+
 In addition to the JSONL format, training and validation data files must be encoded in UTF-8 and include a byte-order mark (BOM). The file must be less than 512 MB in size.
 
 ### Create your training and validation datasets