Skip to content

Commit 5c68218

Browse files
Merge pull request #2770 from mrbullwinkle/mrb_02_06_2025_models_update
[Azure OpenAI] Model retirement updates
2 parents e9b0206 + ea10891 commit 5c68218

File tree

6 files changed

+5
-245
lines changed

6 files changed

+5
-245
lines changed

articles/ai-services/openai/concepts/models.md

Lines changed: 0 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -320,7 +320,6 @@ These models can only be used with Embedding API requests.
320320

321321
| Model ID | Max Request (characters) |
322322
| --- | :---: |
323-
| dalle2 (preview) | 1,000 |
324323
| dall-e-3 | 4,000 |
325324

326325
# [Audio](#tab/standard-audio)

articles/ai-services/openai/includes/fine-tune-models.md

Lines changed: 1 addition & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -6,7 +6,7 @@ author: mrbullwinkle
66
ms.author: mbullwin
77
ms.service: azure-ai-openai
88
ms.topic: include
9-
ms.date: 10/31/2024
9+
ms.date: 02/06/2025
1010
manager: nitinme
1111
---
1212

@@ -17,8 +17,6 @@ manager: nitinme
1717
1818
| Model ID | Fine-tuning regions | Max request (tokens) | Training Data (up to) |
1919
| --- | --- | :---: | :---: |
20-
| `babbage-002` | North Central US <br> Sweden Central <br> Switzerland West | 16,384 | Sep 2021 |
21-
| `davinci-002` | North Central US <br> Sweden Central <br> Switzerland West | 16,384 | Sep 2021 |
2220
| `gpt-35-turbo` (0613) | East US2 <br> North Central US <br> Sweden Central <br> Switzerland West | 4,096 | Sep 2021 |
2321
| `gpt-35-turbo` (1106) | East US2 <br> North Central US <br> Sweden Central <br> Switzerland West | Input: 16,385<br> Output: 4,096 | Sep 2021|
2422
| `gpt-35-turbo` (0125) | East US2 <br> North Central US <br> Sweden Central <br> Switzerland West | 16,385 | Sep 2021 |

articles/ai-services/openai/includes/fine-tuning-openai-in-ai-studio.md

Lines changed: 0 additions & 54 deletions
Original file line numberDiff line numberDiff line change
@@ -28,8 +28,6 @@ ms.custom: include, build-2024
2828

2929
The following models support fine-tuning:
3030

31-
- `babbage-002`
32-
- `davinci-002`
3331
- `gpt-35-turbo` (0613)
3432
- `gpt-35-turbo` (1106)
3533
- `gpt-35-turbo` (0125)
@@ -64,10 +62,6 @@ Take a moment to review the fine-tuning workflow for using Azure AI Foundry:
6462

6563
Your training data and validation data sets consist of input and output examples for how you would like the model to perform.
6664

67-
Different model types require a different format of training data.
68-
69-
# [chat completion models](#tab/turbo)
70-
7165
The training and validation data you use **must** be formatted as a JSON Lines (JSONL) document. For `gpt-35-turbo-0613` the fine-tuning dataset must be formatted in the conversational format that is used by the [Chat completions](../how-to/chatgpt.md) API.
7266

7367
If you would like a step-by-step walk-through of fine-tuning a `gpt-35-turbo-0613` model please refer to the [Azure OpenAI fine-tuning tutorial.](../tutorials/fine-tune.md)
@@ -104,54 +98,6 @@ The more training examples you have, the better. Fine tuning jobs will not proce
10498

10599
In general, doubling the dataset size can lead to a linear increase in model quality. But keep in mind, low quality examples can negatively impact performance. If you train the model on a large amount of internal data, without first pruning the dataset for only the highest quality examples you could end up with a model that performs much worse than expected.
106100

107-
# [babbage-002/davinci-002](#tab/completionfinetuning)
108-
109-
The training and validation data you use **must** be formatted as a JSON Lines (JSONL) document in which each line represents a single prompt-completion pair. The OpenAI command-line interface (CLI) includes [a data preparation tool](#openai-cli-data-preparation-tool) that validates, gives suggestions, and reformats your training data into a JSONL file ready for fine-tuning.
110-
111-
```json
112-
{"prompt": "<prompt text>", "completion": "<ideal generated text>"}
113-
{"prompt": "<prompt text>", "completion": "<ideal generated text>"}
114-
{"prompt": "<prompt text>", "completion": "<ideal generated text>"}
115-
```
116-
117-
In addition to the JSONL format, training and validation data files must be encoded in UTF-8 and include a byte-order mark (BOM). The file must be less than 512 MB in size.
118-
119-
### Create your training and validation datasets
120-
121-
Designing your prompts and completions for fine-tuning is different from designing your prompts for use with any of [our GPT-3 base models](../concepts/legacy-models.md#gpt-3-models). Prompts for completion calls often use either detailed instructions or few-shot learning techniques, and consist of multiple examples. For fine-tuning, each training example should consist of a single input prompt and its desired completion output. You don't need to give detailed instructions or multiple completion examples for the same prompt.
122-
123-
The more training examples you have, the better. The minimum number of training examples is 10, but such a small number of examples is often not enough to noticeably influence model responses. OpenAI states it's best practice to have at least 50 high quality training examples. However, it is entirely possible to have a use case that might require 1,000's of high quality training examples to be successful.
124-
125-
In general, doubling the dataset size can lead to a linear increase in model quality. But keep in mind, low quality examples can negatively impact performance. If you train the model on a large amount of internal data, without first pruning the dataset for only the highest quality examples you could end up with a model that performs much worse than expected.
126-
127-
### OpenAI CLI data preparation tool
128-
129-
OpenAI's CLI data preparation tool was developed for the previous generation of fine-tuning models to assist with many of the data preparation steps. This tool will only work for data preparation for models that work with the completion API like `babbage-002` and `davinci-002`. The tool validates, gives suggestions, and reformats your data into a JSONL file ready for fine-tuning.
130-
131-
To install the OpenAI CLI, run the following Python command:
132-
133-
```console
134-
pip install openai==0.28.1
135-
```
136-
137-
To analyze your training data with the data preparation tool, run the following Python command. Replace the _\<LOCAL_FILE>_ argument with the full path and file name of the training data file to analyze:
138-
139-
```console
140-
openai tools fine_tunes.prepare_data -f <LOCAL_FILE>
141-
```
142-
143-
This tool accepts files in the following data formats, if they contain a prompt and a completion column/key:
144-
145-
- Comma-separated values (CSV)
146-
- Tab-separated values (TSV)
147-
- Microsoft Excel workbook (XLSX)
148-
- JavaScript Object Notation (JSON)
149-
- JSON Lines (JSONL)
150-
151-
After it guides you through the process of implementing suggested changes, the tool reformats your training data and saves output into a JSONL file ready for fine-tuning.
152-
153-
---
154-
155101
## Create your fine-tuned model
156102

157103
To fine-tune an Azure OpenAI model in an existing Azure AI Foundry project, follow these steps:

articles/ai-services/openai/includes/fine-tuning-python.md

Lines changed: 3 additions & 57 deletions
Original file line numberDiff line numberDiff line change
@@ -26,8 +26,6 @@ ms.author: mbullwin
2626

2727
The following models support fine-tuning:
2828

29-
- `babbage-002`
30-
- `davinci-002`
3129
- `gpt-35-turbo` (0613)
3230
- `gpt-35-turbo` (1106)
3331
- `gpt-35-turbo` (0125)
@@ -60,10 +58,6 @@ Take a moment to review the fine-tuning workflow for using the Python SDK with A
6058

6159
Your training data and validation data sets consist of input and output examples for how you would like the model to perform.
6260

63-
Different model types require a different format of training data.
64-
65-
# [chat completion models](#tab/turbo)
66-
6761
The training and validation data you use **must** be formatted as a JSON Lines (JSONL) document. For `gpt-35-turbo-0613` the fine-tuning dataset must be formatted in the conversational format that is used by the [Chat completions](../how-to/chatgpt.md) API.
6862

6963
If you would like a step-by-step walk-through of fine-tuning a `gpt-35-turbo-0613` please refer to the [Azure OpenAI fine-tuning tutorial](../tutorials/fine-tune.md)
@@ -100,54 +94,6 @@ The more training examples you have, the better. Fine tuning jobs will not proce
10094

10195
In general, doubling the dataset size can lead to a linear increase in model quality. But keep in mind, low quality examples can negatively impact performance. If you train the model on a large amount of internal data, without first pruning the dataset for only the highest quality examples you could end up with a model that performs much worse than expected.
10296

103-
# [babbage-002/davinci-002](#tab/completionfinetuning)
104-
105-
The training and validation data you use **must** be formatted as a JSON Lines (JSONL) document in which each line represents a single prompt-completion pair. The OpenAI command-line interface (CLI) includes [a data preparation tool](#openai-cli-data-preparation-tool) that validates, gives suggestions, and reformats your training data into a JSONL file ready for fine-tuning.
106-
107-
```json
108-
{"prompt": "<prompt text>", "completion": "<ideal generated text>"}
109-
{"prompt": "<prompt text>", "completion": "<ideal generated text>"}
110-
{"prompt": "<prompt text>", "completion": "<ideal generated text>"}
111-
```
112-
113-
In addition to the JSONL format, training and validation data files must be encoded in UTF-8 and include a byte-order mark (BOM). The file must be less than 512 MB in size.
114-
115-
### Create your training and validation datasets
116-
117-
Designing your prompts and completions for fine-tuning is different from designing your prompts for use with any of [our GPT-3 base models](../concepts/legacy-models.md#gpt-3-models). Prompts for completion calls often use either detailed instructions or few-shot learning techniques, and consist of multiple examples. For fine-tuning, each training example should consist of a single input prompt and its desired completion output. You don't need to give detailed instructions or multiple completion examples for the same prompt.
118-
119-
The more training examples you have, the better. Fine tuning jobs will not proceed without at least 10 training examples, but such a small number is not enough to noticeably influence model responses. It is best practice to provide hundreds, if not thousands, of training examples to be successful.
120-
121-
In general, doubling the dataset size can lead to a linear increase in model quality. But keep in mind, low quality examples can negatively impact performance. If you train the model on a large amount of internal data, without first pruning the dataset for only the highest quality examples you could end up with a model that performs much worse than expected.
122-
123-
### OpenAI CLI data preparation tool
124-
125-
OpenAI's CLI data preparation tool was developed for the previous generation of fine-tuning models to assist with many of the data preparation steps. This tool will only work for data preparation for models that work with the completion API like `babbage-002` and `davinci-002`. The tool validates, gives suggestions, and reformats your data into a JSONL file ready for fine-tuning.
126-
127-
To install the OpenAI CLI, run the following Python command:
128-
129-
```console
130-
pip install openai==0.28.1
131-
```
132-
133-
To analyze your training data with the data preparation tool, run the following Python command. Replace the _\<LOCAL_FILE>_ argument with the full path and file name of the training data file to analyze:
134-
135-
```console
136-
openai tools fine_tunes.prepare_data -f <LOCAL_FILE>
137-
```
138-
139-
This tool accepts files in the following data formats, if they contain a prompt and a completion column/key:
140-
141-
- Comma-separated values (CSV)
142-
- Tab-separated values (TSV)
143-
- Microsoft Excel workbook (XLSX)
144-
- JavaScript Object Notation (JSON)
145-
- JSON Lines (JSONL)
146-
147-
After it guides you through the process of implementing suggested changes, the tool reformats your training data and saves output into a JSONL file ready for fine-tuning.
148-
149-
---
150-
15197
## Upload your training data
15298

15399
The next step is to either choose existing prepared training data or upload new prepared training data to use when customizing your model. After you prepare your training data, you can upload your files to the service. There are two ways to upload training data:
@@ -209,7 +155,7 @@ import os
209155
openai.api_key = os.getenv("AZURE_OPENAI_API_KEY")
210156
openai.api_base = os.getenv("AZURE_OPENAI_ENDPOINT")
211157
openai.api_type = 'azure'
212-
openai.api_version = '2024-02-01' # This API version or later is required to access fine-tuning for turbo/babbage-002/davinci-002
158+
openai.api_version = '2024-02-01' # This API version or later is required
213159

214160
training_file_name = 'training_set.jsonl'
215161
validation_file_name = 'validation_set.jsonl'
@@ -302,7 +248,7 @@ from openai import AzureOpenAI
302248
client = AzureOpenAI(
303249
azure_endpoint = os.getenv("AZURE_OPENAI_ENDPOINT"),
304250
api_key=os.getenv("AZURE_OPENAI_API_KEY"),
305-
api_version="2024-02-01" # This API version or later is required to access fine-tuning for turbo/babbage-002/davinci-002
251+
api_version="2024-02-01" # This API version or later is required
306252
)
307253

308254
client.fine_tuning.jobs.create(
@@ -580,7 +526,7 @@ az cognitiveservices account deployment create
580526

581527
## Use a deployed customized model
582528

583-
After your custom model deploys, you can use it like any other deployed model. You can use the **Playgrounds** in [Azure AI Foundry](https://ai.azure.com) to experiment with your new deployment. You can continue to use the same parameters with your custom model, such as `temperature` and `max_tokens`, as you can with other deployed models. For fine-tuned `babbage-002` and `davinci-002` models you will use the Completions playground and the Completions API. For fine-tuned `gpt-35-turbo-0613` models you will use the Chat playground and the Chat completion API.
529+
After your custom model deploys, you can use it like any other deployed model. You can use the **Chat Playground** in [Azure AI Foundry](https://ai.azure.com) to experiment with your new deployment. You can continue to use the same parameters with your custom model, such as `temperature` and `max_tokens`, as you can with other deployed models.
584530

585531
# [OpenAI Python 1.x](#tab/python-new)
586532

articles/ai-services/openai/includes/fine-tuning-rest.md

Lines changed: 1 addition & 66 deletions
Original file line numberDiff line numberDiff line change
@@ -25,8 +25,6 @@ ms.author: mbullwin
2525

2626
The following models support fine-tuning:
2727

28-
- `babbage-002`
29-
- `davinci-002`
3028
- `gpt-35-turbo` (0613)
3129
- `gpt-35-turbo` (1106)
3230
- `gpt-35-turbo` (0125)
@@ -59,10 +57,6 @@ Take a moment to review the fine-tuning workflow for using the REST APIS and Pyt
5957

6058
Your training data and validation data sets consist of input and output examples for how you would like the model to perform.
6159

62-
Different model types require a different format of training data.
63-
64-
# [chat completion models](#tab/turbo)
65-
6660
The training and validation data you use **must** be formatted as a JSON Lines (JSONL) document. For `gpt-35-turbo-0613` and other related models, the fine-tuning dataset must be formatted in the conversational format that is used by the [Chat completions](../how-to/chatgpt.md) API.
6761

6862
If you would like a step-by-step walk-through of fine-tuning a `gpt-35-turbo-0613` please refer to the [Azure OpenAI fine-tuning tutorial.](../tutorials/fine-tune.md)
@@ -100,71 +94,12 @@ The more training examples you have, the better. Fine tuning jobs will not proce
10094

10195
In general, doubling the dataset size can lead to a linear increase in model quality. But keep in mind, low quality examples can negatively impact performance. If you train the model on a large amount of internal data without first pruning the dataset for only the highest quality examples, you could end up with a model that performs much worse than expected.
10296

103-
# [babbage-002/davinci-002](#tab/completionfinetuning)
104-
105-
The training and validation data you use **must** be formatted as a JSON Lines (JSONL) document in which each line represents a single prompt-completion pair. The OpenAI command-line interface (CLI) includes [a data preparation tool](#openai-cli-data-preparation-tool) that validates, gives suggestions, and reformats your training data into a JSONL file ready for fine-tuning.
106-
107-
```json
108-
{"prompt": "<prompt text>", "completion": "<ideal generated text>"}
109-
{"prompt": "<prompt text>", "completion": "<ideal generated text>"}
110-
{"prompt": "<prompt text>", "completion": "<ideal generated text>"}
111-
```
112-
113-
In addition to the JSONL format, training and validation data files must be encoded in UTF-8 and include a byte-order mark (BOM). The file must be less than 512 MB in size.
114-
115-
### Create your training and validation datasets
116-
117-
Designing your prompts and completions for fine-tuning is different from designing your prompts for use with any of [our GPT-3 base models](../concepts/legacy-models.md#gpt-3-models). Prompts for completion calls often use either detailed instructions or few-shot learning techniques, and consist of multiple examples. For fine-tuning, each training example should consist of a single input prompt and its desired completion output. You don't need to give detailed instructions or multiple completion examples for the same prompt.
118-
119-
The more training examples you have, the better. Fine tuning jobs will not proceed without at least 10 training examples, but such a small number is not enough to noticeably influence model responses. It is best practice to provide hundreds, if not thousands, of training examples to be successful.
120-
121-
In general, doubling the dataset size can lead to a linear increase in model quality. But keep in mind, low quality examples can negatively impact performance. If you train the model on a large amount of internal data without first pruning the dataset for only the highest quality examples, you could end up with a model that performs much worse than expected.
122-
123-
### OpenAI CLI data preparation tool
124-
125-
OpenAI's CLI data preparation tool was developed for the previous generation of fine-tuning models to assist with many of the data preparation steps. This tool will only work for data preparation for models that work with the completion API like `babbage-002` and `davinci-002`. The tool validates, gives suggestions, and reformats your data into a JSONL file ready for fine-tuning.
126-
127-
To install the OpenAI CLI, run the following Python command:
128-
129-
```console
130-
pip install openai==0.28.1
131-
```
132-
133-
To analyze your training data with the data preparation tool, run the following Python command. Replace the _\<LOCAL_FILE>_ argument with the full path and file name of the training data file to analyze:
134-
135-
```console
136-
openai tools fine_tunes.prepare_data -f <LOCAL_FILE>
137-
```
138-
139-
This tool accepts files in the following data formats, if they contain a prompt and a completion column/key:
140-
141-
- Comma-separated values (CSV)
142-
- Tab-separated values (TSV)
143-
- Microsoft Excel workbook (XLSX)
144-
- JavaScript Object Notation (JSON)
145-
- JSON Lines (JSONL)
146-
147-
After it guides you through the process of implementing suggested changes, the tool reformats your training data and saves output into a JSONL file ready for fine-tuning.
148-
149-
---
150-
15197
### Select the base model
15298

15399
The first step in creating a custom model is to choose a base model. The **Base model** pane lets you choose a base model to use for your custom model. Your choice influences both the performance and the cost of your model.
154100

155101
Select the base model from the **Base model type** dropdown, and then select **Next** to continue.
156102

157-
You can create a custom model from one of the following available base models:
158-
159-
- `babbage-002`
160-
- `davinci-002`
161-
- `gpt-35-turbo` (0613)
162-
- `gpt-35-turbo` (1106)
163-
- `gpt-35-turbo` (0125)
164-
- `gpt-4` (0613)
165-
- `gpt-4o` (2024-08-06)
166-
- `gpt-4o-mini` (2023-07-18)
167-
168103
Or you can fine tune a previously fine-tuned model, formatted as base-model.ft-{jobid}.
169104

170105
:::image type="content" source="../media/fine-tuning/models.png" alt-text="Screenshot of model options with a custom fine-tuned model." lightbox="../media/fine-tuning/models.png":::
@@ -373,7 +308,7 @@ az cognitiveservices account deployment create
373308

374309
## Use a deployed customized model
375310

376-
After your custom model deploys, you can use it like any other deployed model. You can use the **Playgrounds** in [Azure AI Foundry](https://ai.azure.com) to experiment with your new deployment. You can continue to use the same parameters with your custom model, such as `temperature` and `max_tokens`, as you can with other deployed models. For fine-tuned `babbage-002` and `davinci-002` models you'll use the Completions playground and the Completions API. For fine-tuned `gpt-35-turbo-0613` models you'll use the Chat playground and the Chat completion API.
311+
After your custom model deploys, you can use it like any other deployed model. You can use the **Chat Playgrounds** in [Azure AI Foundry](https://ai.azure.com) to experiment with your new deployment. You can continue to use the same parameters with your custom model, such as `temperature` and `max_tokens`, as you can with other deployed models.
377312

378313
```bash
379314
curl $AZURE_OPENAI_ENDPOINT/openai/deployments/<deployment_name>/chat/completions?api-version=2023-05-15 \

0 commit comments

Comments
 (0)