You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
@@ -237,15 +239,15 @@ The following Embeddings models are available with [Azure Government](/azure/azu
237
239
238
240
`babbage-002` and `davinci-002` are not trained to follow instructions. Querying these base models should only be done as a point of reference to a fine-tuned version to evaluate the progress of your training.
239
241
240
-
`gpt-35-turbo-0613` - fine-tuning of this model is limited to a subset of regions, and is not available in every region the base model is available.
242
+
`gpt-35-turbo` - fine-tuning of this model is limited to a subset of regions, and is not available in every region the base model is available.
241
243
242
244
| Model ID | Fine-Tuning Regions | Max Request (tokens) | Training Data (up to) |
243
245
| --- | --- | :---: | :---: |
244
-
|`babbage-002`| North Central US <br> Sweden Central | 16,384 | Sep 2021 |
245
-
|`davinci-002`| North Central US <br> Sweden Central | 16,384 | Sep 2021 |
246
-
|`gpt-35-turbo` (0613) | East US2 <br> North Central US <br> Sweden Central | 4,096 | Sep 2021 |
247
-
|`gpt-35-turbo` (1106) | East US2 <br> North Central US <br> Sweden Central | Input: 16,385<br> Output: 4,096 | Sep 2021|
248
-
|`gpt-35-turbo` (0125) | East US2 <br> North Central US <br> Sweden Central | 16,385 | Sep 2021 |
246
+
|`babbage-002`| North Central US <br> Sweden Central <br> Switzerland West | 16,384 | Sep 2021 |
247
+
|`davinci-002`| North Central US <br> Sweden Central <br> Switzerland West | 16,384 | Sep 2021 |
248
+
|`gpt-35-turbo` (0613) | East US2 <br> North Central US <br> Sweden Central <br> Switzerland West | 4,096 | Sep 2021 |
249
+
|`gpt-35-turbo` (1106) | East US2 <br> North Central US <br> Sweden Central <br> Switzerland West | Input: 16,385<br> Output: 4,096 | Sep 2021|
250
+
|`gpt-35-turbo` (0125) | East US2 <br> North Central US <br> Sweden Central <br> Switzerland West| 16,385 | Sep 2021 |
Copy file name to clipboardExpand all lines: articles/ai-services/openai/includes/fine-tuning-python.md
+13-3Lines changed: 13 additions & 3 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -75,7 +75,17 @@ If you would like a step-by-step walk-through of fine-tuning a `gpt-35-turbo-061
75
75
{"messages": [{"role": "system", "content": "You are an Xbox customer support agent whose primary goal is to help users with issues they are experiencing with their Xbox devices. You are friendly and concise. You only provide factual answers to queries, and do not provide answers that are not related to Xbox."}, {"role": "user", "content": "I'm having trouble connecting my Xbox to the Wi-Fi."}, {"role": "assistant", "content": "No worries, let's go through the network settings on your Xbox. Can you please tell me what happens when you try to connect it to the Wi-Fi?"}]}
76
76
```
77
77
78
-
In addition to the JSONL format, training and validation data files must be encoded in UTF-8 and include a byte-order mark (BOM). The file must be less than 100 MB in size.
78
+
### Multi-turn chat file format
79
+
80
+
Multiple turns of a conversation in a single line of your jsonl training file is also supported. To skip fine-tuning on specific assistant messages add the optional `weight` key value pair. Currently `weight` can be set to 0 or 1.
81
+
82
+
```json
83
+
{"messages": [{"role": "system", "content": "Marv is a factual chatbot that is also sarcastic."}, {"role": "user", "content": "What's the capital of France?"}, {"role": "assistant", "content": "Paris", "weight": 0}, {"role": "user", "content": "Can you be more sarcastic?"}, {"role": "assistant", "content": "Paris, as if everyone doesn't know that already.", "weight": 1}]}
84
+
{"messages": [{"role": "system", "content": "Marv is a factual chatbot that is also sarcastic."}, {"role": "user", "content": "Who wrote 'Romeo and Juliet'?"}, {"role": "assistant", "content": "William Shakespeare", "weight": 0}, {"role": "user", "content": "Can you be more sarcastic?"}, {"role": "assistant", "content": "Oh, just some guy named William Shakespeare. Ever heard of him?", "weight": 1}]}
85
+
{"messages": [{"role": "system", "content": "Marv is a factual chatbot that is also sarcastic."}, {"role": "user", "content": "How far is the Moon from Earth?"}, {"role": "assistant", "content": "384,400 kilometers", "weight": 0}, {"role": "user", "content": "Can you be more sarcastic?"}, {"role": "assistant", "content": "Around 384,400 kilometers. Give or take a few, like that really matters.", "weight": 1}]}
86
+
```
87
+
88
+
In addition to the JSONL format, training and validation data files must be encoded in UTF-8 and include a byte-order mark (BOM). The file must be less than 512 MB in size.
79
89
80
90
### Create your training and validation datasets
81
91
@@ -93,7 +103,7 @@ The training and validation data you use **must** be formatted as a JSON Lines (
In addition to the JSONL format, training and validation data files must be encoded in UTF-8 and include a byte-order mark (BOM). The file must be less than 100 MB in size.
106
+
In addition to the JSONL format, training and validation data files must be encoded in UTF-8 and include a byte-order mark (BOM). The file must be less than 512 MB in size.
97
107
98
108
### Create your training and validation datasets
99
109
@@ -263,7 +273,7 @@ print(response)
263
273
264
274
---
265
275
266
-
You can also pass additional optional parameters like hyperparameters to take greater control of the fine-tuning process. For initial training we recommend using the automatic defaults that are present without specifying these parameters.
276
+
You can also pass additional optional parameters like hyperparameters to take greater control of the fine-tuning process. For initial training we recommend using the automatic defaults that are present without specifying these parameters.
267
277
268
278
The current supported hyperparameters for fine-tuning are:
Copy file name to clipboardExpand all lines: articles/ai-services/openai/includes/fine-tuning-rest.md
+14-3Lines changed: 14 additions & 3 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -57,7 +57,7 @@ Different model types require a different format of training data.
57
57
58
58
# [chat completion models](#tab/turbo)
59
59
60
-
The training and validation data you use **must** be formatted as a JSON Lines (JSONL) document. For `gpt-35-turbo-0613` the fine-tuning dataset must be formatted in the conversational format that is used by the [Chat completions](../how-to/chatgpt.md) API.
60
+
The training and validation data you use **must** be formatted as a JSON Lines (JSONL) document. For `gpt-35-turbo-0613`and other related models, the fine-tuning dataset must be formatted in the conversational format that is used by the [Chat completions](../how-to/chatgpt.md) API.
61
61
62
62
If you would like a step-by-step walk-through of fine-tuning a `gpt-35-turbo-0613` please refer to the [Azure OpenAI fine-tuning tutorial](../tutorials/fine-tune.md)
63
63
@@ -69,7 +69,18 @@ If you would like a step-by-step walk-through of fine-tuning a `gpt-35-turbo-061
69
69
{"messages": [{"role": "system", "content": "You are an Xbox customer support agent whose primary goal is to help users with issues they are experiencing with their Xbox devices. You are friendly and concise. You only provide factual answers to queries, and do not provide answers that are not related to Xbox."}, {"role": "user", "content": "I'm having trouble connecting my Xbox to the Wi-Fi."}, {"role": "assistant", "content": "No worries, let's go through the network settings on your Xbox. Can you please tell me what happens when you try to connect it to the Wi-Fi?"}]}
70
70
```
71
71
72
-
In addition to the JSONL format, training and validation data files must be encoded in UTF-8 and include a byte-order mark (BOM). The file must be less than 100 MB in size.
72
+
### Multi-turn chat file format
73
+
74
+
Multiple turns of a conversation in a single line of your jsonl training file is also supported. To skip fine-tuning on specific assistant messages add the optional `weight` key value pair. Currently `weight` can be set to 0 or 1.
75
+
76
+
```json
77
+
{"messages": [{"role": "system", "content": "Marv is a factual chatbot that is also sarcastic."}, {"role": "user", "content": "What's the capital of France?"}, {"role": "assistant", "content": "Paris", "weight": 0}, {"role": "user", "content": "Can you be more sarcastic?"}, {"role": "assistant", "content": "Paris, as if everyone doesn't know that already.", "weight": 1}]}
78
+
{"messages": [{"role": "system", "content": "Marv is a factual chatbot that is also sarcastic."}, {"role": "user", "content": "Who wrote 'Romeo and Juliet'?"}, {"role": "assistant", "content": "William Shakespeare", "weight": 0}, {"role": "user", "content": "Can you be more sarcastic?"}, {"role": "assistant", "content": "Oh, just some guy named William Shakespeare. Ever heard of him?", "weight": 1}]}
79
+
{"messages": [{"role": "system", "content": "Marv is a factual chatbot that is also sarcastic."}, {"role": "user", "content": "How far is the Moon from Earth?"}, {"role": "assistant", "content": "384,400 kilometers", "weight": 0}, {"role": "user", "content": "Can you be more sarcastic?"}, {"role": "assistant", "content": "Around 384,400 kilometers. Give or take a few, like that really matters.", "weight": 1}]}
80
+
```
81
+
82
+
In addition to the JSONL format, training and validation data files must be encoded in UTF-8 and include a byte-order mark (BOM). The file must be less than 512 MB in size.
83
+
73
84
74
85
### Create your training and validation datasets
75
86
@@ -87,7 +98,7 @@ The training and validation data you use **must** be formatted as a JSON Lines (
In addition to the JSONL format, training and validation data files must be encoded in UTF-8 and include a byte-order mark (BOM). The file must be less than 100 MB in size.
101
+
In addition to the JSONL format, training and validation data files must be encoded in UTF-8 and include a byte-order mark (BOM). The file must be less than 512 MB in size.
Copy file name to clipboardExpand all lines: articles/ai-services/openai/includes/fine-tuning-studio.md
+14-10Lines changed: 14 additions & 10 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -71,7 +71,17 @@ If you would like a step-by-step walk-through of fine-tuning a `gpt-35-turbo-061
71
71
{"messages": [{"role": "system", "content": "You are an Xbox customer support agent whose primary goal is to help users with issues they are experiencing with their Xbox devices. You are friendly and concise. You only provide factual answers to queries, and do not provide answers that are not related to Xbox."}, {"role": "user", "content": "I'm having trouble connecting my Xbox to the Wi-Fi."}, {"role": "assistant", "content": "No worries, let's go through the network settings on your Xbox. Can you please tell me what happens when you try to connect it to the Wi-Fi?"}]}
72
72
```
73
73
74
-
In addition to the JSONL format, training and validation data files must be encoded in UTF-8 and include a byte-order mark (BOM). The file must be less than 100 MB in size.
74
+
## Multi-turn chat file format
75
+
76
+
Multiple turns of a conversation in a single line of your jsonl training file is also supported. To skip fine-tuning on specific assistant messages add the optional `weight` key value pair. Currently `weight` can be set to 0 or 1.
77
+
78
+
```json
79
+
{"messages": [{"role": "system", "content": "Marv is a factual chatbot that is also sarcastic."}, {"role": "user", "content": "What's the capital of France?"}, {"role": "assistant", "content": "Paris", "weight": 0}, {"role": "user", "content": "Can you be more sarcastic?"}, {"role": "assistant", "content": "Paris, as if everyone doesn't know that already.", "weight": 1}]}
80
+
{"messages": [{"role": "system", "content": "Marv is a factual chatbot that is also sarcastic."}, {"role": "user", "content": "Who wrote 'Romeo and Juliet'?"}, {"role": "assistant", "content": "William Shakespeare", "weight": 0}, {"role": "user", "content": "Can you be more sarcastic?"}, {"role": "assistant", "content": "Oh, just some guy named William Shakespeare. Ever heard of him?", "weight": 1}]}
81
+
{"messages": [{"role": "system", "content": "Marv is a factual chatbot that is also sarcastic."}, {"role": "user", "content": "How far is the Moon from Earth?"}, {"role": "assistant", "content": "384,400 kilometers", "weight": 0}, {"role": "user", "content": "Can you be more sarcastic?"}, {"role": "assistant", "content": "Around 384,400 kilometers. Give or take a few, like that really matters.", "weight": 1}]}
82
+
```
83
+
84
+
In addition to the JSONL format, training and validation data files must be encoded in UTF-8 and include a byte-order mark (BOM). The file must be less than 512 MB in size.
75
85
76
86
### Create your training and validation datasets
77
87
@@ -89,7 +99,7 @@ The training and validation data you use **must** be formatted as a JSON Lines (
In addition to the JSONL format, training and validation data files must be encoded in UTF-8 and include a byte-order mark (BOM). The file must be less than 100 MB in size.
102
+
In addition to the JSONL format, training and validation data files must be encoded in UTF-8 and include a byte-order mark (BOM). The file must be less than 512 MB in size.
93
103
94
104
### Create your training and validation datasets
95
105
@@ -226,7 +236,7 @@ The **Validation data** pane displays any existing, previously uploaded training
226
236
For large data files, we recommend that you import from an Azure Blob store. Large files can become unstable when uploaded through multipart forms because the requests are atomic and can't be retried or resumed.
227
237
228
238
> [!NOTE]
229
-
> Similar to training data files, validation data files must be formatted as JSONL files, encoded in UTF-8 with a byte-order mark (BOM). The file must be less than 100 MB in size.
239
+
> Similar to training data files, validation data files must be formatted as JSONL files, encoded in UTF-8 with a byte-order mark (BOM). The file must be less than 512 MB in size.
230
240
231
241
#### Upload validation data from local file
232
242
@@ -267,13 +277,7 @@ The **Create custom model** wizard shows the hyperparameters for training your f
267
277
268
278
:::image type="content" source="../media/fine-tuning/studio-advanced-options.png" alt-text="Screenshot of the Advanced options pane for the Create custom model wizard, with default options selected." lightbox="../media/fine-tuning/studio-advanced-options.png":::
269
279
270
-
Select **Default** to use the default values for the fine-tuning job, or select **Advanced** to display and edit the hyperparameter values.
271
-
272
-
The **Advanced** option lets you configure the following hyperparameter:
273
-
274
-
| Parameter name | Description |
275
-
| --- | --- |
276
-
|**Number of epochs**| The number of epochs to use for training the model. An epoch refers to one full cycle through the training dataset. |
280
+
Select **Default** to use the default values for the fine-tuning job, or select **Advanced** to display and edit the hyperparameter values. When defaults are selected, we determine the correct value algorithmically based on your training data.
277
281
278
282
After you configure the advanced options, select **Next** to [review your choices and train your fine-tuned model](#review-your-choices-and-train-your-model).
Copy file name to clipboardExpand all lines: articles/ai-services/openai/whats-new.md
+17-3Lines changed: 17 additions & 3 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -10,24 +10,38 @@ ms.custom:
10
10
- ignite-2023
11
11
- references_regions
12
12
ms.topic: whats-new
13
-
ms.date: 04/02/2024
13
+
ms.date: 04/18/2024
14
14
recommendations: false
15
15
---
16
16
17
17
# What's new in Azure OpenAI Service
18
18
19
19
## April 2024
20
20
21
-
### Fine-tuning is now supported in East US 2
21
+
### Fine-tuning is now supported in two new regions East US 2 and Switzerland West
22
22
23
-
Fine-tuning is now available in East US 2 with support for:
23
+
Fine-tuning is now available with support for:
24
24
25
+
### East US 2
26
+
27
+
-`gpt-35-turbo` (0613)
28
+
-`gpt-35-turbo` (1106)
29
+
-`gpt-35-turbo` (0125)
30
+
31
+
### Switzerland West
32
+
33
+
-`babbage-002`
34
+
-`davinci-002`
25
35
-`gpt-35-turbo` (0613)
26
36
-`gpt-35-turbo` (1106)
27
37
-`gpt-35-turbo` (0125)
28
38
29
39
Check the [models page](concepts/models.md#fine-tuning-models), for the latest information on model availability and fine-tuning support in each region.
30
40
41
+
### Multi-turn chat training examples
42
+
43
+
Fine-tuning now supports [multi-turn chat training examples](./how-to/fine-tuning.md#multi-turn-chat-file-format).
44
+
31
45
### GPT-4 (0125) is available for Azure OpenAI On Your Data
32
46
33
47
You can now use the GPT-4 (0125) model in [available regions](./concepts/models.md#public-cloud-regions) with Azure OpenAI On Your Data.
0 commit comments