You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: articles/ai-services/openai/includes/fine-tuning-openai-in-ai-studio.md
+3-2Lines changed: 3 additions & 2 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -209,6 +209,7 @@ Optionally, configure parameters for your fine-tuning job. The following are ava
209
209
|`batch_size`|integer | The batch size to use for training. The batch size is the number of training examples used to train a single forward and backward pass. In general, we've found that larger batch sizes tend to work better for larger datasets. The default value as well as the maximum value for this property are specific to a base model. A larger batch size means that model parameters are updated less frequently, but with lower variance. When set to -1, batch_size is calculated as 0.2% of examples in training set and the max is 256. |
210
210
|`learning_rate_multiplier`| number | The learning rate multiplier to use for training. The fine-tuning learning rate is the original learning rate used for pre-training multiplied by this value. Larger learning rates tend to perform better with larger batch sizes. We recommend experimenting with values in the range 0.02 to 0.2 to see what produces the best results. A smaller learning rate may be useful to avoid overfitting. |
211
211
|`n_epochs`| integer | The number of epochs to train the model for. An epoch refers to one full cycle through the training dataset. If set to -1, the number of epochs is determined dynamically based on the input data. |
212
+
|`seed`| integer | The seed controls the reproducibility of the job. Passing in the same seed and job parameters should produce the same results, but may differ in rare cases. If a seed isn't specified, one will be generated for you. |
212
213
213
214
You can choose to leave the default configuration or customize the values to your preference. After you finish making your configurations, select **Next**.
214
215
@@ -232,11 +233,11 @@ The result file is a CSV file that contains a header row and a row for each trai
232
233
| --- | --- |
233
234
|`step`| The number of the training step. A training step represents a single pass, forward and backward, on a batch of training data. |
234
235
|`train_loss`| The loss for the training batch. |
235
-
|`training_accuracy`| The percentage of completions in the training batch for which the model's predicted tokens exactly matched the true completion tokens.<br>For example, if the batch size is set to 3 and your data contains completions `[[1, 2], [0, 5], [4, 2]]`, this value is set to 0.67 (2 of 3) if the model predicted `[[1, 1], [0, 5], [4, 2]]`. |
236
236
|`train_mean_token_accuracy`| The percentage of tokens in the training batch correctly predicted by the model.<br>For example, if the batch size is set to 3 and your data contains completions `[[1, 2], [0, 5], [4, 2]]`, this value is set to 0.83 (5 of 6) if the model predicted `[[1, 1], [0, 5], [4, 2]]`. |
237
237
|`valid_loss`| The loss for the validation batch. |
238
-
|`valid_accuracy`| The percentage of completions in the validation batch for which the model's predicted tokens exactly matched the true completion tokens.<br>For example, if the batch size is set to 3 and your data contains completions `[[1, 2], [0, 5], [4, 2]]`, this value is set to 0.67 (2 of 3) if the model predicted `[[1, 1], [0, 5], [4, 2]]`. |
239
238
|`validation_mean_token_accuracy`| The percentage of tokens in the validation batch correctly predicted by the model.<br>For example, if the batch size is set to 3 and your data contains completions `[[1, 2], [0, 5], [4, 2]]`, this value is set to 0.83 (5 of 6) if the model predicted `[[1, 1], [0, 5], [4, 2]]`. |
239
+
|`full_valid_loss`| The validation loss calculated at the end of each epoch. When training goes well, loss should decrease. |
240
+
|`full_valid_mean_token_accuracy`| The valid mean token accuracy calculated at the end of each epoch. When training is going well, token accuracy should increase. |
240
241
241
242
You can also view the data in your results.csv file as plots in Azure AI Studio under the **Metrics** tab of your fine-tuned model. Select the link for your trained model, and you will see three charts: loss, mean token accuracy, and token accuracy. If you provided validation data, both datasets will appear on the same plot.
api_version="2024-02-01"# This API version or later is required to access fine-tuning for turbo/babbage-002/davinci-002
169
+
api_version="2024-05-01-preview"# This API version or later is required to access seed/events/checkpoint capabilities
170
170
)
171
171
172
172
training_file_name ='training_set.jsonl'
@@ -233,11 +233,14 @@ The following Python code shows an example of how to create a new fine-tune job
233
233
234
234
# [OpenAI Python 1.x](#tab/python-new)
235
235
236
+
In this example we are also passing the seed parameter. The seed controls the reproducibility of the job. Passing in the same seed and job parameters should produce the same results, but may differ in rare cases. If a seed isn't specified, one will be generated for you.
237
+
236
238
```python
237
239
response = client.fine_tuning.jobs.create(
238
240
training_file=training_file_id,
239
241
validation_file=validation_file_id,
240
-
model="gpt-35-turbo-0613"# Enter base model name. Note that in Azure OpenAI the model name contains dashes and cannot contain dot/period characters.
242
+
model="gpt-35-turbo-0613", # Enter base model name. Note that in Azure OpenAI the model name contains dashes and cannot contain dot/period characters.
243
+
seed=105# seed parameter controls reproducibility of the fine-tuning job. If no seed is specified one will be generated automatically.
241
244
)
242
245
243
246
job_id = response.id
@@ -329,6 +332,46 @@ print(response)
329
332
330
333
---
331
334
335
+
## List fine-tuning events
336
+
337
+
To examine the individual fine-tuning events that were generated during training:
338
+
339
+
# [OpenAI Python 1.x](#tab/python-new)
340
+
341
+
You may need to upgrade your OpenAI client library to the latest version with `pip install openai --upgrade` to run this command.
This command isn't available in the 0.28.1 OpenAI Python library. Upgrade to the latest release.
351
+
352
+
---
353
+
354
+
## Checkpoints
355
+
356
+
When each training epoch completes a checkpoint is generated. A checkpoint is a fully functional version of a model which can both be deployed and used as the target model for subsequent fine-tuning jobs. Checkpoints can be particularly useful, as they can provide a snapshot of your model prior to overfitting having occurred. When a fine-tuning job completes you will have the three most recent versions of the model available to deploy. The final epoch will be represented by your fine-tuned model, the previous two epochs will be available as checkpoints.
357
+
358
+
You can run the list checkpoints command to retrieve the list of checkpoints associated with an individual fine-tuning job:
359
+
360
+
# [OpenAI Python 1.x](#tab/python-new)
361
+
362
+
You may need to upgrade your OpenAI client library to the latest version with `pip install openai --upgrade` to run this command.
This command isn't available in the 0.28.1 OpenAI Python library. Upgrade to the latest release.
372
+
373
+
---
374
+
332
375
## Deploy a customized model
333
376
334
377
When the fine-tuning job succeeds, the value of the `fine_tuned_model` variable in the response body is set to the name of your customized model. Your model is now also available for discovery from the [list Models API](/rest/api/azureopenai/models/list). However, you can't issue completion calls to your customized model until your customized model is deployed. You must deploy your customized model to make it available for use with completion calls.
@@ -351,7 +394,7 @@ Unlike the previous SDK commands, deployment must be done using the control plan
351
394
| resource_group | The resource group name for your Azure OpenAI resource |
352
395
| resource_name | The Azure OpenAI resource name |
353
396
| model_deployment_name | The custom name for your new fine-tuned model deployment. This is the name that will be referenced in your code when making chat completion calls. |
354
-
| fine_tuned_model | Retrieve this value from your fine-tuning job results in the previous step. It will look like `gpt-35-turbo-0613.ft-b044a9d3cf9c4228b5d393567f693b83`. You will need to add that value to the deploy_data json. |
397
+
| fine_tuned_model | Retrieve this value from your fine-tuning job results in the previous step. It will look like `gpt-35-turbo-0613.ft-b044a9d3cf9c4228b5d393567f693b83`. You will need to add that value to the deploy_data json. Alternatively you can also deploy a checkpoint, by passing the checkpoint ID which will appear in the format `ftchkpt-e559c011ecc04fc68eaa339d8227d02d`|
355
398
356
399
```python
357
400
import json
@@ -583,15 +626,15 @@ The result file is a CSV file that contains a header row and a row for each trai
583
626
| --- | --- |
584
627
|`step`| The number of the training step. A training step represents a single pass, forward and backward, on a batch of training data. |
585
628
|`train_loss`| The loss for the training batch. |
586
-
|`training_accuracy`| The percentage of completions in the training batch for which the model's predicted tokens exactly matched the true completion tokens.<br>For example, if the batch size is set to 3 and your data contains completions `[[1, 2], [0, 5], [4, 2]]`, this value is set to 0.67 (2 of 3) if the model predicted `[[1, 1], [0, 5], [4, 2]]`. |
587
629
|`train_mean_token_accuracy`| The percentage of tokens in the training batch correctly predicted by the model.<br>For example, if the batch size is set to 3 and your data contains completions `[[1, 2], [0, 5], [4, 2]]`, this value is set to 0.83 (5 of 6) if the model predicted `[[1, 1], [0, 5], [4, 2]]`. |
588
630
|`valid_loss`| The loss for the validation batch. |
589
-
|`valid_accuracy`| The percentage of completions in the validation batch for which the model's predicted tokens exactly matched the true completion tokens.<br>For example, if the batch size is set to 3 and your data contains completions `[[1, 2], [0, 5], [4, 2]]`, this value is set to 0.67 (2 of 3) if the model predicted `[[1, 1], [0, 5], [4, 2]]`. |
590
631
|`validation_mean_token_accuracy`| The percentage of tokens in the validation batch correctly predicted by the model.<br>For example, if the batch size is set to 3 and your data contains completions `[[1, 2], [0, 5], [4, 2]]`, this value is set to 0.83 (5 of 6) if the model predicted `[[1, 1], [0, 5], [4, 2]]`. |
632
+
|`full_valid_loss`| The validation loss calculated at the end of each epoch. When training goes well, loss should decrease. |
633
+
|`full_valid_mean_token_accuracy`| The valid mean token accuracy calculated at the end of each epoch. When training is going well, token accuracy should increase. |
591
634
592
635
You can also view the data in your results.csv file as plots in Azure OpenAI Studio. Select the link for your trained model, and you will see three charts: loss, mean token accuracy, and token accuracy. If you provided validation data, both datasets will appear on the same plot.
593
636
594
-
Look for your loss to decrease over time, and your accuracy to increase. If you see a divergence between your training and validation data, that may indicate that you are overfitting. Try training with fewer epochs, or a smaller learning rate multiplier.
637
+
Look for your loss to decrease over time, and your accuracy to increase. If you see a divergence between your training and validation data that can indicate that you are overfitting. Try training with fewer epochs, or a smaller learning rate multiplier.
595
638
596
639
## Clean up your deployments, customized models, and training files
0 commit comments