Skip to content

Commit 1e9b88f

Browse files
committed
review edits
1 parent ab660a6 commit 1e9b88f

File tree

3 files changed

+70
-71
lines changed

3 files changed

+70
-71
lines changed

articles/ai-services/openai/includes/fine-tuning-python.md

Lines changed: 23 additions & 23 deletions
Original file line numberDiff line numberDiff line change
@@ -7,7 +7,7 @@ manager: nitinme
77
ms.service: cognitive-services
88
ms.subservice: openai
99
ms.topic: include
10-
ms.date: 08/25/2023
10+
ms.date: 08/28/2023
1111
author: ChrisHMSFT
1212
ms.author: chrhoder
1313
keywords:
@@ -50,19 +50,19 @@ Here's an example of the training data format:
5050
{"prompt": "<prompt text>", "completion": "<ideal generated text>"}
5151
```
5252

53-
In addition to the JSONL format, training and validation data files must be encoded in UTF-8 and include a byte-order mark (BOM), and the file must be less than 200 MB in size. For more information about formatting your training data, see [Learn how to prepare your dataset for fine-tuning](../how-to/prepare-dataset.md).
53+
In addition to the JSONL format, training and validation data files must be encoded in UTF-8 and include a byte-order mark (BOM). The file must be less than 200 MB in size. For more information about formatting your training data, see [Learn how to prepare your dataset for fine-tuning](../how-to/prepare-dataset.md).
5454

5555
### Create your training and validation datasets
5656

57-
Designing your prompts and completions for fine-tuning is different from designing your prompts for use with any of [our GPT-3 base models](../concepts/legacy-models.md#gpt-3-models). Prompts for completion calls often use either detailed instructions or few-shot learning techniques, and consist of multiple examples. For fine-tuning, we recommend that each training example consists of a single input prompt and its desired completion output. You don't need to give detailed instructions or multiple completion examples for the same prompt.
57+
Designing your prompts and completions for fine-tuning is different from designing your prompts for use with any of [our GPT-3 base models](../concepts/legacy-models.md#gpt-3-models). Prompts for completion calls often use either detailed instructions or few-shot learning techniques, and consist of multiple examples. For fine-tuning, each training example should consist of a single input prompt and its desired completion output. You don't need to give detailed instructions or multiple completion examples for the same prompt.
5858

59-
The more training examples you have, the better. We recommend having at least 200 training examples. In general, doubling the dataset size leads to a linear increase in model quality.
59+
The more training examples you have, the better. It's a best practice to have at least 200 training examples. In general, doubling the dataset size leads to a linear increase in model quality.
6060

6161
For more information about preparing training data for various tasks, see [Learn how to prepare your dataset for fine-tuning](../how-to/prepare-dataset.md).
6262

6363
### Use the OpenAI CLI data preparation tool
6464

65-
We recommend using OpenAI's command-line interface (CLI) to assist with many of the data preparation steps. OpenAI has developed a tool that validates, gives suggestions, and reformats your data into a JSONL file ready for fine-tuning.
65+
We recommend that you use OpenAI's CLI to assist with many of the data preparation steps. OpenAI has developed a tool that validates, gives suggestions, and reformats your data into a JSONL file ready for fine-tuning.
6666

6767
To install the OpenAI CLI, run the following Python command:
6868

@@ -88,28 +88,28 @@ The tool guides you through suggested changes for your training data. It reforma
8888

8989
## Select a base model
9090

91-
The first step in creating a customized model is to choose a base model. The choice influences both the performance and the cost of your model. You can create a customized model from one of the following available base models:
91+
The first step in creating a customized model is to choose a base model. The choice influences both the performance and the cost of your model.
92+
93+
You can create a customized model from one of the following available base models:
9294
- `ada`
9395
- `babbage`
9496
- `curie`
95-
- `code-cushman-001` __\*__
96-
- `davinci` __\*__
97-
98-
__\*__ This model is currently unavailable for new customers.
97+
- `code-cushman-001` (Currently unavailable for new customers)
98+
- `davinci` (Currently unavailable for new customers)
9999

100-
You can use the [Models API](/rest/api/cognitiveservices/azureopenaistable/models/list) to identify which models are fine-tunable. For more information about our base models, see [Models](../concepts/models.md).
100+
You can use the [Models API](/rest/api/cognitiveservices/azureopenaistable/models/list) to identify which models are fine-tunable. For more information about our base models, see [Azure OpenAI Service models](../concepts/models.md).
101101

102102
## Upload your training data
103103

104-
The next step is to either choose existing prepared training data or upload new prepared training data to use when customizing your model. After you prepare your training data, you can upload your files to the service. We offer two ways to upload training data:
104+
The next step is to either choose existing prepared training data or upload new prepared training data to use when customizing your model. After you prepare your training data, you can upload your files to the service. There are two ways to upload training data:
105105

106106
- [From a local file](/rest/api/cognitiveservices/azureopenaistable/files/upload)
107107
- [Import from an Azure Blob store or other web location](/rest/api/cognitiveservices/azureopenaistable/files/import)
108108

109-
For large data files, we recommend you import from an Azure Blob store. Large files can become unstable when uploaded through multipart forms because the requests are atomic and can't be retried or resumed. For more information about Azure Blob storage, see [What is Azure Blob storage?](../../../storage/blobs/storage-blobs-overview.md)
109+
For large data files, we recommend that you import from an Azure Blob store. Large files can become unstable when uploaded through multipart forms because the requests are atomic and can't be retried or resumed. For more information about Azure Blob storage, see [What is Azure Blob storage?](../../../storage/blobs/storage-blobs-overview.md)
110110

111111
> [!NOTE]
112-
> Training data files must be formatted as JSONL files, encoded in UTF-8 with a byte-order mark (BOM), and less than 200 MB in size.
112+
> Training data files must be formatted as JSONL files, encoded in UTF-8 with a byte-order mark (BOM). The file must be less than 200 MB in size.
113113
114114
The following Python example locally creates sample training and validation dataset files, then uploads the local files by using the Python SDK, and retrieves the returned file IDs. Make sure to save the IDs returned by the example because you need them for the fine-tuning training job creation.
115115

@@ -152,7 +152,7 @@ with open(training_file_name, 'w') as training_file:
152152

153153
# Copy the validation dataset file from the training dataset file.
154154
# Typically, your training data and validation data should be mutually exclusive.
155-
# For the purposes of this example, we use the same data.
155+
# For the purposes of this example, you use the same data.
156156
print(f'Copying the training file to the validation file')
157157
shutil.copy(training_file_name, validation_file_name)
158158

@@ -206,11 +206,11 @@ status = resp["status"]
206206
print(f'Fine-tuning model with job ID: {job_id}.')
207207
```
208208

209-
You can either use default values for the hyperparameters of the fine-tune job, or you can adjust those hyperparameters for your customization needs. In the previous Python example, we set the `n_epochs` hyperparameter to 1, indicating that we want just one full cycle through the training data. For more information about these hyperparameters, see the [Create a Fine tune job](/rest/api/cognitiveservices/azureopenaistable/fine-tunes/create) section of the [REST API](/rest/api/cognitiveservices/azureopenaistable/fine-tunes) documentation.
209+
You can either use default values for the hyperparameters of the fine-tune job, or you can adjust those hyperparameters for your customization needs. In this example, you set the `n_epochs` hyperparameter to 1, indicating that you want just one full cycle through the training data. For more information about these hyperparameters, see [Create a Fine tune job](/rest/api/cognitiveservices/azureopenaistable/fine-tunes/create).
210210

211211
## Check the status of your customized model
212212

213-
After you start a fine-tune job, it can take some time to complete. Your job might be queued behind other jobs on our system, and training your model can take minutes or hours depending on the model and dataset size. The following Python example checks the status of your fine-tune job by retrieving information about your job by using the job ID returned from the previous example:
213+
After you start a fine-tune job, it can take some time to complete. Your job might be queued behind other jobs on the system. Training your model can take minutes or hours depending on the model and dataset size. The following Python example checks the status of your fine-tune job by retrieving information about your job by using the job ID returned from the previous example:
214214

215215
```python
216216
# Get the status of our fine-tune job.
@@ -236,12 +236,12 @@ print(f'Found {len(result)} fine-tune jobs.')
236236

237237
## Deploy a customized model
238238

239-
When the fine-tune job succeeds, the value of the `fine_tuned_model` variable in the response body of the `FineTune.retrieve()`` method is set to the name of your customized model. Your model is now also available for discovery from the [list Models API](/rest/api/cognitiveservices/azureopenaistable/models/list). However, you can't issue completion calls to your customized model until your customized model is deployed. You must deploy your customized model to make it available for use with completion calls.
239+
When the fine-tune job succeeds, the value of the `fine_tuned_model` variable in the response body of the `FineTune.retrieve()` method is set to the name of your customized model. Your model is now also available for discovery from the [list Models API](/rest/api/cognitiveservices/azureopenaistable/models/list). However, you can't issue completion calls to your customized model until your customized model is deployed. You must deploy your customized model to make it available for use with completion calls.
240240

241241
[!INCLUDE [Fine-tuning deletion](fine-tune.md)]
242242

243243
> [!NOTE]
244-
> As with all applications, we require a review process prior to going live.
244+
> As with all applications, Microsoft requires a review process for your custom model before it's available live.
245245
246246
You can use either [Azure OpenAI](#deploy-a-model-with-azure-openai) or the [Azure CLI](#deploy-a-model-with-azure-cli) to deploy your customized model.
247247

@@ -269,7 +269,7 @@ deployment_id = result["id"]
269269

270270
### Deploy a model with Azure CLI
271271

272-
The following example shows how to use the Azure CLI to deploy your customized model. With the Azure CLI, you must specify a name for the deployment of your customized model. For more information about how to use the Azure CLI to deploy customized models, see [az cognitiveservices account deployment](/cli/azure/cognitiveservices/account/deployment) in the [Azure CLI documentation](/cli/azure).
272+
The following example shows how to use the Azure CLI to deploy your customized model. With the Azure CLI, you must specify a name for the deployment of your customized model. For more information about how to use the Azure CLI to deploy customized models, see [az cognitiveservices account deployment](/cli/azure/cognitiveservices/account/deployment).
273273

274274
To run this Azure CLI command in a console window, you must replace the following _\<placeholders>_ with the corresponding values for your customized model:
275275

@@ -284,8 +284,8 @@ To run this Azure CLI command in a console window, you must replace the followin
284284
```console
285285
az cognitiveservices account deployment create
286286
--subscription <YOUR_AZURE_SUBSCRIPTION>
287-
-g <YOUR_RESOURCE_GROUP>
288-
-n <YOUR_RESOURCE_NAME>
287+
--resource-group <YOUR_RESOURCE_GROUP>
288+
--name <YOUR_RESOURCE_NAME>
289289
--deployment-name <YOUR_DEPLOYMENT_NAME>
290290
--model-name <YOUR_FINE_TUNED_MODEL_ID>
291291
--model-version "1"
@@ -307,7 +307,7 @@ print(f'"{start_phrase} {text}"')
307307

308308
## Analyze your customized model
309309

310-
Azure OpenAI attaches a result file named _results.csv_ to each fine-tune job after it's complete. You can use the result file to analyze the training and validation performance of your customized model. The file ID for the result file is listed for each customized model, and you can use the Python SDK to retrieve the file ID and download the result file for analysis.
310+
Azure OpenAI attaches a result file named _results.csv_ to each fine-tune job after it completes. You can use the result file to analyze the training and validation performance of your customized model. The file ID for the result file is listed for each customized model, and you can use the Python SDK to retrieve the file ID and download the result file for analysis.
311311

312312
The following Python example retrieves the file ID of the first result file attached to the fine-tune job for your customized model, and then uses the Python SDK to download the file to your working directory for analysis.
313313

0 commit comments

Comments
 (0)