Skip to content

Commit 6e51253

Browse files
Merge pull request #4687 from voutilad/ptum-global-ga
Drop preview tags for Global and PTU-M FT.
2 parents 388be50 + 3fc5317 commit 6e51253

File tree

1 file changed

+7
-61
lines changed

1 file changed

+7
-61
lines changed

articles/ai-services/openai/how-to/fine-tuning-deploy.md

Lines changed: 7 additions & 61 deletions
Original file line numberDiff line numberDiff line change
@@ -366,7 +366,9 @@ Azure OpenAI fine-tuning supports the following deployment types.
366366
|GPT-35-Turbo-1106-finetune|East US2, North Central US, Sweden Central, Switzerland West|
367367
|GPT-35-Turbo-0125-finetune|East US2, North Central US, Sweden Central, Switzerland West|
368368

369-
### Global Standard (preview)
369+
### Global Standard
370+
371+
[Global standard](./deployment-types.md#global-standard) fine-tuned deployments offer [cost savings](https://azure.microsoft.com/pricing/details/cognitive-services/openai-service/), but custom model weights may temporarily be stored outside the geography of your Azure OpenAI resource.
370372

371373
| Models | Region |
372374
|--|--|
@@ -375,72 +377,16 @@ Azure OpenAI fine-tuning supports the following deployment types.
375377
|GPT-4o-finetune|East US2, North Central US, and Sweden Central|
376378
|GPT-4o-mini-finetune|East US2, North Central US, and Sweden Central|
377379

378-
[Global standard](./deployment-types.md#global-standard) fine-tuned deployments offer [cost savings](https://azure.microsoft.com/pricing/details/cognitive-services/openai-service/), but custom model weights may temporarily be stored outside the geography of your Azure OpenAI resource.
379-
380380
:::image type="content" source="../media/fine-tuning/global-standard.png" alt-text="Screenshot of the global standard deployment user experience with a fine-tuned model." lightbox="../media/fine-tuning/global-standard.png":::
381381

382-
### Provisioned Managed (preview)
382+
### Provisioned Managed
383383

384384
| Models | Region |
385385
|--|--|
386-
|GPT-4o-finetune|North Central US, Switzerland West|
387-
|GPT-4o-mini-finetune|North Central US, Switzerland West|
388-
389-
- `gpt-4o-mini-2024-07-18`
390-
- `gpt-4o-2024-08-06`
391-
392-
[Provisioned managed](./deployment-types.md#provisioned) fine-tuned deployments offer [predictable performance](../concepts/provisioned-throughput.md) for fine-tuned deployments. As part of public preview, provisioned managed deployments may be created regionally via the data-plane [REST API](../reference.md#data-plane-inference) version `2024-10-01` or newer. See below for examples.
393-
394-
#### Creating a Provisioned Managed deployment
395-
396-
To create a new deployment, make an HTTP PUT call via the [Deployments - Create or Update REST API](/rest/api/aiservices/accountmanagement/deployments/create-or-update?view=rest-aiservices-accountmanagement-2024-10-01&tabs=HTTP&preserve-view=true). The approach is similar to performing [cross region deployment](#cross-region-deployment) with the following exceptions:
397-
398-
- You must provide a `sku` name of `ProvisionedManaged`.
399-
- The capacity must be declared in PTUs.
400-
- The `api-version` must be `2024-10-01` or newer.
401-
- The HTTP method should be `PUT`.
402-
403-
For example, to deploy a gpt-4o-mini model:
404-
405-
```bash
406-
curl -X PUT "https://management.azure.com/subscriptions/<SUBSCRIPTION>/resourceGroups/<RESOURCE_GROUP>/providers/Microsoft.CognitiveServices/accounts/<RESOURCE_NAME>/deployments/<MODEL_DEPLOYMENT_NAME>api-version=2024-10-01" \
407-
-H "Authorization: Bearer <TOKEN>" \
408-
-H "Content-Type: application/json" \
409-
-d '{
410-
"sku": {"name": "ProvisionedManaged", "capacity": 25},
411-
"properties": {
412-
"model": {
413-
"format": "OpenAI",
414-
"name": "gpt-4omini-ft-model-name",
415-
"version": "1",
416-
"source": "/subscriptions/00000000-0000-0000-0000-000000000000/resourceGroups/{SourceResourceGroupName}/providers/Microsoft.CognitiveServices/accounts/{SourceAOAIAccountName}"
417-
}
418-
}
419-
}'
420-
```
421-
422-
#### Scaling a fine-tuned model on Provisioned Managed
386+
|GPT-4o-finetune|North Central US, Sweden Central|
387+
|GPT-4o-mini-finetune|North Central US, Sweden Central|
423388

424-
To scale a fine-tuned provision managed deployment to increase or decrease PTU capacity, perform the same `PUT` REST API call as you did when [creating the deployment](#creating-a-provisioned-managed-deployment) and provide an updated `capacity` value for the `sku`. Keep in mind, provisioned deployments must scale in [minimum increments](../how-to/provisioned-throughput-onboarding.md#how-much-throughput-per-ptu-you-get-for-each-model).
425-
426-
For example, to scale the model deployed in the previous section from 25 to 40 PTU, make another `PUT` call and increase the capacity:
427-
428-
```bash
429-
curl -X PUT "https://management.azure.com/subscriptions/<SUBSCRIPTION>/resourceGroups/<RESOURCE_GROUP>/providers/Microsoft.CognitiveServices/accounts/<RESOURCE_NAME>/deployments/<MODEL_DEPLOYMENT_NAME>api-version=2024-10-01" \
430-
-H "Authorization: Bearer <TOKEN>" \
431-
-H "Content-Type: application/json" \
432-
-d '{
433-
"sku": {"name": "ProvisionedManaged", "capacity": 40},
434-
"properties": {
435-
"model": {
436-
"format": "OpenAI",
437-
"name": "gpt-4omini-ft-model-name",
438-
"version": "1",
439-
"source": "/subscriptions/00000000-0000-0000-0000-000000000000/resourceGroups/{SourceResourceGroupName}/providers/Microsoft.CognitiveServices/accounts/{SourceAOAIAccountName}"
440-
}
441-
}
442-
}'
443-
```
389+
[Provisioned managed](./deployment-types.md#provisioned) fine-tuned deployments offer [predictable performance](../concepts/provisioned-throughput.md) for latency-sensitive agents and applications. They use the same regional provisioned throughput (PTU) capacity as base models, so if you already have regional PTU quota you can deploy your fine-tuned model in support regions.
444390

445391
## Clean up your deployment
446392

0 commit comments

Comments
 (0)