Skip to content

Commit 099793c

Browse files
authored
Merge pull request #2541 from sydneemayers/docs-editor/model-versions-1738026192
Update PTU Model Migration Guidance
2 parents 53530b2 + 49fd72e commit 099793c

File tree

4 files changed

+92
-0
lines changed

4 files changed

+92
-0
lines changed

articles/ai-services/openai/concepts/model-retirements.md

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -80,6 +80,8 @@ For more information on the model evaluation process, see the [Getting started w
8080

8181
For information on the model upgrade process, see [How to upgrade to a new model or version](./model-versions.md).
8282

83+
For more information on how to manage model upgrades and migrations for provisioned deployments, see [Managing models on provisioned deployment types](../how-to/working-with-models.md#managing-models-on-provisioned-deployment-types)
84+
8385
## Current models
8486

8587
> [!NOTE]

articles/ai-services/openai/concepts/model-versions.md

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -19,6 +19,9 @@ Azure OpenAI Service is committed to providing the best generative AI models for
1919

2020
We want to make it easy for customers to stay up to date as models improve. Customers can choose to start with a particular version and to automatically update as new versions are released.
2121

22+
> [!NOTE]
23+
> The following upgrade guidance only applies to Standard deployment types. For guidance on updating or migrating provisioned deployment types, review the [model management documentation](../how-to/working-with-models.md).
24+
2225
When you deploy a model you can choose an update policy, which can include the following options:
2326

2427
* Deployments set to **Auto-update to default** automatically update to use the new default version.
68.9 KB
Loading

articles/ai-services/openai/how-to/working-with-models.md

Lines changed: 87 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -26,6 +26,9 @@ Azure OpenAI now supports automatic updates for select model deployments. On mod
2626

2727
You can learn more about Azure OpenAI model versions and how they work in the [Azure OpenAI model versions](../concepts/model-versions.md) article.
2828

29+
> [!NOTE]
30+
> Automatic model updates are only supported for Standard deployment types. For more information on how to manage model updates and migrations on provisioned deployment types, refer to the section on [managing models on provisioned deployment types](./working-with-models.md#managing-models-on-provisioned-deployment-types)
31+
2932
### Auto update to default
3033

3134
When you set your deployment to **Auto-update to default**, your model deployment is automatically updated within two weeks of a change in the default version. For a preview version, it updates automatically when a new preview version is available starting two weeks after the new preview version is released.
@@ -280,6 +283,90 @@ curl -X PUT https://management.azure.com/subscriptions/00000000-0000-0000-0000-0
280283
"etag": "\"GUID\""
281284
}
282285
```
286+
## Managing models on provisioned deployment types
287+
Provisioned deployments support distinct model management practices. Provisioned deployment model management practices are intended to give you the greatest control over when and how you migrate between model versions and model families. Currently, there are two approaches available to manage models on provisioned deployments: (1) in-place migrations and (2) multi-deployment migrations.
288+
289+
### Prerequisites
290+
- Validate that the target model version or model family is supported for your existing deployment type. Migrations can only occur between provisioned deployments of the same deployment type. For more information on deployment types, review the [deployment type documentation](./deployment-types.md).
291+
- Validate capacity availability for your target model version or model family prior to attempting a migration. For more information on determining capacity availability, review the [capacity transparency documentation](../concepts/provisioned-throughput.md#capacity-transparency).
292+
- For multi-deployment migrations, validate that you have sufficient quota to support multiple deployments simultaneously. For more information on how to validate quota for each provisioned deployment type, review the [provisioned quota documentation](../concepts/provisioned-throughput.md#quota).
293+
294+
### In-place migrations for provisioned deployments
295+
In-place migrations allow you to maintain the same provisioned deployment name and size while changing the model version or model family assigned to that deployment. With in-place migrations, Azure OpenAI Service takes care of migrating any existing traffic between model versions or model families throughout the migration over a 20-30 minute window. Throughout the migration window, your provisioned deployment will display an "updating" provisioned state. You can continue to use your provisioned deployment as you normally would. Once the in-place migration is complete, the provisioned state will be updated to "succeeded", indicating that all traffic has been migrated over to the target model version or model family.
296+
297+
#### In-place migration: model version update
298+
In-place migrations that target updating an existing provisioned deployment to a new model version within the same model family are supported through Azure AI Foundry, REST API, and Azure CLI. To perform an in-place migration targeting a model version update within Azure AI Foundry, select **Deployments** > under the deployment name column select the deployment name of the provisioned deployment you would like to migrate.
299+
300+
Selecting a deployment name opens the **Properties** for the model deployment. From this view, select the **edit** button, which will show the **Update deployment** dialogue box. Select the model version dropdown to set a new model version for the provisioned deployment. As noted, the provisioning state will change to "updating" during the migration and will revert to "succeeded" once the migration is complete.
301+
302+
![Screenshot of update deployment dialogue box with the model version field selector opened to show model version options available for selection.](media/working-with-models/provisioned-deployment-model-version-update.png)
303+
304+
#### In-place migration: model family change
305+
In-place migration that target updating an existing provisioned deployment to a new model family are supported through REST API and Azure CLI. To perform an in-place migration targeting a model family change, use the example request below as a guide. In the request, you will need to update the model name and model version for the target model you are migrating to.
306+
307+
```Bash
308+
curl -X PUT https://management.azure.com/subscriptions/00000000-0000-0000-0000-000000000000/resourceGroups/resource-group-temp/providers/Microsoft.CognitiveServices/accounts/docs-openai-test-001/deployments/gpt-4o-ptu-deployment?api-version=2024-10-01 \
309+
-H "Content-Type: application/json" \
310+
-H 'Authorization: Bearer YOUR_AUTH_TOKEN' \
311+
-d '{"sku":{"name":"GlobalProvisionedManaged","capacity":100},"properties": {"model": {"format": "OpenAI","name": "gpt-4o-mini","version": "2024-07-18"}}}'
312+
```
313+
#### Example response
314+
315+
```json
316+
{
317+
"id": "/subscriptions/{subscription-id}/resourceGroups/resource-group-temp/providers/Microsoft.CognitiveServices/accounts/docs-openai-test-001/deployments/gpt-4o-ptu-deployment",
318+
"type": "Microsoft.CognitiveServices/accounts/deployments",
319+
"name": "gpt-4o-ptu-deployment",
320+
"sku": {
321+
"name": "GlobalProvisionedManaged",
322+
"capacity": 100
323+
},
324+
"properties": {
325+
"model": {
326+
"format": "OpenAI",
327+
"name": "gpt-4o-mini",
328+
"version": "2024-07-18"
329+
},
330+
"versionUpgradeOption": "OnceCurrentVersionExpired",
331+
"currentCapacity": 100
332+
"capabilities": {
333+
"area": "EUR",
334+
"chatCompletion": "true"
335+
"jsonObjectResponse": "true",
336+
"maxContextToken": "128000",
337+
"maxOutputToken": "16834",
338+
"assistants": "true"
339+
},
340+
"provisioningState": "Updating",
341+
"rateLimits": [
342+
{
343+
"key": "request",
344+
"renewalPeriod": 10,
345+
"count": 300
346+
}
347+
]
348+
},
349+
"systemData": {
350+
"createdBy": "[email protected]",
351+
"createdByType": "User",
352+
"createdAt": "2025-01-28T02:57:15.8951706Z",
353+
"lastModifiedBy": "[email protected]",
354+
"lastModifiedByType": "User",
355+
"lastModifiedAt": "2025-01-29T15:35:53.082912Z"
356+
},
357+
"etag": "\"GUID\""
358+
}
359+
```
360+
361+
> [!NOTE]
362+
> There are multiple ways to generate an authorization token. The easiest method for initial testing is to launch the Cloud Shell from the [Azure portal](https://portal.azure.com). Then run [`az account get-access-token`](/cli/azure/account?view=azure-cli-latest#az-account-get-access-token&preserve-view=true). You can use this token as your temporary authorization token for API testing.
363+
364+
### Multi-deployment migrations for provisioned deployments
365+
Multi-deployment migrations allow you to have greater control over the model migration process. With multi-deployment migrations, you can dictate how quickly you would like to migrate your existing traffic to the target model version or model family on a new provisioned deployment. The process to migrate to a new model version or model family using the multi-deployment migration approach is as follows:
366+
- Create a new provisioned deployment. For this new deployment, you can choose to maintain the same provisioned deployment type as your existing deployment or select a new deployment type if desired.
367+
- Transition traffic from the existing provisioned deployment to the newly created provisioned deployment with your target model version or model family until all traffic is offloaded from the original deployment.
368+
- Once traffic is migrated over to the new deployment, validate that there are no inference requests being processed on the previous provisioned deployment by ensuring the Azure OpenAI Requests metric does not show any API calls made within 5-10 minutes of the inference traffic being migrated over to the new deployment. For more information on this metric, [see the Monitor Azure OpenAI documentation](https://aka.ms/aoai/docs/monitor-azure-openai).
369+
- Once you confirm that no inference calls have been made, delete the original provisioned deployment.
283370

284371
## Next steps
285372

0 commit comments

Comments
 (0)