You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
@@ -342,7 +342,7 @@ curl -X PUT https://management.azure.com/subscriptions/00000000-0000-0000-0000-0
342
342
{
343
343
"key": "request",
344
344
"renewalPeriod": 10,
345
-
"count": 120
345
+
"count": 300
346
346
}
347
347
]
348
348
},
@@ -363,7 +363,10 @@ curl -X PUT https://management.azure.com/subscriptions/00000000-0000-0000-0000-0
363
363
364
364
### Multi-deployment migrations for provisioned deployments
365
365
Multi-deployment migrations allow you to have greater control over the model migration process. With multi-deployment migrations, you can dictate how quickly you would like to migrate your existing traffic to the target model version or model family on a new provisioned deployment. The process to migrate to a new model version or model family using the multi-deployment migration approach is as follows:
366
-
- Create a new provisioned deployment. For this new deployment, you can choose to maintain the same provisioned deployment type as your existing provisioned deployment or select a new deployment type if de
366
+
- Create a new provisioned deployment. For this new deployment, you can choose to maintain the same provisioned deployment type as your existing deployment or select a new deployment type if desired.
367
+
- Transition traffic from the existing provisioned deployment to the newly created provisioned deployment with your target model version or model family until all traffic is offloaded from the original deployment.
368
+
- Once traffic is migrated over to the new deployment, validate that there are no inference requests being processed on the previous provisioned deployment by ensuring the Azure OpenAI Requests metric does not show any API calls made within 5-10 minutes of the inference traffic being migrated over to the new deployment. For more information on this metric, [see the Monitor Azure OpenAI documentation](https://aka.ms/aoai/docs/monitor-azure-openai).
369
+
- Once you confirm that no inference calls have been made, delete the original provisioned deployment.
0 commit comments