You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: articles/ai-services/openai/how-to/fine-tuning-deploy.md
+3-3Lines changed: 3 additions & 3 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -17,7 +17,7 @@ Once your model is fine-tuned, you can deploy the model and can use it in your o
17
17
18
18
When you deploy the model, you make the model available for inferencing, and that incurs an hourly hosting charge. Fine-tuned models, however, can be stored in Azure AI Foundry at no cost until you're ready to use them.
19
19
20
-
Azure OpenAI provides choices of deployment types for fine-tuned models on the hosting structure that fits different business and usage patterns: **Standard**, **Global Standard** (preview) and **Provisioned Managed** (preview). Learn more about [deployment types for fine-tuned models](#deployment-types) and the [concepts of all deployment types](./deployment-types.md).
20
+
Azure OpenAI provides choices of deployment types for fine-tuned models on the hosting structure that fits different business and usage patterns: **Standard**, **Global Standard** (preview) and **Provisioned Throughput** (preview). Learn more about [deployment types for fine-tuned models](#deployment-types) and the [concepts of all deployment types](./deployment-types.md).
21
21
22
22
## Deploy your fine-tuned model
23
23
@@ -380,14 +380,14 @@ Azure OpenAI fine-tuning supports the following deployment types.
380
380
381
381
:::image type="content" source="../media/fine-tuning/global-standard.png" alt-text="Screenshot of the global standard deployment user experience with a fine-tuned model." lightbox="../media/fine-tuning/global-standard.png":::
382
382
383
-
### Provisioned Managed
383
+
### Provisioned Throughput
384
384
385
385
| Models | Region |
386
386
|--|--|
387
387
|GPT-4o-finetune|North Central US, Sweden Central|
388
388
|GPT-4o-mini-finetune|North Central US, Sweden Central|
389
389
390
-
[Provisioned managed](./deployment-types.md#provisioned) fine-tuned deployments offer [predictable performance](../concepts/provisioned-throughput.md) for latency-sensitive agents and applications. They use the same regional provisioned throughput (PTU) capacity as base models, so if you already have regional PTU quota you can deploy your fine-tuned model in support regions.
390
+
[Provisioned throughput](./deployment-types.md#provisioned) fine-tuned deployments offer [predictable performance](../concepts/provisioned-throughput.md) for latency-sensitive agents and applications. They use the same regional provisioned throughput (PTU) capacity as base models, so if you already have regional PTU quota you can deploy your fine-tuned model in support regions.
Copy file name to clipboardExpand all lines: articles/ai-services/openai/how-to/provisioned-get-started.md
+45-21Lines changed: 45 additions & 21 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -14,9 +14,9 @@ recommendations: false
14
14
15
15
# Get started using provisioned deployments on the Azure OpenAI in Azure AI Foundry Models
16
16
17
-
The following guide walks you through key steps in creating a provisioned deployment with your Azure OpenAI resource. For more details on the concepts discussed here, see:
The following guide walks you through key steps in creating a provisioned deployment with your Azure AI Foundry resource. For more details on the concepts discussed here, see:
18
+
*[Azure AI Foundry Provisioned Throughput Onboarding Guide](./provisioned-throughput-onboarding.md)
19
+
*[Azure AI Foundry Provisioned Throughput Concepts](../concepts/provisioned-throughput.md)
20
20
21
21
## Prerequisites
22
22
@@ -32,48 +32,60 @@ Creating a new deployment requires available (unused) quota to cover the desired
32
32
* Total PTU Quota = 500 PTUs
33
33
* Deployments:
34
34
* 100 PTUs: GPT-4o, 2024-05-13
35
-
* 100 PTUs: GPT-4, 0613
35
+
* 100 PTUs: DeepSeek-R1, 1
36
36
37
37
Then 200 PTUs of quota are considered used, and there are 300 PTUs available for use to create new deployments.
38
38
39
-
A default amount of global, data zone, and regional provisioned quota is assigned to eligible subscriptions in several regions. You can view the quota available to you in a region by visiting the Quotas pane in [Azure AI Foundry portal](https://ai.azure.com/) and selecting the desired subscription and region. For example, the screenshot below shows a quota limit of 500 PTUs in West US for the selected subscription. Note that you might see lower values of available default quotas.
39
+
A default amount of global, data zone, and regional provisioned quota is assigned to eligible subscriptions in several regions. You can view the quota available to you in a region by visiting the Quotas pane in [Azure AI Foundry portal](https://ai.azure.com/) and selecting the desired subscription and region. For example, the screenshot below shows a quota limit of 300 Global Provisioned Throughput PTUs in West US for the selected subscription. The total usage of this Global PTUs is 50, then you will have 250 PTU units available to deploy Global Provisioned Throughput deployment type.
40
40
41
41
:::image type="content" source="../media/provisioned/available-quota.png" alt-text="A screenshot of the available quota in Azure AI Foundry portal." lightbox="../media/provisioned/available-quota.png":::
42
42
43
-
Additional quota can be requested by clicking the Request Quota link to the right of the “Usage/Limit” column. (This is off-screen in the screenshot above).
43
+
Additional quota can be requested by clicking the “Request Quota” Button.
44
44
45
-
## Create an Azure OpenAI resource
45
+
## Create an Azure AI Foundry resource
46
46
47
-
Provisioned deployments are created via Azure OpenAI resource objects within Azure. You must have an Azure OpenAI resource in each region where you intend to create a deployment. Use the Azure portal to [create a resource](./create-resource.md) in a region with available quota, if required.
47
+
Provisioned deployments are created via Azure AI Foundry resource objects within Azure. You must have an Azure AI Foundry resource in each region where you intend to create a deployment. Use the Azure portal to [create a resource](./create-resource.md) in a region with available quota, if required.
48
48
49
49
> [!NOTE]
50
-
> Azure OpenAI resources can support multiple types of Azure OpenAI deployments at the same time. It is not necessary to dedicate new resources for your provisioned deployments.
50
+
> Azure AI Foundry resources can support multiple types of Azure AI Foundry deployments at the same time. It is not necessary to dedicate new resources for your provisioned deployments.
51
+
52
+
## Discover models with provisioned deployment option
53
+
54
+
Once you have verified your quota, you can create a deployment. Navigate to Azure AI Foundry model catalog to discover the models with provisioned deployment options.
55
+
56
+
1. Sign into the [Azure AI Foundry portal](https://ai.azure.com/).
57
+
2. Choose the subscription that was enabled for provisioned deployments & select the desired resource in a region where you have the quota.
58
+
3. You can select models by filtering **Direct from Microsoft** in the model collections filter. Those are models held and served by Azure directly and support provisioned throughput deployment option.
59
+
4. Select the model that you want to deploy and check the model details in the model card.
60
+
51
61
52
62
## Create your provisioned deployment - capacity is available
53
63
54
-
Once you have verified your quota, you can create a deployment. To create a provisioned deployment, you can follow these steps; the choices described reflect the entries shown in the screenshot.
64
+
To create a provisioned deployment, you can follow these steps; the choices described reflect the entries shown in the screenshot.
55
65
56
66
:::image type="content" source="../media/provisioned/deployment-screen.png" alt-text="Screenshot of the Azure AI Foundry portal deployment page for a provisioned deployment." lightbox="../media/provisioned/deployment-screen.png":::
57
67
58
68
59
69
60
-
1. Sign into the [Azure AI Foundry portal](https://ai.azure.com).
61
-
1. Choose the subscription that was enabled for provisioned deployments & select the desired resource in a region where you have the quota.
62
-
1. Under **Management** in the left-nav select **Deployments**.
63
-
1. Select Create new deployment and configure the following fields. Expand the **advanced options** drop-down menu.
64
-
1. Fill out the values in each field. Here's an example:
70
+
1. Click **Use this model** and configure the following fields.
71
+
72
+
2. Select “Global Provisioned Throughput”,” Data Zone Provisioned Throughput” or” Regional Provisioned Throughput” as you required in the Deployment type drop-down for your provisioned deployment.
73
+
74
+
3. Expand the **advanced options** drop-down menu.
75
+
76
+
4. Fill out the values in each field. Here's an example:
65
77
66
78
| Field | Description | Example |
67
79
|--|--|--|
68
80
| Select a model| Choose the specific model you wish to deploy. | GPT-4 |
69
81
| Model version | Choose the version of the model to deploy. | 0613 |
70
82
| Deployment Name | The deployment name is used in your code to call the model by using the client libraries and the REST APIs. | gpt-4|
71
83
| Content filter | Specify the filtering policy to apply to the deployment. Learn more on our [Content Filtering](../concepts/content-filter.md) how-to. | Default |
72
-
| Deployment Type |This impacts the throughput and performance. Choose Global Provisioned-Managed, DataZone Provisioned-Managed or Provisioned-Managed from the deployment dialog dropdown for your deployment | Provisioned-Managed|
84
+
| Deployment Type |This impacts the throughput and performance. Choose Global Provisioned Throughput, Data Zone Provisioned Throughput or Regional Provisioned Throughput from the deployment dialog dropdown for your deployment |Global Provisioned Throughput|
73
85
| Provisioned Throughput Units | Choose the amount of throughput you wish to include in the deployment. | 100 |
74
86
75
87
> [!NOTE]
76
-
> The deployment dialog contains a reminder that you can purchase an Azure Reservation for Azure OpenAI Provisioned to obtain a significant discount for a term commitment.
88
+
> The deployment dialog contains a reminder that you can purchase an Azure Reservation for Azure AI Foundry Provisioned Throughput to obtain a significant discount for a term commitment.
77
89
78
90
Once you have entered the deployment settings, click **Confirm Pricing** to continue. A pricing confirmation dialog will appear that will display the list price for the deployment, if you choose to pay for it on an hourly basis, with no Azure Reservation to provide a term discount.
79
91
@@ -113,20 +125,32 @@ In this event, the wizard in [Azure AI Foundry portal](https://ai.azure.com/) wi
113
125
Things to notice:
114
126
115
127
* A message displays showing you many PTUs you have in available quota, and how many can currently be deployed at this time.
116
-
* If you select a number of PTUs greater than service capacity, a message will appear that provides options for you to obtain more capacity, and a button to allow you to select an alternate region. Clicking the "See other regions" button will display a dialog that shows a list of Azure OpenAI resources where you can create a deployment, along with the maximum sized deployment that can be created based on available quota and service capacity in each region.
128
+
* If you select a number of PTUs greater than service capacity, a message will appear that provides options for you to obtain more capacity, and a button to allow you to select an alternate region. Clicking the "See other regions" button will display a dialog that shows a list of Azure AI Foundry resources where you can create a deployment, along with the maximum sized deployment that can be created based on available quota and service capacity in each region.
117
129
118
130
:::image type="content" source="../media/provisioned/choose-different-resource.png" alt-text="Screenshot of the Azure AI Foundry portal deployment page for choosing a different resource and region." lightbox="../media/provisioned/choose-different-resource.png":::
119
131
120
132
Selecting a resource and clicking **Switch resource** will cause the deployment dialog to redisplay using the selected resource. You can then proceed to create your deployment in the new region.
121
133
134
+
## Create a new deployment or exchange models with your quota
135
+
136
+
If you still have quota available under the subscription and region, you can create new provisioned deployments for other models that direct host and sold from Microsoft.
137
+
138
+
The steps are the same as the above example. When you create a new deployment, you will see the total available quota you can use in the deployment widget. In the screenshot below, the available quota is 250 units.
139
+
140
+
:::image type="content" source="../media/provisioned/deepseek-deployment.png" alt-text="Screenshot of the fungible PTU to deploy flagship models." lightbox="../media/provisioned/deepseek-deployment.png":::
141
+
142
+
After you deployed the new model, you can check the quota usage in [AI Foundry portal](https://ai.azure.com/managementCenter/quota?wsid=/subscriptions/6a6fff00-4464-4eab-a6b1-0b533c7202e0/resourceGroups/rg-fokikioluai/providers/Microsoft.CognitiveServices/accounts/ai-fokikioluai889906014325&tid=72f988bf-86f1-41af-91ab-2d7cd011db47#aoaiProvisionedManaged). You can manage your quota by either requesting new quota or deleting existing deployments to free up PTU quotas for new provisioned deployments.
143
+
144
+
:::image type="content" source="../media/provisioned/fungible-quota.png" alt-text="Screenshot of the fungible PTU quota in quota page." lightbox="../media/provisioned/fungible-quota.png":::
145
+
122
146
## Optionally purchase a reservation
123
147
124
148
Following the creation of your deployment, you might want to purchase a term discount via an Azure Reservation. An Azure Reservation can provide a substantial discount on the hourly rate for users intending to use the deployment beyond a few days.
125
149
126
150
For more information on the purchase model and reservations, see:
127
-
*[Save costs with Microsoft Azure OpenAI provisioned reservations](/azure/cost-management-billing/reservations/azure-openai).
*[Guide for Azure OpenAI provisioned reservations](../concepts/provisioned-throughput.md)
151
+
*[Save costs with Microsoft Azure AI Foundry provisioned throughput reservations](/azure/cost-management-billing/reservations/azure-openai).
152
+
*[Azure AI Foundry provisioned throughput onboarding guide](./provisioned-throughput-onboarding.md)
153
+
*[Guide for Azure AI Foundry provisioned throughput reservations](../concepts/provisioned-throughput.md)
130
154
131
155
> [!IMPORTANT]
132
156
> Capacity availability for model deployments is dynamic and changes frequently across regions and models. To prevent you from purchasing a reservation for more PTUs than you can use, create deployments first, and then purchase the Azure Reservation to cover the PTUs you have deployed. This best practice will ensure that you can take full advantage of the reservation discount and prevent you from purchasing a term commitment that you cannot use.
0 commit comments