You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: articles/ai-foundry/model-inference/how-to/github/create-model-deployments.md
-2Lines changed: 0 additions & 2 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -37,8 +37,6 @@ To use it:
37
37
38
38
1. Get the Azure AI model's inference endpoint URL and keys from the **deployment page** or the **Overview** page. If you're using Microsoft Entra ID authentication, you don't need a key.
39
39
40
-
:::image type="content" source="../../media/add-model-deployments/models-deploy-endpoint-url.png" alt-text="Screenshot showing how to get the URL and key associated with the deployment." lightbox="../../media/add-model-deployments/models-deploy-endpoint-url.png":::
41
-
42
40
2. When constructing your request, indicate the parameter `model` and insert the model deployment name you created.
Copy file name to clipboardExpand all lines: articles/ai-foundry/model-inference/how-to/manage-costs.md
+13-13Lines changed: 13 additions & 13 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -24,7 +24,7 @@ Although this article is about planning for and managing costs for model inferen
24
24
25
25
## Understand model inference billing model
26
26
27
-
Models deployed in Azure AI Services are charged per 1,000 tokens. Language models understand and process text by breaking it down into tokens. For reference, each token is roughly four characters for typical English text. Costs per token vary depending on which model series you choose. Models that can process images break down images in tokens too. The amount of tokens per image depends on the model and the resolution of the input image.
27
+
Models deployed in Azure AI Services are charged per 1,000 tokens. Language models understand and process text by breaking it down into tokens. For reference, each token is roughly four characters for typical English text. Costs per token vary depending on which model series you choose. Models that can process images break down images in tokens too. The number of tokens per image depends on the model and the resolution of the input image.
28
28
29
29
Token costs are for both input and output. For example, suppose you have a 1,000 token JavaScript code sample that you ask a model to convert to Python. You would be charged approximately 1,000 tokens for the initial input request sent, and 1,000 more tokens for the output that is received in response for a total of 2,000 tokens.
30
30
@@ -36,7 +36,7 @@ To understand the breakdown of what makes up the cost, it can be helpful to use
36
36
37
37
1. Go to [Azure AI Foundry Portal](https://ai.azure.com).
38
38
39
-
2. In the upper right corner of the screen, click on the name of your Azure AI Services resource, or if you are working on an AI project, on the name of the project.
39
+
2. In the upper right corner of the screen, select on the name of your Azure AI Services resource, or if you're working on an AI project, on the name of the project.
40
40
41
41
3. Select the name of the project. Azure portal opens in a new window.
42
42
@@ -47,7 +47,7 @@ To understand the breakdown of what makes up the cost, it can be helpful to use
47
47
5. By default, cost analysis is scoped to the selected resource group.
48
48
49
49
> [!IMPORTANT]
50
-
> It's important to scope *Cost Analysis* to the resource group where the Azure AI Services resource is deployed. Cost meters associated with third-party model providers, like Mistral AI or Cohere, are displayed under the resource group instead of the Azure AI Services resource.
50
+
> It's important to scope *Cost Analysis* to the resource group where the Azure AI Services resource is deployed. Cost meters associated with some provider model providers, like Mistral AI or Cohere, are displayed under the resource group instead of the Azure AI Services resource.
51
51
52
52
6. Modify **Group by** to **Meter**. You can now see that for this particular resource group, the source of the costs comes from different models series.
53
53
@@ -61,20 +61,20 @@ Azure OpenAI and Microsoft's family of models (like Phi) are charged directly an
61
61
62
62
:::image type="content" source="../media/manage-cost/cost-by-meter-1p.png" alt-text="Screenshot of cost analysis dashboard scoped to the resource group where the Azure AI Services resource is deployed, highlighting the meters for Azure OpenAI and Microsoft's models. Cost is group by meter." lightbox="../media/manage-cost/cost-by-meter-1p.png":::
63
63
64
-
### Third party models
64
+
### Provider models
65
65
66
-
Models provided by a third party, like Mistral AI, Cohere, Meta AI, or AI21 Labs, are billed using the Azure Marketplace. As opposite to Microsoft billing meters, those entries are associated with the resource group where your Azure AI services is deployed instead of to the Azure AI Services resource itself. You see entries under the **Service Name***SaaS* accounting for inputs and outputs for each consumed model.
66
+
Models provided by another provider, like Mistral AI, Cohere, Meta AI, or AI21 Labs, are billed using Azure Marketplace. As opposite to Microsoft billing meters, those entries are associated with the resource group where your Azure AI services is deployed instead of to the Azure AI Services resource itself. You see entries under the **Service Name***SaaS* accounting for inputs and outputs for each consumed model.
67
67
68
-
:::image type="content" source="../media/manage-cost/cost-by-meter-saas.png" alt-text="Screenshot of cost analysis dashboard scoped to the resource group where the Azure AI Services resource is deployed, highlighting the meters for models billed throughout the Azure Marketplace. Cost is group by meter." lightbox="../media/manage-cost/cost-by-meter-saas.png":::
68
+
:::image type="content" source="../media/manage-cost/cost-by-meter-saas.png" alt-text="Screenshot of cost analysis dashboard scoped to the resource group where the Azure AI Services resource is deployed, highlighting the meters for models billed throughout Azure Marketplace. Cost is group by meter." lightbox="../media/manage-cost/cost-by-meter-saas.png":::
69
69
70
70
### Using Azure Prepayment
71
71
72
-
You can pay for Azure OpenAI and Microsoft's models charges with your Azure Prepayment credit. However, you can't use Azure Prepayment credit to pay for charges for third party models given they are billed through the Azure Marketplace.
72
+
You can pay for Azure OpenAI and Microsoft's models charges with your Azure Prepayment credit. However, you can't use Azure Prepayment credit to pay for charges for other provider models given they're billed through Azure Marketplace.
73
73
74
74
### HTTP Error response code and billing status
75
75
76
-
If the service performs processing, you are charged even if the status code is not successful (not 200).
77
-
For example, a 400 error due to a content filter or input limit, or a 408 error due to a timeout.
76
+
If the service performs processing, you're charged even if the status code isn't successful (not 200).
77
+
For example, a 400 error due to a content filter or input limit, or a 408 error due to a time-out.
78
78
79
79
If the service doesn't perform processing, you aren't charged. For example, a 401 error due to authentication or a 429 error due to exceeding the Rate Limit.
80
80
@@ -92,15 +92,15 @@ To understand the breakdown of what makes up that cost, it can be helpful to use
92
92
93
93
1. Go to [Azure AI Foundry Portal](https://ai.azure.com).
94
94
95
-
2. In the upper right corner of the screen, click on the name of your Azure AI Services resource, or if you are working on an AI project, on the name of the project.
95
+
2. In the upper right corner of the screen, select on the name of your Azure AI Services resource, or if you're working on an AI project, on the name of the project.
96
96
97
97
3. Select the name of the project. Azure portal opens in a new window.
98
98
99
99
4. Under **Cost Management** select **Cost analysis**
100
100
101
101
5. By default, cost analysis is scoped to the resource group you have selected.
102
102
103
-
6. Since we are seeing the cost of all the resource group, it's useful to see the cost by resource. In that case, select **View** > **Cost by resource**.
103
+
6. Since we're seeing the cost of all the resource group, it's useful to see the cost by resource. In that case, select **View** > **Cost by resource**.
104
104
105
105
:::image type="content" source="../media/manage-cost/cost-by-resource.png" alt-text="Screenshot of how to see the cost by each resource in the resource group." lightbox="../media/manage-cost/cost-by-resource.png":::
106
106
@@ -110,9 +110,9 @@ To understand the breakdown of what makes up that cost, it can be helpful to use
110
110
111
111
:::image type="content" source="../media/manage-cost/cost-by-resource-1p.png" alt-text="Screenshot of cost analysis dashboard scoped to the resource group where the Azure AI Services resource is deployed, highlighting the meters for Azure OpenAI and Microsoft's models. Cost is group by resource." lightbox="../media/manage-cost/cost-by-resource-1p.png":::
112
112
113
-
9.Third-party models are displayed as meters under Global resources. Notice that the word *Global***isn't** related to the SKU of the model deployment (for instance, *Global standard*). If you have multiple Azure AI services resources, your bill contains one entry **for each model for each Azure AI services resource**. The resource meters have the format *[model-name]-[GUID]* where *[GUID]* is an identifier unique an associated with a given Azure AI Services resource. You notice billing meters accounting for inputs and outputs for each model you have consumed.
113
+
9.Some providers' models are displayed as meters under Global resources. Notice that the word *Global***isn't** related to the SKU of the model deployment (for instance, *Global standard*). If you have multiple Azure AI services resources, your bill contains one entry **for each model for each Azure AI services resource**. The resource meters have the format *[model-name]-[GUID]* where *[GUID]* is an identifier unique an associated with a given Azure AI Services resource. You notice billing meters accounting for inputs and outputs for each model you have consumed.
114
114
115
-
:::image type="content" source="../media/manage-cost/cost-by-resource-saas.png" alt-text="Screenshot of cost analysis dashboard scoped to the resource group where the Azure AI Services resource is deployed, highlighting the meters for models billed throughout the Azure Marketplace. Cost is group by resource." lightbox="../media/manage-cost/cost-by-resource-saas.png":::
115
+
:::image type="content" source="../media/manage-cost/cost-by-resource-saas.png" alt-text="Screenshot of cost analysis dashboard scoped to the resource group where the Azure AI Services resource is deployed, highlighting the meters for models billed throughout Azure Marketplace. Cost is group by resource." lightbox="../media/manage-cost/cost-by-resource-saas.png":::
116
116
117
117
It's important to understand scope when you evaluate costs associated with Azure AI Services. If your resources are part of the same resource group, you can scope Cost Analysis at that level to understand the effect on costs. If your resources are spread across multiple resource groups, you can scope to the subscription level.
Copy file name to clipboardExpand all lines: articles/ai-foundry/model-inference/how-to/quickstart-ai-project.md
+13-13Lines changed: 13 additions & 13 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -14,7 +14,7 @@ recommendations: false
14
14
15
15
# Configure your AI project to use Azure AI model inference
16
16
17
-
If you already have an AI project in an existing AI Hub, Models as a Service models are by default deployed inside of your project as standalone endpoints. Each model deployment has its own set of URI and credentials to access it. Azure OpenAI models are deployed to Azure AI Services resource or to the Azure OpenAI Service resource.
17
+
If you already have an AI project in an existing AI Hub, models via "Models as a Service" are by default deployed inside of your project as stand-alone endpoints. Each model deployment has its own set of URI and credentials to access it. Azure OpenAI models are deployed to Azure AI Services resource or to the Azure OpenAI Service resource.
18
18
19
19
You can configure the AI project to connect with the Azure AI model inference in Azure AI services. Once configured, **deployments of Models as a Service models happen to the connected Azure AI Services resource** instead to the project itself, giving you a single set of endpoint and credential to access all the models deployed in Azure AI Foundry.
20
20
@@ -39,7 +39,7 @@ To complete this tutorial, you need:
39
39
* An Azure AI project and Azure AI Hub.
40
40
41
41
> [!TIP]
42
-
> When you provision an Azure AI Hub, an Azure AI Services is created with it and the two resources connected. To see which Azure AI Services resource is connected to your project go to the [Azure AI Foundry portal](https://ai.azure.com) > **Management center** > **Connected resources**, and find the connections of type **AI Services**.
42
+
> When your AI hub is provisioned, an Azure AI services resource is created with it and the two resources connected. To see which Azure AI services resource is connected to your project, go to the [Azure AI Foundry portal](https://ai.azure.com) > **Management center** > **Connected resources**, and find the connections of type **AI Services**.
43
43
44
44
45
45
## Configure the project to use Azure AI model inference
@@ -56,9 +56,9 @@ To configure the project to use the Azure AI model inference capability in Azure
56
56
57
57
1. Close the panel.
58
58
59
-
2. In the landing page of your project, identify the Azure AI Services resource connected to your project. Use the dropdown to change the resource you are connected if you need to.
59
+
2. In the landing page of your project, identify the Azure AI Services resource connected to your project. Use the drop-down to change the resource you're connected if you need to.
60
60
61
-
3. If no resource is listed in the dropdown, your AI Hub doesn't have an Azure AI Services resource connected to it. Create a new connection by:
61
+
3. If no resource is listed in the drop-down, your AI Hub doesn't have an Azure AI Services resource connected to it. Create a new connection by:
62
62
63
63
1. In the lower left corner of the screen, select **Management center**.
64
64
@@ -90,21 +90,21 @@ For each model you want to deploy under Azure AI model inference, follow these s
90
90
91
91
1. Go to **Model catalog** section in [Azure AI Foundry portal](https://ai.azure.com/explore/models).
92
92
93
-
2. Scroll to the model you are interested in and select it.
93
+
2. Scroll to the model you're interested in and select it.
94
94
95
95
:::image type="content" source="../media/add-model-deployments/models-search-and-deploy.gif" alt-text="An animation showing how to search models in the model catalog and select one for viewing its details." lightbox="../media/add-model-deployments/models-search-and-deploy.gif":::
96
96
97
97
3. You can review the details of the model in the model card.
98
98
99
99
4. Select **Deploy**.
100
100
101
-
5. For models providers that require additional terms of contract, you're asked to accept those terms. Accept the terms on those cases by selecting **Subscribe and deploy**.
101
+
5. For models providers that require more terms of contract, you're asked to accept those terms. Accept the terms on those cases by selecting **Subscribe and deploy**.
102
102
103
103
:::image type="content" source="../media/add-model-deployments/models-deploy-agree.png" alt-text="Screenshot showing how to agree the terms and conditions of a Mistral-Large model." lightbox="../media/add-model-deployments/models-deploy-agree.png":::
104
104
105
-
6. You can configure the deployment settings at this time. By default, the deployment receives the name of the model you are deploying. The deployment name is used in the `model` parameter for request to route to this particular model deployment. It allows you to configure specific names for your models when you attach specific configurations. For instance, `o1-preview-safe` for a model with a strict content safety content filter.
105
+
6. You can configure the deployment settings at this time. By default, the deployment receives the name of the model you're deploying. The deployment name is used in the `model` parameter for request to route to this particular model deployment. It allows you to configure specific names for your models when you attach specific configurations. For instance, `o1-preview-safe` for a model with a strict content safety content filter.
106
106
107
-
7. We automatically select an Azure AI Services connection depending on your project because you have turned on the feature **Deploy models to Azure AI model inference service**. Use the **Customize** option to change the connection based on your needs. If you are deploying under the **Standard** deployment type, the models needs to be available in the region of the Azure AI Services resource.
107
+
7. We automatically select an Azure AI Services connection depending on your project because you have turned on the feature **Deploy models to Azure AI model inference service**. Use the **Customize** option to change the connection based on your needs. If you're deploying under the **Standard** deployment type, the models need to be available in the region of the Azure AI Services resource.
108
108
109
109
:::image type="content" source="../media/add-model-deployments/models-deploy-customize.png" alt-text="Screenshot showing how to customize the deployment if needed." lightbox="../media/add-model-deployments/models-deploy-customize.png":::
110
110
@@ -141,13 +141,13 @@ Use the parameter `model="<deployment-name>` to route your request to this deplo
141
141
142
142
## Move from Serverless API Endpoints to Azure AI model inference
143
143
144
-
Although your configured the project to use the Azure AI model inference, existing model deployments continue to exit within the project as Serverless API Endpoints. Those deployments aren't moved for you. Hence, you can progressively upgrade any existing code that reference previous model deployments. To start moving the model deployments, we recommend the following workflow:
144
+
Although you configured the project to use the Azure AI model inference, existing model deployments continue to exit within the project as Serverless API Endpoints. Those deployments aren't moved for you. Hence, you can progressively upgrade any existing code that reference previous model deployments. To start moving the model deployments, we recommend the following workflow:
145
145
146
146
1. Recreate the model deployment in Azure AI model inference. This model deployment is accessible under the **Azure AI model inference endpoint**.
147
147
148
148
2. Upgrade your code to use the new endpoint.
149
149
150
-
3. Clean-up the project by removing the Serverless API Endpoint.
150
+
3. Cleanup the project by removing the Serverless API Endpoint.
151
151
152
152
153
153
### Upgrade your code with the new endpoint
@@ -165,7 +165,7 @@ The following table summarizes the changes you have to introduce:
165
165
166
166
### Clean-up existing Serverless API endpoints from your project
167
167
168
-
After you refactored your code, you may want to delete the existing Serverless API endpoints inside of the project (if any).
168
+
After you refactored your code, you might want to delete the existing Serverless API endpoints inside of the project (if any).
169
169
170
170
For each model deployed as Serverless API Endpoints, follow these steps:
171
171
@@ -178,11 +178,11 @@ For each model deployed as Serverless API Endpoints, follow these steps:
178
178
4. Select the option **Delete**.
179
179
180
180
> [!WARNING]
181
-
> This operation can't not be reverted. Ensure that the endpoint isn't currently used by any other user or piece of code.
181
+
> This operation can't be reverted. Ensure that the endpoint isn't currently used by any other user or piece of code.
182
182
183
183
5. Confirm the operation by selecting **Delete**.
184
184
185
-
6. If you created a **Serverless API connection** to this endpoint from other projects, such connection aren't removed and continue to point to the inexistent endpoint. Delete any of those connections for avoiding errors.
185
+
6. If you created a **Serverless API connection** to this endpoint from other projects, such connections aren't removed and continue to point to the inexistent endpoint. Delete any of those connections for avoiding errors.
Copy file name to clipboardExpand all lines: articles/ai-foundry/model-inference/how-to/quickstart-github-models.md
-2Lines changed: 0 additions & 2 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -53,8 +53,6 @@ To obtain the key and endpoint:
53
53
54
54
8. Once it's deployed, your model's API Key and endpoint are shown in the Overview. Use these values in your code to use the model in your production environment.
55
55
56
-
:::image type="content" source="../media/add-model-deployments/models-deploy-endpoint-url.png" alt-text="Screenshot showing how to get the URL and key associated with the deployment." lightbox="../media/add-model-deployments/models-deploy-endpoint-url.png":::
57
-
58
56
At this point, the model you selected is ready to consume.
0 commit comments