Skip to content

Commit 4a373d1

Browse files
committed
Update the AzureML article
1 parent 6de0936 commit 4a373d1

File tree

2 files changed

+40
-23
lines changed

2 files changed

+40
-23
lines changed

articles/ai-studio/how-to/deploy-models-serverless.md

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -117,7 +117,7 @@ For non-Microsoft models offered through the Azure Marketplace, you can deploy t
117117
118118
# [AI Studio](#tab/azure-ai-studio)
119119
120-
1. On the model's **Details** page, select **Deploy** and then select **Serverless API with Azure AI Content Safety** to open the deployment wizard.
120+
1. On the model's **Details** page, select **Deploy** and then select **Serverless API with Azure AI Content Safety (preview)** to open the deployment wizard.
121121
122122
1. Select the project in which you want to deploy your models. To use the serverless API model deployment offering, your project must belong to one of the [regions that are supported for serverless deployment](deploy-models-serverless-availability.md) for the particular model.
123123
@@ -244,14 +244,14 @@ Once you've created a subscription for a non-Microsoft model, you can deploy the
244244
245245
The serverless API endpoint provides a way to consume models as an API without hosting them on your subscription, while keeping the enterprise security and compliance organizations need. This deployment option doesn't require quota from your subscription.
246246
247-
In this article, you create an endpoint with the name **meta-llama3-8b-qwerty**.
247+
In this section, you create an endpoint with the name **meta-llama3-8b-qwerty**.
248248
249249
1. Create the serverless endpoint
250250
251251
# [AI Studio](#tab/azure-ai-studio)
252252
253253
1. To deploy a Microsoft model that doesn't require subscribing to a model offering:
254-
1. Select **Deploy** and then select **Serverless API with Azure AI Content Safety** to open the deployment wizard.
254+
1. Select **Deploy** and then select **Serverless API with Azure AI Content Safety (preview)** to open the deployment wizard.
255255
1. Select the project in which you want to deploy your model. Notice that not all the regions are supported.
256256
257257
1. Alternatively, for a non-Microsoft model that requires a model subscription, if you've just subscribed your project to the model offer in the previous section, continue to select **Deploy**. Alternatively, select **Continue to deploy** (if your deployment wizard had the note *You already have an Azure Marketplace subscription for this project*).

articles/machine-learning/how-to-deploy-models-serverless.md

Lines changed: 37 additions & 20 deletions
Original file line numberDiff line numberDiff line change
@@ -6,8 +6,9 @@ manager: scottpolly
66
ms.service: machine-learning
77
ms.subservice: inferencing
88
ms.topic: how-to
9-
ms.date: 05/09/2024
10-
ms.reviewer: None
9+
ms.date: 07/19/2024
10+
ms.reviewer: fasantia
11+
reviewer: santiagxf
1112
ms.author: mopeakande
1213
author: msakande
1314
ms.custom: build-2024, serverless, devx-track-azurecli
@@ -17,7 +18,7 @@ ms.custom: build-2024, serverless, devx-track-azurecli
1718

1819
In this article, you learn how to deploy a model from the model catalog as a serverless API with pay-as-you-go token-based billing.
1920

20-
Certain models in the model catalog can be deployed as a serverless API with pay-as-you-go billing. This kind of deployment provides a way to consume models as an API without hosting them on your subscription, while keeping the enterprise security and compliance that organizations need. This deployment option doesn't require quota from your subscription.
21+
[Certain models in the model catalog](concept-endpoint-serverless-availability.md) can be deployed as a serverless API with pay-as-you-go billing. This kind of deployment provides a way to consume models as an API without hosting them on your subscription, while keeping the enterprise security and compliance that organizations need. This deployment option doesn't require quota from your subscription.
2122

2223
## Prerequisites
2324

@@ -82,18 +83,15 @@ Certain models in the model catalog can be deployed as a serverless API with pay
8283
You can use any compatible web browser to [deploy ARM templates](../azure-resource-manager/templates/deploy-portal.md) in the Microsoft Azure portal or using any of the deployment tools. This tutorial uses the [Azure CLI](/cli/azure/).
8384
8485
85-
## Subscribe your workspace to the model offering
86-
87-
For models offered through the Azure Marketplace, you can deploy them to serverless API endpoints to consume their predictions. If it's your first time deploying the model in the workspace, you have to subscribe your workspace for the particular model offering from the Azure Marketplace. Each workspace has its own subscription to the particular Azure Marketplace offering of the model, which allows you to control and monitor spending.
88-
89-
> [!NOTE]
90-
> Models offered through the Azure Marketplace are available for deployment to serverless API endpoints in specific regions. Check [Region availability for models in Serverless API endpoints](concept-endpoint-serverless-availability.md) to verify which regions are available. If the one you need is not listed, you can deploy to a workspace in a supported region and then [consume serverless API endpoints from a different workspace](how-to-connect-models-serverless.md).
86+
## Find your model and model ID in the model catalog
9187
9288
1. Sign in to [Azure Machine Learning studio](https://ml.azure.com)
9389
94-
1. Ensure your account has the **Azure AI Developer** role permissions on the resource group, or that you meet the [permissions required to subscribe to model offerings](#permissions-required-to-subscribe-to-model-offerings).
90+
1. For models offered through the Azure Marketplace, ensure that your account has the **Azure AI Developer** role permissions on the resource group, or that you meet the [permissions required to subscribe to model offerings](#permissions-required-to-subscribe-to-model-offerings).
91+
92+
Models that are offered by non-Microsoft providers (for example, Llama and Mistral models) are billed through the Azure Marketplace. For such models, you're required to subscribe your project to the particular model offering. Models that are offered by Microsoft (for example, Phi-3 models) don't have this requirement, as billing is done differently. For details about billing for serverless deployment of models in the model catalog, see [Billing for serverless APIs](concept-model-catalog.md#pay-for-model-usage-in-maas).
9593
96-
1. Go to your workspace.
94+
1. Go to your workspace. To use the serverless API model deployment offering, your workspace must belong to one of the [regions that are supported for serverless deployment](concept-endpoint-serverless-availability.md) for the particular model you want to deploy.
9795
9896
1. Select **Model catalog** from the left sidebar and find the model card of the model you want to deploy. In this article, you select a **Meta-Llama-3-8B-Instruct** model.
9997
@@ -104,12 +102,20 @@ For models offered through the Azure Marketplace, you can deploy them to serverl
104102
105103
:::image type="content" source="media/how-to-deploy-models-serverless/model-card.png" alt-text="A screenshot showing a model's details page." lightbox="media/how-to-deploy-models-serverless/model-card.png":::
106104
105+
The next section covers the steps for subscribing your project to a model offering. You can skip this section and go to [Deploy the model to a serverless API endpoint](#deploy-the-model-to-a-serverless-api-endpoint), if you're deploying a Microsoft model.
106+
107+
## Subscribe your project to the model offering
108+
109+
For non-Microsoft models offered through the Azure Marketplace, you can deploy them to serverless API endpoints to consume their predictions. If it's your first time deploying the model in the project, you have to subscribe your workspace for the particular model offering from the Azure Marketplace. Each workspace has its own subscription to the particular Azure Marketplace offering of the model, which allows you to control and monitor spending.
110+
111+
> [!NOTE]
112+
> Models offered through the Azure Marketplace are available for deployment to serverless API endpoints in specific regions. Check [Region availability for models in serverless API endpoints](concept-endpoint-serverless-availability.md) to verify which models and regions are available. If the one you need is not listed, you can deploy to a workspace in a supported region and then [consume serverless API endpoints from a different workspace](how-to-connect-models-serverless.md).
107113
108114
1. Create the model's marketplace subscription. When you create a subscription, you accept the terms and conditions associated with the model offer.
109115
110116
# [Studio](#tab/azure-studio)
111117
112-
1. On the model's **Details** page, select **Deploy** and then select **Serverless API** to open the deployment wizard.
118+
1. On the model's **Details** page, select **Deploy** and then select **Serverless API with Azure AI Content Safety (preview)** to open the deployment wizard.
113119
114120
1. Select the checkbox to acknowledge the Microsoft purchase policy.
115121
@@ -194,7 +200,7 @@ For models offered through the Azure Marketplace, you can deploy them to serverl
194200
}
195201
```
196202
197-
1. Once you sign up the workspace for the particular Azure Marketplace offering, subsequent deployments of the same offering in the same workspace don't require subscribing again.
203+
1. Once you subscribe the workspace for the particular Azure Marketplace offering, subsequent deployments of the same offering in the same workspace don't require subscribing again.
198204
199205
1. At any point, you can see the model offers to which your workspace is currently subscribed:
200206
@@ -236,15 +242,19 @@ For models offered through the Azure Marketplace, you can deploy them to serverl
236242
237243
## Deploy the model to a serverless API endpoint
238244
239-
Once you've created a model's subscription, you can deploy the associated model to a serverless API endpoint. The serverless API endpoint provides a way to consume models as an API without hosting them on your subscription, while keeping the enterprise security and compliance organizations need. This deployment option doesn't require quota from your subscription.
245+
Once you've created a subscription for a non-Microsoft model, you can deploy the associated model to a serverless API endpoint. For Microsoft models (such as Phi-3 models), you don't need to create a subscription.
240246
241-
In this article, you create an endpoint with name **meta-llama3-8b-qwerty**.
247+
The serverless API endpoint provides a way to consume models as an API without hosting them on your subscription, while keeping the enterprise security and compliance organizations need. This deployment option doesn't require quota from your subscription.
248+
249+
In this section, you create an endpoint with the name **meta-llama3-8b-qwerty**.
242250
243251
1. Create the serverless endpoint
244252
245253
# [Studio](#tab/azure-studio)
246254
247-
1. From the previous wizard, select **Deploy** (if you've just subscribed the workspace to the model offer in the previous section), or select **Continue to deploy** (if your deployment wizard had the note *You already have an Azure Marketplace subscription for this workspace*).
255+
1. To deploy a Microsoft model that doesn't require subscribing to a model offering, select **Deploy** and then select **Serverless API with Azure AI Content Safety (preview)** to open the deployment wizard.
256+
257+
1. Alternatively, for a non-Microsoft model that requires a model subscription, if you've just subscribed your project to the model offer in the previous section, continue to select **Deploy**. Alternatively, select **Continue to deploy** (if your deployment wizard had the note *You already have an Azure Marketplace subscription for this workspace*).
248258
249259
:::image type="content" source="media/how-to-deploy-models-serverless/deploy-pay-as-you-go-subscribed-workspace.png" alt-text="A screenshot showing a workspace that is already subscribed to the offering." lightbox="media/how-to-deploy-models-serverless/deploy-pay-as-you-go-subscribed-workspace.png":::
250260
@@ -422,11 +432,11 @@ In this article, you create an endpoint with name **meta-llama3-8b-qwerty**.
422432
> [!TIP]
423433
> If you're using prompt flow in the same workspace where the deployment was deployed, you still need to create the connection.
424434
425-
## Using the serverless API endpoint
435+
## Use the serverless API endpoint
426436
427437
Models deployed in Azure Machine Learning and Azure AI studio in Serverless API endpoints support the [Azure AI Model Inference API](reference-model-inference-api.md) that exposes a common set of capabilities for foundational models and that can be used by developers to consume predictions from a diverse set of models in a uniform and consistent way.
428438
429-
Read more about the [capabilities of this API](reference-model-inference-api.md#capabilities) and how [you can leverage it when building applications](reference-model-inference-api.md#getting-started).
439+
Read more about the [capabilities of this API](reference-model-inference-api.md#capabilities) and how [you can use it when building applications](reference-model-inference-api.md#getting-started).
430440
431441
## Delete endpoints and subscriptions
432442
@@ -501,15 +511,22 @@ az resource delete --name <resource-name>
501511

502512
## Cost and quota considerations for models deployed as serverless API endpoints
503513

504-
Models deployed as a serverless API endpoint are offered through the Azure Marketplace and integrated with Azure Machine Learning for use. You can find the Azure Marketplace pricing when deploying or fine-tuning the models.
514+
Quota is managed per deployment. Each deployment has a rate limit of 200,000 tokens per minute and 1,000 API requests per minute. However, we currently limit one deployment per model per workspace. Contact Microsoft Azure Support if the current rate limits aren't sufficient for your scenarios.
515+
516+
#### Cost for Microsoft models
517+
518+
You can find the pricing information on the __Pricing and terms__ tab of the deployment wizard when deploying Microsoft models (such as Phi-3 models) as serverless API endpoints.
519+
520+
#### Cost for non-Microsoft models
521+
522+
Non-Microsoft models deployed as serverless API endpoints are offered through the Azure Marketplace and integrated with Azure AI Studio for use. You can find the Azure Marketplace pricing when deploying or fine-tuning these models.
505523

506524
Each time a workspace subscribes to a given offer from the Azure Marketplace, a new resource is created to track the costs associated with its consumption. The same resource is used to track costs associated with inference and fine-tuning; however, multiple meters are available to track each scenario independently.
507525

508526
For more information on how to track costs, see [Monitor costs for models offered through the Azure Marketplace](../ai-studio/how-to/costs-plan-manage.md#monitor-costs-for-models-offered-through-the-azure-marketplace).
509527

510528
:::image type="content" source="media/how-to-deploy-models-serverless/costs-model-as-service-cost-details.png" alt-text="A screenshot showing different resources corresponding to different model offers and their associated meters." lightbox="media/how-to-deploy-models-serverless/costs-model-as-service-cost-details.png":::
511529

512-
Quota is managed per deployment. Each deployment has a rate limit of 200,000 tokens per minute and 1,000 API requests per minute. However, we currently limit one deployment per model per workspace. Contact Microsoft Azure Support if the current rate limits aren't sufficient for your scenarios.
513530

514531
## Permissions required to subscribe to model offerings
515532

0 commit comments

Comments
 (0)