Skip to content

Commit a0000b1

Browse files
committed
implement peer review feedback and freshness
1 parent eef70b3 commit a0000b1

8 files changed

+158
-19
lines changed

articles/ai-foundry/how-to/deploy-models-serverless.md

Lines changed: 144 additions & 19 deletions
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,7 @@
11
---
2-
title: Deploy models as standard deployments
2+
title: Deploy models as serverless API deployments
33
titleSuffix: Azure AI Foundry
4-
description: Learn to deploy models as standard deployments, using Azure AI Foundry.
4+
description: Learn to deploy models as serverless API deployments, using Azure AI Foundry.
55
manager: scottpolly
66
ms.service: azure-ai-foundry
77
ms.topic: how-to
@@ -11,39 +11,154 @@ author: msakande
1111
ms.reviewer: fasantia
1212
reviewer: santiagxf
1313
ms.custom: build-2024, serverless, devx-track-azurecli, ignite-2024
14+
zone_pivot_groups: azure-ai-serverless-deployment
1415
---
1516

16-
# Deploy models as standard deployments
17+
# Deploy models as serverless API deployments
1718

18-
In this article, you learn how to deploy a model from the model catalog as a standard deployment.
19+
In this article, you learn how to deploy a model from the model catalog as a serverless API deployment.
1920

2021
[!INCLUDE [models-preview](../includes/models-preview.md)]
2122

22-
[Certain models in the model catalog](deploy-models-serverless-availability.md) can be deployed as a standard deployment. This kind of deployment provides a way to consume models as an API without hosting them on your subscription, while keeping the enterprise security and compliance that organizations need. This deployment option doesn't require quota from your subscription.
23+
[Certain models in the model catalog](deploy-models-serverless-availability.md) can be deployed as a serverless API deployment. This kind of deployment provides a way to consume models as an API without hosting them on your subscription, while keeping the enterprise security and compliance that organizations need. This deployment option doesn't require quota from your subscription.
2324

24-
This article uses a Meta Llama model deployment for illustration. However, you can use the same steps to deploy any of the [models in the model catalog that are available for standard deployment](deploy-models-serverless-availability.md).
25+
<!-- This article uses a Meta Llama model deployment for illustration. However, you can use the same steps to deploy any of the [models in the model catalog that are available for serverless API deployment](deploy-models-serverless-availability.md). -->
2526

2627
## Prerequisites
2728

2829
- An Azure subscription with a valid payment method. Free or trial Azure subscriptions won't work. If you don't have an Azure subscription, create a [paid Azure account](https://azure.microsoft.com/pricing/purchase-options/pay-as-you-go) to begin.
2930

3031
- If you don't have one, [create a [!INCLUDE [hub](../includes/hub-project-name.md)]](create-projects.md?pivots=hub-project).
3132

32-
- Ensure that the **Deploy models to Azure AI model inference service** feature is turned off in the Azure AI Foundry portal. When this feature is on, standard deployments are not available for deployment when using the portal.
33+
- Ensure that the **Deploy models to Azure AI Foundry resources** feature is turned off in the Azure AI Foundry portal. When this feature is on, serverless API deployments are not available from the portal.
3334

34-
:::image type="content" source="../media/deploy-models-serverless/ai-project-inference-endpoint.gif" alt-text="An animation showing how to turn off the Deploy models to Azure AI model inference service feature in Azure AI Foundry portal." lightbox="../media/deploy-models-serverless/ai-project-inference-endpoint.gif":::
35+
:::image type="content" source="../media/deploy-models-serverless/foundry-resources-deployment-disabled.gif" alt-text="A screenshot of the Azure AI Foundry portal showing where to disable deployment to Azure AI Foundry resources." lightbox="../media/deploy-models-serverless/foundry-resources-deployment-disabled.gif":::
3536

3637
- Azure role-based access controls (Azure RBAC) are used to grant access to operations in Azure AI Foundry portal. To perform the steps in this article, your user account must be assigned the __Azure AI Developer role__ on the resource group. For more information on permissions, see [Role-based access control in Azure AI Foundry portal](../concepts/rbac-azure-ai-foundry.md).
3738

38-
- You need to install the following software to work with Azure AI Foundry:
39+
::: zone pivot="ai-foundry-portal"
40+
41+
- You can use any compatible web browser to navigate [Azure AI Foundry](https://ai.azure.com/?cid=learnDocs).
42+
43+
## Find your model in the model catalog
44+
45+
[!INCLUDE [open-catalog](../includes/open-catalog.md)]
46+
47+
# [Models sold directly by Azure](#tab/azure-direct)
48+
49+
4. Select the model card of the model you want to deploy. In this article, you select a **DeepSeek-R1-0528** model.
50+
1. Select **Use this model** and view the *Pricing and terms* tab in the window that opens.
51+
1. Select **Agree and Proceed** to open the deployment wizard. Here, you can name the deployment and select the deployment type.
52+
:::image type="content" source="../media/deploy-models-serverless/deepseek-deployment-wizard.png" alt-text="Screenshot showing the deployment wizard for a model sold directly by Azure." lightbox="../media/deploy-models-serverless/deepseek-deployment-wizard.png":::
53+
54+
55+
# [Models from Partners and Community](#tab/partner-models)
56+
57+
> [!NOTE]
58+
> [Models from Partners and Community](../concepts/foundry-models-overview.md#models-from-partners-and-community) are offered through the Azure Marketplace. For these models, ensure that your account has the **Azure AI Developer** role permissions on the resource group, or that you meet the [permissions required to subscribe to model offerings](#permissions-required-to-subscribe-to-model-offerings), as you're required to subscribe your project to the particular model offering.
59+
60+
4. Select the model card of the model you want to deploy. In this article, you select the **AI21-Jamba-1.5-Large** model.
61+
62+
The next section covers the steps for subscribing your project to a model offering.
63+
64+
### Subscribe your project to the model offering
65+
66+
Standard deployments can deploy both Microsoft and non-Microsoft offered models. For models from partners and community, e.g., the Gretel model, you must create a subscription before you can deploy them. If it's your first time deploying the model in the project, you have to subscribe your project for the particular model offering from the Azure Marketplace. Each project has its own subscription to the particular Azure Marketplace offering of the model, which allows you to control and monitor spending.
67+
68+
Furthermore, models offered through the Azure Marketplace are available for deployment to standard deployment in specific regions. Check [Model and region availability for standard deployment](deploy-models-serverless-availability.md) to verify which models and regions are available. If the one you need is not listed, you can deploy to a project in a supported region and then [consume standard deployment from a different project](deploy-models-serverless-connect.md).
69+
70+
1. Create the model's marketplace subscription. When you create a subscription, you accept the terms and conditions associated with the model offer.
3971

4072
# [Azure AI Foundry portal](#tab/azure-ai-studio)
4173

42-
You can use any compatible web browser to navigate [Azure AI Foundry](https://ai.azure.com/?cid=learnDocs).
74+
1. On the model's **Details** page, select **Use this model** to open the Serverless API deployment window. In the Serverless API deployment window, the **Azure Marketplace Terms** link provides more information about the terms of use. The **Pricing and terms** tab also provides pricing details for the selected model.
75+
76+
> [!TIP]
77+
> For models that can be deployed via serverless API deployment or managed compute, a **Deployment options** window opens up, giving you the choice between serverless API deployment and deployment using a managed compute. From there, you can select the serverless API deployment option.
78+
>
79+
> To use the serverless API deployment offering, your project must belong to one of the [regions that are supported for serverless deployment](deploy-models-serverless-availability.md) for the particular model.
80+
81+
1. If you've never deployed the model in your project before, you first have to subscribe to the model's offering in the Azure Marketplace. Select **Subscribe and Deploy** to open the deployment wizard.
82+
:::image type="content" source="../media/deploy-models-serverless/model-marketplace-subscription.png" alt-text="Screenshot showing where to subscribe a model to the Azure marketplace before deployment." lightbox="../media/deploy-models-serverless/model-marketplace-subscription.png":::
83+
84+
1. Alternatively, if you see the note *You already have an Azure Marketplace subscription for this project*, you don't need to create the subscription since you already have one. Select **Continue to deploy** to open the deployment wizard.
85+
:::image type="content" source="../media/deploy-models-serverless/model-subscribed-to-marketplace.png" alt-text="Deployment page for a model that is already subscribed to Azure marketplace." lightbox="../media/deploy-models-serverless/model-subscribed-to-marketplace.png":::
86+
87+
1. (Optional) Once you subscribe a project for the particular Azure Marketplace offering, subsequent deployments of the same offering in the same project don't require subscribing again. At any point, you can see the model offers to which your project is currently subscribed:
88+
89+
1. Go to the [Azure portal](https://portal.azure.com).
90+
1. Navigate to the resource group where the project belongs.
91+
1. On the **Type** filter, select **SaaS**.
92+
1. You see all the offerings to which you're currently subscribed.
93+
1. Select any resource to see the details.
94+
95+
1. In the deployment wizard, name the deployment. The **Content filter (preview)** option is enabled by default. Leave the default setting for the service to detect harmful content such as hate, self-harm, sexual, and violent content. For more information about content filtering, see [Content filtering in Azure AI Foundry portal](../concepts/content-filtering.md).
96+
97+
---
98+
99+
## Deploy the model to a serverless API and use the deployment
100+
101+
The serverless API deployment provides a way to consume models as an API without hosting them on your subscription, while keeping the enterprise security and compliance organizations need. This deployment option doesn't require quota from your subscription. In this section, you create an endpoint for your model.
102+
103+
1. In the deployment wizard, select **Deploy**. Wait until the deployment is ready and you're redirected to the Deployments page.
104+
105+
1. To see the endpoints deployed to your project, in the **My assets** section of the left pane, select **Models + endpoints**.
106+
107+
1. The created endpoint uses key authentication for authorization. To get the keys associated with a given endpoint, follow these steps:
108+
1. Select the deployment, and note the endpoint's Target URI and Key.
109+
1. Use these credentials to call the deployment and generate predictions.
110+
111+
1. If you need to consume this deployment from a different project or hub, or you plan to use prompt flow to build intelligent applications, you need to create a connection to the serverless API deployment. To learn how to configure an existing standard deployment on a new project or hub, see [Consume deployed standard deployment from a different project or from Prompt flow](deploy-models-serverless-connect.md).
112+
113+
> [!TIP]
114+
> If you're using prompt flow in the same project or hub where the deployment was deployed, you still need to create the connection.
115+
116+
### Use the standard deployment
117+
118+
Models deployed in Azure Machine Learning and Azure AI Foundry in standard deployments support the [Azure AI Foundry Models API](../../ai-foundry/model-inference/reference/reference-model-inference-api.md) that exposes a common set of capabilities for foundational models and that can be used by developers to consume predictions from a diverse set of models in a uniform and consistent way.
119+
120+
Read more about the [capabilities of this API](../../ai-foundry/model-inference/reference/reference-model-inference-api.md#capabilities) and how [you can use it when building applications](../../ai-foundry/model-inference/reference/reference-model-inference-api.md#getting-started).
121+
122+
123+
## Delete endpoints and subscriptions
124+
125+
[!INCLUDE [tip-left-pane](../includes/tip-left-pane.md)]
126+
127+
You can delete model subscriptions and endpoints. Deleting a model subscription makes any associated endpoint become *Unhealthy* and unusable.
128+
129+
To delete a standard deployment:
130+
131+
1. Go to the [Azure AI Foundry](https://ai.azure.com/?cid=learnDocs).
132+
133+
1. Go to your project.
134+
135+
1. In the **My assets** section, select **Models + endpoints**.
136+
137+
1. Open the deployment you want to delete.
138+
139+
1. Select **Delete**.
140+
141+
142+
To delete the associated model subscription:
143+
144+
1. Go to the [Azure portal](https://portal.azure.com)
145+
146+
1. Navigate to the resource group where the project belongs.
147+
148+
1. On the **Type** filter, select **SaaS**.
149+
150+
1. Select the subscription you want to delete.
151+
152+
1. Select **Delete**.
43153

44-
# [Azure CLI](#tab/cli)
45154

46-
The [Azure CLI](/cli/azure/) and the [ml extension for Azure Machine Learning](/azure/machine-learning/how-to-configure-cli).
155+
156+
::: zone-end
157+
158+
159+
::: zone pivot="programming-language-cli"
160+
161+
- To work with Azure AI Foundry, install the [Azure CLI](/cli/azure/) and the [ml extension for Azure Machine Learning](/azure/machine-learning/how-to-configure-cli).
47162

48163
```azurecli
49164
az extension add -n ml
@@ -62,9 +177,12 @@ This article uses a Meta Llama model deployment for illustration. However, you c
62177
az configure --defaults workspace=<project-name> group=<resource-group> location=<location>
63178
```
64179
65-
# [Python SDK](#tab/python)
180+
::: zone-end
181+
66182
67-
Install the [Azure Machine Learning SDK for Python](https://aka.ms/sdk-v2-install).
183+
::: zone pivot="python-sdk"
184+
185+
- To work with Azure AI Foundry, install the [Azure Machine Learning SDK for Python](https://aka.ms/sdk-v2-install).
68186
69187
```python
70188
pip install -U azure-ai-ml
@@ -85,9 +203,12 @@ This article uses a Meta Llama model deployment for illustration. However, you c
85203
)
86204
```
87205
88-
# [Bicep](#tab/bicep)
206+
::: zone-end
207+
89208
90-
Install the Azure CLI as described at [Azure CLI](/cli/azure/).
209+
::: zone pivot="programming-language-bicep"
210+
211+
- To work with Azure AI Foundry, install the Azure CLI as described at [Azure CLI](/cli/azure/).
91212
92213
Configure the following environment variables according to your settings:
93214
@@ -96,11 +217,13 @@ This article uses a Meta Llama model deployment for illustration. However, you c
96217
LOCATION="eastus2"
97218
```
98219
99-
# [ARM](#tab/arm)
220+
::: zone-end
100221
101-
You can use any compatible web browser to [deploy ARM templates](/azure/azure-resource-manager/templates/deploy-portal) in the Microsoft Azure portal or use any of the deployment tools. This tutorial uses the [Azure CLI](/cli/azure/).
102222
103223
224+
225+
<!--
226+
104227
## Find your model and model ID in the model catalog
105228
106229
[!INCLUDE [open-catalog](../includes/open-catalog.md)]
@@ -651,7 +774,7 @@ You can use the resource management tools to manage the resources. The following
651774
az resource delete --name <resource-name>
652775
```
653776

654-
---
777+
--- -->
655778

656779
## Cost and quota considerations for models deployed as a standard deployment
657780

@@ -693,6 +816,8 @@ Azure role-based access controls (Azure RBAC) are used to grant access to operat
693816

694817
For more information on permissions, see [Role-based access control in Azure AI Foundry portal](../concepts/rbac-azure-ai-foundry.md).
695818

819+
820+
696821
## Related content
697822

698823
* [Region availability for models as standard deployments](deploy-models-serverless-availability.md)
224 KB
Loading
32.7 KB
Loading
104 KB
Loading
213 KB
Loading
217 KB
Loading

zone-pivots/zone-pivot-groups.yml

Lines changed: 14 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -670,6 +670,20 @@ groups:
670670
title: Azure CLI
671671
- id: programming-language-bicep
672672
title: Bicep
673+
674+
- id: azure-ai-serverless-deployment
675+
# Owner: mopeakande
676+
title: Programming languages
677+
prompt: Choose a tool or API
678+
pivots:
679+
- id: ai-foundry-portal
680+
title: Azure AI Foundry portal
681+
- id: programming-language-cli
682+
title: Azure CLI
683+
- id: python-sdk
684+
title: Python SDK
685+
- id: programming-language-bicep
686+
title: Bicep
673687
- id: azure-ai-model-catalog-sub-group-samples
674688
# Owner: mopeakande
675689
title: Programming languages

0 commit comments

Comments
 (0)