You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
[Certain models in the model catalog](deploy-models-serverless-availability.md) can be deployed as a serverless API deployment. This kind of deployment provides a way to consume models as an API without hosting them on your subscription, while keeping the enterprise security and compliance that organizations need. This deployment option doesn't require quota from your subscription.
26
26
27
-
<!-- This article uses a Meta Llama model deployment for illustration. However, you can use the same steps to deploy any of the [models in the model catalog that are available for serverless API deployment](deploy-models-serverless-availability.md). -->
- An Azure subscription with a valid payment method. Free or trial Azure subscriptions won't work. If you don't have an Azure subscription, create a [paid Azure account](https://azure.microsoft.com/pricing/purchase-options/pay-as-you-go) to begin.
32
32
33
33
- If you don't have one, [create a [!INCLUDE [hub](../includes/hub-project-name.md)]](create-projects.md?pivots=hub-project).
34
34
35
-
- Ensure that the **Deploy models to Azure AI Foundry resources** feature is turned off in the Azure AI Foundry portal. When this feature is on, serverless API deployments are not available from the portal.
35
+
- Ensure that the **Deploy models to Azure AI Foundry resources**(preview) feature is turned off in the Azure AI Foundry portal. When this feature is on, serverless API deployments are not available from the portal.
36
36
37
37
:::image type="content" source="../media/deploy-models-serverless/foundry-resources-deployment-disabled.png" alt-text="A screenshot of the Azure AI Foundry portal showing where to disable deployment to Azure AI Foundry resources." lightbox="../media/deploy-models-serverless/foundry-resources-deployment-disabled.png":::
38
38
39
+
- Foundry [Models from Partners and Community](../model-inference/concepts/models.md#models-from-partners-and-community?context=/azure/ai-foundry/context/context) require access to Azure Marketplace, while Foundry [Models Sold Directly by Azure](../model-inference/concepts/models.md#models-sold-directly-by-azure?context=/azure/ai-foundry/context/context) don't have this requirement. Ensure you have the permissions required to subscribe to model offerings in Azure Marketplace.
40
+
39
41
- Azure role-based access controls (Azure RBAC) are used to grant access to operations in Azure AI Foundry portal. To perform the steps in this article, your user account must be assigned the __Azure AI Developer role__ on the resource group. For more information on permissions, see [Role-based access control in Azure AI Foundry portal](../concepts/rbac-azure-ai-foundry.md).
40
42
41
43
::: zone pivot="ai-foundry-portal"
@@ -49,38 +51,40 @@ In this article, you learn how to deploy a model from the model catalog as a ser
49
51
# [Models sold directly by Azure](#tab/azure-direct)
50
52
51
53
4. Select the model card of the model you want to deploy. In this article, you select a **DeepSeek-R1** model.
54
+
52
55
1. Select **Use this model** to open the _Serverless API deployment_ window where you can view the *Pricing and terms* tab.
56
+
53
57
1. In the deployment wizard, name the deployment. The **Content filter (preview)** option is enabled by default. Leave the default setting for the service to detect harmful content such as hate, self-harm, sexual, and violent content. For more information about content filtering, see [Content filtering in Azure AI Foundry portal](../concepts/content-filtering.md).
58
+
54
59
:::image type="content" source="../media/deploy-models-serverless/deepseek-deployment-wizard.png" alt-text="Screenshot showing the deployment wizard for a model sold directly by Azure." lightbox="../media/deploy-models-serverless/deepseek-deployment-wizard.png":::
55
60
56
61
57
62
# [Models from Partners and Community](#tab/partner-models)
58
63
64
+
4. Select the model card of the model you want to deploy. In this article, you select the **AI21-Jamba-1.5-Large** model.
65
+
59
66
> [!NOTE]
60
67
> [Models from Partners and Community](../concepts/foundry-models-overview.md#models-from-partners-and-community) are offered through the Azure Marketplace. For these models, ensure that your account has the **Azure AI Developer** role permissions on the resource group, or that you meet the [permissions required to subscribe to model offerings](#permissions-required-to-subscribe-to-model-offerings), as you're required to subscribe your project to the particular model offering.
61
-
62
-
4. Select the model card of the model you want to deploy. In this article, you select the **AI21-Jamba-1.5-Large** model.
63
68
64
-
The next section covers the steps for subscribing your project to a model offering.
65
69
66
70
### Subscribe your project to the model offering
67
71
68
-
Standard deployments can deploy both Microsoft and non-Microsoft offered models. For models from partners and community, e.g., the AI21-Jamba-1.5-Large model, you must create a subscription before you can deploy them. If it's your first time deploying the model in the project, you have to subscribe your project for the particular model offering from the Azure Marketplace. Each project has its own subscription to the particular Azure Marketplace offering of the model, which allows you to control and monitor spending.
72
+
For models from partners and community, e.g., the AI21-Jamba-1.5-Large model, you must create a subscription before you can deploy them. If it's your first time deploying the model in the project, you have to subscribe your project for the particular model offering from the Azure Marketplace. Each project has its own subscription to the particular Azure Marketplace offering of the model, which allows you to control and monitor spending.
69
73
70
-
Furthermore, models offered through the Azure Marketplace are available for deployment to standard deployment in specific regions. Check [Model and region availability for standard deployment](deploy-models-serverless-availability.md) to verify which models and regions are available. If the one you need is not listed, you can deploy to a project in a supported region and then [consume standard deployment from a different project](deploy-models-serverless-connect.md).
74
+
Furthermore, models offered through the Azure Marketplace are available for deployment to standard deployment in specific regions. Check [regions that are supported for serverless deployment](deploy-models-serverless-availability.md) to verify available regions for the particular model. If the region in which your project is located isn't listed, you can deploy to a project in a supported region and then [consume standard deployment from a different project](deploy-models-serverless-connect.md).
71
75
72
76
73
77
1. On the model's **Details** page, select **Use this model** to open the Serverless API deployment window. In the Serverless API deployment window, the **Azure Marketplace Terms** link provides more information about the terms of use. The **Pricing and terms** tab also provides pricing details for the selected model.
74
78
75
79
> [!TIP]
76
-
> For models that can be deployed via serverless API deployment or managed compute, a **Deployment options** window opens up, giving you the choice between serverless API deployment and deployment using a managed compute. From there, you can select the serverless API deployment option.
77
-
>
78
-
> To use the serverless API deployment offering, your project must belong to one of the [regions that are supported for serverless deployment](deploy-models-serverless-availability.md) for the particular model.
80
+
> For models that can be deployed via serverless API deployment or [managed compute](deploy-models-managed.md), a **Deployment options** window opens up, giving you the choice between serverless API deployment and deployment using a managed compute. From there, you can select the serverless API deployment option.
79
81
80
82
1. If you've never deployed the model in your project before, you first have to subscribe to the model's offering in the Azure Marketplace. Select **Subscribe and Deploy** to open the deployment wizard.
83
+
81
84
:::image type="content" source="../media/deploy-models-serverless/model-marketplace-subscription.png" alt-text="Screenshot showing where to subscribe a model to the Azure marketplace before deployment." lightbox="../media/deploy-models-serverless/model-marketplace-subscription.png":::
82
85
83
86
1. Alternatively, if you see the note *You already have an Azure Marketplace subscription for this project*, you don't need to create the subscription since you already have one. Select **Continue to deploy** to open the deployment wizard.
87
+
84
88
:::image type="content" source="../media/deploy-models-serverless/model-subscribed-to-marketplace.png" alt-text="Deployment page for a model that is already subscribed to Azure marketplace." lightbox="../media/deploy-models-serverless/model-subscribed-to-marketplace.png":::
85
89
86
90
1. (Optional) Once you subscribe a project for the particular Azure Marketplace offering, subsequent deployments of the same offering in the same project don't require subscribing again. At any point, you can see the model offers to which your project is currently subscribed:
@@ -96,9 +100,9 @@ Furthermore, models offered through the Azure Marketplace are available for depl
96
100
97
101
---
98
102
99
-
## Deploy the model to a serverless API and use the deployment
103
+
## Deploy the model to a serverless API
100
104
101
-
The serverless API deployment provides a way to consume models as an API without hosting them on your subscription, while keeping the enterprise security and compliance organizations need. This deployment option doesn't require quota from your subscription. In this section, you create an endpoint for your model.
105
+
In this section, you create an endpoint for your model.
102
106
103
107
1. In the deployment wizard, select **Deploy**. Wait until the deployment is ready and you're redirected to the Deployments page.
104
108
@@ -108,12 +112,12 @@ The serverless API deployment provides a way to consume models as an API without
108
112
1. Select the deployment, and note the endpoint's Target URI and Key.
109
113
1. Use these credentials to call the deployment and generate predictions.
110
114
111
-
1. If you need to consume this deployment from a different project or hub, or you plan to use prompt flow to build intelligent applications, you need to create a connection to the serverless API deployment. To learn how to configure an existing standard deployment on a new project or hub, see [Consume deployed standard deployment from a different project or from Prompt flow](deploy-models-serverless-connect.md).
115
+
1. If you need to consume this deployment from a different project or hub, or you plan to use Prompt flow to build intelligent applications, you need to create a connection to the serverless API deployment. To learn how to configure an existing standard deployment on a new project or hub, see [Consume deployed standard deployment from a different project or from Prompt flow](deploy-models-serverless-connect.md).
112
116
113
117
> [!TIP]
114
-
> If you're using prompt flow in the same project or hub where the deployment was deployed, you still need to create the connection.
118
+
> If you're using Prompt flow in the same project or hub where the deployment was deployed, you still need to create the connection.
115
119
116
-
###Use the standard deployment
120
+
## Use the standard deployment
117
121
118
122
Models deployed in Azure Machine Learning and Azure AI Foundry in standard deployments support the [Azure AI Foundry Models API](../../ai-foundry/model-inference/reference/reference-model-inference-api.md) that exposes a common set of capabilities for foundational models and that can be used by developers to consume predictions from a diverse set of models in a uniform and consistent way.
119
123
@@ -129,29 +133,19 @@ You can delete model subscriptions and endpoints. Deleting a model subscription
129
133
To delete a standard deployment:
130
134
131
135
1. Go to the [Azure AI Foundry](https://ai.azure.com/?cid=learnDocs).
132
-
133
136
1. Go to your project.
134
-
135
137
1. In the **My assets** section, select **Models + endpoints**.
136
-
137
138
1. Open the deployment you want to delete.
138
-
139
139
1. Select **Delete**.
140
140
141
-
142
141
To delete the associated model subscription:
143
142
144
143
1. Go to the [Azure portal](https://portal.azure.com)
145
-
146
144
1. Navigate to the resource group where the project belongs.
147
-
148
145
1. On the **Type** filter, select **SaaS**.
149
-
150
146
1. Select the subscription you want to delete.
151
-
152
147
1. Select **Delete**.
153
148
154
-
155
149
156
150
::: zone-end
157
151
@@ -671,10 +665,10 @@ In this section, you create an endpoint with the name **meta-llama3-8b-qwerty**.
671
665
672
666
1. At this point, your endpoint is ready to be used.
673
667
674
-
1. If you need to consume this deployment from a different project or hub, or you plan to use prompt flow to build intelligent applications, you need to create a connection to the standard deployment. To learn how to configure an existing standard deployment on a new project or hub, see [Consume deployed standard deployment from a different project or from Prompt flow](deploy-models-serverless-connect.md).
668
+
1. If you need to consume this deployment from a different project or hub, or you plan to use Prompt flow to build intelligent applications, you need to create a connection to the standard deployment. To learn how to configure an existing standard deployment on a new project or hub, see [Consume deployed standard deployment from a different project or from Prompt flow](deploy-models-serverless-connect.md).
675
669
676
670
> [!TIP]
677
-
> If you're using prompt flow in the same project or hub where the deployment was deployed, you still need to create the connection.
671
+
> If you're using Prompt flow in the same project or hub where the deployment was deployed, you still need to create the connection.
678
672
679
673
## Use the standard deployment
680
674
@@ -776,23 +770,13 @@ az resource delete --name <resource-name>
776
770
777
771
--- -->
778
772
779
-
## Cost and quota considerations for models deployed as a standard deployment
780
-
781
-
Quota is managed per deployment. Each deployment has a rate limit of 200,000 tokens per minute and 1,000 API requests per minute. However, we currently limit one deployment per model per project. Contact Microsoft Azure Support if the current rate limits aren't sufficient for your scenarios.
782
-
783
-
#### Cost for Microsoft models
784
-
785
-
You can find the pricing information on the __Pricing and terms__ tab of the deployment wizard when deploying Microsoft models (such as Phi-3 models) as a standard deployment.
786
-
787
-
#### Cost for non-Microsoft models
788
-
789
-
Non-Microsoft models deployed as a standard deployment are offered through the Azure Marketplace and integrated with Azure AI Foundry for use. You can find the Azure Marketplace pricing when deploying or fine-tuning these models.
773
+
## Cost and quota considerations for Foundry Models deployed as a standard deployment
790
774
791
-
Each time a project subscribes to a given offer from the Azure Marketplace, a new resource is created to track the costs associated with its consumption. The same resource is used to track costs associated with inference and fine-tuning; however, multiple meters are available to track each scenario independently.
775
+
Quota is managed per deployment. Each deployment has a rate limit of 200,000 tokens per minute and 1,000 API requests per minute. Additionally, we currently limit one deployment per model per project. Contact Microsoft Azure Support if the current rate limits aren't sufficient for your scenarios.
792
776
793
-
For more information on how to track costs, see [Monitor costs for models offered through the Azure Marketplace](costs-plan-manage.md#monitor-costs-for-models-offered-through-the-azure-marketplace).
777
+
- You can find find pricing information for [Models Sold Directly by Azure](../model-inference/concepts/models.md#models-sold-directly-by-azure?context=/azure/ai-foundry/context/context), on the *Pricing and terms* tab of the _Serverless API deployment_ window.
794
778
795
-
:::image type="content" source="../media/deploy-monitor/serverless/costs-model-as-service-cost-details.png" alt-text="A screenshot showing different resources corresponding to different model offers and their associated meters." lightbox="../media/deploy-monitor/serverless/costs-model-as-service-cost-details.png":::
779
+
-[Models from Partners and Community](../model-inference/concepts/models.md#models-from-partners-and-community?context=/azure/ai-foundry/context/context) are offered through Azure Marketplace and integrated with Azure AI Foundry for use. You can find the Azure Marketplace pricing when deploying or fine-tuning these models. Each time a project subscribes to a given offer from the Azure Marketplace, a new resource is created to track the costs associated with its consumption. The same resource is used to track costs associated with inference and fine-tuning; however, multiple meters are available to track each scenario independently. For more information on how to track costs, see [Monitor costs for models offered through the Azure Marketplace](costs-plan-manage.md#monitor-costs-for-models-offered-through-the-azure-marketplace).
796
780
797
781
798
782
## Permissions required to subscribe to model offerings
We recommend that you deploy Foundry Models to Azure AI Foundry resources. This deployment method allows you to consume your models via a single endpoint with the same authentication and schema to generate inference for the deployed models in the resource. The endpoint follows the [Azure AI Model Inference API](/rest/api/aifoundry/modelinference/) which all the models in Foundry Models support. To learn how to deploy a Foundry Model to the Azure AI Foundry resources, see [Add and configure models to Azure AI Foundry Models](../model-inference/how-to/create-model-deployments.md).
0 commit comments