Skip to content

Commit 87b0971

Browse files
committed
review/updates
1 parent ed0c4ff commit 87b0971

File tree

1 file changed

+39
-17
lines changed

1 file changed

+39
-17
lines changed

articles/ai-foundry/how-to/deploy-models-managed-pay-go.md

Lines changed: 39 additions & 17 deletions
Original file line numberDiff line numberDiff line change
@@ -6,7 +6,7 @@ manager: scottpolly
66
ms.service: azure-ai-foundry
77
ms.custom:
88
ms.topic: how-to
9-
ms.date: 06/20/2025
9+
ms.date: 06/23/2025
1010
ms.reviewer: tinaem
1111
reviewer: tinaem
1212
ms.author: mopeakande
@@ -15,11 +15,8 @@ author: msakande
1515

1616
# Deploy Azure AI Foundry Models with pay-as-you-go billing to managed compute
1717

18-
[!INCLUDE [feature-preview](../includes/feature-preview.md)]
18+
Azure AI Foundry Models include a comprehensive catalog of models organized into two categories—Models sold directly by Azure, and [Models from partners and community](../concepts/foundry-models-overview.md#models-from-partners-and-community). These models from partners and community, which are available for deployment on a managed compute, are either open or protected models. In this article, you learn how to use protected models from partners and community, offered via Azure Marketplace for deployment on managed compute.
1919

20-
Azure AI Foundry Models include a comprehensive catalog of models organized into two categories—Models sold directly by Azure, and [Models from partners and community](../concepts/foundry-models-overview.md#models-from-partners-and-community). These models from partners and community, which are available for deployment on a managed compute, are either open or protected models. The deployment of protected models on managed compute (preview) involves pay-as-you-go billing for the customer in two dimensions: per-hour Azure Machine Learning compute billing for the virtual machines employed in the deployment, and surcharge billing for the model as set by the model publisher on the Azure Marketplace offer. This pay-as-you-go billing of Azure compute and model surcharge is pro-rated per minute based on the uptime of these managed online deployments.
21-
22-
In this article, you learn how to use protected models from partners and community, offered via Azure Marketplace for deployment on managed compute. Azure AI Foundry enables a seamless subscription and transaction experience for these protected models as you create and consume your dedicated model deployments at scale.
2320

2421
## Prerequisites
2522

@@ -50,39 +47,52 @@ In this article, you learn how to use protected models from partners and communi
5047
- Microsoft.MachineLearningServices/workspaces/marketplaceModelSubscriptions/*
5148
- Microsoft.MachineLearningServices/workspaces/onlineEndpoints/*
5249

53-
## Marketplace offer unit of measure and subscription scope
50+
## Subscription scope and unit of measure for Azure Marketplace offer
51+
52+
Azure AI Foundry enables a seamless subscription and transaction experience for protected models as you create and consume your dedicated model deployments at scale. The deployment of protected models on managed compute involves pay-as-you-go billing for the customer in two dimensions:
53+
54+
- Per-hour Azure Machine Learning compute billing for the virtual machines employed in the deployment.
55+
- Surcharge billing for the model as set by the model publisher on the Azure Marketplace offer.
56+
57+
Pay-as-you-go billing of Azure compute and model surcharge are pro-rated per minute based on the uptime of the managed online deployments. The surcharge for a model is a per GPU-hour price, set by the partner (or model's publisher) on Azure marketplace, for all the supported GPUs that can be used to deploy the model on Azure AI Foundry managed compute.
58+
59+
A user's subscription to Azure marketplace offers are scoped to the project resource within Azure AI Foundry. If a subscription to the Azure marketplace offer for a particular model already exists within the project, the user is informed in the deployment wizard that the subscription already exists for the project.
5460

55-
The surcharge for the models is a per GPU hour price set by the partner / publisher on Azure marketplace, for all the supported GPUs for the model to be deployed on Foundry managed compute.
61+
To find all the SaaS subscriptions that exist in an Azure subscription:
5662

57-
User's subscriptions to the azure marketplace offers are scoped to a project resource within Azure AI Foundry. If a subscription to the marketplace offer already exists within a project, users will be notified of the same in the Deploy Wizard (reference snapshot below).
63+
1. Sign in to the [Azure portal](https://portal.azure.com) and go to your Azure subscription.
5864

59-
<insert image from doc>
65+
1. Select **Subscriptions** and then select your Azure subscription to open its overview page.
6066

61-
All SaaS subscriptions created in an Azure subscription are listed under 'Resources' of the Settings blade of the Azure subscription and can be filtered using Resource Type equals SaaS. The consumption-based surcharge is accrued to the associated SaaS subscription and billed to the user via Azure Marketplace. The user can view his invoice by clicking on 'View Billing' in the Overview tab of the respective SaaS subscription.
67+
1. Select **Settings** > **Resources** to see the list of resources.
68+
69+
1. Use the **Type** filter to select the SaaS resource type.
70+
71+
The consumption-based surcharge is accrued to the associated SaaS subscription and billed to a user via Azure marketplace. You can view the invoice in the **Overview** tab of the respective SaaS subscription.
6272

6373
## Subscribe and deploy on managed compute
6474

6575
[!INCLUDE [open-catalog](../includes/open-catalog.md)]
6676

6777
1. Select the **Deployment options** filter in the model catalog and choose **Managed compute**.
6878

69-
2. Filter the list further by selecting the **Collection** and model of your choice. In this article, we use **Cohere Command A** for illustration.
79+
1. Filter the list further by selecting the **Collection** and model of your choice. In this article, we use **Cohere Command A** for illustration.
7080

71-
3. From the model's page, select **Use this model** to open the deployment wizard.
81+
1. From the model's page, select **Use this model** to open the deployment wizard.
7282

73-
4. Choose from one of the supported VM SKUs for the model. You need to have Azure Machine Learning Compute quota for that SKU in your Azure subscription.
83+
1. Choose from one of the supported VM SKUs for the model. You need to have Azure Machine Learning Compute quota for that SKU in your Azure subscription.
7484

75-
5. Select **Customize** to specify your deployment configuration for parameters such as the instance count. You can also select an existing endpoint for the deployment or create a new one. For this example, we specify an instance count of **1** and create a new endpoint for the deployment.
85+
1. Select **Customize** to specify your deployment configuration for parameters such as the instance count. You can also select an existing endpoint for the deployment or create a new one. For this example, we specify an instance count of **1** and create a new endpoint for the deployment.
7686

7787
:::image type="content" source="../media/deploy-models-managed-pay-go/deployment-configuration.png" alt-text="Screenshot of the deployment configuration screen for a protected model in Azure AI Foundry." lightbox="../media/deploy-models-managed-pay-go/deployment-configuration.png":::
7888

79-
6. Select **Next** to proceed to the *pricing breakdown* page.
89+
1. Select **Next** to proceed to the *pricing breakdown* page.
8090

81-
7. Review the pricing breakdown for the deployment, terms of use, and license agreement associated with the model's offer on Azure Marketplace. The pricing breakdown tells you what the aggregated pricing for the deployed model would be, where the surcharge for the model is a function of the number of GPUs in the VM instance that is selected in the previous steps. In addition to the applicable surcharge for the model, Azure compute charges also apply, based on your deployment configuration. If you have existing reservations or Azure savings plan, the invoice for the compute charges honors and reflects the discounted VM pricing.
91+
1. Review the pricing breakdown for the deployment, terms of use, and license agreement associated with the model's offer on Azure Marketplace. The pricing breakdown tells you what the aggregated pricing for the deployed model would be, where the surcharge for the model is a function of the number of GPUs in the VM instance that is selected in the previous steps. In addition to the applicable surcharge for the model, Azure compute charges also apply, based on your deployment configuration. If you have existing reservations or Azure savings plan, the invoice for the compute charges honors and reflects the discounted VM pricing.
8292

8393
:::image type="content" source="../media/deploy-models-managed-pay-go/pricing-breakdown.png" alt-text="Screenshot of the pricing breakdown page for a protected model deployment in Azure AI Foundry." lightbox="../media/deploy-models-managed-pay-go/pricing-breakdown.png":::
8494

85-
8. Select the checkbox to acknowledge that you understand and agree to the terms of use. Then, select **Deploy**. Foundry creates the user's subscription to the marketplace offer and further on, the deployment of the model on managed compute. It takes about 15-20 minutes for the deployment to complete.
95+
1. Select the checkbox to acknowledge that you understand and agree to the terms of use. Then, select **Deploy**. Azure AI Foundry creates the user's subscription to the marketplace offer and then creates the deployment of the model on a managed compute. It takes about 15-20 minutes for the deployment to complete.
8696

8797
## Network Isolation of deployments
8898

@@ -92,6 +102,18 @@ Collections in the model catalog can be deployed within your isolated networks u
92102

93103
An Azure AI Foundry project with ingress Public Network Access disabled can only support a single active deployment of one of the protected models from the catalog. Attempts to create more active deployments result in deployment creation failures.
94104

105+
## Supported models for pay-as-you-go billing to managed compute
106+
107+
| Collection | Model | Task |
108+
|--|--|--|
109+
| Paige AI | [Virchow2G](https://ai.azure.com/explore/models/Virchow2G/version/1/registry/azureml-paige) | Image Feature Extraction |
110+
| Paige AI | [Virchow2G-Mini](https://ai.azure.com/explore/models/Virchow2G-Mini/version/1/registry/azureml-paige) | Image Feature Extraction |
111+
| Cohere | [Command A](https://ai.azure.com/explore/models/cohere-command-a/version/3/registry/azureml-cohere) | Chat completion |
112+
| Cohere | [Embed v4](https://ai.azure.com/explore/models/embed-v-4-0/version/4/registry/azureml-cohere) | Embeddings |
113+
| Cohere | [Rerank v3.5](https://ai.azure.com/explore/models/Cohere-rerank-v3.5/version/2/registry/azureml-cohere) | Text classification |
114+
115+
116+
95117
## Related content
96118

97119
* [How to deploy and inference a managed compute deployment](deploy-models-managed.md)

0 commit comments

Comments
 (0)