Skip to content

Commit 8f842ce

Browse files
authored
Merge pull request #6450 from s-polly/serverless-deploy-freshness
Deploy models as standard deployment - freshness
2 parents 211a5ca + 1c1de51 commit 8f842ce

File tree

6 files changed

+37
-36
lines changed

6 files changed

+37
-36
lines changed

articles/machine-learning/how-to-deploy-models-serverless.md

Lines changed: 37 additions & 36 deletions
Original file line numberDiff line numberDiff line change
@@ -5,7 +5,7 @@ description: Learn to deploy models as standard deployments, using Azure Machine
55
ms.service: azure-machine-learning
66
ms.subservice: inferencing
77
ms.topic: how-to
8-
ms.date: 07/19/2024
8+
ms.date: 08/07/2025
99
ms.reviewer: fasantia
1010
reviewer: santiagxf
1111
ms.author: scottpolly
@@ -18,7 +18,7 @@ ms.custom: build-2024, serverless, devx-track-azurecli
1818

1919
In this article, you learn how to deploy a model from the model catalog as a standard deployment.
2020

21-
[Certain models in the model catalog](concept-endpoint-serverless-availability.md) can be deployed as a standard deployment with Standard billing. This kind of deployment provides a way to consume models as an API without hosting them on your subscription, while keeping the enterprise security and compliance that organizations need. This deployment option doesn't require quota from your subscription.
21+
[Certain models in the model catalog](concept-endpoint-serverless-availability.md) can be deployed as a standard deployment with Standard billing. This deployment type provides a way to consume models as an API without hosting them on your subscription, while maintaining the enterprise security and compliance that organizations need. This deployment option doesn't require quota from your subscription.
2222

2323
This article uses a Meta Llama model deployment for illustration. However, you can use the same steps to deploy any of the [models in the model catalog that are available for standard deployment](concept-endpoint-serverless-availability.md).
2424

@@ -89,42 +89,43 @@ This article uses a Meta Llama model deployment for illustration. However, you c
8989
9090
1. Sign in to [Azure Machine Learning studio](https://ml.azure.com)
9191
92-
1. For models offered through the Azure Marketplace, ensure that your account has the **Azure AI Developer** role permissions on the resource group, or that you meet the [permissions required to subscribe to model offerings](#permissions-required-to-subscribe-to-model-offerings).
92+
1. For models offered through Azure Marketplace, ensure that your account has the **Azure AI Developer** role permissions on the resource group, or that you meet the [permissions required to subscribe to model offerings](#permissions-required-to-subscribe-to-model-offerings).
9393
94-
Models that are offered by non-Microsoft providers (for example, Llama and Mistral models) are billed through the Azure Marketplace. For such models, you're required to subscribe your workspace to the particular model offering. Models that are offered by Microsoft (for example, Phi-3 models) don't have this requirement, as billing is done differently. For details about billing for serverless deployment of models in the model catalog, see [Billing for standard deployments](concept-model-catalog.md#pay-for-model-usage-in-standard-deployment).
94+
Models that are offered by non-Microsoft providers (for example, Llama and Mistral models) are billed through Azure Marketplace. For such models, you're required to subscribe your workspace to the particular model offering. Models that are offered by Microsoft (for example, Phi-3 models) don't have this requirement, as billing is done differently. For details about billing for serverless deployment of models in the model catalog, see [Billing for standard deployments](concept-model-catalog.md#pay-for-model-usage-in-standard-deployment).
9595
9696
1. Go to your workspace. To use the standard deployment offering, your workspace must belong to one of the [regions that are supported for serverless deployment](concept-endpoint-serverless-availability.md) for the particular model you want to deploy.
9797
98-
1. Select **Model catalog** from the left sidebar and find the model card of the model you want to deploy. In this article, you select a **Meta-Llama-3-8B-Instruct** model.
98+
1. Select **Model catalog** from the left sidebar and find the model card of the model you want to deploy. In this article, you select a **Bria-2.3-Fast** model.
9999
100100
1. If you're deploying the model using Azure CLI, Python SDK, or ARM, copy the **Model ID**.
101101
102102
> [!IMPORTANT]
103-
> Do not include the version when copying the **Model ID**. standard deployments always deploy the model's latest version available. For example, for the model ID `azureml://registries/azureml-meta/models/Meta-Llama-3-8B-Instruct/versions/3`, copy `azureml://registries/azureml-meta/models/Meta-Llama-3-8B-Instruct`.
103+
> Don't include the version when copying the **Model ID**. Standard deployments always deploy the model's latest version available. For example, for the model ID `azureml://registries/azureml-bria/models/Bria-2.3-Fast/versions/1`, copy `azureml://registries/azureml-bria/models/Bria-2.3-Fast`.
104104
105105
:::image type="content" source="media/how-to-deploy-models-serverless/model-card.png" alt-text="A screenshot showing a model's details page." lightbox="media/how-to-deploy-models-serverless/model-card.png":::
106106
107107
The next section covers the steps for subscribing your workspace to a model offering. You can skip this section and go to [Deploy the model to a standard deployment](#deploy-the-model-to-a-standard-deployment), if you're deploying a Microsoft model.
108108
109109
## Subscribe your workspace to the model offering
110110
111-
standard deployments can deploy both Microsoft and non-Microsoft offered models. For Microsoft models (such as Phi-3 models), you don't need to create an Azure Marketplace subscription and you can [deploy them to standard deployments directly](#deploy-the-model-to-a-standard-deployment) to consume their predictions. For non-Microsoft models, you need to create the subscription first. If it's your first time deploying the model in the workspace, you have to subscribe your workspace for the particular model offering from the Azure Marketplace. Each workspace has its own subscription to the particular Azure Marketplace offering of the model, which allows you to control and monitor spending.
111+
Standard deployments can deploy both Microsoft and non-Microsoft offered models. For Microsoft models (such as Phi-3 models), you don't need to create an Azure Marketplace subscription and you can [deploy them to standard deployments directly](#deploy-the-model-to-a-standard-deployment) to consume their predictions. For non-Microsoft models, you need to create the subscription first. If it's your first time deploying the model in the workspace, you have to subscribe your workspace for the particular model offering from Azure Marketplace. Each workspace has its own subscription to the particular Azure Marketplace offering of the model, which allows you to control and monitor spending.
112112
113113
> [!NOTE]
114-
> Models offered through the Azure Marketplace are available for deployment to standard deployments in specific regions. Check [Region availability for models in standard deployments](concept-endpoint-serverless-availability.md) to verify which models and regions are available. If the one you need is not listed, you can deploy to a workspace in a supported region and then [consume standard deployments from a different workspace](how-to-connect-models-serverless.md).
114+
> Models offered through Azure Marketplace are available for deployment to standard deployments in specific regions. Check [Region availability for models in standard deployments](concept-endpoint-serverless-availability.md) to verify which models and regions are available. If the one you need isn't listed, you can deploy to a workspace in a supported region and then [consume standard deployments from a different workspace](how-to-connect-models-serverless.md).
115115
116116
1. Create the model's marketplace subscription. When you create a subscription, you accept the terms and conditions associated with the model offer. Remember you don't need to perform this step for Microsoft offered models (like Phi-3).
117117
118118
# [Studio](#tab/azure-studio)
119119
120-
1. On the model's **Details** page, select **Deploy**. A **Deployment options** window opens up, giving you the choice between standard deployment and deployment using a managed compute.
121-
122-
> [!NOTE]
123-
> For models that can be deployed only via standard deployment, the standard deployment wizard opens up right after you select **Deploy** from the model's details page.
120+
1. On the model's **Details** page, select **Use this model**. A **Deployment options** window opens up, giving you the choice between standard deployment (serverless API) and deployment using a managed compute.
121+
122+
:::image type="content" source="media/how-to-deploy-models-serverless/purchase-options.png" alt-text="A screenshot depicting the dialog for choosing between standard deployments and managed compute." lightbox="media/how-to-deploy-models-serverless/purchase-options.png":::
124123
125-
1. Select **standard deployment with Azure AI Content Safety (preview)** to open the standard deployment wizard.
126-
1. Select the checkbox to acknowledge the Microsoft purchase policy.
124+
> [!NOTE]
125+
> For models that can be deployed only via standard deployment, the standard deployment wizard opens up right after you select **Use this model** from the model's details page.
127126
127+
1. Select **Serverless API** to open the standard deployment wizard.
128+
128129
:::image type="content" source="media/how-to-deploy-models-serverless/deploy-pay-as-you-go.png" alt-text="A screenshot showing how to deploy a model with the standard deployment option." lightbox="media/how-to-deploy-models-serverless/deploy-pay-as-you-go.png":::
129130
130131
1. If you see the note *You already have an Azure Marketplace subscription for this workspace*, you don't need to create the subscription since you already have one. You can proceed to [Deploy the model to a standard deployment](#deploy-the-model-to-a-standard-deployment).
@@ -141,8 +142,8 @@ standard deployments can deploy both Microsoft and non-Microsoft offered models.
141142
__subscription.yml__
142143
143144
```yml
144-
name: meta-llama3-8b-qwerty
145-
model_id: azureml://registries/azureml-meta/models/Meta-Llama-3-8B-Instruct
145+
name: bria-2.3-Fast
146+
model_id: azureml://registries/azureml-bria/models/Bria-2.3-Fast
146147
```
147148
148149
Use the _subscription.yml_ file to create the subscription:
@@ -154,8 +155,8 @@ standard deployments can deploy both Microsoft and non-Microsoft offered models.
154155
# [Python SDK](#tab/python)
155156
156157
```python
157-
model_id="azureml://registries/azureml-meta/models/Meta-Llama-3-8B-Instruct"
158-
subscription_name="Meta-Llama-3-8B-Instruct"
158+
model_id="azureml://registries/azureml-bria/models/Bria-2.3-Fast"
159+
subscription_name="Bria-2.3-Fast""
159160
160161
marketplace_subscription = MarketplaceSubscription(
161162
model_id=model_id,
@@ -183,11 +184,11 @@ standard deployments can deploy both Microsoft and non-Microsoft offered models.
183184
"type": "String"
184185
},
185186
"subscription_name": {
186-
"defaultValue": "Meta-Llama-3-8B-Instruct",
187+
"defaultValue": "Bria-2.3-Fast",
187188
"type": "String"
188189
},
189190
"model_id": {
190-
"defaultValue": "azureml://registries/azureml-meta/models/Meta-Llama-3-8B-Instruct",
191+
"defaultValue": "azureml://registries/azureml-bria/models/Bria-2.3-Fast",
191192
"type": "String"
192193
}
193194
},
@@ -249,17 +250,17 @@ standard deployments can deploy both Microsoft and non-Microsoft offered models.
249250
250251
Once you've created a subscription for a non-Microsoft model, you can deploy the associated model to a standard deployment. For Microsoft models (such as Phi-3 models), you don't need to create a subscription.
251252
252-
The standard deployment provides a way to consume models as an API without hosting them on your subscription, while keeping the enterprise security and compliance organizations need. This deployment option doesn't require quota from your subscription.
253+
The standard deployment provides a way to consume models as an API without hosting them on your subscription, while maintaining the enterprise security and compliance organizations need. This deployment option doesn't require quota from your subscription.
253254
254-
In this section, you create an endpoint with the name **meta-llama3-8b-qwerty**.
255+
In this section, you create an endpoint with the name **Bria-2.3-Fast**.
255256
256257
1. Create the serverless endpoint
257258
258259
# [Studio](#tab/azure-studio)
259260
260-
1. To deploy a Microsoft model that doesn't require subscribing to a model offering, select **Deploy** and then select **standard deployment with Azure AI Content Safety (preview)** to open the deployment wizard.
261+
1. To deploy a Microsoft model that doesn't require subscribing to a model offering, select **Use this model** and then select **Serverless API** to open the deployment wizard.
261262
262-
1. Alternatively, for a non-Microsoft model that requires a model subscription, if you've just subscribed your workspace to the model offer in the previous section, continue to select **Deploy**. Alternatively, select **Continue to deploy** (if your deployment wizard had the note *You already have an Azure Marketplace subscription for this workspace*).
263+
1. Alternatively, for a non-Microsoft model that requires a model subscription, if you've subscribed your workspace to the model offer in the previous section, continue to select **Deploy**. Alternatively, select **Continue to deploy** (if your deployment wizard had the note *You already have an Azure Marketplace subscription for this workspace*).
263264
264265
:::image type="content" source="media/how-to-deploy-models-serverless/deploy-pay-as-you-go-subscribed-workspace.png" alt-text="A screenshot showing a workspace that is already subscribed to the offering." lightbox="media/how-to-deploy-models-serverless/deploy-pay-as-you-go-subscribed-workspace.png":::
265266
@@ -276,8 +277,8 @@ In this section, you create an endpoint with the name **meta-llama3-8b-qwerty**.
276277
__endpoint.yml__
277278
278279
```yml
279-
name: meta-llama3-8b-qwerty
280-
model_id: azureml://registries/azureml-meta/models/Meta-Llama-3-8B-Instruct
280+
name: bria-2.3-Fast
281+
model_id: azureml://registries/azureml-bria/models/Bria-2.3-Fast
281282
```
282283
283284
Use the _endpoint.yml_ file to create the endpoint:
@@ -289,7 +290,7 @@ In this section, you create an endpoint with the name **meta-llama3-8b-qwerty**.
289290
# [Python SDK](#tab/python)
290291
291292
```python
292-
endpoint_name="meta-llama3-8b-qwerty"
293+
endpoint_name="bria-2.3-Fast"
293294
294295
serverless_endpoint = ServerlessEndpoint(
295296
name=endpoint_name,
@@ -317,15 +318,15 @@ In this section, you create an endpoint with the name **meta-llama3-8b-qwerty**.
317318
"type": "String"
318319
},
319320
"endpoint_name": {
320-
"defaultValue": "meta-llama3-8b-qwerty",
321+
"defaultValue": "bria-2.3-Fast",
321322
"type": "String"
322323
},
323324
"location": {
324325
"defaultValue": "eastus2",
325326
"type": "String"
326327
},
327328
"model_id": {
328-
"defaultValue": "azureml://registries/azureml-meta/models/Meta-Llama-3-8B-Instruct",
329+
"defaultValue": "azureml://registries/azureml-bria/models/Bria-2.3-Fast",
329330
"type": "String"
330331
}
331332
},
@@ -410,7 +411,7 @@ In this section, you create an endpoint with the name **meta-llama3-8b-qwerty**.
410411
# [Azure CLI](#tab/cli)
411412
412413
```azurecli
413-
az ml serverless-endpoint get-credentials -n meta-llama3-8b-qwerty
414+
az ml serverless-endpoint get-credentials -n bria-2.3-Fast
414415
```
415416
416417
# [Python SDK](#tab/python)
@@ -475,14 +476,14 @@ To delete a standard deployment:
475476
476477
```azurecli
477478
az ml serverless-endpoint delete \
478-
--name "meta-llama3-8b-qwerty"
479+
--name "bria-2.3-Fast"
479480
```
480481

481482
To delete the associated model subscription:
482483

483484
```azurecli
484485
az ml marketplace-subscription delete \
485-
--name "Meta-Llama-3-8B-Instruct"
486+
--name "bria-2.3-Fast"
486487
```
487488

488489
# [Python SDK](#tab/python)
@@ -519,11 +520,11 @@ You can find the pricing information on the __Pricing and terms__ tab of the dep
519520

520521
#### Cost for non-Microsoft models
521522

522-
Non-Microsoft models deployed as standard deployments are offered through the Azure Marketplace and integrated with Azure AI Foundry for use. You can find the Azure Marketplace pricing when deploying or fine-tuning these models.
523+
Non-Microsoft models deployed as standard deployments are offered through Azure Marketplace and integrated with Azure AI Foundry for use. You can find Azure Marketplace pricing when deploying or fine-tuning these models.
523524

524-
Each time a workspace subscribes to a given offer from the Azure Marketplace, a new resource is created to track the costs associated with its consumption. The same resource is used to track costs associated with inference and fine-tuning; however, multiple meters are available to track each scenario independently.
525+
Each time a workspace subscribes to a given offer from Azure Marketplace, a new resource is created to track the costs associated with its consumption. The same resource is used to track costs associated with inference and fine-tuning; however, multiple meters are available to track each scenario independently.
525526

526-
For more information on how to track costs, see [Monitor costs for models offered through the Azure Marketplace](/azure/ai-studio/how-to/costs-plan-manage#monitor-costs-for-models-offered-through-the-azure-marketplace).
527+
For more information on how to track costs, see [Monitor costs for models offered through Azure Marketplace](/azure/ai-studio/how-to/costs-plan-manage#monitor-costs-for-models-offered-through-the-azure-marketplace).
527528

528529
:::image type="content" source="media/how-to-deploy-models-serverless/costs-model-as-service-cost-details.png" alt-text="A screenshot showing different resources corresponding to different model offers and their associated meters." lightbox="media/how-to-deploy-models-serverless/costs-model-as-service-cost-details.png":::
529530

@@ -532,7 +533,7 @@ For more information on how to track costs, see [Monitor costs for models offere
532533

533534
Azure role-based access controls (Azure RBAC) are used to grant access to operations in Azure Machine Learning. To perform the steps in this article, your user account must be assigned the __Owner__, __Contributor__, or __Azure AI Developer__ role for the Azure subscription. Alternatively, your account can be assigned a custom role that has the following permissions:
534535

535-
- On the Azure subscription—to subscribe the workspace to the Azure Marketplace offering, once for each workspace, per offering:
536+
- On the Azure subscription—to subscribe the workspace to Azure Marketplace offering, once for each workspace, per offering:
536537
- `Microsoft.MarketplaceOrdering/agreements/offers/plans/read`
537538
- `Microsoft.MarketplaceOrdering/agreements/offers/plans/sign/action`
538539
- `Microsoft.MarketplaceOrdering/offerTypes/publishers/offers/plans/agreements/read`
13.8 KB
Loading
-15.6 KB
Loading
5.44 KB
Loading
-70.7 KB
Loading
25.6 KB
Loading

0 commit comments

Comments
 (0)