Skip to content

Commit 7197d0d

Browse files
Merge pull request #281706 from santiagxf/santiagxf-patch-1
Update deploy-models-serverless.md
2 parents cb8be1e + 3ace3aa commit 7197d0d

File tree

1 file changed

+135
-3
lines changed

1 file changed

+135
-3
lines changed

articles/ai-studio/how-to/deploy-models-serverless.md

Lines changed: 135 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -81,6 +81,17 @@ This article uses a Meta Llama model deployment for illustration. However, you c
8181
)
8282
```
8383
84+
# [Bicep](#tab/bicep)
85+
86+
Install the Azure CLI as described at [Azure CLI](/cli/azure/).
87+
88+
Configure the following environment variables according to your settings:
89+
90+
```azurecli
91+
RESOURCE_GROUP="serverless-models-dev"
92+
LOCATION="eastus2"
93+
```
94+
8495
# [ARM](#tab/arm)
8596
8697
You can use any compatible web browser to [deploy ARM templates](../../azure-resource-manager/templates/deploy-portal.md) in the Microsoft Azure portal or use any of the deployment tools. This tutorial uses the [Azure CLI](/cli/azure/).
@@ -110,6 +121,9 @@ The next section covers the steps for subscribing your project to a model offeri
110121
111122
Serverless API endpoints can deploy both Microsoft and non-Microsoft offered models. For Microsoft models (such as Phi-3 models), you don't need to create an Azure Marketplace subscription and you can [deploy them to serverless API endpoints directly](#deploy-the-model-to-a-serverless-api-endpoint) to consume their predictions. For non-Microsoft models, you need to create the subscription first. If it's your first time deploying the model in the project, you have to subscribe your project for the particular model offering from the Azure Marketplace. Each project has its own subscription to the particular Azure Marketplace offering of the model, which allows you to control and monitor spending.
112123
124+
> [!TIP]
125+
> Skip this step if you are deploying models from the Phi-3 family of models. Directly [deploy the model to a serverless API endpoint](#deploy-the-model-to-a-serverless-api-endpoint).
126+
113127
> [!NOTE]
114128
> Models offered through the Azure Marketplace are available for deployment to serverless API endpoints in specific regions. Check [Model and region availability for Serverless API deployments](deploy-models-serverless-availability.md) to verify which models and regions are available. If the one you need is not listed, you can deploy to a workspace in a supported region and then [consume serverless API endpoints from a different workspace](deploy-models-serverless-connect.md).
115129
@@ -164,11 +178,41 @@ Serverless API endpoints can deploy both Microsoft and non-Microsoft offered mod
164178
).result()
165179
```
166180
181+
# [Bicep](#tab/bicep)
182+
183+
Use the following bicep configuration to create a model subscription:
184+
185+
__model-subscription.bicep__
186+
187+
```bicep
188+
param projectName string = 'my-project'
189+
param modelId string = 'azureml://registries/azureml-meta/models/Meta-Llama-3-8B-Instruct'
190+
191+
var modelName = substring(modelId, (lastIndexOf(modelId, '/') + 1))
192+
var subscriptionName = '${modelName}-subscription'
193+
194+
resource projectName_subscription 'Microsoft.MachineLearningServices/workspaces/marketplaceSubscriptions@2024-04-01-preview' = if (!startsWith(
195+
modelId,
196+
'azureml://registries/azureml/'
197+
)) {
198+
name: '${projectName}/${subscriptionName}'
199+
properties: {
200+
modelId: modelId
201+
}
202+
}
203+
```
204+
205+
Then create the resource as follows:
206+
207+
```azurecli
208+
az deployment group create --resource-group $RESOURCE_GROUP --template-file model-subscription.bicep
209+
```
210+
167211
# [ARM](#tab/arm)
168212
169213
Use the following template to create a model subscription:
170214
171-
__template.json__
215+
__model-subscription.json__
172216
173217
```json
174218
{
@@ -202,6 +246,12 @@ Serverless API endpoints can deploy both Microsoft and non-Microsoft offered mod
202246
}
203247
```
204248
249+
Use the Azure portal or the Azure CLI to create the deployment.
250+
251+
```azurecli
252+
az deployment group create --resource-group $RESOURCE_GROUP --template-file model-subscription.json
253+
```
254+
205255
1. Once you subscribe the project for the particular Azure Marketplace offering, subsequent deployments of the same offering in the same project don't require subscribing again.
206256
207257
1. At any point, you can see the model offers to which your project is currently subscribed:
@@ -233,6 +283,15 @@ Serverless API endpoints can deploy both Microsoft and non-Microsoft offered mod
233283
print(sub.as_dict())
234284
```
235285
286+
# [Bicep](#tab/bicep)
287+
288+
You can use the resource management tools to query the resources. The following code uses Azure CLI:
289+
290+
```azurecli
291+
az resource list \
292+
--query "[?type=='Microsoft.SaaS']"
293+
```
294+
236295
# [ARM](#tab/arm)
237296
238297
You can use the resource management tools to query the resources. The following code uses Azure CLI:
@@ -298,6 +357,46 @@ In this section, you create an endpoint with the name **meta-llama3-8b-qwerty**.
298357
).result()
299358
```
300359
360+
# [Bicep](#tab/bicep)
361+
362+
Use the following template to create an endpoint:
363+
364+
__serverless-endpoint.bicep__
365+
366+
```bicep
367+
param projectName string = 'my-project'
368+
param endpointName string = 'myserverless-text-1234ss'
369+
param location string = resourceGroup().location
370+
param modelId string = 'azureml://registries/azureml-meta/models/Meta-Llama-3-8B-Instruct'
371+
372+
var modelName = substring(modelId, (lastIndexOf(modelId, '/') + 1))
373+
var subscriptionName = '${modelName}-subscription'
374+
375+
resource projectName_endpoint 'Microsoft.MachineLearningServices/workspaces/serverlessEndpoints@2024-04-01-preview' = {
376+
name: '${projectName}/${endpointName}'
377+
location: location
378+
sku: {
379+
name: 'Consumption'
380+
}
381+
properties: {
382+
modelSettings: {
383+
modelId: modelId
384+
}
385+
}
386+
dependsOn: [
387+
projectName_subscription
388+
]
389+
}
390+
391+
output endpointUri string = projectName_endpoint.properties.inferenceEndpoint.uri
392+
```
393+
394+
Create the deployment as follows:
395+
396+
```azurecli
397+
az deployment group create --resource-group $RESOURCE_GROUP --template-file model-subscription.bicep
398+
```
399+
301400
# [ARM](#tab/arm)
302401
303402
Use the following template to create an endpoint:
@@ -350,8 +449,7 @@ In this section, you create an endpoint with the name **meta-llama3-8b-qwerty**.
350449
351450
```azurecli
352451
az deployment group create \
353-
--name model-subscription-deployment \
354-
--resource-group <resource-group> \
452+
--resource-group $RESOURCE_GROUP \
355453
--template-file template.json
356454
```
357455
@@ -392,6 +490,15 @@ In this section, you create an endpoint with the name **meta-llama3-8b-qwerty**.
392490
).result()
393491
```
394492
493+
# [Bicep](#tab/bicep)
494+
495+
You can use the resource management tools to query the resources. The following code uses Azure CLI:
496+
497+
```azurecli
498+
az resource list \
499+
--query "[?type=='Microsoft.MachineLearningServices/workspaces/serverlessEndpoints']"
500+
```
501+
395502
# [ARM](#tab/arm)
396503
397504
You can use the resource management tools to query the resources. The following code uses Azure CLI:
@@ -424,6 +531,10 @@ In this section, you create an endpoint with the name **meta-llama3-8b-qwerty**.
424531
print(endpoint_keys.secondary_key)
425532
```
426533
534+
# [Bicep](#tab/bicep)
535+
536+
Use REST APIs to query this information.
537+
427538
# [ARM](#tab/arm)
428539
429540
Use REST APIs to query this information.
@@ -441,6 +552,18 @@ Models deployed in Azure Machine Learning and Azure AI studio in Serverless API
441552
442553
Read more about the [capabilities of this API](../reference/reference-model-inference-api.md#capabilities) and how [you can use it when building applications](../reference/reference-model-inference-api.md#getting-started).
443554
555+
## Network isolation
556+
557+
Endpoints for models deployed as Serverless APIs follow the public network access (PNA) flag setting of the AI Studio Hub that has the project in which the deployment exists. To secure your MaaS endpoint, disable the PNA flag on your AI Studio Hub. You can secure inbound communication from a client to your endpoint by using a private endpoint for the hub.
558+
559+
To set the PNA flag for the Azure AI hub:
560+
561+
1. Go to the [Azure portal](https://portal.azure.com).
562+
2. Search for the Resource group to which the hub belongs, and select your Azure AI hub from the resources listed for this Resource group.
563+
3. On the hub Overview page, use the left navigation pane to go to Settings > Networking.
564+
4. Under the **Public access** tab, you can configure settings for the public network access flag.
565+
5. Save your changes. Your changes might take up to five minutes to propagate.
566+
444567
## Delete endpoints and subscriptions
445568
446569
You can delete model subscriptions and endpoints. Deleting a model subscription makes any associated endpoint become *Unhealthy* and unusable.
@@ -500,6 +623,15 @@ To delete the associated model subscription:
500623
client.marketplace_subscriptions.begin_delete(subscription_name).wait()
501624
```
502625

626+
# [Bicep](#tab/bicep)
627+
628+
You can use the resource management tools to manage the resources. The following code uses Azure CLI:
629+
630+
```azurecli
631+
az resource delete --name <resource-name>
632+
```
633+
634+
503635
# [ARM](#tab/arm)
504636

505637
You can use the resource management tools to manage the resources. The following code uses Azure CLI:

0 commit comments

Comments
 (0)