Skip to content

Commit 4456a00

Browse files
committed
init
1 parent ec100d2 commit 4456a00

File tree

2 files changed

+197
-2
lines changed

2 files changed

+197
-2
lines changed

articles/container-apps/gpu-serverless-overview.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -93,10 +93,10 @@ Serverless GPUs are run on consumption GPU workload profiles. You manage a consu
9393

9494
## Improve GPU cold start
9595

96-
You can improve cold start on your GPU-enabled containers by enabling artifact streaming on your Azure Container Registry.
96+
You can improve cold start on your GPU-enabled containers by enabling artifact streaming on your Azure Container Registry. For more details, see [enable artifact streaming](./https://learn.microsoft.com/en-us/azure/container-registry/container-registry-artifact-streaming?pivots=development-environment-azure-cli#pushimport-the-image-and-generate-the-streaming-artifact----azure-cli).
9797

9898
> [!NOTE]
99-
> To use artifact streaming, your container images must be hosted in Azure Container Registry.
99+
> To use artifact streaming, your container images must be hosted in a premium Azure Container Registry.
100100
101101
Use the following steps to enable image streaming:
102102

Lines changed: 195 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,195 @@
1+
---
2+
title: 'Tutorial: Deploy your first container app'
3+
description: Deploy a NVIDIA NIM to Azure Container Apps.
4+
services: container-apps
5+
author: craigshoemaker
6+
ms.service: azure-container-apps
7+
ms.topic: tutorial
8+
ms.date: 03/16/2025
9+
ms.author: cachai
10+
ms.custom: mode-api, devx-track-azurecli, devx-track-azurepowershell
11+
ms.devlang: azurecli
12+
---
13+
14+
# Tutorial: Deploy an NVIDIA LLAMA3 NIM to Azure Container Apps
15+
16+
NVIDIA Inference Microservices (NIMs) are optimized, containerized AI inference microservices designed to simplify and accelerate the deployment of AI models across various environments. By leveraging Azure Container Apps with serverless GPUs, you can run these NIMs efficiently without managing the underlying infrastructure.​
17+
18+
In this tutorial, you'll deploy a Llama3 NVIDIA NIM to Azure Container Apps using serverless GPUs.
19+
20+
## Prerequisites
21+
22+
- An Azure account with an active subscription.
23+
- If you don't have one, you [can create one for free](https://azure.microsoft.com/free/).
24+
- Install the [Azure CLI](/cli/azure/install-azure-cli).
25+
- Have a NVIDIA NGC API Key. Obtain an API key from the [NVIDIA NGC website](https://catalog.ngc.nvidia.com).
26+
27+
[!INCLUDE [container-apps-create-cli-steps.md](../../includes/container-apps-create-cli-steps.md)]
28+
29+
[!INCLUDE [container-apps-set-environment-variables.md](../../includes/container-apps-set-environment-variables.md)]
30+
31+
[!INCLUDE [container-apps-create-resource-group.md](../../includes/container-apps-create-resource-group.md)]
32+
33+
[!INCLUDE [container-apps-create-environment.md](../../includes/container-apps-create-environment.md)]
34+
35+
## Initial setup
36+
37+
1. Set up environment variables
38+
39+
```bash
40+
RESOURCE_GROUP="my-resource-group"
41+
LOCATION="swedencentral"
42+
ACR_NAME="myacrname"
43+
CONTAINERAPPS_ENVIRONMENT="my-environment-name"
44+
CONTAINER_APP_NAME="llama3-nim"
45+
GPU_TYPE="Consumption-GPU-NC24-A100"
46+
```
47+
48+
1. Create an Azure resource group
49+
50+
```azurecli
51+
az group create --name $RESOURCE_GROUP --location $LOCATION
52+
```
53+
54+
1. Create an Azure Container Registry (ACR)
55+
56+
> [!NOTE]
57+
> This tutorial uses a premium Azure Contianer Registry as it is recommended when using serverless GPUs for improved cold start performance. If you do not wish to use a premium Azure Container Registry, modify the below command, so --sku is set to Basic.
58+
59+
```azurecli
60+
az acr create --resource-group $RESOURCE_GROUP --name $ACR_NAME --sku Premium --location $LOCATION
61+
```
62+
63+
## Pull the image from NGC and push to ACR
64+
65+
> [!NOTE]
66+
> NVIDIA NICs each have their own hardware requirements. [Make sure the NIM](link) you select is supported by the GPU types available in Azure Container Apps. The Llama3 NIM used in this tutorial can run on NVIDIA A100 GPUs.
67+
68+
1. Authenticate with both the NVIDIA and azure container registries
69+
70+
```bash
71+
docker login nvcr.io
72+
Username: $oauthtoken
73+
Password: <PASTE_API_KEY_HERE>
74+
```
75+
76+
```bash
77+
az acr login --name $ACR_NAME
78+
```
79+
80+
1. Pull the Llama3 NIM image and push it to your Azure Container Registry
81+
82+
Pull the image
83+
```azurecli
84+
docker pull nvcr.io/nim/meta/llama3-8b-instruct:1.0.0
85+
```
86+
87+
Tag the image
88+
```azurecli
89+
docker tag nvcr.io/nim/meta/llama3-8b-instruct:1.0.0 $ACR_NAME.azurecr.io/llama3-8b-instruct:1.0.0
90+
```
91+
92+
Push the image
93+
```azurecli
94+
docker push $ACR_NAME.azurecr.io/llama3-8b-instruct:1.0.0
95+
```
96+
97+
## (Recommended: Optional) Enable artifact streaming.
98+
99+
Many of the NIM images are large, and your container app may take a long time to start if you don't enable artifact streaming. To enable artifact streaming, follow these steps:
100+
101+
```azurecli
102+
az acr artifact-streaming create --image jupyter/all-spark-notebook:latest
103+
```
104+
105+
```azurecli
106+
az acr artifact-streaming update --repository jupyter/all-spark-notebook --enable-streaming true
107+
```
108+
109+
```azurecli
110+
az acr artifact-streaming operation show --image jupyter/all-spark-notebook:newtag
111+
```
112+
113+
Note: Tis may take a few minutes.
114+
115+
## Create your container app with the NGC API Key
116+
117+
```azurecli
118+
az containerapp env create \
119+
--name $CONTAINERAPPS_ENVIRONMENT \
120+
--resource-group $RESOURCE_GROUP \
121+
--location $LOCATION \
122+
--workload-profiles enabled
123+
```
124+
125+
az containerapp env workload-profile add \
126+
--resource-group $RESOURCE_GROUP \
127+
--name $CONTAINERAPPS_ENVIRONMENT \
128+
--workload-profile-type $GPU_TYPE \
129+
--workload-profile-name <WORKLOAD_PROFILE_NAME> \
130+
131+
az containerapp secret set \
132+
--name $CONTAINER_APP_NAME \
133+
--resource-group $RESOURCE_GROUP \
134+
--secrets ngc-api-key=<PASTE_NGC_API_KEY_HERE>
135+
136+
```azurecli ///need add workload profile and verify
137+
az containerapp create \
138+
--name $CONTAINER_APP_NAME \
139+
--resource-group $RESOURCE_GROUP \
140+
--environment $CONTAINERAPPS_ENVIRONMENT \
141+
--image $ACR_NAME.azurecr.io/llama3-8b-instruct:1.0.0 \
142+
--cpu 24 \
143+
--memory 220 \
144+
--gpu "NvidiaA100" \
145+
--secrets ngc-api-key=<PASTE_NGC_API_KEY_HERE> \
146+
--env-vars NGC_API_KEY=secretref:ngc-api-key \
147+
--registry-server $ACR_NAME.azurecr.io \
148+
--registry-username <ACR_USERNAME> \
149+
--registry-password <ACR_PASSWORD>
150+
151+
## Test your NIM
152+
Once deployed, test the NIM by sending a request:
153+
```bash
154+
curl -X POST \
155+
'http://<YOUR_CONTAINER_APP_URL>/v1/completions' \
156+
-H 'accept: application/json' \
157+
-H 'Content-Type: application/json' \
158+
-d '{
159+
"model": "meta/llama3-8b-instruct",
160+
"prompt": "Once upon a time",
161+
"max_tokens": 64
162+
}'
163+
```
164+
165+
## (Optional) Improving performance with volume mounts
166+
For even faster cold start times, many of the NIMs provide a volume mount path to mount a cache directory. This cache directory can be used to store the model weights and other files that the NIM needs to run. To set up a volume mount for the Llama3 NIM, see this article.
167+
168+
## Clean up resources
169+
170+
If you're not going to continue to use this application, run the following command to delete the resource group along with all the resources created in this tutorial.
171+
172+
>[!CAUTION]
173+
> The following command deletes the specified resource group and all resources contained within it. If resources outside the scope of this tutorial exist in the specified resource group, they will also be deleted.
174+
175+
# [Bash](#tab/bash)
176+
177+
```azurecli
178+
az group delete --name $RESOURCE_GROUP
179+
```
180+
181+
# [PowerShell](#tab/powershell)
182+
183+
```azurepowershell
184+
Remove-AzResourceGroup -Name $ResourceGroupName -Force
185+
```
186+
187+
---
188+
189+
> [!TIP]
190+
> Having issues? Let us know on GitHub by opening an issue in the [Azure Container Apps repo](https://github.com/microsoft/azure-container-apps).
191+
192+
## Next steps
193+
194+
> [!div class="nextstepaction"]
195+
> [Communication between microservices](communicate-between-microservices.md)

0 commit comments

Comments
 (0)