You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: articles/container-apps/serverless-gpu-nim.md
+4-4Lines changed: 4 additions & 4 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -12,7 +12,7 @@ ms.devlang: azurecli
12
12
13
13
# Tutorial: Deploy an NVIDIA Llama3 NIM to Azure Container Apps
14
14
15
-
NVIDIA Inference Microservices (NIMs) are optimized, containerized AI inference microservices designed to simplify and accelerate the deployment of AI models. When you use Azure Container Apps with serverless GPUs, you can run these NIMs efficiently without having to manage the underlying infrastructure.
15
+
NVIDIA Inference Microservices (NIMs) are optimized, containerized AI inference microservices which simplify and accelerate the development of AI applications and agentic AI workflows with pre-packaged, scalable, and performance-tuned models that can be deployed as secure inference endpoints on Azure Container Apps. When you use Azure Container Apps with serverless GPUs, you can run these NIMs efficiently without having to manage the underlying infrastructure.
16
16
17
17
In this tutorial, you learn to deploy a Llama3 NVIDIA NIM to Azure Container Apps using serverless GPUs.
18
18
@@ -38,7 +38,7 @@ This tutorial uses a premium instance of Azure Container Registry to improve col
38
38
Next, generate a unique container registry name.
39
39
40
40
```bash
41
-
SUFFIX=$(head /dev/urandom | tr -dc 'A-Za-z0-9'| head -c 6)
41
+
SUFFIX=$(head /dev/urandom | tr -dc 'a-z0-9'| head -c 6)
42
42
ACR_NAME="mygpututorialacr${SUFFIX}"
43
43
```
44
44
@@ -72,7 +72,7 @@ This tutorial uses a premium instance of Azure Container Registry to improve col
72
72
Next, pull the image from NVIDIA GPU Cloud and push to Azure Container Registry.
73
73
74
74
> [!NOTE]
75
-
> NVIDIA NICs each has their own hardware requirements. Make sure the GPU type you select supports the [NIM](link) of your choice. The Llama3 NIM used in this tutorial can run on NVIDIA A100 GPUs.
75
+
> NVIDIA NICs each has their own hardware requirements. Make sure the GPU type you select supports the [NIM](https://build.nvidia.com/models?filters=nimType%3Anim_type_run_anywhere&q=llama) of your choice. The Llama3 NIM used in this tutorial can run on NVIDIA A100 GPUs.
76
76
77
77
1. Authenticate to the NVIDIA container registry.
78
78
@@ -110,7 +110,7 @@ Next, pull the image from NVIDIA GPU Cloud and push to Azure Container Registry.
110
110
111
111
## Enable artifact streaming (recommended but optional)
112
112
113
-
Many of the NIM images are large, and your container app can take a long time to start if you don't enable artifact streaming. Use the following steps to enable artifact streaming.
113
+
When your container app runs, it pulls the container from your container registry. When you have larger images like in the case of AI workloads, this image pull may take some time. By enabling artifact streaming, you reduce the time needed , and your container app can take a long time to start if you don't enable artifact streaming. Use the following steps to enable artifact streaming.
114
114
115
115
> [!NOTE]
116
116
> The following commands can take a few minutes to complete.
0 commit comments