Skip to content

Commit 45c0b7e

Browse files
authored
Update deployment instructions and add security details
1 parent 5dd71d6 commit 45c0b7e

File tree

1 file changed

+5
-6
lines changed

1 file changed

+5
-6
lines changed

articles/ai-foundry/how-to/deploy-nvidia-inference-microservice.md

Lines changed: 5 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -16,7 +16,7 @@ ms.custom: devx-track-azurecli
1616
# How to deploy NVIDIA Inference Microservices
1717

1818
In this article, you learn how to deploy NVIDIA Inference Microservices (NIMs) on Managed Compute in the model catalog on Foundry​. NVIDIA inference microservices are containers built by NVIDIA for optimized pre-trained and customized AI models serving on NVIDIA GPUs​.
19-
Get improved TCO (total cost of ownership) and performance with NVIDIA NIMs offered for one-click deployment on Foundry, with enterprise production-grade software under NVIDIA AI Enterprise license.
19+
Get increased throughput and reduced total cost ownership with NVIDIA NIMs offered for one-click deployment on Foundry, with enterprise production-grade software under NVIDIA AI Enterprise license.
2020

2121
[!INCLUDE [models-preview](../includes/models-preview.md)]
2222

@@ -65,12 +65,12 @@ Get improved TCO (total cost of ownership) and performance with NVIDIA NIMs offe
6565
4. Select the NVIDIA NIM of your choice. In this article, we are using **Llama-3.3-70B-Instruct-NIM-microservice** as an example.
6666
5. Select **Deploy**.
6767
6. Select one of the NVIDIA GPU based VM SKUs supported for the NIM, based on your intended workload. You need to have quota in your Azure subscription.
68-
7. You can then customize your deployment configuration for the instance count, select an existing endpoint or create a new one, etc. For the example in this article, we consider an instance count of **2** and create a new endpoint.
68+
7. You can then customize your deployment configuration for the instance count, select an existing endpoint or create a new one, etc. For the example in this article, we consider an instance count of **1** and create a new endpoint.
6969

7070
:::image type="content" source="../media/how-to/deploy-nvidia-inference-microservice/project-customization.png" alt-text="A screenshot showing project customization options in the deployment wizard." lightbox="../media/how-to/deploy-nvidia-inference-microservice/project-customization.png":::
7171

7272
8. Select **Next**
73-
9. Then, review the pricing breakdown for the NIM deployment, terms of use and license agreement associated with the NIM offer. The pricing breakdown helps to inform what the aggregated pricing for the NIM software deployed would be, which is a function of the number of NVIDIA GPUs in the VM instance that was selected in the previous steps. In addition to the applicable NIM software price, Azure Compute charges also applies based on your deployment configuration.
73+
9. Then, review the pricing breakdown for the NIM deployment, terms of use and license agreement associated with the NIM offer. The pricing breakdown helps inform what the aggregated pricing for the NIM software deployed would be, which is a function of the number of NVIDIA GPUs in the VM instance that was selected in the previous steps. In addition to the applicable NIM software price, Azure Compute charges also applies based on your deployment configuration.
7474

7575
:::image type="content" source="../media/how-to/deploy-nvidia-inference-microservice/payment-description.png" alt-text="A screenshot showing the necessary user payment agreement detailing how the user is charged for deploying the models." lightbox="../media/how-to/deploy-nvidia-inference-microservice/payment-description.png":::
7676

@@ -84,14 +84,13 @@ NVIDIA NIMs on Foundry expose an OpenAI compatible API, learn more about the pay
8484

8585
## Security scanning for NIMs by NVIDIA
8686

87+
NVIDIA ensures the security and reliability of NVIDIA NIM container images through best-in-class vulnerability scanning, rigorous patch management, and transparent processes. Learn the details [here](https://docs.nvidia.com/ai-enterprise/planning-resource/security-for-azure-ai-foundry/latest/introduction.html). Microsoft works with NVIDIA to get the latest patches of the NIMs to deliver secure, stable, and reliable production-grade software within AI Foundry.
88+
Users can refer to the last updated time for the NIM in the model overview page, and you can redeploy to get the latest version of NIM from NVIDIA on Foundry.
8789

8890
Redeploy to get the latest version of NIM from NVIDIA on Foundry.
8991

9092
## Network Isolation support for NIMs
9193

92-
NVIDIA ensures the security and reliability of NVIDIA NIM container images through best-in-class vulnerability scanning, rigorous patch management, and transparent processes. Learn the details [here](https://docs.nvidia.com/ai-enterprise/planning-resource/security-for-azure-ai-foundry/latest/introduction.html). Microsoft works with NVIDIA to get the latest patches of the NIMs to deliver secure, stable, and reliable production-grade software within AI Foundry.
93-
Users can refer to the last updated time for the NIM in the model overview page, and you can redeploy to get the latest version of NIM from NVIDIA on Foundry.
94-
9594
While NIMs are in preview on Foundry, workspaces with Public Network Access disabled will have a limitation of being able to create only one successful deployment in the private workspace or project. Note, there can only be a single active deployment in a private workspace, attempts to create more active deployments will end in failure.
9695

9796
## Related content

0 commit comments

Comments
 (0)