Skip to content

Commit e45c54d

Browse files
committed
Merge branch 'main' into release-2025-openai-march-latest
2 parents 727203c + 19faaef commit e45c54d

33 files changed

+228
-118
lines changed

articles/ai-foundry/.openpublishing.redirection.ai-studio.json

Lines changed: 6 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1097,6 +1097,11 @@
10971097
"source_path_from_root": "/articles/ai-foundry/model-inference/reference/reference-model-inference-images-embeddings.md",
10981098
"redirect_url": "/rest/api/aifoundry/model-inference/get-image-embeddings/get-image-embeddings",
10991099
"redirect_document_id": false
1100-
}
1100+
},
1101+
{
1102+
"source_path_from_root": "/articles/ai-foundry/how-to/prompt-flow.md",
1103+
"redirect_url": "/azure/ai-foundry/concepts/prompt-flow",
1104+
"redirect_document_id": true
1105+
}
11011106
]
11021107
}

articles/ai-foundry/how-to/prompt-flow.md renamed to articles/ai-foundry/concepts/prompt-flow.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -9,7 +9,7 @@ ms.custom:
99
- build-2024
1010
- ignite-2024
1111
ms.topic: conceptual
12-
ms.date: 11/19/2024
12+
ms.date: 03/18/2025
1313
ms.reviewer: none
1414
ms.author: lagayhar
1515
author: lgayhardt
@@ -108,5 +108,5 @@ If the prompt flow tools in Azure AI Foundry portal don't meet your requirements
108108

109109
## Next steps
110110

111-
- [Build with prompt flow in Azure AI Foundry portal](flow-develop.md)
111+
- [Build with prompt flow in Azure AI Foundry portal](../how-to/flow-develop.md)
112112
- [Get started with prompt flow in VS Code](https://microsoft.github.io/promptflow/how-to-guides/quick-start.html)
Lines changed: 100 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,100 @@
1+
---
2+
title: How to deploy NVIDIA Inference Microservices
3+
titleSuffix: Azure AI Foundry
4+
description: Learn to deploy NVIDIA Inference Microservices, using Azure AI Foundry.
5+
manager: scottpolly
6+
ms.service: azure-ai-foundry
7+
ms.topic: how-to
8+
ms.date: 03/14/2024
9+
ms.author: ssalgado
10+
author: ssalgadodev
11+
ms.reviewer: tinaem
12+
reviewer: tinaem
13+
ms.custom: devx-track-azurecli
14+
---
15+
16+
# How to deploy NVIDIA Inference Microservices
17+
18+
In this article, you learn how to deploy NVIDIA Inference Microservices (NIMs) on Managed Compute in the model catalog on Foundry​. NVIDIA inference microservices are containers built by NVIDIA for optimized pre-trained and customized AI models serving on NVIDIA GPUs​.
19+
Get improved TCO (total cost of ownership) and performance with NVIDIA NIMs offered for one-click deployment on Foundry, with enterprise production-grade software under NVIDIA AI Enterprise license.
20+
21+
[!INCLUDE [models-preview](../includes/models-preview.md)]
22+
23+
## Prerequisites
24+
25+
- An Azure subscription with a valid payment method. Free or trial Azure subscriptions won't work. If you don't have an Azure subscription, create a [paid Azure account](https://azure.microsoft.com/pricing/purchase-options/pay-as-you-go) to begin.
26+
27+
- An [Azure AI Foundry hub](create-azure-ai-resource.md).
28+
29+
- An [Azure AI Foundry project](create-projects.md).
30+
31+
- Ensure Marketplace purchases are enabled for your Azure subscription. Learn more about it [here](/azure/cost-management-billing/manage/enable-marketplace-purchases).
32+
33+
- Azure role-based access controls (Azure RBAC) are used to grant access to operations in Azure AI Foundry portal. To perform the steps in this article, your user account must be assigned a _custom role_ with the following permissions. User accounts assigned the _Owner_ or _Contributor_ role for the Azure subscription can also create NIM deployments. For more information on permissions, see [Role-based access control in Azure AI Foundry portal](../concepts/rbac-ai-foundry.md).
34+
35+
- On the Azure subscription—**to subscribe the workspace to the Azure Marketplace offering**, once for each workspace/project:
36+
- Microsoft.MarketplaceOrdering/agreements/offers/plans/read
37+
- Microsoft.MarketplaceOrdering/agreements/offers/plans/sign/action
38+
- Microsoft.MarketplaceOrdering/offerTypes/publishers/offers/plans/agreements/read
39+
- Microsoft.Marketplace/offerTypes/publishers/offers/plans/agreements/read
40+
- Microsoft.SaaS/register/action
41+
42+
- On the resource group—**to create and use the SaaS resource**:
43+
- Microsoft.SaaS/resources/read
44+
- Microsoft.SaaS/resources/write
45+
46+
- On the workspace—**to deploy endpoints**:
47+
- Microsoft.MachineLearningServices/workspaces/marketplaceModelSubscriptions/*
48+
- Microsoft.MachineLearningServices/workspaces/onlineEndpoints/*
49+
50+
51+
## NVIDIA NIM PayGo offer on Azure Marketplace by NVIDIA
52+
53+
NVIDIA NIMs available on Azure AI Foundry model catalog can be deployed with a subscription to the [NVIDIA NIM SaaS offer](https://aka.ms/nvidia-nims-plan) on Azure Marketplace. This offer includes a 90-day trial that applies to all NIMs associated with a particular SaaS subscription scoped to an Azure AI Foundry project, and has a PayGo price of $1 per GPU hour post the trial period.
54+
55+
Azure AI Foundry enables a seamless purchase flow of the NVIDIA NIM offering on Marketplace from NVIDIA collection in the model catalog, and further deployment on managed compute.
56+
57+
## Deploy NVIDIA Inference Microservices on Managed Compute
58+
59+
1. Sign in to [Azure AI Foundry](https://ai.azure.com) and go to the **Home** page.
60+
2. Select **Model catalog** from the left sidebar.
61+
3. In the filters section, select **Collections** and select **NVIDIA**.
62+
63+
:::image type="content" source="../media/how-to/deploy-nvidia-inference-microservice/nvidia-collections.png" alt-text="A screenshot showing the Nvidia inference microservices available in the model catalog." lightbox="../media/how-to/deploy-nvidia-inference-microservice/nvidia-collections.png":::
64+
65+
4. Select the NVIDIA NIM of your choice. In this article, we are using **Llama-3.3-70B-Instruct-NIM-microservice** as an example.
66+
5. Select **Deploy**.
67+
6. Select one of the NVIDIA GPU based VM SKUs supported for the NIM, based on your intended workload. You need to have quota in your Azure subscription.
68+
7. You can then customize your deployment configuration for the instance count, select an existing endpoint or create a new one, etc. For the example in this article, we consider an instance count of **2** and create a new endpoint.
69+
70+
:::image type="content" source="../media/how-to/deploy-nvidia-inference-microservice/project-customization.png" alt-text="A screenshot showing project customization options in the deployment wizard." lightbox="../media/how-to/deploy-nvidia-inference-microservice/project-customization.png":::
71+
72+
8. Select **Next**
73+
9. Then, review the pricing breakdown for the NIM deployment, terms of use and license agreement associated with the NIM offer. The pricing breakdown helps to inform what the aggregated pricing for the NIM software deployed would be, which is a function of the number of NVIDIA GPUs in the VM instance that was selected in the previous steps. In addition to the applicable NIM software price, Azure Compute charges also applies based on your deployment configuration.
74+
75+
:::image type="content" source="../media/how-to/deploy-nvidia-inference-microservice/payment-description.png" alt-text="A screenshot showing the necessary user payment agreement detailing how the user is charged for deploying the models." lightbox="../media/how-to/deploy-nvidia-inference-microservice/payment-description.png":::
76+
77+
10. Select the checkbox to acknowledge understanding of pricing and terms of use, and then, select **Deploy**.
78+
79+
## Consume NVIDIA NIM deployments
80+
81+
After your deployment is successfully created, you can go to **Models + Endpoints** under My assets in your Azure AI Foundry project, select your deployment under "Model deployments" and navigate to the Test tab for sample inference to the endpoint. You can also go to the Chat Playground by selecting **Open in Playground** in Deployment Details tab, to be able to modify parameters for the inference requests.
82+
83+
NVIDIA NIMs on Foundry expose an OpenAI compatible API, learn more about the payload supported [here](https://docs.nvidia.com/nim/large-language-models/latest/api-reference.html#). The 'model' parameter for NIMs on Foundry is set to a default value within the container, and is not required to pass through in the payload to your online endpoint. The **Consume** tab of the NIM deployment on Foundry includes code samples for inference with the target URL of your deployment. You can also consume NIM deployments using the Azure AI Model Inference SDK.
84+
85+
## Security scanning for NIMs by NVIDIA
86+
87+
88+
Redeploy to get the latest version of NIM from NVIDIA on Foundry.
89+
90+
## Network Isolation support for NIMs
91+
92+
NVIDIA ensures the security and reliability of NVIDIA NIM container images through best-in-class vulnerability scanning, rigorous patch management, and transparent processes. Learn the details [here](https://docs.nvidia.com/ai-enterprise/planning-resource/security-for-azure-ai-foundry/latest/introduction.html). Microsoft works with NVIDIA to get the latest patches of the NIMs to deliver secure, stable, and reliable production-grade software within AI Foundry.
93+
Users can refer to the last updated time for the NIM in the model overview page, and you can redeploy to get the latest version of NIM from NVIDIA on Foundry.
94+
95+
While NIMs are in preview on Foundry, workspaces with Public Network Access disabled will have a limitation of being able to create only one successful deployment in the private workspace or project. Note, there can only be a single active deployment in a private workspace, attempts to create more active deployments will end in failure.
96+
97+
## Related content
98+
99+
* Learn more about the [Model Catalog](./model-catalog-overview.md)
100+
* Learn more about [built-in policies for deployment](./built-in-policy-model-deployment.md)
557 KB
Loading
444 KB
Loading
623 KB
Loading

articles/ai-foundry/toc.yml

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -110,6 +110,8 @@ items:
110110
href: how-to/healthcare-ai/deploy-cxrreportgen.md
111111
- name: MedImageParse - prompted segmentation model
112112
href: how-to/healthcare-ai/deploy-medimageparse.md
113+
- name: Nvidia Inference Microservices (NIM)
114+
href: how-to/deploy-nvidia-inference-microservice.md
113115
- name: Gretel Navigator model
114116
href: how-to/deploy-models-gretel-navigator.md
115117
- name: Mistral-7B and Mixtral models

articles/ai-services/computer-vision/identity-api-reference.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -19,7 +19,7 @@ feedback_help_link_url: https://learn.microsoft.com/answers/tags/156/azure-face
1919
Azure AI Face is a cloud-based service that provides algorithms for face detection and recognition. The Face APIs comprise the following categories:
2020

2121
- Face Algorithm APIs: Cover core functions such as [Detection](/rest/api/face/face-detection-operations/detect), [Find Similar](/rest/api/face/face-recognition-operations/find-similar-from-large-face-list), [Verification](/rest/api/face/face-recognition-operations/verify-face-to-face), [Identification](/rest/api/face/face-recognition-operations/identify-from-large-person-group), and [Group](/rest/api/face/face-recognition-operations/group).
22-
- [DetectLiveness session APIs](/rest/api/face/liveness-session-operations): Used to create and manage a Liveness Detection session. See the [Liveness Detection](/azure/ai-services/computer-vision/tutorials/liveness) tutorial.
22+
- [DetectLiveness session APIs](/rest/api/face/liveness-session-operations?view=rest-face-v1.2): Used to create and manage a Liveness Detection session. See the [Liveness Detection](/azure/ai-services/computer-vision/tutorials/liveness) tutorial.
2323
- [FaceList APIs](/rest/api/face/face-list-operations): Used to manage a FaceList for [Find Similar From Face List](/rest/api/face/face-recognition-operations/find-similar-from-face-list).
2424
- [LargeFaceList APIs](/rest/api/face/face-list-operations): Used to manage a LargeFaceList for [Find Similar From Large Face List](/rest/api/face/face-recognition-operations/find-similar-from-large-face-list).
2525
- [PersonGroup APIs](/rest/api/face/person-group-operations): Used to manage a PersonGroup dataset for [Identification From Person Group](/rest/api/face/face-recognition-operations/identify-from-person-group).

articles/ai-services/computer-vision/toc.yml

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -295,6 +295,8 @@ items:
295295
href: concept-face-recognition.md
296296
- name: Face recognition data structures
297297
href: concept-face-recognition-data-structures.md
298+
- name: Face liveness detection
299+
href: concept-face-liveness-detection.md
298300
- name: Face liveness abuse monitoring
299301
href: concept-liveness-abuse-monitoring.md
300302

articles/ai-services/content-understanding/audio/overview.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -7,7 +7,7 @@ ms.author: lajanuar
77
manager: nitinme
88
ms.service: azure-ai-content-understanding
99
ms.topic: overview
10-
ms.date: 01/14/2025
10+
ms.date: 03/18/2025
1111
ms.custom: ignite-2024-understanding-release
1212
---
1313

@@ -82,7 +82,7 @@ Developers using Content Understanding should review Microsoft's policies on cus
8282

8383
## Next steps
8484

85-
* Try processing your audio content using Content Understanding in [**Azure AI Foundry portal**](https://ai.azure.com/).
85+
* Try processing your audio content using Content Understanding in [**Azure AI Foundry portal**](https://aka.ms/cu-landing).
8686
* Learn how to analyze audio content [**analyzer templates**](../quickstart/use-ai-foundry.md).
8787
* Review code sample: [**audio content extraction**](https://github.com/Azure-Samples/azure-ai-content-understanding-python/blob/main/notebooks/content_extraction.ipynb).
8888
* Review code sample: [**analyzer templates**](https://github.com/Azure-Samples/azure-ai-content-understanding-python/tree/main/analyzer_templates).

0 commit comments

Comments
 (0)