MicrosoftDocs
diff --git a/‎articles/ai-foundry/how-to/data-add.md‎
Lines changed: 2 additions & 0 deletions b/‎articles/ai-foundry/how-to/data-add.md‎
Lines changed: 2 additions & 0 deletions
diff --git a/‎articles/ai-foundry/how-to/deploy-models-serverless.md‎
Lines changed: 2 additions & 2 deletions b/‎articles/ai-foundry/how-to/deploy-models-serverless.md‎
Lines changed: 2 additions & 2 deletions
diff --git a/‎articles/ai-foundry/how-to/deploy-nvidia-inference-microservice.md‎
Lines changed: 39 additions & 20 deletions b/‎articles/ai-foundry/how-to/deploy-nvidia-inference-microservice.md‎
Lines changed: 39 additions & 20 deletions
diff --git a/‎articles/ai-foundry/media/deploy-models-serverless/ai-project-inference-endpoint.gif‎
867 KB b/‎articles/ai-foundry/media/deploy-models-serverless/ai-project-inference-endpoint.gif‎
867 KB
diff --git a/‎articles/ai-foundry/model-inference/how-to/quickstart-ai-project.md‎
Lines changed: 1 addition & 1 deletion b/‎articles/ai-foundry/model-inference/how-to/quickstart-ai-project.md‎
Lines changed: 1 addition & 1 deletion
diff --git a/‎articles/ai-foundry/model-inference/includes/configure-project-connection/portal.md‎
Lines changed: 1 addition & 1 deletion b/‎articles/ai-foundry/model-inference/includes/configure-project-connection/portal.md‎
Lines changed: 1 addition & 1 deletion
diff --git a/‎articles/ai-foundry/model-inference/includes/create-resources/portal.md‎
Lines changed: 1 addition & 1 deletion b/‎articles/ai-foundry/model-inference/includes/create-resources/portal.md‎
Lines changed: 1 addition & 1 deletion
diff --git a/‎articles/ai-foundry/model-inference/tutorials/get-started-deepseek-r1.md‎
Lines changed: 1 addition & 1 deletion b/‎articles/ai-foundry/model-inference/tutorials/get-started-deepseek-r1.md‎
Lines changed: 1 addition & 1 deletion
diff --git a/‎articles/ai-services/agents/concepts/model-region-support.md‎
Lines changed: 2 additions & 2 deletions b/‎articles/ai-services/agents/concepts/model-region-support.md‎
Lines changed: 2 additions & 2 deletions
diff --git a/‎articles/ai-services/openai/how-to/create-resource.md‎
Lines changed: 1 addition & 1 deletion b/‎articles/ai-services/openai/how-to/create-resource.md‎
Lines changed: 1 addition & 1 deletion
@@ -38,6 +38,8 @@ To create and work with data, you need:
 
 ## Create data
 
+You are charged for the storage used by your data. To help estimate the cost, you can use the [Azure Pricing Calculator](https://azure.microsoft.com/pricing/calculator/). The data is stored in a container called `workspaceblobstore` in your project's Azure Storage account. 
+
 When you create your data, you need to set the data type. Azure AI Foundry supports these data types:
 
 |Type  |**Canonical Scenarios**|
 
@@ -31,9 +31,9 @@ This article uses a Meta Llama model deployment for illustration. However, you c
 
 - An [Azure AI Foundry project](create-projects.md).
 
-- You have to disable the feature **Deploy models to Azure AI model inference service**. When this feature is on, serverless API endpoints are not available for deployment when using the Azure AI Foundry portal.
+- Ensure that the **Deploy models to Azure AI model inference service** feature is turned off in the Azure AI Foundry portal. When this feature is on, serverless API endpoints are not available for deployment when using the portal.
 
-    :::image type="content" source="../model-inference/media/quickstart-ai-project/ai-project-inference-endpoint.gif" alt-text="An animation showing how to turn on the Deploy models to Azure AI model inference service feature in Azure AI Foundry portal." lightbox="../model-inference/media/quickstart-ai-project/ai-project-inference-endpoint.gif":::
+    :::image type="content" source="../media/deploy-models-serverless/ai-project-inference-endpoint.gif" alt-text="An animation showing how to turn off the Deploy models to Azure AI model inference service feature in Azure AI Foundry portal." lightbox="../media/deploy-models-serverless/ai-project-inference-endpoint.gif":::
 
 - Azure role-based access controls (Azure RBAC) are used to grant access to operations in Azure AI Foundry portal. To perform the steps in this article, your user account must be assigned the __Azure AI Developer role__ on the resource group. For more information on permissions, see [Role-based access control in Azure AI Foundry portal](../concepts/rbac-ai-foundry.md).
 
 
@@ -15,20 +15,19 @@ ms.custom:  devx-track-azurecli
 
 # How to deploy NVIDIA Inference Microservices
 
-In this article, you learn how to deploy NVIDIA Inference Microservices (NIMs) on Managed Compute in the model catalog on Foundry. NVIDIA inference microservices are containers built by NVIDIA for optimized pre-trained and customized AI models serving on NVIDIA GPUs. 
-Get increased throughput and reduced total cost ownership with NVIDIA NIMs offered for one-click deployment on Foundry, with enterprise production-grade software under NVIDIA AI Enterprise license. 
+In this article, you learn how to deploy NVIDIA Inference Microservices (NIMs) on Managed Compute in the model catalog on Foundry. 
+
+NVIDIA inference microservices are containers built by NVIDIA for optimized pretrained and customized AI models serving on NVIDIA GPUs. Get increased throughput and reduced total cost of ownership with NVIDIA NIMs offered for managed compute deployment on Foundry, with enterprise production-grade software under NVIDIA AI Enterprise license. 
 
 [!INCLUDE [models-preview](../includes/models-preview.md)]
 
 ## Prerequisites
 
 - An Azure subscription with a valid payment method. Free or trial Azure subscriptions won't work. If you don't have an Azure subscription, create a [paid Azure account](https://azure.microsoft.com/pricing/purchase-options/pay-as-you-go) to begin.
 
-- An [Azure AI Foundry hub](create-azure-ai-resource.md).
-
 - An [Azure AI Foundry project](create-projects.md).
 
-- Ensure Marketplace purchases are enabled for your Azure subscription. Learn more about it [here](/azure/cost-management-billing/manage/enable-marketplace-purchases).
+- Marketplace purchases enabled for your Azure subscription. Learn more [here](/azure/cost-management-billing/manage/enable-marketplace-purchases).
 
 - Azure role-based access controls (Azure RBAC) are used to grant access to operations in Azure AI Foundry portal. To perform the steps in this article, your user account must be assigned a _custom role_ with the following permissions. User accounts assigned the _Owner_ or _Contributor_ role for the Azure subscription can also create NIM deployments. For more information on permissions, see [Role-based access control in Azure AI Foundry portal](../concepts/rbac-ai-foundry.md).
 
@@ -48,11 +47,11 @@ Get increased throughput and reduced total cost ownership with NVIDIA NIMs offer
         -	Microsoft.MachineLearningServices/workspaces/onlineEndpoints/* 
 
 
-## NVIDIA NIM PayGo offer on Azure Marketplace by NVIDIA
+## NVIDIA NIM pay-as-you-go offer on Azure Marketplace by NVIDIA
 
- NVIDIA NIMs available on Azure AI Foundry model catalog can be deployed with a subscription to the [NVIDIA NIM SaaS offer](https://aka.ms/nvidia-nims-plan) on Azure Marketplace. This offer includes a 90-day trial that applies to all NIMs associated with a particular SaaS subscription scoped to an Azure AI Foundry project, and has a PayGo price of $1 per GPU hour post the trial period. 
+ NVIDIA NIMs available on Azure AI Foundry model catalog can be deployed with a pay-as-you-go subscription to the [NVIDIA NIM SaaS offer](https://aka.ms/nvidia-nims-plan) on Azure Marketplace. This offer includes a 90-day trial and a pay-as-you-go price of $1 per GPU hour post the trial period. The trial applies to all NIMs associated with a particular SaaS subscription, and starts from the time the SaaS subscription was created. SaaS subscriptions scope to an Azure AI Foundry project, so you have to subscribe to the NIM offer only once within a project, then you are able to deploy all NIMs offered by NVIDIA in the AI Foundry model catalog. If you want to deploy NIM in a different project with no existing SaaS subscription, you will have to resubscribe to the offer.  
 
- Azure AI Foundry enables a seamless purchase flow of the NVIDIA NIM offering on Marketplace from NVIDIA collection in the model catalog, and further deployment on managed compute.
+ Azure AI Foundry enables a seamless purchase experience of the NVIDIA NIM offering on Marketplace from the NVIDIA collection in the model catalog, and further deployment on managed compute.
 
 ## Deploy NVIDIA Inference Microservices on Managed Compute
 
@@ -64,34 +63,54 @@ Get increased throughput and reduced total cost ownership with NVIDIA NIMs offer
 
 4. Select the NVIDIA NIM of your choice. In this article, we are using **Llama-3.3-70B-Instruct-NIM-microservice** as an example.
 5. Select **Deploy**.
-6. Select one of the NVIDIA GPU based VM SKUs supported for the NIM, based on your intended workload. You need to have quota in your Azure subscription.
-7. You can then customize your deployment configuration for the instance count, select an existing endpoint or create a new one, etc. For the example in this article, we consider an instance count of **1** and create a new endpoint. 
+6. Select one of the NVIDIA GPUs accelerated Azure Machine Learning VM SKUs supported for the NIM, based on your intended workload. You need to have quota in your Azure subscription.
+7. You can then customize your deployment configuration for the instance count and select an existing endpoint or create a new one. For the example in this article, we consider an instance count of **1** and create a new endpoint. 
 
 :::image type="content" source="../media/how-to/deploy-nvidia-inference-microservice/project-customization.png" alt-text="A screenshot showing project customization options in the deployment wizard." lightbox="../media/how-to/deploy-nvidia-inference-microservice/project-customization.png"::: 
 
 8. Select **Next**
-9. Then, review the pricing breakdown for the NIM deployment, terms of use and license agreement associated with the NIM offer. The pricing breakdown helps inform what the aggregated pricing for the NIM software deployed would be, which is a function of the number of NVIDIA GPUs in the VM instance that was selected in the previous steps. In addition to the applicable NIM software price, Azure Compute charges also applies based on your deployment configuration.
+9. Then, review the pricing breakdown for the NIM deployment, terms of use and license agreement associated with the NIM offer. The pricing breakdown helps inform what the aggregated pricing for the NIM software deployed would be, which is a function of the number of NVIDIA GPUs in the VM instance that is selected in the previous steps. In addition to the applicable NIM software price, Azure Compute charges also apply based on your deployment configuration.
 
 :::image type="content" source="../media/how-to/deploy-nvidia-inference-microservice/payment-description.png" alt-text="A screenshot showing the necessary user payment agreement detailing how the user is charged for deploying the models." lightbox="../media/how-to/deploy-nvidia-inference-microservice/payment-description.png":::  
 
-10. Select the checkbox to acknowledge understanding of pricing and terms of use, and then, select **Deploy**.
+10. Select the checkbox to acknowledge understanding of pricing and terms of use, and then, click **Deploy**. 
 
 ## Consume NVIDIA NIM deployments
 
-After your deployment is successfully created, you can go to **Models + Endpoints** under My assets in your Azure AI Foundry project, select your deployment under "Model deployments" and navigate to the Test tab for sample inference to the endpoint. You can also go to the Chat Playground by selecting **Open in Playground** in Deployment Details tab, to be able to modify parameters for the inference requests.   
+After your deployment is successfully created, you can go to **Models + Endpoints** under _My assets_ in your Azure AI Foundry project, select your deployment under **Model deployments** and navigate to the Test tab for sample inference to the endpoint. You can also go to the Chat Playground by selecting **Open in Playground** in Deployment Details tab, to be able to modify parameters for the inference requests.   
+
+NVIDIA NIMs on Foundry expose an OpenAI compatible API. Learn more about the payload supported [here](https://docs.nvidia.com/nim/large-language-models/latest/api-reference.html#). The 'model' parameter for NIMs on Foundry is set to a default value within the container, and is not required to be passed in the request payload to your online endpoint. The **Consume** tab of the NIM deployment on Foundry includes code samples for inference with the target URL of your deployment. 
+
+You can also consume NIM deployments using the [Azure AI Model Inference SDK](/python/api/overview/azure/ai-inference-readme), with limitations such as no support for [creating and authenticating clients using `load_client`](/python/api/overview/azure/ai-inference-readme#create-and-authenticate-clients-using-load_client) and calling client method `get_model_info` to [retrieve model information](/python/api/overview/azure/ai-inference-readme#get-ai-model-information).
+
+### Develop and run agents with NIM endpoints
+
+The following NVIDIA NIMs of **chat completions** task type in the model catalog can be used to [create and run agents using Agent Service](/python/api/overview/azure/ai-projects-readme#agents-preview) using various supported tools, with the following two additional requirements: 
+
+1. Create a _Serverless Connection_ to the project using the NIM endpoint and Key. Note that the target URL for NIM endpoint in the connection should be `https://<endpoint-name>.region.inference.ml.azure.com/v1/`. 
+2. Set the _model parameter_ in the request body to be like, `https://<endpoint>.region.inference.ml.azure.com/v1/@<parameter value per table below>` while creating and running agents.
+
+
+NVIDIA NIM | `model` parameter value 
+--|--
+Llama-3.3-70B-Instruct-NIM-microservice | meta/llama-3.3-70b-instruct 
+Llama-3.1-8B-Instruct-NIM-microservice | meta/llama-3.1-8b-instruct 
+Mistral-7B-Instruct-v0.3-NIM-microservice | mistralai/mistral-7b-instruct-v0.3 
+
+
+## Security scanning
 
-NVIDIA NIMs on Foundry expose an OpenAI compatible API, learn more about the payload supported [here](https://docs.nvidia.com/nim/large-language-models/latest/api-reference.html#). The 'model' parameter for NIMs on Foundry is set to a default value within the container, and is not required to pass through in the payload to your online endpoint. The **Consume** tab of the NIM deployment on Foundry includes code samples for inference with the target URL of your deployment. You can also consume NIM deployments using the Azure AI Model Inference SDK. 
+NVIDIA ensures the security and reliability of NVIDIA NIM container images through best-in-class vulnerability scanning, rigorous patch management, and transparent processes. Learn more on the details [here](https://docs.nvidia.com/ai-enterprise/planning-resource/security-for-azure-ai-foundry/latest/introduction.html). Microsoft works with NVIDIA to get the latest patches of the NIMs to deliver secure, stable, and reliable production-grade software within AI Foundry.
 
-## Security scanning for NIMs by NVIDIA
+Users can refer to the last updated time for the NIM on the right pane in the model overview page. Redeploy to consume the latest version of NIM from NVIDIA on AI Foundry. 
 
-NVIDIA ensures the security and reliability of NVIDIA NIM container images through best-in-class vulnerability scanning, rigorous patch management, and transparent processes. Learn the details [here](https://docs.nvidia.com/ai-enterprise/planning-resource/security-for-azure-ai-foundry/latest/introduction.html). Microsoft works with NVIDIA to get the latest patches of the NIMs to deliver secure, stable, and reliable production-grade software within AI Foundry.
-Users can refer to the last updated time for the NIM in the model overview page, and you can redeploy to get the latest version of NIM from NVIDIA on Foundry.
+## Network Isolation 
 
-Redeploy to get the latest version of NIM from NVIDIA on Foundry. 
+Collections in the model catalog can be deployed within your isolated networks using workspace managed virtual network. For more information on how to configure your workspace managed networks, see [here.](/azure/machine-learning/how-to-managed-network#configure-a-managed-virtual-network-to-allow-internet-outbound)
 
-## Network Isolation support for NIMs
+### Limitation
 
-While NIMs are in preview on Foundry, workspaces with Public Network Access disabled will have a limitation of being able to create only one successful deployment in the private workspace or project. Note, there can only be a single active deployment in a private workspace, attempts to create more active deployments will end in failure.
+While NIMs are in preview on Foundry, projects with ingress Public Network Access disabled have a limitation of supporting creation of only one deployment successfully. Note that there can only be a single active deployment in a private workspace, attempts to create more active deployments result in deployment creation failures. This limitation does not exist when NIMs are generally available on AI Foundry.
 
 ## Related content
 
 
@@ -50,7 +50,7 @@ To configure the project to use the Azure AI model inference capability in Azure
 
 2. At the top navigation bar, over the right corner, select the **Preview features** icon. A contextual blade shows up at the right of the screen.
 
-3. Turn the feature **Deploy models to Azure AI model inference service** on.
+3. Turn on the **Deploy models to Azure AI model inference service** feature.
 
     :::image type="content" source="../media/quickstart-ai-project/ai-project-inference-endpoint.gif" alt-text="An animation showing how to turn on the Deploy models to Azure AI model inference service feature in Azure AI Foundry portal." lightbox="../media/quickstart-ai-project/ai-project-inference-endpoint.gif":::
 
 
@@ -12,7 +12,7 @@ zone_pivot_groups: azure-ai-models-deployment
 
 * An AI project resource.
 
-* The feature **Deploy models to Azure AI model inference service** on.
+* The **Deploy models to Azure AI model inference service** feature is turned on.
 
    :::image type="content" source="../../media/quickstart-ai-project/ai-project-inference-endpoint.gif" alt-text="An animation showing how to turn on the Deploy models to Azure AI model inference service feature in Azure AI Foundry portal." lightbox="../../media/quickstart-ai-project/ai-project-inference-endpoint.gif":::
 
 
@@ -46,7 +46,7 @@ To create a project with an Azure AI Services account, follow these steps:
 
 10. Azure AI model inference is a Preview feature that needs to be turned on in Azure AI Foundry. At the top navigation bar, over the right corner, select the **Preview features** icon. A contextual blade shows up at the right of the screen.
 
-11. Turn the feature **Deploy models to Azure AI model inference service** on.
+11. Turn on the **Deploy models to Azure AI model inference service** feature.
 
     :::image type="content" source="../../media/quickstart-ai-project/ai-project-inference-endpoint.gif" alt-text="An animation showing how to turn on the Azure AI model inference service deploy models feature in Azure AI Foundry portal." lightbox="../../media/quickstart-ai-project/ai-project-inference-endpoint.gif":::
 
 
@@ -69,7 +69,7 @@ To create an Azure AI project that supports model inference for DeepSeek-R1, fol
 
 10. Azure AI model inference is a Preview feature that needs to be turned on in Azure AI Foundry. At the top navigation bar, over the right corner, select the **Preview features** icon. A contextual blade shows up at the right of the screen.
 
-11. Turn the feature **Deploy models to Azure AI model inference service** on.
+11. Turn on the **Deploy models to Azure AI model inference service** feature.
 
     :::image type="content" source="../media/quickstart-ai-project/ai-project-inference-endpoint.gif" alt-text="An animation showing how to turn on the Azure AI model inference service deploy models feature in Azure AI Foundry portal." lightbox="../media/quickstart-ai-project/ai-project-inference-endpoint.gif":::
 
 
@@ -7,7 +7,7 @@ author: aahill
 ms.author: aahi
 ms.service: azure-ai-agent-service
 ms.topic: conceptual
-ms.date: 03/05/2025
+ms.date: 03/21/2025
 ms.custom: azure-ai-agents
 ---
 
@@ -24,7 +24,7 @@ Azure OpenAI provides customers with choices on the hosting structure that fits
 
 All deployments can perform the exact same inference operations, however the billing, scale, and performance are substantially different. To learn more about Azure OpenAI deployment types see our [deployment types guide](../../openai/how-to/deployment-types.md).
 
-Azure AI Agent Service supports the same models as the chat completions API in Azure OpenAI, in the following regions.
+Azure AI Agent Service supports the following Azure OpenAI models in the listed regions.
 
 > [!NOTE]
 > The following table is for pay-as-you-go. For information on Provisioned Throughput Unit (PTU) availability, see [provisioned throughput](../../openai/concepts/provisioned-throughput.md) in the Azure OpenAI documentation. `GlobalStandard` customers also have access to [global standard models](../../openai/concepts/models.md#global-standard-model-availability). 
 
@@ -5,7 +5,7 @@ description: Learn how to get started with Azure OpenAI Service and create your
 #services: cognitive-services
 manager: nitinme
 ms.service: azure-ai-openai
-ms.custom: devx-track-azurecli, build-2023, build-2023-dataai, devx-track-azurepowershell
+ms.custom: devx-track-azurecli, build-2023, build-2023-dataai, devx-track-azurepowershell, innovation-engine
 ms.topic: how-to
 ms.date: 01/31/2025
 zone_pivot_groups: openai-create-resource