MicrosoftDocs
diff --git a/‎articles/ai-foundry/concepts/model-lifecycle-retirement.md‎
Lines changed: 6 additions & 0 deletions b/‎articles/ai-foundry/concepts/model-lifecycle-retirement.md‎
Lines changed: 6 additions & 0 deletions
diff --git a/‎articles/ai-foundry/concepts/models-featured.md‎
Lines changed: 3 additions & 2 deletions b/‎articles/ai-foundry/concepts/models-featured.md‎
Lines changed: 3 additions & 2 deletions
diff --git a/‎articles/ai-foundry/includes/region-availability-maas.md‎
Lines changed: 1 addition & 0 deletions b/‎articles/ai-foundry/includes/region-availability-maas.md‎
Lines changed: 1 addition & 0 deletions
diff --git a/‎articles/ai-foundry/model-inference/concepts/models.md‎
Lines changed: 2 additions & 1 deletion b/‎articles/ai-foundry/model-inference/concepts/models.md‎
Lines changed: 2 additions & 1 deletion
diff --git a/‎articles/ai-foundry/model-inference/includes/create-model-deployments/cli.md‎
Lines changed: 4 additions & 6 deletions b/‎articles/ai-foundry/model-inference/includes/create-model-deployments/cli.md‎
Lines changed: 4 additions & 6 deletions
diff --git a/‎articles/ai-foundry/model-inference/quotas-limits.md‎
Lines changed: 3 additions & 3 deletions b/‎articles/ai-foundry/model-inference/quotas-limits.md‎
Lines changed: 3 additions & 3 deletions
diff --git a/‎articles/ai-services/agents/how-to/tools/fabric.md‎
Lines changed: 2 additions & 0 deletions b/‎articles/ai-services/agents/how-to/tools/fabric.md‎
Lines changed: 2 additions & 0 deletions
diff --git a/‎articles/ai-services/language-service/whats-new.md‎
Lines changed: 8 additions & 8 deletions b/‎articles/ai-services/language-service/whats-new.md‎
Lines changed: 8 additions & 8 deletions
diff --git a/‎articles/ai-services/openai/concepts/models.md‎
Lines changed: 18 additions & 3 deletions b/‎articles/ai-services/openai/concepts/models.md‎
Lines changed: 18 additions & 3 deletions
diff --git a/‎articles/ai-services/openai/how-to/batch.md‎
Lines changed: 6 additions & 1 deletion b/‎articles/ai-services/openai/how-to/batch.md‎
Lines changed: 6 additions & 1 deletion
@@ -80,6 +80,12 @@ The following tables list the timelines for models that are on track for retirem
 | [Cohere-rerank-v3-english](https://ai.azure.com/explore/models/Cohere-rerank-v3-english/version/1/registry/azureml-cohere) | February 28, 2025 | March 31, 2025 | June 30, 2025 | [Cohere-rerank-v3.5-english](https://ai.azure.com/explore/models/Cohere-rerank-v3.5/version/1/registry/azureml-cohere) |
 | [Cohere-rerank-v3-multilingual](https://ai.azure.com/explore/models/Cohere-rerank-v3-multilingual/version/1/registry/azureml-cohere) | February 28, 2025 | March 31, 2025 | June 30, 2025 | [Cohere-rerank-v3.5-multilingual](https://ai.azure.com/explore/models/Cohere-rerank-v3.5/version/1/registry/azureml-cohere) |
 
+#### DeepSeek
+
+| Model | Legacy date (UTC) | Deprecation date (UTC) | Retirement date (UTC) | Suggested replacement model |
+|-------|-------------------|------------------------|-----------------------|-----------------------------|
+| [DeepSeek-V3](https://aka.ms/azureai/landing/DeepSeek-V3) | April 10, 2025 | May 31, 2025 | August 31, 2025 | [DeepSeek-V3-0324](https://aka.ms/azureai/landing/DeepSeek-V3-0324) |
+
 #### Meta
 
 | Model | Legacy date (UTC) | Deprecation date (UTC) | Retirement date (UTC) | Suggested replacement model |
 
@@ -141,11 +141,12 @@ For more examples of how to use Jais models, see the following examples:
 
 ## DeepSeek
 
-DeepSeek family of models includes DeepSeek-R1, which excels at reasoning tasks using a step-by-step training process, such as language, scientific reasoning, and coding tasks, and DeepSeek-V3, a Mixture-of-Experts (MoE) language model. 
+DeepSeek family of models includes DeepSeek-R1, which excels at reasoning tasks using a step-by-step training process, such as language, scientific reasoning, and coding tasks, DeepSeek-V3-0324, a Mixture-of-Experts (MoE) language model, and more. 
 
 | Model  | Type | Capabilities | 
 | ------ | ---- | --- | 
-| [DeepSeek-V3](https://ai.azure.com/explore/models/deepseek-v3/version/1/registry/azureml-deepseek) | [chat-completion](../model-inference/how-to/use-chat-completions.md?context=/azure/ai-foundry/context/context) | - **Input:** text (131,072 tokens) <br /> - **Output:** text (131,072 tokens) <br />  - **Tool calling:** No <br /> - **Response formats:** Text, JSON |
+| [DeekSeek-V3-0324](https://ai.azure.com/explore/models/deepseek-v3-0324/version/1/registry/azureml-deepseek) | [chat-completion](../model-inference/how-to/use-chat-completions.md?context=/azure/ai-foundry/context/context) | - **Input:** text (131,072 tokens) <br /> - **Output:** (131,072 tokens) <br /> - **Tool calling:** No <br /> - **Response formats:** Text, JSON |
+| [DeepSeek-V3](https://ai.azure.com/explore/models/deepseek-v3/version/1/registry/azureml-deepseek) <br />(Legacy) | [chat-completion](../model-inference/how-to/use-chat-completions.md?context=/azure/ai-foundry/context/context) | - **Input:** text (131,072 tokens) <br /> - **Output:** text (131,072 tokens) <br />  - **Tool calling:** No <br /> - **Response formats:** Text, JSON |
 | [DeepSeek-R1](https://ai.azure.com/explore/models/deepseek-r1/version/1/registry/azureml-deepseek) | [chat-completion with reasoning content](../model-inference/how-to/use-chat-reasoning.md?context=/azure/ai-foundry/context/context) | - **Input:** text (163,840 tokens) <br /> - **Output:** text (163,840 tokens) <br />  - **Tool calling:** No <br /> - **Response formats:** Text. |
 
 For a tutorial on DeepSeek-R1, see [Tutorial: Get started with DeepSeek-R1 reasoning model in Azure AI model inference](../model-inference/tutorials/get-started-deepseek-r1.md?context=/azure/ai-foundry/context/context).
 
@@ -41,6 +41,7 @@ Cohere Embed v3 -  Multilingual    |  [Microsoft Managed Countries/Regions](/par
 
 | Model | Offer Availability Region  | Hub/Project Region for Deployment  | Hub/Project Region for Fine tuning  |
 |---------|---------|---------|---------|
+DeepSeek-V3-0324                  | Not applicable | East US <br> East US 2 <br> North Central US <br> South Central US <br> West US <br> West US 3  | Not available       |
 DeepSeek-V3                       | Not applicable | East US <br> East US 2 <br> North Central US <br> South Central US <br> West US <br> West US 3  | Not available       |
 DeepSeek-R1                       | Not applicable | East US <br> East US 2 <br> North Central US <br> South Central US <br> West US <br> West US 3  | Not available       |
 
 
@@ -110,7 +110,8 @@ DeepSeek family of models includes DeepSeek-R1, which excels at reasoning tasks
 | Model  | Type | Tier | Capabilities |
 | ------ | ---- | --- | ------------ |
 | [DeekSeek-R1](https://ai.azure.com/explore/models/deepseek-r1/version/1/registry/azureml-deepseek) | chat-completion <br /> [(with reasoning content)](../how-to/use-chat-reasoning.md) | Global standard | - **Input:** text (163,840 tokens) <br /> - **Output:**  (163,840 tokens) <br /> - **Languages:** `en` and `zh` <br />  - **Tool calling:** No <br /> - **Response formats:** Text. |
-| [DeekSeek-V3](https://ai.azure.com/explore/models/deepseek-v3/version/1/registry/azureml-deepseek) | chat-completion | Global standard | - **Input:** text (131,072 tokens) <br /> - **Output:**  (131,072 tokens) <br /> - **Languages:** `en` and `zh` <br />  - **Tool calling:** No <br /> - **Response formats:** Text, JSON |
+| [DeekSeek-V3](https://ai.azure.com/explore/models/deepseek-v3/version/1/registry/azureml-deepseek) <br />(Legacy) | chat-completion | Global standard | - **Input:** text (131,072 tokens) <br /> - **Output:**  (131,072 tokens) <br /> - **Languages:** `en` and `zh` <br />  - **Tool calling:** No <br /> - **Response formats:** Text, JSON |
+| [DeekSeek-V3-0324](https://ai.azure.com/explore/models/deepseek-v3-0324/version/1/registry/azureml-deepseek) | chat-completion | Global standard | - **Input:** text (131,072 tokens) <br /> - **Output:**  (131,072 tokens) <br /> - **Languages:** `en` and `zh` <br />  - **Tool calling:** No <br /> - **Response formats:** Text, JSON |
 
 For a tutorial on DeepSeek-R1, see [Tutorial: Get started with DeepSeek-R1 reasoning model in Azure AI model inference](../tutorials/get-started-deepseek-r1.md).
 
 
@@ -48,15 +48,16 @@ To add a model, you first need to identify the model that you want to deploy. Yo
     ```azurecli
     accountName="<ai-services-resource-name>"
     resourceGroupName="<resource-group>"
+    location="eastus2"
     ```
 
 3. If you don't have an Azure AI Services account create yet, you can create one as follows:
 
     ```azurecli
-    az cognitiveservices account create -n $accountName -g $resourceGroupName --custom-domain $accountName
+    az cognitiveservices account create -n $accountName -g $resourceGroupName --custom-domain $accountName --location $location --kind AIServices --sku S0
     ```
 
-4. Let's see first which models are available to you and under which SKU. The following command list all the model definitions available:
+4. Let's see first which models are available to you and under which SKU. SKUs, also known as [deployment types](../../concepts/deployment-types.md), define how Azure infrastructure is used to process requests. Models may offer different deployment types. The following command list all the model definitions available:
     
     ```azurecli
     az cognitiveservices account list-models \
@@ -77,10 +78,7 @@ To add a model, you first need to identify the model that you want to deploy. Yo
     }
     ```
 
-6. Identify the model you want to deploy. You need the properties `name`, `format`, `version`, and `sku`. Capacity might also be needed depending on the type of deployment.
-   
-   > [!TIP]
-   > Notice that not all the models are available in all the SKUs.
+6. Identify the model you want to deploy. You need the properties `name`, `format`, `version`, and `sku`. The property `format` indicates the provider offering the model. Capacity might also be needed depending on the type of deployment.
 
 7. Add the model deployment to the resource. The following example adds `Phi-3.5-vision-instruct`:
 
 
@@ -32,9 +32,9 @@ Azure uses quotas and limits to prevent budget overruns due to fraud, and to hon
 | -------------------- | ------------------- | ----------- |
 | Tokens per minute    | Azure OpenAI models | Varies per model and SKU. See [limits for Azure OpenAI](../../ai-services/openai/quotas-limits.md). |
 | Requests per minute  | Azure OpenAI models | Varies per model and SKU. See [limits for Azure OpenAI](../../ai-services/openai/quotas-limits.md). |
-| Tokens per minute    | DeepSeek-R1         | 5,000,000 |
-| Requests per minute  | DeepSeek-R1         | 5,000     |
-| Concurrent requests  | DeepSeek-R1         | 300       |
+| Tokens per minute    | DeepSeek-R1<br />DeepSeek-V3-0324         | 5,000,000 |
+| Requests per minute  | DeepSeek-R1<br />DeepSeek-V3-0324         | 5,000     |
+| Concurrent requests  | DeepSeek-R1<br />DeepSeek-V3-0324         | 300       |
 | Tokens per minute    | Rest of models      | 400,000   |
 | Requests per minute  | Rest of models      | 1,000     |
 | Concurrent requests  | Rest of models      | 300       |
 
@@ -34,6 +34,8 @@ You need to first build and publish a Fabric data agent and then connect your Fa
 
 * Developers and end users have at least `READ` access to the Fabric data agent and the underlying data sources it connects with.
 
+* Your Fabric Data Agent and Azure AI Agent need to be in the same tenant.
+
 ## Setup  
 > [!NOTE]
 > * The model you selected in Azure AI Agent setup is only used for agent orchestration and response generation. It doesn't impact which model Fabric data agent uses for NL2SQL operation.
 
@@ -17,14 +17,14 @@ Azure AI Language is updated on an ongoing basis. To stay up-to-date with recent
 ## March 2025
 
 * Azure AI Language resource now can be deployed to 3 new regions, Jio India Central, UK West and Canada East, for the following capabilities: 
- * Language detection 
- * Sentiment analysis 
- * Key phrase extraction 
- * Named entity recognition (NER) 
- * Personally identifiable information (PII) entity recognition 
- * Entity linking 
- * Text analytics for health 
- * Extractive text summarization 
+    * Language detection 
+    * Sentiment analysis 
+    * Key phrase extraction 
+    * Named entity recognition (NER) 
+    * Personally identifiable information (PII) entity recognition 
+    * Entity linking 
+    * Text analytics for health 
+    * Extractive text summarization 
 
 * Back-end infrastructure for the Named entity recognition (NER) and Text Personally identifiable information (PII) entity recognition models is now updated with extended context window limits. 
 
 
@@ -4,11 +4,11 @@ titleSuffix: Azure OpenAI
 description: Learn about the different model capabilities that are available with Azure OpenAI.
 ms.service: azure-ai-openai
 ms.topic: conceptual
-ms.date: 04/01/2025
+ms.date: 04/14/2025
 ms.custom: references_regions, build-2023, build-2023-dataai, refefences_regions
 manager: nitinme
 author: mrbullwinkle #ChrisHMSFT
-ms.author: mbullwin #chrhoder
+ms.author: mbullwin #chrhoder#
 recommendations: false
 ---
 
@@ -18,6 +18,7 @@ Azure OpenAI Service is powered by a diverse set of models with different capabi
 
 | Models | Description |
 |--|--|
+| [GPT-4.1 series](#gpt-41-series) | Latest model release from Azure OpenAI |
 | [computer-use-preview](#computer-use-preview) | An experimental model trained for use with the Responses API computer use tool. |
 | [GPT-4.5 Preview](#gpt-45-preview) |The latest GPT model that excels at diverse text and image tasks.  |
 | [o-series models](#o-series-models) |[Reasoning models](../how-to/reasoning.md) with advanced problem-solving and increased focus and capability.  |
@@ -30,6 +31,20 @@ Azure OpenAI Service is powered by a diverse set of models with different capabi
 | [Whisper](#whisper-models) | A series of models in preview that can transcribe and translate speech to text. |
 | [Text to speech](#text-to-speech-models-preview) (Preview) | A series of models in preview that can synthesize text to speech. |
 
+## GPT 4.1 series
+
+### Region Availability
+
+| Model | Region |
+|---|---|
+| `gpt-4.1` (2025-04-14) | East US2 (Global Standard), Sweden Central (Global Standard) |
+
+### Capabilities
+
+|  Model ID  | Description | Context Window | Max Output Tokens | Training Data (up to)  |
+|  --- |  :--- |:--- |:---|:---: |
+| `gpt-4.1` (2025-04-14) <br> <br> **Latest model from Azure OpenAI**  | - Text & image input <br> - Text output <br> - Chat completions API <br>- Responses API <br> - Streaming <br> - Function calling <br> Structured outputs (chat completions)   | 1,047,576 | 32,768 | May 31, 2024 |
+
 ## computer-use-preview
 
 An experimental model trained for use with the [Responses API](../how-to/responses.md) computer use tool. It can be used in conjunction with 3rd-party libraries to allow the model to control mouse & keyboard input while getting context from screenshots of the current environment.
@@ -70,7 +85,7 @@ Once access has been granted, you will need to create a deployment for the model
 
 |  Model ID  | Description | Context Window | Max Output Tokens | Training Data (up to)  |
 |  --- |  :--- |:--- |:---|:---: |
-| `gpt-4.5-preview` (2025-02-27) <br> **GPT-4.5 Preview**  | The **latest GPT model** that excels at diverse text and image tasks. <br>-Structured outputs <br>-Prompt caching <br>-Tools <br>-Streaming<br>-Text(input/output)<br>- Image(input)   | 128,000 | 16,384 | Oct 2023 |
+| `gpt-4.5-preview` (2025-02-27) <br> **GPT-4.5 Preview**  | [GPT 4.1](#gpt-41-series) is the recommended replacement for this model. Excels at diverse text and image tasks. <br>-Structured outputs <br>-Prompt caching <br>-Tools <br>-Streaming<br>-Text(input/output)<br>- Image(input)   | 128,000 | 16,384 | Oct 2023 |
 
 > [!NOTE]
 > It is expected behavior that the model cannot answer questions about itself. If you want to know when the knowledge cutoff for the model's training data is, or other details about the model you should refer to the model documentation above.
 
@@ -6,7 +6,7 @@ manager: nitinme
 ms.service: azure-ai-openai
 ms.custom: references_regions
 ms.topic: how-to
-ms.date: 01/14/2025
+ms.date: 04/14/2025
 author: mrbullwinkle
 ms.author: mbullwin
 recommendations: false
@@ -33,6 +33,11 @@ Key use cases include:
 
 * **Marketing and Personalization:** Generate personalized content and recommendations at scale.
 
+> [!TIP]
+> If your batch jobs are so large that you are hitting the enqueued token limit even after maxing out the quota for your deployment, certain regions now support a new feature that allows you to queue multiple batch jobs with exponential backoff. 
+>
+>Once your enqueued token quota is available, the next batch job can be created and kicked off automatically.To learn more, see [**automating retries of large batch jobs with exponential backoff**](#queueing-batch-jobs).
+
 > [!IMPORTANT]
 > We aim to process batch requests within 24 hours; we don't expire the jobs that take longer. You can [cancel](#cancel-batch) the job anytime. When you cancel the job, any remaining work is cancelled and any already completed work is returned. You'll be charged for any completed work.
 >