Skip to content

Commit df50786

Browse files
committed
Merge branch 'main' into hailey-how-to-iv
2 parents 3c26025 + 53958ec commit df50786

31 files changed

+365
-106
lines changed

articles/ai-foundry/concepts/model-lifecycle-retirement.md

Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -80,6 +80,12 @@ The following tables list the timelines for models that are on track for retirem
8080
| [Cohere-rerank-v3-english](https://ai.azure.com/explore/models/Cohere-rerank-v3-english/version/1/registry/azureml-cohere) | February 28, 2025 | March 31, 2025 | June 30, 2025 | [Cohere-rerank-v3.5-english](https://ai.azure.com/explore/models/Cohere-rerank-v3.5/version/1/registry/azureml-cohere) |
8181
| [Cohere-rerank-v3-multilingual](https://ai.azure.com/explore/models/Cohere-rerank-v3-multilingual/version/1/registry/azureml-cohere) | February 28, 2025 | March 31, 2025 | June 30, 2025 | [Cohere-rerank-v3.5-multilingual](https://ai.azure.com/explore/models/Cohere-rerank-v3.5/version/1/registry/azureml-cohere) |
8282

83+
#### DeepSeek
84+
85+
| Model | Legacy date (UTC) | Deprecation date (UTC) | Retirement date (UTC) | Suggested replacement model |
86+
|-------|-------------------|------------------------|-----------------------|-----------------------------|
87+
| [DeepSeek-V3](https://aka.ms/azureai/landing/DeepSeek-V3) | April 10, 2025 | May 31, 2025 | August 31, 2025 | [DeepSeek-V3-0324](https://aka.ms/azureai/landing/DeepSeek-V3-0324) |
88+
8389
#### Meta
8490

8591
| Model | Legacy date (UTC) | Deprecation date (UTC) | Retirement date (UTC) | Suggested replacement model |

articles/ai-foundry/concepts/models-featured.md

Lines changed: 3 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -141,11 +141,12 @@ For more examples of how to use Jais models, see the following examples:
141141

142142
## DeepSeek
143143

144-
DeepSeek family of models includes DeepSeek-R1, which excels at reasoning tasks using a step-by-step training process, such as language, scientific reasoning, and coding tasks, and DeepSeek-V3, a Mixture-of-Experts (MoE) language model.
144+
DeepSeek family of models includes DeepSeek-R1, which excels at reasoning tasks using a step-by-step training process, such as language, scientific reasoning, and coding tasks, DeepSeek-V3-0324, a Mixture-of-Experts (MoE) language model, and more.
145145

146146
| Model | Type | Capabilities |
147147
| ------ | ---- | --- |
148-
| [DeepSeek-V3](https://ai.azure.com/explore/models/deepseek-v3/version/1/registry/azureml-deepseek) | [chat-completion](../model-inference/how-to/use-chat-completions.md?context=/azure/ai-foundry/context/context) | - **Input:** text (131,072 tokens) <br /> - **Output:** text (131,072 tokens) <br /> - **Tool calling:** No <br /> - **Response formats:** Text, JSON |
148+
| [DeekSeek-V3-0324](https://ai.azure.com/explore/models/deepseek-v3-0324/version/1/registry/azureml-deepseek) | [chat-completion](../model-inference/how-to/use-chat-completions.md?context=/azure/ai-foundry/context/context) | - **Input:** text (131,072 tokens) <br /> - **Output:** (131,072 tokens) <br /> - **Tool calling:** No <br /> - **Response formats:** Text, JSON |
149+
| [DeepSeek-V3](https://ai.azure.com/explore/models/deepseek-v3/version/1/registry/azureml-deepseek) <br />(Legacy) | [chat-completion](../model-inference/how-to/use-chat-completions.md?context=/azure/ai-foundry/context/context) | - **Input:** text (131,072 tokens) <br /> - **Output:** text (131,072 tokens) <br /> - **Tool calling:** No <br /> - **Response formats:** Text, JSON |
149150
| [DeepSeek-R1](https://ai.azure.com/explore/models/deepseek-r1/version/1/registry/azureml-deepseek) | [chat-completion with reasoning content](../model-inference/how-to/use-chat-reasoning.md?context=/azure/ai-foundry/context/context) | - **Input:** text (163,840 tokens) <br /> - **Output:** text (163,840 tokens) <br /> - **Tool calling:** No <br /> - **Response formats:** Text. |
150151

151152
For a tutorial on DeepSeek-R1, see [Tutorial: Get started with DeepSeek-R1 reasoning model in Azure AI model inference](../model-inference/tutorials/get-started-deepseek-r1.md?context=/azure/ai-foundry/context/context).

articles/ai-foundry/includes/region-availability-maas.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -41,6 +41,7 @@ Cohere Embed v3 - Multilingual | [Microsoft Managed Countries/Regions](/par
4141

4242
| Model | Offer Availability Region | Hub/Project Region for Deployment | Hub/Project Region for Fine tuning |
4343
|---------|---------|---------|---------|
44+
DeepSeek-V3-0324 | Not applicable | East US <br> East US 2 <br> North Central US <br> South Central US <br> West US <br> West US 3 | Not available |
4445
DeepSeek-V3 | Not applicable | East US <br> East US 2 <br> North Central US <br> South Central US <br> West US <br> West US 3 | Not available |
4546
DeepSeek-R1 | Not applicable | East US <br> East US 2 <br> North Central US <br> South Central US <br> West US <br> West US 3 | Not available |
4647

articles/ai-foundry/model-inference/concepts/models.md

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -110,7 +110,8 @@ DeepSeek family of models includes DeepSeek-R1, which excels at reasoning tasks
110110
| Model | Type | Tier | Capabilities |
111111
| ------ | ---- | --- | ------------ |
112112
| [DeekSeek-R1](https://ai.azure.com/explore/models/deepseek-r1/version/1/registry/azureml-deepseek) | chat-completion <br /> [(with reasoning content)](../how-to/use-chat-reasoning.md) | Global standard | - **Input:** text (163,840 tokens) <br /> - **Output:** (163,840 tokens) <br /> - **Languages:** `en` and `zh` <br /> - **Tool calling:** No <br /> - **Response formats:** Text. |
113-
| [DeekSeek-V3](https://ai.azure.com/explore/models/deepseek-v3/version/1/registry/azureml-deepseek) | chat-completion | Global standard | - **Input:** text (131,072 tokens) <br /> - **Output:** (131,072 tokens) <br /> - **Languages:** `en` and `zh` <br /> - **Tool calling:** No <br /> - **Response formats:** Text, JSON |
113+
| [DeekSeek-V3](https://ai.azure.com/explore/models/deepseek-v3/version/1/registry/azureml-deepseek) <br />(Legacy) | chat-completion | Global standard | - **Input:** text (131,072 tokens) <br /> - **Output:** (131,072 tokens) <br /> - **Languages:** `en` and `zh` <br /> - **Tool calling:** No <br /> - **Response formats:** Text, JSON |
114+
| [DeekSeek-V3-0324](https://ai.azure.com/explore/models/deepseek-v3-0324/version/1/registry/azureml-deepseek) | chat-completion | Global standard | - **Input:** text (131,072 tokens) <br /> - **Output:** (131,072 tokens) <br /> - **Languages:** `en` and `zh` <br /> - **Tool calling:** No <br /> - **Response formats:** Text, JSON |
114115

115116
For a tutorial on DeepSeek-R1, see [Tutorial: Get started with DeepSeek-R1 reasoning model in Azure AI model inference](../tutorials/get-started-deepseek-r1.md).
116117

articles/ai-foundry/model-inference/includes/create-model-deployments/cli.md

Lines changed: 4 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -48,15 +48,16 @@ To add a model, you first need to identify the model that you want to deploy. Yo
4848
```azurecli
4949
accountName="<ai-services-resource-name>"
5050
resourceGroupName="<resource-group>"
51+
location="eastus2"
5152
```
5253
5354
3. If you don't have an Azure AI Services account create yet, you can create one as follows:
5455
5556
```azurecli
56-
az cognitiveservices account create -n $accountName -g $resourceGroupName --custom-domain $accountName
57+
az cognitiveservices account create -n $accountName -g $resourceGroupName --custom-domain $accountName --location $location --kind AIServices --sku S0
5758
```
5859
59-
4. Let's see first which models are available to you and under which SKU. The following command list all the model definitions available:
60+
4. Let's see first which models are available to you and under which SKU. SKUs, also known as [deployment types](../../concepts/deployment-types.md), define how Azure infrastructure is used to process requests. Models may offer different deployment types. The following command list all the model definitions available:
6061
6162
```azurecli
6263
az cognitiveservices account list-models \
@@ -77,10 +78,7 @@ To add a model, you first need to identify the model that you want to deploy. Yo
7778
}
7879
```
7980
80-
6. Identify the model you want to deploy. You need the properties `name`, `format`, `version`, and `sku`. Capacity might also be needed depending on the type of deployment.
81-
82-
> [!TIP]
83-
> Notice that not all the models are available in all the SKUs.
81+
6. Identify the model you want to deploy. You need the properties `name`, `format`, `version`, and `sku`. The property `format` indicates the provider offering the model. Capacity might also be needed depending on the type of deployment.
8482
8583
7. Add the model deployment to the resource. The following example adds `Phi-3.5-vision-instruct`:
8684

articles/ai-foundry/model-inference/quotas-limits.md

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -32,9 +32,9 @@ Azure uses quotas and limits to prevent budget overruns due to fraud, and to hon
3232
| -------------------- | ------------------- | ----------- |
3333
| Tokens per minute | Azure OpenAI models | Varies per model and SKU. See [limits for Azure OpenAI](../../ai-services/openai/quotas-limits.md). |
3434
| Requests per minute | Azure OpenAI models | Varies per model and SKU. See [limits for Azure OpenAI](../../ai-services/openai/quotas-limits.md). |
35-
| Tokens per minute | DeepSeek-R1 | 5,000,000 |
36-
| Requests per minute | DeepSeek-R1 | 5,000 |
37-
| Concurrent requests | DeepSeek-R1 | 300 |
35+
| Tokens per minute | DeepSeek-R1<br />DeepSeek-V3-0324 | 5,000,000 |
36+
| Requests per minute | DeepSeek-R1<br />DeepSeek-V3-0324 | 5,000 |
37+
| Concurrent requests | DeepSeek-R1<br />DeepSeek-V3-0324 | 300 |
3838
| Tokens per minute | Rest of models | 400,000 |
3939
| Requests per minute | Rest of models | 1,000 |
4040
| Concurrent requests | Rest of models | 300 |

articles/ai-services/agents/how-to/tools/fabric.md

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -34,6 +34,8 @@ You need to first build and publish a Fabric data agent and then connect your Fa
3434

3535
* Developers and end users have at least `READ` access to the Fabric data agent and the underlying data sources it connects with.
3636

37+
* Your Fabric Data Agent and Azure AI Agent need to be in the same tenant.
38+
3739
## Setup
3840
> [!NOTE]
3941
> * The model you selected in Azure AI Agent setup is only used for agent orchestration and response generation. It doesn't impact which model Fabric data agent uses for NL2SQL operation.

articles/ai-services/language-service/whats-new.md

Lines changed: 8 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -17,14 +17,14 @@ Azure AI Language is updated on an ongoing basis. To stay up-to-date with recent
1717
## March 2025
1818

1919
* Azure AI Language resource now can be deployed to 3 new regions, Jio India Central, UK West and Canada East, for the following capabilities:
20-
* Language detection
21-
* Sentiment analysis
22-
* Key phrase extraction
23-
* Named entity recognition (NER)
24-
* Personally identifiable information (PII) entity recognition
25-
* Entity linking
26-
* Text analytics for health
27-
* Extractive text summarization
20+
* Language detection
21+
* Sentiment analysis
22+
* Key phrase extraction
23+
* Named entity recognition (NER)
24+
* Personally identifiable information (PII) entity recognition
25+
* Entity linking
26+
* Text analytics for health
27+
* Extractive text summarization
2828

2929
* Back-end infrastructure for the Named entity recognition (NER) and Text Personally identifiable information (PII) entity recognition models is now updated with extended context window limits.
3030

articles/ai-services/openai/concepts/models.md

Lines changed: 18 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -4,11 +4,11 @@ titleSuffix: Azure OpenAI
44
description: Learn about the different model capabilities that are available with Azure OpenAI.
55
ms.service: azure-ai-openai
66
ms.topic: conceptual
7-
ms.date: 04/01/2025
7+
ms.date: 04/14/2025
88
ms.custom: references_regions, build-2023, build-2023-dataai, refefences_regions
99
manager: nitinme
1010
author: mrbullwinkle #ChrisHMSFT
11-
ms.author: mbullwin #chrhoder
11+
ms.author: mbullwin #chrhoder#
1212
recommendations: false
1313
---
1414

@@ -18,6 +18,7 @@ Azure OpenAI Service is powered by a diverse set of models with different capabi
1818

1919
| Models | Description |
2020
|--|--|
21+
| [GPT-4.1 series](#gpt-41-series) | Latest model release from Azure OpenAI |
2122
| [computer-use-preview](#computer-use-preview) | An experimental model trained for use with the Responses API computer use tool. |
2223
| [GPT-4.5 Preview](#gpt-45-preview) |The latest GPT model that excels at diverse text and image tasks. |
2324
| [o-series models](#o-series-models) |[Reasoning models](../how-to/reasoning.md) with advanced problem-solving and increased focus and capability. |
@@ -30,6 +31,20 @@ Azure OpenAI Service is powered by a diverse set of models with different capabi
3031
| [Whisper](#whisper-models) | A series of models in preview that can transcribe and translate speech to text. |
3132
| [Text to speech](#text-to-speech-models-preview) (Preview) | A series of models in preview that can synthesize text to speech. |
3233

34+
## GPT 4.1 series
35+
36+
### Region Availability
37+
38+
| Model | Region |
39+
|---|---|
40+
| `gpt-4.1` (2025-04-14) | East US2 (Global Standard), Sweden Central (Global Standard) |
41+
42+
### Capabilities
43+
44+
| Model ID | Description | Context Window | Max Output Tokens | Training Data (up to) |
45+
| --- | :--- |:--- |:---|:---: |
46+
| `gpt-4.1` (2025-04-14) <br> <br> **Latest model from Azure OpenAI** | - Text & image input <br> - Text output <br> - Chat completions API <br>- Responses API <br> - Streaming <br> - Function calling <br> Structured outputs (chat completions) | 1,047,576 | 32,768 | May 31, 2024 |
47+
3348
## computer-use-preview
3449

3550
An experimental model trained for use with the [Responses API](../how-to/responses.md) computer use tool. It can be used in conjunction with 3rd-party libraries to allow the model to control mouse & keyboard input while getting context from screenshots of the current environment.
@@ -70,7 +85,7 @@ Once access has been granted, you will need to create a deployment for the model
7085

7186
| Model ID | Description | Context Window | Max Output Tokens | Training Data (up to) |
7287
| --- | :--- |:--- |:---|:---: |
73-
| `gpt-4.5-preview` (2025-02-27) <br> **GPT-4.5 Preview** | The **latest GPT model** that excels at diverse text and image tasks. <br>-Structured outputs <br>-Prompt caching <br>-Tools <br>-Streaming<br>-Text(input/output)<br>- Image(input) | 128,000 | 16,384 | Oct 2023 |
88+
| `gpt-4.5-preview` (2025-02-27) <br> **GPT-4.5 Preview** | [GPT 4.1](#gpt-41-series) is the recommended replacement for this model. Excels at diverse text and image tasks. <br>-Structured outputs <br>-Prompt caching <br>-Tools <br>-Streaming<br>-Text(input/output)<br>- Image(input) | 128,000 | 16,384 | Oct 2023 |
7489

7590
> [!NOTE]
7691
> It is expected behavior that the model cannot answer questions about itself. If you want to know when the knowledge cutoff for the model's training data is, or other details about the model you should refer to the model documentation above.

articles/ai-services/openai/how-to/batch.md

Lines changed: 6 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -6,7 +6,7 @@ manager: nitinme
66
ms.service: azure-ai-openai
77
ms.custom: references_regions
88
ms.topic: how-to
9-
ms.date: 01/14/2025
9+
ms.date: 04/14/2025
1010
author: mrbullwinkle
1111
ms.author: mbullwin
1212
recommendations: false
@@ -33,6 +33,11 @@ Key use cases include:
3333

3434
* **Marketing and Personalization:** Generate personalized content and recommendations at scale.
3535

36+
> [!TIP]
37+
> If your batch jobs are so large that you are hitting the enqueued token limit even after maxing out the quota for your deployment, certain regions now support a new feature that allows you to queue multiple batch jobs with exponential backoff.
38+
>
39+
>Once your enqueued token quota is available, the next batch job can be created and kicked off automatically.To learn more, see [**automating retries of large batch jobs with exponential backoff**](#queueing-batch-jobs).
40+
3641
> [!IMPORTANT]
3742
> We aim to process batch requests within 24 hours; we don't expire the jobs that take longer. You can [cancel](#cancel-batch) the job anytime. When you cancel the job, any remaining work is cancelled and any already completed work is returned. You'll be charged for any completed work.
3843
>

0 commit comments

Comments
 (0)