Skip to content

Commit 3ebd6b5

Browse files
authored
Merge pull request #2679 from santiagxf/santiagxf/deepseek-inference
feat: deepseek + reasoning models
2 parents e145d27 + ccf0e13 commit 3ebd6b5

File tree

14 files changed

+1285
-42
lines changed

14 files changed

+1285
-42
lines changed

articles/ai-foundry/model-inference/concepts/models.md

Lines changed: 15 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -17,9 +17,6 @@ ms.custom: references_regions, tool_generated
1717

1818
Azure AI model inference in Azure AI Foundry gives you access to flagship models in Azure AI to consume them as APIs without hosting them on your infrastructure.
1919

20-
> [!TIP]
21-
> DeepSeek-R1 (preview) is available for deployment as [Serverless API endpoint](../../../ai-studio/how-to/deploy-models-deepseek.md).
22-
2320
:::image type="content" source="../media/models/models-catalog.gif" alt-text="An animation showing Azure AI studio model catalog section and the models available." lightbox="../media/models/models-catalog.gif":::
2421

2522
Model availability varies by model provider, deployment SKU, and cloud. All models available in Azure AI Model Inference support the [Global standard](deployment-types.md#global-standard) deployment type which uses global capacity to guarantee throughput. [Azure OpenAI models](#azure-openai) also support regional deployments and [sovereign clouds](/entra/identity-platform/authentication-national-cloud)—Azure Government, Azure Germany, and Azure China 21Vianet.
@@ -52,10 +49,11 @@ Azure OpenAI Service offers a diverse set of models with different capabilities
5249
- Models that can transcribe and translate speech to text
5350

5451
| Model | Type | Tier | Capabilities |
55-
| ------ | ---- | --- | ------------ |
52+
| ------ | ---- | ---- | ------------ |
53+
| [o3-mini](https://ai.azure.com/explore/models/o3-mini/version/2025-01-31/registry/azure-openai) | chat-completion | Global standard | - **Input:** text and image (200,000 tokens) <br /> - **Output:** text (100,000 tokens) <br /> - **Languages:** `en`, `it`, `af`, `es`, `de`, `fr`, `id`, `ru`, `pl`, `uk`, `el`, `lv`, `zh`, `ar`, `tr`, `ja`, `sw`, `cy`, `ko`, `is`, `bn`, `ur`, `ne`, `th`, `pa`, `mr`, and `te`. <br /> - **Tool calling:** Yes <br /> - **Response formats:** Text, JSON, structured outputs |
5654
| [o1](https://ai.azure.com/explore/models/o1/version/2024-12-17/registry/azure-openai) | chat-completion | Global standard | - **Input:** text and image (200,000 tokens) <br /> - **Output:** text (100,000 tokens) <br /> - **Languages:** `en`, `it`, `af`, `es`, `de`, `fr`, `id`, `ru`, `pl`, `uk`, `el`, `lv`, `zh`, `ar`, `tr`, `ja`, `sw`, `cy`, `ko`, `is`, `bn`, `ur`, `ne`, `th`, `pa`, `mr`, and `te`. <br /> - **Tool calling:** Yes <br /> - **Response formats:** Text, JSON, structured outputs |
5755
| [o1-preview](https://ai.azure.com/explore/models/o1-preview/version/1/registry/azure-openai) | chat-completion | Global standard <br />Standard<br /> | - **Input:** text (128,000 tokens) <br /> - **Output:** (32,768 tokens) <br /> - **Languages:** `en`, `it`, `af`, `es`, `de`, `fr`, `id`, `ru`, `pl`, `uk`, `el`, `lv`, `zh`, `ar`, `tr`, `ja`, `sw`, `cy`, `ko`, `is`, `bn`, `ur`, `ne`, `th`, `pa`, `mr`, and `te`. <br /> - **Tool calling:** Yes <br /> - **Response formats:** Text, JSON, structured outputs |
58-
| [o1-mini](https://ai.azure.com/explore/models/o1-mini/version/1/registry/azure-openai) | chat-completion | Global standard <br />Standard | - **Input:** text (128,000 tokens) <br /> - **Output:** (65,536 tokens) <br /> - **Languages:** `en`, `it`, `af`, `es`, `de`, `fr`, `id`, `ru`, `pl`, `uk`, `el`, `lv`, `zh`, `ar`, `tr`, `ja`, `sw`, `cy`, `ko`, `is`, `bn`, `ur`, `ne`, `th`, `pa`, `mr`, and `te`. <br /> - **Tool calling:** Yes <br /> - **Response formats:** Text, JSON, structured outputs |
56+
| [o1-mini](https://ai.azure.com/explore/models/o1-mini/version/1/registry/azure-openai) | chat-completion | Global standard <br />Standard | - **Input:** text (128,000 tokens) <br /> - **Output:** (65,536 tokens) <br /> - **Languages:** `en`, `it`, `af`, `es`, `de`, `fr`, `id`, `ru`, `pl`, `uk`, `el`, `lv`, `zh`, `ar`, `tr`, `ja`, `sw`, `cy`, `ko`, `is`, `bn`, `ur`, `ne`, `th`, `pa`, `mr`, and `te`. <br /> - **Tool calling:** No <br /> - **Response formats:** Text |
5957
| [gpt-4o-realtime-preview](https://ai.azure.com/explore/models/gpt-4o-realtime-preview/version/2024-10-01/registry/azure-openai) | real-time | Global standard | - **Input:** control, text, and audio (131,072 tokens) <br /> - **Output:** text and audio (16,384 tokens) <br /> - **Languages:** en <br /> - **Tool calling:** Yes <br /> - **Response formats:** Text, JSON |
6058
| [gpt-4o](https://ai.azure.com/explore/models/gpt-4o/version/2024-11-20/registry/azure-openai) | chat-completion | Global standard <br />Standard<br />Batch<br />Provisioned<br />Global provisioned<br />Data Zone | - **Input:** text and image (131,072 tokens) <br /> - **Output:** text (16,384 tokens) <br /> - **Languages:** `en`, `it`, `af`, `es`, `de`, `fr`, `id`, `ru`, `pl`, `uk`, `el`, `lv`, `zh`, `ar`, `tr`, `ja`, `sw`, `cy`, `ko`, `is`, `bn`, `ur`, `ne`, `th`, `pa`, `mr`, and `te`. <br /> - **Tool calling:** Yes <br /> - **Response formats:** Text, JSON, structured outputs |
6159
| [gpt-4o-mini](https://ai.azure.com/explore/models/gpt-4o-mini/version/2024-07-18/registry/azure-openai) | chat-completion | Global standard <br />Standard<br />Batch<br />Provisioned<br />Global provisioned<br />Data Zone | - **Input:** text, image, and audio (131,072 tokens) <br /> - **Output:** (16,384 tokens) <br /> - **Languages:** `en`, `it`, `af`, `es`, `de`, `fr`, `id`, `ru`, `pl`, `uk`, `el`, `lv`, `zh`, `ar`, `tr`, `ja`, `sw`, `cy`, `ko`, `is`, `bn`, `ur`, `ne`, `th`, `pa`, `mr`, and `te`. <br /> - **Tool calling:** Yes <br /> - **Response formats:** Text, JSON, structured outputs |
@@ -93,6 +91,16 @@ Core42 includes autoregressive bi-lingual LLMs for Arabic & English with state-o
9391

9492
See [this model collection in Azure AI Foundry portal](https://ai.azure.com/explore/models?&selectedCollection=core42).
9593

94+
### DeepSeek
95+
96+
DeepSeek family of models include DeepSeek-R1, which excels at reasoning tasks using a step-by-step training process, such as language, scientific reasoning, and coding tasks.
97+
98+
| Model | Type | Tier | Capabilities |
99+
| ------ | ---- | --- | ------------ |
100+
| [DeekSeek-R1](https://ai.azure.com/explore/models/deepseek-r1/version/1/registry/azureml-deepseek) | chat-completion | Global standard | - **Input:** text (16,384 tokens) <br /> - **Output:** (163,840 tokens) <br /> - **Languages:** `en` and `zh` <br /> - **Tool calling:** No <br /> - **Response formats:** Text (with reasoning content). |
101+
102+
See [this model collection in Azure AI Foundry portal](https://ai.azure.com/explore/models?&selectedCollection=deepseek).
103+
96104
### Meta
97105

98106
Meta Llama models and tools are a collection of pretrained and fine-tuned generative AI text and image reasoning models. Meta models range is scale to include:
@@ -143,10 +151,10 @@ Mistral AI offers two categories of models: premium models including Mistral Lar
143151
| Model | Type | Tier | Capabilities |
144152
| ------ | ---- | --- | ------------ |
145153
| [Ministral-3B](https://ai.azure.com/explore/models/Ministral-3B/version/1/registry/azureml-mistral) | chat-completion | Global standard | - **Input:** text (131,072 tokens) <br /> - **Output:** text (4,096 tokens) <br /> - **Languages:** fr, de, es, it, and en <br /> - **Tool calling:** Yes <br /> - **Response formats:** Text, JSON |
146-
| [Mistral-large](https://ai.azure.com/explore/models/Mistral-large/version/1/registry/azureml-mistral) | chat-completion | Global standard | - **Input:** text (32,768 tokens) <br /> - **Output:** (4,096 tokens) <br /> - **Languages:** fr, de, es, it, and en <br /> - **Tool calling:** Yes <br /> - **Response formats:** Text, JSON |
154+
| [Mistral-large](https://ai.azure.com/explore/models/Mistral-large/version/1/registry/azureml-mistral) <br /> (deprecated) | chat-completion | Global standard | - **Input:** text (32,768 tokens) <br /> - **Output:** (4,096 tokens) <br /> - **Languages:** fr, de, es, it, and en <br /> - **Tool calling:** Yes <br /> - **Response formats:** Text, JSON |
147155
| [Mistral-small](https://ai.azure.com/explore/models/Mistral-small/version/1/registry/azureml-mistral) | chat-completion | Global standard | - **Input:** text (32,768 tokens) <br /> - **Output:** text (4,096 tokens) <br /> - **Languages:** fr, de, es, it, and en <br /> - **Tool calling:** Yes <br /> - **Response formats:** Text, JSON |
148156
| [Mistral-Nemo](https://ai.azure.com/explore/models/Mistral-Nemo/version/1/registry/azureml-mistral) | chat-completion | Global standard | - **Input:** text (131,072 tokens) <br /> - **Output:** text (4,096 tokens) <br /> - **Languages:** en, fr, de, es, it, zh, ja, ko, pt, nl, and pl <br /> - **Tool calling:** Yes <br /> - **Response formats:** Text, JSON |
149-
| [Mistral-large-2407](https://ai.azure.com/explore/models/Mistral-large-2407/version/1/registry/azureml-mistral) | chat-completion | Global standard | - **Input:** text (131,072 tokens) <br /> - **Output:** (4,096 tokens) <br /> - **Languages:** en, fr, de, es, it, zh, ja, ko, pt, nl, and pl <br /> - **Tool calling:** Yes <br /> - **Response formats:** Text, JSON |
157+
| [Mistral-large-2407](https://ai.azure.com/explore/models/Mistral-large-2407/version/1/registry/azureml-mistral) <br /> (legacy) | chat-completion | Global standard | - **Input:** text (131,072 tokens) <br /> - **Output:** (4,096 tokens) <br /> - **Languages:** en, fr, de, es, it, zh, ja, ko, pt, nl, and pl <br /> - **Tool calling:** Yes <br /> - **Response formats:** Text, JSON |
150158
| [Mistral-Large-2411](https://ai.azure.com/explore/models/Mistral-Large-2411/version/2/registry/azureml-mistral) | chat-completion | Global standard | - **Input:** text (128,000 tokens) <br /> - **Output:** text (4,096 tokens) <br /> - **Languages:** en, fr, de, es, it, zh, ja, ko, pt, nl, and pl <br /> - **Tool calling:** Yes <br /> - **Response formats:** Text, JSON |
151159
| [Codestral-2501](https://ai.azure.com/explore/models/Codestral-2501/version/2/registry/azureml-mistral) | chat-completion | Global standard | - **Input:** text (262,144 tokens) <br /> - **Output:** text (4,096 tokens) <br /> - **Languages:** en <br /> - **Tool calling:** No <br /> - **Response formats:** Text |
152160

Lines changed: 53 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,53 @@
1+
---
2+
title: How to use reasoning models with Azure AI model inference
3+
titleSuffix: Azure AI Foundry
4+
description: Learn how to use reasoning capabilities from models with Azure AI model inference
5+
manager: scottpolly
6+
author: msakande
7+
reviewer: santiagxf
8+
ms.service: azure-ai-model-inference
9+
ms.topic: how-to
10+
ms.date: 1/31/2025
11+
ms.author: mopeakande
12+
ms.reviewer: fasantia
13+
ms.custom: generated
14+
zone_pivot_groups: azure-ai-inference-samples
15+
---
16+
17+
# How to use reasoning models with Azure AI model inference
18+
19+
20+
::: zone pivot="programming-language-python"
21+
22+
[!INCLUDE [python](../includes/use-chat-reasoning/python.md)]
23+
::: zone-end
24+
25+
26+
::: zone pivot="programming-language-javascript"
27+
28+
[!INCLUDE [javascript](../includes/use-chat-reasoning/javascript.md)]
29+
::: zone-end
30+
31+
32+
::: zone pivot="programming-language-java"
33+
34+
[!INCLUDE [java](../includes/use-chat-reasoning/java.md)]
35+
::: zone-end
36+
37+
38+
::: zone pivot="programming-language-csharp"
39+
40+
[!INCLUDE [csharp](../includes/use-chat-reasoning/csharp.md)]
41+
::: zone-end
42+
43+
44+
::: zone pivot="programming-language-rest"
45+
46+
[!INCLUDE [rest](../includes/use-chat-reasoning/rest.md)]
47+
::: zone-end
48+
49+
## Related content
50+
51+
* [Use embeddings models](use-embeddings.md)
52+
* [Use image embeddings models](use-image-embeddings.md)
53+
* [Azure AI Model Inference API](.././reference/reference-model-inference-api.md)

articles/ai-foundry/model-inference/includes/use-chat-completions/csharp.md

Lines changed: 1 addition & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -26,15 +26,12 @@ To use chat completion models in your application, you need:
2626

2727
* A chat completions model deployment. If you don't have one read [Add and configure models to Azure AI services](../../how-to/create-model-deployments.md) to add a chat completions model to your resource.
2828

29-
* Install the Azure AI inference package with the following command:
29+
* Install the [Azure AI inference package](https://aka.ms/azsdk/azure-ai-inference/python/reference) with the following command:
3030

3131
```bash
3232
dotnet add package Azure.AI.Inference --prerelease
3333
```
3434

35-
> [!TIP]
36-
> Read more about the [Azure AI inference package and reference](https://aka.ms/azsdk/azure-ai-inference/python/reference).
37-
3835
* If you are using Entra ID, you also need the following package:
3936

4037
```bash

articles/ai-foundry/model-inference/includes/use-chat-completions/java.md

Lines changed: 1 addition & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -26,7 +26,7 @@ To use chat completion models in your application, you need:
2626

2727
* A chat completions model deployment. If you don't have one read [Add and configure models to Azure AI services](../../how-to/create-model-deployments.md) to add a chat completions model to your resource.
2828

29-
* Add the Azure AI inference package to your project:
29+
* Add the [Azure AI inference package](https://aka.ms/azsdk/azure-ai-inference/java/reference) to your project:
3030

3131
```xml
3232
<dependency>
@@ -36,9 +36,6 @@ To use chat completion models in your application, you need:
3636
</dependency>
3737
```
3838

39-
> [!TIP]
40-
> Read more about the [Azure AI inference package and reference](https://aka.ms/azsdk/azure-ai-inference/java/reference).
41-
4239
* If you are using Entra ID, you also need the following package:
4340

4441
```xml

articles/ai-foundry/model-inference/includes/use-chat-completions/javascript.md

Lines changed: 1 addition & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -26,15 +26,12 @@ To use chat completion models in your application, you need:
2626

2727
* A chat completions model deployment. If you don't have one read [Add and configure models to Azure AI services](../../how-to/create-model-deployments.md) to add a chat completions model to your resource.
2828

29-
* Install the Azure Inference library for JavaScript with the following command:
29+
* Install the [Azure Inference library for JavaScript](https://aka.ms/azsdk/azure-ai-inference/javascript/reference) with the following command:
3030

3131
```bash
3232
npm install @azure-rest/ai-inference
3333
```
3434
35-
> [!TIP]
36-
> Read more about the [Azure AI inference package and reference](https://aka.ms/azsdk/azure-ai-inference/javascript/reference).
37-
3835
## Use chat completions
3936

4037
First, create the client to consume the model. The following code uses an endpoint URL and key that are stored in environment variables.

articles/ai-foundry/model-inference/includes/use-chat-completions/python.md

Lines changed: 1 addition & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -26,15 +26,12 @@ To use chat completion models in your application, you need:
2626

2727
* A chat completions model deployment. If you don't have one read [Add and configure models to Azure AI services](../../how-to/create-model-deployments.md) to add a chat completions model to your resource.
2828

29-
* Install the Azure AI inference package with the following command:
29+
* Install the [Azure AI inference package for Python](https://aka.ms/azsdk/azure-ai-inference/python/reference) with the following command:
3030

3131
```bash
3232
pip install -U azure-ai-inference
3333
```
3434

35-
> [!TIP]
36-
> Read more about the [Azure AI inference package and reference](https://aka.ms/azsdk/azure-ai-inference/python/reference).
37-
3835
## Use chat completions
3936

4037
First, create the client to consume the model. The following code uses an endpoint URL and key that are stored in environment variables.

0 commit comments

Comments
 (0)