You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Azure AI model inference in Azure AI Foundry gives you access to flagship models in Azure AI to consume them as APIs without hosting them on your infrastructure.
19
19
20
-
> [!TIP]
21
-
> DeepSeek-R1 (preview) is available for deployment as [Serverless API endpoint](../../../ai-studio/how-to/deploy-models-deepseek.md).
22
-
23
20
:::image type="content" source="../media/models/models-catalog.gif" alt-text="An animation showing Azure AI studio model catalog section and the models available." lightbox="../media/models/models-catalog.gif":::
24
21
25
22
Model availability varies by model provider, deployment SKU, and cloud. All models available in Azure AI Model Inference support the [Global standard](deployment-types.md#global-standard) deployment type which uses global capacity to guarantee throughput. [Azure OpenAI models](#azure-openai) also support regional deployments and [sovereign clouds](/entra/identity-platform/authentication-national-cloud)—Azure Government, Azure Germany, and Azure China 21Vianet.
@@ -52,10 +49,11 @@ Azure OpenAI Service offers a diverse set of models with different capabilities
52
49
- Models that can transcribe and translate speech to text
@@ -93,6 +91,16 @@ Core42 includes autoregressive bi-lingual LLMs for Arabic & English with state-o
93
91
94
92
See [this model collection in Azure AI Foundry portal](https://ai.azure.com/explore/models?&selectedCollection=core42).
95
93
94
+
### DeepSeek
95
+
96
+
DeepSeek family of models include DeepSeek-R1, which excels at reasoning tasks using a step-by-step training process, such as language, scientific reasoning, and coding tasks.
97
+
98
+
| Model | Type | Tier | Capabilities |
99
+
| ------ | ---- | --- | ------------ |
100
+
|[DeekSeek-R1](https://ai.azure.com/explore/models/deepseek-r1/version/1/registry/azureml-deepseek)| chat-completion | Global standard | - **Input:** text (16,384 tokens) <br /> - **Output:** (163,840 tokens) <br /> - **Languages:**`en` and `zh` <br /> - **Tool calling:** No <br /> - **Response formats:** Text (with reasoning content). |
101
+
102
+
See [this model collection in Azure AI Foundry portal](https://ai.azure.com/explore/models?&selectedCollection=deepseek).
103
+
96
104
### Meta
97
105
98
106
Meta Llama models and tools are a collection of pretrained and fine-tuned generative AI text and image reasoning models. Meta models range is scale to include:
@@ -143,10 +151,10 @@ Mistral AI offers two categories of models: premium models including Mistral Lar
143
151
| Model | Type | Tier | Capabilities |
144
152
| ------ | ---- | --- | ------------ |
145
153
|[Ministral-3B](https://ai.azure.com/explore/models/Ministral-3B/version/1/registry/azureml-mistral)| chat-completion | Global standard | - **Input:** text (131,072 tokens) <br /> - **Output:** text (4,096 tokens) <br /> - **Languages:** fr, de, es, it, and en <br /> - **Tool calling:** Yes <br /> - **Response formats:** Text, JSON |
146
-
|[Mistral-large](https://ai.azure.com/explore/models/Mistral-large/version/1/registry/azureml-mistral)| chat-completion | Global standard | - **Input:** text (32,768 tokens) <br /> - **Output:** (4,096 tokens) <br /> - **Languages:** fr, de, es, it, and en <br /> - **Tool calling:** Yes <br /> - **Response formats:** Text, JSON |
154
+
|[Mistral-large](https://ai.azure.com/explore/models/Mistral-large/version/1/registry/azureml-mistral)<br /> (deprecated) | chat-completion | Global standard | - **Input:** text (32,768 tokens) <br /> - **Output:** (4,096 tokens) <br /> - **Languages:** fr, de, es, it, and en <br /> - **Tool calling:** Yes <br /> - **Response formats:** Text, JSON |
147
155
|[Mistral-small](https://ai.azure.com/explore/models/Mistral-small/version/1/registry/azureml-mistral)| chat-completion | Global standard | - **Input:** text (32,768 tokens) <br /> - **Output:** text (4,096 tokens) <br /> - **Languages:** fr, de, es, it, and en <br /> - **Tool calling:** Yes <br /> - **Response formats:** Text, JSON |
148
156
|[Mistral-Nemo](https://ai.azure.com/explore/models/Mistral-Nemo/version/1/registry/azureml-mistral)| chat-completion | Global standard | - **Input:** text (131,072 tokens) <br /> - **Output:** text (4,096 tokens) <br /> - **Languages:** en, fr, de, es, it, zh, ja, ko, pt, nl, and pl <br /> - **Tool calling:** Yes <br /> - **Response formats:** Text, JSON |
149
-
|[Mistral-large-2407](https://ai.azure.com/explore/models/Mistral-large-2407/version/1/registry/azureml-mistral)| chat-completion | Global standard | - **Input:** text (131,072 tokens) <br /> - **Output:** (4,096 tokens) <br /> - **Languages:** en, fr, de, es, it, zh, ja, ko, pt, nl, and pl <br /> - **Tool calling:** Yes <br /> - **Response formats:** Text, JSON |
157
+
|[Mistral-large-2407](https://ai.azure.com/explore/models/Mistral-large-2407/version/1/registry/azureml-mistral)<br /> (legacy) | chat-completion | Global standard | - **Input:** text (131,072 tokens) <br /> - **Output:** (4,096 tokens) <br /> - **Languages:** en, fr, de, es, it, zh, ja, ko, pt, nl, and pl <br /> - **Tool calling:** Yes <br /> - **Response formats:** Text, JSON |
150
158
|[Mistral-Large-2411](https://ai.azure.com/explore/models/Mistral-Large-2411/version/2/registry/azureml-mistral)| chat-completion | Global standard | - **Input:** text (128,000 tokens) <br /> - **Output:** text (4,096 tokens) <br /> - **Languages:** en, fr, de, es, it, zh, ja, ko, pt, nl, and pl <br /> - **Tool calling:** Yes <br /> - **Response formats:** Text, JSON |
151
159
|[Codestral-2501](https://ai.azure.com/explore/models/Codestral-2501/version/2/registry/azureml-mistral)| chat-completion | Global standard | - **Input:** text (262,144 tokens) <br /> - **Output:** text (4,096 tokens) <br /> - **Languages:** en <br /> - **Tool calling:** No <br /> - **Response formats:** Text |
Copy file name to clipboardExpand all lines: articles/ai-foundry/model-inference/includes/use-chat-completions/csharp.md
+1-4Lines changed: 1 addition & 4 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -26,15 +26,12 @@ To use chat completion models in your application, you need:
26
26
27
27
* A chat completions model deployment. If you don't have one read [Add and configure models to Azure AI services](../../how-to/create-model-deployments.md) to add a chat completions model to your resource.
28
28
29
-
* Install the Azure AI inference package with the following command:
29
+
* Install the [Azure AI inference package](https://aka.ms/azsdk/azure-ai-inference/python/reference) with the following command:
Copy file name to clipboardExpand all lines: articles/ai-foundry/model-inference/includes/use-chat-completions/java.md
+1-4Lines changed: 1 addition & 4 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -26,7 +26,7 @@ To use chat completion models in your application, you need:
26
26
27
27
* A chat completions model deployment. If you don't have one read [Add and configure models to Azure AI services](../../how-to/create-model-deployments.md) to add a chat completions model to your resource.
28
28
29
-
* Add the Azure AI inference package to your project:
29
+
* Add the [Azure AI inference package](https://aka.ms/azsdk/azure-ai-inference/java/reference) to your project:
30
30
31
31
```xml
32
32
<dependency>
@@ -36,9 +36,6 @@ To use chat completion models in your application, you need:
36
36
</dependency>
37
37
```
38
38
39
-
> [!TIP]
40
-
> Read more about the [Azure AI inference package and reference](https://aka.ms/azsdk/azure-ai-inference/java/reference).
41
-
42
39
* If you are using Entra ID, you also need the following package:
Copy file name to clipboardExpand all lines: articles/ai-foundry/model-inference/includes/use-chat-completions/javascript.md
+1-4Lines changed: 1 addition & 4 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -26,15 +26,12 @@ To use chat completion models in your application, you need:
26
26
27
27
* A chat completions model deployment. If you don't have one read [Add and configure models to Azure AI services](../../how-to/create-model-deployments.md) to add a chat completions model to your resource.
28
28
29
-
* Install the Azure Inference library for JavaScript with the following command:
29
+
* Install the [Azure Inference library for JavaScript](https://aka.ms/azsdk/azure-ai-inference/javascript/reference) with the following command:
30
30
31
31
```bash
32
32
npm install @azure-rest/ai-inference
33
33
```
34
34
35
-
> [!TIP]
36
-
> Read more about the [Azure AI inference package and reference](https://aka.ms/azsdk/azure-ai-inference/javascript/reference).
37
-
38
35
## Use chat completions
39
36
40
37
First, create the client to consume the model. The following code uses an endpoint URL and key that are stored in environment variables.
Copy file name to clipboardExpand all lines: articles/ai-foundry/model-inference/includes/use-chat-completions/python.md
+1-4Lines changed: 1 addition & 4 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -26,15 +26,12 @@ To use chat completion models in your application, you need:
26
26
27
27
* A chat completions model deployment. If you don't have one read [Add and configure models to Azure AI services](../../how-to/create-model-deployments.md) to add a chat completions model to your resource.
28
28
29
-
* Install the Azure AI inference package with the following command:
29
+
* Install the [Azure AI inference package for Python](https://aka.ms/azsdk/azure-ai-inference/python/reference) with the following command:
30
30
31
31
```bash
32
32
pip install -U azure-ai-inference
33
33
```
34
34
35
-
> [!TIP]
36
-
> Read more about the [Azure AI inference package and reference](https://aka.ms/azsdk/azure-ai-inference/python/reference).
37
-
38
35
## Use chat completions
39
36
40
37
First, create the client to consume the model. The following code uses an endpoint URL and key that are stored in environment variables.
0 commit comments