Skip to content

Commit dd9abaa

Browse files
authored
Merge branch 'main' into ft-ptum-pup
2 parents 3a473a3 + b992340 commit dd9abaa

File tree

154 files changed

+2728
-993
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

154 files changed

+2728
-993
lines changed

.openpublishing.redirection.json

Lines changed: 10 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -15,6 +15,11 @@
1515
"redirect_url": "/azure/search/search-how-to-dotnet-sdk",
1616
"redirect_document_id": false
1717
},
18+
{
19+
"source_path_from_root": "/articles/ai-services/agents/how-to/tools/overview.md",
20+
"redirect_url": "/azure/ai-services/agents/overview",
21+
"redirect_document_id": false
22+
},
1823
{
1924
"source_path_from_root": "/articles/search/search-howto-index-csv-blobs.md",
2025
"redirect_url": "/azure/search/search-how-to-index-csv-blobs",
@@ -25,6 +30,11 @@
2530
"redirect_url": "/azure/search/search-how-to-large-index",
2631
"redirect_document_id": false
2732
},
33+
{
34+
"source_path_from_root": "/articles/ai-services/agents/concepts/agents.md",
35+
"redirect_url": "/azure/ai-services/agents/overview",
36+
"redirect_document_id": false
37+
},
2838
{
2939
"source_path_from_root": "/articles/ai-services/openai/how-to/use-your-data-securely.md",
3040
"redirect_url": "/azure/ai-services/openai/how-to/on-your-data-configuration",

articles/ai-foundry/model-inference/concepts/models.md

Lines changed: 15 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -17,9 +17,6 @@ ms.custom: references_regions, tool_generated
1717

1818
Azure AI model inference in Azure AI Foundry gives you access to flagship models in Azure AI to consume them as APIs without hosting them on your infrastructure.
1919

20-
> [!TIP]
21-
> DeepSeek-R1 (preview) is available for deployment as [Serverless API endpoint](../../../ai-studio/how-to/deploy-models-deepseek.md).
22-
2320
:::image type="content" source="../media/models/models-catalog.gif" alt-text="An animation showing Azure AI studio model catalog section and the models available." lightbox="../media/models/models-catalog.gif":::
2421

2522
Model availability varies by model provider, deployment SKU, and cloud. All models available in Azure AI Model Inference support the [Global standard](deployment-types.md#global-standard) deployment type which uses global capacity to guarantee throughput. [Azure OpenAI models](#azure-openai) also support regional deployments and [sovereign clouds](/entra/identity-platform/authentication-national-cloud)—Azure Government, Azure Germany, and Azure China 21Vianet.
@@ -52,10 +49,11 @@ Azure OpenAI Service offers a diverse set of models with different capabilities
5249
- Models that can transcribe and translate speech to text
5350

5451
| Model | Type | Tier | Capabilities |
55-
| ------ | ---- | --- | ------------ |
52+
| ------ | ---- | ---- | ------------ |
53+
| [o3-mini](https://ai.azure.com/explore/models/o3-mini/version/2025-01-31/registry/azure-openai) | chat-completion | Global standard | - **Input:** text and image (200,000 tokens) <br /> - **Output:** text (100,000 tokens) <br /> - **Languages:** `en`, `it`, `af`, `es`, `de`, `fr`, `id`, `ru`, `pl`, `uk`, `el`, `lv`, `zh`, `ar`, `tr`, `ja`, `sw`, `cy`, `ko`, `is`, `bn`, `ur`, `ne`, `th`, `pa`, `mr`, and `te`. <br /> - **Tool calling:** Yes <br /> - **Response formats:** Text, JSON, structured outputs |
5654
| [o1](https://ai.azure.com/explore/models/o1/version/2024-12-17/registry/azure-openai) | chat-completion | Global standard | - **Input:** text and image (200,000 tokens) <br /> - **Output:** text (100,000 tokens) <br /> - **Languages:** `en`, `it`, `af`, `es`, `de`, `fr`, `id`, `ru`, `pl`, `uk`, `el`, `lv`, `zh`, `ar`, `tr`, `ja`, `sw`, `cy`, `ko`, `is`, `bn`, `ur`, `ne`, `th`, `pa`, `mr`, and `te`. <br /> - **Tool calling:** Yes <br /> - **Response formats:** Text, JSON, structured outputs |
5755
| [o1-preview](https://ai.azure.com/explore/models/o1-preview/version/1/registry/azure-openai) | chat-completion | Global standard <br />Standard<br /> | - **Input:** text (128,000 tokens) <br /> - **Output:** (32,768 tokens) <br /> - **Languages:** `en`, `it`, `af`, `es`, `de`, `fr`, `id`, `ru`, `pl`, `uk`, `el`, `lv`, `zh`, `ar`, `tr`, `ja`, `sw`, `cy`, `ko`, `is`, `bn`, `ur`, `ne`, `th`, `pa`, `mr`, and `te`. <br /> - **Tool calling:** Yes <br /> - **Response formats:** Text, JSON, structured outputs |
58-
| [o1-mini](https://ai.azure.com/explore/models/o1-mini/version/1/registry/azure-openai) | chat-completion | Global standard <br />Standard | - **Input:** text (128,000 tokens) <br /> - **Output:** (65,536 tokens) <br /> - **Languages:** `en`, `it`, `af`, `es`, `de`, `fr`, `id`, `ru`, `pl`, `uk`, `el`, `lv`, `zh`, `ar`, `tr`, `ja`, `sw`, `cy`, `ko`, `is`, `bn`, `ur`, `ne`, `th`, `pa`, `mr`, and `te`. <br /> - **Tool calling:** Yes <br /> - **Response formats:** Text, JSON, structured outputs |
56+
| [o1-mini](https://ai.azure.com/explore/models/o1-mini/version/1/registry/azure-openai) | chat-completion | Global standard <br />Standard | - **Input:** text (128,000 tokens) <br /> - **Output:** (65,536 tokens) <br /> - **Languages:** `en`, `it`, `af`, `es`, `de`, `fr`, `id`, `ru`, `pl`, `uk`, `el`, `lv`, `zh`, `ar`, `tr`, `ja`, `sw`, `cy`, `ko`, `is`, `bn`, `ur`, `ne`, `th`, `pa`, `mr`, and `te`. <br /> - **Tool calling:** No <br /> - **Response formats:** Text |
5957
| [gpt-4o-realtime-preview](https://ai.azure.com/explore/models/gpt-4o-realtime-preview/version/2024-10-01/registry/azure-openai) | real-time | Global standard | - **Input:** control, text, and audio (131,072 tokens) <br /> - **Output:** text and audio (16,384 tokens) <br /> - **Languages:** en <br /> - **Tool calling:** Yes <br /> - **Response formats:** Text, JSON |
6058
| [gpt-4o](https://ai.azure.com/explore/models/gpt-4o/version/2024-11-20/registry/azure-openai) | chat-completion | Global standard <br />Standard<br />Batch<br />Provisioned<br />Global provisioned<br />Data Zone | - **Input:** text and image (131,072 tokens) <br /> - **Output:** text (16,384 tokens) <br /> - **Languages:** `en`, `it`, `af`, `es`, `de`, `fr`, `id`, `ru`, `pl`, `uk`, `el`, `lv`, `zh`, `ar`, `tr`, `ja`, `sw`, `cy`, `ko`, `is`, `bn`, `ur`, `ne`, `th`, `pa`, `mr`, and `te`. <br /> - **Tool calling:** Yes <br /> - **Response formats:** Text, JSON, structured outputs |
6159
| [gpt-4o-mini](https://ai.azure.com/explore/models/gpt-4o-mini/version/2024-07-18/registry/azure-openai) | chat-completion | Global standard <br />Standard<br />Batch<br />Provisioned<br />Global provisioned<br />Data Zone | - **Input:** text, image, and audio (131,072 tokens) <br /> - **Output:** (16,384 tokens) <br /> - **Languages:** `en`, `it`, `af`, `es`, `de`, `fr`, `id`, `ru`, `pl`, `uk`, `el`, `lv`, `zh`, `ar`, `tr`, `ja`, `sw`, `cy`, `ko`, `is`, `bn`, `ur`, `ne`, `th`, `pa`, `mr`, and `te`. <br /> - **Tool calling:** Yes <br /> - **Response formats:** Text, JSON, structured outputs |
@@ -93,6 +91,16 @@ Core42 includes autoregressive bi-lingual LLMs for Arabic & English with state-o
9391

9492
See [this model collection in Azure AI Foundry portal](https://ai.azure.com/explore/models?&selectedCollection=core42).
9593

94+
### DeepSeek
95+
96+
DeepSeek family of models include DeepSeek-R1, which excels at reasoning tasks using a step-by-step training process, such as language, scientific reasoning, and coding tasks.
97+
98+
| Model | Type | Tier | Capabilities |
99+
| ------ | ---- | --- | ------------ |
100+
| [DeekSeek-R1](https://ai.azure.com/explore/models/deepseek-r1/version/1/registry/azureml-deepseek) | chat-completion <br /> [(with reasoning content)](../how-to/use-chat-reasoning.md) | Global standard | - **Input:** text (16,384 tokens) <br /> - **Output:** (163,840 tokens) <br /> - **Languages:** `en` and `zh` <br /> - **Tool calling:** No <br /> - **Response formats:** Text. |
101+
102+
See [this model collection in Azure AI Foundry portal](https://ai.azure.com/explore/models?&selectedCollection=deepseek).
103+
96104
### Meta
97105

98106
Meta Llama models and tools are a collection of pretrained and fine-tuned generative AI text and image reasoning models. Meta models range is scale to include:
@@ -143,10 +151,10 @@ Mistral AI offers two categories of models: premium models including Mistral Lar
143151
| Model | Type | Tier | Capabilities |
144152
| ------ | ---- | --- | ------------ |
145153
| [Ministral-3B](https://ai.azure.com/explore/models/Ministral-3B/version/1/registry/azureml-mistral) | chat-completion | Global standard | - **Input:** text (131,072 tokens) <br /> - **Output:** text (4,096 tokens) <br /> - **Languages:** fr, de, es, it, and en <br /> - **Tool calling:** Yes <br /> - **Response formats:** Text, JSON |
146-
| [Mistral-large](https://ai.azure.com/explore/models/Mistral-large/version/1/registry/azureml-mistral) | chat-completion | Global standard | - **Input:** text (32,768 tokens) <br /> - **Output:** (4,096 tokens) <br /> - **Languages:** fr, de, es, it, and en <br /> - **Tool calling:** Yes <br /> - **Response formats:** Text, JSON |
154+
| [Mistral-large](https://ai.azure.com/explore/models/Mistral-large/version/1/registry/azureml-mistral) <br /> (deprecated) | chat-completion | Global standard | - **Input:** text (32,768 tokens) <br /> - **Output:** (4,096 tokens) <br /> - **Languages:** fr, de, es, it, and en <br /> - **Tool calling:** Yes <br /> - **Response formats:** Text, JSON |
147155
| [Mistral-small](https://ai.azure.com/explore/models/Mistral-small/version/1/registry/azureml-mistral) | chat-completion | Global standard | - **Input:** text (32,768 tokens) <br /> - **Output:** text (4,096 tokens) <br /> - **Languages:** fr, de, es, it, and en <br /> - **Tool calling:** Yes <br /> - **Response formats:** Text, JSON |
148156
| [Mistral-Nemo](https://ai.azure.com/explore/models/Mistral-Nemo/version/1/registry/azureml-mistral) | chat-completion | Global standard | - **Input:** text (131,072 tokens) <br /> - **Output:** text (4,096 tokens) <br /> - **Languages:** en, fr, de, es, it, zh, ja, ko, pt, nl, and pl <br /> - **Tool calling:** Yes <br /> - **Response formats:** Text, JSON |
149-
| [Mistral-large-2407](https://ai.azure.com/explore/models/Mistral-large-2407/version/1/registry/azureml-mistral) | chat-completion | Global standard | - **Input:** text (131,072 tokens) <br /> - **Output:** (4,096 tokens) <br /> - **Languages:** en, fr, de, es, it, zh, ja, ko, pt, nl, and pl <br /> - **Tool calling:** Yes <br /> - **Response formats:** Text, JSON |
157+
| [Mistral-large-2407](https://ai.azure.com/explore/models/Mistral-large-2407/version/1/registry/azureml-mistral) <br /> (legacy) | chat-completion | Global standard | - **Input:** text (131,072 tokens) <br /> - **Output:** (4,096 tokens) <br /> - **Languages:** en, fr, de, es, it, zh, ja, ko, pt, nl, and pl <br /> - **Tool calling:** Yes <br /> - **Response formats:** Text, JSON |
150158
| [Mistral-Large-2411](https://ai.azure.com/explore/models/Mistral-Large-2411/version/2/registry/azureml-mistral) | chat-completion | Global standard | - **Input:** text (128,000 tokens) <br /> - **Output:** text (4,096 tokens) <br /> - **Languages:** en, fr, de, es, it, zh, ja, ko, pt, nl, and pl <br /> - **Tool calling:** Yes <br /> - **Response formats:** Text, JSON |
151159
| [Codestral-2501](https://ai.azure.com/explore/models/Codestral-2501/version/2/registry/azureml-mistral) | chat-completion | Global standard | - **Input:** text (262,144 tokens) <br /> - **Output:** text (4,096 tokens) <br /> - **Languages:** en <br /> - **Tool calling:** No <br /> - **Response formats:** Text |
152160

articles/ai-foundry/model-inference/how-to/inference.md

Lines changed: 9 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -48,6 +48,14 @@ For a chat model, you can create a request as follows:
4848

4949
If you specify a model name that doesn't match any given model deployment, you get an error that the model doesn't exist. You can control which models are available for users by creating model deployments as explained at [add and configure model deployments](create-model-deployments.md).
5050

51+
## Key-less authentication
52+
53+
Models deployed to Azure AI model inference in Azure AI Services support key-less authorization using Microsoft Entra ID. Key-less authorization enhances security, simplifies the user experience, reduces operational complexity, and provides robust compliance support for modern development. It makes it a strong choice for organizations adopting secure and scalable identity management solutions.
54+
55+
To use key-less authentication, [configure your resource and grant access to users](configure-entra-id.md) to perform inference. Once configured, then you can authenticate as follows:
56+
57+
[!INCLUDE [code-create-chat-client-entra](../includes/code-create-chat-client-entra.md)]
58+
5159
## Limitations
5260

5361
* Azure OpenAI Batch can't be used with the Azure AI model inference endpoint. You have to use the dedicated deployment URL as explained at [Batch API support in Azure OpenAI documentation](../../../ai-services/openai/how-to/batch.md#api-support).
@@ -56,4 +64,4 @@ If you specify a model name that doesn't match any given model deployment, you g
5664
## Next steps
5765

5866
* [Use embedding models](use-embeddings.md)
59-
* [Use chat completion models](use-chat-completions.md)
67+
* [Use chat completion models](use-chat-completions.md)

articles/ai-foundry/model-inference/how-to/quickstart-github-models.md

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -85,6 +85,7 @@ Use the parameter `model="<deployment-name>` to route your request to this deplo
8585
Azure AI model inference supports additional features not available in GitHub Models, including:
8686

8787
* [Explore the model catalog](https://ai.azure.com/github/models) to see additional models not available in GitHub Models.
88+
* Configure [key-less authentication](configure-entra-id.md).
8889
* Configure [content filtering](configure-content-filters.md).
8990
* Configure rate limiting (for specific models).
9091
* Explore additional [deployment SKUs (for specific models)](../concepts/deployment-types.md).
@@ -97,4 +98,4 @@ See the [FAQ section](../faq.yml) to explore more help.
9798
## Next steps
9899

99100
* [Explore the model catalog](https://ai.azure.com/github/models) in Azure AI studio.
100-
* [Add more models](create-model-deployments.md) to your endpoint.
101+
* [Add more models](create-model-deployments.md) to your endpoint.
Lines changed: 53 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,53 @@
1+
---
2+
title: How to use reasoning models with Azure AI model inference
3+
titleSuffix: Azure AI Foundry
4+
description: Learn how to use reasoning capabilities from models with Azure AI model inference
5+
manager: scottpolly
6+
author: msakande
7+
reviewer: santiagxf
8+
ms.service: azure-ai-model-inference
9+
ms.topic: how-to
10+
ms.date: 1/31/2025
11+
ms.author: mopeakande
12+
ms.reviewer: fasantia
13+
ms.custom: generated
14+
zone_pivot_groups: azure-ai-inference-samples
15+
---
16+
17+
# How to use reasoning models with Azure AI model inference
18+
19+
20+
::: zone pivot="programming-language-python"
21+
22+
[!INCLUDE [python](../includes/use-chat-reasoning/python.md)]
23+
::: zone-end
24+
25+
26+
::: zone pivot="programming-language-javascript"
27+
28+
[!INCLUDE [javascript](../includes/use-chat-reasoning/javascript.md)]
29+
::: zone-end
30+
31+
32+
::: zone pivot="programming-language-java"
33+
34+
[!INCLUDE [java](../includes/use-chat-reasoning/java.md)]
35+
::: zone-end
36+
37+
38+
::: zone pivot="programming-language-csharp"
39+
40+
[!INCLUDE [csharp](../includes/use-chat-reasoning/csharp.md)]
41+
::: zone-end
42+
43+
44+
::: zone pivot="programming-language-rest"
45+
46+
[!INCLUDE [rest](../includes/use-chat-reasoning/rest.md)]
47+
::: zone-end
48+
49+
## Related content
50+
51+
* [Use embeddings models](use-embeddings.md)
52+
* [Use image embeddings models](use-image-embeddings.md)
53+
* [Azure AI Model Inference API](.././reference/reference-model-inference-api.md)

articles/ai-foundry/model-inference/includes/code-create-chat-client-entra.md

Lines changed: 1 addition & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -12,12 +12,9 @@ author: santiagxf
1212
Install the package `azure-ai-inference` using your package manager, like pip:
1313

1414
```bash
15-
pip install azure-ai-inference>=1.0.0b5
15+
pip install azure-ai-inference
1616
```
1717

18-
> [!WARNING]
19-
> Azure AI Services resource requires the version `azure-ai-inference>=1.0.0b5` for Python.
20-
2118
Then, you can use the package to consume the model. The following example shows how to create a client to consume chat completions with Entra ID:
2219

2320
```python

articles/ai-foundry/model-inference/includes/code-create-chat-client.md

Lines changed: 3 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -12,12 +12,9 @@ author: santiagxf
1212
Install the package `azure-ai-inference` using your package manager, like pip:
1313

1414
```bash
15-
pip install azure-ai-inference>=1.0.0b5
15+
pip install azure-ai-inference
1616
```
1717

18-
> [!WARNING]
19-
> Azure AI Services resource requires the version `azure-ai-inference>=1.0.0b5` for Python.
20-
2118
Then, you can use the package to consume the model. The following example shows how to create a client to consume chat completions:
2219

2320
```python
@@ -115,7 +112,7 @@ __Request__
115112

116113
```HTTP/1.1
117114
POST https://<resource>.services.ai.azure.com/models/chat/completions?api-version=2024-05-01-preview
118-
Authorization: Bearer <bearer-token>
115+
api-key: <api-key>
119116
Content-Type: application/json
120117
```
121-
---
118+
---

articles/ai-foundry/model-inference/includes/code-create-chat-completion.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -78,7 +78,7 @@ __Request__
7878

7979
```HTTP/1.1
8080
POST https://<resource>.services.ai.azure.com/models/chat/completions?api-version=2024-05-01-preview
81-
Authorization: Bearer <bearer-token>
81+
api-key: <api-key>
8282
Content-Type: application/json
8383
```
8484

@@ -98,4 +98,4 @@ Content-Type: application/json
9898
}
9999
```
100100

101-
---
101+
---

articles/ai-foundry/model-inference/includes/code-create-embeddings-client.md

Lines changed: 3 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -12,12 +12,9 @@ author: santiagxf
1212
Install the package `azure-ai-inference` using your package manager, like pip:
1313

1414
```bash
15-
pip install azure-ai-inference>=1.0.0b5
15+
pip install azure-ai-inference
1616
```
1717

18-
> [!WARNING]
19-
> Azure AI Services resource requires the version `azure-ai-inference>=1.0.0b5` for Python.
20-
2118
Then, you can use the package to consume the model. The following example shows how to create a client to consume chat completions:
2219

2320
```python
@@ -132,7 +129,7 @@ __Request__
132129

133130
```HTTP/1.1
134131
POST https://<resource>.services.ai.azure.com/models/embeddings?api-version=2024-05-01-preview
135-
Authorization: Bearer <bearer-token>
132+
api-key: <api-key>
136133
Content-Type: application/json
137134
```
138-
---
135+
---

articles/ai-foundry/model-inference/includes/code-create-embeddings.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -54,7 +54,7 @@ __Request__
5454

5555
```HTTP/1.1
5656
POST https://<resource>.services.ai.azure.com/models/embeddings?api-version=2024-05-01-preview
57-
Authorization: Bearer <bearer-token>
57+
api-key: <api-key>
5858
Content-Type: application/json
5959
```
6060

@@ -100,4 +100,4 @@ __Response__
100100
}
101101
```
102102

103-
---
103+
---

0 commit comments

Comments
 (0)