Skip to content

Commit 6810473

Browse files
Merge pull request #2830 from santiagxf/santiagxf/reasoning-patch
fix: reasoning
2 parents bfe8d43 + bfd8ca5 commit 6810473

File tree

7 files changed

+54
-35
lines changed

7 files changed

+54
-35
lines changed
Lines changed: 22 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,22 @@
1+
---
2+
manager: nitinme
3+
ms.service: azure-ai-model-inference
4+
ms.topic: include
5+
ms.date: 1/31/2025
6+
ms.author: fasantia
7+
author: santiagxf
8+
---
9+
10+
## Reasoning models
11+
12+
Reasoning models can reach higher levels of performance in domains like math, coding, science, strategy, and logistics. The way these models produces outputs is by explicitly using chain of thought to explore all possible paths before generating an answer. They verify their answers as they produce them which helps them to arrive to better more accurate conclusions. This means that reasoning models may require less context in prompting in order to produce effective results.
13+
14+
Such way of scaling model's performance is referred as *inference compute time* as it trades performance against higher latency and cost. It contrasts to other approaches that scale through *training compute time*.
15+
16+
Reasoning models then produce two types of outputs:
17+
18+
> [!div class="checklist"]
19+
> * Reasoning completions
20+
> * Output completions
21+
22+
Both of these completions count towards content generated from the model and hence, towards the token limits and costs associated with the model. Some models may output the reasoning content, like `DeepSeek-R1`. Some others, like `o1`, only outputs the output piece of the completions.
Lines changed: 17 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,17 @@
1+
---
2+
manager: nitinme
3+
ms.service: azure-ai-model-inference
4+
ms.topic: include
5+
ms.date: 1/31/2025
6+
ms.author: fasantia
7+
author: santiagxf
8+
---
9+
10+
When building prompts for reasoning models, take the following into consideration:
11+
12+
> [!div class="checklist"]
13+
> * Use simple instructions and avoid using chain-of-thought techniques.
14+
> * Built-in reasoning capabilities make simple zero-shot prompts as effective as more complex methods.
15+
> * When providing additional context or documents, like in RAG scenarios, including only the most relevant information may help preventing the model from over-complicating its response.
16+
> * Reasoning models may support the use of system messages. However, they may not follow them as strictly as other non-reasoning models.
17+
> * When creating multi-turn applications, consider only appending the final answer from the model, without it's reasoning content as explained at [Reasoning content](#reasoning-content) section.

articles/ai-foundry/model-inference/includes/use-chat-reasoning/csharp.md

Lines changed: 3 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -11,6 +11,8 @@ author: santiagxf
1111

1212
This article explains how to use the reasoning capabilities of chat completions models deployed to Azure AI model inference in Azure AI services.
1313

14+
[!INCLUDE [about-reasoning](about-reasoning.md)]
15+
1416
## Prerequisites
1517

1618
To complete this tutorial, you need:
@@ -73,13 +75,7 @@ ChatCompletionsOptions requestOptions = new ChatCompletionsOptions()
7375
Response<ChatCompletions> response = client.Complete(requestOptions);
7476
```
7577
76-
When building prompts for reasoning models, take the following into consideration:
77-
78-
> [!div class="checklist"]
79-
> * Built-in reasoning capabilities make simple zero-shot prompts as effective as more complex methods.
80-
> * When providing additional context or documents, like in RAG scenarios, including only the most relevant information may help preventing the model from over-complicating its response.
81-
> * Reasoning models may support the use of system messages. However, they may not follow them as strictly as other non-reasoning models.
82-
> * When creating multi-turn applications, consider only appending the final answer from the model, without it's reasoning content as explained at [Reasoning content](#reasoning-content) section.
78+
[!INCLUDE [best-practices](best-practices.md)]
8379
8480
The response is as follows, where you can see the model's usage statistics:
8581

articles/ai-foundry/model-inference/includes/use-chat-reasoning/java.md

Lines changed: 3 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -11,6 +11,8 @@ author: santiagxf
1111

1212
This article explains how to use the reasoning capabilities of chat completions models deployed to Azure AI model inference in Azure AI services.
1313

14+
[!INCLUDE [about-reasoning](about-reasoning.md)]
15+
1416
## Prerequisites
1517

1618
To complete this tutorial, you need:
@@ -93,13 +95,7 @@ ChatCompletionsOptions requestOptions = new ChatCompletionsOptions()
9395
Response<ChatCompletions> response = client.complete(requestOptions);
9496
```
9597
96-
When building prompts for reasoning models, take the following into consideration:
97-
98-
> [!div class="checklist"]
99-
> * Built-in reasoning capabilities make simple zero-shot prompts as effective as more complex methods.
100-
> * When providing additional context or documents, like in RAG scenarios, including only the most relevant information may help preventing the model from over-complicating its response.
101-
> * Reasoning models may support the use of system messages. However, they may not follow them as strictly as other non-reasoning models.
102-
> * When creating multi-turn applications, consider only appending the final answer from the model, without it's reasoning content as explained at [Reasoning content](#reasoning-content) section.
98+
[!INCLUDE [best-practices](best-practices.md)]
10399
104100
The response is as follows, where you can see the model's usage statistics:
105101

articles/ai-foundry/model-inference/includes/use-chat-reasoning/javascript.md

Lines changed: 3 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -11,6 +11,8 @@ author: santiagxf
1111

1212
This article explains how to use the reasoning capabilities of chat completions models deployed to Azure AI model inference in Azure AI services.
1313

14+
[!INCLUDE [about-reasoning](about-reasoning.md)]
15+
1416
## Prerequisites
1517

1618
To complete this tutorial, you need:
@@ -78,13 +80,7 @@ var response = await client.path("/chat/completions").post({
7880
});
7981
```
8082

81-
When building prompts for reasoning models, take the following into consideration:
82-
83-
> [!div class="checklist"]
84-
> * Built-in reasoning capabilities make simple zero-shot prompts as effective as more complex methods.
85-
> * When providing additional context or documents, like in RAG scenarios, including only the most relevant information may help preventing the model from over-complicating its response.
86-
> * Reasoning models may support the use of system messages. However, they may not follow them as strictly as other non-reasoning models.
87-
> * When creating multi-turn applications, consider only appending the final answer from the model, without it's reasoning content as explained at [Reasoning content](#reasoning-content) section.
83+
[!INCLUDE [best-practices](best-practices.md)]
8884

8985
The response is as follows, where you can see the model's usage statistics:
9086

articles/ai-foundry/model-inference/includes/use-chat-reasoning/python.md

Lines changed: 3 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -11,6 +11,8 @@ author: santiagxf
1111

1212
This article explains how to use the reasoning capabilities of chat completions models deployed to Azure AI model inference in Azure AI services.
1313

14+
[!INCLUDE [about-reasoning](about-reasoning.md)]
15+
1416
## Prerequisites
1517

1618
To complete this tutorial, you need:
@@ -75,13 +77,7 @@ response = client.complete(
7577
)
7678
```
7779

78-
When building prompts for reasoning models, take the following into consideration:
79-
80-
> [!div class="checklist"]
81-
> * Built-in reasoning capabilities make simple zero-shot prompts as effective as more complex methods.
82-
> * When providing additional context or documents, like in RAG scenarios, including only the most relevant information may help preventing the model from over-complicating its response.
83-
> * Reasoning models may support the use of system messages. However, they may not follow them as strictly as other non-reasoning models.
84-
> * When creating multi-turn applications, consider only appending the final answer from the model, without it's reasoning content as explained at [Reasoning content](#reasoning-content) section.
80+
[!INCLUDE [best-practices](best-practices.md)]
8581

8682
The response is as follows, where you can see the model's usage statistics:
8783

articles/ai-foundry/model-inference/includes/use-chat-reasoning/rest.md

Lines changed: 3 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -11,6 +11,8 @@ author: santiagxf
1111

1212
This article explains how to use the reasoning capabilities of chat completions models deployed to Azure AI model inference in Azure AI services.
1313

14+
[!INCLUDE [about-reasoning](about-reasoning.md)]
15+
1416
## Prerequisites
1517

1618
To complete this tutorial, you need:
@@ -58,13 +60,7 @@ The following example shows how you can create a basic chat request to the model
5860
}
5961
```
6062

61-
When building prompts for reasoning models, take the following into consideration:
62-
63-
> [!div class="checklist"]
64-
> * Built-in reasoning capabilities make simple zero-shot prompts as effective as more complex methods.
65-
> * When providing additional context or documents, like in RAG scenarios, including only the most relevant information may help preventing the model from over-complicating its response.
66-
> * Reasoning models may support the use of system messages. However, they may not follow them as strictly as other non-reasoning models.
67-
> * When creating multi-turn applications, consider only appending the final answer from the model, without it's reasoning content as explained at [Reasoning content](#reasoning-content) section.
63+
[!INCLUDE [best-practices](best-practices.md)]
6864

6965
The response is as follows, where you can see the model's usage statistics:
7066

0 commit comments

Comments
 (0)