Merge pull request #2830 from santiagxf/santiagxf/reasoning-patch

JamesJBarnett · web-flow · commit 6810473c5b5f · 2025-02-08T16:05:09.000-07:00
fix: reasoning
diff --git a/articles/ai-foundry/model-inference/includes/use-chat-reasoning/about-reasoning.md b/articles/ai-foundry/model-inference/includes/use-chat-reasoning/about-reasoning.md
@@ -0,0 +1,22 @@
+---
+manager: nitinme
+ms.service: azure-ai-model-inference
+ms.topic: include
+ms.date: 1/31/2025
+ms.author: fasantia
+author: santiagxf
+---
+
+## Reasoning models
+
+Reasoning models can reach higher levels of performance in domains like math, coding, science, strategy, and logistics. The way these models produces outputs is by explicitly using chain of thought to explore all possible paths before generating an answer. They verify their answers as they produce them which helps them to arrive to better more accurate conclusions. This means that reasoning models may require less context in prompting in order to produce effective results. 
+
+Such way of scaling model's performance is referred as *inference compute time* as it trades performance against higher latency and cost. It contrasts to other approaches that scale through *training compute time*.
+
+Reasoning models then produce two types of outputs:
+
+> [!div class="checklist"]
+> * Reasoning completions
+> * Output completions
+
+Both of these completions count towards content generated from the model and hence, towards the token limits and costs associated with the model. Some models may output the reasoning content, like `DeepSeek-R1`. Some others, like `o1`, only outputs the output piece of the completions.
diff --git a/articles/ai-foundry/model-inference/includes/use-chat-reasoning/best-practices.md b/articles/ai-foundry/model-inference/includes/use-chat-reasoning/best-practices.md
@@ -0,0 +1,17 @@
+---
+manager: nitinme
+ms.service: azure-ai-model-inference
+ms.topic: include
+ms.date: 1/31/2025
+ms.author: fasantia
+author: santiagxf
+---
+
+When building prompts for reasoning models, take the following into consideration:
+
+> [!div class="checklist"]
+> * Use simple instructions and avoid using chain-of-thought techniques.
+> * Built-in reasoning capabilities make simple zero-shot prompts as effective as more complex methods. 
+> * When providing additional context or documents, like in RAG scenarios, including only the most relevant information may help preventing the model from over-complicating its response.
+> * Reasoning models may support the use of system messages. However, they may not follow them as strictly as other non-reasoning models.
+> * When creating multi-turn applications, consider only appending the final answer from the model, without it's reasoning content as explained at [Reasoning content](#reasoning-content) section.
diff --git a/articles/ai-foundry/model-inference/includes/use-chat-reasoning/csharp.md b/articles/ai-foundry/model-inference/includes/use-chat-reasoning/csharp.md
@@ -11,6 +11,8 @@ author: santiagxf
 
 This article explains how to use the reasoning capabilities of chat completions models deployed to Azure AI model inference in Azure AI services.
 
+[!INCLUDE [about-reasoning](about-reasoning.md)]
+
 ## Prerequisites
 
 To complete this tutorial, you need:
@@ -73,13 +75,7 @@ ChatCompletionsOptions requestOptions = new ChatCompletionsOptions()
 Response<ChatCompletions> response = client.Complete(requestOptions);
 ```
 
-When building prompts for reasoning models, take the following into consideration:
-
-> [!div class="checklist"]
-> * Built-in reasoning capabilities make simple zero-shot prompts as effective as more complex methods.
-> * When providing additional context or documents, like in RAG scenarios, including only the most relevant information may help preventing the model from over-complicating its response.
-> * Reasoning models may support the use of system messages. However, they may not follow them as strictly as other non-reasoning models.
-> * When creating multi-turn applications, consider only appending the final answer from the model, without it's reasoning content as explained at [Reasoning content](#reasoning-content) section.
+[!INCLUDE [best-practices](best-practices.md)]
 
 The response is as follows, where you can see the model's usage statistics:
 
diff --git a/articles/ai-foundry/model-inference/includes/use-chat-reasoning/java.md b/articles/ai-foundry/model-inference/includes/use-chat-reasoning/java.md
@@ -11,6 +11,8 @@ author: santiagxf
 
 This article explains how to use the reasoning capabilities of chat completions models deployed to Azure AI model inference in Azure AI services.
 
+[!INCLUDE [about-reasoning](about-reasoning.md)]
+
 ## Prerequisites
 
 To complete this tutorial, you need:
@@ -93,13 +95,7 @@ ChatCompletionsOptions requestOptions = new ChatCompletionsOptions()
 Response<ChatCompletions> response = client.complete(requestOptions);
 ```
 
-When building prompts for reasoning models, take the following into consideration:
-
-> [!div class="checklist"]
-> * Built-in reasoning capabilities make simple zero-shot prompts as effective as more complex methods.
-> * When providing additional context or documents, like in RAG scenarios, including only the most relevant information may help preventing the model from over-complicating its response.
-> * Reasoning models may support the use of system messages. However, they may not follow them as strictly as other non-reasoning models.
-> * When creating multi-turn applications, consider only appending the final answer from the model, without it's reasoning content as explained at [Reasoning content](#reasoning-content) section.
+[!INCLUDE [best-practices](best-practices.md)]
 
 The response is as follows, where you can see the model's usage statistics:
 
diff --git a/articles/ai-foundry/model-inference/includes/use-chat-reasoning/javascript.md b/articles/ai-foundry/model-inference/includes/use-chat-reasoning/javascript.md
@@ -11,6 +11,8 @@ author: santiagxf
 
 This article explains how to use the reasoning capabilities of chat completions models deployed to Azure AI model inference in Azure AI services.
 
+[!INCLUDE [about-reasoning](about-reasoning.md)]
+
 ## Prerequisites
 
 To complete this tutorial, you need:
@@ -78,13 +80,7 @@ var response = await client.path("/chat/completions").post({
 });
 ```
 
-When building prompts for reasoning models, take the following into consideration:
-
-> [!div class="checklist"]
-> * Built-in reasoning capabilities make simple zero-shot prompts as effective as more complex methods.
-> * When providing additional context or documents, like in RAG scenarios, including only the most relevant information may help preventing the model from over-complicating its response.
-> * Reasoning models may support the use of system messages. However, they may not follow them as strictly as other non-reasoning models.
-> * When creating multi-turn applications, consider only appending the final answer from the model, without it's reasoning content as explained at [Reasoning content](#reasoning-content) section.
+[!INCLUDE [best-practices](best-practices.md)]
 
 The response is as follows, where you can see the model's usage statistics:
 
diff --git a/articles/ai-foundry/model-inference/includes/use-chat-reasoning/python.md b/articles/ai-foundry/model-inference/includes/use-chat-reasoning/python.md
@@ -11,6 +11,8 @@ author: santiagxf
 
 This article explains how to use the reasoning capabilities of chat completions models deployed to Azure AI model inference in Azure AI services.
 
+[!INCLUDE [about-reasoning](about-reasoning.md)]
+
 ## Prerequisites
 
 To complete this tutorial, you need:
@@ -75,13 +77,7 @@ response = client.complete(
 )
 ```
 
-When building prompts for reasoning models, take the following into consideration:
-
-> [!div class="checklist"]
-> * Built-in reasoning capabilities make simple zero-shot prompts as effective as more complex methods.
-> * When providing additional context or documents, like in RAG scenarios, including only the most relevant information may help preventing the model from over-complicating its response.
-> * Reasoning models may support the use of system messages. However, they may not follow them as strictly as other non-reasoning models.
-> * When creating multi-turn applications, consider only appending the final answer from the model, without it's reasoning content as explained at [Reasoning content](#reasoning-content) section.
+[!INCLUDE [best-practices](best-practices.md)]
 
 The response is as follows, where you can see the model's usage statistics:
 
diff --git a/articles/ai-foundry/model-inference/includes/use-chat-reasoning/rest.md b/articles/ai-foundry/model-inference/includes/use-chat-reasoning/rest.md
@@ -11,6 +11,8 @@ author: santiagxf
 
 This article explains how to use the reasoning capabilities of chat completions models deployed to Azure AI model inference in Azure AI services.
 
+[!INCLUDE [about-reasoning](about-reasoning.md)]
+
 ## Prerequisites
 
 To complete this tutorial, you need:
@@ -58,13 +60,7 @@ The following example shows how you can create a basic chat request to the model
 }
 ```
 
-When building prompts for reasoning models, take the following into consideration:
-
-> [!div class="checklist"]
-> * Built-in reasoning capabilities make simple zero-shot prompts as effective as more complex methods.
-> * When providing additional context or documents, like in RAG scenarios, including only the most relevant information may help preventing the model from over-complicating its response.
-> * Reasoning models may support the use of system messages. However, they may not follow them as strictly as other non-reasoning models.
-> * When creating multi-turn applications, consider only appending the final answer from the model, without it's reasoning content as explained at [Reasoning content](#reasoning-content) section.
+[!INCLUDE [best-practices](best-practices.md)]
 
 The response is as follows, where you can see the model's usage statistics: