fix: reasoning

santiagxf · santiagxf · commit 755cc319307a · 2025-02-04T11:01:29.000-05:00
diff --git a/articles/ai-foundry/model-inference/includes/use-chat-completions/csharp.md b/articles/ai-foundry/model-inference/includes/use-chat-completions/csharp.md
@@ -26,15 +26,12 @@ To use chat completion models in your application, you need:
 
 * A chat completions model deployment. If you don't have one read [Add and configure models to Azure AI services](../../how-to/create-model-deployments.md) to add a chat completions model to your resource.
 
-* Install the Azure AI inference package with the following command:
+* Install the [Azure AI inference package](https://aka.ms/azsdk/azure-ai-inference/python/reference) with the following command:
 
     ```bash
     dotnet add package Azure.AI.Inference --prerelease
     ```
     
-    > [!TIP]
-    > Read more about the [Azure AI inference package and reference](https://aka.ms/azsdk/azure-ai-inference/python/reference).
-
 * If you are using Entra ID, you also need the following package:
 
     ```bash
diff --git a/articles/ai-foundry/model-inference/includes/use-chat-completions/java.md b/articles/ai-foundry/model-inference/includes/use-chat-completions/java.md
@@ -26,7 +26,7 @@ To use chat completion models in your application, you need:
 
 * A chat completions model deployment. If you don't have one read [Add and configure models to Azure AI services](../../how-to/create-model-deployments.md) to add a chat completions model to your resource.
 
-* Add the Azure AI inference package to your project:
+* Add the [Azure AI inference package](https://aka.ms/azsdk/azure-ai-inference/java/reference) to your project:
 
   ```xml
   <dependency>
@@ -36,9 +36,6 @@ To use chat completion models in your application, you need:
   </dependency>
   ```
   
-  > [!TIP]
-  > Read more about the [Azure AI inference package and reference](https://aka.ms/azsdk/azure-ai-inference/java/reference).
-
 * If you are using Entra ID, you also need the following package:
 
   ```xml
diff --git a/articles/ai-foundry/model-inference/includes/use-chat-completions/javascript.md b/articles/ai-foundry/model-inference/includes/use-chat-completions/javascript.md
@@ -26,15 +26,12 @@ To use chat completion models in your application, you need:
 
 * A chat completions model deployment. If you don't have one read [Add and configure models to Azure AI services](../../how-to/create-model-deployments.md) to add a chat completions model to your resource.
 
-* Install the Azure Inference library for JavaScript with the following command:
+* Install the [Azure Inference library for JavaScript](https://aka.ms/azsdk/azure-ai-inference/javascript/reference) with the following command:
 
   ```bash
   npm install @azure-rest/ai-inference
   ```
       
-  > [!TIP]
-  > Read more about the [Azure AI inference package and reference](https://aka.ms/azsdk/azure-ai-inference/javascript/reference).
-
 ## Use chat completions
 
 First, create the client to consume the model. The following code uses an endpoint URL and key that are stored in environment variables.
diff --git a/articles/ai-foundry/model-inference/includes/use-chat-completions/python.md b/articles/ai-foundry/model-inference/includes/use-chat-completions/python.md
@@ -26,15 +26,12 @@ To use chat completion models in your application, you need:
 
 * A chat completions model deployment. If you don't have one read [Add and configure models to Azure AI services](../../how-to/create-model-deployments.md) to add a chat completions model to your resource.
 
-* Install the Azure AI inference package with the following command:
+* Install the [Azure AI inference package for Python](https://aka.ms/azsdk/azure-ai-inference/python/reference) with the following command:
 
   ```bash
   pip install -U azure-ai-inference
   ```
   
-  > [!TIP]
-  > Read more about the [Azure AI inference package and reference](https://aka.ms/azsdk/azure-ai-inference/python/reference).
-
 ## Use chat completions
 
 First, create the client to consume the model. The following code uses an endpoint URL and key that are stored in environment variables.
diff --git a/articles/ai-foundry/model-inference/includes/use-chat-reasoning/csharp.md b/articles/ai-foundry/model-inference/includes/use-chat-reasoning/csharp.md
@@ -73,7 +73,13 @@ ChatCompletionsOptions requestOptions = new ChatCompletionsOptions()
 Response<ChatCompletions> response = client.Complete(requestOptions);
 ```
 
-When building prompts for reasoning models, built-in reasoning capabilities make simple zero-shot prompts as effective as more complex methods. When providing additional context or documents, like in RAG scenarios, including only the most relevant information may help preventing the model from over-complicating its response.
+When building prompts for reasoning models, take the following into consideration:
+
+> [!div class="checklist"]
+> * Built-in reasoning capabilities make simple zero-shot prompts as effective as more complex methods.
+> * When providing additional context or documents, like in RAG scenarios, including only the most relevant information may help preventing the model from over-complicating its response.
+> * Reasoning models may support the use of system messages. However, they may not follow them as strictly as other non-reasoning models.
+> * When creating multi-turn applications, consider only appending the final answer from the model, without it's reasoning content as explained at [Reasoning content](#reasoning-content) section.
 
 The response is as follows, where you can see the model's usage statistics:
 
diff --git a/articles/ai-foundry/model-inference/includes/use-chat-reasoning/java.md b/articles/ai-foundry/model-inference/includes/use-chat-reasoning/java.md
@@ -21,7 +21,7 @@ To complete this tutorial, you need:
 
   * This examples uses `DeepSeek-R1`.
 
-* Add the Azure AI inference package to your project:
+* Add the [Azure AI inference package](https://aka.ms/azsdk/azure-ai-inference/java/reference) to your project:
 
   ```xml
   <dependency>
@@ -31,9 +31,6 @@ To complete this tutorial, you need:
   </dependency>
   ```
   
-  > [!TIP]
-  > Read more about the [Azure AI inference package and reference](https://aka.ms/azsdk/azure-ai-inference/java/reference).
-
 * If you are using Entra ID, you also need the following package:
 
   ```xml
@@ -97,7 +94,13 @@ ChatCompletionsOptions requestOptions = new ChatCompletionsOptions()
 Response<ChatCompletions> response = client.complete(requestOptions);
 ```
 
-When building prompts for reasoning models, built-in reasoning capabilities make simple zero-shot prompts as effective as more complex methods. When providing additional context or documents, like in RAG scenarios, including only the most relevant information may help preventing the model from over-complicating its response.
+When building prompts for reasoning models, take the following into consideration:
+
+> [!div class="checklist"]
+> * Built-in reasoning capabilities make simple zero-shot prompts as effective as more complex methods.
+> * When providing additional context or documents, like in RAG scenarios, including only the most relevant information may help preventing the model from over-complicating its response.
+> * Reasoning models may support the use of system messages. However, they may not follow them as strictly as other non-reasoning models.
+> * When creating multi-turn applications, consider only appending the final answer from the model, without it's reasoning content as explained at [Reasoning content](#reasoning-content) section.
 
 The response is as follows, where you can see the model's usage statistics:
 
diff --git a/articles/ai-foundry/model-inference/includes/use-chat-reasoning/javascript.md b/articles/ai-foundry/model-inference/includes/use-chat-reasoning/javascript.md
@@ -76,7 +76,13 @@ var response = await client.path("/chat/completions").post({
 });
 ```
 
-When building prompts for reasoning models, built-in reasoning capabilities make simple zero-shot prompts as effective as more complex methods. When providing additional context or documents, like in RAG scenarios, including only the most relevant information may help preventing the model from over-complicating its response.
+When building prompts for reasoning models, take the following into consideration:
+
+> [!div class="checklist"]
+> * Built-in reasoning capabilities make simple zero-shot prompts as effective as more complex methods.
+> * When providing additional context or documents, like in RAG scenarios, including only the most relevant information may help preventing the model from over-complicating its response.
+> * Reasoning models may support the use of system messages. However, they may not follow them as strictly as other non-reasoning models.
+> * When creating multi-turn applications, consider only appending the final answer from the model, without it's reasoning content as explained at [Reasoning content](#reasoning-content) section.
 
 The response is as follows, where you can see the model's usage statistics:
 
diff --git a/articles/ai-foundry/model-inference/includes/use-chat-reasoning/python.md b/articles/ai-foundry/model-inference/includes/use-chat-reasoning/python.md
@@ -21,15 +21,12 @@ To complete this tutorial, you need:
 
   * This examples uses `DeepSeek-R1`.
 
-* Install the Azure AI inference package with the following command:
+* Install the [Azure AI inference package](https://aka.ms/azsdk/azure-ai-inference/python/reference) with the following command:
 
   ```bash
   pip install -U azure-ai-inference
   ```
   
-  > [!TIP]
-  > Read more about the [Azure AI inference package and reference](https://aka.ms/azsdk/azure-ai-inference/python/reference).
-
 ## Use reasoning capabilities with chat
 
 First, create the client to consume the model. The following code uses an endpoint URL and key that are stored in environment variables.
@@ -78,7 +75,13 @@ response = client.complete(
 )
 ```
 
-When building prompts for reasoning models, built-in reasoning capabilities make simple zero-shot prompts as effective as more complex methods. When providing additional context or documents, like in RAG scenarios, including only the most relevant information may help preventing the model from over-complicating its response.
+When building prompts for reasoning models, take the following into consideration:
+
+> [!div class="checklist"]
+> * Built-in reasoning capabilities make simple zero-shot prompts as effective as more complex methods.
+> * When providing additional context or documents, like in RAG scenarios, including only the most relevant information may help preventing the model from over-complicating its response.
+> * Reasoning models may support the use of system messages. However, they may not follow them as strictly as other non-reasoning models.
+> * When creating multi-turn applications, consider only appending the final answer from the model, without it's reasoning content as explained at [Reasoning content](#reasoning-content) section.
 
 The response is as follows, where you can see the model's usage statistics:
 
diff --git a/articles/ai-foundry/model-inference/includes/use-chat-reasoning/rest.md b/articles/ai-foundry/model-inference/includes/use-chat-reasoning/rest.md
@@ -55,7 +55,13 @@ The following example shows how you can create a basic reasoning capabilities wi
 }
 ```
 
-When building prompts for reasoning models, built-in reasoning capabilities make simple zero-shot prompts as effective as more complex methods. When providing additional context or documents, like in RAG scenarios, including only the most relevant information may help preventing the model from over-complicating its response.
+When building prompts for reasoning models, take the following into consideration:
+
+> [!div class="checklist"]
+> * Built-in reasoning capabilities make simple zero-shot prompts as effective as more complex methods.
+> * When providing additional context or documents, like in RAG scenarios, including only the most relevant information may help preventing the model from over-complicating its response.
+> * Reasoning models may support the use of system messages. However, they may not follow them as strictly as other non-reasoning models.
+> * When creating multi-turn applications, consider only appending the final answer from the model, without it's reasoning content as explained at [Reasoning content](#reasoning-content) section.
 
 The response is as follows, where you can see the model's usage statistics:
 
@@ -84,18 +90,6 @@ The response is as follows, where you can see the model's usage statistics:
 }
 ```
 
-### Parameters
-
-In general, reasoning models don't support the following parameters you can find in chat completion models:
-
-* Temperature
-* Presence penalty
-* Repetition penalty
-* Parameter `top_p`
-
-Some models support the use of tools or structured outputs (including JSON-schemas). Read the [Models](../../concepts/models.md) details page to understand each model's support.
-
-
 ### Reasoning content
 
 Some reasoning models, like DeepSeek-R1, generate completions and include the reasoning behind it. The reasoning associated with the completion is included in the response's content within the tags `<think>` and `</think>`. The model may select on which scenarios to generate reasoning content. 
@@ -177,6 +171,16 @@ The last message in the stream has `finish_reason` set, indicating the reason fo
 }
 ```
 
+### Parameters
+
+In general, reasoning models don't support the following parameters you can find in chat completion models:
+
+* Temperature
+* Presence penalty
+* Repetition penalty
+* Parameter `top_p`
+
+Some models support the use of tools or structured outputs (including JSON-schemas). Read the [Models](../../concepts/models.md) details page to understand each model's support.
 
 ### Apply content safety