Skip to content

Commit eba41eb

Browse files
committed
fix: add examples
1 parent 81c4770 commit eba41eb

File tree

7 files changed

+455
-1212
lines changed

7 files changed

+455
-1212
lines changed

articles/ai-foundry/model-inference/includes/use-chat-reasoning/csharp.md

Lines changed: 90 additions & 312 deletions
Large diffs are not rendered by default.

articles/ai-foundry/model-inference/includes/use-chat-reasoning/java.md

Lines changed: 144 additions & 89 deletions
Large diffs are not rendered by default.

articles/ai-foundry/model-inference/includes/use-chat-reasoning/javascript.md

Lines changed: 88 additions & 319 deletions
Large diffs are not rendered by default.

articles/ai-foundry/model-inference/includes/use-chat-reasoning/python.md

Lines changed: 73 additions & 28 deletions
Original file line numberDiff line numberDiff line change
@@ -34,14 +34,13 @@ To complete this tutorial, you need:
3434

3535
First, create the client to consume the model. The following code uses an endpoint URL and key that are stored in environment variables.
3636

37-
3837
```python
3938
import os
4039
from azure.ai.inference import ChatCompletionsClient
4140
from azure.core.credentials import AzureKeyCredential
4241

4342
client = ChatCompletionsClient(
44-
endpoint=os.environ["AZURE_INFERENCE_ENDPOINT"],
43+
endpoint="https://<resource>.services.ai.azure.com/models",
4544
credential=AzureKeyCredential(os.environ["AZURE_INFERENCE_CREDENTIAL"]),
4645
model="deepseek-r1"
4746
)
@@ -52,16 +51,16 @@ client = ChatCompletionsClient(
5251
5352
If you have configured the resource to with **Microsoft Entra ID** support, you can use the following code snippet to create a client.
5453

55-
5654
```python
5755
import os
5856
from azure.ai.inference import ChatCompletionsClient
5957
from azure.identity import DefaultAzureCredential
6058

6159
client = ChatCompletionsClient(
62-
endpoint=os.environ["AZURE_INFERENCE_ENDPOINT"],
60+
endpoint="https://<resource>.services.ai.azure.com/models",
6361
credential=DefaultAzureCredential(),
64-
model="mistral-large-2407"
62+
credential_scopes=["https://cognitiveservices.azure.com/.default"],
63+
model="deepseek-r1"
6564
)
6665
```
6766

@@ -74,18 +73,15 @@ from azure.ai.inference.models import SystemMessage, UserMessage
7473

7574
response = client.complete(
7675
messages=[
77-
SystemMessage(content="You are a helpful assistant."),
7876
UserMessage(content="How many languages are in the world?"),
7977
],
8078
)
8179
```
8280

83-
> [!NOTE]
84-
> Some models don't support system messages (`role="system"`). When you use the Azure AI model inference API, system messages are translated to user messages, which is the closest capability available. This translation is offered for convenience, but it's important for you to verify that the model is following the instructions in the system message with the right level of confidence.
81+
When building prompts for reasoning models, built-in reasoning capabilities make simple zero-shot prompts as effective as more complex methods. When providing additional context or documents, like in RAG scenarios, including only the most relevant information may help preventing the model from over-complicating its response.
8582

8683
The response is as follows, where you can see the model's usage statistics:
8784

88-
8985
```python
9086
print("Response:", response.choices[0].message.content)
9187
print("Model:", response.model)
@@ -96,17 +92,51 @@ print("\tCompletion tokens:", response.usage.completion_tokens)
9692
```
9793

9894
```console
99-
Response: As of now, it's estimated that there are about 7,000 languages spoken around the world. However, this number can vary as some languages become extinct and new ones develop. It's also important to note that the number of speakers can greatly vary between languages, with some having millions of speakers and others only a few hundred.
100-
Model: mistral-large-2407
95+
Response: <think>Okay, the user is asking how many languages exist in the world. I need to provide a clear and accurate answer...</think>As of now, it's estimated that there are about 7,000 languages spoken around the world. However, this number can vary as some languages become extinct and new ones develop. It's also important to note that the number of speakers can greatly vary between languages, with some having millions of speakers and others only a few hundred.
96+
Model: deepseek-r1
10197
Usage:
102-
Prompt tokens: 19
103-
Total tokens: 91
104-
Completion tokens: 72
98+
Prompt tokens: 11
99+
Total tokens: 897
100+
Completion tokens: 886
105101
```
106102

107-
Inspect the `usage` section in the response to see the number of tokens used for the prompt, the total number of tokens generated, and the number of tokens used for the completion.
108103

109-
#### Stream content
104+
### Reasoning content
105+
106+
Some reasoning models, like DeepSeek-R1, generate completions and include the reasoning behind it. The reasoning associated with the completion is included in the response's content within the tags `<think>` and `</think>`. The model may select on which scenarios to generate reasoning content. You can extract the reasoning content from the response to understand the model's thought process as follows:
107+
108+
```python
109+
import re
110+
111+
match = re.match(r"<think>(.*?)</think>(.*)", response.choices[0].message.content, re.DOTALL)
112+
113+
print("Response:", )
114+
if match:
115+
print("\tThinking:", match.group(1))
116+
print("\tAnswer:", match.group(2))
117+
else:
118+
print("\tAnswer:", response.choices[0].message.content)
119+
print("Model:", response.model)
120+
print("Usage:")
121+
print("\tPrompt tokens:", response.usage.prompt_tokens)
122+
print("\tTotal tokens:", response.usage.total_tokens)
123+
print("\tCompletion tokens:", response.usage.completion_tokens)
124+
```
125+
126+
```console
127+
Thinking: Okay, the user is asking how many languages exist in the world. I need to provide a clear and accurate answer. Let's start by recalling the general consensus from linguistic sources. I remember that the number often cited is around 7,000, but maybe I should check some reputable organizations.\n\nEthnologue is a well-known resource for language data, and I think they list about 7,000 languages. But wait, do they update their numbers? It might be around 7,100 or so. Also, the exact count can vary because some sources might categorize dialects differently or have more recent data. \n\nAnother thing to consider is language endangerment. Many languages are endangered, with some having only a few speakers left. Organizations like UNESCO track endangered languages, so mentioning that adds context. Also, the distribution isn't even. Some countries have hundreds of languages, like Papua New Guinea with over 800, while others have just a few. \n\nA user might also wonder why the exact number is hard to pin down. It's because the distinction between a language and a dialect can be political or cultural. For example, Mandarin and Cantonese are considered dialects of Chinese by some, but they're mutually unintelligible, so others classify them as separate languages. Also, some regions are under-researched, making it hard to document all languages. \n\nI should also touch on language families. The 7,000 languages are grouped into families like Indo-European, Sino-Tibetan, Niger-Congo, etc. Maybe mention a few of the largest families. But wait, the question is just about the count, not the families. Still, it's good to provide a bit more context. \n\nI need to make sure the information is up-to-date. Let me think – recent estimates still hover around 7,000. However, languages are dying out rapidly, so the number decreases over time. Including that note about endangerment and language extinction rates could be helpful. For instance, it's often stated that a language dies every few weeks. \n\nAnother point is sign languages. Does the count include them? Ethnologue includes some, but not all sources might. If the user is including sign languages, that adds more to the count, but I think the 7,000 figure typically refers to spoken languages. For thoroughness, maybe mention that there are also over 300 sign languages. \n\nSummarizing, the answer should state around 7,000, mention Ethnologue's figure, explain why the exact number varies, touch on endangerment, and possibly note sign languages as a separate category. Also, a brief mention of Papua New Guinea as the most linguistically diverse country. \n\nWait, let me verify Ethnologue's current number. As of their latest edition (25th, 2022), they list 7,168 living languages. But I should check if that's the case. Some sources might round to 7,000. Also, SIL International publishes Ethnologue, so citing them as reference makes sense. \n\nOther sources, like Glottolog, might have a different count because they use different criteria. Glottolog might list around 7,000 as well, but exact numbers vary. It's important to highlight that the count isn't exact because of differing definitions and ongoing research. \n\nIn conclusion, the approximate number is 7,000, with Ethnologue being a key source, considerations of endangerment, and the challenges in counting due to dialect vs. language distinctions. I should make sure the answer is clear, acknowledges the variability, and provides key points succinctly.
128+
129+
Answer: The exact number of languages in the world is challenging to determine due to differences in definitions (e.g., distinguishing languages from dialects) and ongoing documentation efforts. However, widely cited estimates suggest there are approximately **7,000 languages** globally.
130+
Model: DeepSeek-R1
131+
Usage:
132+
Prompt tokens: 11
133+
Total tokens: 897
134+
Completion tokens: 886
135+
```
136+
137+
When making multi-turn conversations, it's useful to avoid sending the reasoning content in the chat history as reasoning tends to generate long explanations.
138+
139+
### Stream content
110140

111141
By default, the completions API returns the entire generated content in a single response. If you're generating long completions, waiting for the response can take many seconds.
112142

@@ -117,28 +147,34 @@ To stream completions, set `stream=True` when you call the model.
117147

118148
```python
119149
result = client.complete(
150+
model="deepseek-r1",
120151
messages=[
121-
SystemMessage(content="You are a helpful assistant."),
122152
UserMessage(content="How many languages are in the world?"),
123153
],
124-
temperature=0,
125-
top_p=1,
126154
max_tokens=2048,
127155
stream=True,
128156
)
129157
```
130158

131-
To visualize the output, define a helper function to print the stream.
159+
To visualize the output, define a helper function to print the stream. The following example implements a routing that stream only the answer without the reasoning content:
132160

133161
```python
134162
def print_stream(result):
135163
"""
136164
Prints the chat completion with streaming.
137165
"""
138-
import time
139-
for update in result:
140-
if update.choices:
141-
print(update.choices[0].delta.content, end="")
166+
is_thinking = False
167+
for event in completion:
168+
if event.choices:
169+
content = event.choices[0].delta.content
170+
if content == "<think>":
171+
is_thinking = True
172+
print("🧠 Thinking...", end="", flush=True)
173+
elif content == "</think>":
174+
is_thinking = False
175+
print("🛑\n\n")
176+
elif content:
177+
print(content, end="", flush=True)
142178
```
143179

144180
You can visualize how streaming generates content:
@@ -148,7 +184,16 @@ You can visualize how streaming generates content:
148184
print_stream(result)
149185
```
150186

151-
### Considerations when working with reasoning models
187+
### Parameters
188+
189+
In general, reasoning models don't support the following parameters you can find in chat completion models:
190+
191+
* Temperature
192+
* Presence penalty
193+
* Repetition penalty
194+
* Parameter `top_p`
195+
196+
Some models support the use of tools or structured outputs (including JSON-schemas). Read the [Models](../../concepts/models.md) details page to understand each model's support.
152197

153198
### Apply content safety
154199

@@ -158,14 +203,14 @@ The following example shows how to handle events when the model detects harmful
158203

159204

160205
```python
161-
from azure.ai.inference.models import AssistantMessage, UserMessage, SystemMessage
206+
from azure.ai.inference.models import AssistantMessage, UserMessage
162207

163208
try:
164209
response = client.complete(
210+
model="deepseek-r1",
165211
messages=[
166-
SystemMessage(content="You are an AI assistant that helps people find information."),
167212
UserMessage(content="Chopping tomatoes and cutting them into cubes or wedges are great ways to practice your knife skills."),
168-
]
213+
],
169214
)
170215

171216
print(response.choices[0].message.content)

0 commit comments

Comments
 (0)