Skip to content

Commit a8220a9

Browse files
authored
Update python.md
1 parent c724ac4 commit a8220a9

File tree

1 file changed

+16
-16
lines changed
  • articles/ai-foundry/model-inference/includes/use-chat-reasoning

1 file changed

+16
-16
lines changed

articles/ai-foundry/model-inference/includes/use-chat-reasoning/python.md

Lines changed: 16 additions & 16 deletions
Original file line numberDiff line numberDiff line change
@@ -29,7 +29,7 @@ To complete this tutorial, you need:
2929

3030
First, create the client to consume the model. The following code uses an endpoint URL and key that are stored in environment variables.
3131

32-
# [OpenAI](#tab/openai)
32+
# [OpenAI API](#tab/openai)
3333

3434
```python
3535
import os
@@ -42,7 +42,7 @@ client = AzureOpenAI(
4242
)
4343
```
4444

45-
# [Model Inference (preview)](#tab/inference)
45+
# [Model Inference API (preview)](#tab/inference)
4646

4747
```python
4848
import os
@@ -59,7 +59,7 @@ client = ChatCompletionsClient(
5959

6060
If you have configured the resource to with **Microsoft Entra ID** support, you can use the following code snippet to create a client.
6161

62-
# [OpenAI](#tab/openai)
62+
# [OpenAI API](#tab/openai)
6363

6464
```python
6565
import os
@@ -77,7 +77,7 @@ client = AzureOpenAI(
7777
)
7878
```
7979

80-
# [Model Inference (preview)](#tab/inference)
80+
# [Model Inference API (preview)](#tab/inference)
8181

8282
```python
8383
import os
@@ -99,7 +99,7 @@ client = ChatCompletionsClient(
9999

100100
The following example shows how you can create a basic chat request to the model.
101101

102-
# [OpenAI](#tab/openai)
102+
# [OpenAI API](#tab/openai)
103103

104104
```python
105105
response = client.chat.completions.create(
@@ -112,7 +112,7 @@ response = client.chat.completions.create(
112112
print(response.model_dump_json(indent=2)
113113
```
114114

115-
# [Model Inference (preview)](#tab/inference)
115+
# [Model Inference API (preview)](#tab/inference)
116116

117117
```python
118118
from azure.ai.inference.models import SystemMessage, UserMessage
@@ -129,7 +129,7 @@ response = client.complete(
129129

130130
The response is as follows, where you can see the model's usage statistics:
131131

132-
# [OpenAI](#tab/openai)
132+
# [OpenAI API](#tab/openai)
133133

134134
```python
135135
print("Response:", response.choices[0].message.content)
@@ -149,7 +149,7 @@ Usage:
149149
Completion tokens: 886
150150
```
151151

152-
# [Model Inference (preview)](#tab/inference)
152+
# [Model Inference API (preview)](#tab/inference)
153153

154154
```python
155155
print("Response:", response.choices[0].message.content)
@@ -174,7 +174,7 @@ Usage:
174174

175175
Some reasoning models, like DeepSeek-R1, generate completions and include the reasoning behind it.
176176

177-
# [OpenAI](#tab/openai)
177+
# [OpenAI API](#tab/openai)
178178

179179
The reasoning associated with the completion is included in the field `reasoning_content`. The model may select on which scenearios to generate reasoning content.
180180

@@ -186,7 +186,7 @@ print("Thinking:", response.choices[0].message.reasoning_content)
186186
Thinking: Okay, the user is asking how many languages exist in the world. I need to provide a clear and accurate answer...
187187
```
188188

189-
# [Model Inference (preview)](#tab/inference)
189+
# [Model Inference API (preview)](#tab/inference)
190190

191191
The reasoning associated with the completion is included in the response's content within the tags `<think>` and `</think>`. The model may select on which scenarios to generate reasoning content. You can extract the reasoning content from the response to understand the model's thought process as follows:
192192

@@ -216,7 +216,7 @@ You can _stream_ the content to get it as it's being generated. Streaming conten
216216

217217
To stream completions, set `stream=True` when you call the model.
218218

219-
# [OpenAI](#tab/openai)
219+
# [OpenAI API](#tab/openai)
220220

221221
```python
222222
response = client.chat.completions.create(
@@ -228,7 +228,7 @@ response = client.chat.completions.create(
228228
)
229229
```
230230

231-
# [Model Inference (preview)](#tab/inference)
231+
# [Model Inference API (preview)](#tab/inference)
232232

233233
```python
234234
response = client.complete(
@@ -244,7 +244,7 @@ response = client.complete(
244244

245245
To visualize the output, define a helper function to print the stream. The following example implements a routing that stream only the answer without the reasoning content:
246246

247-
# [OpenAI](#tab/openai)
247+
# [OpenAI API](#tab/openai)
248248

249249
Reasoning content is also included inside of the delta pieces of the response, in the key `reasoning_content`.
250250

@@ -268,7 +268,7 @@ def print_stream(completion):
268268
print(content, end="", flush=True)
269269
```
270270

271-
# [Model Inference (preview)](#tab/inference)
271+
# [Model Inference API (preview)](#tab/inference)
272272

273273
When streaming, pay closer attention to the `<think>` tag that may be included inside of the `content` field.
274274

@@ -316,7 +316,7 @@ The Azure AI Model Inference API supports [Azure AI Content Safety](https://aka.
316316

317317
The following example shows how to handle events when the model detects harmful content in the input prompt.
318318

319-
# [OpenAI](#tab/openai)
319+
# [OpenAI API](#tab/openai)
320320

321321
```python
322322
try:
@@ -339,7 +339,7 @@ except HttpResponseError as ex:
339339
raise
340340
```
341341

342-
# [Model Inference (preview)](#tab/inference)
342+
# [Model Inference API (preview)](#tab/inference)
343343

344344
```python
345345
from azure.ai.inference.models import AssistantMessage, UserMessage

0 commit comments

Comments
 (0)