Skip to content

Commit f4f2f9f

Browse files
Merge pull request #257056 from mrbullwinkle/mrb_11_1_2023_python_updates
[Azure OpenAI] Python v1.0 tabs
2 parents b185c5e + c552eaf commit f4f2f9f

File tree

3 files changed

+353
-65
lines changed

3 files changed

+353
-65
lines changed

articles/ai-services/openai/includes/chat-completion.md

Lines changed: 229 additions & 15 deletions
Original file line numberDiff line numberDiff line change
@@ -6,26 +6,28 @@ author: mrbullwinkle #dereklegenzoff
66
ms.author: mbullwin #delegenz
77
ms.service: azure-ai-openai
88
ms.topic: include
9-
ms.date: 05/31/2023
9+
ms.date: 11/02/2023
1010
manager: nitinme
1111
keywords: ChatGPT
1212

1313
---
1414

15-
## Working with the GPT-35-Turbo and GPT-4 models
15+
## Working with the GPT-3.5-Turbo and GPT-4 models
1616

17-
The following code snippet shows the most basic way to use the GPT-35-Turbo and GPT-4 models with the Chat Completion API. If this is your first time using these models programmatically, we recommend starting with our [GPT-35-Turbo & GPT-4 Quickstart](../chatgpt-quickstart.md).
17+
The following code snippet shows the most basic way to use the GPT-3.5-Turbo and GPT-4 models with the Chat Completion API. If this is your first time using these models programmatically, we recommend starting with our [GPT-3.5-Turbo & GPT-4 Quickstart](../chatgpt-quickstart.md).
18+
19+
# [OpenAI Python 0.28.1](#tab/python)
1820

1921
```python
2022
import os
2123
import openai
2224
openai.api_type = "azure"
2325
openai.api_version = "2023-05-15"
24-
openai.api_base = os.getenv("OPENAI_API_BASE") # Your Azure OpenAI resource's endpoint value.
25-
openai.api_key = os.getenv("OPENAI_API_KEY")
26+
openai.api_base = os.getenv("AZURE_OPENAI_ENDPOINT") # Your Azure OpenAI resource's endpoint value.
27+
openai.api_key = os.getenv("AZURE_OPENAI_KEY")
2628

2729
response = openai.ChatCompletion.create(
28-
engine="gpt-35-turbo", # The deployment name you chose when you deployed the GPT-35-Turbo or GPT-4 model.
30+
engine="gpt-35-turbo", # The deployment name you chose when you deployed the GPT-3.5-Turbo or GPT-4 model.
2931
messages=[
3032
{"role": "system", "content": "Assistant is a large language model trained by OpenAI."},
3133
{"role": "user", "content": "Who were the founders of Microsoft?"}
@@ -34,12 +36,15 @@ response = openai.ChatCompletion.create(
3436

3537
print(response)
3638

37-
print(response['choices'][0]['message']['content'])
39+
# To print only the response content text:
40+
# print(response['choices'][0]['message']['content'])
3841
```
3942

4043
### Output
4144

42-
```
45+
JSON formatting added artificially for ease of reading.
46+
47+
```json
4348
{
4449
"choices": [
4550
{
@@ -64,6 +69,100 @@ print(response['choices'][0]['message']['content'])
6469

6570
```
6671

72+
# [OpenAI Python 1.0](#tab/python-new)
73+
74+
```python
75+
import os
76+
from openai import AzureOpenAI
77+
78+
client = AzureOpenAI(
79+
api_key = os.getenv("AZURE_OPENAI_KEY"),
80+
api_version = "2023-05-15",
81+
azure_endpoint = os.getenv("AZURE_OPENAI_ENDPOINT")
82+
)
83+
84+
response = client.chat.completions.create(
85+
model="gpt-35-turbo", # model = "deployment_name".
86+
messages=[
87+
{"role": "system", "content": "Assistant is a large language model trained by OpenAI."},
88+
{"role": "user", "content": "Who were the founders of Microsoft?"}
89+
]
90+
)
91+
92+
#print(response)
93+
print(response.model_dump_json(indent=2))
94+
print(response.choices[0].message.content)
95+
```
96+
97+
```output
98+
{
99+
"id": "chatcmpl-8GHoQAJ3zN2DJYqOFiVysrMQJfe1P",
100+
"choices": [
101+
{
102+
"finish_reason": "stop",
103+
"index": 0,
104+
"message": {
105+
"content": "Microsoft was founded by Bill Gates and Paul Allen. They established the company on April 4, 1975. Bill Gates served as the CEO of Microsoft until 2000 and later as Chairman and Chief Software Architect until his retirement in 2008, while Paul Allen left the company in 1983 but remained on the board of directors until 2000.",
106+
"role": "assistant",
107+
"function_call": null
108+
},
109+
"content_filter_results": {
110+
"hate": {
111+
"filtered": false,
112+
"severity": "safe"
113+
},
114+
"self_harm": {
115+
"filtered": false,
116+
"severity": "safe"
117+
},
118+
"sexual": {
119+
"filtered": false,
120+
"severity": "safe"
121+
},
122+
"violence": {
123+
"filtered": false,
124+
"severity": "safe"
125+
}
126+
}
127+
}
128+
],
129+
"created": 1698892410,
130+
"model": "gpt-35-turbo",
131+
"object": "chat.completion",
132+
"usage": {
133+
"completion_tokens": 73,
134+
"prompt_tokens": 29,
135+
"total_tokens": 102
136+
},
137+
"prompt_filter_results": [
138+
{
139+
"prompt_index": 0,
140+
"content_filter_results": {
141+
"hate": {
142+
"filtered": false,
143+
"severity": "safe"
144+
},
145+
"self_harm": {
146+
"filtered": false,
147+
"severity": "safe"
148+
},
149+
"sexual": {
150+
"filtered": false,
151+
"severity": "safe"
152+
},
153+
"violence": {
154+
"filtered": false,
155+
"severity": "safe"
156+
}
157+
}
158+
}
159+
]
160+
}
161+
Microsoft was founded by Bill Gates and Paul Allen. They established the company on April 4, 1975. Bill Gates served as the CEO of Microsoft until 2000 and later as Chairman and Chief Software Architect until his retirement in 2008, while Paul Allen left the company in 1983 but remained on the board of directors until 2000.
162+
```
163+
164+
---
165+
67166
> [!NOTE]
68167
> The following parameters aren't available with the new GPT-35-Turbo and GPT-4 models: `logprobs`, `best_of`, and `echo`. If you set any of these parameters, you'll get an error.
69168
@@ -205,13 +304,16 @@ The examples so far have shown you the basic mechanics of interacting with the C
205304

206305
This means that every time a new question is asked, a running transcript of the conversation so far is sent along with the latest question. Since the model has no memory, you need to send an updated transcript with each new question or the model will lose context of the previous questions and answers.
207306

208-
```Python
307+
308+
# [OpenAI Python 0.28.1](#tab/python)
309+
310+
```python
209311
import os
210312
import openai
211313
openai.api_type = "azure"
212314
openai.api_version = "2023-05-15"
213-
openai.api_base = os.getenv("OPENAI_API_BASE") # Your Azure OpenAI resource's endpoint value .
214-
openai.api_key = os.getenv("OPENAI_API_KEY")
315+
openai.api_base = os.getenv("AZURE_OPENAI_ENDPOINT") # Your Azure OpenAI resource's endpoint value.
316+
openai.api_key = os.getenv("AZURE_OPENAI_KEY")
215317

216318
conversation=[{"role": "system", "content": "You are a helpful assistant."}]
217319

@@ -220,14 +322,43 @@ while True:
220322
conversation.append({"role": "user", "content": user_input})
221323

222324
response = openai.ChatCompletion.create(
223-
engine="gpt-3.5-turbo", # The deployment name you chose when you deployed the GPT-35-turbo or GPT-4 model.
325+
engine="gpt-35-turbo", # The deployment name you chose when you deployed the GPT-35-turbo or GPT-4 model.
224326
messages=conversation
225327
)
226328

227329
conversation.append({"role": "assistant", "content": response["choices"][0]["message"]["content"]})
228330
print("\n" + response['choices'][0]['message']['content'] + "\n")
229331
```
230332

333+
# [OpenAI Python 1.0](#tab/python-new)
334+
335+
```python
336+
import os
337+
from openai import AzureOpenAI
338+
339+
client = AzureOpenAI(
340+
api_key = os.getenv("AZURE_OPENAI_KEY"),
341+
api_version = "2023-05-15",
342+
azure_endpoint = os.getenv("AZURE_OPENAI_ENDPOINT") # Your Azure OpenAI resource's endpoint value.
343+
)
344+
345+
conversation=[{"role": "system", "content": "You are a helpful assistant."}]
346+
347+
while True:
348+
user_input = input("Q:")
349+
conversation.append({"role": "user", "content": user_input})
350+
351+
response = client.chat.completions.create(
352+
model="gpt-35-turbo", # model = "deployment_name".
353+
messages=conversation
354+
)
355+
356+
conversation.append({"role": "assistant", "content": response.choices[0].message.content})
357+
print("\n" + response.choices[0].message.content + "\n")
358+
```
359+
360+
---
361+
231362
When you run the code above you will get a blank console window. Enter your first question in the window and then hit enter. Once the response is returned, you can repeat the process and keep asking questions.
232363

233364
## Managing conversations
@@ -241,7 +372,9 @@ It's your responsibility to ensure the prompt and completion falls within the to
241372
242373
The following code sample shows a simple chat loop example with a technique for handling a 4096 token count using OpenAI's tiktoken library.
243374

244-
The code requires tiktoken `0.3.0`. If you have an older version run `pip install tiktoken --upgrade`.
375+
The code uses tiktoken `0.5.1`. If you have an older version run `pip install tiktoken --upgrade`.
376+
377+
# [OpenAI Python 0.28.1](#tab/python)
245378

246379
```python
247380
import tiktoken
@@ -250,8 +383,8 @@ import os
250383

251384
openai.api_type = "azure"
252385
openai.api_version = "2023-05-15"
253-
openai.api_base = os.getenv("OPENAI_API_BASE") # Your Azure OpenAI resource's endpoint value.
254-
openai.api_key = os.getenv("OPENAI_API_KEY")
386+
openai.api_base = os.getenv("AZURE_OPENAI_ENDPOINT") # Your Azure OpenAI resource's endpoint value.
387+
openai.api_key = os.getenv("AZURE_OPENAI_KEY")
255388

256389
system_message = {"role": "system", "content": "You are a helpful assistant."}
257390
max_response_tokens = 250
@@ -319,6 +452,87 @@ while True:
319452
print("\n" + response['choices'][0]['message']['content'] + "\n")
320453
```
321454

455+
# [OpenAI Python 1.0](#tab/python-new)
456+
457+
```python
458+
import tiktoken
459+
import os
460+
from openai import AzureOpenAI
461+
462+
client = AzureOpenAI(
463+
api_key = os.getenv("AZURE_OPENAI_KEY"),
464+
api_version = "2023-05-15",
465+
azure_endpoint = os.getenv("AZURE_OPENAI_ENDPOINT") # Your Azure OpenAI resource's endpoint value.
466+
)
467+
468+
system_message = {"role": "system", "content": "You are a helpful assistant."}
469+
max_response_tokens = 250
470+
token_limit = 4096
471+
conversation = []
472+
conversation.append(system_message)
473+
474+
def num_tokens_from_messages(messages, model="gpt-3.5-turbo-0613"):
475+
"""Return the number of tokens used by a list of messages."""
476+
try:
477+
encoding = tiktoken.encoding_for_model(model)
478+
except KeyError:
479+
print("Warning: model not found. Using cl100k_base encoding.")
480+
encoding = tiktoken.get_encoding("cl100k_base")
481+
if model in {
482+
"gpt-3.5-turbo-0613",
483+
"gpt-3.5-turbo-16k-0613",
484+
"gpt-4-0314",
485+
"gpt-4-32k-0314",
486+
"gpt-4-0613",
487+
"gpt-4-32k-0613",
488+
}:
489+
tokens_per_message = 3
490+
tokens_per_name = 1
491+
elif model == "gpt-3.5-turbo-0301":
492+
tokens_per_message = 4 # every message follows <|start|>{role/name}\n{content}<|end|>\n
493+
tokens_per_name = -1 # if there's a name, the role is omitted
494+
elif "gpt-3.5-turbo" in model:
495+
print("Warning: gpt-3.5-turbo may update over time. Returning num tokens assuming gpt-3.5-turbo-0613.")
496+
return num_tokens_from_messages(messages, model="gpt-3.5-turbo-0613")
497+
elif "gpt-4" in model:
498+
print("Warning: gpt-4 may update over time. Returning num tokens assuming gpt-4-0613.")
499+
return num_tokens_from_messages(messages, model="gpt-4-0613")
500+
else:
501+
raise NotImplementedError(
502+
f"""num_tokens_from_messages() is not implemented for model {model}. See https://github.com/openai/openai-python/blob/main/chatml.md for information on how messages are converted to tokens."""
503+
)
504+
num_tokens = 0
505+
for message in messages:
506+
num_tokens += tokens_per_message
507+
for key, value in message.items():
508+
num_tokens += len(encoding.encode(value))
509+
if key == "name":
510+
num_tokens += tokens_per_name
511+
num_tokens += 3 # every reply is primed with <|start|>assistant<|message|>
512+
return num_tokens
513+
while True:
514+
user_input = input("Q:")
515+
conversation.append({"role": "user", "content": user_input})
516+
conv_history_tokens = num_tokens_from_messages(conversation)
517+
518+
while conv_history_tokens + max_response_tokens >= token_limit:
519+
del conversation[1]
520+
conv_history_tokens = num_tokens_from_messages(conversation)
521+
522+
response = client.chat.completions.create(
523+
model="gpt-35-turbo", # model = "deployment_name".
524+
messages=conversation,
525+
temperature=0.7,
526+
max_tokens=max_response_tokens
527+
)
528+
529+
530+
conversation.append({"role": "assistant", "content": response.choices[0].message.content})
531+
print("\n" + response.choices[0].message.content + "\n")
532+
```
533+
534+
---
535+
322536
In this example, once the token count is reached, the oldest messages in the conversation transcript will be removed. `del` is used instead of `pop()` for efficiency, and we start at index 1 so as to always preserve the system message and only remove user/assistant messages. Over time, this method of managing the conversation can cause the conversation quality to degrade as the model will gradually lose context of the earlier portions of the conversation.
323537

324538
An alternative approach is to limit the conversation duration to the max token length or a certain number of turns. Once the max token limit is reached and the model would lose context if you were to allow the conversation to continue, you can prompt the user that they need to begin a new conversation and clear the messages list to start a brand new conversation with the full token limit available.

0 commit comments

Comments
 (0)