Skip to content

Commit a425274

Browse files
Merge pull request #229864 from dereklegenzoff/delegenz-chatgpt
Updating the chatgpt how-to
2 parents ebebd2a + ada2d80 commit a425274

File tree

2 files changed

+74
-33
lines changed

2 files changed

+74
-33
lines changed

articles/cognitive-services/openai/how-to/chatgpt.md

Lines changed: 73 additions & 32 deletions
Original file line numberDiff line numberDiff line change
@@ -1,29 +1,26 @@
11
---
2-
title: How to work with the ChatGPT model (preview)
2+
title: How to work with the Chat Markup Language (preview)
33
titleSuffix: Azure OpenAI
4-
description: Learn how to work with the ChatGPT model (preview)
4+
description: Learn how to work with Chat Markup Language (preview)
55
author: dereklegenzoff
66
ms.author: delegenz
77
ms.service: cognitive-services
88
ms.topic: conceptual
9-
ms.date: 03/01/2023
9+
ms.date: 03/09/2023
1010
manager: nitinme
1111
keywords: ChatGPT
1212
---
1313

14-
# Learn how to work with the ChatGPT model (preview)
14+
# Learn how to work with Chat Markup Language (preview)
1515

16-
The ChatGPT model (gpt-35-turbo) is a language model designed for conversational interfaces and the model behaves differently than previous GPT-3 models. Previous models were text-in and text-out, meaning they accepted a prompt string and returned a completion to append to the prompt. However, the ChatGPT model is conversation-in and message-out. The model expects a prompt string formatted in a specific chat-like transcript format, and returns a completion that represents a model-written message in the chat.
16+
The ChatGPT model (`gpt-35-turbo`) is a language model designed for conversational interfaces and the model behaves differently than previous GPT-3 models. Previous models were text-in and text-out, meaning they accepted a prompt string and returned a completion to append to the prompt. However, the ChatGPT model is conversation-in and message-out. The model expects a prompt string formatted in a specific chat-like transcript format, and returns a completion that represents a model-written message in the chat. While the prompt format was designed specifically for multi-turn conversations, you'll find it can also work well for non-chat scenarios too.
1717

18-
The ChatGPT model uses the same [completion API](/azure/cognitive-services/openai/reference#completions) that you use for other models like text-davinci-002, but it requires a unique prompt format. It's important to use the new prompt format to get the best results. Without the right prompts, the model tends to be verbose and provides less useful responses.
18+
The ChatGPT model can be used with the same [completion API](/azure/cognitive-services/openai/reference#completions) that you use for other models like text-davinci-002, but it requires a unique prompt format known as Chat Markup Language (ChatML). It's important to use the new prompt format to get the best results. Without the right prompts, the model tends to be verbose and provides less useful responses.
1919

2020
## Working with the ChatGPT model
2121

2222
The following code snippet shows the most basic way to use the ChatGPT model. We also have a UI driven experience that you can learn about in the [ChatGPT Quickstart](../chatgpt-quickstart.md).
2323

24-
> [!NOTE]
25-
> OpenAI continues to improve the ChatGPT model and release new versions. During the preview of this model, we'll continue updating to the latest version of the model in place. This means that you may see small changes in the behavior of the model during the preview.
26-
2724
```python
2825
import os
2926
import openai
@@ -36,25 +33,40 @@ response = openai.Completion.create(
3633
engine="gpt-35-turbo",
3734
prompt="<|im_start|>system\nAssistant is a large language model trained by OpenAI.\n<|im_end|>\n<|im_start|>user\nWhat's the difference between garbanzo beans and chickpeas?\n<|im_end|>\n<|im_start|>assistant\n",
3835
temperature=0,
39-
max_tokens=800,
36+
max_tokens=500,
4037
top_p=0.5,
4138
stop=["<|im_end|>"])
4239

4340
print(response['choices'][0]['text'])
4441
```
42+
> [!NOTE]
43+
> The following parameters aren't available with the gpt-35-turbo model: `logprobs`, `best_of`, and `echo`. If you set any of these parameters to a value other than their default, you'll get an error.
44+
45+
The `<|im_end|>` token indicates the end of a message. We recommend including `<|im_end|>` token as a stop sequence to ensure that the model stops generating text when it reaches the end of the message. You can read more about the special tokens in the [Chat Markup Language (ChatML)](#chatml) section.
46+
47+
Consider setting `max_tokens` to a slightly higher value than normal such as 300 or 500. This ensures that the model doesn't stop generating text before it reaches the end of the message.
48+
49+
## Model versioning
50+
51+
> [!NOTE]
52+
> `gpt-35-turbo` is equivalent to the `gpt-3.5-turbo` model from OpenAI.
53+
54+
Unlike previous GPT-3 and GPT-3.5 models, the `gpt-35-turbo` model will continue to be updated. When creating a [deployment](./create-resource.md#deploy-a-model) of `gpt-35-turbo`, you'll also need to specify a model version.
4555

46-
The `<|im_end|>` token indicates the end of a message. We recommend including `<|im_end|>` token as a stop sequence to ensure that the model stops generating text when it reaches the end of the message. When you include the `<|im_end|>` token as a stop sequence, this ensures that the model stops generating text when it reaches the end of the message.
56+
Currently, only version `"0301"` is available. This is equivalent to the `gpt-3.5-turbo-0301` model from OpenAI. We'll continue to make updated versions available in the future. You can find model deprecation times on our [models](../concepts/models.md) page.
4757

48-
Consider setting `max_tokens` to a slightly higher value than normal such as 500 or 800. This ensures that the model doesn't stop generating text before it reaches the end of the message.
58+
One thing that's important to note is that Chat Markup Language (ChatML) will continue to evolve with the new versions of the model. You may need to make updates to your prompts when you upgrade to a new version of the model.
4959

50-
## ChatGPT prompt format
60+
<a id="chatml"></a>
61+
62+
## Working with Chat Markup Language (ChatML)
5163

5264
> [!NOTE]
53-
> OpenAI continues to improve the ChatGPT model and the prompt format may change or evolve in the future. We'll keep this document updated with the latest information.
65+
> OpenAI continues to improve the `gpt-35-turbo` model and the Chat Markup Language used with the model will continue to evolve in the future. We'll keep this document updated with the latest information.
5466
55-
OpenAI trained the ChatGPT model on special tokens that delineate the different parts of the prompt. The prompt starts with a system message that is used to prime the model followed by a series of messages between the user and the assistant.
67+
OpenAI trained the gpt-35-turbo model on special tokens that delineate the different parts of the prompt. The prompt starts with a system message that is used to prime the model followed by a series of messages between the user and the assistant.
5668

57-
When starting a conversation, you should have a prompt that looks similar to the following code block:
69+
The format of a basic ChatML prompt is as follows:
5870

5971
```
6072
<|im_start|>system
@@ -68,12 +80,12 @@ The user’s message goes here
6880

6981
### System message
7082

71-
The system message is included at the beginning of the prompt between the `<|im_start|>system` and `<|im_end|>` tokens. This message provides the initial instructions to the model. You can provide a variety of information including:
83+
The system message is included at the beginning of the prompt between the `<|im_start|>system` and `<|im_end|>` tokens. This message provides the initial instructions to the model. You can provide various information in the system message including:
7284

7385
* A brief description of the assistant
74-
* The personality of the assistant
75-
* Instructions for the assistant
76-
* Data or information needed for the model
86+
* Personality traits of the assistant
87+
* Instructions or rules you would like the instructions to follow
88+
* Data or information needed for the model, such as relevant questions from an FAQ
7789

7890
You can customize the system message for your use case or just include a basic system message. The system message is optional, but it's recommended to at least include a basic one to get the best results.
7991

@@ -120,14 +132,14 @@ Instructions:
120132
- If you're unsure of an answer, you can say "I don't know" or "I'm not sure" and recommend users go to the IRS website for more information.
121133
<|im_end|>
122134
<|im_start|>user
123-
What is the IRS?
135+
When are my taxes due?
124136
<|im_end|>
125137
<|im_start|>assistant
126138
```
127139

128140
#### Using data for grounding
129141

130-
You can also include relevant data or information in the system message to give the model additional context for the conversation. If you only need to include a small amount of information, you can hard code it in the system message. If you have a large amount of data that the model should be aware of, you can use [embeddings](/azure/cognitive-services/openai/tutorials/embeddings?tabs=command-line) or a product like [Azure Cognitive Search](https://azure.microsoft.com/services/search/) to retrieve the most relevant information at query time.
142+
You can also include relevant data or information in the system message to give the model extra context for the conversation. If you only need to include a small amount of information, you can hard code it in the system message. If you have a large amount of data that the model should be aware of, you can use [embeddings](/azure/cognitive-services/openai/tutorials/embeddings?tabs=command-line) or a product like [Azure Cognitive Search](https://azure.microsoft.com/services/search/) to retrieve the most relevant information at query time.
131143

132144
```
133145
<|im_start|>system
@@ -142,12 +154,11 @@ Context:
142154
What is Azure OpenAI Service?
143155
<|im_end|>
144156
<|im_start|>assistant
145-
146157
```
147158

148-
#### Few shot learning with ChatGPT
159+
#### Few shot learning with ChatML
149160

150-
You can also give few shot examples to the model. The approach for few shot learning has changed slightly because of the new prompt format. You can now include a series of messages between the user and the assistant in the prompt as few shot examples. These examples can be used to seed answers to common questions to prime the model.
161+
You can also give few shot examples to the model. The approach for few shot learning has changed slightly because of the new prompt format. You can now include a series of messages between the user and the assistant in the prompt as few shot examples. These examples can be used to seed answers to common questions to prime the model or teach particular behaviors to the model.
151162

152163
This is only one example of how you can use few shot learning with ChatGPT. You can experiment with different approaches to see what works best for your use case.
153164

@@ -169,9 +180,40 @@ You can check the status of your tax refund by visiting https://www.irs.gov/refu
169180
<|im_end|>
170181
```
171182

183+
#### Using Chat Markup Language for non-chat scenarios
184+
185+
ChatML is designed to make multi-turn conversations easier to manage, but it also works well for non-chat scenarios.
186+
187+
For example, for an entity extraction scenario, you might use the following prompt:
188+
189+
```
190+
<|im_start|>system
191+
You are an assistant designed to extract entities from text. Users will paste in a string of text and you will respond with entities you've extracted from the text as a JSON object. Here's an example of your output format:
192+
{
193+
"name": "",
194+
"company": "",
195+
"phone_number": ""
196+
}
197+
<|im_end|>
198+
<|im_start|>user
199+
Hello. My name is Robert Smith. I’m calling from Contoso Insurance, Delaware. My colleague mentioned that you are interested in learning about our comprehensive benefits policy. Could you give me a call back at (555) 346-9322 when you get a chance so we can go over the benefits?
200+
<|im_end|>
201+
<|im_start|>assistant
202+
```
203+
204+
205+
## Preventing unsafe user inputs
206+
207+
It's important to add mitigations into your application to ensure safe use of the Chat Markup Language.
208+
209+
We recommend that you prevent end-users from being able to include special tokens in their input such as `<|im_start|>` and `<|im_end|>`. We also recommend that you include additional validation to ensure the prompts you're sending to the model are well formed and follow the Chat Markup Language format as described in this document.
210+
211+
You can also provide instructions in the system message to guide the model on how to respond to certain types of user inputs. For example, you can instruct the model to only reply to messages about a certain subject. You can also reinforce this behavior with few shot examples.
212+
213+
172214
## Managing conversations with ChatGPT
173215

174-
The token limit of the ChatGPT model is 4096 tokens. This limit includes the token count from both the prompt and completion. The number of tokens in the prompt combined with the value of the `max_tokens` parameter must stay under 4096 or you'll receive an error.
216+
The token limit for `gpt-35-turbo` is 4096 tokens. This limit includes the token count from both the prompt and completion. The number of tokens in the prompt combined with the value of the `max_tokens` parameter must stay under 4096 or you'll receive an error.
175217

176218
It’s your responsibility to ensure the prompt and completion falls within the token limit. This means that for longer conversations, you need to keep track of the token count and only send the model a prompt that falls within the token limit.
177219

@@ -194,16 +236,16 @@ def create_prompt(system_message, messages):
194236
return prompt
195237

196238
# defining the user input and the system message
197-
user_input = "--the user's message--" # allow user input
198-
system_message = f"<|im_start|>system\n{'--your system message--'}\n<|im_end|>"
239+
user_input = "<your user input>"
240+
system_message = f"<|im_start|>system\n{'<your system message>'}\n<|im_end|>"
199241

200242
# creating a list of messages to track the conversation
201243
messages = [{"sender": "user", "text": user_input}]
202244

203245
response = openai.Completion.create(
204246
engine="gpt-35-turbo",
205247
prompt=create_prompt(system_message, messages),
206-
temperature=1,
248+
temperature=0.5,
207249
max_tokens=250,
208250
top_p=0.9,
209251
frequency_penalty=0,
@@ -221,7 +263,7 @@ The simplest approach to staying under the token limit is to truncate the oldest
221263

222264
You can choose to always include as many tokens as possible while staying under the limit or you could always include a set number of previous messages assuming those messages stay within the limit. It's important to keep in mind that longer prompts take longer to generate a response and incur a higher cost than shorter prompts.
223265

224-
You can estimate the number of tokens in a string by using the [tiktoken](https://github.com/openai/tiktoken) Python library. While the exact encoding used by ChatGPT isn't supported yet in tiktoken, you can recreate it yourself by building off of the cl100k_base encoding.
266+
You can estimate the number of tokens in a string by using the [tiktoken](https://github.com/openai/tiktoken) Python library as shown below.
225267

226268
```python
227269
import tiktoken
@@ -235,8 +277,7 @@ enc = tiktoken.Encoding(
235277
special_tokens={
236278
**cl100k_base._special_tokens,
237279
"<|im_start|>": 100264,
238-
"<|im_end|>": 100265,
239-
"<|im_sep|>": 100266,
280+
"<|im_end|>": 100265
240281
}
241282
)
242283

articles/cognitive-services/openai/toc.yml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -32,7 +32,7 @@ items:
3232
items:
3333
- name: Create a resource
3434
href: ./how-to/create-resource.md
35-
- name: ChatGPT
35+
- name: Use Chat Markup Language (ChatML)
3636
href: ./how-to/chatgpt.md
3737
- name: Completions
3838
href: ./how-to/completions.md

0 commit comments

Comments
 (0)