You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: articles/cognitive-services/openai/concepts/models.md
+12-12Lines changed: 12 additions & 12 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -19,8 +19,8 @@ Azure OpenAI provides access to many different models, grouped by family and cap
19
19
20
20
| Model family | Description |
21
21
|--|--|
22
-
|[GPT-4](#gpt-4-models)| A set of models that improve on GPT-3.5 and can understand as well as generate natural language and code.|
23
-
|[GPT-3](#gpt-3-models)| A series of models that can understand and generate natural language. This includes the new [ChatGPT model](#chatgpt-gpt-35-turbo). |
22
+
|[GPT-4](#gpt-4-models)| A set of models that improve on GPT-3.5 and can understand as well as generate natural language and code. These models are currently in preview.|
23
+
|[GPT-3](#gpt-3-models)| A series of models that can understand and generate natural language. This includes the new [ChatGPT model (preview)](#chatgpt-gpt-35-turbo). |
24
24
|[Codex](#codex-models)| A series of models that can understand and generate code, including translating natural language to code. |
25
25
|[Embeddings](#embeddings-models)| A set of models that can understand and use embeddings. An embedding is a special format of data representation that can be easily utilized by machine learning models and algorithms. The embedding is an information dense representation of the semantic meaning of a piece of text. Currently, we offer three families of Embeddings models for different functionalities: similarity, text search, and code search. |
26
26
@@ -56,15 +56,15 @@ You can get a list of models that are available for both inference and fine-tuni
56
56
57
57
We recommend starting with the most capable model in a model family to confirm whether the model capabilities meet your requirements. Then you can stay with that model or move to a model with lower capability and cost, optimizing around that model's capabilities.
58
58
59
-
## GPT-4 models (limited preview)
59
+
## GPT-4 models (preview)
60
60
61
-
GPT-4 is a large multimodal model meaning while it currently accepts text inputs and emits text outputs. It will eventually be able to accept image inputs as well. GPT-4 can solve difficult problems with greater accuracy than any of OpenAI's previous models. Like gpt-35-turbo, GPT-4 is optimized for chat but works well for traditional completions tasks.
61
+
GPT-4 can solve difficult problems with greater accuracy than any of OpenAI's previous models. Like gpt-35-turbo, GPT-4 is optimized for chat but works well for traditional completions tasks.
62
62
63
-
These models are currently in limited preview. For access, existing Azure OpenAI customers can [apply by filling out this form](https://aka.ms/oai/get-gpt4).
63
+
These models are currently in preview. For access, existing Azure OpenAI customers can [apply by filling out this form](https://aka.ms/oai/get-gpt4).
64
64
-`gpt-4`
65
65
-`gpt-4-32k`
66
66
67
-
The `gpt-4` supports 8192 max input tokens and the `gpt-4-32k` supports up to 32,768. The full name of the model will also indicate version so the first set of models are named `gpt-4-0314`, and `gpt-4-32k-0314`.
67
+
The `gpt-4` supports 8192 max input tokens and the `gpt-4-32k` supports up to 32,768 tokens.
68
68
69
69
## GPT-3 models
70
70
@@ -103,7 +103,7 @@ Ada is usually the fastest model and can perform tasks like parsing text, addres
The ChatGPT model (gpt-35-turbo) is a language model designed for conversational interfaces and the model behaves differently than previous GPT-3 models. Previous models were text-in and text-out, meaning they accepted a prompt string and returned a completion to append to the prompt. However, the ChatGPT model is conversation-in and message-out. The model expects a prompt string formatted in a specific chat-like transcript format, and returns a completion that represents a model-written message in the chat.
109
109
@@ -186,20 +186,20 @@ When using our embeddings models, keep in mind their limitations and risks.
186
186
| text-davinci-002 | Yes | No | East US, South Central US, West Europe | N/A | 4,097 | Jun 2021 |
187
187
| text-davinci-003 | Yes | No | East US, West Europe | N/A | 4,097 | Jun 2021 |
188
188
| text-davinci-fine-tune-002<sup>1</sup> | Yes | No | N/A | East US, West Europe<sup>2</sup> |||
189
-
| gpt-35-turbo<sup>3</sup> (ChatGPT) | Yes | No | East US, South Central US | N/A | 4,096 | Sep 2021
189
+
| gpt-35-turbo<sup>3</sup> (ChatGPT) (preview) | Yes | No | East US, South Central US | N/A | 4,096 | Sep 2021
190
190
191
191
<sup>1</sup> The model is available by request only. Currently we aren't accepting new requests to use the model.
192
192
<br><sup>2</sup> East US and West Europe are currently unavailable for new customers to fine-tune due to high demand. Please use US South Central region for fine-tuning.
193
-
<br><sup>3</sup> Currently, only version `"0301"` of this model is available. This version of the model will be deprecated on 8/1/2023 in favor of newer version of the gpt-35-model. See [ChatGPT model versioning](../how-to/chatgpt.md#model-versioning) for more details.
193
+
<br><sup>3</sup> Currently, only version `0301` of this model is available. This version of the model will be deprecated on 8/1/2023 in favor of newer version of the gpt-35-model. See [ChatGPT model versioning](../how-to/chatgpt.md#model-versioning) for more details.
194
194
195
195
### GPT-4 Models
196
196
197
197
| Model ID | Supports Completions | Supports Embeddings | Base model Regions | Fine-Tuning Regions | Max Request (tokens) | Training Data (up to) |
Copy file name to clipboardExpand all lines: articles/cognitive-services/openai/includes/chat-completion.md
+10-10Lines changed: 10 additions & 10 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -12,11 +12,11 @@ keywords: ChatGPT
12
12
13
13
---
14
14
15
-
## Working with the ChatGPT and GPT-4 models
15
+
## Working with the ChatGPT and GPT-4 models (preview)
16
16
17
-
The following code snippet shows the most basic way to use the ChatGPT and GPT-4 models with the ChatCompletion API. If this is your first time using these models programmatically, we recommend starting with our [ChatGPT & GPT-4 Quickstart](../chatgpt-quickstart.md).
17
+
The following code snippet shows the most basic way to use the ChatGPT and GPT-4 models with the Chat Completion API. If this is your first time using these models programmatically, we recommend starting with our [ChatGPT & GPT-4 Quickstart](../chatgpt-quickstart.md).
18
18
19
-
**GPT-4 models are currently in limited preview.** Existing Azure OpenAI customers can [apply for access by filling out this form](https://aka.ms/oai/get-gpt4).
19
+
**GPT-4 models are currently in preview.** Existing Azure OpenAI customers can [apply for access by filling out this form](https://aka.ms/oai/get-gpt4).
20
20
21
21
```python
22
22
import os
@@ -86,13 +86,13 @@ Consider setting `max_tokens` to a slightly higher value than normal such as 300
86
86
87
87
Unlike previous GPT-3 and GPT-3.5 models, the `gpt-35-turbo` model as well as the `gpt-4` and `gpt-4-32k` models will continue to be updated. When creating a [deployment](../how-to/create-resource.md#deploy-a-model) of these models, you'll also need to specify a model version.
88
88
89
-
Currently, only version `"0301"` is available for ChatGPT and `0314` for GPT-4 models. We'll continue to make updated versions available in the future. You can find model deprecation times on our [models](../concepts/models.md) page.
89
+
Currently, only version `0301` is available for ChatGPT and `0314` for GPT-4 models. We'll continue to make updated versions available in the future. You can find model deprecation times on our [models](../concepts/models.md) page.
90
90
91
-
## Working with the ChatCompletion API
91
+
## Working with the Chat Completion API
92
92
93
93
OpenAI trained the ChatGPT and GPT-4 models to accept input formatted as a conversation. The messages parameter takes an array of dictionaries with a conversation organized by role.
94
94
95
-
The format of a basic ChatCompletion is as follows:
95
+
The format of a basic Chat Completion is as follows:
96
96
97
97
```
98
98
{"role": "system", "content": "Provide some context and/or instructions to the model"},
@@ -169,7 +169,7 @@ Context:
169
169
{"role": "user", "content": "What is Azure OpenAI Service?"}
170
170
```
171
171
172
-
#### Few shot learning with ChatCompletion
172
+
#### Few shot learning with Chat Completion
173
173
174
174
You can also give few shot examples to the model. The approach for few shot learning has changed slightly because of the new prompt format. You can now include a series of messages between the user and the assistant in the prompt as few shot examples. These examples can be used to seed answers to common questions to prime the model or teach particular behaviors to the model.
175
175
@@ -183,9 +183,9 @@ This is only one example of how you can use few shot learning with ChatGPT and G
183
183
{"role": "assistant", "content": "You can check the status of your tax refund by visiting https://www.irs.gov/refunds"},
184
184
```
185
185
186
-
#### Using ChatCompletion for non-chat scenarios
186
+
#### Using Chat Completion for non-chat scenarios
187
187
188
-
The ChatCompletion API is designed to work with multi-turn conversations, but it also works well for non-chat scenarios.
188
+
The Chat Completion API is designed to work with multi-turn conversations, but it also works well for non-chat scenarios.
189
189
190
190
For example, for an entity extraction scenario, you might use the following prompt:
191
191
@@ -201,7 +201,7 @@ For example, for an entity extraction scenario, you might use the following prom
201
201
202
202
## Creating a basic conversation loop
203
203
204
-
The examples so far have shown you the basic mechanics of interacting with the ChatCompletion API. This example shows you how to create a conversation loop that performs the following actions:
204
+
The examples so far have shown you the basic mechanics of interacting with the Chat Completion API. This example shows you how to create a conversation loop that performs the following actions:
205
205
206
206
- Continuously takes console input, and properly formats it as part of the messages array as user role content.
207
207
- Outputs responses that are printed to the console and formatted and added to the messages array as assistant role content.
Copy file name to clipboardExpand all lines: articles/cognitive-services/openai/includes/chat-markup-language.md
+10-12Lines changed: 10 additions & 12 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -11,14 +11,12 @@ manager: nitinme
11
11
keywords: ChatGPT
12
12
---
13
13
14
-
## Working with the ChatGPT and GPT-4 models
14
+
## Working with the ChatGPT models (preview)
15
15
16
16
> [!NOTE]
17
-
> The ChatCompletion API is the recommended method of interacting with the ChatGPT and GPT-4 models.
17
+
> The Chat Completion API is the recommended method of interacting with the ChatGPT (gtp-45-turbo) models.
18
18
19
-
The following code snippet shows the most basic way to use the ChatGPT and GPT-4 models with ChatML. If this is your first time using these models programmatically we recommend starting with our [ChatGPT & GPT-4 Quickstart](../chatgpt-quickstart.md).
20
-
21
-
**GPT-4 models are currently in limited preview.** Existing Azure OpenAI customers can [apply for access by filling out this form](https://aka.ms/oai/get-gpt4).
19
+
The following code snippet shows the most basic way to use the ChatGPT models with ChatML. If this is your first time using these models programmatically we recommend starting with our [ChatGPT & GPT-4 Quickstart](../chatgpt-quickstart.md).
engine="gpt-35-turbo", # The deployment name you chose when you deployed the ChatGPT or GPT-4 model.
30
+
engine="gpt-35-turbo", # The deployment name you chose when you deployed the ChatGPT model
33
31
prompt="<|im_start|>system\nAssistant is a large language model trained by OpenAI.\n<|im_end|>\n<|im_start|>user\nWhat's the difference between garbanzo beans and chickpeas?\n<|im_end|>\n<|im_start|>assistant\n",
34
32
temperature=0,
35
33
max_tokens=500,
@@ -53,16 +51,16 @@ Consider setting `max_tokens` to a slightly higher value than normal such as 300
53
51
54
52
Unlike previous GPT-3 and GPT-3.5 models, the `gpt-35-turbo` model as well as the `gpt-4` and `gpt-4-32k` models will continue to be updated. When creating a [deployment](../how-to/create-resource.md#deploy-a-model) of these models, you'll also need to specify a model version.
55
53
56
-
Currently, only version `"0301"` is available for ChatGPT and `0314` for GPT-4 models. We'll continue to make updated versions available in the future. You can find model deprecation times on our [models](../concepts/models.md) page.
54
+
Currently, only version `0301` is available for ChatGPT. We'll continue to make updated versions available in the future. You can find model deprecation times on our [models](../concepts/models.md) page.
57
55
58
56
<aid="chatml"></a>
59
57
60
58
## Working with Chat Markup Language (ChatML)
61
59
62
60
> [!NOTE]
63
-
> OpenAI continues to improve the ChatGPT and GPT-4 models and the Chat Markup Language used with the models will continue to evolve in the future. We'll keep this document updated with the latest information.
61
+
> OpenAI continues to improve the ChatGPT and the Chat Markup Language used with the models will continue to evolve in the future. We'll keep this document updated with the latest information.
64
62
65
-
OpenAI trained the ChatGPT and GPT-4 models on special tokens that delineate the different parts of the prompt. The prompt starts with a system message that is used to prime the model followed by a series of messages between the user and the assistant.
63
+
OpenAI trained the ChatGPT on special tokens that delineate the different parts of the prompt. The prompt starts with a system message that is used to prime the model followed by a series of messages between the user and the assistant.
66
64
67
65
The format of a basic ChatML prompt is as follows:
68
66
@@ -211,7 +209,7 @@ You can also provide instructions in the system message to guide the model on ho
211
209
212
210
## Managing conversations
213
211
214
-
The token limit for `gpt-35-turbo` is 4096 tokens, whereas the token limits for `gpt-4` and `gpt-4-32k` are 8192 and 32768 respectively. This limit includes the token count from both the prompt and completion. The number of tokens in the prompt combined with the value of the `max_tokens` parameter must stay under 4096 or you'll receive an error.
212
+
The token limit for `gpt-35-turbo` is 4096 tokens. This limit includes the token count from both the prompt and completion. The number of tokens in the prompt combined with the value of the `max_tokens` parameter must stay under 4096 or you'll receive an error.
215
213
216
214
It’s your responsibility to ensure the prompt and completion falls within the token limit. This means that for longer conversations, you need to keep track of the token count and only send the model a prompt that falls within the token limit.
217
215
@@ -241,7 +239,7 @@ system_message = f"<|im_start|>system\n{'<your system message>'}\n<|im_end|>"
The simplest approach to staying under the token limit is to truncate the oldest messages in the conversation when you reach the token limit.
258
+
The simplest approach to staying under the token limit is to remove the oldest messages in the conversation when you reach the token limit.
261
259
262
260
You can choose to always include as many tokens as possible while staying under the limit or you could always include a set number of previous messages assuming those messages stay within the limit. It's important to keep in mind that longer prompts take longer to generate a response and incur a higher cost than shorter prompts.
0 commit comments