Skip to content

Commit 997bacd

Browse files
committed
update based on feedback
1 parent ac43be9 commit 997bacd

File tree

3 files changed

+32
-34
lines changed

3 files changed

+32
-34
lines changed

articles/cognitive-services/openai/concepts/models.md

Lines changed: 12 additions & 12 deletions
Original file line numberDiff line numberDiff line change
@@ -19,8 +19,8 @@ Azure OpenAI provides access to many different models, grouped by family and cap
1919

2020
| Model family | Description |
2121
|--|--|
22-
| [GPT-4](#gpt-4-models) | A set of models that improve on GPT-3.5 and can understand as well as generate natural language and code.|
23-
| [GPT-3](#gpt-3-models) | A series of models that can understand and generate natural language. This includes the new [ChatGPT model](#chatgpt-gpt-35-turbo). |
22+
| [GPT-4](#gpt-4-models) | A set of models that improve on GPT-3.5 and can understand as well as generate natural language and code. These models are currently in preview.|
23+
| [GPT-3](#gpt-3-models) | A series of models that can understand and generate natural language. This includes the new [ChatGPT model (preview)](#chatgpt-gpt-35-turbo). |
2424
| [Codex](#codex-models) | A series of models that can understand and generate code, including translating natural language to code. |
2525
| [Embeddings](#embeddings-models) | A set of models that can understand and use embeddings. An embedding is a special format of data representation that can be easily utilized by machine learning models and algorithms. The embedding is an information dense representation of the semantic meaning of a piece of text. Currently, we offer three families of Embeddings models for different functionalities: similarity, text search, and code search. |
2626

@@ -56,15 +56,15 @@ You can get a list of models that are available for both inference and fine-tuni
5656

5757
We recommend starting with the most capable model in a model family to confirm whether the model capabilities meet your requirements. Then you can stay with that model or move to a model with lower capability and cost, optimizing around that model's capabilities.
5858

59-
## GPT-4 models (limited preview)
59+
## GPT-4 models (preview)
6060

61-
GPT-4 is a large multimodal model meaning while it currently accepts text inputs and emits text outputs. It will eventually be able to accept image inputs as well. GPT-4 can solve difficult problems with greater accuracy than any of OpenAI's previous models. Like gpt-35-turbo, GPT-4 is optimized for chat but works well for traditional completions tasks.
61+
GPT-4 can solve difficult problems with greater accuracy than any of OpenAI's previous models. Like gpt-35-turbo, GPT-4 is optimized for chat but works well for traditional completions tasks.
6262

63-
These models are currently in limited preview. For access, existing Azure OpenAI customers can [apply by filling out this form](https://aka.ms/oai/get-gpt4).
63+
These models are currently in preview. For access, existing Azure OpenAI customers can [apply by filling out this form](https://aka.ms/oai/get-gpt4).
6464
- `gpt-4`
6565
- `gpt-4-32k`
6666

67-
The `gpt-4` supports 8192 max input tokens and the `gpt-4-32k` supports up to 32,768. The full name of the model will also indicate version so the first set of models are named `gpt-4-0314`, and `gpt-4-32k-0314`.
67+
The `gpt-4` supports 8192 max input tokens and the `gpt-4-32k` supports up to 32,768 tokens.
6868

6969
## GPT-3 models
7070

@@ -103,7 +103,7 @@ Ada is usually the fastest model and can perform tasks like parsing text, addres
103103

104104
**Use for**: Parsing text, simple classification, address correction, keywords
105105

106-
### ChatGPT (gpt-35-turbo)
106+
### ChatGPT (gpt-35-turbo) (preview)
107107

108108
The ChatGPT model (gpt-35-turbo) is a language model designed for conversational interfaces and the model behaves differently than previous GPT-3 models. Previous models were text-in and text-out, meaning they accepted a prompt string and returned a completion to append to the prompt. However, the ChatGPT model is conversation-in and message-out. The model expects a prompt string formatted in a specific chat-like transcript format, and returns a completion that represents a model-written message in the chat.
109109

@@ -186,20 +186,20 @@ When using our embeddings models, keep in mind their limitations and risks.
186186
| text-davinci-002 | Yes | No | East US, South Central US, West Europe | N/A | 4,097 | Jun 2021 |
187187
| text-davinci-003 | Yes | No | East US, West Europe | N/A | 4,097 | Jun 2021 |
188188
| text-davinci-fine-tune-002<sup>1</sup> | Yes | No | N/A | East US, West Europe<sup>2</sup> | | |
189-
| gpt-35-turbo<sup>3</sup> (ChatGPT) | Yes | No | East US, South Central US | N/A | 4,096 | Sep 2021
189+
| gpt-35-turbo<sup>3</sup> (ChatGPT) (preview) | Yes | No | East US, South Central US | N/A | 4,096 | Sep 2021
190190

191191
<sup>1</sup> The model is available by request only. Currently we aren't accepting new requests to use the model.
192192
<br><sup>2</sup> East US and West Europe are currently unavailable for new customers to fine-tune due to high demand. Please use US South Central region for fine-tuning.
193-
<br><sup>3</sup> Currently, only version `"0301"` of this model is available. This version of the model will be deprecated on 8/1/2023 in favor of newer version of the gpt-35-model. See [ChatGPT model versioning](../how-to/chatgpt.md#model-versioning) for more details.
193+
<br><sup>3</sup> Currently, only version `0301` of this model is available. This version of the model will be deprecated on 8/1/2023 in favor of newer version of the gpt-35-model. See [ChatGPT model versioning](../how-to/chatgpt.md#model-versioning) for more details.
194194

195195
### GPT-4 Models
196196

197197
| Model ID | Supports Completions | Supports Embeddings | Base model Regions | Fine-Tuning Regions | Max Request (tokens) | Training Data (up to) |
198198
| ----------------------- | -------------------- | ------------------- | ------------------------- | ------------------- | -------------------- | ---------------------- |
199-
| `gpt-4` <sup>1,</sup><sup>2</sup> | Yes | No | East US, South Central US | N/A | 8,192 | September 2021 |
200-
| `gpt-4-32k` <sup>1,</sup><sup>2</sup> | Yes | No | East US, South Central US | N/A | 32,768 | September 2021 |
199+
| `gpt-4` <sup>1,</sup><sup>2</sup> (preview) | Yes | No | East US, South Central US | N/A | 8,192 | September 2021 |
200+
| `gpt-4-32k` <sup>1,</sup><sup>2</sup> (preview) | Yes | No | East US, South Central US | N/A | 32,768 | September 2021 |
201201

202-
<sup>1</sup> The model is in limited preview and only available by request.<br>
202+
<sup>1</sup> The model is in preview and only available by request.<br>
203203
<sup>2</sup> Currently, only version `0314` of this model is available.
204204

205205
### Codex Models

articles/cognitive-services/openai/includes/chat-completion.md

Lines changed: 10 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -12,11 +12,11 @@ keywords: ChatGPT
1212

1313
---
1414

15-
## Working with the ChatGPT and GPT-4 models
15+
## Working with the ChatGPT and GPT-4 models (preview)
1616

17-
The following code snippet shows the most basic way to use the ChatGPT and GPT-4 models with the ChatCompletion API. If this is your first time using these models programmatically, we recommend starting with our [ChatGPT & GPT-4 Quickstart](../chatgpt-quickstart.md).
17+
The following code snippet shows the most basic way to use the ChatGPT and GPT-4 models with the Chat Completion API. If this is your first time using these models programmatically, we recommend starting with our [ChatGPT & GPT-4 Quickstart](../chatgpt-quickstart.md).
1818

19-
**GPT-4 models are currently in limited preview.** Existing Azure OpenAI customers can [apply for access by filling out this form](https://aka.ms/oai/get-gpt4).
19+
**GPT-4 models are currently in preview.** Existing Azure OpenAI customers can [apply for access by filling out this form](https://aka.ms/oai/get-gpt4).
2020

2121
```python
2222
import os
@@ -86,13 +86,13 @@ Consider setting `max_tokens` to a slightly higher value than normal such as 300
8686
8787
Unlike previous GPT-3 and GPT-3.5 models, the `gpt-35-turbo` model as well as the `gpt-4` and `gpt-4-32k` models will continue to be updated. When creating a [deployment](../how-to/create-resource.md#deploy-a-model) of these models, you'll also need to specify a model version.
8888

89-
Currently, only version `"0301"` is available for ChatGPT and `0314` for GPT-4 models. We'll continue to make updated versions available in the future. You can find model deprecation times on our [models](../concepts/models.md) page.
89+
Currently, only version `0301` is available for ChatGPT and `0314` for GPT-4 models. We'll continue to make updated versions available in the future. You can find model deprecation times on our [models](../concepts/models.md) page.
9090

91-
## Working with the ChatCompletion API
91+
## Working with the Chat Completion API
9292

9393
OpenAI trained the ChatGPT and GPT-4 models to accept input formatted as a conversation. The messages parameter takes an array of dictionaries with a conversation organized by role.
9494

95-
The format of a basic ChatCompletion is as follows:
95+
The format of a basic Chat Completion is as follows:
9696

9797
```
9898
{"role": "system", "content": "Provide some context and/or instructions to the model"},
@@ -169,7 +169,7 @@ Context:
169169
{"role": "user", "content": "What is Azure OpenAI Service?"}
170170
```
171171

172-
#### Few shot learning with ChatCompletion
172+
#### Few shot learning with Chat Completion
173173

174174
You can also give few shot examples to the model. The approach for few shot learning has changed slightly because of the new prompt format. You can now include a series of messages between the user and the assistant in the prompt as few shot examples. These examples can be used to seed answers to common questions to prime the model or teach particular behaviors to the model.
175175

@@ -183,9 +183,9 @@ This is only one example of how you can use few shot learning with ChatGPT and G
183183
{"role": "assistant", "content": "You can check the status of your tax refund by visiting https://www.irs.gov/refunds"},
184184
```
185185

186-
#### Using ChatCompletion for non-chat scenarios
186+
#### Using Chat Completion for non-chat scenarios
187187

188-
The ChatCompletion API is designed to work with multi-turn conversations, but it also works well for non-chat scenarios.
188+
The Chat Completion API is designed to work with multi-turn conversations, but it also works well for non-chat scenarios.
189189

190190
For example, for an entity extraction scenario, you might use the following prompt:
191191

@@ -201,7 +201,7 @@ For example, for an entity extraction scenario, you might use the following prom
201201

202202
## Creating a basic conversation loop
203203

204-
The examples so far have shown you the basic mechanics of interacting with the ChatCompletion API. This example shows you how to create a conversation loop that performs the following actions:
204+
The examples so far have shown you the basic mechanics of interacting with the Chat Completion API. This example shows you how to create a conversation loop that performs the following actions:
205205

206206
- Continuously takes console input, and properly formats it as part of the messages array as user role content.
207207
- Outputs responses that are printed to the console and formatted and added to the messages array as assistant role content.

articles/cognitive-services/openai/includes/chat-markup-language.md

Lines changed: 10 additions & 12 deletions
Original file line numberDiff line numberDiff line change
@@ -11,14 +11,12 @@ manager: nitinme
1111
keywords: ChatGPT
1212
---
1313

14-
## Working with the ChatGPT and GPT-4 models
14+
## Working with the ChatGPT models (preview)
1515

1616
> [!NOTE]
17-
> The ChatCompletion API is the recommended method of interacting with the ChatGPT and GPT-4 models.
17+
> The Chat Completion API is the recommended method of interacting with the ChatGPT (gtp-45-turbo) models.
1818
19-
The following code snippet shows the most basic way to use the ChatGPT and GPT-4 models with ChatML. If this is your first time using these models programmatically we recommend starting with our [ChatGPT & GPT-4 Quickstart](../chatgpt-quickstart.md).
20-
21-
**GPT-4 models are currently in limited preview.** Existing Azure OpenAI customers can [apply for access by filling out this form](https://aka.ms/oai/get-gpt4).
19+
The following code snippet shows the most basic way to use the ChatGPT models with ChatML. If this is your first time using these models programmatically we recommend starting with our [ChatGPT & GPT-4 Quickstart](../chatgpt-quickstart.md).
2220

2321
```python
2422
import os
@@ -29,7 +27,7 @@ openai.api_version = "2022-12-01"
2927
openai.api_key = os.getenv("OPENAI_API_KEY")
3028

3129
response = openai.Completion.create(
32-
engine="gpt-35-turbo", # The deployment name you chose when you deployed the ChatGPT or GPT-4 model.
30+
engine="gpt-35-turbo", # The deployment name you chose when you deployed the ChatGPT model
3331
prompt="<|im_start|>system\nAssistant is a large language model trained by OpenAI.\n<|im_end|>\n<|im_start|>user\nWhat's the difference between garbanzo beans and chickpeas?\n<|im_end|>\n<|im_start|>assistant\n",
3432
temperature=0,
3533
max_tokens=500,
@@ -53,16 +51,16 @@ Consider setting `max_tokens` to a slightly higher value than normal such as 300
5351
5452
Unlike previous GPT-3 and GPT-3.5 models, the `gpt-35-turbo` model as well as the `gpt-4` and `gpt-4-32k` models will continue to be updated. When creating a [deployment](../how-to/create-resource.md#deploy-a-model) of these models, you'll also need to specify a model version.
5553

56-
Currently, only version `"0301"` is available for ChatGPT and `0314` for GPT-4 models. We'll continue to make updated versions available in the future. You can find model deprecation times on our [models](../concepts/models.md) page.
54+
Currently, only version `0301` is available for ChatGPT. We'll continue to make updated versions available in the future. You can find model deprecation times on our [models](../concepts/models.md) page.
5755

5856
<a id="chatml"></a>
5957

6058
## Working with Chat Markup Language (ChatML)
6159

6260
> [!NOTE]
63-
> OpenAI continues to improve the ChatGPT and GPT-4 models and the Chat Markup Language used with the models will continue to evolve in the future. We'll keep this document updated with the latest information.
61+
> OpenAI continues to improve the ChatGPT and the Chat Markup Language used with the models will continue to evolve in the future. We'll keep this document updated with the latest information.
6462
65-
OpenAI trained the ChatGPT and GPT-4 models on special tokens that delineate the different parts of the prompt. The prompt starts with a system message that is used to prime the model followed by a series of messages between the user and the assistant.
63+
OpenAI trained the ChatGPT on special tokens that delineate the different parts of the prompt. The prompt starts with a system message that is used to prime the model followed by a series of messages between the user and the assistant.
6664

6765
The format of a basic ChatML prompt is as follows:
6866

@@ -211,7 +209,7 @@ You can also provide instructions in the system message to guide the model on ho
211209

212210
## Managing conversations
213211

214-
The token limit for `gpt-35-turbo` is 4096 tokens, whereas the token limits for `gpt-4` and `gpt-4-32k` are 8192 and 32768 respectively. This limit includes the token count from both the prompt and completion. The number of tokens in the prompt combined with the value of the `max_tokens` parameter must stay under 4096 or you'll receive an error.
212+
The token limit for `gpt-35-turbo` is 4096 tokens. This limit includes the token count from both the prompt and completion. The number of tokens in the prompt combined with the value of the `max_tokens` parameter must stay under 4096 or you'll receive an error.
215213

216214
It’s your responsibility to ensure the prompt and completion falls within the token limit. This means that for longer conversations, you need to keep track of the token count and only send the model a prompt that falls within the token limit.
217215

@@ -241,7 +239,7 @@ system_message = f"<|im_start|>system\n{'<your system message>'}\n<|im_end|>"
241239
messages = [{"sender": "user", "text": user_input}]
242240

243241
response = openai.Completion.create(
244-
engine="gpt-35-turbo", # The deployment name you chose when you deployed the ChatGPT or GPT-4 model.
242+
engine="gpt-35-turbo", # The deployment name you chose when you deployed the ChatGPT model.
245243
prompt=create_prompt(system_message, messages),
246244
temperature=0.5,
247245
max_tokens=250,
@@ -257,7 +255,7 @@ print(response['choices'][0]['text'])
257255

258256
## Staying under the token limit
259257

260-
The simplest approach to staying under the token limit is to truncate the oldest messages in the conversation when you reach the token limit.
258+
The simplest approach to staying under the token limit is to remove the oldest messages in the conversation when you reach the token limit.
261259

262260
You can choose to always include as many tokens as possible while staying under the limit or you could always include a set number of previous messages assuming those messages stay within the limit. It's important to keep in mind that longer prompts take longer to generate a response and incur a higher cost than shorter prompts.
263261

0 commit comments

Comments
 (0)