Skip to content

Commit ad37578

Browse files
authored
Merge pull request #258895 from MicrosoftDocs/release-azopenai-nov-2023
[Publishing] [Out of Band Publish] - release-azopenai-nov-2023 11/17 - 8:00 AM PST
2 parents c39959a + 2c362af commit ad37578

File tree

5 files changed

+373
-13
lines changed

5 files changed

+373
-13
lines changed

articles/ai-services/openai/concepts/models.md

Lines changed: 16 additions & 12 deletions
Original file line numberDiff line numberDiff line change
@@ -4,7 +4,7 @@ titleSuffix: Azure OpenAI
44
description: Learn about the different model capabilities that are available with Azure OpenAI.
55
ms.service: azure-ai-openai
66
ms.topic: conceptual
7-
ms.date: 10/04/2023
7+
ms.date: 11/17/2023
88
ms.custom: event-tier1-build-2022, references_regions, build-2023, build-2023-dataai
99
manager: nitinme
1010
author: mrbullwinkle #ChrisHMSFT
@@ -19,13 +19,13 @@ Azure OpenAI Service is powered by a diverse set of models with different capabi
1919

2020
| Models | Description |
2121
|--|--|
22-
| [GPT-4](#gpt-4) | A set of models that improve on GPT-3.5 and can understand and generate natural language and code. |
22+
| [GPT-4](#gpt-4-and-gpt-4-turbo-preview) | A set of models that improve on GPT-3.5 and can understand and generate natural language and code. |
2323
| [GPT-3.5](#gpt-35) | A set of models that improve on GPT-3 and can understand and generate natural language and code. |
2424
| [Embeddings](#embeddings-models) | A set of models that can convert text into numerical vector form to facilitate text similarity. |
2525
| [DALL-E](#dall-e-models-preview) (Preview) | A series of models in preview that can generate original images from natural language. |
2626
| [Whisper](#whisper-models-preview) (Preview) | A series of models in preview that can transcribe and translate speech to text. |
2727

28-
## GPT-4
28+
## GPT-4 and GPT-4 Turbo Preview
2929

3030
GPT-4 can solve difficult problems with greater accuracy than any of OpenAI's previous models. Like GPT-3.5 Turbo, GPT-4 is optimized for chat and works well for traditional completions tasks. Use the Chat Completions API to use GPT-4. To learn more about how to interact with GPT-4 and the Chat Completions API check out our [in-depth how-to](../how-to/chatgpt.md).
3131

@@ -72,7 +72,7 @@ You can also use the Whisper model via Azure AI Speech [batch transcription](../
7272
>
7373
> - South Central US is temporarily unavailable for creating new resources and deployments.
7474
75-
### GPT-4 models
75+
### GPT-4 and GPT-4 Turbo Preview models
7676

7777
GPT-4 and GPT-4-32k models are now available to all Azure OpenAI Service customers. Availability varies by region. If you don't see GPT-4 in your region, please check back later.
7878

@@ -86,20 +86,23 @@ See [model versions](../concepts/model-versions.md) to learn about how Azure Ope
8686
> Version `0314` of `gpt-4` and `gpt-4-32k` will be retired no earlier than July 5, 2024. See [model updates](../how-to/working-with-models.md#model-updates) for model upgrade behavior.
8787
8888
| Model ID | Max Request (tokens) | Training Data (up to) |
89-
| --- | :---: | :---: |
89+
| --- | :--- | :---: |
9090
| `gpt-4` (0314) | 8,192 | Sep 2021 |
9191
| `gpt-4-32k`(0314) | 32,768 | Sep 2021 |
9292
| `gpt-4` (0613) | 8,192 | Sep 2021 |
9393
| `gpt-4-32k` (0613) | 32,768 | Sep 2021 |
94+
| `gpt-4` (1106-preview)**<sup>1</sup>** | Input: 128,000 <br> Output: 4096 | Apr 2023 |
95+
96+
**<sup>1</sup>** We don't recommend using this model in production. We will upgrade all deployments of this model to a future stable version. Models designated preview do not follow the standard Azure OpenAI model lifecycle.
9497

9598
> [!NOTE]
96-
> Regions where GPT-4 is listed as available have access to both the 8K and 32K versions of the model
99+
> Regions where GPT-4 (0314) & (0613) are listed as available have access to both the 8K and 32K versions of the model
97100
98-
### GPT-4 model availability
101+
### GPT-4 and GPT-4 Turbo Preview model availability
99102

100-
| Model Availability | gpt-4 (0314) | gpt-4 (0613) |
101-
|---|:---|:---|
102-
| Available to all subscriptions with Azure OpenAI access | | Australia East <br> Canada East <br> France Central <br> Sweden Central <br> Switzerland North |
103+
| Model Availability | gpt-4 (0314) | gpt-4 (0613) | gpt-4 (1106-preview) |
104+
|---|:---|:---|:---|
105+
| Available to all subscriptions with Azure OpenAI access | | Australia East <br> Canada East <br> France Central <br> Sweden Central <br> Switzerland North | Australia East <br> Canada East <br> East US 2 <br> France Central <br> Sweden Central <br> UK South |
103106
| Available to subscriptions with current access to the model version in the region | East US <br> France Central <br> South Central US <br> UK South | East US <br> East US 2 <br> Japan East <br> UK South |
104107

105108
### GPT-3.5 models
@@ -117,12 +120,13 @@ See [model versions](../concepts/model-versions.md) to learn about how Azure Ope
117120

118121
| Model ID | Model Availability | Max Request (tokens) | Training Data (up to) |
119122
| --------- | -------------------- |:------:|:----:|
120-
| `gpt-35-turbo`<sup>1</sup> (0301) | East US <br> France Central <br> South Central US <br> UK South <br> West Europe | 4096 | Sep 2021 |
123+
| `gpt-35-turbo`**<sup>1</sup>** (0301) | East US <br> France Central <br> South Central US <br> UK South <br> West Europe | 4096 | Sep 2021 |
121124
| `gpt-35-turbo` (0613) | Australia East <br> Canada East <br> East US <br> East US 2 <br> France Central <br> Japan East <br> North Central US <br> Sweden Central <br> Switzerland North <br> UK South | 4096 | Sep 2021 |
122125
| `gpt-35-turbo-16k` (0613) | Australia East <br> Canada East <br> East US <br> East US 2 <br> France Central <br> Japan East <br> North Central US <br> Sweden Central <br> Switzerland North<br> UK South | 16,384 | Sep 2021 |
123126
| `gpt-35-turbo-instruct` (0914) | East US <br> Sweden Central | 4097 |Sep 2021 |
127+
| `gpt-35-turbo` (1106) | Australia East <br> Canada East <br> France Central <br> Sweden Central<br> UK South | Input: 16,385<br> Output: 4,096 | Sep 2021|
124128

125-
<sup>1</sup> This model will accept requests > 4096 tokens. It is not recommended to exceed the 4096 input token limit as the newer version of the model are capped at 4096 tokens. If you encounter issues when exceeding 4096 input tokens with this model this configuration is not officially supported.
129+
**<sup>1</sup>** This model will accept requests > 4096 tokens. It is not recommended to exceed the 4096 input token limit as the newer version of the model are capped at 4096 tokens. If you encounter issues when exceeding 4096 input tokens with this model this configuration is not officially supported.
126130

127131
### Embeddings models
128132

Lines changed: 86 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,86 @@
1+
---
2+
title: 'How to use JSON mode with Azure OpenAI Service'
3+
titleSuffix: Azure OpenAI
4+
description: Learn how to improve your chat completions with Azure OpenAI JSON mode
5+
services: cognitive-services
6+
manager: nitinme
7+
ms.service: azure-ai-openai
8+
ms.topic: how-to
9+
ms.date: 11/17/2023
10+
author: mrbullwinkle
11+
ms.author: mbullwin
12+
recommendations: false
13+
keywords:
14+
15+
---
16+
17+
# Learn how to use JSON mode
18+
19+
JSON mode allows you to set the models response format to return a valid JSON object as part of a chat completion. While generating valid JSON was possible previously, there could be issues with response consistency that would lead to invalid JSON objects being generated.
20+
21+
## JSON mode support
22+
23+
JSON mode is only currently supported with the following:
24+
25+
### Supported models
26+
27+
- `gpt-4-1106-preview`
28+
- `gpt-35-turbo-1106`
29+
30+
### API version
31+
32+
- `2023-12-01-preview`
33+
34+
## Example
35+
36+
```python
37+
import os
38+
from openai import AzureOpenAI
39+
40+
client = AzureOpenAI(
41+
azure_endpoint = os.getenv("AZURE_OPENAI_ENDPOINT"),
42+
api_key=os.getenv("AZURE_OPENAI_KEY"),
43+
api_version="2023-12-01-preview"
44+
)
45+
46+
response = client.chat.completions.create(
47+
model="gpt-4-1106-preview", # Model = should match the deployment name you chose for your 1106-preview model deployment
48+
response_format={ "type": "json_object" },
49+
messages=[
50+
{"role": "system", "content": "You are a helpful assistant designed to output JSON."},
51+
{"role": "user", "content": "Who won the world series in 2020?"}
52+
]
53+
)
54+
print(response.choices[0].message.content)
55+
```
56+
57+
### Output
58+
59+
```output
60+
{
61+
"winner": "Los Angeles Dodgers",
62+
"event": "World Series",
63+
"year": 2020
64+
}
65+
```
66+
67+
There are two key factors that need to be present to successfully use JSON mode:
68+
69+
- `response_format={ "type": "json_object" }`
70+
- We have told the model to output JSON as part of the system message.
71+
72+
Including guidance to the model that it should produce JSON as part of the messages conversation is **required**. We recommend adding this instruction as part of the system message. According to OpenAI failure to add this instruction can cause the model to *"generate an unending stream of whitespace and the request may run continually until it reaches the token limit."*
73+
74+
When using the [OpenAI Python API library](https://github.com/openai/openai-python) failure to include "JSON" within the messages will return:
75+
76+
### Output
77+
78+
```output
79+
BadRequestError: Error code: 400 - {'error': {'message': "'messages' must contain the word 'json' in some form, to use 'response_format' of type 'json_object'.", 'type': 'invalid_request_error', 'param': 'messages', 'code': None}}
80+
```
81+
82+
## Additional considerations
83+
84+
You should check `finish_reason` for the value `length` before parsing the response. When this is present, you might have generated partial JSON. This means that output from the model was larger than the available max_tokens that were set as part of the request, or the conversation itself exceeded the token limit.
85+
86+
JSON mode will produce JSON that is valid and will parse without errors. However, this doesn't mean that output will match a specific schema.

0 commit comments

Comments
 (0)