MicrosoftDocs
diff --git a/‎articles/ai-services/openai/concepts/models.md
Lines changed: 16 additions & 12 deletions b/‎articles/ai-services/openai/concepts/models.md
Lines changed: 16 additions & 12 deletions
diff --git a/‎articles/ai-services/openai/how-to/json-mode.md
Lines changed: 86 additions & 0 deletions b/‎articles/ai-services/openai/how-to/json-mode.md
Lines changed: 86 additions & 0 deletions
@@ -4,7 +4,7 @@ titleSuffix: Azure OpenAI
 description: Learn about the different model capabilities that are available with Azure OpenAI. 
 ms.service: azure-ai-openai
 ms.topic: conceptual 
-ms.date: 10/04/2023
+ms.date: 11/17/2023
 ms.custom: event-tier1-build-2022, references_regions, build-2023, build-2023-dataai
 manager: nitinme
 author: mrbullwinkle #ChrisHMSFT
@@ -19,13 +19,13 @@ Azure OpenAI Service is powered by a diverse set of models with different capabi
 
 | Models | Description |
 |--|--|
-| [GPT-4](#gpt-4) | A set of models that improve on GPT-3.5 and can understand and generate natural language and code. |
+| [GPT-4](#gpt-4-and-gpt-4-turbo-preview) | A set of models that improve on GPT-3.5 and can understand and generate natural language and code. |
 | [GPT-3.5](#gpt-35) | A set of models that improve on GPT-3 and can understand and generate natural language and code. |
 | [Embeddings](#embeddings-models) | A set of models that can convert text into numerical vector form to facilitate text similarity. |
 | [DALL-E](#dall-e-models-preview) (Preview) | A series of models in preview that can generate original images from natural language. |
 | [Whisper](#whisper-models-preview) (Preview) | A series of models in preview that can transcribe and translate speech to text. |
 
-## GPT-4
+## GPT-4 and GPT-4 Turbo Preview
 
  GPT-4 can solve difficult problems with greater accuracy than any of OpenAI's previous models. Like GPT-3.5 Turbo, GPT-4 is optimized for chat and works well for traditional completions tasks. Use the Chat Completions API to use GPT-4. To learn more about how to interact with GPT-4 and the Chat Completions API check out our [in-depth how-to](../how-to/chatgpt.md).
 
@@ -72,7 +72,7 @@ You can also use the Whisper model via Azure AI Speech [batch transcription](../
 >
 > - South Central US is temporarily unavailable for creating new resources and deployments.
 
-### GPT-4 models
+### GPT-4 and GPT-4 Turbo Preview models
 
 GPT-4 and GPT-4-32k models are now available to all Azure OpenAI Service customers.  Availability varies by region.  If you don't see GPT-4 in your region, please check back later.
 
@@ -86,20 +86,23 @@ See [model versions](../concepts/model-versions.md) to learn about how Azure Ope
 > Version `0314` of `gpt-4` and `gpt-4-32k` will be retired no earlier than July 5, 2024.  See [model updates](../how-to/working-with-models.md#model-updates) for model upgrade behavior.
 
 |  Model ID  | Max Request (tokens) | Training Data (up to)  |
-|  --- |  :---: | :---: |
+|  --- |  :--- | :---: |
 | `gpt-4` (0314) | 8,192 | Sep 2021         |
 | `gpt-4-32k`(0314)  | 32,768               | Sep 2021         |
 | `gpt-4` (0613)     | 8,192                | Sep 2021         |
 | `gpt-4-32k` (0613) | 32,768               | Sep 2021         |
+| `gpt-4` (1106-preview)**<sup>1</sup>** | Input: 128,000  <br> Output: 4096           | Apr 2023         |
+
+**<sup>1</sup>** We don't recommend using this model in production. We will upgrade all deployments of this model to a future stable version. Models designated preview do not follow the standard Azure OpenAI model lifecycle.
 
 > [!NOTE]
-> Regions where GPT-4 is listed as available have access to both the 8K and 32K versions of the model
+> Regions where GPT-4 (0314) & (0613) are listed as available have access to both the 8K and 32K versions of the model
 
-### GPT-4 model availability
+### GPT-4 and GPT-4 Turbo Preview model availability
 
-| Model Availability | gpt-4 (0314) | gpt-4 (0613) |
-|---|:---|:---|
-| Available to all subscriptions with Azure OpenAI access | | Australia East <br> Canada East <br> France Central <br> Sweden Central <br> Switzerland North |
+| Model Availability | gpt-4 (0314) | gpt-4 (0613) | gpt-4 (1106-preview) |
+|---|:---|:---|:---|
+| Available to all subscriptions with Azure OpenAI access | | Australia East <br> Canada East <br> France Central <br> Sweden Central <br> Switzerland North | Australia East <br> Canada East <br> East US 2 <br> France Central <br> Sweden Central <br> UK South |
 | Available to subscriptions with current access to the model version in the region | East US <br> France Central <br> South Central US <br> UK South | East US <br> East US 2 <br> Japan East <br> UK South |
 
 ### GPT-3.5 models
@@ -117,12 +120,13 @@ See [model versions](../concepts/model-versions.md) to learn about how Azure Ope
 
 |  Model ID  |   Model Availability  | Max Request (tokens) | Training Data (up to) |
 |  --------- |  -------------------- |:------:|:----:|
-| `gpt-35-turbo`<sup>1</sup> (0301) | East US <br> France Central <br> South Central US <br> UK South <br> West Europe | 4096 | Sep 2021 |
+| `gpt-35-turbo`**<sup>1</sup>** (0301) | East US <br> France Central <br> South Central US <br> UK South <br> West Europe | 4096 | Sep 2021 |
 | `gpt-35-turbo` (0613) | Australia East <br> Canada East <br> East US <br> East US 2 <br> France Central <br> Japan East <br> North Central US <br> Sweden Central <br> Switzerland North <br> UK South | 4096 | Sep 2021 |
 | `gpt-35-turbo-16k` (0613) | Australia East <br> Canada East <br> East US <br> East US 2 <br> France Central <br> Japan East <br> North Central US <br> Sweden Central <br> Switzerland North<br> UK South | 16,384 | Sep 2021 |
 | `gpt-35-turbo-instruct` (0914) | East US <br> Sweden Central | 4097 |Sep 2021 |
+| `gpt-35-turbo` (1106) | Australia East <br> Canada East <br> France Central <br> Sweden Central<br> UK South | Input: 16,385<br> Output: 4,096 |  Sep 2021|
 
-<sup>1</sup> This model will accept requests > 4096 tokens. It is not recommended to exceed the 4096 input token limit as the newer version of the model are capped at 4096 tokens. If you encounter issues when exceeding 4096 input tokens with this model this configuration is not officially supported.
+**<sup>1</sup>** This model will accept requests > 4096 tokens. It is not recommended to exceed the 4096 input token limit as the newer version of the model are capped at 4096 tokens. If you encounter issues when exceeding 4096 input tokens with this model this configuration is not officially supported.
 
 ### Embeddings models
 
 
@@ -0,0 +1,86 @@
+---
+title: 'How to use JSON mode with Azure OpenAI Service'
+titleSuffix: Azure OpenAI
+description: Learn how to improve your chat completions with Azure OpenAI JSON mode
+services: cognitive-services
+manager: nitinme
+ms.service: azure-ai-openai
+ms.topic: how-to
+ms.date: 11/17/2023
+author: mrbullwinkle
+ms.author: mbullwin
+recommendations: false
+keywords: 
+
+---
+
+# Learn how to use JSON mode
+
+JSON mode allows you to set the models response format to return a valid JSON object as part of a chat completion. While generating valid JSON was possible previously, there could be issues with response consistency that would lead to invalid JSON objects being generated.
+
+## JSON mode support
+
+JSON mode is only currently supported with the following:
+
+### Supported models
+
+- `gpt-4-1106-preview`
+- `gpt-35-turbo-1106`
+
+### API version
+
+- `2023-12-01-preview`
+
+## Example
+
+```python
+import os
+from openai import AzureOpenAI
+
+client = AzureOpenAI(
+  azure_endpoint = os.getenv("AZURE_OPENAI_ENDPOINT"), 
+  api_key=os.getenv("AZURE_OPENAI_KEY"),  
+  api_version="2023-12-01-preview"
+)
+
+response = client.chat.completions.create(
+  model="gpt-4-1106-preview", # Model = should match the deployment name you chose for your 1106-preview model deployment
+  response_format={ "type": "json_object" },
+  messages=[
+    {"role": "system", "content": "You are a helpful assistant designed to output JSON."},
+    {"role": "user", "content": "Who won the world series in 2020?"}
+  ]
+)
+print(response.choices[0].message.content)
+```
+
+### Output
+
+```output
+{
+  "winner": "Los Angeles Dodgers",
+  "event": "World Series",
+  "year": 2020
+}
+```
+
+There are two key factors that need to be present to successfully use JSON mode:
+
+- `response_format={ "type": "json_object" }`
+- We have told the model to output JSON as part of the system message.
+
+Including guidance to the model that it should produce JSON as part of the messages conversation is **required**. We recommend adding this instruction as part of the system message. According to OpenAI failure to add this instruction can cause the model to *"generate an unending stream of whitespace and the request may run continually until it reaches the token limit."*
+
+When using the [OpenAI Python API library](https://github.com/openai/openai-python) failure to include "JSON" within the messages will return:
+
+### Output
+
+```output
+BadRequestError: Error code: 400 - {'error': {'message': "'messages' must contain the word 'json' in some form, to use 'response_format' of type 'json_object'.", 'type': 'invalid_request_error', 'param': 'messages', 'code': None}}
+```
+
+## Additional considerations
+
+You should check `finish_reason` for the value `length` before parsing the response. When this is present, you might have generated partial JSON. This means that output from the model was larger than the available max_tokens that were set as part of the request, or the conversation itself exceeded the token limit.
+
+JSON mode will produce JSON that is valid and will parse without errors. However, this doesn't mean that output will match a specific schema.