MicrosoftDocs
diff --git a/‎articles/ai-services/openai/concepts/model-retirements.md
Lines changed: 5 additions & 3 deletions b/‎articles/ai-services/openai/concepts/model-retirements.md
Lines changed: 5 additions & 3 deletions
diff --git a/‎articles/ai-services/openai/concepts/models.md
Lines changed: 54 additions & 43 deletions b/‎articles/ai-services/openai/concepts/models.md
Lines changed: 54 additions & 43 deletions
diff --git a/‎articles/ai-services/openai/concepts/provisioned-throughput.md
Lines changed: 5 additions & 1 deletion b/‎articles/ai-services/openai/concepts/provisioned-throughput.md
Lines changed: 5 additions & 1 deletion
diff --git a/‎articles/ai-services/openai/gpt-v-quickstart.md
Lines changed: 6 additions & 0 deletions b/‎articles/ai-services/openai/gpt-v-quickstart.md
Lines changed: 6 additions & 0 deletions
diff --git a/‎articles/ai-services/openai/how-to/gpt-with-vision.md
Lines changed: 4 additions & 0 deletions b/‎articles/ai-services/openai/how-to/gpt-with-vision.md
Lines changed: 4 additions & 0 deletions
diff --git a/‎articles/ai-services/openai/how-to/reproducible-output.md
Lines changed: 2 additions & 2 deletions b/‎articles/ai-services/openai/how-to/reproducible-output.md
Lines changed: 2 additions & 2 deletions
diff --git a/‎articles/ai-services/openai/includes/gpt-4-turbo.md
Lines changed: 36 additions & 0 deletions b/‎articles/ai-services/openai/includes/gpt-4-turbo.md
Lines changed: 36 additions & 0 deletions
@@ -66,15 +66,17 @@ These models are currently available for use in Azure OpenAI Service.
 | `gpt-35-turbo` | 0125 | No earlier than Feb 22, 2025 |
 | `gpt-4`<br>`gpt-4-32k` | 0314 | No earlier than July 13, 2024 |
 | `gpt-4`<br>`gpt-4-32k` | 0613 | No earlier than Sep 30, 2024 |
-| `gpt-4` | 1106-preview | To be upgraded to a stable version with date to be announced |
-| `gpt-4` | 0125-preview | To be upgraded to a stable version with date to be announced |
-| `gpt-4` | vision-preview | To be upgraded to a stable version with date to be announced |
+| `gpt-4` | 1106-preview | To be upgraded to `gpt-4` Version: `2024-04-09`, starting on June 10, 2024, or later **<sup>1</sup>** |
+| `gpt-4` | 0125-preview |To be upgraded to `gpt-4` Version: `2024-04-09`, starting on June 10, 2024, or later  **<sup>1</sup>**  |
+| `gpt-4` | vision-preview | To be upgraded to `gpt-4` Version: `2024-04-09`, starting on June 10, 2024, or later  **<sup>1</sup>** |
 | `gpt-3.5-turbo-instruct` | 0914 | No earlier than Sep 14, 2025 |
 | `text-embedding-ada-002` | 2 | No earlier than April 3, 2025 |
 | `text-embedding-ada-002` | 1 | No earlier than April 3, 2025 |
 | `text-embedding-3-small` | | No earlier than Feb 2, 2025 |
 | `text-embedding-3-large` | | No earlier than Feb 2, 2025 |
 
+ **<sup>1</sup>** We will notify all customers with these preview deployments at least two weeks before the start of the upgrades. We will publish an upgrade schedule detailing the order of regions and model versions that we will follow during the upgrades, and link to that schedule from here.
+
 
 ## Deprecated models
 
 
@@ -18,24 +18,64 @@ Azure OpenAI Service is powered by a diverse set of models with different capabi
 
 | Models | Description |
 |--|--|
-| [GPT-4](#gpt-4-and-gpt-4-turbo-preview) | A set of models that improve on GPT-3.5 and can understand and generate natural language and code. |
+| [GPT-4 Turbo 🆕](#gpt-4-turbo) | The latest most capable Azure OpenAI models with multimodal versions which can accept both text and images as input. |
+| [GPT-4](#gpt-4) | A set of models that improve on GPT-3.5 and can understand and generate natural language and code. |
 | [GPT-3.5](#gpt-35) | A set of models that improve on GPT-3 and can understand and generate natural language and code. |
 | [Embeddings](#embeddings-models) | A set of models that can convert text into numerical vector form to facilitate text similarity. |
 | [DALL-E](#dall-e-models) | A series of models that can generate original images from natural language. |
 | [Whisper](#whisper-models) | A series of models in preview that can transcribe and translate speech to text. |
 | [Text to speech](#text-to-speech-models-preview) (Preview) | A series of models in preview that can synthesize text to speech. |
 
-## GPT-4 and GPT-4 Turbo Preview
+## GPT-4 Turbo
 
- GPT-4 is a large multimodal model (accepting text or image inputs and generating text) that can solve difficult problems with greater accuracy than any of OpenAI's previous models. Like GPT-3.5 Turbo, GPT-4 is optimized for chat and works well for traditional completions tasks. Use the Chat Completions API to use GPT-4. To learn more about how to interact with GPT-4 and the Chat Completions API check out our [in-depth how-to](../how-to/chatgpt.md).
+GPT-4 Turbo is a large multimodal model (accepting text or image inputs and generating text) that can solve difficult problems with greater accuracy than any of OpenAI's previous models. Like GPT-3.5 Turbo, and older GPT-4 models GPT-4 Turbo is optimized for chat and works well for traditional completions tasks.
 
- GPT-4 Turbo with Vision is the version of GPT-4 that accepts image inputs.  It is available as the `vision-preview` model of `gpt-4`.
+[!INCLUDE [GPT-4 Turbo](../includes/gpt-4-turbo.md)]
 
-- `gpt-4`
-- `gpt-4-32k`
+## GPT-4
+
+GPT-4 is the predecessor to GPT-4 Turbo. Both the GPT-4 and GPT-4 Turbo models have a base model name of `gpt-4`. You can distinguish between the GPT-4 and Turbo models by examining the model version.
+
+- `gpt-4` **Version** `0314`
+- `gpt-4` **Version** `0613`
+- `gpt-4-32k` **Version** `0613`
 
 You can see the token context length supported by each model in the [model summary table](#model-summary-table-and-region-availability).
 
+## GPT-4 and GPT-4 Turbo models
+
+- These models can only be used with the Chat Completion API.
+
+See [model versions](../concepts/model-versions.md) to learn about how Azure OpenAI Service handles model version upgrades, and [working with models](../how-to/working-with-models.md) to learn how to view and configure the model version settings of your GPT-4 deployments.
+
+|  Model ID  | Max Request (tokens) | Training Data (up to)  |
+|  --- |  :--- | :---: |
+| `gpt-4` (0314) | 8,192 | Sep 2021         |
+| `gpt-4-32k`(0314)  | 32,768               | Sep 2021         |
+| `gpt-4` (0613)     | 8,192                | Sep 2021         |
+| `gpt-4-32k` (0613) | 32,768               | Sep 2021         |
+| `gpt-4` (1106-Preview)**<sup>1</sup>**<br>**GPT-4 Turbo Preview** | Input: 128,000  <br> Output: 4,096           | Apr 2023         |
+| `gpt-4` (0125-Preview)**<sup>1</sup>**<br>**GPT-4 Turbo Preview** | Input: 128,000  <br> Output: 4,096           | Dec 2023         |
+| `gpt-4` (vision-preview)**<sup>2</sup>**<br>**GPT-4 Turbo with Vision Preview**  | Input: 128,000  <br> Output: 4,096              | Apr 2023       |
+| `gpt-4` (turbo-2024-04-09) 🆕 <br>**GPT-4 Turbo with Vision GA** | Input: 128,000  <br> Output: 4,096  | Dec 2023 |
+
+**<sup>1</sup>** GPT-4 Turbo Preview = `gpt-4` (0125-Preview) or `gpt-4` (1106-Preview). To deploy this model, under **Deployments** select model **gpt-4**. Under version select (0125-Preview) or (1106-Preview).
+
+**<sup>2</sup>** GPT-4 Turbo with Vision Preview = `gpt-4` (vision-preview). To deploy this model, under **Deployments** select model **gpt-4**. For **Model version** select **vision-preview**.
+
+> [!CAUTION]
+> We don't recommend using preview models in production. We will upgrade all deployments of preview models to future preview versions and a stable version. Models designated preview do not follow the standard Azure OpenAI model lifecycle.
+
+> [!NOTE]
+> Version `0314` of `gpt-4` and `gpt-4-32k` will be retired no earlier than July 5, 2024.  Version `0613` of `gpt-4` and `gpt-4-32k` will be retired no earlier than September 30, 2024.  See [model updates](../how-to/working-with-models.md#model-updates) for model upgrade behavior.
+
+- GPT-4 version 0125-preview is an updated version of the GPT-4 Turbo preview previously released as version 1106-preview.  
+- GPT-4 version 0125-preview completes tasks such as code generation more completely compared to gpt-4-1106-preview.  Because of this, depending on the task, customers may find that GPT-4-0125-preview generates more output compared to the gpt-4-1106-preview.  We recommend customers compare the outputs of the new model.  GPT-4-0125-preview also addresses bugs in gpt-4-1106-preview with UTF-8 handling for non-English languages. GPT-4 version `turbo-2024-04-09` is the latest GA release and replaces `0125-Preview`, `1106-preview`, and `vision-preview`.
+
+> [!IMPORTANT]
+>
+> - `gpt-4` versions 1106-Preview and 0125-Preview will be upgraded with a stable version of `gpt-4` in the future. The deployment upgrade of `gpt-4` 1106-Preview to `gpt-4` 0125-Preview scheduled for March 8, 2024 is no longer taking place.  Deployments of `gpt-4` versions 1106-Preview and 0125-Preview set to "Auto-update to default" and "Upgrade when expired" will start to be upgraded after the stable version is released.  For each deployment, a model version upgrade takes place with no interruption in service for API calls.  Upgrades are staged by region and the full upgrade process is expected to take 2 weeks. Deployments of `gpt-4` versions 1106-Preview and 0125-Preview set to "No autoupgrade" will not be upgraded and will stop operating when the preview version is upgraded in the region.
+
 ## GPT-3.5
 
 GPT-3.5 models can understand and generate natural language or code. The most capable and cost effective model in the GPT-3.5 family is GPT-3.5 Turbo, which has been optimized for chat and works well for traditional completions tasks as well. GPT-3.5 Turbo is available for use with the Chat Completions API. GPT-3.5 Turbo Instruct has similar capabilities to `text-davinci-003` using the Completions API instead of the Chat Completions API.  We recommend using GPT-3.5 Turbo and GPT-3.5 Turbo Instruct over [legacy GPT-3.5 and GPT-3 models](./legacy-models.md).
@@ -86,58 +126,29 @@ You can also use the OpenAI text to speech voices via Azure AI Speech. To learn
 ## Model summary table and region availability
 
 > [!NOTE]
-> This article only covers model/region availability that applies to all Azure OpenAI customers with deployment types of **Standard**. Some select customers have access to model/region combinations that are not listed in the unified table below. These tables also do not apply to customers using only **Provisioned** deployment types which have their own unique model/region availability matrix. For more information on **Provisioned** deployments refer to our [Provisioned guidance](./provisioned-throughput.md).
+> This article primarily covers model/region availability that applies to all Azure OpenAI customers with deployment types of **Standard**. Some select customers have access to model/region combinations that are not listed in the unified table below. For more information on Provisioned deployments, see our [Provisioned guidance](./provisioned-throughput.md).
 
 ### Standard deployment model availability
 
 [!INCLUDE [Standard Models](../includes/model-matrix/standard-models.md)]
 
-This table does not include fine-tuning regional availability, consult the dedicated [fine-tuning section](#fine-tuning-models) for this information.
+This table doesn't include fine-tuning regional availability, consult the dedicated [fine-tuning section](#fine-tuning-models) for this information.
 
 ### Standard deployment model quota
 
 [!INCLUDE [Quota](../includes/model-matrix/quota.md)]
 
-### GPT-4 and GPT-4 Turbo Preview models
-
-GPT-4, GPT-4-32k, and GPT-4 Turbo with Vision are now available to all Azure OpenAI Service customers.  Availability varies by region.  If you don't see GPT-4 in your region, please check back later.
-
-These models can only be used with the Chat Completion API.
-
-GPT-4 version 0314 is the first version of the model released.  Version 0613 is the second version of the model and adds function calling support. 
+### Provisioned deployment model availability
 
-See [model versions](../concepts/model-versions.md) to learn about how Azure OpenAI Service handles model version upgrades, and [working with models](../how-to/working-with-models.md) to learn how to view and configure the model version settings of your GPT-4 deployments.
+[!INCLUDE [Provisioned](../includes/model-matrix/provisioned-models.md)]
 
-> [!NOTE]
-> Version `0314` of `gpt-4` and `gpt-4-32k` will be retired no earlier than July 5, 2024.  Version `0613` of `gpt-4` and `gpt-4-32k` will be retired no earlier than September 30, 2024.  See [model updates](../how-to/working-with-models.md#model-updates) for model upgrade behavior.
+### How do I get access to Provisioned?
 
-GPT-4 version 0125-preview is an updated version of the GPT-4 Turbo preview previously released as version 1106-preview.  GPT-4 version 0125-preview completes tasks such as code generation more completely compared to gpt-4-1106-preview.  Because of this, depending on the task, customers may find that GPT-4-0125-preview generates more output compared to the gpt-4-1106-preview.  We recommend customers compare the outputs of the new model.  GPT-4-0125-preview also addresses bugs in gpt-4-1106-preview with UTF-8 handling for non-English languages. 
+You need to speak with your Microsoft sales/account team to acquire provisioned throughput. If you don't have a sales/account team, unfortunately at this time, you cannot purchase provisioned throughput.
 
-> [!IMPORTANT]
->
-> - `gpt-4` versions 1106-Preview and 0125-Preview will be upgraded with a stable version of `gpt-4` in the future. The deployment upgrade of `gpt-4` 1106-Preview to `gpt-4` 0125-Preview scheduled for March 8, 2024 is no longer taking place.  Deployments of `gpt-4` versions 1106-Preview and 0125-Preview set to "Auto-update to default" and "Upgrade when expired" will start to be upgraded after the stable version is released.  For each deployment, a model version upgrade takes place with no interruption in service for API calls.  Upgrades are staged by region and the full upgrade process is expected to take 2 weeks. Deployments of `gpt-4` versions 1106-Preview and 0125-Preview set to "No autoupgrade" will not be upgraded and will stop operating when the preview version is upgraded in the region.
-
-|  Model ID  | Max Request (tokens) | Training Data (up to)  |
-|  --- |  :--- | :---: |
-| `gpt-4` (0314) | 8,192 | Sep 2021         |
-| `gpt-4-32k`(0314)  | 32,768               | Sep 2021         |
-| `gpt-4` (0613)     | 8,192                | Sep 2021         |
-| `gpt-4-32k` (0613) | 32,768               | Sep 2021         |
-| `gpt-4` (1106-Preview)**<sup>1</sup>**<br>**GPT-4 Turbo Preview** | Input: 128,000  <br> Output: 4,096           | Apr 2023         |
-| `gpt-4` (0125-Preview)**<sup>1</sup>**<br>**GPT-4 Turbo Preview** | Input: 128,000  <br> Output: 4,096           | Dec 2023         |
-| `gpt-4` (vision-preview)**<sup>2</sup>**<br>**GPT-4 Turbo with Vision Preview**  | Input: 128,000  <br> Output: 4,096              | Apr 2023       |
-
-**<sup>1</sup>** GPT-4 Turbo Preview = `gpt-4` (0125-Preview) or `gpt-4` (1106-Preview). To deploy this model, under **Deployments** select model **gpt-4**. Under version select (0125-Preview) or (1106-Preview).
-
-**<sup>2</sup>** GPT-4 Turbo with Vision Preview = `gpt-4` (vision-preview). To deploy this model, under **Deployments** select model **gpt-4**. For **Model version** select **vision-preview**.
-
-> [!CAUTION]
-> We don't recommend using preview models in production. We will upgrade all deployments of preview models to future preview versions and a stable version. Models designated preview do not follow the standard Azure OpenAI model lifecycle.
-
-> [!NOTE]
-> Regions where GPT-4 (0314) & (0613) are listed as available have access to both the 8K and 32K versions of the model
+For more information on Provisioned deployments, see our [Provisioned guidance](./provisioned-throughput.md).
 
-### GPT-4 and GPT-4 Turbo Preview model availability
+### GPT-4 and GPT-4 Turbo model availability
 
 #### Public cloud regions
 
 
@@ -3,7 +3,7 @@ title: Azure OpenAI Service provisioned throughput
 description: Learn about provisioned throughput and Azure OpenAI. 
 ms.service: azure-ai-openai
 ms.topic: conceptual 
-ms.date: 1/16/2024 
+ms.date: 04/29/2024 
 manager: nitinme
 author: mrbullwinkle #ChrisHMSFT
 ms.author: mbullwin #chrhoder
@@ -40,6 +40,10 @@ An Azure OpenAI Deployment is a unit of management for a specific OpenAI Model.
 
 You need to speak with your Microsoft sales/account team to acquire provisioned throughput. If you don't have a sales/account team, unfortunately at this time, you cannot purchase provisioned throughput.
 
+## What models and regions are available for provisioned throughput?
+
+[!INCLUDE [Provisioned](../includes/model-matrix/provisioned-models.md)]
+
 ## Key concepts
 
 ### Provisioned throughput units
 
@@ -15,6 +15,12 @@ zone_pivot_groups: openai-quickstart-gpt-v
 
 # Quickstart: Use images in your AI chats
 
+Get started using GPT-4 Turbo with images with the Azure OpenAI Service.
+
+## GPT-4 Turbo model upgrade
+
+[!INCLUDE [GPT-4 Turbo](./includes/gpt-4-turbo.md)]
+
 ::: zone pivot="programming-language-studio"
 
 [!INCLUDE [Studio quickstart](includes/gpt-v-studio.md)]
 
@@ -20,6 +20,10 @@ The GPT-4 Turbo with Vision model answers general questions about what's present
 > [!TIP]
 > To use GPT-4 Turbo with Vision, you call the Chat Completion API on a GPT-4 Turbo with Vision model that you have deployed. If you're not familiar with the Chat Completion API, see the [GPT-4 Turbo & GPT-4 how-to guide](/azure/ai-services/openai/how-to/chatgpt?tabs=python&pivots=programming-language-chat-completions).
 
+## GPT-4 Turbo model upgrade
+
+[!INCLUDE [GPT-4 Turbo](../includes/gpt-4-turbo.md)]
+
 ## Call the Chat Completion APIs
 
 The following command shows the most basic way to use the GPT-4 Turbo with Vision model with code. If this is your first time using these models programmatically, we recommend starting with our [GPT-4 Turbo with Vision quickstart](../gpt-v-quickstart.md). 
 
@@ -25,8 +25,8 @@ Reproducible output is only currently supported with the following:
 
 * `gpt-35-turbo` (1106) - [region availability](../concepts/models.md#gpt-35-turbo-model-availability)
 * `gpt-35-turbo` (0125) - [region availability](../concepts/models.md#gpt-35-turbo-model-availability)
-* `gpt-4` (1106-Preview) - [region availability](../concepts/models.md#gpt-4-and-gpt-4-turbo-preview-model-availability)
-* `gpt-4` (0125-Preview) - [region availability](../concepts/models.md#gpt-4-and-gpt-4-turbo-preview-model-availability)
+* `gpt-4` (1106-Preview) - [region availability](../concepts/models.md#gpt-4-and-gpt-4-turbo-model-availability)
+* `gpt-4` (0125-Preview) - [region availability](../concepts/models.md#gpt-4-and-gpt-4-turbo-model-availability)
 
 ### API Version
 
 
@@ -0,0 +1,36 @@
+---
+title: GPT-4 Turbo general availability
+titleSuffix: Azure OpenAI Service
+description: Information on GPT-4 Turbo model behavior and limitations
+manager: nitinme
+ms.service: azure-ai-openai
+ms.topic: include
+ms.date: 04/29/2024
+---
+
+The latest GA release of GPT-4 Turbo is:
+
+- `gpt-4` **Version:** `turbo-2024-04-09`
+
+This is the replacement for the following preview models:
+
+- `gpt-4` **Version:** `1106-Preview`
+- `gpt-4` **Version:** `0125-Preview`
+- `gpt-4` **Version:** `vision-preview`
+
+### Differences between OpenAI and Azure OpenAI GPT-4 Turbo with Vision GA model
+
+- OpenAI's version of the latest `0409` turbo model supports JSON mode and function calling for all inference requests.
+- Azure OpenAI's version of the latest `turbo-2024-04-09` currently doesn't support the use of JSON mode and function calling when making inference requests with image (vision) input. Text based input requests do support JSON mode and function calling.
+
+### Differences from gpt-4 vision-preview
+
+- Azure AI specific Vision enhancements integration with GPT-4 Turbo with Vision aren't supported for `gpt-4` **Version:** `turbo-2024-04-09`. This includes Optical Character Recognition (OCR), object grounding, video prompts, and improved handling of your data with images.
+
+### Region availability
+
+For information on model regional availability consult the [model matrix](../concepts/models.md#gpt-4-and-gpt-4-turbo-model-availability).
+
+### Deploying GPT-4 Turbo with Vision GA
+
+To deploy the GA model from the Studio UI, select `GPT-4` and then choose the `turbo-2024-04-09` version from the dropdown menu. The default quota for the `gpt-4-turbo-2024-04-09` model will be the same as current quota for GPT-4-Turbo. See the [regional quota limits.](../concepts/models.md#standard-deployment-model-quota)