Skip to content

Commit ce9e2b7

Browse files
committed
Merge branch 'main' into release-aio-april-updates
2 parents 3597f0f + 4f9b100 commit ce9e2b7

File tree

819 files changed

+2833
-2000
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

819 files changed

+2833
-2000
lines changed

articles/ai-services/openai/api-version-deprecation.md

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -5,7 +5,7 @@ services: cognitive-services
55
manager: nitinme
66
ms.service: azure-ai-openai
77
ms.topic: conceptual
8-
ms.date: 03/28/2024
8+
ms.date: 05/02/2024
99
author: mrbullwinkle
1010
ms.author: mbullwin
1111
recommendations: false
@@ -14,14 +14,14 @@ ms.custom:
1414

1515
# Azure OpenAI API preview lifecycle
1616

17-
This article is to help you understand the support lifecycle for the Azure OpenAI API previews. New preview APIs target a monthly release cadence. After July 1, 2024, the latest three preview APIs will remain supported while older APIs will no longer be supported unless support is explictly indicated.
17+
This article is to help you understand the support lifecycle for the Azure OpenAI API previews. New preview APIs target a monthly release cadence. After July 1, 2024, the latest three preview APIs will remain supported while older APIs will no longer be supported unless support is explicitly indicated.
1818

1919
> [!NOTE]
2020
> The `2023-06-01-preview` API will remain supported at this time, as `DALL-E 2` is only available in this API version. `DALL-E 3` is supported in the latest API releases. The `2023-10-01-preview` API will also remain supported at this time.
2121
2222
## Latest preview API release
2323

24-
Azure OpenAI API version [2024-03-01-preview](https://github.com/Azure/azure-rest-api-specs/blob/main/specification/cognitiveservices/data-plane/AzureOpenAI/inference/preview/2024-03-01-preview/inference.json)
24+
Azure OpenAI API version [2024-04-01-preview](https://github.com/Azure/azure-rest-api-specs/blob/main/specification/cognitiveservices/data-plane/AzureOpenAI/inference/preview/2024-04-01-preview/inference.json)
2525
is currently the latest preview release.
2626

2727
This version contains support for all the latest Azure OpenAI features including:

articles/ai-services/openai/concepts/model-retirements.md

Lines changed: 6 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -4,7 +4,7 @@ titleSuffix: Azure OpenAI
44
description: Learn about the model deprecations and retirements in Azure OpenAI.
55
ms.service: azure-ai-openai
66
ms.topic: conceptual
7-
ms.date: 04/24/2024
7+
ms.date: 05/01/2024
88
ms.custom:
99
manager: nitinme
1010
author: mrbullwinkle
@@ -66,15 +66,17 @@ These models are currently available for use in Azure OpenAI Service.
6666
| `gpt-35-turbo` | 0125 | No earlier than Feb 22, 2025 |
6767
| `gpt-4`<br>`gpt-4-32k` | 0314 | No earlier than July 13, 2024 |
6868
| `gpt-4`<br>`gpt-4-32k` | 0613 | No earlier than Sep 30, 2024 |
69-
| `gpt-4` | 1106-preview | To be upgraded to a stable version with date to be announced |
70-
| `gpt-4` | 0125-preview | To be upgraded to a stable version with date to be announced |
71-
| `gpt-4` | vision-preview | To be upgraded to a stable version with date to be announced |
69+
| `gpt-4` | 1106-preview | To be upgraded to `gpt-4` Version: `turbo-2024-04-09`, starting on June 10, 2024, or later **<sup>1</sup>** |
70+
| `gpt-4` | 0125-preview |To be upgraded to `gpt-4` Version: `turbo-2024-04-09`, starting on June 10, 2024, or later **<sup>1</sup>** |
71+
| `gpt-4` | vision-preview | To be upgraded to `gpt-4` Version: `turbo-2024-04-09`, starting on June 10, 2024, or later **<sup>1</sup>** |
7272
| `gpt-3.5-turbo-instruct` | 0914 | No earlier than Sep 14, 2025 |
7373
| `text-embedding-ada-002` | 2 | No earlier than April 3, 2025 |
7474
| `text-embedding-ada-002` | 1 | No earlier than April 3, 2025 |
7575
| `text-embedding-3-small` | | No earlier than Feb 2, 2025 |
7676
| `text-embedding-3-large` | | No earlier than Feb 2, 2025 |
7777

78+
**<sup>1</sup>** We will notify all customers with these preview deployments at least two weeks before the start of the upgrades. We will publish an upgrade schedule detailing the order of regions and model versions that we will follow during the upgrades, and link to that schedule from here.
79+
7880

7981
## Deprecated models
8082

articles/ai-services/openai/concepts/models.md

Lines changed: 54 additions & 43 deletions
Original file line numberDiff line numberDiff line change
@@ -18,24 +18,64 @@ Azure OpenAI Service is powered by a diverse set of models with different capabi
1818

1919
| Models | Description |
2020
|--|--|
21-
| [GPT-4](#gpt-4-and-gpt-4-turbo-preview) | A set of models that improve on GPT-3.5 and can understand and generate natural language and code. |
21+
| [GPT-4 Turbo 🆕](#gpt-4-turbo) | The latest most capable Azure OpenAI models with multimodal versions which can accept both text and images as input. |
22+
| [GPT-4](#gpt-4) | A set of models that improve on GPT-3.5 and can understand and generate natural language and code. |
2223
| [GPT-3.5](#gpt-35) | A set of models that improve on GPT-3 and can understand and generate natural language and code. |
2324
| [Embeddings](#embeddings-models) | A set of models that can convert text into numerical vector form to facilitate text similarity. |
2425
| [DALL-E](#dall-e-models) | A series of models that can generate original images from natural language. |
2526
| [Whisper](#whisper-models) | A series of models in preview that can transcribe and translate speech to text. |
2627
| [Text to speech](#text-to-speech-models-preview) (Preview) | A series of models in preview that can synthesize text to speech. |
2728

28-
## GPT-4 and GPT-4 Turbo Preview
29+
## GPT-4 Turbo
2930

30-
GPT-4 is a large multimodal model (accepting text or image inputs and generating text) that can solve difficult problems with greater accuracy than any of OpenAI's previous models. Like GPT-3.5 Turbo, GPT-4 is optimized for chat and works well for traditional completions tasks. Use the Chat Completions API to use GPT-4. To learn more about how to interact with GPT-4 and the Chat Completions API check out our [in-depth how-to](../how-to/chatgpt.md).
31+
GPT-4 Turbo is a large multimodal model (accepting text or image inputs and generating text) that can solve difficult problems with greater accuracy than any of OpenAI's previous models. Like GPT-3.5 Turbo, and older GPT-4 models GPT-4 Turbo is optimized for chat and works well for traditional completions tasks.
3132

32-
GPT-4 Turbo with Vision is the version of GPT-4 that accepts image inputs. It is available as the `vision-preview` model of `gpt-4`.
33+
[!INCLUDE [GPT-4 Turbo](../includes/gpt-4-turbo.md)]
3334

34-
- `gpt-4`
35-
- `gpt-4-32k`
35+
## GPT-4
36+
37+
GPT-4 is the predecessor to GPT-4 Turbo. Both the GPT-4 and GPT-4 Turbo models have a base model name of `gpt-4`. You can distinguish between the GPT-4 and Turbo models by examining the model version.
38+
39+
- `gpt-4` **Version** `0314`
40+
- `gpt-4` **Version** `0613`
41+
- `gpt-4-32k` **Version** `0613`
3642

3743
You can see the token context length supported by each model in the [model summary table](#model-summary-table-and-region-availability).
3844

45+
## GPT-4 and GPT-4 Turbo models
46+
47+
- These models can only be used with the Chat Completion API.
48+
49+
See [model versions](../concepts/model-versions.md) to learn about how Azure OpenAI Service handles model version upgrades, and [working with models](../how-to/working-with-models.md) to learn how to view and configure the model version settings of your GPT-4 deployments.
50+
51+
| Model ID | Max Request (tokens) | Training Data (up to) |
52+
| --- | :--- | :---: |
53+
| `gpt-4` (0314) | 8,192 | Sep 2021 |
54+
| `gpt-4-32k`(0314) | 32,768 | Sep 2021 |
55+
| `gpt-4` (0613) | 8,192 | Sep 2021 |
56+
| `gpt-4-32k` (0613) | 32,768 | Sep 2021 |
57+
| `gpt-4` (1106-Preview)**<sup>1</sup>**<br>**GPT-4 Turbo Preview** | Input: 128,000 <br> Output: 4,096 | Apr 2023 |
58+
| `gpt-4` (0125-Preview)**<sup>1</sup>**<br>**GPT-4 Turbo Preview** | Input: 128,000 <br> Output: 4,096 | Dec 2023 |
59+
| `gpt-4` (vision-preview)**<sup>2</sup>**<br>**GPT-4 Turbo with Vision Preview** | Input: 128,000 <br> Output: 4,096 | Apr 2023 |
60+
| `gpt-4` (turbo-2024-04-09) 🆕 <br>**GPT-4 Turbo with Vision GA** | Input: 128,000 <br> Output: 4,096 | Dec 2023 |
61+
62+
**<sup>1</sup>** GPT-4 Turbo Preview = `gpt-4` (0125-Preview) or `gpt-4` (1106-Preview). To deploy this model, under **Deployments** select model **gpt-4**. Under version select (0125-Preview) or (1106-Preview).
63+
64+
**<sup>2</sup>** GPT-4 Turbo with Vision Preview = `gpt-4` (vision-preview). To deploy this model, under **Deployments** select model **gpt-4**. For **Model version** select **vision-preview**.
65+
66+
> [!CAUTION]
67+
> We don't recommend using preview models in production. We will upgrade all deployments of preview models to future preview versions and a stable version. Models designated preview do not follow the standard Azure OpenAI model lifecycle.
68+
69+
> [!NOTE]
70+
> Version `0314` of `gpt-4` and `gpt-4-32k` will be retired no earlier than July 5, 2024. Version `0613` of `gpt-4` and `gpt-4-32k` will be retired no earlier than September 30, 2024. See [model updates](../how-to/working-with-models.md#model-updates) for model upgrade behavior.
71+
72+
- GPT-4 version 0125-preview is an updated version of the GPT-4 Turbo preview previously released as version 1106-preview.
73+
- GPT-4 version 0125-preview completes tasks such as code generation more completely compared to gpt-4-1106-preview. Because of this, depending on the task, customers may find that GPT-4-0125-preview generates more output compared to the gpt-4-1106-preview. We recommend customers compare the outputs of the new model. GPT-4-0125-preview also addresses bugs in gpt-4-1106-preview with UTF-8 handling for non-English languages. GPT-4 version `turbo-2024-04-09` is the latest GA release and replaces `0125-Preview`, `1106-preview`, and `vision-preview`.
74+
75+
> [!IMPORTANT]
76+
>
77+
> - `gpt-4` versions 1106-Preview and 0125-Preview will be upgraded with a stable version of `gpt-4` in the future. Deployments of `gpt-4` versions 1106-Preview and 0125-Preview set to "Auto-update to default" and "Upgrade when expired" will start to be upgraded after the stable version is released. For each deployment, a model version upgrade takes place with no interruption in service for API calls. Upgrades are staged by region and the full upgrade process is expected to take 2 weeks. Deployments of `gpt-4` versions 1106-Preview and 0125-Preview set to "No autoupgrade" will not be upgraded and will stop operating when the preview version is upgraded in the region. See [Azure OpenAI model retirements and deprecations](./model-retirements.md) for more information on the timing of the upgrade.
78+
3979
## GPT-3.5
4080

4181
GPT-3.5 models can understand and generate natural language or code. The most capable and cost effective model in the GPT-3.5 family is GPT-3.5 Turbo, which has been optimized for chat and works well for traditional completions tasks as well. GPT-3.5 Turbo is available for use with the Chat Completions API. GPT-3.5 Turbo Instruct has similar capabilities to `text-davinci-003` using the Completions API instead of the Chat Completions API. We recommend using GPT-3.5 Turbo and GPT-3.5 Turbo Instruct over [legacy GPT-3.5 and GPT-3 models](./legacy-models.md).
@@ -86,58 +126,29 @@ You can also use the OpenAI text to speech voices via Azure AI Speech. To learn
86126
## Model summary table and region availability
87127

88128
> [!NOTE]
89-
> This article only covers model/region availability that applies to all Azure OpenAI customers with deployment types of **Standard**. Some select customers have access to model/region combinations that are not listed in the unified table below. These tables also do not apply to customers using only **Provisioned** deployment types which have their own unique model/region availability matrix. For more information on **Provisioned** deployments refer to our [Provisioned guidance](./provisioned-throughput.md).
129+
> This article primarily covers model/region availability that applies to all Azure OpenAI customers with deployment types of **Standard**. Some select customers have access to model/region combinations that are not listed in the unified table below. For more information on Provisioned deployments, see our [Provisioned guidance](./provisioned-throughput.md).
90130
91131
### Standard deployment model availability
92132

93133
[!INCLUDE [Standard Models](../includes/model-matrix/standard-models.md)]
94134

95-
This table does not include fine-tuning regional availability, consult the dedicated [fine-tuning section](#fine-tuning-models) for this information.
135+
This table doesn't include fine-tuning regional availability, consult the dedicated [fine-tuning section](#fine-tuning-models) for this information.
96136

97137
### Standard deployment model quota
98138

99139
[!INCLUDE [Quota](../includes/model-matrix/quota.md)]
100140

101-
### GPT-4 and GPT-4 Turbo Preview models
102-
103-
GPT-4, GPT-4-32k, and GPT-4 Turbo with Vision are now available to all Azure OpenAI Service customers. Availability varies by region. If you don't see GPT-4 in your region, please check back later.
104-
105-
These models can only be used with the Chat Completion API.
106-
107-
GPT-4 version 0314 is the first version of the model released. Version 0613 is the second version of the model and adds function calling support.
108-
109-
See [model versions](../concepts/model-versions.md) to learn about how Azure OpenAI Service handles model version upgrades, and [working with models](../how-to/working-with-models.md) to learn how to view and configure the model version settings of your GPT-4 deployments.
110-
111-
> [!NOTE]
112-
> Version `0314` of `gpt-4` and `gpt-4-32k` will be retired no earlier than July 5, 2024. Version `0613` of `gpt-4` and `gpt-4-32k` will be retired no earlier than September 30, 2024. See [model updates](../how-to/working-with-models.md#model-updates) for model upgrade behavior.
113-
114-
GPT-4 version 0125-preview is an updated version of the GPT-4 Turbo preview previously released as version 1106-preview. GPT-4 version 0125-preview completes tasks such as code generation more completely compared to gpt-4-1106-preview. Because of this, depending on the task, customers may find that GPT-4-0125-preview generates more output compared to the gpt-4-1106-preview. We recommend customers compare the outputs of the new model. GPT-4-0125-preview also addresses bugs in gpt-4-1106-preview with UTF-8 handling for non-English languages.
115-
116-
> [!IMPORTANT]
117-
>
118-
> - `gpt-4` versions 1106-Preview and 0125-Preview will be upgraded with a stable version of `gpt-4` in the future. The deployment upgrade of `gpt-4` 1106-Preview to `gpt-4` 0125-Preview scheduled for March 8, 2024 is no longer taking place. Deployments of `gpt-4` versions 1106-Preview and 0125-Preview set to "Auto-update to default" and "Upgrade when expired" will start to be upgraded after the stable version is released. For each deployment, a model version upgrade takes place with no interruption in service for API calls. Upgrades are staged by region and the full upgrade process is expected to take 2 weeks. Deployments of `gpt-4` versions 1106-Preview and 0125-Preview set to "No autoupgrade" will not be upgraded and will stop operating when the preview version is upgraded in the region.
119-
120-
| Model ID | Max Request (tokens) | Training Data (up to) |
121-
| --- | :--- | :---: |
122-
| `gpt-4` (0314) | 8,192 | Sep 2021 |
123-
| `gpt-4-32k`(0314) | 32,768 | Sep 2021 |
124-
| `gpt-4` (0613) | 8,192 | Sep 2021 |
125-
| `gpt-4-32k` (0613) | 32,768 | Sep 2021 |
126-
| `gpt-4` (1106-Preview)**<sup>1</sup>**<br>**GPT-4 Turbo Preview** | Input: 128,000 <br> Output: 4,096 | Apr 2023 |
127-
| `gpt-4` (0125-Preview)**<sup>1</sup>**<br>**GPT-4 Turbo Preview** | Input: 128,000 <br> Output: 4,096 | Dec 2023 |
128-
| `gpt-4` (vision-preview)**<sup>2</sup>**<br>**GPT-4 Turbo with Vision Preview** | Input: 128,000 <br> Output: 4,096 | Apr 2023 |
141+
### Provisioned deployment model availability
129142

130-
**<sup>1</sup>** GPT-4 Turbo Preview = `gpt-4` (0125-Preview) or `gpt-4` (1106-Preview). To deploy this model, under **Deployments** select model **gpt-4**. Under version select (0125-Preview) or (1106-Preview).
143+
[!INCLUDE [Provisioned](../includes/model-matrix/provisioned-models.md)]
131144

132-
**<sup>2</sup>** GPT-4 Turbo with Vision Preview = `gpt-4` (vision-preview). To deploy this model, under **Deployments** select model **gpt-4**. For **Model version** select **vision-preview**.
145+
### How do I get access to Provisioned?
133146

134-
> [!CAUTION]
135-
> We don't recommend using preview models in production. We will upgrade all deployments of preview models to future preview versions and a stable version. Models designated preview do not follow the standard Azure OpenAI model lifecycle.
147+
You need to speak with your Microsoft sales/account team to acquire provisioned throughput. If you don't have a sales/account team, unfortunately at this time, you cannot purchase provisioned throughput.
136148

137-
> [!NOTE]
138-
> Regions where GPT-4 (0314) & (0613) are listed as available have access to both the 8K and 32K versions of the model
149+
For more information on Provisioned deployments, see our [Provisioned guidance](./provisioned-throughput.md).
139150

140-
### GPT-4 and GPT-4 Turbo Preview model availability
151+
### GPT-4 and GPT-4 Turbo model availability
141152

142153
#### Public cloud regions
143154

articles/ai-services/openai/concepts/provisioned-throughput.md

Lines changed: 7 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -3,7 +3,7 @@ title: Azure OpenAI Service provisioned throughput
33
description: Learn about provisioned throughput and Azure OpenAI.
44
ms.service: azure-ai-openai
55
ms.topic: conceptual
6-
ms.date: 1/16/2024
6+
ms.date: 04/29/2024
77
manager: nitinme
88
author: mrbullwinkle #ChrisHMSFT
99
ms.author: mbullwin #chrhoder
@@ -40,6 +40,10 @@ An Azure OpenAI Deployment is a unit of management for a specific OpenAI Model.
4040

4141
You need to speak with your Microsoft sales/account team to acquire provisioned throughput. If you don't have a sales/account team, unfortunately at this time, you cannot purchase provisioned throughput.
4242

43+
## What models and regions are available for provisioned throughput?
44+
45+
[!INCLUDE [Provisioned](../includes/model-matrix/provisioned-models.md)]
46+
4347
## Key concepts
4448

4549
### Provisioned throughput units
@@ -64,13 +68,12 @@ az cognitiveservices account deployment create \
6468

6569
### Quota
6670

67-
Provisioned throughput quota represents a specific amount of total throughput you can deploy. Quota in the Azure OpenAI Service is managed at the subscription level. All Azure OpenAI resources within the subscription share this quota.
71+
Provisioned throughput quota represents a specific amount of total throughput you can deploy. Quota in the Azure OpenAI Service is managed at the subscription level. All Azure OpenAI resources within the subscription share this quota.
6872

69-
Quota is specified in Provisioned throughput units and is specific to a (deployment type, model, region) triplet. Quota isn't interchangeable. Meaning you can't use quota for GPT-4 to deploy GPT-35-turbo. You can raise a support request to move quota across deployment types, models, or regions but the swap isn't guaranteed.
73+
Quota is specified in Provisioned throughput units and is specific to a (deployment type, model, region) triplet. Quota isn't interchangeable. Meaning you can't use quota for GPT-4 to deploy GPT-3.5-Turbo.
7074

7175
While we make every attempt to ensure that quota is deployable, quota doesn't represent a guarantee that the underlying capacity is available. The service assigns capacity during the deployment operation and if capacity is unavailable the deployment fails with an out of capacity error.
7276

73-
7477
### Determining the number of PTUs needed for a workload
7578

7679
PTUs represent an amount of model processing capacity. Similar to your computer or databases, different workloads or requests to the model will consume different amounts of underlying processing capacity. The conversion from call shape characteristics (prompt size, generation size and call rate) to PTUs is complex and non-linear. To simplify this process, you can use the [Azure OpenAI Capacity calculator](https://oai.azure.com/portal/calculator) to size specific workload shapes.

articles/ai-services/openai/gpt-v-quickstart.md

Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -15,6 +15,12 @@ zone_pivot_groups: openai-quickstart-gpt-v
1515

1616
# Quickstart: Use images in your AI chats
1717

18+
Get started using GPT-4 Turbo with images with the Azure OpenAI Service.
19+
20+
## GPT-4 Turbo model upgrade
21+
22+
[!INCLUDE [GPT-4 Turbo](./includes/gpt-4-turbo.md)]
23+
1824
::: zone pivot="programming-language-studio"
1925

2026
[!INCLUDE [Studio quickstart](includes/gpt-v-studio.md)]

0 commit comments

Comments
 (0)