You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: articles/ai-services/openai/api-version-deprecation.md
+3-3Lines changed: 3 additions & 3 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -5,7 +5,7 @@ services: cognitive-services
5
5
manager: nitinme
6
6
ms.service: azure-ai-openai
7
7
ms.topic: conceptual
8
-
ms.date: 03/28/2024
8
+
ms.date: 05/02/2024
9
9
author: mrbullwinkle
10
10
ms.author: mbullwin
11
11
recommendations: false
@@ -14,14 +14,14 @@ ms.custom:
14
14
15
15
# Azure OpenAI API preview lifecycle
16
16
17
-
This article is to help you understand the support lifecycle for the Azure OpenAI API previews. New preview APIs target a monthly release cadence. After July 1, 2024, the latest three preview APIs will remain supported while older APIs will no longer be supported unless support is explictly indicated.
17
+
This article is to help you understand the support lifecycle for the Azure OpenAI API previews. New preview APIs target a monthly release cadence. After July 1, 2024, the latest three preview APIs will remain supported while older APIs will no longer be supported unless support is explicitly indicated.
18
18
19
19
> [!NOTE]
20
20
> The `2023-06-01-preview` API will remain supported at this time, as `DALL-E 2` is only available in this API version. `DALL-E 3` is supported in the latest API releases. The `2023-10-01-preview` API will also remain supported at this time.
21
21
22
22
## Latest preview API release
23
23
24
-
Azure OpenAI API version [2024-03-01-preview](https://github.com/Azure/azure-rest-api-specs/blob/main/specification/cognitiveservices/data-plane/AzureOpenAI/inference/preview/2024-03-01-preview/inference.json)
24
+
Azure OpenAI API version [2024-04-01-preview](https://github.com/Azure/azure-rest-api-specs/blob/main/specification/cognitiveservices/data-plane/AzureOpenAI/inference/preview/2024-04-01-preview/inference.json)
25
25
is currently the latest preview release.
26
26
27
27
This version contains support for all the latest Azure OpenAI features including:
Copy file name to clipboardExpand all lines: articles/ai-services/openai/concepts/model-retirements.md
+6-4Lines changed: 6 additions & 4 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -4,7 +4,7 @@ titleSuffix: Azure OpenAI
4
4
description: Learn about the model deprecations and retirements in Azure OpenAI.
5
5
ms.service: azure-ai-openai
6
6
ms.topic: conceptual
7
-
ms.date: 04/24/2024
7
+
ms.date: 05/01/2024
8
8
ms.custom:
9
9
manager: nitinme
10
10
author: mrbullwinkle
@@ -66,15 +66,17 @@ These models are currently available for use in Azure OpenAI Service.
66
66
|`gpt-35-turbo`| 0125 | No earlier than Feb 22, 2025 |
67
67
|`gpt-4`<br>`gpt-4-32k`| 0314 | No earlier than July 13, 2024 |
68
68
|`gpt-4`<br>`gpt-4-32k`| 0613 | No earlier than Sep 30, 2024 |
69
-
|`gpt-4`| 1106-preview | To be upgraded to a stable version with date to be announced|
70
-
|`gpt-4`| 0125-preview |To be upgraded to a stable version with date to be announced|
71
-
|`gpt-4`| vision-preview | To be upgraded to a stable version with date to be announced|
69
+
|`gpt-4`| 1106-preview | To be upgraded to `gpt-4` Version: `turbo-2024-04-09`, starting on June 10, 2024, or later **<sup>1</sup>**|
70
+
|`gpt-4`| 0125-preview |To be upgraded to `gpt-4` Version: `turbo-2024-04-09`, starting on June 10, 2024, or later **<sup>1</sup>**|
71
+
|`gpt-4`| vision-preview | To be upgraded to `gpt-4` Version: `turbo-2024-04-09`, starting on June 10, 2024, or later **<sup>1</sup>**|
72
72
|`gpt-3.5-turbo-instruct`| 0914 | No earlier than Sep 14, 2025 |
73
73
|`text-embedding-ada-002`| 2 | No earlier than April 3, 2025 |
74
74
|`text-embedding-ada-002`| 1 | No earlier than April 3, 2025 |
75
75
|`text-embedding-3-small`|| No earlier than Feb 2, 2025 |
76
76
|`text-embedding-3-large`|| No earlier than Feb 2, 2025 |
77
77
78
+
**<sup>1</sup>** We will notify all customers with these preview deployments at least two weeks before the start of the upgrades. We will publish an upgrade schedule detailing the order of regions and model versions that we will follow during the upgrades, and link to that schedule from here.
Copy file name to clipboardExpand all lines: articles/ai-services/openai/concepts/models.md
+54-43Lines changed: 54 additions & 43 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -18,24 +18,64 @@ Azure OpenAI Service is powered by a diverse set of models with different capabi
18
18
19
19
| Models | Description |
20
20
|--|--|
21
-
|[GPT-4](#gpt-4-and-gpt-4-turbo-preview)| A set of models that improve on GPT-3.5 and can understand and generate natural language and code. |
21
+
|[GPT-4 Turbo 🆕](#gpt-4-turbo)| The latest most capable Azure OpenAI models with multimodal versions which can accept both text and images as input. |
22
+
|[GPT-4](#gpt-4)| A set of models that improve on GPT-3.5 and can understand and generate natural language and code. |
22
23
|[GPT-3.5](#gpt-35)| A set of models that improve on GPT-3 and can understand and generate natural language and code. |
23
24
|[Embeddings](#embeddings-models)| A set of models that can convert text into numerical vector form to facilitate text similarity. |
24
25
|[DALL-E](#dall-e-models)| A series of models that can generate original images from natural language. |
25
26
|[Whisper](#whisper-models)| A series of models in preview that can transcribe and translate speech to text. |
26
27
|[Text to speech](#text-to-speech-models-preview) (Preview) | A series of models in preview that can synthesize text to speech. |
27
28
28
-
## GPT-4 and GPT-4 Turbo Preview
29
+
## GPT-4 Turbo
29
30
30
-
GPT-4 is a large multimodal model (accepting text or image inputs and generating text) that can solve difficult problems with greater accuracy than any of OpenAI's previous models. Like GPT-3.5 Turbo, GPT-4 is optimized for chat and works well for traditional completions tasks. Use the Chat Completions API to use GPT-4. To learn more about how to interact with GPT-4 and the Chat Completions API check out our [in-depth how-to](../how-to/chatgpt.md).
31
+
GPT-4 Turbo is a large multimodal model (accepting text or image inputs and generating text) that can solve difficult problems with greater accuracy than any of OpenAI's previous models. Like GPT-3.5 Turbo, and older GPT-4 models GPT-4 Turbo is optimized for chat and works well for traditional completions tasks.
31
32
32
-
GPT-4 Turbo with Vision is the version of GPT-4 that accepts image inputs. It is available as the `vision-preview` model of `gpt-4`.
GPT-4 is the predecessor to GPT-4 Turbo. Both the GPT-4 and GPT-4 Turbo models have a base model name of `gpt-4`. You can distinguish between the GPT-4 and Turbo models by examining the model version.
38
+
39
+
-`gpt-4`**Version**`0314`
40
+
-`gpt-4`**Version**`0613`
41
+
-`gpt-4-32k`**Version**`0613`
36
42
37
43
You can see the token context length supported by each model in the [model summary table](#model-summary-table-and-region-availability).
38
44
45
+
## GPT-4 and GPT-4 Turbo models
46
+
47
+
- These models can only be used with the Chat Completion API.
48
+
49
+
See [model versions](../concepts/model-versions.md) to learn about how Azure OpenAI Service handles model version upgrades, and [working with models](../how-to/working-with-models.md) to learn how to view and configure the model version settings of your GPT-4 deployments.
50
+
51
+
| Model ID | Max Request (tokens) | Training Data (up to) |
|`gpt-4` (turbo-2024-04-09) 🆕 <br>**GPT-4 Turbo with Vision GA**| Input: 128,000 <br> Output: 4,096 | Dec 2023 |
61
+
62
+
**<sup>1</sup>** GPT-4 Turbo Preview = `gpt-4` (0125-Preview) or `gpt-4` (1106-Preview). To deploy this model, under **Deployments** select model **gpt-4**. Under version select (0125-Preview) or (1106-Preview).
63
+
64
+
**<sup>2</sup>** GPT-4 Turbo with Vision Preview = `gpt-4` (vision-preview). To deploy this model, under **Deployments** select model **gpt-4**. For **Model version** select **vision-preview**.
65
+
66
+
> [!CAUTION]
67
+
> We don't recommend using preview models in production. We will upgrade all deployments of preview models to future preview versions and a stable version. Models designated preview do not follow the standard Azure OpenAI model lifecycle.
68
+
69
+
> [!NOTE]
70
+
> Version `0314` of `gpt-4` and `gpt-4-32k` will be retired no earlier than July 5, 2024. Version `0613` of `gpt-4` and `gpt-4-32k` will be retired no earlier than September 30, 2024. See [model updates](../how-to/working-with-models.md#model-updates) for model upgrade behavior.
71
+
72
+
- GPT-4 version 0125-preview is an updated version of the GPT-4 Turbo preview previously released as version 1106-preview.
73
+
- GPT-4 version 0125-preview completes tasks such as code generation more completely compared to gpt-4-1106-preview. Because of this, depending on the task, customers may find that GPT-4-0125-preview generates more output compared to the gpt-4-1106-preview. We recommend customers compare the outputs of the new model. GPT-4-0125-preview also addresses bugs in gpt-4-1106-preview with UTF-8 handling for non-English languages. GPT-4 version `turbo-2024-04-09` is the latest GA release and replaces `0125-Preview`, `1106-preview`, and `vision-preview`.
74
+
75
+
> [!IMPORTANT]
76
+
>
77
+
> -`gpt-4` versions 1106-Preview and 0125-Preview will be upgraded with a stable version of `gpt-4` in the future. Deployments of `gpt-4` versions 1106-Preview and 0125-Preview set to "Auto-update to default" and "Upgrade when expired" will start to be upgraded after the stable version is released. For each deployment, a model version upgrade takes place with no interruption in service for API calls. Upgrades are staged by region and the full upgrade process is expected to take 2 weeks. Deployments of `gpt-4` versions 1106-Preview and 0125-Preview set to "No autoupgrade" will not be upgraded and will stop operating when the preview version is upgraded in the region. See [Azure OpenAI model retirements and deprecations](./model-retirements.md) for more information on the timing of the upgrade.
78
+
39
79
## GPT-3.5
40
80
41
81
GPT-3.5 models can understand and generate natural language or code. The most capable and cost effective model in the GPT-3.5 family is GPT-3.5 Turbo, which has been optimized for chat and works well for traditional completions tasks as well. GPT-3.5 Turbo is available for use with the Chat Completions API. GPT-3.5 Turbo Instruct has similar capabilities to `text-davinci-003` using the Completions API instead of the Chat Completions API. We recommend using GPT-3.5 Turbo and GPT-3.5 Turbo Instruct over [legacy GPT-3.5 and GPT-3 models](./legacy-models.md).
@@ -86,58 +126,29 @@ You can also use the OpenAI text to speech voices via Azure AI Speech. To learn
86
126
## Model summary table and region availability
87
127
88
128
> [!NOTE]
89
-
> This article only covers model/region availability that applies to all Azure OpenAI customers with deployment types of **Standard**. Some select customers have access to model/region combinations that are not listed in the unified table below. These tables also do not apply to customers using only **Provisioned** deployment types which have their own unique model/region availability matrix. For more information on **Provisioned** deployments refer to our [Provisioned guidance](./provisioned-throughput.md).
129
+
> This article primarily covers model/region availability that applies to all Azure OpenAI customers with deployment types of **Standard**. Some select customers have access to model/region combinations that are not listed in the unified table below. For more information on Provisioned deployments, see our [Provisioned guidance](./provisioned-throughput.md).
GPT-4, GPT-4-32k, and GPT-4 Turbo with Vision are now available to all Azure OpenAI Service customers. Availability varies by region. If you don't see GPT-4 in your region, please check back later.
104
-
105
-
These models can only be used with the Chat Completion API.
106
-
107
-
GPT-4 version 0314 is the first version of the model released. Version 0613 is the second version of the model and adds function calling support.
108
-
109
-
See [model versions](../concepts/model-versions.md) to learn about how Azure OpenAI Service handles model version upgrades, and [working with models](../how-to/working-with-models.md) to learn how to view and configure the model version settings of your GPT-4 deployments.
110
-
111
-
> [!NOTE]
112
-
> Version `0314` of `gpt-4` and `gpt-4-32k` will be retired no earlier than July 5, 2024. Version `0613` of `gpt-4` and `gpt-4-32k` will be retired no earlier than September 30, 2024. See [model updates](../how-to/working-with-models.md#model-updates) for model upgrade behavior.
113
-
114
-
GPT-4 version 0125-preview is an updated version of the GPT-4 Turbo preview previously released as version 1106-preview. GPT-4 version 0125-preview completes tasks such as code generation more completely compared to gpt-4-1106-preview. Because of this, depending on the task, customers may find that GPT-4-0125-preview generates more output compared to the gpt-4-1106-preview. We recommend customers compare the outputs of the new model. GPT-4-0125-preview also addresses bugs in gpt-4-1106-preview with UTF-8 handling for non-English languages.
115
-
116
-
> [!IMPORTANT]
117
-
>
118
-
> -`gpt-4` versions 1106-Preview and 0125-Preview will be upgraded with a stable version of `gpt-4` in the future. The deployment upgrade of `gpt-4` 1106-Preview to `gpt-4` 0125-Preview scheduled for March 8, 2024 is no longer taking place. Deployments of `gpt-4` versions 1106-Preview and 0125-Preview set to "Auto-update to default" and "Upgrade when expired" will start to be upgraded after the stable version is released. For each deployment, a model version upgrade takes place with no interruption in service for API calls. Upgrades are staged by region and the full upgrade process is expected to take 2 weeks. Deployments of `gpt-4` versions 1106-Preview and 0125-Preview set to "No autoupgrade" will not be upgraded and will stop operating when the preview version is upgraded in the region.
119
-
120
-
| Model ID | Max Request (tokens) | Training Data (up to) |
**<sup>1</sup>** GPT-4 Turbo Preview = `gpt-4` (0125-Preview) or `gpt-4` (1106-Preview). To deploy this model, under **Deployments** select model **gpt-4**. Under version select (0125-Preview) or (1106-Preview).
**<sup>2</sup>** GPT-4 Turbo with Vision Preview = `gpt-4` (vision-preview). To deploy this model, under **Deployments** select model **gpt-4**. For **Model version** select **vision-preview**.
145
+
### How do I get access to Provisioned?
133
146
134
-
> [!CAUTION]
135
-
> We don't recommend using preview models in production. We will upgrade all deployments of preview models to future preview versions and a stable version. Models designated preview do not follow the standard Azure OpenAI model lifecycle.
147
+
You need to speak with your Microsoft sales/account team to acquire provisioned throughput. If you don't have a sales/account team, unfortunately at this time, you cannot purchase provisioned throughput.
136
148
137
-
> [!NOTE]
138
-
> Regions where GPT-4 (0314) & (0613) are listed as available have access to both the 8K and 32K versions of the model
149
+
For more information on Provisioned deployments, see our [Provisioned guidance](./provisioned-throughput.md).
139
150
140
-
### GPT-4 and GPT-4 Turbo Preview model availability
Copy file name to clipboardExpand all lines: articles/ai-services/openai/concepts/provisioned-throughput.md
+7-4Lines changed: 7 additions & 4 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -3,7 +3,7 @@ title: Azure OpenAI Service provisioned throughput
3
3
description: Learn about provisioned throughput and Azure OpenAI.
4
4
ms.service: azure-ai-openai
5
5
ms.topic: conceptual
6
-
ms.date: 1/16/2024
6
+
ms.date: 04/29/2024
7
7
manager: nitinme
8
8
author: mrbullwinkle #ChrisHMSFT
9
9
ms.author: mbullwin #chrhoder
@@ -40,6 +40,10 @@ An Azure OpenAI Deployment is a unit of management for a specific OpenAI Model.
40
40
41
41
You need to speak with your Microsoft sales/account team to acquire provisioned throughput. If you don't have a sales/account team, unfortunately at this time, you cannot purchase provisioned throughput.
42
42
43
+
## What models and regions are available for provisioned throughput?
@@ -64,13 +68,12 @@ az cognitiveservices account deployment create \
64
68
65
69
### Quota
66
70
67
-
Provisioned throughput quota represents a specific amount of total throughput you can deploy. Quota in the Azure OpenAI Service is managed at the subscription level. All Azure OpenAI resources within the subscription share this quota.
71
+
Provisioned throughput quota represents a specific amount of total throughput you can deploy. Quota in the Azure OpenAI Service is managed at the subscription level. All Azure OpenAI resources within the subscription share this quota.
68
72
69
-
Quota is specified in Provisioned throughput units and is specific to a (deployment type, model, region) triplet. Quota isn't interchangeable. Meaning you can't use quota for GPT-4 to deploy GPT-35-turbo. You can raise a support request to move quota across deployment types, models, or regions but the swap isn't guaranteed.
73
+
Quota is specified in Provisioned throughput units and is specific to a (deployment type, model, region) triplet. Quota isn't interchangeable. Meaning you can't use quota for GPT-4 to deploy GPT-3.5-Turbo.
70
74
71
75
While we make every attempt to ensure that quota is deployable, quota doesn't represent a guarantee that the underlying capacity is available. The service assigns capacity during the deployment operation and if capacity is unavailable the deployment fails with an out of capacity error.
72
76
73
-
74
77
### Determining the number of PTUs needed for a workload
75
78
76
79
PTUs represent an amount of model processing capacity. Similar to your computer or databases, different workloads or requests to the model will consume different amounts of underlying processing capacity. The conversion from call shape characteristics (prompt size, generation size and call rate) to PTUs is complex and non-linear. To simplify this process, you can use the [Azure OpenAI Capacity calculator](https://oai.azure.com/portal/calculator) to size specific workload shapes.
0 commit comments