You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: articles/ai-services/openai/concepts/model-retirements.md
+7-2Lines changed: 7 additions & 2 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -4,7 +4,7 @@ titleSuffix: Azure OpenAI
4
4
description: Learn about the model deprecations and retirements in Azure OpenAI.
5
5
ms.service: azure-ai-openai
6
6
ms.topic: conceptual
7
-
ms.date: 10/02/2024
7
+
ms.date: 10/25/2024
8
8
ms.custom:
9
9
manager: nitinme
10
10
author: mrbullwinkle
@@ -91,6 +91,8 @@ These models are currently available for use in Azure OpenAI Service.
91
91
92
92
| Model | Version | Retirement date | Suggested replacements |
93
93
| ---- | ---- | ---- | --- |
94
+
|`babbage-002`| 1 | Deprecation Date: November 15, 2024 <br>Retirement Date: January 27, 2025 ||
95
+
|`davinci-002`| 1 | Deprecation Date: November 15, 2024 <br>Retirement Date: January 27, 2025 ||
94
96
|`dall-e-2`| 2 | January 27, 2025 |`dalle-3`|
95
97
|`dall-e-3`| 3 | No earlier than April 30, 2025 ||
96
98
|`gpt-35-turbo`| 0301 | January 27, 2025<br><br> Deployments set to [**Auto-update to default**](/azure/ai-services/openai/how-to/working-with-models?tabs=powershell#auto-update-to-default) will be automatically upgraded to version: `0125`, starting on November 13, 2024. |`gpt-35-turbo` (0125) <br><br> `gpt-4o-mini`|
@@ -158,9 +160,12 @@ If you're an existing customer looking for information about these models, see [
158
160
| code-search-babbage-code-001 | July 6, 2023 | June 14, 2024 | text-embedding-3-small |
159
161
| code-search-babbage-text-001 | July 6, 2023 | June 14, 2024 | text-embedding-3-small |
160
162
161
-
162
163
## Retirement and deprecation history
163
164
165
+
## October 25, 2024
166
+
167
+
*`babbage-002` & `davinci-002` deprecation date: November 15, 2024 and retirement date: January 27, 2025.
168
+
164
169
## September 12, 2024
165
170
166
171
*`gpt-35-turbo` (0301), (0613), (1106) and `gpt-35-turbo-16k` (0613) auto-update to default upgrade date updated to November 13, 2024.
@@ -357,16 +357,40 @@ You can also use the OpenAI text to speech voices via Azure AI Speech. To learn
357
357
358
358
## Model summary table and region availability
359
359
360
-
> [!NOTE]
361
-
> This article primarily covers model/region availability that applies to all Azure OpenAI customers with deployment types of **Standard**. Some select customers have access to model/region combinations that are not listed in the unified table below. For more information on Provisioned deployments, see our [Provisioned guidance](./provisioned-throughput.md).
360
+
### Models by deployment type
361
+
362
+
Azure OpenAI provides customers with choices on the hosting structure that fits their business and usage patterns. The service offers two main types of deployment:
363
+
364
+
-**Standard** is offered with a global deployment option, routing traffic globally to provide higher throughput.
365
+
-**Provisioned** is also offered with a global deployment option, allowing customers to purchase and deploy provisioned throughput units across Azure global infrastructure.
366
+
367
+
All deployments can perform the exact same inference operations, however the billing, scale, and performance are substantially different. To learn more about Azure OpenAI deployment types see our [deployment types guide](../how-to/deployment-types.md).
In addition to the regions above which are available to all Azure OpenAI customers, some select pre-existing customers have been granted access to versions of GPT-4 in additional regions:
@@ -406,23 +426,14 @@ In addition to the regions above which are available to all Azure OpenAI custome
406
426
407
427
### GPT-3.5 models
408
428
409
-
> [!IMPORTANT]
410
-
> The NEW `gpt-35-turbo (0125)` model has various improvements, including higher accuracy at responding in requested formats and a fix for a bug which caused a text encoding issue for non-English language function calls.
411
-
412
-
GPT-3.5 Turbo is used with the Chat Completion API. GPT-3.5 Turbo version 0301 can also be used with the Completions API, though this is not recommended. GPT-3.5 Turbo versions 0613 and 1106 only support the Chat Completions API.
413
-
414
-
GPT-3.5 Turbo version 0301 is the first version of the model released. Version 0613 is the second version of the model and adds function calling support.
415
-
416
429
See [model versions](../concepts/model-versions.md) to learn about how Azure OpenAI Service handles model version upgrades, and [working with models](../how-to/working-with-models.md) to learn how to view and configure the model version settings of your GPT-3.5 Turbo deployments.
`babbage-002` and `davinci-002` are not trained to follow instructions. Querying these base models should only be done as a point of reference to a fine-tuned version to evaluate the progress of your training.
`gpt-35-turbo` - fine-tuning of this model is limited to a subset of regions, and is not available in every region the base model is available.
457
498
458
499
| Model ID | Fine-Tuning Regions | Max Request (tokens) | Training Data (up to) |
@@ -468,20 +509,7 @@ These models can only be used with Embedding API requests.
468
509
469
510
**<sup>1</sup>** GPT-4 is currently in public preview.
470
511
471
-
### Whisper models
472
-
473
-
| Model ID | Model Availability | Max Request (audio file size) |
474
-
| --- | --- | :---: |
475
-
|`whisper`| East US 2 <br> North Central US <br> Norway East <br> South India <br> Sweden Central <br> West Europe | 25 MB |
476
-
477
-
### Text to speech models (Preview)
478
-
479
-
| Model ID | Model Availability |
480
-
| --- | --- | :---: |
481
-
|`tts-1`| North Central US <br> Sweden Central |
482
-
|`tts-1-hd`| North Central US <br> Sweden Central |
483
-
484
-
### Assistants (Preview)
512
+
## Assistants (Preview)
485
513
486
514
For Assistants you need a combination of a supported model, and a supported region. Certain tools and capabilities require the latest models. The following models are available in the Assistants API, SDK, Azure AI Studio and Azure OpenAI Studio. The following table is for pay-as-you-go. For information on Provisioned Throughput Unit (PTU) availability, see [provisioned throughput](./provisioned-throughput.md). The listed models and regions can be used with both Assistants v1 and v2. You can use [global standard models](#global-standard-model-availability) if they are supported in the regions listed below.
Copy file name to clipboardExpand all lines: articles/ai-services/openai/concepts/provisioned-throughput.md
+2-2Lines changed: 2 additions & 2 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -49,7 +49,7 @@ To help with simplifying the sizing effort, the following table outlines the TPM
49
49
| Input TPM per PTU | 2,500 | 37,000 |
50
50
| Output TPM per PTU | 833 | 12,333 |
51
51
52
-
\**For a full list see the [AOAI Studio calcualator](https://oai.azure.com/portal/calculator)
52
+
For a full list see the [AOAI Studio calculator](https://oai.azure.com/portal/calculator).
53
53
54
54
55
55
## Key concepts
@@ -114,7 +114,7 @@ In Azure OpenAI Studio, the deployment experience identifies when a region lacks
114
114
115
115
Details on the new deployment experience can be found in the Azure OpenAI [Provisioned get started guide](../how-to/provisioned-get-started.md).
116
116
117
-
The new [model capacities API](/rest/api/aiservices/accountmanagement/model-capacities/list?view=rest-aiservices-accountmanagement-2024-04-01-preview&tabs=HTTP&preserve-view=true) can be used to programmatically identify the maximum sized deployment of a specified model. The API consideres both the your quota and service capacity in the region.
117
+
The new [model capacities API](/rest/api/aiservices/accountmanagement/model-capacities/list?view=rest-aiservices-accountmanagement-2024-04-01-preview&tabs=HTTP&preserve-view=true) can be used to programmatically identify the maximum sized deployment of a specified model. The API considers both your quota and service capacity in the region.
118
118
119
119
If an acceptable region isn't available to support the desire model, version and/or PTUs, customers can also try the following steps:
0 commit comments