Skip to content

Commit 301d03e

Browse files
authored
Merge pull request #274125 from MicrosoftDocs/main
5/2/2024 AM Publish
2 parents 53c43a3 + 4f9b100 commit 301d03e

File tree

75 files changed

+819
-687
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

75 files changed

+819
-687
lines changed

articles/ai-services/openai/api-version-deprecation.md

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -5,7 +5,7 @@ services: cognitive-services
55
manager: nitinme
66
ms.service: azure-ai-openai
77
ms.topic: conceptual
8-
ms.date: 03/28/2024
8+
ms.date: 05/02/2024
99
author: mrbullwinkle
1010
ms.author: mbullwin
1111
recommendations: false
@@ -14,14 +14,14 @@ ms.custom:
1414

1515
# Azure OpenAI API preview lifecycle
1616

17-
This article is to help you understand the support lifecycle for the Azure OpenAI API previews. New preview APIs target a monthly release cadence. After July 1, 2024, the latest three preview APIs will remain supported while older APIs will no longer be supported unless support is explictly indicated.
17+
This article is to help you understand the support lifecycle for the Azure OpenAI API previews. New preview APIs target a monthly release cadence. After July 1, 2024, the latest three preview APIs will remain supported while older APIs will no longer be supported unless support is explicitly indicated.
1818

1919
> [!NOTE]
2020
> The `2023-06-01-preview` API will remain supported at this time, as `DALL-E 2` is only available in this API version. `DALL-E 3` is supported in the latest API releases. The `2023-10-01-preview` API will also remain supported at this time.
2121
2222
## Latest preview API release
2323

24-
Azure OpenAI API version [2024-03-01-preview](https://github.com/Azure/azure-rest-api-specs/blob/main/specification/cognitiveservices/data-plane/AzureOpenAI/inference/preview/2024-03-01-preview/inference.json)
24+
Azure OpenAI API version [2024-04-01-preview](https://github.com/Azure/azure-rest-api-specs/blob/main/specification/cognitiveservices/data-plane/AzureOpenAI/inference/preview/2024-04-01-preview/inference.json)
2525
is currently the latest preview release.
2626

2727
This version contains support for all the latest Azure OpenAI features including:

articles/ai-services/openai/concepts/model-retirements.md

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -66,9 +66,9 @@ These models are currently available for use in Azure OpenAI Service.
6666
| `gpt-35-turbo` | 0125 | No earlier than Feb 22, 2025 |
6767
| `gpt-4`<br>`gpt-4-32k` | 0314 | No earlier than July 13, 2024 |
6868
| `gpt-4`<br>`gpt-4-32k` | 0613 | No earlier than Sep 30, 2024 |
69-
| `gpt-4` | 1106-preview | To be upgraded to `gpt-4` Version: `2024-04-09`, starting on June 10, 2024, or later **<sup>1</sup>** |
70-
| `gpt-4` | 0125-preview |To be upgraded to `gpt-4` Version: `2024-04-09`, starting on June 10, 2024, or later **<sup>1</sup>** |
71-
| `gpt-4` | vision-preview | To be upgraded to `gpt-4` Version: `2024-04-09`, starting on June 10, 2024, or later **<sup>1</sup>** |
69+
| `gpt-4` | 1106-preview | To be upgraded to `gpt-4` Version: `turbo-2024-04-09`, starting on June 10, 2024, or later **<sup>1</sup>** |
70+
| `gpt-4` | 0125-preview |To be upgraded to `gpt-4` Version: `turbo-2024-04-09`, starting on June 10, 2024, or later **<sup>1</sup>** |
71+
| `gpt-4` | vision-preview | To be upgraded to `gpt-4` Version: `turbo-2024-04-09`, starting on June 10, 2024, or later **<sup>1</sup>** |
7272
| `gpt-3.5-turbo-instruct` | 0914 | No earlier than Sep 14, 2025 |
7373
| `text-embedding-ada-002` | 2 | No earlier than April 3, 2025 |
7474
| `text-embedding-ada-002` | 1 | No earlier than April 3, 2025 |

articles/ai-services/openai/concepts/provisioned-throughput.md

Lines changed: 2 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -68,13 +68,12 @@ az cognitiveservices account deployment create \
6868

6969
### Quota
7070

71-
Provisioned throughput quota represents a specific amount of total throughput you can deploy. Quota in the Azure OpenAI Service is managed at the subscription level. All Azure OpenAI resources within the subscription share this quota.
71+
Provisioned throughput quota represents a specific amount of total throughput you can deploy. Quota in the Azure OpenAI Service is managed at the subscription level. All Azure OpenAI resources within the subscription share this quota.
7272

73-
Quota is specified in Provisioned throughput units and is specific to a (deployment type, model, region) triplet. Quota isn't interchangeable. Meaning you can't use quota for GPT-4 to deploy GPT-35-turbo. You can raise a support request to move quota across deployment types, models, or regions but the swap isn't guaranteed.
73+
Quota is specified in Provisioned throughput units and is specific to a (deployment type, model, region) triplet. Quota isn't interchangeable. Meaning you can't use quota for GPT-4 to deploy GPT-3.5-Turbo.
7474

7575
While we make every attempt to ensure that quota is deployable, quota doesn't represent a guarantee that the underlying capacity is available. The service assigns capacity during the deployment operation and if capacity is unavailable the deployment fails with an out of capacity error.
7676

77-
7877
### Determining the number of PTUs needed for a workload
7978

8079
PTUs represent an amount of model processing capacity. Similar to your computer or databases, different workloads or requests to the model will consume different amounts of underlying processing capacity. The conversion from call shape characteristics (prompt size, generation size and call rate) to PTUs is complex and non-linear. To simplify this process, you can use the [Azure OpenAI Capacity calculator](https://oai.azure.com/portal/calculator) to size specific workload shapes.

articles/ai-services/openai/includes/model-matrix/quota.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -32,6 +32,6 @@ Quota for standard deployments is described in of terms of [Tokens-Per-Minute (T
3232
| uksouth | - | - | 80 K | - | 240 K | - | 350 K | - | - | - | - | - | - | - | - | - |
3333
| westeurope | - | - | - | - | 240 K | - | 240 K | - | - | - | - | - | - | - | - | - |
3434
| westus | - | - | 80 K | 30 K | 300 K | - | 350 K | - | - | - | - | - | - | - | - | - |
35-
| westus3 | - | - | - | - | - | - | 350 K | - | - | - | - | - | - | - | - | - |
35+
| westus3 | - | - | 80 K | - | - | - | 350 K | - | - | - | - | - | - | - | - | - |
3636

3737
1 K = 1000 Tokens-Per-Minute (TPM). The relationship between TPM and Requests Per Minute (RPM) is [currently defined as 6 RPM per 1000 TPM](../../how-to/quota.md#understanding-rate-limits).

articles/ai-services/openai/includes/model-matrix/standard-gpt-4.md

Lines changed: 18 additions & 16 deletions
Original file line numberDiff line numberDiff line change
@@ -8,19 +8,21 @@ ms.topic: include
88
ms.date: 04/29/2024
99
---
1010

11-
| **Region** | **gpt-4**, **0613** | **gpt-4**, **1106-Preview** | **gpt-4**, **0125-Preview** |**gpt-4**, **vision-preview** | **gpt-4**, **turbo-2024-04-09** | **gpt-4-32k**, **0613** |
12-
|:-----------------|:-------------------:|:---------------------------:|:---------------------------:|:-----------------------------:|:-----------------------------:|:-----------------------:|
13-
| australiaeast ||| - || - ||
14-
| canadaeast ||| - | - | - ||
15-
| eastus | - | - || - | - | - |
16-
| eastus2 | - || - | - || - |
17-
| francecentral ||| - | - | - ||
18-
| japaneast | - | - | - || - | - |
19-
| northcentralus | - | - || - | - | - |
20-
| norwayeast | - || - | - | - | - |
21-
| southcentralus | - | - || - | - | - |
22-
| southindia | - || - | - | - | - |
23-
| swedencentral ||| - ||||
24-
| switzerlandnorth || - | - || - ||
25-
| uksouth | - ||| - | - | - |
26-
| westus | - || - || - | - |
11+
| **Region** | **gpt-4**, **0613** | **gpt-4**, **1106-Preview** | **gpt-4**, **0125-Preview** | **gpt-4**, **vision-preview** | **gpt-4**, **turbo-2024-04-09** | **gpt-4-32k**, **0613** |
12+
|:-----------------|:-------------------:|:---------------------------:|:---------------------------:|:-----------------------------:|:-------------------------------:|:-----------------------:|
13+
| australiaeast ||| - || - ||
14+
| canadaeast ||| - | - | - ||
15+
| eastus | - | - || - | - | - |
16+
| eastus2 | - || - | - || - |
17+
| francecentral ||| - | - | - ||
18+
| japaneast | - | - | - || - | - |
19+
| northcentralus | - | - || - | - | - |
20+
| norwayeast | - || - | - | - | - |
21+
| southcentralus | - | - || - | - | - |
22+
| southindia | - || - | - | - | - |
23+
| swedencentral ||| - ||||
24+
| switzerlandnorth || - | - || - ||
25+
| uksouth | - ||| - | - | - |
26+
| westus | - || - || - | - |
27+
| westus3 | - || - | - | - | - |
28+

0 commit comments

Comments
 (0)