Skip to content

Commit db7ad0b

Browse files
Merge pull request #281839 from mrbullwinkle/mrb_07_25_2024_gpt4o-mini
[Azure OpenAI] GPT-4o mini
2 parents fcfe845 + ff44ad2 commit db7ad0b

File tree

9 files changed

+53
-48
lines changed

9 files changed

+53
-48
lines changed

articles/ai-services/openai/concepts/models.md

Lines changed: 19 additions & 27 deletions
Original file line numberDiff line numberDiff line change
@@ -4,7 +4,7 @@ titleSuffix: Azure OpenAI
44
description: Learn about the different model capabilities that are available with Azure OpenAI.
55
ms.service: azure-ai-openai
66
ms.topic: conceptual
7-
ms.date: 07/18/2024
7+
ms.date: 07/31/2024
88
ms.custom: references_regions, build-2023, build-2023-dataai, refefences_regions
99
manager: nitinme
1010
author: mrbullwinkle #ChrisHMSFT
@@ -18,7 +18,7 @@ Azure OpenAI Service is powered by a diverse set of models with different capabi
1818

1919
| Models | Description |
2020
|--|--|
21-
| [GPT-4o & GPT-4 Turbo](#gpt-4o-and-gpt-4-turbo) | The latest most capable Azure OpenAI models with multimodal versions, which can accept both text and images as input. |
21+
| [GPT-4o & GPT-4o mini & GPT-4 Turbo](#gpt-4o-and-gpt-4-turbo) | The latest most capable Azure OpenAI models with multimodal versions, which can accept both text and images as input. |
2222
| [GPT-4](#gpt-4) | A set of models that improve on GPT-3.5 and can understand and generate natural language and code. |
2323
| [GPT-3.5](#gpt-35) | A set of models that improve on GPT-3 and can understand and generate natural language and code. |
2424
| [Embeddings](#embeddings-models) | A set of models that can convert text into numerical vector form to facilitate text similarity. |
@@ -30,27 +30,16 @@ Azure OpenAI Service is powered by a diverse set of models with different capabi
3030

3131
GPT-4o integrates text and images in a single model, enabling it to handle multiple data types simultaneously. This multimodal approach enhances accuracy and responsiveness in human-computer interactions. GPT-4o matches GPT-4 Turbo in English text and coding tasks while offering superior performance in non-English languages and vision tasks, setting new benchmarks for AI capabilities.
3232

33-
### Early access playground
33+
### How do I access the GPT-4o and GPT-4o mini models?
3434

35-
Existing Azure OpenAI customers can test out the **NEW GPT-4o mini** model in the **Azure OpenAI Studio Early Access Playground (Preview)**.
36-
37-
To test the latest model:
38-
39-
> [!NOTE]
40-
> The GPT-4o mini early access playground is currently only available for resources in **West US3** and **East US**, and is limited to 10 requests every five minutes per subscription. Azure OpenAI content filters are enabled at the default configuration and cannot be modified. GPT-4o mini is a preview model and is currently not available for deployment/direct API access.
41-
42-
1. Navigate to Azure OpenAI Studio at https://oai.azure.com/ and sign-in with credentials that have access to your OpenAI resources.
43-
2. Select an Azure OpenAI resource in the **West US3** or **East US** regions. If you don't have a resource in one of these regions you will need to [create a resource](../how-to/create-resource.md).
44-
3. From the main [Azure OpenAI Studio](https://oai.azure.com/) page select the **Early Access Playground (Preview)** button from under the **Get started** section. (This button will only be visible when a resource in **West US3** or **East US** is selected.)
45-
4. Now you can start asking the model questions just as you would before in the existing [chat playground](../chatgpt-quickstart.md).
46-
47-
### How do I access the GPT-4o model?
48-
49-
GPT-4o is available for **standard** and **global-standard** model deployment.
35+
GPT-4o and GPT-4o mini are available for **standard** and **global-standard** model deployment.
5036

5137
You need to [create](../how-to/create-resource.md) or use an existing resource in a [supported standard](#gpt-4-and-gpt-4-turbo-model-availability) or [global standard](#global-standard-model-availability) region where the model is available.
5238

53-
When your resource is created, you can [deploy](../how-to/create-resource.md#deploy-a-model) the GPT-4o model. If you are performing a programmatic deployment, the **model** name is `gpt-4o`, and the **version** is `2024-05-13`.
39+
When your resource is created, you can [deploy](../how-to/create-resource.md#deploy-a-model) the GPT-4o models. If you are performing a programmatic deployment, the **model** names are:
40+
41+
- `gpt-4o`, **Version** `2024-05-13`
42+
- `gpt-4o-mini` **Version** `2024-07-18`
5443

5544
### GPT-4 Turbo
5645

@@ -76,7 +65,8 @@ See [model versions](../concepts/model-versions.md) to learn about how Azure Ope
7665

7766
| Model ID | Description | Max Request (tokens) | Training Data (up to) |
7867
| --- | :--- |:--- |:---: |
79-
|`gpt-4o` (2024-05-13) <br> **GPT-4o (Omni)** | **Latest GA model** <br> - Text, image processing <br> - JSON Mode <br> - parallel function calling <br> - Enhanced accuracy and responsiveness <br> - Parity with English text and coding tasks compared to GPT-4 Turbo with Vision <br> - Superior performance in non-English languages and in vision tasks <br> - **Does not support enhancements** |Input: 128,000 <br> Output: 4,096| Oct 2023 |
68+
|`gpt-4o-mini` (2024-07-18) <br> **GPT-4o mini** | **Latest small GA model** <br> - Fast, inexpensive, capable model ideal for replacing GPT-3.5 Turbo series models. <br> - Text, image processing <br>- JSON Mode <br> - parallel function calling <br> - **Does not support enhancements** | Input: 128,000 <br> Output: 16,384 | Oct 2023 |
69+
|`gpt-4o` (2024-05-13) <br> **GPT-4o (Omni)** | **Latest large GA model** <br> - Text, image processing <br> - JSON Mode <br> - parallel function calling <br> - Enhanced accuracy and responsiveness <br> - Parity with English text and coding tasks compared to GPT-4 Turbo with Vision <br> - Superior performance in non-English languages and in vision tasks <br> - **Does not support enhancements** |Input: 128,000 <br> Output: 4,096| Oct 2023 |
8070
| `gpt-4` (turbo-2024-04-09) <br>**GPT-4 Turbo with Vision** | **New GA model** <br> - Replacement for all previous GPT-4 preview models (`vision-preview`, `1106-Preview`, `0125-Preview`). <br> - [**Feature availability**](#gpt-4o-and-gpt-4-turbo) is currently different depending on method of input, and deployment type. <br> - **Does not support enhancements**. | Input: 128,000 <br> Output: 4,096 | Dec 2023 |
8171
| `gpt-4` (0125-Preview)*<br>**GPT-4 Turbo Preview** | **Preview Model** <br> -Replaces 1106-Preview <br>- Better code generation performance <br> - Reduces cases where the model doesn't complete a task <br> - JSON Mode <br> - parallel function calling <br> - reproducible output (preview) | Input: 128,000 <br> Output: 4,096 | Dec 2023 |
8272
| `gpt-4` (vision-preview)<br>**GPT-4 Turbo with Vision Preview** | **Preview model** <br> - Accepts text and image input. <br> - Supports enhancements <br> - JSON Mode <br> - parallel function calling <br> - reproducible output (preview) | Input: 128,000 <br> Output: 4,096 | Apr 2023 |
@@ -180,9 +170,7 @@ For more information on Provisioned deployments, see our [Provisioned guidance](
180170

181171
### Global standard model availability
182172

183-
**Supported models:**
184-
185-
- `gpt-4o` **Version:** `2024-05-13`
173+
`gpt-4o` **Version:** `2024-05-13`
186174

187175
**Supported regions:**
188176

@@ -208,14 +196,18 @@ For more information on Provisioned deployments, see our [Provisioned guidance](
208196
- westus
209197
- westus3
210198

199+
`gpt-4o-mini` **Version:** `2024-07-18`
200+
201+
**Supported regions:**
202+
203+
- eastus
204+
211205
### GPT-4 and GPT-4 Turbo model availability
212206

213207
#### Public cloud regions
214208

215209
[!INCLUDE [GPT-4](../includes/model-matrix/standard-gpt-4.md)]
216210

217-
218-
219211
#### Select customer access
220212

221213
In addition to the regions above which are available to all Azure OpenAI customers, some select pre-existing customers have been granted access to versions of GPT-4 in additional regions:
@@ -283,9 +275,9 @@ These models can only be used with Embedding API requests.
283275
| `gpt-35-turbo` (0613) | East US2 <br> North Central US <br> Sweden Central <br> Switzerland West | 4,096 | Sep 2021 |
284276
| `gpt-35-turbo` (1106) | East US2 <br> North Central US <br> Sweden Central <br> Switzerland West | Input: 16,385<br> Output: 4,096 | Sep 2021|
285277
| `gpt-35-turbo` (0125) | East US2 <br> North Central US <br> Sweden Central <br> Switzerland West | 16,385 | Sep 2021 |
286-
| `gpt-4` (0613) <sup>**1**<sup> | North Central US <br> Sweden Central | 8192 | Sep 2021 |
278+
| `gpt-4` (0613) <sup>**1**</sup> | North Central US <br> Sweden Central | 8192 | Sep 2021 |
287279

288-
**<sup>1<sup>** GPT-4 fine-tuning is currently in public preview. See our [GPT-4 fine-tuning safety evaluation guidance](/azure/ai-services/openai/how-to/fine-tuning?tabs=turbo%2Cpython-new&pivots=programming-language-python#safety-evaluation-gpt-4-fine-tuning---public-preview) for more information.
280+
**<sup>1</sup>** GPT-4 fine-tuning is currently in public preview. See our [GPT-4 fine-tuning safety evaluation guidance](/azure/ai-services/openai/how-to/fine-tuning?tabs=turbo%2Cpython-new&pivots=programming-language-python#safety-evaluation-gpt-4-fine-tuning---public-preview) for more information.
289281

290282
### Whisper models
291283

articles/ai-services/openai/how-to/fine-tuning.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -7,7 +7,7 @@ manager: nitinme
77
ms.service: azure-ai-openai
88
ms.custom: build-2023, build-2023-dataai, devx-track-python
99
ms.topic: how-to
10-
ms.date: 05/16/2024
10+
ms.date: 07/25/2024
1111
author: mrbullwinkle
1212
ms.author: mbullwin
1313
zone_pivot_groups: openai-fine-tuning-new

articles/ai-services/openai/how-to/function-calling.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -36,6 +36,7 @@ At a high level you can break down working with functions into three steps:
3636
* `gpt-4` (vision-preview)
3737
* `gpt-4` (2024-04-09)
3838
* `gpt-4o` (2024-05-13)
39+
* `gpt-4o-mini` (2024-07-18)
3940

4041
Support for parallel function was first added in API version [`2023-12-01-preview`](https://github.com/Azure/azure-rest-api-specs/blob/main/specification/cognitiveservices/data-plane/AzureOpenAI/inference/preview/2023-12-01-preview/inference.json)
4142

articles/ai-services/openai/includes/fine-tuning-openai-in-ai-studio.md

Lines changed: 3 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -31,7 +31,9 @@ The following models support fine-tuning:
3131
- `gpt-35-turbo` (0613)
3232
- `gpt-35-turbo` (1106)
3333
- `gpt-35-turbo` (0125)
34-
- `gpt-4` (0613)
34+
- `gpt-4` (0613)**<sup>*</sup>**
35+
36+
**<sup>*</sup>** Fine-tuning for this model is currently in public preview.
3537

3638
Consult the [models page](../concepts/models.md#fine-tuning-models) to check which regions currently support fine-tuning.
3739

articles/ai-services/openai/includes/fine-tuning-python.md

Lines changed: 3 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -31,7 +31,9 @@ The following models support fine-tuning:
3131
- `gpt-35-turbo` (0613)
3232
- `gpt-35-turbo` (1106)
3333
- `gpt-35-turbo` (0125)
34-
- `gpt-4` (0613)
34+
- `gpt-4` (0613)**<sup>*</sup>**
35+
36+
**<sup>*</sup>** Fine-tuning for this model is currently in public preview.
3537

3638
If you plan to use `gpt-4` for fine-tuning, please refer to the [GPT-4 public preview safety evaluation guidance](#safety-evaluation-gpt-4-fine-tuning---public-preview)
3739

articles/ai-services/openai/includes/fine-tuning-rest.md

Lines changed: 3 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -30,7 +30,9 @@ The following models support fine-tuning:
3030
- `gpt-35-turbo` (0613)
3131
- `gpt-35-turbo` (1106)
3232
- `gpt-35-turbo` (0125)
33-
- `gpt-4` (0613)
33+
- `gpt-4` (0613)**<sup>*</sup>**
34+
35+
**<sup>*</sup>** Fine-tuning for this model is currently in public preview.
3436

3537
Consult the [models page](../concepts/models.md#fine-tuning-models) to check which regions currently support fine-tuning.
3638

articles/ai-services/openai/includes/fine-tuning-studio.md

Lines changed: 3 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -29,7 +29,9 @@ The following models support fine-tuning:
2929
- `gpt-35-turbo` (0613)
3030
- `gpt-35-turbo` (1106)
3131
- `gpt-35-turbo` (0125)
32-
- `gpt-4` (0613)
32+
- `gpt-4` (0613)**<sup>*</sup>**
33+
34+
**<sup>*</sup>** Fine-tuning for this model is currently in public preview.
3335

3436
Consult the [models page](../concepts/models.md#fine-tuning-models) to check which regions currently support fine-tuning.
3537

articles/ai-services/openai/quotas-limits.md

Lines changed: 14 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -10,7 +10,7 @@ ms.custom:
1010
- ignite-2023
1111
- references_regions
1212
ms.topic: conceptual
13-
ms.date: 07/24/2024
13+
ms.date: 07/31/2024
1414
ms.author: mbullwin
1515
---
1616

@@ -55,23 +55,27 @@ The following sections provide you with a quick guide to the default quotas and
5555

5656
## gpt-4o rate limits
5757

58-
`gpt-4o` introduces rate limit tiers with higher limits for certain customer types.
58+
`gpt-4o` and `gpt-4o-mini` have rate limit tiers with higher limits for certain customer types.
5959

6060
### gpt-4o global standard
6161

62-
|Tier| Quota Limit in tokens per minute (TPM) | Requests per minute |
63-
|---|:---:|:---:|
64-
|Enterprise agreement | 30 M | 180 K |
65-
|Default | 450 K | 2.7 K |
62+
| Model|Tier| Quota Limit in tokens per minute (TPM) | Requests per minute |
63+
|---|---|:---:|:---:|
64+
|`gpt-4o`|Enterprise agreement | 30 M | 180 K |
65+
|`gpt-4o-mini` | Enterprise agreement | 50 M | 300 K |
66+
|`gpt-4o` |Default | 450 K | 2.7 K |
67+
|`gpt-4o-mini` | Default | 2 M | 12 K |
6668

6769
M = million | K = thousand
6870

6971
### gpt-4o standard
7072

71-
|Tier| Quota Limit in tokens per minute (TPM) | Requests per minute |
72-
|---|:---:|:---:|
73-
|Enterprise agreement | 1 M | 6 K |
74-
|Default | 150 K | 900 |
73+
| Model|Tier| Quota Limit in tokens per minute (TPM) | Requests per minute |
74+
|---|---|:---:|:---:|
75+
|`gpt-4o`|Enterprise agreement | 1 M | 6 K |
76+
|`gpt-4o-mini` | Enterprise agreement | 2 M | 12 K |
77+
|`gpt-4o`|Default | 150 K | 900 |
78+
|`gpt-4o-mini` | Default | 450 K | 2.7 K |
7579

7680
M = million | K = thousand
7781

articles/ai-services/openai/whats-new.md

Lines changed: 6 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -10,7 +10,7 @@ ms.custom:
1010
- ignite-2023
1111
- references_regions
1212
ms.topic: whats-new
13-
ms.date: 07/18/2024
13+
ms.date: 07/31/2024
1414
recommendations: false
1515
---
1616

@@ -20,15 +20,15 @@ This article provides a summary of the latest releases and major documentation u
2020

2121
## July 2024
2222

23-
### GPT-4o mini preview model available for early access
23+
### GPT-4o mini model available for deployment
2424

25-
GPT-4o mini is the latest model from OpenAI [launched on July 18, 2024](https://openai.com/index/gpt-4o-mini-advancing-cost-efficient-intelligence/).
25+
GPT-4o mini is the latest Azure OpenAI model first [announced on July 18, 2024](https://azure.microsoft.com/blog/openais-fastest-model-gpt-4o-mini-is-now-available-on-azure-ai/):
2626

27-
From OpenAI:
27+
*"GPT-4o mini allows customers to deliver stunning applications at a lower cost with blazing speed. GPT-4o mini is significantly smarter than GPT-3.5 Turbo—scoring 82% on Measuring Massive Multitask Language Understanding (MMLU) compared to 70%—and is more than 60% cheaper.1 The model delivers an expanded 128K context window and integrates the improved multilingual capabilities of GPT-4o, bringing greater quality to languages from around the world."*
2828

29-
*"GPT-4o mini surpasses GPT-3.5 Turbo and other small models on academic benchmarks across both textual intelligence and multimodal reasoning, and supports the same range of languages as GPT-4o. It also demonstrates strong performance in function calling, which can enable developers to build applications that fetch data or take actions with external systems, and improved long-context performance compared to GPT-3.5 Turbo."*
29+
The model is currently available for both [standard and global standard deployment](./how-to/deployment-types.md) in the East US region.
3030

31-
To start testing out the model today in Azure OpenAI, see the [**Azure OpenAI Studio early access playground**](./concepts/models.md#early-access-playground).
31+
For information on model quota, consult the [quota and limits page](./quotas-limits.md) and for the latest info on model availability refer to the [models page](./concepts/models.md).
3232

3333
### New Responsible AI default content filtering policy
3434

0 commit comments

Comments
 (0)