You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: articles/ai-foundry/openai/how-to/reasoning.md
+24-23Lines changed: 24 additions & 23 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -28,13 +28,14 @@ Azure OpenAI reasoning models are designed to tackle reasoning and problem-solvi
28
28
29
29
| Model | Region | Limited access |
30
30
|---|---|---|
31
-
|`gpt-5`|[Model availability](../concepts/models.md#global-standard-model-availability)| Request access: [gpt-5 limited access model application](https://aka.ms/oai/gpt5access). If you already have `o3 access` no request is required |
31
+
|`gpt-5-codex`| East US2 & Sweden Central (Global Standard) | Request access: [Limited access model application](https://aka.ms/oai/gpt5access)|
32
+
|`gpt-5`|[Model availability](../concepts/models.md#global-standard-model-availability)| Request access: [Limited access model application](https://aka.ms/oai/gpt5access). If you already have `o3 access` no request is required |
32
33
|`gpt-5-mini`|[Model availability](../concepts/models.md#global-standard-model-availability)| No access request needed. |
33
34
|`gpt-5-nano`|[Model availability](../concepts/models.md#global-standard-model-availability)| No access request needed. |
34
-
|`o3-pro`| East US2 & Sweden Central (Global Standard) | Request access: [o3 limited access model application](https://aka.ms/oai/o3access). If you already have `o3 access` no request is required. |
35
+
|`o3-pro`| East US2 & Sweden Central (Global Standard) | Request access: [Limited access model application](https://aka.ms/oai/o3access). If you already have `o3 access` no request is required. |
35
36
|`codex-mini`| East US2 & Sweden Central (Global Standard) | No access request needed. |
36
37
|`o4-mini`|[Model availability](../concepts/models.md#global-standard-model-availability)| No access request needed to use the core capabilities of this model.<br><br> Request access: [o4-mini reasoning summary feature](https://aka.ms/oai/o3access)|
37
-
|`o3`|[Model availability](../concepts/models.md#global-standard-model-availability)| Request access: [o3 limited access model application](https://aka.ms/oai/o3access)|
38
+
|`o3`|[Model availability](../concepts/models.md#global-standard-model-availability)| Request access: [Limited access model application](https://aka.ms/oai/o3access)|
38
39
|`o3-mini`|[Model availability](../concepts/models.md#global-standard-model-availability). | Access is no longer restricted for this model. |
39
40
|`o1`|[Model availability](../concepts/models.md#global-standard-model-availability). | Access is no longer restricted for this model. |
40
41
|`o1-mini`|[Model availability](../concepts/models.md#global-standard-model-availability). | No access request needed for Global Standard deployments.<br><br>Standard (regional) deployments are currently only available to select customers who were previously granted access as part of the `o1-preview` release.|
@@ -43,40 +44,40 @@ Azure OpenAI reasoning models are designed to tackle reasoning and problem-solvi
<sup>1</sup> Parallel tool calls are not supported when `reasoning_effort` is set to `minimal`<br><br>
65
65
<sup>2</sup> Reasoning models will only work with the `max_completion_tokens` parameter when using the Chat Completions API. Use `max_output_tokens` with the Responses API. <br><br>
66
66
<sup>3</sup> The latest reasoning models support system messages to make migration easier. You should not use both a developer message and a system message in the same API request.<br><br>
67
67
68
-
69
68
### NEW GPT-5 reasoning features
70
69
71
70
| Feature | Description |
72
71
|----|----|
73
-
|`reasoning_effort`|`minimal` is now supported with GPT-5 series reasoning models <br><br> **Options**: `minimal`, `low`, `medium`, `high`|
74
-
|`verbosity`| A new parameter giving you more granular control over how concise the model's output will be.<br><br>**Options:**`low`, `medium`, `high`. |
72
+
|`reasoning_effort`|`minimal` is now supported with GPT-5 series reasoning models<sup>*</sup> <br><br> **Options**: `minimal`, `low`, `medium`, `high`|
73
+
|`verbosity`| A new parameter providing more granular control over how concise the model's output will be.<br><br>**Options:**`low`, `medium`, `high`. |
75
74
|`preamble`| GPT-5 series reasoning models have the ability to spend extra time *"thinking"* before executing a function/tool call.<br><br> When this planning occurs the model can provide insight into the planning steps in the model response via a new object called the `preamble` object.<br><br> Generation of preambles in the model response is not guaranteed though you can encourage the model by using the `instructions` parameter and passing content like "You MUST plan extensively before each function call. ALWAYS output your plan to the user before calling any function"|
76
75
|**allowed tools**| You can specify multiple tools under `tool_choice` instead of just one. |
77
76
|**custom tool type**| Enables raw text (non-json) outputs |
78
77
|[`lark_tool`](#python-lark)| Allows you to use some of the capabilities of [Python lark](https://github.com/lark-parser/lark) for more flexible constraining of model responses |
79
78
79
+
<sup>*</sup> `gpt-5-codex` does not support `reasoning_effort` minimal.
80
+
80
81
For more information, we also recommend reading OpenAI's [GPT-5 prompting cookbook guide](https://cookbook.openai.com/examples/gpt-5/gpt-5_prompting_guide) and their [GPT-5 feature guide](https://platform.openai.com/docs/guides/latest-model).
Copy file name to clipboardExpand all lines: articles/ai-foundry/openai/includes/models-azure-direct-openai.md
+6-4Lines changed: 6 additions & 4 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -36,19 +36,21 @@ ms.topic: include
36
36
|`gpt-5-mini` (2025-08-07) | See the [models table](#model-summary-table-and-region-availability).|
37
37
|`gpt-5-nano` (2025-08-07) | See the [models table](#model-summary-table-and-region-availability).|
38
38
|`gpt-5-chat` (2025-08-07) | See the [models table](#model-summary-table-and-region-availability).|
39
+
|`gpt-5-codex` (2025-09-11) | East US2 (Global Standard) and Sweden Central (Global Standard) |
39
40
40
-
-**[Registration is required for access to the gpt-5 model](https://aka.ms/oai/gpt5access).**
41
+
-**[Registration is required for access to the gpt-5 & gpt-5-codex models](https://aka.ms/oai/gpt5access).**
41
42
42
43
-`gpt-5-mini`, `gpt-5-nano`, and `gpt-5-chat` do not require registration.
43
44
44
45
Access will be granted based on Microsoft's eligibility criteria. Customers who previously applied and received access to `o3`, don't need to reapply as their approved subscriptions will automatically be granted access upon model release.
45
46
46
47
| Model ID | Description | Context Window | Max Output Tokens | Training Data (up to) |
47
48
| --- | :--- |:--- |:---|:---: |
48
-
|`gpt-5` (2025-08-07) | - [Reasoning](../how-to/reasoning.md) <br> - Chat Completions API. <br> - [Responses API](../how-to/responses.md). <br> - Structured outputs.<br> - Text and image processing. <br> - Functions, tools, and parallel tool calling. <br> [Full summary of capabilities](../how-to/reasoning.md). | 400,000<br><br>Input: 272,000<br>Output: 128,000 | 128,000 | October 24, 2024 |
49
-
|`gpt-5-mini` (2025-08-07) | - [Reasoning](../how-to/reasoning.md) <br> - Chat Completions API. <br> - [Responses API](../how-to/responses.md). <br> - Structured outputs.<br> - Text and image processing. <br> - Functions, tools, and parallel tool calling. <br> [Full summary of capabilities](../how-to/reasoning.md). | 400,000<br><br>Input: 272,000<br>Output: 128,000 | 128,000 | June 24, 2024 |
50
-
|`gpt-5-nano` (2025-08-07) | - [Reasoning](../how-to/reasoning.md) <br> - Chat Completions API. <br> - [Responses API](../how-to/responses.md). <br> - Structured outputs.<br> - Text and image processing. <br> - Functions, tools, and parallel tool calling. <br> [Full summary of capabilities](../how-to/reasoning.md). | 400,000<br><br>Input: 272,000<br>Output: 128,000 | 128,000 | May 31, 2024 |
49
+
|`gpt-5` (2025-08-07) | - [Reasoning](../how-to/reasoning.md) <br> - Chat Completions API. <br> - [Responses API](../how-to/responses.md). <br> - Structured outputs.<br> - Text and image processing. <br> - Functions, tools, and parallel tool calling. <br> - [Full summary of capabilities](../how-to/reasoning.md). | 400,000<br><br>Input: 272,000<br>Output: 128,000 | 128,000 | October 24, 2024 |
50
+
|`gpt-5-mini` (2025-08-07) | - [Reasoning](../how-to/reasoning.md) <br> - Chat Completions API. <br> - [Responses API](../how-to/responses.md). <br> - Structured outputs.<br> - Text and image processing. <br> - Functions, tools, and parallel tool calling. <br> - [Full summary of capabilities](../how-to/reasoning.md). | 400,000<br><br>Input: 272,000<br>Output: 128,000 | 128,000 | June 24, 2024 |
51
+
|`gpt-5-nano` (2025-08-07) | - [Reasoning](../how-to/reasoning.md) <br> - Chat Completions API. <br> - [Responses API](../how-to/responses.md). <br> - Structured outputs.<br> - Text and image processing. <br> - Functions, tools, and parallel tool calling. <br> - [Full summary of capabilities](../how-to/reasoning.md). | 400,000<br><br>Input: 272,000<br>Output: 128,000 | 128,000 | May 31, 2024 |
51
52
|`gpt-5-chat` (2025-08-07)<br>**Preview**| - Chat Completions API. <br> - [Responses API](../how-to/responses.md). <br> - **Input**: Text/Image <br> - **Output**: Text only | 128,000 | 16,384 | October 24, 2024 |
53
+
|`gpt-5-codex` (2025-09-11) | - [Responses API](../how-to/responses.md) only. <br> - **Input**: Text/Image <br> - **Output**: Text only <br> - Structured outputs.<br> - Text and image processing. <br> - Functions, tools, and parallel tool calling. <br> - [Full summary of capabilities](../how-to/reasoning.md)| 400,000<br><br>Input: 272,000<br>Output: 128,000 | 128,000 | - |
Copy file name to clipboardExpand all lines: articles/ai-foundry/openai/quotas-limits.md
+11-8Lines changed: 11 additions & 8 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -75,17 +75,20 @@ The following section provides you with a quick guide to the default quotas and
75
75
76
76
| Model | Global Default<br>Tokens per minute (TPM) | Global Enterprise and MCA-E <br>Tokens per minute (TPM) | Data Zone Default <br>Tokens per minute (TPM) | Data Zone Enterprise and MCA-E <br>Tokens per minute (TPM) |
| Model | Global Default<br>Requests per minute (RPM) | Global Enterprise and MCA-E <br>Requests per minute (RPM) | Data Zone Default <br>Requests per minute (RPM) | Data Zone Enterprise and MCA-E <br>Requests per minute (RPM) |
0 commit comments