Skip to content

Commit 371d6e9

Browse files
committed
models
1 parent ae07dbf commit 371d6e9

File tree

2 files changed

+2
-2
lines changed

2 files changed

+2
-2
lines changed

articles/ai-services/openai/concepts/models.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -43,7 +43,7 @@ Azure OpenAI Service is powered by a diverse set of models with different capabi
4343

4444
| Model ID | Description | Context Window | Max Output Tokens | Training Data (up to) |
4545
| --- | :--- |:--- |:---|:---: |
46-
| `gpt-4.1` (2025-04-14) | - Text & image input <br> - Text output <br> - Chat completions API <br>- Responses API <br> - Streaming <br> - Function calling <br> Structured outputs (chat completions) | 1,047,576 | 32,768 | May 31, 2024 |
46+
| `gpt-4.1` (2025-04-14) | - Text & image input <br> - Text output <br> - Chat completions API <br>- Responses API <br> - Streaming <br> - Function calling <br> Structured outputs (chat completions) | 1,047,576 <br> 128,000 (provisioned managed deployments) | 32,768 | May 31, 2024 |
4747
| `gpt-4.1-nano` (2025-04-14) <br><br> **Fastest 4.1 model** | - Text & image input <br> - Text output <br> - Chat completions API <br>- Responses API <br> - Streaming <br> - Function calling <br> Structured outputs (chat completions) | 1,047,576 | 32,768 | May 31, 2024 |
4848
| `gpt-4.1-mini` (2025-04-14) | - Text & image input <br> - Text output <br> - Chat completions API <br>- Responses API <br> - Streaming <br> - Function calling <br> Structured outputs (chat completions) | 1,047,576 | 32,768 | May 31, 2024 |
4949

articles/ai-services/openai/how-to/provisioned-throughput-onboarding.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -79,7 +79,7 @@ Customers that require long-term usage of provisioned, data zoned provisioned, a
7979

8080
The amount of throughput (measured in tokens per minute or TPM) a deployment gets per PTU is a function of the input and output tokens in a given minute. Generating output tokens requires more processing than input tokens.  Starting with GPT 4.1 models and later, the system matches the global standard price ratio between input and output tokens. Cached tokens are deducted 100% from the utilization.
8181

82-
For example, for 'gpt-4.1:2025-04-14', 1 output token counts as 4 input tokens towards your utilization limit which matches the [pricing](https://azure.microsoft.com/pricing/details/cognitive-services/openai-service/). Older models use a different ratio and for a deeper understanding on how different ratios of input and output tokens impact the throughput your workload needs, see the [Azure OpenAI capacity calculator](https://ai.azure.com/resource/calculator).
82+
For example, for `gpt-4.1:2025-04-14`, 1 output token counts as 4 input tokens towards your utilization limit which matches the [pricing](https://azure.microsoft.com/pricing/details/cognitive-services/openai-service/). Older models use a different ratio and for a deeper understanding on how different ratios of input and output tokens impact the throughput your workload needs, see the [Azure OpenAI capacity calculator](https://ai.azure.com/resource/calculator).
8383

8484

8585
|Topic| **gpt-4o** | **gpt-4o-mini** | **o1**| gpt-4.1 |

0 commit comments

Comments
 (0)