Skip to content

Commit a81ce16

Browse files
committed
moved order
1 parent c5e0a01 commit a81ce16

File tree

1 file changed

+7
-5
lines changed

1 file changed

+7
-5
lines changed

articles/ai-services/openai/concepts/provisioned-throughput.md

Lines changed: 7 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -36,12 +36,7 @@ An Azure OpenAI Deployment is a unit of management for a specific OpenAI Model.
3636
| Utilization | Provisioned-managed Utilization V2 measure provided in Azure Monitor. |
3737
| Estimating size | Provided calculator in the studio & benchmarking script. |
3838

39-
## What models and regions are available for provisioned throughput?
40-
41-
[!INCLUDE [Provisioned](../includes/model-matrix/provisioned-models.md)]
4239

43-
> [!NOTE]
44-
> The provisioned version of `gpt-4` **Version:** `turbo-2024-04-09` is currently limited to text only.
4540

4641
## How much thoughput you get for each model
4742
The amount of throughput (tokens per minute or TPM) a deployment gets per PTU is a function of the input and output tokens being generated.
@@ -186,6 +181,13 @@ For Provisioned-Managed and Global Provisioned-Managed, we use a variation of th
186181

187182
The number of concurrent calls you can achieve depends on each call's shape (prompt size, max_token parameter, etc.). The service will continue to accept calls until the utilization reach 100%. To determine the approximate number of concurrent calls you can model out the maximum requests per minute for a particular call shape in the [capacity calculator](https://oai.azure.com/portal/calculator). If the system generates less than the number of samplings tokens like max_token, it will accept more requests.
188183

184+
## What models and regions are available for provisioned throughput?
185+
186+
[!INCLUDE [Provisioned](../includes/model-matrix/provisioned-models.md)]
187+
188+
> [!NOTE]
189+
> The provisioned version of `gpt-4` **Version:** `turbo-2024-04-09` is currently limited to text only.
190+
189191
## Next steps
190192

191193
- [Learn about the onboarding steps for provisioned deployments](../how-to/provisioned-throughput-onboarding.md)

0 commit comments

Comments
 (0)