Skip to content

Commit b4cb667

Browse files
committed
adding links
1 parent 3bfd198 commit b4cb667

File tree

3 files changed

+9
-1
lines changed

3 files changed

+9
-1
lines changed

articles/ai-services/openai/concepts/provisioned-throughput.md

Lines changed: 4 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -14,6 +14,9 @@ recommendations: false
1414

1515
The provisioned throughput capability allows you to specify the amount of throughput you require in a deployment. The service then allocates the necessary model processing capacity and ensures it's ready for you. Throughput is defined in terms of provisioned throughput units (PTU) which is a normalized way of representing the throughput for your deployment. Each model-version pair requires different amounts of PTU to deploy and provide different amounts of throughput per PTU.
1616

17+
> [!NOTE]
18+
> On July 29th 2024, Microsoft switched to an hourly/reservation PTU offering that offers usability improvements. For more details, see the [PTU migration article](../provisioned-migration.md#whats-changing).
19+
1720
## What does the provisioned deployment type provide?
1821

1922
- **Predictable performance:** stable max latency and throughput for uniform workloads.
@@ -47,7 +50,7 @@ An Azure OpenAI Deployment is a unit of management for a specific OpenAI Model.
4750

4851
### Provisioned throughput units
4952

50-
Provisioned throughput units (PTU) are units of model processing capacity that you can reserve and deploy for processing prompts and generating completions. The minimum PTU deployment, increments, and processing capacity associated with each unit varies by model type & version.
53+
Provisioned throughput units (PTU) are units of model processing capacity that you can reserve and deploy for processing prompts and generating completions. The minimum PTU deployment, increments, and processing capacity associated with each unit varies by model type & version.
5154

5255
### Deployment types
5356

articles/ai-services/openai/how-to/provisioned-throughput-onboarding.md

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -21,6 +21,9 @@ This article walks you through the process of onboarding to [Provisioned Through
2121

2222
You should consider switching from pay-as-you-go to provisioned throughput when you have well-defined, predictable throughput requirements. Typically, this occurs when the application is ready for production or has already been deployed in production and there's an understanding of the expected traffic. This will allow users to accurately forecast the required capacity and avoid unexpected billing.
2323

24+
> [!NOTE]
25+
> On July 29th 2024, Microsoft switched to an hourly/reservation PTU offering that offers usability improvements. For more details, see the [PTU migration article](../provisioned-migration.md#whats-changing).
26+
2427
### Typical PTU scenarios
2528

2629
- An application that is ready for production or in production.

articles/ai-services/openai/provisioned-migration.md

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -72,6 +72,7 @@ Customers will no longer obtain quota by contacting their sales teams. Instead,
7272

7373
The target is to respond to all quota requests within two business days. However, many requests will be autoapproved and responses can be expected within 30 minutes in these cases.
7474

75+
7576
## New hourly reservation payment model
7677

7778
> [!NOTE]
@@ -104,6 +105,7 @@ Microsoft has introduced a new “Hourly/reservation” payment model for provis
104105
> - Create deployments on new Azure OpenAI resources without commitments.
105106
> - Migrate an existing resources off its commitments.
106107
108+
107109
## Hourly reservation model details
108110

109111
Details on the hourly/reservation model can be found in the [Azure OpenAI Provisioned Onboarding Guide](./how-to/provisioned-throughput-onboarding.md)

0 commit comments

Comments
 (0)