Skip to content

Commit d039dea

Browse files
committed
small verbiage change to bring differentiation
1 parent cf68301 commit d039dea

File tree

1 file changed

+5
-1
lines changed

1 file changed

+5
-1
lines changed

articles/ai-services/openai/concepts/provisioned-throughput.md

Lines changed: 5 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -21,9 +21,13 @@ The provisioned throughput capability allows you to specify the amount of throug
2121
## What do the provisioned deployment types provide?
2222

2323
- **Predictable performance:** stable max latency and throughput for uniform workloads.
24-
- **Reserved processing capacity:** A deployment configures the amount of throughput. Once deployed, the throughput is available whether used or not.
24+
- **Allocated processing capacity:** A deployment configures the amount of throughput. Once deployed, the throughput is available whether used or not.
2525
- **Cost savings:** High throughput workloads might provide cost savings vs token-based consumption.
2626

27+
> [!NOTE]
28+
> Customers can avail additional cost savings on provisioned deployments when they buy [Microsoft Azure OpenAI service reservations](https://learn.microsoft.com/en-us/azure/cost-management-billing/reservations/azure-openai#buy-a-microsoft-azure-openai-service-reservation).
29+
30+
2731
An Azure OpenAI Deployment is a unit of management for a specific OpenAI Model. A deployment provides customer access to a model for inference and integrates more features like Content Moderation ([See content moderation documentation](content-filter.md)). Global provisioned deployments are available in the same Azure OpenAI resources as all other deployment types but allow you to leverage Azure's global infrastructure to dynamically route traffic to the data center with the best availability for each request. Similarly, data zone provisioned deployments are also available in the same resources as all other deployment types but allow you to leverage Azure's global infrastructure to dynamically route traffic to the data center within the Microsoft specified data zone with the best availability for each request.
2832

2933
## What do you get?

0 commit comments

Comments
 (0)