Skip to content

Commit 21d1369

Browse files
committed
Learn Editor: Update latency.md
1 parent 1b4c48c commit 21d1369

File tree

1 file changed

+1
-1
lines changed

1 file changed

+1
-1
lines changed

articles/ai-services/openai/how-to/latency.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -26,7 +26,7 @@ For a standard deployment, the quota assigned to your deployment partially deter
2626

2727
In a provisioned deployment, a set amount of model processing capacity is allocated to your endpoint. The amount of throughput that you can achieve on the endpoint is a factor of the workload shape including input token amount, output amount, call rate and cache match rate. The number of concurrent calls and total tokens processed can vary based on these values.
2828

29-
For all deployment types, system level throughput is a key component of performance. The following section explains several approaches that can be used to estimate system level throughput with existing metrics and data from your Azure OpenAI Service environment.
29+
For all deployment types, understanding system level throughput is a key component of optimizing performance. The following section explains several approaches that can be used to estimate system level throughput with existing metrics and data from your Azure OpenAI Service environment.
3030

3131
#### Estimating system level throughput
3232

0 commit comments

Comments
 (0)