Skip to content

Commit cda1528

Browse files
committed
update
1 parent c30675e commit cda1528

File tree

1 file changed

+1
-2
lines changed

1 file changed

+1
-2
lines changed

articles/ai-services/openai/how-to/provisioned-throughput-onboarding.md

Lines changed: 1 addition & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -46,8 +46,7 @@ The **Provisioned** option and the capacity planner are only available in certai
4646
| Version | Version of the model you plan to use, for example 0614 |
4747
| Peak calls per min | The number of calls per minute that are expected to be sent to the model |
4848
| Tokens in prompt call | The number of tokens in the prompt for each call to the model. Calls with larger prompts will utilize more of the PTU deployment. Currently this calculator assumes a single prompt value so for workloads with wide variance, we recommend benchmarking your deployment on your traffic to determine the most accurate estimate of PTU needed for your deployment. |
49-
| Tokens in model response |
50-
The number of tokens generated from each call to the model. Calls with larger generation sizes will utilize more of the PTU deployment. Currently this calculator assumes a single prompt value so for workloads with wide variance, we recommend benchmarking your deployment on your traffic to determine the most accurate estimate of PTU needed for your deployment. |
49+
| Tokens in model response | The number of tokens generated from each call to the model. Calls with larger generation sizes will utilize more of the PTU deployment. Currently this calculator assumes a single prompt value so for workloads with wide variance, we recommend benchmarking your deployment on your traffic to determine the most accurate estimate of PTU needed for your deployment. |
5150

5251
After you fill in the required details, select **Calculate** button in the output column.
5352

0 commit comments

Comments
 (0)