Skip to content

Commit 18b43f4

Browse files
committed
update file names
1 parent 4cccf60 commit 18b43f4

File tree

1 file changed

+12
-6
lines changed

1 file changed

+12
-6
lines changed

articles/ai-services/translator/text-translation/reference/v4/reference-overview.md

Lines changed: 12 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -39,6 +39,12 @@ Metrics allow you to view the translator usage and availability information in A
3939

4040
:::image type="content" source="../../../media/azure-portal-metrics-v4.png" alt-text="Screenshot of HTTP request metrics in the Azure portal.":::
4141

42+
#### Metrics terminology
43+
44+
* **PTU**: provisioned throughput units
45+
* **TPS**: transactions per second
46+
* **TPM**: tokens per minute
47+
4248
The following tables list available metrics with description of how they're used to monitor **Translator resource** API calls.
4349

4450
#### Translator resource HTTP requests
@@ -72,19 +78,19 @@ The following tables list available metrics with description of how they're used
7278
| Metrics | Description |
7379
|:----|:-----|
7480
| `AzureOpenAIAvailabilityRate`|Availability percentage with the following calculation:<br>`(Total Calls - Server Errors) / Total Calls`. Server Errors include any HTTP response >= 500.|
75-
|`AzureOpenAIRequests`|Number of calls made to the Azure OpenAI API over a period of time. Applies to Provisioned Throughput Units (`PTU`), `PTU`-managed, and Pay-as-you-go deployments. To breakdown API requests, you can add a filter or apply splitting by the following dimensions: <br> `ModelDeploymentName`, `ModelName`, `ModelVersion`, `StatusCode` (successful, client errors, server errors), `StreamType` (streaming vs nonstreaming requests), and `Operation`.|
81+
|`AzureOpenAIRequests`|Number of calls made to the Azure OpenAI API over a period of time. Applies to `PTU`, `PTU`-managed, and Pay-as-you-go deployments. To breakdown API requests, you can add a filter or apply splitting by the following dimensions: <br> `ModelDeploymentName`, `ModelName`, `ModelVersion`, `StatusCode` (successful, client errors, server errors), `StreamType` (streaming vs nonstreaming requests), and `Operation`.|
7682

7783
#### Azure OpenAI usage
7884

7985
| Metrics | Description |
8086
|:----|:-----|
81-
|`ActiveTokens`|Total tokens minus cached tokens over a period of time. Applies to Provisioned Throughput Units (`PTU`) and `PTU`-managed deployments. Use this metric to understand your TPS- or TPM-based utilization for `PTU`s and compare your benchmarks for target TPS or TPM for your scenarios. <br> To breakdown API requests you can add a filter or apply splitting by the following dimensions: `ModelDeploymentName`, `ModelName`, `ModelVersion`.|
82-
|`GeneratedTokens`|Number of tokens generated (output) from an OpenAI model. Applies to `PTU`, `PTU`-managed, and Pay-as-you-go deployments. To breakdown this metric, you can add a filter or apply splitting by the following dimensions:<br>`ModelDeploymentName`or `ModelName`.|
87+
|`ActiveTokens`|Total tokens minus cached tokens over a period of time. Applies to `PTU` and `PTU`-managed deployments. Use this metric to understand your `TPS`- or `TPM`-based utilization for `PTU`s and compare your benchmarks for target `TPS` or `TPM` for your scenarios. <br> To breakdown API requests, you can add a filter or apply splitting by the following dimensions: `ModelDeploymentName`, `ModelName`, `ModelVersion`.|
88+
|`GeneratedTokens`|Number of tokens generated (output) from an OpenAI model. Applies to `PTU`, `PTU`-managed, and Pay-as-you-go deployments. To break down this metric, you can add a filter or apply splitting by the following dimensions:<br>`ModelDeploymentName`or `ModelName`.|
8389
|`FineTunedTrainingHours`|Number of training hours processed on an OpenAI fine-tuned model.|
84-
|`TokenTransaction`|Number of inference tokens processed on an OpenAI model. Calculated as prompt tokens (input) plus generated tokens (output). Applies to `PTU`, `PTU`-managed, and Pay-as-you-go deployments. To breakdown this metric, you can add a filter or apply splitting by the following dimensions:<br>`ModelDeploymentName`or `ModelName`.|
85-
|`ProcessedPromptTokens`|Number of prompt tokens processed (input) on an OpenAI model. Applies to `PTU`, `PTU`-managed, and Pay-as-you-go deployments. To breakdown this metric, you can add a filter or apply splitting by the following dimensions:<br>`ModelDeploymentName`or `ModelName`.|
90+
|`TokenTransaction`|Number of inference tokens processed on an OpenAI model. Calculated as prompt tokens (input) plus generated tokens (output). Applies to `PTU`, `PTU`-managed, and Pay-as-you-go deployments. To break down this metric, you can add a filter or apply splitting by the following dimensions:<br>`ModelDeploymentName`or `ModelName`.|
91+
|`ProcessedPromptTokens`|Number of prompt tokens processed (input) on an OpenAI model. Applies to `PTU`, `PTU`-managed, and Pay-as-you-go deployments. To break down this metric, you can add a filter or apply splitting by the following dimensions:<br>`ModelDeploymentName`or `ModelName`.|
8692
|`AzureOpenAIContextTokensCacheMatchRate`|Percentage of prompt tokens that hit the cache. Applies to `PTU` and `PTU`-managed deployments.
87-
|`AzureOpenAIProvisionedManagedUtilizationV2`|Utilization percentage for a provisioned-managed deployment, calculated as (PTUs consumed / PTUs deployed) x 100. When utilization is greater than or equal to 100%, calls are throttled and error code 429 is returned. To breakdown this metric, you can add a filter or apply splitting by the following dimensions: `ModelDeploymentName`, `ModelName`, `ModelVersion`, and `StreamType` (streaming vs non-streaming requests).|
93+
|`AzureOpenAIProvisionedManagedUtilizationV2`|Utilization percentage for a provisioned-managed deployment, calculated as (`PTU`s consumed / `PTU`s deployed) x 100. When utilization is greater than or equal to 100%, calls are throttled and error code 429 is returned. To break down this metric, you can add a filter or apply splitting by the following dimensions: `ModelDeploymentName`, `ModelName`, `ModelVersion`, and `StreamType` (streaming vs nonstreaming requests).|
8894

8995

9096

0 commit comments

Comments
 (0)