You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: articles/ai-services/translator/text-translation/reference/v4/reference-overview.md
+26-26Lines changed: 26 additions & 26 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -39,52 +39,52 @@ Metrics allow you to view the translator usage and availability information in A
39
39
40
40
:::image type="content" source="../../../media/azure-portal-metrics-v4.png" alt-text="Screenshot of HTTP request metrics in the Azure portal.":::
41
41
42
-
The following tables lists available metrics with description of how they're used to monitor **Translator resource** API calls.
42
+
The following tables list available metrics with description of how they're used to monitor **Translator resource** API calls.
43
43
44
44
#### Translator resource HTTP requests
45
45
46
46
| Metrics | Description |
47
47
|:----|:-----|
48
-
| BlockedCalls| Number of calls that exceeded rate or quota limit.|
49
-
| ClientErrors| Number of calls with client-side error(4XX).|
50
-
| Latency| Duration to complete request in milliseconds.|
51
-
| Ratelimit| The current ratelimit of the ratelimit key.|
52
-
| ServerErrors| Number of calls with server internal error(5XX).|
53
-
| SuccessfulCalls| Number of successful calls.|
54
-
| TotalCalls| Total number of API calls.|
55
-
| TotalErrors| Number of calls with error response.|
56
-
| TotalTokenCalls| Total number of API calls via token service using authentication token.|
48
+
|`BlockedCalls`| Number of calls that exceeded rate or quota limit.|
49
+
|`ClientErrors`| Number of calls with client-side error(4XX).|
50
+
|`Latency`| Duration to complete request in milliseconds.|
51
+
|`Ratelimit`| The current rate limit of the rate limit key.|
52
+
|`ServerErrors`| Number of calls with server internal error(5XX).|
53
+
|`SuccessfulCalls`| Number of successful calls.|
54
+
|`TotalCalls`| Total number of API calls.|
55
+
|`TotalErrors`| Number of calls with error response.|
56
+
|`TotalTokenCalls`| Total number of API calls via token service using authentication token.|
57
57
58
58
#### Translator resource usage
59
59
60
60
| Metrics | Description |
61
61
|:----|:-----|
62
-
| TextCharactersTranslated|Number of characters in incoming text translation request.|
63
-
| TextCustomCharactersTranslated|Number of characters in incoming custom text translation request.|
64
-
| TextTrainedCharacters|Number of characters trained using text translation.|
65
-
| DocumentCharactersTranslated|Number of characters in document translation request.|
66
-
| DocumentCustomCharactersTranslated|Number of characters in custom document translation request.|
62
+
|`TextCharactersTranslated`|Number of characters in incoming text translation request.|
63
+
|`TextCustomCharactersTranslated`|Number of characters in incoming custom text translation request.|
64
+
|`TextTrainedCharacters`|Number of characters trained using text translation.|
65
+
|`DocumentCharactersTranslated`|Number of characters in document translation request.|
66
+
|`DocumentCustomCharactersTranslated`|Number of characters in custom document translation request.|
67
67
68
-
The following tables lists available metrics with description of how they're used to monitor **Azure OpenAI** API calls.
68
+
The following tables list available metrics with description of how they're used to monitor **Azure OpenAI** API calls.
69
69
70
70
#### Azure OpenAI HTTP requests
71
71
72
72
| Metrics | Description |
73
73
|:----|:-----|
74
-
| AzureOpenAIAvailabilityRate|Availability percentage with the following calculation:<br>`(Total Calls - Server Errors) / Total Calls`. Server Errors include any HTTP response >= 500.|
75
-
|AzureOpenAIRequests|Number of calls made to the Azure OpenAI API over a period of time. Applies to PTU, PTU-managed, and Pay-as-you-go deployments. To breakdown API requests, you can add a filter or apply splitting by the following dimensions: <br> `ModelDeploymentName`, `ModelName`, `ModelVersion`, `StatusCode` (successful, client errors, server errors) `StreamType` (streaming vs non-streaming requests) and `Operation`.|
74
+
|`AzureOpenAIAvailabilityRate`|Availability percentage with the following calculation:<br>`(Total Calls - Server Errors) / Total Calls`. Server Errors include any HTTP response >= 500.|
75
+
|`AzureOpenAIRequests`|Number of calls made to the Azure OpenAI API over a period of time. Applies to Provisioned Throughput Units (`PTU`), `PTU`-managed, and Pay-as-you-go deployments. To breakdown API requests, you can add a filter or apply splitting by the following dimensions: <br> `ModelDeploymentName`, `ModelName`, `ModelVersion`, `StatusCode` (successful, client errors, server errors),`StreamType` (streaming vs nonstreaming requests), and `Operation`.|
76
76
77
-
#### Azure OpenAI usaga
77
+
#### Azure OpenAI usage
78
78
79
79
| Metrics | Description |
80
80
|:----|:-----|
81
-
|ActiveTokens|Total tokens minus cached tokens over a period of time. Applies to PTU and PTU-managed deployments. Use this metric to understand your TPS- or TPM-based utilization for PTUs and compare your benchmarks for target TPS or TPM for your scenarios. <br> To breakdown API requests you can add a filter or apply splitting by the following dimensions: `ModelDeploymentName`, `ModelName`, `ModelVersion`.|
82
-
|GeneratedTokens|Number of tokens generated (output) from an OpenAI model. Applies to PTU, PTU-managed, and Pay-as-you-go deployments. To breakdown this metric, you can add a filter or apply splitting by the following dimensions:<br>`ModelDeploymentName`or `ModelName`.|
83
-
|FineTunedTrainingHours|Number of training hours processed on an OpenAI fine-tuned model.|
84
-
|TokenTransaction|Number of inference tokens processed on an OpenAI model. Calculated as prompt tokens (input) plus generated tokens (output). Applies to PTU, PTU-managed, and Pay-as-you-go deployments. To breakdown this metric, you can add a filter or apply splitting by the following dimensions:<br>`ModelDeploymentName`or `ModelName`.|
85
-
|ProcessedPromptTokens|Number of prompt tokens processed (input) on an OpenAI model. Applies to PTU, PTU-managed, and Pay-as-you-go deployments. To breakdown this metric, you can add a filter or apply splitting by the following dimensions:<br>`ModelDeploymentName`or `ModelName`.|
86
-
|AzureOpenAIContextTokensCacheMatchRate|Percentage of prompt tokens that hit the cache. Applies to PTU and PTU-managed deployments.
87
-
|AzureOpenAIProvisionedManagedUtilizationV2|Utilization percentage for a provisioned-managed deployment, calculated as (PTUs consumed / PTUs deployed) x 100. When utilization is greater than or equal to 100%, calls are throttled and error code 429 is returned. To breakdown this metric, you can add a filter or apply splitting by the following dimensions: `ModelDeploymentName`, `ModelName`, `ModelVersion`, and `StreamType` (streaming vs non-streaming requests).|
81
+
|`ActiveTokens`|Total tokens minus cached tokens over a period of time. Applies to Provisioned Throughput Units (`PTU`) and `PTU`-managed deployments. Use this metric to understand your TPS- or TPM-based utilization for `PTU`s and compare your benchmarks for target TPS or TPM for your scenarios. <br> To breakdown API requests you can add a filter or apply splitting by the following dimensions: `ModelDeploymentName`, `ModelName`, `ModelVersion`.|
82
+
|`GeneratedTokens`|Number of tokens generated (output) from an OpenAI model. Applies to `PTU`, `PTU`-managed, and Pay-as-you-go deployments. To breakdown this metric, you can add a filter or apply splitting by the following dimensions:<br>`ModelDeploymentName`or `ModelName`.|
83
+
|`FineTunedTrainingHours`|Number of training hours processed on an OpenAI fine-tuned model.|
84
+
|`TokenTransaction`|Number of inference tokens processed on an OpenAI model. Calculated as prompt tokens (input) plus generated tokens (output). Applies to `PTU`, `PTU`-managed, and Pay-as-you-go deployments. To breakdown this metric, you can add a filter or apply splitting by the following dimensions:<br>`ModelDeploymentName`or `ModelName`.|
85
+
|`ProcessedPromptTokens`|Number of prompt tokens processed (input) on an OpenAI model. Applies to `PTU`, `PTU`-managed, and Pay-as-you-go deployments. To breakdown this metric, you can add a filter or apply splitting by the following dimensions:<br>`ModelDeploymentName`or `ModelName`.|
86
+
|`AzureOpenAIContextTokensCacheMatchRate`|Percentage of prompt tokens that hit the cache. Applies to `PTU` and `PTU`-managed deployments.
87
+
|`AzureOpenAIProvisionedManagedUtilizationV2`|Utilization percentage for a provisioned-managed deployment, calculated as (PTUs consumed / PTUs deployed) x 100. When utilization is greater than or equal to 100%, calls are throttled and error code 429 is returned. To breakdown this metric, you can add a filter or apply splitting by the following dimensions: `ModelDeploymentName`, `ModelName`, `ModelVersion`, and `StreamType` (streaming vs non-streaming requests).|
0 commit comments