Skip to content

Commit 64c1122

Browse files
committed
add procedures and images
1 parent 6583270 commit 64c1122

8 files changed

+67
-24
lines changed

articles/ai-services/openai/how-to/monitoring.md

Lines changed: 67 additions & 24 deletions
Original file line numberDiff line numberDiff line change
@@ -1,13 +1,13 @@
11
---
22
title: Monitoring Azure OpenAI Service
3-
description: Learn how to use Azure Monitor tools to capture and analyze metrics and data logs for your Azure OpenAI Service resources.
3+
description: Learn how to use Azure Monitor tools like Log Analytics to capture and analyze metrics and data logs for your Azure OpenAI Service resources.
44
author: mrbullwinkle
55
ms.author: mbullwin
66
ms.service: cognitive-services
77
ms.subservice: openai
88
ms.topic: how-to
99
ms.custom: subject-monitoring
10-
ms.date: 09/06/2023
10+
ms.date: 09/07/2023
1111
---
1212

1313
# Monitoring Azure OpenAI Service
@@ -38,22 +38,49 @@ Azure OpenAI has commonality with a subset of Azure AI services. For a list of a
3838

3939
### Azure OpenAI metrics
4040

41-
This table summarizes the current subset of metrics available in Azure OpenAI. All of the following metrics are exportable by using [Diagnostic settings in Azure Monitor](/azure/azure-monitor/essentials/diagnostic-settings) in Azure Monitor.
41+
The following table summarizes the current subset of metrics available in Azure OpenAI.
4242

43-
|Metric|Display Name|Unit|Aggregation|Description|Dimensions|
43+
|Metric|Display Name|Category|Unit|Aggregation|Description|Dimensions|
4444
|---|---|---|---|---|---|---|
45-
|`BlockedCalls` |Blocked Calls |Count |Total |Number of calls that exceeded rate or quota limit. | `ApiName`, `OperationName`, `Region`, `RatelimitKey` |
46-
|`ClientErrors` |Client Errors |Count |Total |Number of calls with a client side error (HTTP response code 4xx). |`ApiName`, `OperationName`, `Region`, `RatelimitKey` |
47-
|`DataIn` |Data In |Bytes |Total |Size of incoming data in bytes. |`ApiName`, `OperationName`, `Region` |
48-
|`DataOut` |Data Out |Bytes |Total |Size of outgoing data in bytes. |`ApiName`, `OperationName`, `Region` |
49-
|`FineTunedTrainingHours` |Processed FineTuned Training Hours |Count |Total |Number of training hours processed on an Azure OpenAI fine-tuned model. |`ApiName`, `ModelDeploymentName`, `FeatureName`, `UsageChannel`, `Region` |
50-
|`Latency` |Latency |MilliSeconds |Average |Latency in milliseconds. |`ApiName`, `OperationName`, `Region`, `RatelimitKey` |
51-
|`Ratelimit` |Ratelimit |Count |Total |The current rate limit of the rate limit key. |`Region`, `RatelimitKey` |
52-
|`ServerErrors` |Server Errors |Count |Total |Number of calls with a service internal error (HTTP response code 5xx). |`ApiName`, `OperationName`, `Region`, `RatelimitKey` |
53-
|`SuccessfulCalls` |Successful Calls |Count |Total |Number of successful calls. |`ApiName`, `OperationName`, `Region`, `RatelimitKey` |
54-
|`TokenTransaction` |Processed Inference Tokens |Count |Total |Number of inference tokens processed on an Azure OpenAI model. |`ApiName`, `ModelDeploymentName`, `FeatureName`, `UsageChannel`, `Region` |
55-
|`TotalCalls` |Total Calls |Count |Total |Total number of calls. |`ApiName`, `OperationName`, `Region`, `RatelimitKey` |
56-
|`TotalErrors` |Total Errors |Count |Total |Total number of calls with an error response (HTTP response code 4xx or 5xx). |`ApiName`, `OperationName`, `Region`, `RatelimitKey` |
45+
|`BlockedCalls` |Blocked Calls |HTTP | Count |Total |Number of calls that exceeded rate or quota limit. | `ApiName`, `OperationName`, `Region`, `RatelimitKey` |
46+
|`ClientErrors` |Client Errors |HTTP | Count |Total |Number of calls with a client side error (HTTP response code 4xx). |`ApiName`, `OperationName`, `Region`, `RatelimitKey` |
47+
|`DataIn` |Data In |HTTP | Bytes |Total |Size of incoming data in bytes. |`ApiName`, `OperationName`, `Region` |
48+
|`DataOut` |Data Out |HTTP | Bytes |Total |Size of outgoing data in bytes. |`ApiName`, `OperationName`, `Region` |
49+
|`FineTunedTrainingHours` |Processed FineTuned Training Hours |USAGE |Count |Total |Number of training hours processed on an Azure OpenAI fine-tuned model. |`ApiName`, `ModelDeploymentName`, `FeatureName`, `UsageChannel`, `Region` |
50+
|`GeneratedTokens` |Generated Completion Tokens |USAGE |Count |Total |Number of generated tokens from an Azure OpenAI model. |`ApiName`, `ModelDeploymentName`, `FeatureName`, `UsageChannel`, `Region` |
51+
|`Latency` |Latency |HTTP |MilliSeconds |Average |Latency in milliseconds. |`ApiName`, `OperationName`, `Region`, `RatelimitKey` |
52+
|`ProcessedPromptTokens` |Processed Prompt Tokens |USAGE |Count |Total |Number of prompt tokens processed on an Azure OpenAI model. |`ApiName`, `ModelDeploymentName`, `FeatureName`, `UsageChannel`, `Region` |
53+
|`Ratelimit` |Ratelimit |HTTP |Count |Total |The current rate limit of the rate limit key. |`Region`, `RatelimitKey` |
54+
|`ServerErrors` |Server Errors |HTTP | Count |Total |Number of calls with a service internal error (HTTP response code 5xx). |`ApiName`, `OperationName`, `Region`, `RatelimitKey` |
55+
|`SuccessfulCalls` |Successful Calls |HTTP |Count |Total |Number of successful calls. |`ApiName`, `OperationName`, `Region`, `RatelimitKey` |
56+
|`SuccessRate` |Availability Rate |SLI |Percentage |Total |Availability percentage for the calculation `(TotalCalls - ServerErrors)/TotalCalls` for `ServerErrors` of HTTP response code 5xx. |`ApiName`, `ModelDeploymentName`, `FeatureName`, `UsageChannel`, `Region` |
57+
|`TokenTransaction` |Processed Inference Tokens |USAGE |Count |Total |Number of inference tokens processed on an Azure OpenAI model. |`ApiName`, `ModelDeploymentName`, `FeatureName`, `UsageChannel`, `Region` |
58+
|`TotalCalls` |Total Calls |HTTP |Count |Total |Total number of calls. |`ApiName`, `OperationName`, `Region`, `RatelimitKey` |
59+
|`TotalErrors` |Total Errors |HTTP |Count |Total |Total number of calls with an error response (HTTP response code 4xx or 5xx). |`ApiName`, `OperationName`, `Region`, `RatelimitKey` |
60+
61+
## Configure diagnostic settings
62+
63+
All of the metrics are exportable with [diagnostic settings in Azure Monitor](/azure/azure-monitor/essentials/diagnostic-settings). To analyze logs and metrics data with Azure Monitor Log Analytics queries, you need to configure diagnostic settings for your Azure OpenAI resource and your Log Analytics workspace.
64+
65+
1. From your Azure OpenAI resource page, under **Monitoring**, select **Diagnostic settings** on the left pane. On the **Diagnostic settings** page, select **Add diagnostic setting**.
66+
67+
:::image type="content" source="../media/monitoring/monitor-add-diagnostic-setting.png" alt-text="Screenshot that shows how to open the Diagnostic setting page for an Azure OpenAI resource in the Azure portal." border="false":::
68+
69+
1. On the **Diagnostic settings** page, configure the following fields:
70+
71+
1. Select **Send to Log Analytics workspace**.
72+
1. Choose your Azure account subscription.
73+
1. Choose your Log Analytics workspace.
74+
1. Under **Logs**, select **allLogs**.
75+
1. Under **Metrics**, select **AllMetrics**.
76+
77+
:::image type="content" source="../media/monitoring/monitor-configure-diagnostics.png" alt-text="Screenshot that shows how to configure diagnostic settings for an Azure OpenAI resource in the Azure portal.":::
78+
79+
1. Enter a **Diagnostic setting name** to save the configuration.
80+
81+
1. Select **Save**.
82+
83+
After you configure the diagnostic settings, you can work with metrics and log data for your Azure OpenAI resource in your Log Analytics workspace.
5784

5885
## Analyze logs
5986

@@ -67,19 +94,37 @@ For a list of the types of resource logs available for Azure OpenAI and similar
6794

6895
## Use Kusto queries
6996

70-
After you deploy an Azure OpenAI model and send some completions calls in [Azure AI Studio](https://oai.azure.com/), you have monitoring data available for your resource. You can explore the data to get a sense of the performance information available. To analyze your monitoring data, you can use the [Kusto](/azure/data-explorer/kusto/query/) query language.
97+
After you deploy an Azure OpenAI model, you can send some completions calls by using the **playground** environment in [Azure AI Studio](https://oai.azure.com/).
98+
99+
:::image type="content" source="../media/monitoring/azure-openai-studio-playground.png" alt-text="Screenshot that shows how to generate completions for an Azure OpenAI resource in the Azure OpenAI Studio playground." lightbox="../media/monitoring/azure-openai-studio-playground.png" border="false":::
100+
101+
Any text that you enter in the **Completions playground** or the **Chat completions playground** generates metrics and log data for your Azure OpenAI resource. In the Log Analytics workspace for your resource, you can query the monitoring data by using the [Kusto](/azure/data-explorer/kusto/query/) query language.
102+
103+
> [!IMPORTANT]
104+
> The **Open query** option on the Azure OpenAI resource page browses to Azure Resource Graph, which isn't described in this article.
105+
> The following queries use the query environment for Log Analytics. Be sure to follow the steps in [Configure diagnostic settings](#configure-diagnostic-settings) to prepare your Log Analytics workspace.
106+
107+
1. From your Azure OpenAI resource page, under **Monitoring** on the left pane, select **Logs**.
108+
109+
1. Select the Log Analytics workspace that you configured with diagnostics for your Azure OpenAI resource.
110+
111+
1. From the **Log Analytics workspace** page, under **Overview** on the left pane, select **Logs**.
112+
113+
The Azure portal displays a **Queries** window with sample queries and suggestions by default. You can close this window.
114+
115+
For the following examples, enter the Kusto query into the edit region at the top of the **Query** window, and then select **Run**. The query results display below the query text.
71116

72117
The following Kusto query is useful for an initial analysis of Azure Diagnostics (`AzureDiagnostics`) data about your resource:
73118

74119
```kusto
75120
AzureDiagnostics
76121
| take 100
77-
| project TimeGenerated, _ResourceId, Category,OperationName, DurationMs, ResultSignature, properties_s
122+
| project TimeGenerated, _ResourceId, Category, OperationName, DurationMs, ResultSignature, properties_s
78123
```
79124

80-
This query returns a sample of 100 entries and displays a subset of the available columns of data in the logs:
125+
This query returns a sample of 100 entries and displays a subset of the available columns of data in the logs. In the query results, you can select the arrow next to the table name to view all available columns and associated data types.
81126

82-
:::image type="content" source="../media/monitoring/kusto-results-diagnostics.png" alt-text="Screenshot that shows the results of a Kusto query for Azure Diagnostics data about the Azure OpenAI resource." lightbox="../media/monitoring/kusto-results-diagnostics.png":::
127+
:::image type="content" source="../media/monitoring/log-analytics-diagnostics-query.png" alt-text="Screenshot that shows the Log Analytics query results for Azure Diagnostics data about the Azure OpenAI resource." lightbox="../media/monitoring/log-analytics-diagnostics-query.png":::
83128

84129
To see all available columns of data, you can remove the scoping parameters line `| project ...` from the query:
85130

@@ -88,19 +133,17 @@ AzureDiagnostics
88133
| take 100
89134
```
90135

91-
In the query results, you can select the arrow next to the table name to view all available columns and associated data types.
92-
93136
To examine the Azure Metrics (`AzureMetrics`) data for your resource, run the following query:
94137

95138
```kusto
96139
AzureMetrics
97140
| take 100
98-
| project TimeGenerated, MetricName, Total, Count, TimeGrain, UnitName
141+
| project TimeGenerated, MetricName, Total, Count, Maximum, Minimum, Average, TimeGrain, UnitName
99142
```
100143

101144
The query returns a sample of 100 entries and displays a subset of the available columns of Azure Metrics data:
102145

103-
:::image type="content" source="../media/monitoring/kusto-results-metrics.png" alt-text="Screenshot that shows the results of a Kusto query for Azure Metrics data about the Azure OpenAI resource" lightbox="../media/monitoring/kusto-results-metrics.png":::
146+
:::image type="content" source="../media/monitoring/log-analytics-metrics-query.png" alt-text="Screenshot that shows the Log Analytics query results for Azure Metrics data about the Azure OpenAI resource." lightbox="../media/monitoring/log-analytics-metrics-query.png":::
104147

105148
> [!NOTE]
106149
> When you select **Monitoring** > **Logs** in the Azure OpenAI menu for your resource, Log Analytics opens with the query scope set to the current resource. The visible log queries include data from that specific resource only. To run a query that includes data from other resources or data from other Azure services, select **Logs** from the **Azure Monitor** menu in the Azure portal. For more information, see [Log query scope and time range in Azure Monitor Log Analytics](../../../azure-monitor/logs/scope.md) for details.
97.1 KB
Loading
Binary file not shown.
Binary file not shown.
193 KB
Loading
54 KB
Loading
70.7 KB
Loading
54.7 KB
Loading

0 commit comments

Comments
 (0)