|
| 1 | +--- |
| 2 | +id: azure-open-ai |
| 3 | +title: Azure OpenAI |
| 4 | +description: Learn about the Sumo Logic collection process for the Azure OpenAI service. |
| 5 | +--- |
| 6 | + |
| 7 | +import useBaseUrl from '@docusaurus/useBaseUrl'; |
| 8 | + |
| 9 | +<img src={useBaseUrl('img/integrations/microsoft-azure/azure-openai.png')} alt="Thumbnail icon" width="50"/> |
| 10 | + |
| 11 | +[Azure OpenAI](https://learn.microsoft.com/en-us/azure/ai-foundry/openai/overview) is a fully managed platform that provides access to advanced generative AI models from OpenAI, such as GPT, Codex, and Embeddings, within Azure’s secure, enterprise-grade environment. It integrates seamlessly with Azure services like Cognitive Search, Machine Learning, and Logic Apps, as well as with external applications and data sources, enabling powerful capabilities in natural language processing, code generation, and reasoning. The platform also supports monitoring of key performance metrics, including request volume, token usage, response latency, and error rates, to ensure efficient model utilization and reliable AI-driven application performance. |
| 12 | + |
| 13 | +## Log and metric types |
| 14 | + |
| 15 | +For Azure OpenAI, you can collect the following logs and metrics: |
| 16 | + |
| 17 | +* **Resource logs**. To learn more about the different resource log category types and schemas collected for Azure OpenAI, refer to [Azure documentation](https://learn.microsoft.com/en-us/azure/ai-foundry/openai/monitor-openai-reference#resource-logs). |
| 18 | + |
| 19 | +* **Platform Metrics for Azure OpenAI**. These metrics are available in the namespaces below: |
| 20 | + * [Microsoft.CognitiveServices/accounts](https://learn.microsoft.com/en-us/azure/ai-foundry/openai/monitor-openai-reference#supported-metrics-for-microsoftcognitiveservicesaccounts) |
| 21 | + |
| 22 | +For more information on supported metrics, refer to [Azure documentation](https://learn.microsoft.com/en-us/azure/ai-foundry/openai/monitor-openai-reference#metrics). |
| 23 | + |
| 24 | +## Setup |
| 25 | + |
| 26 | +Azure services send monitoring data to Azure Monitor, which can then [stream data to Event Hub](https://learn.microsoft.com/en-us/azure/azure-monitor/essentials/stream-monitoring-data-event-hubs). Sumo Logic supports: |
| 27 | + |
| 28 | +* Logs collection from [Azure Monitor](https://docs.microsoft.com/en-us/azure/monitoring-and-diagnostics/monitoring-get-started) using our [Azure Event Hubs source](/docs/send-data/collect-from-other-data-sources/azure-monitoring/ms-azure-event-hubs-source/). |
| 29 | +* Metrics collection using our [Azure Metrics Source](/docs/send-data/hosted-collectors/microsoft-source/azure-metrics-source). |
| 30 | + |
| 31 | +You must explicitly enable diagnostic settings for each OpenAI resource you want to monitor. You can forward logs to the same Event Hub, provided they satisfy the limitations and permissions as described [here](https://learn.microsoft.com/en-us/azure/azure-monitor/essentials/diagnostic-settings?tabs=portal#destination-limitations). |
| 32 | + |
| 33 | +When you configure the event hubs source or HTTP source, plan your source category to ease the querying process. A hierarchical approach allows you to make use of wildcards. For example: `Azure/OpenAI/Logs`, `Azure/OpenAI/Metrics`. |
| 34 | + |
| 35 | +### Configure collector |
| 36 | + |
| 37 | +Create a hosted collector if not already configured and tag the `tenant_name` field. You can get the tenant name using the instructions [here](https://learn.microsoft.com/en-us/azure/active-directory-b2c/tenant-management-read-tenant-name#get-your-tenant-name). Make sure you create the required sources in this collector. <br/><img src={useBaseUrl('img/integrations/microsoft-azure/Azure-Storage-Tag-Tenant-Name.png')} alt="Azure Tag Tenant Name" style={{border: '1px solid gray'}} width="500" /> |
| 38 | + |
| 39 | +### Configure metrics collection |
| 40 | + |
| 41 | +import MetricsSource from '../../reuse/metrics-source.md'; |
| 42 | + |
| 43 | +<MetricsSource/> |
| 44 | + |
| 45 | +### Configure logs collection |
| 46 | + |
| 47 | +In this section, you will configure a pipeline for shipping diagnostic logs from Azure Monitor to an Event Hub. |
| 48 | + |
| 49 | +#### Diagnostic logs |
| 50 | + |
| 51 | +1. To set up the Azure Event Hubs source in Sumo Logic, refer to the [Azure Event Hubs Source for Logs](/docs/send-data/collect-from-other-data-sources/azure-monitoring/ms-azure-event-hubs-source/). |
| 52 | +1. To create the diagnostic settings in the Azure portal, refer to the [Azure documentation](https://learn.microsoft.com/en-us/azure/azure-monitor/essentials/diagnostic-settings?tabs=portal#create-diagnostic-settings). Perform the steps below for each Azure Event Hubs namespace that you want to monitor. |
| 53 | + 1. Choose `Stream to an event hub` as the destination. |
| 54 | + 1. Select `allLogs`. |
| 55 | + 1. Use the Event Hub namespace and Event Hub name configured in the previous step in the destination details section. You can use the default policy `RootManageSharedAccessKey` as the policy name.<br/><img src={useBaseUrl('img/send-data/azure-openai-logs.png')} alt="Azure OpenAI logs" style={{border: '1px solid gray'}} width="800" /> |
| 56 | +1. Tag the location field in the source with the right location value. <br/><img src={useBaseUrl('img/integrations/microsoft-azure/Azure-Storage-Tag-Location.png')} alt="Azure OpenAI Tag Location" style={{border: '1px solid gray'}} width="400" /> |
| 57 | + |
| 58 | +#### Activity logs (optional) |
| 59 | + |
| 60 | +import ActivityLogs from '../../reuse/apps/azure-activity-logs.md'; |
| 61 | + |
| 62 | +<ActivityLogs/> |
| 63 | + |
| 64 | +## Installing the Azure OpenAI app |
| 65 | + |
| 66 | +import AppInstallIndexV2 from '../../reuse/apps/app-install-index-option.md'; |
| 67 | + |
| 68 | +<AppInstallIndexV2/> |
| 69 | + |
| 70 | +As part of the app installation process, the following fields will be created by default: |
| 71 | + |
| 72 | +- `tenant_name`. This field is tagged at the collector level. You can get the tenant name using the instructions [here](https://learn.microsoft.com/en-us/azure/active-directory-b2c/tenant-management-read-tenant-name#get-your-tenant-name). |
| 73 | +- `location`. The region the resource name belongs to. |
| 74 | +- `subscription_id`. ID associated with a subscription where the resource is present. |
| 75 | +- `resource_group`. The resource group name where the Azure resource is present. |
| 76 | +- `provider_name`. Azure resource provider name (for example, Microsoft.Network). |
| 77 | +- `resource_type`. Azure resource type (for example, storage accounts). |
| 78 | +- `resource_name`. The name of the resource (for example, storage account name). |
| 79 | +- `service_type`. The type of service that can be accessed with an Azure resource. |
| 80 | +- `service_name`. Services that can be accessed with an Azure resource. (For example, in Azure Container Instances, the service is Subscriptions.) |
| 81 | + |
| 82 | +## Viewing the Azure OpenAI dashboards |
| 83 | + |
| 84 | +import ViewDashboardsIndex from '../../reuse/apps/view-dashboards-index.md'; |
| 85 | + |
| 86 | +<ViewDashboardsIndex/> |
| 87 | + |
| 88 | +### Overview |
| 89 | + |
| 90 | +The **Azure OpenAI - Overview** dashboard provides a high‑level view of the overall health, performance, usage, and safety signals of your Azure OpenAI service. It surfaces key indicators such as availability, request activity, token consumption, latency, and moderation events. Use this dashboard to monitor general service reliability, detect issues quickly, and understand workload patterns across your deployments. |
| 91 | +<img src={useBaseUrl('https://sumologic-app-data-v2.s3.us-east-1.amazonaws.com/dashboards/AzureOpenAI/Azure-OpenAI-Overview.png')} alt="Azure OpenAI - Overview dashboard" style={{border: '1px solid gray'}} width="800" /> |
| 92 | + |
| 93 | +### Models |
| 94 | + |
| 95 | +The **Azure OpenAI - Models** dashboard enables a deep dive into individual model performance, usage, and health. Tracks model availability, request rates, operations, latency, throughput (tokens per second), and usage split by deployment, model name, and resources |
| 96 | +<img src={useBaseUrl('https://sumologic-app-data-v2.s3.us-east-1.amazonaws.com/dashboards/AzureOpenAI/Azure-OpenAI-Models.png')} alt="Azure OpenAI - Models" style={{border: '1px solid gray'}} width="800" /> |
| 97 | + |
| 98 | +### Performance and Latency |
| 99 | + |
| 100 | +The **Azure OpenAI - Performance and Latency** dashboard focuses on the responsiveness of Azure OpenAI APIs and models. It tracks time-to-first-byte (TTFB), time-to-response, time-between-tokens for streaming performance, tokens-per-second speed, and time-to-last-byte. Use this dashboard to identify latency bottlenecks across models, deployments and to compare streaming vs non-streaming performance trends. |
| 101 | +<img src={useBaseUrl('https://sumologic-app-data-v2.s3.us-east-1.amazonaws.com/dashboards/AzureOpenAI/Azure-OpenAI-Performance-and-Latency.png')} alt="Azure OpenAI - Performance and Latency" style={{border: '1px solid gray'}} width="800" /> |
| 102 | + |
| 103 | +### Reliability and Availability |
| 104 | + |
| 105 | +The **Azure OpenAI - Reliability and Availability** dashboard provides visibility into the operational health of Azure OpenAI across deployments and models. It highlights metrics that track overall API availability, request success vs client/server errors, and throttled calls (429). Use this dashboard to quickly identify availability degradation, error spikes, or throttling events that may affect your applications. |
| 106 | +<img src={useBaseUrl('https://sumologic-app-data-v2.s3.us-east-1.amazonaws.com/dashboards/AzureOpenAI/Azure-OpenAI-Reliability-and-Availability.png')} alt="Azure OpenAI - Reliability and Availability" style={{border: '1px solid gray'}} width="800" /> |
| 107 | + |
| 108 | +### Usage and Token Consumption |
| 109 | + |
| 110 | +The **Azure OpenAI - Usage and Token Consumption** dashboard provides details on model utilization and token consumption across deployments. The dashboard surfaces prompt tokens (input), generated tokens (output), total tokens processed, and cache match rates. Use this dashboard for cost optimization and understanding workload trends across different models and regions. |
| 111 | +<img src={useBaseUrl('https://sumologic-app-data-v2.s3.us-east-1.amazonaws.com/dashboards/AzureOpenAI/Azure-OpenAI-Usage-and-Token-Consumption.png')} alt="Azure OpenAI - Usage and Token Consumption" style={{border: '1px solid gray'}} width="800" /> |
| 112 | + |
| 113 | +### Content Safety |
| 114 | + |
| 115 | +The **Azure OpenAI - Content Safety** dashboard provides metrics on responsible AI policies and content safety enforcement. It monitors harmful content detected, requests blocked by filters, abusive user identification, and system safety events. Use this dashboard for compliance, RAI monitoring, and auditing risky behaviors across workloads. |
| 116 | +<img src={useBaseUrl('https://sumologic-app-data-v2.s3.us-east-1.amazonaws.com/dashboards/AzureOpenAI/Azure-OpenAI-Content-Safety.png')} alt="Azure OpenAI - Content Safety" style={{border: '1px solid gray'}} width="800" /> |
| 117 | + |
| 118 | +### Administrative Operations |
| 119 | + |
| 120 | +The **Azure OpenAI - Administrative Operations** dashboard provides details on the operational activities and status of your Azure OpenAI resources. |
| 121 | + |
| 122 | +Use this dashboard to: |
| 123 | +* Monitor the distribution of operation types and their success rates to ensure proper functioning of your OpenAI. |
| 124 | +* Identify potential issues by analyzing the top operations causing errors and correlating them with specific users or applications. |
| 125 | +* Track recent write and delete operations to maintain an audit trail of changes made to your OpenAI. |
| 126 | + |
| 127 | +<img src={useBaseUrl('https://sumologic-app-data-v2.s3.us-east-1.amazonaws.com/dashboards/AzureOpenAI/Azure-OpenAI-Administrative-Operations.png')} alt="Azure OpenAI - Administrative Operations" style={{border: '1px solid gray'}} width="800" /> |
| 128 | + |
| 129 | +### Policy and Recommendations |
| 130 | + |
| 131 | +The **Azure OpenAI - Policy and Recommendations** dashboard provides details on policy events and recommendations for your Azure OpenAI resources. |
| 132 | + |
| 133 | +Use this dashboard to: |
| 134 | +* Monitor the success and failure rates of policy events to ensure proper configuration and compliance. |
| 135 | +* Track and analyze recent recommendations to improve the performance and security of your OpenAI setup. |
| 136 | +* Identify trends in policy events and recommendations over time to proactively address potential issues. |
| 137 | + |
| 138 | +<img src={useBaseUrl('https://sumologic-app-data-v2.s3.us-east-1.amazonaws.com/dashboards/AzureOpenAI/Azure-OpenAI-Policy-and-Recommendations.png')} alt="Azure OpenAI - Policy and Recommendations" style={{border: '1px solid gray'}} width="800" /> |
| 139 | + |
| 140 | +## Create monitors for Azure OpenAI |
| 141 | + |
| 142 | +import CreateMonitors from '../../reuse/apps/create-monitors.md'; |
| 143 | + |
| 144 | +<CreateMonitors/> |
| 145 | + |
| 146 | +### Azure OpenAI alerts |
| 147 | + |
| 148 | +These alerts are metric-based and will work for all Azure Storage. |
| 149 | + |
| 150 | +| Alert Name | Alert Description and Conditions | Alert Condition | Recover Condition | |
| 151 | +|:--|:--|:--|:------------------| |
| 152 | +| `Azure OpenAI - Availability` | This alert is triggered when the availability of the resource drops below 100%. | Count < 100 | Count > = 100 | |
| 153 | +| `Azure OpenAI - Processed Inference Tokens` | This alert is triggered when inference token consumption crosses the value of 1000000 tokens. | Count > 1000000 | Count < = 1000000 | |
| 154 | + |
| 155 | +## Upgrade/Downgrade the Azure OpenAI app (optional) |
| 156 | + |
| 157 | +import AppUpdate from '../../reuse/apps/app-update.md'; |
| 158 | + |
| 159 | +<AppUpdate/> |
| 160 | + |
| 161 | +## Uninstalling the Azure OpenAI app (optional) |
| 162 | + |
| 163 | +import AppUninstall from '../../reuse/apps/app-uninstall.md'; |
| 164 | + |
| 165 | +<AppUninstall/> |
| 166 | + |
| 167 | +## Troubleshooting |
| 168 | + |
| 169 | +### Metrics collection via Azure Metrics Source |
| 170 | + |
| 171 | +To troubleshoot metrics collection via Azure Metrics Source, follow the instructions in [Troubleshooting Azure Metrics Source](/docs/send-data/hosted-collectors/microsoft-source/azure-metrics-source/#troubleshooting). |
0 commit comments