Important
This repository is under active development and is subject to the Azure AI Private Preview Terms - Online Experimentation.
Online Experimentation enables you to evaluate feature variations in production. Quality evaluation requires setting up metrics that measure your application's performance, reliability, usage and quality of engagement. The goal of this repository is to provide out-of-the-box GenAI metrics and custom metric samples that make it easy for you to get started with online experimentation.
This repository provides documentation and samples of online experimentation metrics and the required files for integrating with CI/CD.
Sample metric collections are organized into 2 directories:
-
genai-operational: A pre-built GenAI metric collection compatible with instrumentation libraries that adhere to OpenTelemetry semantic conventions for GenAI spans. Contents include configuration for GenAI metrics such as frequency of user engagement, token usage and response latency so that you can monitor usage volume, costs and varied operational metrics for GenAI integrations within your application. The
summaryrules.yaml
file is necessary to provision a corresponding Log Analytics summary rule for data extraction and transformation on GenAI spans. -
custom: Sample metric collections based on Azure Monitor custom events (with corresponding sample code for instrumentation). These samples demonstrate how to instrument custom events and then use them in metric definitions. They can also be used directly in your application. This section also documents requirements around instrumentation for Online Experimentation metrics.
To generate metrics with Online Experimentation you must integrate Online Experimentation offering. See Online Experimentation documentation for the full setup documentation.
- App Configuration with Online Experimentation and Azure Monitor resources - See this quickstart guide for details.
- GitHub Action azure/online-experimentation-deploy-metrics in your CI/CD workflow.
- Instrument your application.
- App Configuration provides a custom event logger that automatically adds the App Configuration targeting id to each event. Targeting id is required for any event used in Online Experimentation metrics.
- Send tracked events to Azure Monitor.
- Azure Monitor OpenTelemetry Distro enables collection of OpenTelemetry-based logs.
- Azure Monitor Logs charge based on data ingested. See pricing.
- [For GenAI metrics] integrate a GenAI instrumentation library which follows the OpenTelemetry GenAI semantic conventions.
- Enrich spans with custom attribute
TargetingId
(required): Azure App Configuration's TargetingId must be attached to GenAI traces in order to consume them for Online Experimentation metrics. - You must also create a summary rule which outputs transformed GenAI spans into
AppEvents_CL
table. Summary rule and directions are provided in this repository. Seegenai-operational
directory.
- Enrich spans with custom attribute
The sample application OpenAI Chat App
for Online Experimentation provides a contextualized example of how telemetry, metrics and summary rules fit into an application.
To modify the metrics in this sample application:
- Add (your customized) metrics to a json file. Check your GitHub Actions workflow file configured file path to ensure the file is processed by that GHA.
- Add the summary rule(s) necessary for consuming OTel-based GenAI spans into your repository's infra path and ensure your main.bicep has a module for summary rule deployment. For more clarity on deploying summary rules, a sample bicep template is referenced below, with placeholder support files in the infra folder of this samples repo.
Sample bicep module for summary rule deployment:
targetScope = 'subscription'
@description('Log Analytics Workspace name, location, and resource group')
param logAnalyticsWorkspaceName string = 'YOUR_WORKSPACE_NAMWE'
param logAnalyticsWorkspaceLocation string = 'YOUR_WORKSPACE_REGION'
param logAnalyticsWorkspaceResourceGroupName string = 'YOUR_WORKSPACE_RG'
resource logAnalyticsWorkspaceResourceGroup 'Microsoft.Resources/resourceGroups@2021-04-01' existing = {
name: logAnalyticsWorkspaceResourceGroupName
}
// summary rule module
var ruleDefinitions = loadYamlContent('./monitor/summaryrules.yaml')
module summaryRules './monitor/summaryrule.bicep' = [ for (rule, i) in ruleDefinitions.summaryRules: if (!empty(logAnalyticsWorkspaceName) && !empty(logAnalyticsWorkspaceLocation)) {
name: 'loganalytics-summaryrule-${i}'
scope: logAnalyticsWorkspaceResourceGroup
params: {
location: logAnalyticsWorkspaceLocation
logAnalyticsWorkspaceName: logAnalyticsWorkspaceName
summaryRuleName: rule.name
description: rule.description
query: rule.query
binSize: rule.binSize
destinationTable: rule.destinationTable
}
} ]
Important
Ensure destinationTable matches 'AppEvents_CL'. No other custom log tables are used for Online Experimentation metric computation.
This module requires two dependent files:
summaryrule.bicep
template (can be copied as-is from this repo)summaryrules.yaml
-- a list of parameterized summary rules to create or update. Examples are provided in the genai-operational directory.
- Online Experimentation documentation
- Sample Online Experimentation enabled OpenAI app
- GitHub Action to deploy metrics
- For continuous (online) evaluation in production environments, see How to run evaluations online with the Azure AI Foundry SDK.
- Contact [email protected] for assistance during private preview.