Skip to content

Commit b748b56

Browse files
authored
Merge pull request #221948 from mrbullwinkle/mrb_12_19_2022_anomaly_detector_ADX
[Cognitive Services] [Anomaly Detector] Azure Data Explorer integration
2 parents 14a76e4 + 2effa9b commit b748b56

File tree

5 files changed

+122
-0
lines changed

5 files changed

+122
-0
lines changed
40.1 KB
Loading
20.1 KB
Loading
16.6 KB
Loading

articles/cognitive-services/Anomaly-Detector/toc.yml

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -68,6 +68,8 @@
6868
href: concepts/multivariate-architecture.md
6969
- name: Tutorials
7070
items:
71+
- name: Azure Data Explorer integration
72+
href: tutorials/azure-data-explorer.md
7173
- name: Visualize anomalies as a batch using Power BI (univariate)
7274
href: tutorials/batch-anomaly-detection-powerbi.md
7375
- name: Azure Synapse with Multivariate Anomaly Detection
Lines changed: 120 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,120 @@
1+
---
2+
title: "Tutorial: Use Univariate Anomaly Detector in Azure Data Explorer"
3+
titleSuffix: Azure Cognitive Services
4+
description: Learn how to use the Univariate Anomaly Detector with Azure Data Explorer.
5+
services: cognitive-services
6+
author: jr-MS
7+
manager: nitinme
8+
ms.service: cognitive-services
9+
ms.subservice: anomaly-detector
10+
ms.topic: tutorial
11+
ms.date: 12/19/2022
12+
ms.author: mbullwin
13+
---
14+
15+
# Tutorial: Use Univariate Anomaly Detector in Azure Data Explorer
16+
17+
## Introduction
18+
19+
The [Anomaly Detector API](/azure/cognitive-services/anomaly-detector/overview-multivariate) enables you to check and detect abnormalities in your time series data without having to know machine learning. The Anomaly Detector API's algorithms adapt by automatically finding and applying the best-fitting models to your data, regardless of industry, scenario, or data volume. Using your time series data, the API decides boundaries for anomaly detection, expected values, and which data points are anomalies.
20+
21+
[Azure Data Explorer](/azure/data-explorer/data-explorer-overview) is a fully managed, high-performance, big data analytics platform that makes it easy to analyze high volumes of data in near real-time. The Azure Data Explorer toolbox gives you an end-to-end solution for data ingestion, query, visualization, and management.
22+
23+
## Anomaly Detection functions in Azure Data Explorer
24+
25+
### Function 1: series_uv_anomalies_fl()
26+
27+
The function **[series_uv_anomalies_fl()](/azure/data-explorer/kusto/functions-library/series-uv-anomalies-fl?tabs=adhoc)** detects anomalies in time series by calling the [Univariate Anomaly Detection API](/azure/cognitive-services/anomaly-detector/overview). The function accepts a limited set of time series as numerical dynamic arrays and the required anomaly detection sensitivity level. Each time series is converted into the required JSON (JavaScript Object Notation) format and posts it to the Anomaly Detector service endpoint. The service response has dynamic arrays of high/low/all anomalies, the modeled baseline time series, its normal high/low boundaries (a value above or below the high/low boundary is an anomaly) and the detected seasonality.
28+
29+
### Function 2: series_uv_change_points_fl()
30+
31+
The function **[series_uv_change_points_fl()](/azure/data-explorer/kusto/functions-library/series-uv-change-points-fl?tabs=adhoc)** finds change points in time series by calling the Univariate Anomaly Detection API. The function accepts a limited set of time series as numerical dynamic arrays, the change point detection threshold, and the minimum size of the stable trend window. Each time series is converted into the required JSON format and posts it to the Anomaly Detector service endpoint. The service response has dynamic arrays of change points, their respective confidence, and the detected seasonality.
32+
33+
These two functions are user-defined [tabular functions](/azure/data-explorer/kusto/query/functions/user-defined-functions#tabular-function) applied using the [invoke operator](/azure/data-explorer/kusto/query/invokeoperator). You can either embed its code in your query or you can define it as a stored function in your database.
34+
35+
## Where to use these new capabilities?
36+
37+
These two functions are available to use either in Azure Data Explorer website or in the Kusto Explorer application.
38+
39+
![Screenshot of Azure Data Explorer and Kusto Explorer](../media/data-explorer/way-of-use.png)
40+
41+
## Create resources
42+
43+
1. [Create an Azure Data Explorer Cluster](https://portal.azure.com/#create/Microsoft.AzureKusto) in the Azure portal, after the resource is created successfully, go to the resource and create a database.
44+
2. [Create an Anomaly Detector](https://portal.azure.com/#create/Microsoft.CognitiveServicesAnomalyDetector) resource in the Azure portal and check the keys and endpoints that you’ll need later.
45+
3. Enable plugins in Azure Data Explorer
46+
* These new functions have inline Python and require [enabling the python() plugin](/azure/data-explorer/kusto/query/pythonplugin#enable-the-plugin) on the cluster.
47+
* These new functions call the anomaly detection service endpoint and require:
48+
* Enable the [http_request plugin / http_request_post plugin](/azure/data-explorer/kusto/query/http-request-plugin) on the cluster.
49+
* Modify the [callout policy](/azure/data-explorer/kusto/management/calloutpolicy) for type `webapi` to allow accessing the service endpoint.
50+
51+
## Download sample data
52+
53+
This quickstart uses the `request-data.csv` file that can be downloaded from our [GitHub sample data](https://github.com/Azure/azure-sdk-for-python/blob/main/sdk/anomalydetector/azure-ai-anomalydetector/samples/sample_data/request-data.csv)
54+
55+
You can also download the sample data by running:
56+
57+
```cmd
58+
curl "https://raw.githubusercontent.com/Azure/azure-sdk-for-python/main/sdk/anomalydetector/azure-ai-anomalydetector/samples/sample_data/request-data.csv" --output request-data.csv
59+
```
60+
61+
Then ingest the sample data to Azure Data Explorer by following the [ingestion guide](/azure/data-explorer/ingest-sample-data?tabs=ingestion-wizard). Name the new table for the ingested data **univariate**.
62+
63+
Once ingested, your data should look as follows:
64+
65+
:::image type="content" source="../media/data-explorer/project.png" alt-text="Screenshot of Kusto query with sample data." lightbox="../media/data-explorer/project.png":::
66+
67+
## Detect anomalies in an entire time series
68+
69+
In Azure Data Explorer, run the following query to make an anomaly detection chart with your onboarded data. You could also [create a function](/azure/data-explorer/kusto/functions-library/series-uv-change-points-fl?tabs=persistent) to add the code to a stored function for persistent usage.
70+
71+
```kusto
72+
let series_uv_anomalies_fl=(tbl:(*), y_series:string, sensitivity:int=85, tsid:string='_tsid')
73+
{
74+
    let uri = '[Your-Endpoint]anomalydetector/v1.0/timeseries/entire/detect';
75+
    let headers=dynamic({'Ocp-Apim-Subscription-Key': h'[Your-key]'});
76+
    let kwargs = pack('y_series', y_series, 'sensitivity', sensitivity);
77+
    let code = ```if 1:
78+
        import json
79+
        y_series = kargs["y_series"]
80+
        sensitivity = kargs["sensitivity"]
81+
        json_str = []
82+
        for i in range(len(df)):
83+
            row = df.iloc[i, :]
84+
            ts = [{'value':row[y_series][j]} for j in range(len(row[y_series]))]
85+
            json_data = {'series': ts, "sensitivity":sensitivity}     # auto-detect period, or we can force 'period': 84. We can also add 'maxAnomalyRatio':0.25 for maximum 25% anomalies
86+
            json_str = json_str + [json.dumps(json_data)]
87+
        result = df
88+
        result['json_str'] = json_str
89+
    ```;
90+
    tbl
91+
    | evaluate python(typeof(*, json_str:string), code, kwargs)
92+
    | extend _tsid = column_ifexists(tsid, 1)
93+
    | partition by _tsid (
94+
       project json_str
95+
       | evaluate http_request_post(uri, headers, dynamic(null))
96+
       | project period=ResponseBody.period, baseline_ama=ResponseBody.expectedValues, ad_ama=series_add(0, ResponseBody.isAnomaly), pos_ad_ama=series_add(0, ResponseBody.isPositiveAnomaly)
97+
       , neg_ad_ama=series_add(0, ResponseBody.isNegativeAnomaly), upper_ama=series_add(ResponseBody.expectedValues, ResponseBody.upperMargins), lower_ama=series_subtract(ResponseBody.expectedValues, ResponseBody.lowerMargins)
98+
       | extend _tsid=toscalar(_tsid)
99+
      )
100+
}
101+
;
102+
let stime=datetime(2018-03-01);
103+
let etime=datetime(2018-04-16);
104+
let dt=1d;
105+
let ts = univariate
106+
| make-series value=avg(Column2) on Column1 from stime to etime step dt
107+
| extend _tsid='TS1';
108+
ts
109+
| invoke series_uv_anomalies_fl('value')
110+
| lookup ts on _tsid
111+
| render anomalychart with(xcolumn=Column1, ycolumns=value, anomalycolumns=ad_ama)
112+
```
113+
114+
After you run the code, you'll render a chart like this:
115+
116+
:::image type="content" source="../media/data-explorer/anomaly.png" alt-text="Screenshot of line chart of anomalies." lightbox="../media/data-explorer/anomaly.png":::
117+
118+
## Next steps
119+
120+
* [Best practices of Univariate Anomaly Detection](../concepts/anomaly-detection-best-practices.md)

0 commit comments

Comments
 (0)