Skip to content
This repository was archived by the owner on Aug 16, 2022. It is now read-only.

Commit fab160e

Browse files
Merge pull request #8 from opendistro/master
merge
2 parents 6ead123 + b90223e commit fab160e

File tree

4 files changed

+12
-8
lines changed

4 files changed

+12
-8
lines changed

docs/ad/index.md

Lines changed: 9 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -58,27 +58,30 @@ You can add a maximum of five features for a detector.
5858
1. On the **Model configuration** page, enter the **Feature name**.
5959
1. For **Find anomalies based on**, choose the method to find anomalies. For **Field Value** menu, choose the **field** and the **aggregation method**. Or choose **Custom expression**, and add in your own JSON aggregation query.
6060

61-
#### (Optional) Set a category field
61+
#### (Optional) Set a category field for high cardinality
6262

6363
You can categorize anomalies based on a keyword or IP field type.
6464

65-
If you specify a category in the same time series but sliced with a different dimension like IP addresses, product IDs, country codes, and so on, you’ll see a granular view of anomalies within each entity of that field. This helps to dive deeper into anomalies of a unique entity or ID and isolate and debug issues.
65+
The category field categorizes or slices the source time series with a dimension like IP addresses, product IDs, country codes, and so on. This helps to see a granular view of anomalies within each entity of the category field to isolate and debug issues.
6666

6767
To set a category field, choose **Enable a category field** and select a field.
6868

6969
Only a certain number of unique entities are supported in the category field. Use the following equation to calculate the recommended total number of entities number supported in a cluster:
7070

7171
```
72-
(JvmHeapSizeInMb / 20) * (DataNodesCount)
72+
(data nodes * heap size * anomaly detection maximum memory percentage) / (entity size of a detector)
7373
```
7474

75-
For example, for a cluster with 3 data nodes, each with 8G of JVM heap size, the total number of unique entities supported is (8096 / 20 ) * 3 = 1200.
75+
This formula doesn't take into account the query size limit.
76+
{: .note }
77+
78+
For example, for a cluster with 3 data nodes, each with 8G of JVM heap size, a maximum memory percentage of 10% (default), and the entity size of the detector as 1MB: the total number of unique entities supported is (8.096 * 10^9 * 0.1 / 1M ) * 3 = 2429.
7679

7780
#### Set a window size
7881

7982
Set the number of aggregation intervals from your data stream to consider in a detection window. We recommend you choose this value based on your actual data to see which one leads to the best results for your use case.
8083

81-
Based on experiments performed on a wide variety of one-dimensional data streams, we recommend using a window size between 1 and 16. The default window size is 8.
84+
Based on experiments performed on a wide variety of one-dimensional data streams, we recommend using a window size between 1 and 16. The default window size is 8. If you have set the category field for high cardinality, the default window size is 1.
8285

8386
If you expect missing values in your data or if you want the anomalies based on the current interval, choose 1. If your data is continuously ingested and you want the anomalies based on multiple intervals, choose a larger window size.
8487

@@ -113,7 +116,7 @@ If you see the detector pending in "initialization" for longer than a day, aggre
113116

114117
Anomaly grade is a number between 0 and 1 that indicates the level of severity of how anomalous a data point is. An anomaly grade of 0 represents “not an anomaly,” and a non-zero value represents the relative severity of the anomaly. The confidence score is an estimate of the probability that the reported anomaly grade matches the expected anomaly grade. Confidence increases as the model observes more data and learns the data behavior and trends. Note that confidence is distinct from model accuracy.
115118

116-
If you set the category field, you see an additional **Heat map** chart. The heat map correlates results for anomalous entities.
119+
If you set the category field, you see an additional **Heat map** chart. The heat map correlates results for anomalous entities. This chart is empty until you select an anomalous entity. You also see the anomaly and feature line chart for the time period of the anomaly (`anomaly_grade` > 0).
117120

118121
Choose a filled rectangle to see a more detailed view of the anomaly.
119122
{: .note }

docs/images/kibana-notebooks.gif

-10.2 MB
Binary file not shown.

docs/notebooks/index.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,11 +1,11 @@
11
---
22
layout: default
3-
title: Notebook
3+
title: Notebooks
44
nav_order: 38
55
has_children: false
66
---
77

8-
# Kibana Notebook
8+
# Kibana Notebooks
99

1010
A Kibana notebook is an interface that lets you easily combine live visualizations and narrative text in a single notebook interface.
1111

docs/ppl/index.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -9,6 +9,7 @@ has_toc: false
99
# PPL
1010

1111
Piped Processing Language (PPL) is a query language that makes it easier to query data stored in Elasticsearch as compared to the standard domain-specific language (DSL).
12+
PPL lets you use pipe (`|`) syntax to explore, discover, and query data stored in Elasticsearch.
1213

1314
To quickly get up and running with PPL, use **Query Workbench** in Kibana. To learn more, see [Workbench](../sql/workbench/).
1415

0 commit comments

Comments
 (0)