You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
{{ message }}
This repository was archived by the owner on Aug 16, 2022. It is now read-only.
Copy file name to clipboardExpand all lines: docs/ad/index.md
+9-6Lines changed: 9 additions & 6 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -58,27 +58,30 @@ You can add a maximum of five features for a detector.
58
58
1. On the **Model configuration** page, enter the **Feature name**.
59
59
1. For **Find anomalies based on**, choose the method to find anomalies. For **Field Value** menu, choose the **field** and the **aggregation method**. Or choose **Custom expression**, and add in your own JSON aggregation query.
60
60
61
-
#### (Optional) Set a category field
61
+
#### (Optional) Set a category field for high cardinality
62
62
63
63
You can categorize anomalies based on a keyword or IP field type.
64
64
65
-
If you specify a category in the same time series but sliced with a different dimension like IP addresses, product IDs, country codes, and so on, you’ll see a granular view of anomalies within each entity of that field. This helps to dive deeper into anomalies of a unique entity or ID and isolate and debug issues.
65
+
The category field categorizes or slices the source time series with a dimension like IP addresses, product IDs, country codes, and so on. This helps to see a granular view of anomalies within each entity of the category field to isolate and debug issues.
66
66
67
67
To set a category field, choose **Enable a category field** and select a field.
68
68
69
69
Only a certain number of unique entities are supported in the category field. Use the following equation to calculate the recommended total number of entities number supported in a cluster:
70
70
71
71
```
72
-
(JvmHeapSizeInMb / 20) * (DataNodesCount)
72
+
(data nodes * heap size * anomaly detection maximum memory percentage) / (entity size of a detector)
73
73
```
74
74
75
-
For example, for a cluster with 3 data nodes, each with 8G of JVM heap size, the total number of unique entities supported is (8096 / 20 ) * 3 = 1200.
75
+
This formula doesn't take into account the query size limit.
76
+
{: .note }
77
+
78
+
For example, for a cluster with 3 data nodes, each with 8G of JVM heap size, a maximum memory percentage of 10% (default), and the entity size of the detector as 1MB: the total number of unique entities supported is (8.096 * 10^9 * 0.1 / 1M ) * 3 = 2429.
76
79
77
80
#### Set a window size
78
81
79
82
Set the number of aggregation intervals from your data stream to consider in a detection window. We recommend you choose this value based on your actual data to see which one leads to the best results for your use case.
80
83
81
-
Based on experiments performed on a wide variety of one-dimensional data streams, we recommend using a window size between 1 and 16. The default window size is 8.
84
+
Based on experiments performed on a wide variety of one-dimensional data streams, we recommend using a window size between 1 and 16. The default window size is 8. If you have set the category field for high cardinality, the default window size is 1.
82
85
83
86
If you expect missing values in your data or if you want the anomalies based on the current interval, choose 1. If your data is continuously ingested and you want the anomalies based on multiple intervals, choose a larger window size.
84
87
@@ -113,7 +116,7 @@ If you see the detector pending in "initialization" for longer than a day, aggre
113
116
114
117
Anomaly grade is a number between 0 and 1 that indicates the level of severity of how anomalous a data point is. An anomaly grade of 0 represents “not an anomaly,” and a non-zero value represents the relative severity of the anomaly. The confidence score is an estimate of the probability that the reported anomaly grade matches the expected anomaly grade. Confidence increases as the model observes more data and learns the data behavior and trends. Note that confidence is distinct from model accuracy.
115
118
116
-
If you set the category field, you see an additional **Heat map** chart. The heat map correlates results for anomalous entities.
119
+
If you set the category field, you see an additional **Heat map** chart. The heat map correlates results for anomalous entities. This chart is empty until you select an anomalous entity. You also see the anomaly and feature line chart for the time period of the anomaly (`anomaly_grade` > 0).
117
120
118
121
Choose a filled rectangle to see a more detailed view of the anomaly.
Copy file name to clipboardExpand all lines: docs/ppl/index.md
+1Lines changed: 1 addition & 0 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -9,6 +9,7 @@ has_toc: false
9
9
# PPL
10
10
11
11
Piped Processing Language (PPL) is a query language that makes it easier to query data stored in Elasticsearch as compared to the standard domain-specific language (DSL).
12
+
PPL lets you use pipe (`|`) syntax to explore, discover, and query data stored in Elasticsearch.
12
13
13
14
To quickly get up and running with PPL, use **Query Workbench** in Kibana. To learn more, see [Workbench](../sql/workbench/).
0 commit comments