You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: explore-analyze/alerts/kibana/rule-type-es-query.md
+1-1Lines changed: 1 addition & 1 deletion
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -37,7 +37,7 @@ When you create an {{es}} query rule, your choice of query type affects the info
37
37
38
38
If you use [KQL](../../query-filter/languages/kql.md) or [Lucene](../../query-filter/languages/lucene-query-syntax.md), you must specify a data view then define a text-based query. For example, `http.request.referrer: "https://example.com"`.
39
39
40
-
If you use [ES|QL](../../query-filter/languages/esorql.md), you must provide a sourcecommand followed by an optional series of processing commands, separated by pipe characters (|). [8.16.0] For example:
40
+
If you use [ES|QL](../../query-filter/languages/esql.md), you must provide a sourcecommand followed by an optional series of processing commands, separated by pipe characters (|). [8.16.0] For example:
Copy file name to clipboardExpand all lines: explore-analyze/discover/try-esql.md
+1-1Lines changed: 1 addition & 1 deletion
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -10,7 +10,7 @@ The Elasticsearch Query Language, {{esql}}, makes it easier to explore your data
10
10
In this tutorial we’ll use the {{kib}} sample web logs in Discover and Lens to explore the data and create visualizations.
11
11
12
12
::::{tip}
13
-
For the complete {{esql}} documentation, including tutorials, examples and the full syntax reference, refer to the [{{es}} documentation](../query-filter/languages/esorql.md). For a more detailed overview of {{esql}} in {{kib}}, refer to [Use {{esql}} in Kibana](../query-filter/languages/esql-kibana.md).
13
+
For the complete {{esql}} documentation, including tutorials, examples and the full syntax reference, refer to the [{{es}} documentation](../query-filter/languages/esql.md). For a more detailed overview of {{esql}} in {{kib}}, refer to [Use {{esql}} in Kibana](../query-filter/languages/esql-kibana.md).
Copy file name to clipboardExpand all lines: explore-analyze/geospatial-analysis.md
+1-1Lines changed: 1 addition & 1 deletion
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -34,7 +34,7 @@ Data is often messy and incomplete. [Ingest pipelines](../manage-data/ingest/tra
34
34
35
35
## ES|QL [esql-query]
36
36
37
-
[ES|QL](query-filter/languages/esorql.md) has support for [Geospatial Search](https://www.elastic.co/guide/en/elasticsearch/reference/current/esql-functions-operators.html#esql-spatial-functions) functions, enabling efficient index searching for documents that intersect with, are within, are contained by, or are disjoint from a query geometry. In addition, the `ST_DISTANCE` function calculates the distance between two points.
37
+
[ES|QL](query-filter/languages/esql.md) has support for [Geospatial Search](https://www.elastic.co/guide/en/elasticsearch/reference/current/esql-functions-operators.html#esql-spatial-functions) functions, enabling efficient index searching for documents that intersect with, are within, are contained by, or are disjoint from a query geometry. In addition, the `ST_DISTANCE` function calculates the distance between two points.
Copy file name to clipboardExpand all lines: explore-analyze/machine-learning/anomaly-detection/geographic-anomalies.md
+2-11Lines changed: 2 additions & 11 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -7,7 +7,6 @@ mapped_pages:
7
7
8
8
If your data includes geographic fields, you can use {{ml-features}} to detect anomalous behavior, such as a credit card transaction that occurs in an unusual location or a web request that has an unusual source location.
9
9
10
-
11
10
## Prerequisites [geographic-anomalies-prereqs]
12
11
13
12
To run this type of {{anomaly-job}}, you must have [{{ml-features}} set up](../setting-up-machine-learning.md). You must also have time series data that contains spatial data types. In particular, you must have:
@@ -21,7 +20,6 @@ The latitude and longitude must be in the range -180 to 180 and represent a poin
21
20
22
21
This example uses the sample eCommerce orders and sample web logs data sets. For more information, see [Add the sample data](../../overview/kibana-quickstart.md#gs-get-data-into-kibana).
23
22
24
-
25
23
## Explore your geographic data [geographic-anomalies-visualize]
26
24
27
25
To get the best results from {{ml}} analytics, you must understand your data. You can use the **{{data-viz}}** in the **{{ml-app}}** app for this purpose. Search for specific fields or field types, such as geo-point fields in the sample data sets. You can see how many documents contain those fields within a specific time period and sample size. You can also see the number of distinct values, a list of example values, and preview them on a map. For example:
@@ -31,7 +29,6 @@ To get the best results from {{ml}} analytics, you must understand your data. Yo
31
29
:class: screenshot
32
30
:::
33
31
34
-
35
32
## Create an {{anomaly-job}} [geographic-anomalies-jobs]
36
33
37
34
There are a few limitations to consider before you create this type of job:
@@ -51,6 +48,7 @@ For example, create a job that analyzes the sample eCommerce orders data set to
51
48
:::
52
49
53
50
::::{dropdown} API example
51
+
54
52
```console
55
53
PUT _ml/anomaly_detectors/ecommerce-geo <1>
56
54
{
@@ -101,10 +99,8 @@ POST _ml/datafeeds/datafeed-ecommerce-geo/_start <4>
101
99
3. Open the job.
102
100
4. Start the {{dfeed}}. Since the sample data sets often contain timestamps that are later than the current date, it is a good idea to specify the appropriate end date for the {{dfeed}}.
103
101
104
-
105
102
::::
106
103
107
-
108
104
Alternatively, create a job that analyzes the sample web logs data set to detect events with unusual coordinates (`geo.coordinates` values) or unusually high sums of transferred data (`bytes` values):
@@ -113,6 +109,7 @@ Alternatively, create a job that analyzes the sample web logs data set to detect
113
109
:::
114
110
115
111
::::{dropdown} API example
112
+
116
113
```console
117
114
PUT _ml/anomaly_detectors/weblogs-geo <1>
118
115
{
@@ -167,11 +164,8 @@ POST _ml/datafeeds/datafeed-weblogs-geo/_start <4>
167
164
3. Open the job.
168
165
4. Start the {{dfeed}}. Since the sample data sets often contain timestamps that are later than the current date, it is a good idea to specify the appropriate end date for the {{dfeed}}.
169
166
170
-
171
167
::::
172
168
173
-
174
-
175
169
## Analyze the results [geographic-anomalies-results]
176
170
177
171
After the {{anomaly-jobs}} have processed some data, you can view the results in {{kib}}.
@@ -180,7 +174,6 @@ After the {{anomaly-jobs}} have processed some data, you can view the results in
180
174
If you used APIs to create the jobs and {{dfeeds}}, you cannot see them in {{kib}} until you follow the prompts to synchronize the necessary saved objects.
181
175
::::
182
176
183
-
184
177
When you select a period that contains an anomaly in the **Anomaly Explorer** swim lane results, you can see a map of the typical and actual coordinates. For example, in the eCommerce sample data there is a user with anomalous shopping behavior:
@@ -210,7 +203,6 @@ When you try this type of {{anomaly-job}} with your own data, it might take some
210
203
211
204
For more information about {{anomaly-detect}} concepts, see [Concepts](https://www.elastic.co/guide/en/machine-learning/current/ml-concepts.html). For the full list of functions that you can use in {{anomaly-jobs}}, see [*Function reference*](ml-functions.md). For more {{anomaly-detect}} examples, see [Examples](https://www.elastic.co/guide/en/machine-learning/current/anomaly-examples.html).
212
205
213
-
214
206
## Add anomaly layers to your maps [geographic-anomalies-map-layer]
215
207
216
208
To integrate the results from your {{anomaly-job}} in **Maps**, click **Add layer**, then select **ML Anomalies**. You must then select or create an {{anomaly-job}} that uses the `lat_long` function.
@@ -222,7 +214,6 @@ For example, you can extend the map example from [Build a map to compare metrics
222
214
:class: screenshot
223
215
:::
224
216
225
-
226
217
## What’s next [geographic-anomalies-next]
227
218
228
219
*[Learn more about **Maps**](../../visualize/maps.md)
Copy file name to clipboardExpand all lines: explore-analyze/machine-learning/anomaly-detection/mapping-anomalies.md
+1-8Lines changed: 1 addition & 8 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -7,14 +7,12 @@ mapped_pages:
7
7
8
8
If your data includes vector layers that are defined in the [{{ems}} ({{ems-init}})](../../visualize/maps/maps-connect-to-ems.md), your {{anomaly-jobs}} can generate a map of the anomalies by location.
9
9
10
-
11
10
## Prerequisites [mapping-anomalies-prereqs]
12
11
13
12
If you want to view choropleth maps in **{{data-viz}}** or {{anomaly-job}} results, you must have fields that contain valid vector layers (such as [country codes](https://maps.elastic.co/#file/world_countries) or [postal codes](https://maps.elastic.co/#file/usa_zip_codes)).
14
13
15
14
This example uses the sample web logs data set. For more information, see [Add the sample data](../../overview/kibana-quickstart.md#gs-get-data-into-kibana).
16
15
17
-
18
16
## Explore your data [visualize-vector-layers]
19
17
20
18
If you have fields that contain valid vector layers, you can use the **{{data-viz}}** in the **{{ml-app}}** app to see a choropleth map, in which each area is colored based on its document count. For example:
@@ -24,7 +22,6 @@ If you have fields that contain valid vector layers, you can use the **{{data-vi
24
22
:class: screenshot
25
23
:::
26
24
27
-
28
25
## Create an {{anomaly-job}} [mapping-anomalies-jobs]
29
26
30
27
To create an {{anomaly-job}} in {{kib}}, click **Create job** on the **{{ml-cap}} > {{anomaly-detect-cap}}** page and select an appropriate job wizard. Alternatively, use the [create {{anomaly-jobs}} API](https://www.elastic.co/guide/en/elasticsearch/reference/current/ml-put-job.html).
@@ -37,6 +34,7 @@ For example, use the multi-metric job wizard to create a job that analyzes the s
37
34
:::
38
35
39
36
::::{dropdown} API example
37
+
40
38
```console
41
39
PUT _ml/anomaly_detectors/weblogs-vectors <1>
42
40
{
@@ -87,11 +85,8 @@ POST _ml/datafeeds/datafeed-weblogs-vectors/_start <4>
87
85
3. Open the job.
88
86
4. Start the {{dfeed}}. Since the sample data sets often contain timestamps that are later than the current date, it is a good idea to specify the appropriate end date for the {{dfeed}}.
89
87
90
-
91
88
::::
92
89
93
-
94
-
95
90
## Analyze the results [mapping-anomalies-results]
96
91
97
92
After the {{anomaly-jobs}} have processed some data, you can view the results in {{kib}}.
@@ -100,15 +95,13 @@ After the {{anomaly-jobs}} have processed some data, you can view the results in
100
95
If you used APIs to create the jobs and {{dfeeds}}, you cannot see them in {{kib}} until you follow the prompts to synchronize the necessary saved objects.
:alt: A screenshot of the anomaly count by location in Anomaly Explorer
106
100
:class: screenshot
107
101
:::
108
102
109
103
The **Anomaly Explorer** contains a map, which is affected by your swim lane selections. It colors each location to reflect the number of anomalies in that selected time period. Locations that have few anomalies are indicated in blue; locations with many anomalies are red. Thus you can quickly see the locations that are generating the most anomalies. If your vector layers define regions, counties, or postal codes, you can zoom in for fine details.
110
104
111
-
112
105
## What’s next [mapping-anomalies-next]
113
106
114
107
*[Learn more about **Maps**](../../visualize/maps.md)
Use the information in this section to troubleshoot common problems and find answers for frequently asked questions.
13
10
14
-
15
11
## How to restart failed {{anomaly-jobs}} [ml-ad-restart-failed-jobs]
16
12
17
13
If an {{anomaly-job}} fails, try to restart the job by following the procedure described below. If the restarted job runs as expected, then the problem that caused the job to fail was transient and no further investigation is needed. If the job quickly fails after the restart, then the problem is persistent and needs further investigation. In this case, find out which node the failed job was running on by checking the job stats on the **Job management** pane in {{kib}}. Then get the logs for that node and look for exceptions and errors where the ID of the {{anomaly-job}} is in the message to have a better understanding of the issue.
@@ -35,7 +31,6 @@ If an {{anomaly-job}} has failed, do the following to recover from `failed` stat
35
31
36
32
3. Restart the {{anomaly-job}} on the **Job management** pane in {{kib}}.
37
33
38
-
39
34
## What {{ml}} methods are used for {{anomaly-detect}}? [faq-methods]
40
35
41
36
For detailed information, refer to the paper [Anomaly Detection in Application Performance Monitoring Data](https://www.ijmlc.org/papers/398-LC018.pdf) by Thomas Veasey and Stephen Dodson, as well as our webinars on [The Math behind Elastic Machine Learning](https://www.elastic.co/elasticon/conf/2018/sf/the-math-behind-elastic-machine-learning) and [Machine Learning and Statistical Methods for Time Series Analysis](https://www.elastic.co/elasticon/conf/2017/sf/machine-learning-and-statistical-methods-for-time-series-analysis).
@@ -47,30 +42,25 @@ Further papers cited in the C++ code:
47
42
* [Large-Scale Bayesian Logistic Regression for Text Categorization](http://www.stat.columbia.edu/~madigan/PAPERS/techno.pdf)
48
43
* [X-means: Extending K-means with Efficient Estimation of the Number of Clusters](https://www.cs.cmu.edu/~dpelleg/download/xmeans.pdf)
49
44
50
-
51
45
## What are the input features used by the model? [faq-features]
52
46
53
47
All input features are specified by the user, for example, using [diverse statistical functions](https://www.elastic.co/guide/en/machine-learning/current/ml-functions.html) like count or mean over the data of interest.
54
48
55
-
56
49
## Does the data used by the model only include customers' data? [faq-data]
57
50
58
51
Yes. Only the data specified in the {{anomaly-job}} configuration are used for detection.
59
52
60
-
61
53
## What does the model output score represent? How is it generated and calibrated? [faq-output-score]
62
54
63
55
The ensemble model generates a probability value, which is then mapped to an anomaly severity score between 0 and 100. The lower the probability of observed data, the higher the severity score. Refer to this [advanced concept doc](ml-ad-explain.md) for details. Calibration (also called as normalization) happens on two levels:
64
56
65
57
1. Within the same metric/partition, the scores are re-normalized “back in time” within the window specified by the `renormalization_window_days` parameter. This is the reason, for example, that both `record_score` and `initial_record_score` exist.
66
58
2. Over multiple partitions, scores are renormalized as described in [this blog post](https://www.elastic.co/blog/changes-to-elastic-machine-learning-anomaly-scoring-in-6-5).
67
59
68
-
69
60
## Is the model static or updated periodically? [faq-model-update]
70
61
71
62
It’s an online model and updated continuously. Old parts of the model are pruned out based on the parameter `model_prune_window` (usually 30 days).
72
63
73
-
74
64
## Is the performance of the model monitored? [faq-model-performance]
75
65
76
66
There is a set of benchmarks to monitor the performance of the {{anomaly-detect}} algorithms and to ensure no regression occurs as the methods are continuously developed and refined. They are called "data scenarios" and consist of 3 things:
@@ -87,14 +77,12 @@ On the customer side, the situation is different. There is no conventional way t
87
77
* Use the forecasting feature to predict the development of the metric of interest in the future.
88
78
* Use one or a combination of multiple {{anomaly-jobs}} to identify the significant anomaly influencers.
89
79
90
-
91
80
## How to measure the accuracy of the unsupervised {{ml}} model? [faq-model-accuracy]
92
81
93
82
For each record in a given time series, anomaly detection models provide an anomaly severity score, 95% confidence intervals, and an actual value. This data is stored in an index and can be retrieved using the Get Records API. With this information, you can use standard measures to assess prediction accuracy, interval calibration, and so on. Elasticsearch aggregations can be used to compute these statistics.
94
83
95
84
The purpose of {{anomaly-detect}} is to achieve the best ranking of periods where an anomaly happened. A practical way to evaluate this is to keep track of real incidents and see how well they correlate with the predictions of {{anomaly-detect}}.
96
85
97
-
98
86
## Can the {{anomaly-detect}} model experience model drift? [faq-model-drift]
99
87
100
88
Elasticsearch’s {{anomaly-detect}} model continuously learns and adapts to changes in the time series. These changes can take the form of slow drifts as well as sudden jumps. Therefore, we take great care to manage the adaptation to changing data characteristics. There is always a fine trade-off between fitting anomalous periods (over-fitting) and not learning new normal behavior. The following are the main approaches Elastic uses to manage this trade-off:
@@ -105,7 +93,6 @@ Elasticsearch’s {{anomaly-detect}} model continuously learns and adapts to cha
105
93
* Running continuous hypothesis tests on time windows of various lengths to test for significant evidence of new or changed periodic patterns, and update the model if the null hypothesis of unchanged features is rejected.
106
94
* Accumulating error statistics on calendar days and continuously test whether predictive calendar features need to be added or removed from the model.
107
95
108
-
109
96
## What is the minimum amount of data for an {{anomaly-job}}? [faq-minimum-data]
110
97
111
98
Elastic {{ml}} needs a minimum amount of data to be able to build an effective model for {{anomaly-detect}}.
@@ -120,7 +107,6 @@ Rules of thumb:
120
107
* more than three weeks for periodic data or a few hundred buckets for non-periodic data
121
108
* at least as much data as you want to forecast
122
109
123
-
124
110
## Are there any checks or processes to ensure data integrity? [faq-data-integrity]
125
111
126
112
The Elastic {{ml}} algorithms are programmed to work with missing and noisy data and use denoising and data reputation techniques based on the learned statistical properties.
0 commit comments