Skip to content

Commit 0ed1526

Browse files
authored
Merge branch 'main' into cloud_account_refinement
2 parents 050bca0 + b75d5c5 commit 0ed1526

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

50 files changed

+435
-667
lines changed

explore-analyze/alerts/kibana/rule-type-es-query.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -37,7 +37,7 @@ When you create an {{es}} query rule, your choice of query type affects the info
3737

3838
If you use [KQL](../../query-filter/languages/kql.md) or [Lucene](../../query-filter/languages/lucene-query-syntax.md), you must specify a data view then define a text-based query. For example, `http.request.referrer: "https://example.com"`.
3939

40-
If you use [ES|QL](../../query-filter/languages/esorql.md), you must provide a source command followed by an optional series of processing commands, separated by pipe characters (|). [8.16.0] For example:
40+
If you use [ES|QL](../../query-filter/languages/esql.md), you must provide a source command followed by an optional series of processing commands, separated by pipe characters (|). [8.16.0] For example:
4141

4242
```sh
4343
FROM kibana_sample_data_logs

explore-analyze/discover/try-esql.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -10,7 +10,7 @@ The Elasticsearch Query Language, {{esql}}, makes it easier to explore your data
1010
In this tutorial we’ll use the {{kib}} sample web logs in Discover and Lens to explore the data and create visualizations.
1111

1212
::::{tip}
13-
For the complete {{esql}} documentation, including tutorials, examples and the full syntax reference, refer to the [{{es}} documentation](../query-filter/languages/esorql.md). For a more detailed overview of {{esql}} in {{kib}}, refer to [Use {{esql}} in Kibana](../query-filter/languages/esql-kibana.md).
13+
For the complete {{esql}} documentation, including tutorials, examples and the full syntax reference, refer to the [{{es}} documentation](../query-filter/languages/esql.md). For a more detailed overview of {{esql}} in {{kib}}, refer to [Use {{esql}} in Kibana](../query-filter/languages/esql-kibana.md).
1414

1515
::::
1616

explore-analyze/geospatial-analysis.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -34,7 +34,7 @@ Data is often messy and incomplete. [Ingest pipelines](../manage-data/ingest/tra
3434

3535
## ES|QL [esql-query]
3636

37-
[ES|QL](query-filter/languages/esorql.md) has support for [Geospatial Search](https://www.elastic.co/guide/en/elasticsearch/reference/current/esql-functions-operators.html#esql-spatial-functions) functions, enabling efficient index searching for documents that intersect with, are within, are contained by, or are disjoint from a query geometry. In addition, the `ST_DISTANCE` function calculates the distance between two points.
37+
[ES|QL](query-filter/languages/esql.md) has support for [Geospatial Search](https://www.elastic.co/guide/en/elasticsearch/reference/current/esql-functions-operators.html#esql-spatial-functions) functions, enabling efficient index searching for documents that intersect with, are within, are contained by, or are disjoint from a query geometry. In addition, the `ST_DISTANCE` function calculates the distance between two points.
3838

3939
* [`ST_INTERSECTS`](https://www.elastic.co/guide/en/elasticsearch/reference/current/esql-functions-operators.html#esql-st_intersects)
4040
* [`ST_DISJOINT`](https://www.elastic.co/guide/en/elasticsearch/reference/current/esql-functions-operators.html#esql-st_disjoint)

explore-analyze/machine-learning/anomaly-detection/geographic-anomalies.md

Lines changed: 2 additions & 11 deletions
Original file line numberDiff line numberDiff line change
@@ -7,7 +7,6 @@ mapped_pages:
77

88
If your data includes geographic fields, you can use {{ml-features}} to detect anomalous behavior, such as a credit card transaction that occurs in an unusual location or a web request that has an unusual source location.
99

10-
1110
## Prerequisites [geographic-anomalies-prereqs]
1211

1312
To run this type of {{anomaly-job}}, you must have [{{ml-features}} set up](../setting-up-machine-learning.md). You must also have time series data that contains spatial data types. In particular, you must have:
@@ -21,7 +20,6 @@ The latitude and longitude must be in the range -180 to 180 and represent a poin
2120

2221
This example uses the sample eCommerce orders and sample web logs data sets. For more information, see [Add the sample data](../../overview/kibana-quickstart.md#gs-get-data-into-kibana).
2322

24-
2523
## Explore your geographic data [geographic-anomalies-visualize]
2624

2725
To get the best results from {{ml}} analytics, you must understand your data. You can use the **{{data-viz}}** in the **{{ml-app}}** app for this purpose. Search for specific fields or field types, such as geo-point fields in the sample data sets. You can see how many documents contain those fields within a specific time period and sample size. You can also see the number of distinct values, a list of example values, and preview them on a map. For example:
@@ -31,7 +29,6 @@ To get the best results from {{ml}} analytics, you must understand your data. Yo
3129
:class: screenshot
3230
:::
3331

34-
3532
## Create an {{anomaly-job}} [geographic-anomalies-jobs]
3633

3734
There are a few limitations to consider before you create this type of job:
@@ -51,6 +48,7 @@ For example, create a job that analyzes the sample eCommerce orders data set to
5148
:::
5249

5350
::::{dropdown} API example
51+
5452
```console
5553
PUT _ml/anomaly_detectors/ecommerce-geo <1>
5654
{
@@ -101,10 +99,8 @@ POST _ml/datafeeds/datafeed-ecommerce-geo/_start <4>
10199
3. Open the job.
102100
4. Start the {{dfeed}}. Since the sample data sets often contain timestamps that are later than the current date, it is a good idea to specify the appropriate end date for the {{dfeed}}.
103101

104-
105102
::::
106103

107-
108104
Alternatively, create a job that analyzes the sample web logs data set to detect events with unusual coordinates (`geo.coordinates` values) or unusually high sums of transferred data (`bytes` values):
109105

110106
:::{image} ../../../images/machine-learning-weblogs-advanced-wizard-geopoint.jpg
@@ -113,6 +109,7 @@ Alternatively, create a job that analyzes the sample web logs data set to detect
113109
:::
114110

115111
::::{dropdown} API example
112+
116113
```console
117114
PUT _ml/anomaly_detectors/weblogs-geo <1>
118115
{
@@ -167,11 +164,8 @@ POST _ml/datafeeds/datafeed-weblogs-geo/_start <4>
167164
3. Open the job.
168165
4. Start the {{dfeed}}. Since the sample data sets often contain timestamps that are later than the current date, it is a good idea to specify the appropriate end date for the {{dfeed}}.
169166

170-
171167
::::
172168

173-
174-
175169
## Analyze the results [geographic-anomalies-results]
176170

177171
After the {{anomaly-jobs}} have processed some data, you can view the results in {{kib}}.
@@ -180,7 +174,6 @@ After the {{anomaly-jobs}} have processed some data, you can view the results in
180174
If you used APIs to create the jobs and {{dfeeds}}, you cannot see them in {{kib}} until you follow the prompts to synchronize the necessary saved objects.
181175
::::
182176

183-
184177
When you select a period that contains an anomaly in the **Anomaly Explorer** swim lane results, you can see a map of the typical and actual coordinates. For example, in the eCommerce sample data there is a user with anomalous shopping behavior:
185178

186179
:::{image} ../../../images/machine-learning-ecommerce-anomaly-explorer-geopoint.jpg
@@ -210,7 +203,6 @@ When you try this type of {{anomaly-job}} with your own data, it might take some
210203

211204
For more information about {{anomaly-detect}} concepts, see [Concepts](https://www.elastic.co/guide/en/machine-learning/current/ml-concepts.html). For the full list of functions that you can use in {{anomaly-jobs}}, see [*Function reference*](ml-functions.md). For more {{anomaly-detect}} examples, see [Examples](https://www.elastic.co/guide/en/machine-learning/current/anomaly-examples.html).
212205

213-
214206
## Add anomaly layers to your maps [geographic-anomalies-map-layer]
215207

216208
To integrate the results from your {{anomaly-job}} in **Maps**, click **Add layer**, then select **ML Anomalies**. You must then select or create an {{anomaly-job}} that uses the `lat_long` function.
@@ -222,7 +214,6 @@ For example, you can extend the map example from [Build a map to compare metrics
222214
:class: screenshot
223215
:::
224216

225-
226217
## What’s next [geographic-anomalies-next]
227218

228219
* [Learn more about **Maps**](../../visualize/maps.md)

explore-analyze/machine-learning/anomaly-detection/mapping-anomalies.md

Lines changed: 1 addition & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -7,14 +7,12 @@ mapped_pages:
77

88
If your data includes vector layers that are defined in the [{{ems}} ({{ems-init}})](../../visualize/maps/maps-connect-to-ems.md), your {{anomaly-jobs}} can generate a map of the anomalies by location.
99

10-
1110
## Prerequisites [mapping-anomalies-prereqs]
1211

1312
If you want to view choropleth maps in **{{data-viz}}** or {{anomaly-job}} results, you must have fields that contain valid vector layers (such as [country codes](https://maps.elastic.co/#file/world_countries) or [postal codes](https://maps.elastic.co/#file/usa_zip_codes)).
1413

1514
This example uses the sample web logs data set. For more information, see [Add the sample data](../../overview/kibana-quickstart.md#gs-get-data-into-kibana).
1615

17-
1816
## Explore your data [visualize-vector-layers]
1917

2018
If you have fields that contain valid vector layers, you can use the **{{data-viz}}** in the **{{ml-app}}** app to see a choropleth map, in which each area is colored based on its document count. For example:
@@ -24,7 +22,6 @@ If you have fields that contain valid vector layers, you can use the **{{data-vi
2422
:class: screenshot
2523
:::
2624

27-
2825
## Create an {{anomaly-job}} [mapping-anomalies-jobs]
2926

3027
To create an {{anomaly-job}} in {{kib}}, click **Create job** on the **{{ml-cap}} > {{anomaly-detect-cap}}** page and select an appropriate job wizard. Alternatively, use the [create {{anomaly-jobs}} API](https://www.elastic.co/guide/en/elasticsearch/reference/current/ml-put-job.html).
@@ -37,6 +34,7 @@ For example, use the multi-metric job wizard to create a job that analyzes the s
3734
:::
3835

3936
::::{dropdown} API example
37+
4038
```console
4139
PUT _ml/anomaly_detectors/weblogs-vectors <1>
4240
{
@@ -87,11 +85,8 @@ POST _ml/datafeeds/datafeed-weblogs-vectors/_start <4>
8785
3. Open the job.
8886
4. Start the {{dfeed}}. Since the sample data sets often contain timestamps that are later than the current date, it is a good idea to specify the appropriate end date for the {{dfeed}}.
8987

90-
9188
::::
9289

93-
94-
9590
## Analyze the results [mapping-anomalies-results]
9691

9792
After the {{anomaly-jobs}} have processed some data, you can view the results in {{kib}}.
@@ -100,15 +95,13 @@ After the {{anomaly-jobs}} have processed some data, you can view the results in
10095
If you used APIs to create the jobs and {{dfeeds}}, you cannot see them in {{kib}} until you follow the prompts to synchronize the necessary saved objects.
10196
::::
10297

103-
10498
:::{image} ../../../images/machine-learning-weblogs-anomaly-explorer-vectors.png
10599
:alt: A screenshot of the anomaly count by location in Anomaly Explorer
106100
:class: screenshot
107101
:::
108102

109103
The **Anomaly Explorer** contains a map, which is affected by your swim lane selections. It colors each location to reflect the number of anomalies in that selected time period. Locations that have few anomalies are indicated in blue; locations with many anomalies are red. Thus you can quickly see the locations that are generating the most anomalies. If your vector layers define regions, counties, or postal codes, you can zoom in for fine details.
110104

111-
112105
## What’s next [mapping-anomalies-next]
113106

114107
* [Learn more about **Maps**](../../visualize/maps.md)

explore-analyze/machine-learning/anomaly-detection/ml-ad-resources.md

Lines changed: 0 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -11,6 +11,3 @@ This section contains further resources for using {{anomaly-detect}}.
1111
* [*Function reference*](ml-functions.md)
1212
* [Supplied configurations](ootb-ml-jobs.md)
1313
* [Troubleshooting and FAQ](ml-ad-troubleshooting.md)
14-
15-
16-

explore-analyze/machine-learning/anomaly-detection/ml-ad-troubleshooting.md

Lines changed: 0 additions & 14 deletions
Original file line numberDiff line numberDiff line change
@@ -4,14 +4,10 @@ mapped_pages:
44
- https://www.elastic.co/guide/en/machine-learning/current/ml-ad-troubleshooting.html
55
---
66

7-
8-
97
# Troubleshooting and FAQ [ml-ad-troubleshooting]
108

11-
129
Use the information in this section to troubleshoot common problems and find answers for frequently asked questions.
1310

14-
1511
## How to restart failed {{anomaly-jobs}} [ml-ad-restart-failed-jobs]
1612

1713
If an {{anomaly-job}} fails, try to restart the job by following the procedure described below. If the restarted job runs as expected, then the problem that caused the job to fail was transient and no further investigation is needed. If the job quickly fails after the restart, then the problem is persistent and needs further investigation. In this case, find out which node the failed job was running on by checking the job stats on the **Job management** pane in {{kib}}. Then get the logs for that node and look for exceptions and errors where the ID of the {{anomaly-job}} is in the message to have a better understanding of the issue.
@@ -35,7 +31,6 @@ If an {{anomaly-job}} has failed, do the following to recover from `failed` stat
3531

3632
3. Restart the {{anomaly-job}} on the **Job management** pane in {{kib}}.
3733

38-
3934
## What {{ml}} methods are used for {{anomaly-detect}}? [faq-methods]
4035

4136
For detailed information, refer to the paper [Anomaly Detection in Application Performance Monitoring Data](https://www.ijmlc.org/papers/398-LC018.pdf) by Thomas Veasey and Stephen Dodson, as well as our webinars on [The Math behind Elastic Machine Learning](https://www.elastic.co/elasticon/conf/2018/sf/the-math-behind-elastic-machine-learning) and [Machine Learning and Statistical Methods for Time Series Analysis](https://www.elastic.co/elasticon/conf/2017/sf/machine-learning-and-statistical-methods-for-time-series-analysis).
@@ -47,30 +42,25 @@ Further papers cited in the C++ code:
4742
* [Large-Scale Bayesian Logistic Regression for Text Categorization](http://www.stat.columbia.edu/~madigan/PAPERS/techno.pdf)
4843
* [X-means: Extending K-means with Efficient Estimation of the Number of Clusters](https://www.cs.cmu.edu/~dpelleg/download/xmeans.pdf)
4944

50-
5145
## What are the input features used by the model? [faq-features]
5246

5347
All input features are specified by the user, for example, using [diverse statistical functions](https://www.elastic.co/guide/en/machine-learning/current/ml-functions.html) like count or mean over the data of interest.
5448

55-
5649
## Does the data used by the model only include customers' data? [faq-data]
5750

5851
Yes. Only the data specified in the {{anomaly-job}} configuration are used for detection.
5952

60-
6153
## What does the model output score represent? How is it generated and calibrated? [faq-output-score]
6254

6355
The ensemble model generates a probability value, which is then mapped to an anomaly severity score between 0 and 100. The lower the probability of observed data, the higher the severity score. Refer to this [advanced concept doc](ml-ad-explain.md) for details. Calibration (also called as normalization) happens on two levels:
6456

6557
1. Within the same metric/partition, the scores are re-normalized “back in time” within the window specified by the `renormalization_window_days` parameter. This is the reason, for example, that both `record_score` and `initial_record_score` exist.
6658
2. Over multiple partitions, scores are renormalized as described in [this blog post](https://www.elastic.co/blog/changes-to-elastic-machine-learning-anomaly-scoring-in-6-5).
6759

68-
6960
## Is the model static or updated periodically? [faq-model-update]
7061

7162
It’s an online model and updated continuously. Old parts of the model are pruned out based on the parameter `model_prune_window` (usually 30 days).
7263

73-
7464
## Is the performance of the model monitored? [faq-model-performance]
7565

7666
There is a set of benchmarks to monitor the performance of the {{anomaly-detect}} algorithms and to ensure no regression occurs as the methods are continuously developed and refined. They are called "data scenarios" and consist of 3 things:
@@ -87,14 +77,12 @@ On the customer side, the situation is different. There is no conventional way t
8777
* Use the forecasting feature to predict the development of the metric of interest in the future.
8878
* Use one or a combination of multiple {{anomaly-jobs}} to identify the significant anomaly influencers.
8979

90-
9180
## How to measure the accuracy of the unsupervised {{ml}} model? [faq-model-accuracy]
9281

9382
For each record in a given time series, anomaly detection models provide an anomaly severity score, 95% confidence intervals, and an actual value. This data is stored in an index and can be retrieved using the Get Records API. With this information, you can use standard measures to assess prediction accuracy, interval calibration, and so on. Elasticsearch aggregations can be used to compute these statistics.
9483

9584
The purpose of {{anomaly-detect}} is to achieve the best ranking of periods where an anomaly happened. A practical way to evaluate this is to keep track of real incidents and see how well they correlate with the predictions of {{anomaly-detect}}.
9685

97-
9886
## Can the {{anomaly-detect}} model experience model drift? [faq-model-drift]
9987

10088
Elasticsearch’s {{anomaly-detect}} model continuously learns and adapts to changes in the time series. These changes can take the form of slow drifts as well as sudden jumps. Therefore, we take great care to manage the adaptation to changing data characteristics. There is always a fine trade-off between fitting anomalous periods (over-fitting) and not learning new normal behavior. The following are the main approaches Elastic uses to manage this trade-off:
@@ -105,7 +93,6 @@ Elasticsearch’s {{anomaly-detect}} model continuously learns and adapts to cha
10593
* Running continuous hypothesis tests on time windows of various lengths to test for significant evidence of new or changed periodic patterns, and update the model if the null hypothesis of unchanged features is rejected.
10694
* Accumulating error statistics on calendar days and continuously test whether predictive calendar features need to be added or removed from the model.
10795

108-
10996
## What is the minimum amount of data for an {{anomaly-job}}? [faq-minimum-data]
11097

11198
Elastic {{ml}} needs a minimum amount of data to be able to build an effective model for {{anomaly-detect}}.
@@ -120,7 +107,6 @@ Rules of thumb:
120107
* more than three weeks for periodic data or a few hundred buckets for non-periodic data
121108
* at least as much data as you want to forecast
122109

123-
124110
## Are there any checks or processes to ensure data integrity? [faq-data-integrity]
125111

126112
The Elastic {{ml}} algorithms are programmed to work with missing and noisy data and use denoising and data reputation techniques based on the learned statistical properties.

0 commit comments

Comments
 (0)