-
Notifications
You must be signed in to change notification settings - Fork 254
Adds population job how to guide #2784
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
Changes from 4 commits
Commits
Show all changes
5 commits
Select commit
Hold shift + click to select a range
3991bf4
Adds population job how to guide.
szabosteve 5304973
Modifies index file.
szabosteve 466b624
Update docs/en/stack/ml/anomaly-detection/ml-population-analysis.asci…
szabosteve 13eb16f
Adds links to the job types page.
szabosteve 0c5ac80
Addresses feedback.
szabosteve File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
96 changes: 96 additions & 0 deletions
96
docs/en/stack/ml/anomaly-detection/ml-population-analysis.asciidoc
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,96 @@ | ||
[[ml-configuring-populations]] | ||
= Performing population analysis | ||
|
||
Population analysis is a method of detecting anomalies by comparing the behavior of entities or events within a specified population. | ||
In this approach, {ml} analytics create a profile of what is considered "typical" behavior for users, machines, or other entities over a specified time period. | ||
An entity is considered as anomalous when its behavior deviates from that of the population, indicating abnormal activity compared to the rest of the population. | ||
|
||
This type of analysis is most effective when the behavior within a group is generally homogeneous, allowing for the identification of unusual patterns. | ||
However, it is less useful when members of the population show vastly different behaviors. | ||
In such cases, you can segment your data into groups with similar behaviors and run separate jobs for each. | ||
This can be done by using a query filter in the datafeed or by applying the `partition_field_name` to split the analysis across different groups. | ||
|
||
Population analysis is resource-efficient and scales well, enabling the analysis of populations consisting of hundreds of thousands or even millions of entities with a lower resource footprint than analyzing each series individually. | ||
|
||
|
||
|
||
[discrete] | ||
[[population-recommendations]] | ||
== Recommendations | ||
|
||
* Use population analysis when the behavior within a group is mostly homogeneous, as it helps identify anomalous patterns effectively. | ||
* Leverage population analysis when dealing with large-scale datasets. | ||
* Avoid using population analysis when members of the population exhibit vastly different behaviors, as it may not be effective. | ||
|
||
|
||
[discrete] | ||
[[creating-population-jobs]] | ||
== Creating population jobs | ||
|
||
. In {kib}, navigate to **{ml-app} > Anomaly Detection > Jobs**. | ||
. Click **Create {anomaly-jobs}**, select the {data-source} you want to analyze. | ||
. Select the **Population** wizard from the list. | ||
. Choose a population field - it's the `clientip` field in this example - and the metric you want to use for the analysis - `Mean(bytes)` in this example. | ||
+ | ||
-- | ||
[role="screenshot"] | ||
image::images/ml-population-wizard.png[Creating a population job in Kibana] | ||
-- | ||
. Click **Next**. | ||
. Provide a job ID and click **Next**. | ||
. If the validation is successful, click **Next** to review the summary of the job creation. | ||
. Click **Create job**. | ||
|
||
[%collapsible] | ||
.API example | ||
==== | ||
To specify the population, use the `over_field_name` property. For example: | ||
|
||
[source,console] | ||
---------------------------------- | ||
PUT _ml/anomaly_detectors/population | ||
{ | ||
"description" : "Population analysis", | ||
"analysis_config" : { | ||
"bucket_span":"15m", | ||
"influencers": [ | ||
"clientip" | ||
], | ||
"detectors": [ | ||
{ | ||
"function": "mean", | ||
"field_name": "bytes", | ||
"over_field_name": "clientip" <1> | ||
} | ||
] | ||
}, | ||
"data_description" : { | ||
"time_field":"timestamp", | ||
"time_format": "epoch_ms" | ||
} | ||
} | ||
---------------------------------- | ||
// TEST[skip:needs-licence] | ||
|
||
<1> This `over_field_name` property indicates that the metrics for each client (as identified by their IP address) are analyzed relative to other clients in each bucket. | ||
==== | ||
|
||
[discrete] | ||
[[population-job-results]] | ||
=== Viewing the job results | ||
|
||
Use the **Anomaly Explorer** in {kib} to view the analysis results: | ||
|
||
[role="screenshot"] | ||
image::images/ml-population-anomalies.png["Population results in the Anomaly Explorer"] | ||
|
||
The results are often quite sparse. | ||
There might be just a few data points for the selected time period. | ||
Population analysis is particularly useful when you have many entities and the data for specific entitles is sporadic or sparse. | ||
If you click on a section in the timeline or swim lanes, you can see more details about the anomalies: | ||
|
||
[role="screenshot"] | ||
image::images/ml-population-anomaly.png["Anomaly details for a specific user"] | ||
|
||
In this example, the client IP address `167.145.234.154` received a high volume of bytes on the date and time shown. | ||
This event is anomalous because the mean is four times higher than the expected behavior of the population. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.