Skip to content

Commit 97a4a1e

Browse files
authored
DOCS-1093 - Minimum learning period (#5777)
* Minimum learning period * Upate article titles * Updates from Paul Tobia review
1 parent 54e55d3 commit 97a4a1e

File tree

2 files changed

+12
-9
lines changed

2 files changed

+12
-9
lines changed

docs/cse/rules/write-first-seen-rule.md

Lines changed: 8 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
---
22
id: write-first-seen-rule
3-
title: Write a First Seen rule
3+
title: Write a First Seen Rule
44
sidebar_label: First Seen Rule
55
description: First seen rules allow you to generate a signal when behavior by an entity (user) is encountered that hasn't been seen before.
66
keywords:
@@ -56,15 +56,17 @@ Watch this micro lesson to learn more about first seen rules.
5656

5757
## Baselines for first seen rules
5858

59-
A first seen rule is different from other Cloud SIEM rule types in that you don’t define the criteria for firing a signal. Instead, the rule expression in a first seen rule is simply a filter condition that defines what incoming records the rule will apply to. For each first seen rule, Cloud SIEM automatically creates a baseline model of normal behavior for a defined time period (by default for the last 90 days) evidenced by records that match the Rule Expression. The activity found during this period is considered normal behavior and will not be alerted on.
59+
A first seen rule is different from other Cloud SIEM rule types in that you don’t define the criteria for firing a signal. Instead, the rule expression in a first seen rule is simply a filter condition that defines what incoming records the rule will apply to. For each first seen rule, Cloud SIEM automatically creates a baseline model of normal behavior for a defined time period (by default using data from the last 90 days) evidenced by records that match the Rule Expression. The activity found during this period is considered normal behavior and will not be alerted on.
6060

61-
As soon as you save or update a first seen rule (or disable and re-enable it), the full baseline is built using existing data collected. If data exists in the system to build the baseline, baseline creation typically takes only minutes to complete. If the records gathered for a baseline exceed 50 million, the historical baseline capabilities become inefficient and it’s better to let the baseline gather data over time. You will be notified of this state in the UI, and can either let the baseline gather over the days set in the baseline, or edit the rule to filter more records or reduce the baseline period to keep it under 50 million records.
61+
As soon as you save or update a first seen rule (or disable and re-enable it), the full baseline is built using existing data collected. A minimum of 7 days of baseline information needs to be available in order for a rule to be active and generating signals. (That is, events relevant to the baseline must be at least 7 days old before the baseline is considered complete.) If data exists in the system to build the baseline, baseline creation typically takes only minutes to complete.
6262

6363
Once the baseline is created, when an incoming record includes matching activity not seen during the baseline retention period, the rule creates a signal identifying the activity as *first seen*. The signal indicates that the activity is first seen:
6464

6565
<img src={useBaseUrl('img/cse/first-seen-signal-example.png')} alt="First seen signal example" style={{border: '1px solid gray'}} width="600"/>
6666

67-
For example, for the “First time a user logged in from a new geographic location” use case, Cloud SIEM will build a baseline model of all the geolocations from where a logon event is seen for the entity (user). Once the baseline is created, Cloud SIEM will create a signal for every new geolocation detected and incrementally add to the baseline.
67+
For example, for the “First time a user logged in from a new geographic location” use case, Cloud SIEM will build a baseline model of all the geolocations from where a logon event is seen for the entity (user). Because a minimum of 7 days of baseline information needs to be available, activities within 7 days of the first recorded login to a new location will not generate signals, but the first login to a new location on the 8th day will generate a signal. Once the baseline is created, Cloud SIEM will create a signal for every new geolocation detected and incrementally add to the baseline.
68+
69+
If the records gathered for a baseline exceed 50 million, the historical baseline capabilities to generate a baseline through a query become inefficient and it’s better to let the baseline gather data over time. You will be notified of this state in the UI, and can either let the baseline gather over the days set in the baseline, or edit the rule to filter more records or reduce the baseline period to keep it under 50 million records.
6870

6971
:::tip
7072
Sumo Logic ensures that rule processing does not impact the reliability of production environments through the implementation of "circuit breakers." If a rule matches too many records in too short a period of time, the circuit breaker will trip and the rule will move to a degraded state, and first seen rules are no exception.
@@ -149,12 +151,11 @@ with **has a new value for the field(s)** set to `srcDeviceIP_countryName`
149151

150152
### With a global baseline
151153

152-
With a global baseline, and the default baseline retention period of the last 90 days, the rule creates a baseline of all geolocations that users logged in from for the last 90 days. If a new geolocation is detected, a signal will be created. Then, if a new hire (that wasn’t part of the 90 day baseline) logs in from any geolocation, a signal
153-
will be created. As a global baseline, the 90 day baseline is shared across all entities.
154+
With a global baseline, and the default baseline retention period of the last 90 days, using the previous example the rule creates a baseline of all geolocations that users logged in from using data from the last 90 days. Once the first event of a new geolocation is detected, the 7-day minimum learning period begins. On the 8th day, a signal will be created. Then, if a new hire (that wasn’t part of the 90 day baseline) logs in from any geolocation, a signal will be created. As a global baseline, the 90 day baseline retention period is shared across all entities.
154155

155156
### With per-entity baselines
156157

157-
With a per-entity baseline, and the default baseline retention period of the last 90 days, the rule creates a baseline of all geolocations on a per-entity basis for the last 90 days. It will generate a signal when a new geolocation is not part of a user’s historic baseline. On a new hire’s first login, a baseline for the last 90 days will begin rebuilding. If that user logs on from a new geolocation, the rule will create a signal.
158+
With a per-entity baseline, and the default baseline retention period of the last 90 days, using the previous example the rule creates a baseline of all geolocations on a per-entity basis using data from the last 90 days. It will generate a signal after the minimum learning period of 7 days when a new geolocation is not part of a user’s historic baseline. On a new hire’s first login, a baseline for the last 90 days will begin rebuilding. If that user logs on from a new geolocation, the rule will create a signal.
158159

159160
:::tip
160161
If you are unsure whether to use a per-entity or a global baseline, consider your use case. If you’re inclined to select `user_username` in the **Has a new value for the field(s)** prompt, you’re better off creating a global baseline for that behavior. Alternatively, if you want to track a new value for a non-entity record field, a per-entity baseline is appropriate.

docs/cse/rules/write-outlier-rule.md

Lines changed: 4 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
---
22
id: write-outlier-rule
3-
title: Write an Outlier rule
3+
title: Write an Outlier Rule
44
sidebar_label: Outlier Rule
55
description: Outlier rules allow you to generate a signal when behavior by an entity (such as a user) is encountered that qualifies as an outlier from expected behavior.
66
keywords:
@@ -61,14 +61,16 @@ Watch this micro lesson to learn more about outlier rules.
6161

6262
When you create the rule, you can set the amount of time Cloud SIEM analyzes data to create a baseline model of behavior, with the default period being for the last 90 days. You can set the rule to build data hourly or daily, depending on how frequently you believe events of interest will occur, and how much data you want to gather. In the rule, you set the model sensitivity threshold to calculate outlier activity based on the number of standard deviations from the mean (z‑score).
6363

64-
As soon as you save or update an outlier rule (or disable and re-enable it), the full baseline is built using existing data collected. So if your baseline retention period is for the last 90 days (the default), the system uses data from the last 90 days to build the baseline. If data exists in the system to build the baseline, baseline creation typically takes only minutes to complete. If the records gathered for a baseline exceed 50 million, the historical baseline capabilities become inefficient and it’s better to let the baseline gather data over time. You will be notified of this state in the UI, and can either let the baseline gather over the days set in the baseline, or edit the rule to filter more records or reduce the baseline period to keep it under 50 million records.
64+
As soon as you save or update an outlier rule (or disable and re-enable it), the full baseline is built using existing data collected. So if your baseline retention period is for the last 90 days (the default), the system uses data from the last 90 days to build the baseline. A minimum of 7 days of baseline information needs to be available in order for a rule to be active and generating signals. (That is, the first relevant event or data point must be at least 7 days old before the baseline is considered complete). If data exists in the system to build the baseline, baseline creation typically takes only minutes to complete.
6565

6666
Once the baseline is created, Cloud SIEM tracks aggregates of count, sum, min, max, and averages of record values, and creates a signal when deviations from the mean occurs. For example, for the [spike in failed logins from a user](#use-case-for-a-spike-in-failed-logins-from-a-user) use case, Cloud SIEM builds a baseline model of counts of authentication failures that are associated with a user over time, and creates a signal when outlier behavior is detected:
6767

6868
<img src={useBaseUrl('img/cse/outlier-signal-example.png')} alt="Outlier signal example" style={{border: '1px solid gray'}} width="600"/>
6969

7070
After your rule starts generating signals, evaluate them to determine if they truly represent outliers of concern, and adjust the rule settings as needed. For example, if too many signals are being generated, the baseline model is too sensitive, and you need to set the model sensitivity threshold higher on the rule; if too few signals are generated, set the threshold lower. Among other things, also evaluate if the signals from outliers are generating enough insights. To [generate an insight](/docs/cse/get-started-with-cloud-siem/insight-generation-process/), by default the combined severity scores of signals need to exceed 12, or a custom insight can be used. Change the severity level in the outlier rule or create a custom insight to trigger insights based on this rule for investigation.
7171

72+
If the records gathered for a baseline exceed 50 million, the historical baseline capabilities to generate a baseline via a query become inefficient and it’s better to let the baseline gather data over time. You will be notified of this state in the UI, and can either let the baseline gather over the days set in the baseline, or edit the rule to filter more records or reduce the baseline period to keep it under 50 million records.
73+
7274
:::tip
7375
Sumo Logic ensures that rule processing does not impact the reliability of production environments through the implementation of "circuit breakers." If a rule matches too many records in too short a period of time, the circuit breaker will trip and the rule will move to a degraded state, and outlier rules are no exception.
7476

0 commit comments

Comments
 (0)