Skip to content

Commit 217f338

Browse files
committed
Processing rule edits
1 parent 916f9ab commit 217f338

File tree

15 files changed

+98
-170
lines changed

15 files changed

+98
-170
lines changed

docs/send-data/opentelemetry-collector/remote-management/processing-rules/include-and-exclude-rules.md

Lines changed: 13 additions & 19 deletions
Original file line numberDiff line numberDiff line change
@@ -5,25 +5,19 @@ sidebar_label: Include and Exclude Rules
55
description: Use include and exclude processing rules to specify what kind of data is sent to Sumo Logic using OpenTelemetry Collector.
66
---
77

8-
<head>
9-
<meta name="robots" content="noindex" />
10-
</head>
11-
12-
<p><a href="/docs/beta"><span className="beta">Beta</span></a></p>
13-
148
import useBaseUrl from '@docusaurus/useBaseUrl';
159

16-
You can use include and exclude processing rules to specify what data is sent to Sumo Logic using OpenTelemetry Collector. Internally these will use [filter processor](https://github.com/open-telemetry/opentelemetry-collector-contrib/tree/main/processor/filterprocessor) to get the data filtered.
10+
You can use include and exclude processing rules to define which data is sent to Sumo Logic using the OpenTelemetry Collector. These rules internally utilize the [filter processor](https://github.com/open-telemetry/opentelemetry-collector-contrib/tree/main/processor/filterprocessor) to filter the data.
1711

18-
* An exclude rule functions as a denylist filter where the matching data is not sent to Sumo Logic.
19-
* An include rule functions as an allowlist filter where only matching data is sent to Sumo Logic.
12+
* An exclude rule functions as a denylist filter, ensuring that matching data is not sent to Sumo Logic.
13+
* An include rule functions as an allowlist filter, ensuring that only matching data is sent to Sumo Logic.
2014

21-
As a best practice, specify these rules to match the lesser volume of data.
15+
As a best practice, configure these rules to filter the smaller volume of data for optimal performance:
2216

23-
* If you want to **collect the majority of data** from a source template, provide **exclude** rules to match (filter out) the lesser volume of data.
24-
* If you want to **collect a small set of data** from a source template, provide **include** rules to match (filter in) the lesser volume of data.
17+
* If you want to **collect the majority of data** from a source template, use **exclude** rules to match (filter out) the lesser volume of data.
18+
* If you want to **collect a small set of data** from a source template, use **include** rules to match (filter in) the lesser volume of data.
2519

26-
For example, to include only messages coming from a Windows Event log with ID `8015`, you can add a Logs Filter to the source template and select the **Type** of the filter as "Include message that match", and can use the following filter regular expression:
20+
For example, to include only messages from a Windows Event log with ID `8015`, you can add a Logs Filter to the source template. Select the **Type** of the filter as "Include messages that match" and use the following filter regular expression:
2721

2822
```
2923
.*"id":8015.*
@@ -33,10 +27,10 @@ For example, to include only messages coming from a Windows Event log with ID `8
3327

3428
## Rules and limitations
3529

36-
When writing regular expression rules, you must follow these rules:
30+
When creating regular expression rules, adhere to the following guidelines:
3731

38-
* Your rule must be [RE2 compliant](https://github.com/google/re2/wiki/Syntax).
39-
* If your rule matches *only a section* of the log line, the full log line will be matched.
40-
* For *single line messages*, it is not mandatory to prefix and suffix the regex expression with `.\*`.
41-
* Exclude rules take priority over include rules. Include rules are processed first. However, if an exclude rule matches data that matched the include rule filter, the data is excluded.
42-
* If two or more rules are listed, the assumed Boolean operator is `OR`.
32+
- Your rule must comply with [RE2 syntax](https://github.com/google/re2/wiki/Syntax).
33+
- If your rule matches *any part* of a log line, the entire log line will be matched.
34+
- For *single-line messages*, it is not necessary to prefix or suffix the regex with `.*`.
35+
- *Exclude rules* take precedence over *include rules*. Include rules are processed first, but if an exclude rule matches data that also matches the include rule, the data will be excluded.
36+
- When multiple rules are listed, the assumed Boolean operator is `OR`.

docs/send-data/opentelemetry-collector/remote-management/processing-rules/mask-rules-windows.md

Lines changed: 23 additions & 20 deletions
Original file line numberDiff line numberDiff line change
@@ -1,32 +1,37 @@
11
---
22
id: mask-rules-windows
33
title: Mask Rules for the Windows Source Template
4-
sidebar_label: Mask Rules for Windows
4+
sidebar_label: Mask Rules - Windows Source Template
55
description: Create a mask rule to replace an expression with a mask string.
66
---
7-
<head>
8-
<meta name="robots" content="noindex" />
9-
</head>
10-
11-
<p><a href="/docs/beta"><span className="beta">Beta</span></a></p>
127

138
:::note
14-
This document only support masking logs for Windows source template. Refer to [Mask Rules](mask-rules.md) to mask logs for other source template.
9+
This document supports masking logs specifically for our [Windows source template](/docs/send-data/opentelemetry-collector/remote-management/source-templates/windows). For other source templates, refer to [Mask Rules](mask-rules.md).
1510
:::
1611

17-
A mask rule is a type of processing rule that hides irrelevant or sensitive information from logs before they are ingested. When you create a mask rule, the selected key will have its value matched against a regex pattern, which will then be replaced with a mask string before being sent to Sumo Logic. You can provide a custom mask string or use the default string, `"#####"`.
12+
A mask rule is a type of processing rule that hides irrelevant or sensitive information from logs before they are ingested. When you create a mask rule:
13+
14+
* The selected key’s value is matched against a regular expression (regex).
15+
* The matching portion is replaced with a mask string before being sent to Sumo Logic.
16+
* You can provide a custom mask string or use the default mask string: `"#####"`.
17+
18+
Masking is an effective method for reducing overall ingestion volume. Ingestion volume is calculated after applying the mask filter. If masking reduces the log size, the smaller size will be considered against the ingestion limits.
1819

19-
Ingestion volume is calculated after applying the mask filter. If masking reduces the log size, the smaller size will be considered against the ingestion limits. Masking is an effective method for reducing overall ingestion volume.
20+
## Masking inputs
2021

2122
To mask specific fields in the Windows Event Log, the following inputs are required:
22-
- **Key**. This should point to the key in the Windows Event Log for which the value needs to be masked. This key can be nested, with each level separated by a dot(.). For example, `provider.guid`.
23-
- **Regex**. This identifies the part of the string value that needs to be masked.
24-
- ** Replacement **. This is to get the string that will be substituted in place of the string that was selected through the regex expression.
23+
- **Key**. This should point to the key in the Windows Event Log for which the value needs to be masked. This key can be nested, with each level separated by a dot (`.`). For example, `provider.guid`.
24+
- **Regex**. This pattern identifies the part of the string value that needs to be masked.
25+
- **Replacement**. The string to substitute for the matching portion identified by the regex.
2526

2627
:::important
2728
Any masking expression should be tested and verified with a sample source file before applying it to your production logs.
2829
:::
2930

31+
## Examples
32+
33+
### Masking numbers in a nested field
34+
3035
For example, to mask numbers inside `guid` under `provider` field from this log:
3136

3237
```
@@ -89,8 +94,7 @@ You could use the following masking expression input:
8994

9095
Using the above masking options would provide the following result:
9196

92-
```
93-
{
97+
`{
9498
"record_id": 163054,
9599
"channel": "Security",
96100
"event_data": {
@@ -139,17 +143,16 @@ Using the above masking options would provide the following result:
139143
"id": 4798
140144
},
141145
"level": "Information"
142-
}
143-
```
146+
}`
144147

145148
:::note
146-
- For masking, we use the [replace_pattern](https://github.com/open-telemetry/opentelemetry-collector-contrib/blob/main/pkg/ottl/ottlfuncs/README.md#replace_pattern) OTTL function. In this function:
147-
- $ must be escaped as $$ to bypass environment variable substitution logic.
148-
- To input a literal $, use $$$.
149+
- Masking utilizes the [replace_pattern](https://github.com/open-telemetry/opentelemetry-collector-contrib/blob/main/pkg/ottl/ottlfuncs/README.md#replace_pattern) OTTL function. In this function:
150+
- Escape `$` as `$$` to bypass environment variable substitution logic.
151+
- Use `$$$` to include a literal `$`.
149152
- When masking strings containing special characters like double quotes (`"`) and backslashes (`\`), these characters will be escaped by a backslash when masking the logs.
150153
:::
151154

152155
## Limitations
153156

154157
- You can *only* mask the data which is a string in the Windows event log JSON.
155-
- You cannot mask a value which is nested inside any array.
158+
- You cannot mask a value that is nested inside any array.

docs/send-data/opentelemetry-collector/remote-management/processing-rules/mask-rules.md

Lines changed: 23 additions & 32 deletions
Original file line numberDiff line numberDiff line change
@@ -3,35 +3,28 @@ id: mask-rules
33
title: Mask Rules
44
description: Create a mask rule to replace an expression with a mask string.
55
---
6-
<head>
7-
<meta name="robots" content="noindex" />
8-
</head>
9-
10-
<p><a href="/docs/beta"><span className="beta">Beta</span></a></p>
116

127
:::note
13-
This document do not support masking logs for Windows source template. Refer to [Mask Rules for Windows Source Template](mask-rules-windows.md) to mask logs for Windows source template.
8+
This document does not cover masking logs for Windows source templates. For details on masking logs for Windows, refer to [Mask Rules for the Windows Source Template](mask-rules-windows.md).
149
:::
1510

16-
A mask rule is a type of processing rule that hides irrelevant or sensitive information from logs before ingestion. When you create a mask rule, whatever expression you choose to mask will be replaced with a mask string before it is sent to Sumo Logic. You can provide a mask string, or use the default `"#####"`.
11+
A mask rule is a processing rule that hides irrelevant or sensitive information from logs before they are ingested. When you create a mask rule, the selected expression will be replaced with a mask string before the data is sent to Sumo Logic. You can either specify a custom mask string or use the default `"#####"`.
1712

18-
Ingestion volume is calculated after applying the mask filter. If the mask reduces the size of the log, the smaller size will be measured against ingestion limits. Masking is a good method for reducing overall ingestion volume.
13+
Ingestion volume is calculated after applying the mask filter. If the mask reduces the size of the log, the smaller size will be measured against ingestion limits. Masking is an effective method to reduce overall ingestion volume.
1914

2015
For example, to mask the email address `[email protected]` from this log:
2116

22-
```
23-
2018-05-16 09:43:39,607 -0700 DEBUG [hostId=prod-cass-raw-8] [module=RAW] [logger=scala.raw.InboundRawProtocolHandler] [auth=User:[email protected]] [remote_ip=98.248.40.103] [web_session=19zefhqy...] [session=80F1BD83AEBDF4FB] [customer=0000000000000005] [call=InboundRawProtocol.getMessages]
24-
```
17+
`2018-05-16 09:43:39,607 -0700 DEBUG [hostId=prod-cass-raw-8] [module=RAW] [logger=scala.raw.InboundRawProtocolHandler] [auth=User:[email protected]] [remote_ip=98.248.40.103] [web_session=19zefhqy...] [session=80F1BD83AEBDF4FB] [customer=0000000000000005] [call=InboundRawProtocol.getMessages]`
2518

2619
You could use the following filter expression:
2720

28-
```
21+
```sh
2922
auth=User:.*\.com
3023
```
3124

32-
Using the masking string `auth=User:AAA` would provide the following result:
25+
Using the masking string `auth=User:AAA` would produce the following result:
3326

34-
```
27+
```sh
3528
2018-05-16 09:43:39,607 -0700 DEBUG [hostId=prod-cass-raw-8] [module=RAW] [logger=scala.raw.InboundRawProtocolHandler] [auth=User:AAA] [remote_ip=98.248.40.103] [web_session=19zefhqy...] [session=80F1BD83AEBDF4FB] [customer=0000000000000005] [call=InboundRawProtocol.getMessages]
3629
```
3730

@@ -41,50 +34,46 @@ Using the masking string `auth=User:AAA` would provide the following result:
4134

4235
For example, this log message:
4336

44-
```
45-
{
37+
`{
4638
"reqHdr":{
4739
"auth":"Basic ksoe9wudkej2lfj*jshd6sl.cmei=",
4840
"cookie":"$Version=0; JSESSIONID=6C1BR5DAB897346B70FD2CA7SD4639.localhost_bc; $Path=/"
4941
}
50-
}
51-
```
42+
}`
5243

5344
You would use the following as a mask expression to mask the auth parameter's token:
5445

5546
```
5647
"auth"\s*:\s*"Basic\s*[^"]+"
5748
```
5849
59-
If the masking string given here is `"auth":"#####"`, then the log output will be:
50+
Applying the masking string `"auth":"#####"`, the log output will be:
6051
61-
```
62-
{
52+
`{
6353
"reqHdr": {
6454
"auth":"#####",
6555
"cookie":"$Version=0; JSESSIONID=6C1BR5DAB897346B70FD2CA7SD4639.localhost_bc; $Path=/"
6656
}
67-
}
68-
```
57+
}`
6958
7059
* Do not unnecessarily match on more of the log than needed. As seen in the previous example, avoid using overly broad expressions that could mask the entire log. This ensures that only the sensitive information is masked, not the whole log entry.
7160
7261
```
7362
(?s).*auth"\s*:\s*"Basic\s*([^"]+)".*(?s)
7463
```
7564
76-
* Make sure you do not specify a regular expression that matches a full log line. Doing so will result in the entire log line being masked.
65+
* Avoid regular expressions that match an entire log line, as this will result in the entire line being masked.
7766
78-
* If you need to mask values on multiple lines, use single-line modifiers (?s). For example:
67+
* To mask values spanning multiple lines, use the single-line modifier `(?s)`. For example:
7968
8069
```
8170
auth=User\:(.*(?s).*session=.*?)\]
8271
```
8372
8473
:::note
85-
- For masking, we use the [replace_pattern](https://github.com/open-telemetry/opentelemetry-collector-contrib/blob/main/pkg/ottl/ottlfuncs/README.md#replace_pattern) OTTL function. In this function:
86-
- $ must be escaped as $$ to bypass environment variable substitution logic.
87-
- To input a literal $, use $$$.
74+
- Masking utilizes the [replace_pattern](https://github.com/open-telemetry/opentelemetry-collector-contrib/blob/main/pkg/ottl/ottlfuncs/README.md#replace_pattern) OTTL function. In this function:
75+
- Escape `$` as `$$` to bypass environment variable substitution logic.
76+
- Use `$$$` to include a literal `$`.
8877
- When masking strings containing special characters like double quotes (`"`) and backslashes (`\`), these characters will be escaped by a backslash when masking the logs.
8978
:::
9079
@@ -98,6 +87,8 @@ Any masking expression should be tested and verified with a sample source file b
9887
9988
You can mask credit card numbers from log messages using a regular expression within a mask rule. Once masked with a known string, you can then perform a search for that string within your logs to detect if credit card numbers may be leaking into your log files.
10089
90+
To mask credit card numbers in logs, you can use a masking filter with the following regular expression:
91+
10192
The following regular expression can be used within a masking filter to mask American Express, Visa (16 digit only), Mastercard, and Discover credit card numbers:
10293
10394
```
@@ -108,7 +99,7 @@ This regular expression covers instances where the number includes dashes, space
10899
109100
Samples include:
110101
111-
* **American Express:** 3711-078176-01234  \|  371107817601234  \|  3711 078176 01234
112-
* **Visa:** 4123-5123-6123-7123  \|  4123512361237123  \|  4123 5123 6123 7123
113-
* **Master Card:** 5123-4123-6123-7123  \|  5123412361237123  \|  5123 4123 6123 7123
114-
* **Discover:** 6011-0009-9013-9424  \|  6500000000000002  \|  6011 0009 9013 9424
102+
* **American Express**. 3711-078176-01234  \|  371107817601234  \|  3711 078176 01234
103+
* **Visa**. 4123-5123-6123-7123  \|  4123512361237123  \|  4123 5123 6123 7123
104+
* **Master Card**. 5123-4123-6123-7123  \|  5123412361237123  \|  5123 4123 6123 7123
105+
* **Discover**. 6011-0009-9013-9424  \|  6500000000000002  \|  6011 0009 9013 9424

0 commit comments

Comments
 (0)