Skip to content

Commit f73dbef

Browse files
authored
Merge pull request #1458 from fluent/lynettemiles/sc-108185/update-pipeline-filters-grep-fluent-bit-doc
fluent: docs: update grep for style
2 parents c0eefc0 + e6cfeec commit f73dbef

File tree

1 file changed

+66
-34
lines changed

1 file changed

+66
-34
lines changed

pipeline/filters/grep.md

Lines changed: 66 additions & 34 deletions
Original file line numberDiff line numberDiff line change
@@ -1,28 +1,33 @@
11
---
2-
description: Select or exclude records per patterns
2+
description: Select or exclude records using patterns
33
---
44

55
# Grep
66

7-
The _Grep Filter_ plugin allows you to match or exclude specific records based on regular expression patterns for values or nested values.
7+
The _Grep Filter_ plugin lets you match or exclude specific records based on
8+
regular expression patterns for values or nested values.
89

9-
## Configuration Parameters
10+
## Configuration parameters
1011

1112
The plugin supports the following configuration parameters:
1213

13-
| Key | Value Format | Description |
14-
| :--- | :--- | :--- |
15-
| Regex | KEY REGEX | Keep records in which the content of KEY matches the regular expression. |
16-
| Exclude | KEY REGEX | Exclude records in which the content of KEY matches the regular expression. |
17-
| Logical_Op| Operation | Specify which logical operator to use. `AND` , `OR` and `legacy` are allowed as an Operation. Default is `legacy` for backward compatibility. In `legacy` mode the behaviour is either AND or OR depending whether the `grep` is including (uses AND) or excluding (uses OR). Only available from 2.1+. |
14+
| Key | Value Format | Description |
15+
| ------------ | ------------ | ----------- |
16+
| `Regex` | KEY REGEX | Keep records where the content of KEY matches the regular expression. |
17+
| `Exclude` | KEY REGEX | Exclude records where the content of KEY matches the regular expression. |
18+
| `Logical_Op` | Operation | Specify a logical operator: `AND`, `OR` or `legacy` (default). In `legacy` mode the behaviour is either `AND` or `OR` depending on whether the `grep` is including (uses AND) or excluding (uses OR). Available from 2.1 or higher. |
1819

19-
#### Record Accessor Enabled
20+
### Record Accessor Enabled
2021

21-
This plugin enables the [Record Accessor](../../administration/configuring-fluent-bit/classic-mode/record-accessor.md) feature to specify the KEY. Using the _record accessor_ is suggested if you want to match values against nested values.
22+
Enable the [Record Accessor](../../administration/configuring-fluent-bit/classic-mode/record-accessor.md)
23+
feature to specify the KEY. Use the record accessor to match values against nested
24+
values.
2225

23-
## Getting Started
26+
## Filter records
2427

25-
In order to start filtering records, you can run the filter from the command line or through the configuration file. The following example assumes that you have a file called `lines.txt` with the following content:
28+
To start filtering records, run the filter from the command line or through the
29+
configuration file. The following example assumes that you have a file named
30+
`lines.txt` with the following content:
2631

2732
```text
2833
{"log": "aaa"}
@@ -35,20 +40,25 @@ In order to start filtering records, you can run the filter from the command lin
3540
{"log": "ggg"}
3641
```
3742

38-
### Command Line
43+
### Command line
3944

40-
> Note: using the command line mode need special attention to quote the regular expressions properly. It's suggested to use a configuration file.
45+
When using the command line, pay close attention to quote the regular expressions.
46+
Using a configuration file might be easier.
4147

42-
The following command will load the _tail_ plugin and read the content of `lines.txt` file. Then the _grep_ filter will apply a regular expression rule over the _log_ field \(created by tail plugin\) and only _pass_ the records which field value starts with _aa_:
48+
The following command loads the [tail](../../pipeline/inputs/tail) plugin and
49+
reads the content of `lines.txt`. Then the `grep` filter applies a regular
50+
expression rule over the `log` field created by the `tail` plugin and only passes
51+
records with a field value starting with `aa`:
4352

4453
```text
4554
$ bin/fluent-bit -i tail -p 'path=lines.txt' -F grep -p 'regex=log aa' -m '*' -o stdout
4655
```
4756

48-
### Configuration File
57+
### Configuration file
4958

5059
{% tabs %}
5160
{% tab title="fluent-bit.conf" %}
61+
5262
```python
5363
[SERVICE]
5464
parsers_file /path/to/parsers.conf
@@ -67,9 +77,11 @@ $ bin/fluent-bit -i tail -p 'path=lines.txt' -F grep -p 'regex=log aa' -m '*' -o
6777
name stdout
6878
match *
6979
```
80+
7081
{% endtab %}
7182

7283
{% tab title="fluent-bit.yaml" %}
84+
7385
```yaml
7486
service:
7587
parsers_file: /path/to/parsers.conf
@@ -87,14 +99,21 @@ pipeline:
8799
match: '*'
88100

89101
```
102+
90103
{% endtab %}
91104
{% endtabs %}
92105

93-
The filter allows to use multiple rules which are applied in order, you can have many _Regex_ and _Exclude_ entries as required.
106+
The filter lets you use multiple rules which are applied in order. You can
107+
have as many `Regex` and `Exclude` entries as required.
94108

95109
### Nested fields example
96110

97-
If you want to match or exclude records based on nested values, you can use a [Record Accessor ](../../administration/configuring-fluent-bit/classic-mode/record-accessor.md)format as the KEY name. Consider the following record example:
111+
To match or exclude records based on nested values, you can use
112+
[Record
113+
Accessor](../../administration/configuring-fluent-bit/classic-mode/record-accessor.md)
114+
format as the `KEY` name.
115+
116+
Consider the following record example:
98117

99118
```javascript
100119
{
@@ -113,40 +132,45 @@ If you want to match or exclude records based on nested values, you can use a [R
113132
}
114133
```
115134

116-
if you want to exclude records that match given nested field \(for example `kubernetes.labels.app`\), you can use the following rule:
135+
For example, to exclude records that match the nested field `kubernetes.labels.app`,
136+
use the following rule:
117137

118138
{% tabs %}
119139
{% tab title="fluent-bit.conf" %}
140+
120141
```python
121142
[FILTER]
122143
Name grep
123144
Match *
124145
Exclude $kubernetes['labels']['app'] myapp
125146
```
126-
{% endtab %}
127147

148+
{% endtab %}
128149
{% tab title="fluent-bit.yaml" %}
150+
129151
```yaml
130152
filters:
131153
- name: grep
132154
match: '*'
133155
exclude: $kubernetes['labels']['app'] myapp
134156
```
157+
135158
{% endtab %}
136159
{% endtabs %}
137160
138-
### Excluding records missing/invalid fields
139-
140-
It may be that in your processing pipeline you want to drop records that are missing certain keys.
161+
### Excluding records with missing or invalid fields
141162
142-
A simple way to do this is just to `exclude` with a regex that matches anything, a missing key will fail this check.
163+
You might want to drop records that are missing certain keys.
143164
144-
Here is an example that checks for a specific valid value for the key as well:
165+
One way to do this is to `exclude` with a regex that matches anything. A missing
166+
key fails this check.
145167

168+
The followinfg example checks for a specific valid value for the key:
146169

147170
{% tabs %}
148171
{% tab title="fluent-bit.conf" %}
149-
```
172+
173+
```text
150174
# Use Grep to verify the contents of the iot_timestamp value.
151175
# If the iot_timestamp key does not exist, this will fail
152176
# and exclude the row.
@@ -156,30 +180,34 @@ Here is an example that checks for a specific valid value for the key as well:
156180
Match iots_thread.*
157181
Regex iot_timestamp ^\d{4}-\d{2}-\d{2}
158182
```
159-
{% endtab %}
160183

184+
{% endtab %}
161185
{% tab title="fluent-bit.yaml" %}
186+
162187
```yaml
163188
filters:
164189
- name: grep
165190
alias: filter-iots-grep
166191
match: iots_thread.*
167192
regex: iot_timestamp ^\d{4}-\d{2}-\d{2}
168193
```
194+
169195
{% endtab %}
170196
{% endtabs %}
171197

172-
The specified key `iot_timestamp` must match the expected expression - if it does not or is missing/empty then it will be excluded.
198+
The specified key `iot_timestamp` must match the expected expression. If it doesn't,
199+
or is missing or empty, then it will be excluded.
173200

174201
### Multiple conditions
175202

176-
If you want to set multiple `Regex` or `Exclude`, you can use `Logical_Op` property to use logical conjuction or disjunction.
177-
178-
Note: If `Logical_Op` is set, setting both 'Regex' and `Exclude` results in an error.
203+
If you want to set multiple `Regex` or `Exclude`, use the `Logical_Op` property
204+
to use a logical conjuction or disjunction.
179205

206+
If `Logical_Op` is set, setting both `Regex` and `Exclude` results in an error.
180207

181208
{% tabs %}
182209
{% tab title="fluent-bit.conf" %}
210+
183211
```python
184212
[INPUT]
185213
Name dummy
@@ -196,9 +224,11 @@ Note: If `Logical_Op` is set, setting both 'Regex' and `Exclude` results in an e
196224
[OUTPUT]
197225
Name stdout
198226
```
227+
199228
{% endtab %}
200229

201230
{% tab title="fluent-bit.yaml" %}
231+
202232
```yaml
203233
pipeline:
204234
inputs:
@@ -215,11 +245,13 @@ pipeline:
215245
outputs:
216246
- name: stdout
217247
```
248+
218249
{% endtab %}
219250
{% endtabs %}
220251

221-
Output will be
222-
```
252+
The output looks similar to:
253+
254+
```text
223255
Fluent Bit v2.0.9
224256
* Copyright (C) 2015-2022 The Fluent Bit Authors
225257
* Fluent Bit is a CNCF sub-project under the umbrella of Fluentd
@@ -236,4 +268,4 @@ Fluent Bit v2.0.9
236268
[2023/01/22 09:46:49] [ info] [output:stdout:stdout.0] worker #0 started
237269
[0] dummy: [1674348410.558341857, {"endpoint"=>"localhost", "value"=>"something"}]
238270
[0] dummy: [1674348411.546425499, {"endpoint"=>"localhost", "value"=>"something"}]
239-
```
271+
```

0 commit comments

Comments
 (0)