Skip to content

Commit 1ffb8be

Browse files
authored
Merge pull request #1670 from fluent/lynettemiles/sc-135596/update-fluent-bit-fluent-bit-docs-administration
2 parents b2e1732 + 783945a commit 1ffb8be

File tree

1 file changed

+78
-66
lines changed

1 file changed

+78
-66
lines changed

administration/configuring-fluent-bit/multiline-parsing.md

Lines changed: 78 additions & 66 deletions
Original file line numberDiff line numberDiff line change
@@ -1,102 +1,109 @@
1-
# Multiline Parsing
1+
# Multiline parsing
22

3-
In an ideal world, applications might log their messages within a single line, but in reality applications generate multiple log messages that sometimes belong to the same context. But when is time to process such information it gets really complex. Consider application stack traces which always have multiple log lines.
3+
In an ideal world, applications might log their messages within a single line, but in
4+
reality applications generate multiple log messages that sometimes belong to the same
5+
context. Processing this information can be complex, like in application stack traces,
6+
which always have multiple log lines.
47

58
<img referrerpolicy="no-referrer-when-downgrade" src="https://static.scarf.sh/a.png?x-pxid=e19a4c14-a9e4-4163-8f3a-52196eb9a585" />
69

7-
Starting from Fluent Bit v1.8, we have implemented a unified Multiline core functionality to solve all the user corner cases. In this section, you will learn about the features and configuration options available.
10+
Fluent Bit v1.8 implemented a unified Multiline core capability to solve corner cases.
811

912
## Concepts
1013

11-
The Multiline parser engine exposes two ways to configure and use the functionality:
14+
The Multiline parser engine exposes two ways to configure and use the feature:
1215

13-
* Built-in multiline parser
14-
* Configurable multiline parser
16+
- Built-in multiline parser
17+
- Configurable multiline parser
1518

16-
### Built-in Multiline Parsers
19+
### Built-in multiline parsers
1720

18-
Without any extra configuration, Fluent Bit exposes certain pre-configured parsers (built-in) to solve specific multiline parser cases, e.g:
21+
Fluent Bit exposes certain pre-configured parsers (built-in) to solve specific
22+
multiline parser cases. For example:
1923

20-
| Parser | Description |
21-
| ------ | --------------------------------------------------------------------------------------------------------------------------------------- |
22-
| docker | Process a log entry generated by a Docker container engine. This parser supports the concatenation of log entries split by Docker. |
23-
| cri | Process a log entry generated by CRI-O container engine. Same as the _docker_ parser, it supports concatenation of log entries |
24-
| go | Process log entries generated by a Go based language application and perform concatenation if multiline messages are detected. |
25-
| python | Process log entries generated by a Python based language application and perform concatenation if multiline messages are detected. |
26-
| java | Process log entries generated by a Google Cloud Java language application and perform concatenation if multiline messages are detected. |
24+
| Parser | Description |
25+
| ------ | ----------- |
26+
| `docker` | Process a log entry generated by a Docker container engine. This parser supports the concatenation of log entries split by Docker. |
27+
| `cri` | Process a log entry generated by CRI-O container engine. Like the `docker` parser, it supports concatenation of log entries |
28+
| `go` | Process log entries generated by a Go based language application and perform concatenation if multiline messages are detected. |
29+
| `python` | Process log entries generated by a Python based language application and perform concatenation if multiline messages are detected. |
30+
| `java` | Process log entries generated by a Google Cloud Java language application and perform concatenation if multiline messages are detected. |
2731

28-
### Configurable Multiline Parsers
32+
### Configurable multiline parsers
2933

30-
Besides the built-in parsers listed above, through the configuration files is possible to define your own Multiline parsers with their own rules.
34+
You can define your own Multiline parsers with their own rules, using a configuration
35+
file.
3136

32-
A multiline parser is defined in a _parsers configuration file_ by using a `[MULTILINE_PARSER]` section definition. The Multiline parser must have a unique name and a type plus other configured properties associated with each type.
37+
A multiline parser is defined in a `parsers configuration file` by using a
38+
`[MULTILINE_PARSER]` section definition. The multiline parser must have a unique name
39+
and a type, plus other configured properties associated with each type.
3340

34-
To understand which Multiline parser type is required for your use case you have to know beforehand what are the conditions in the content that determines the beginning of a multiline message and the continuation of subsequent lines. We provide a regex based configuration that supports states to handle from the most simple to difficult cases.
41+
To understand which multiline parser type is required for your use case you have to
42+
know the conditions in the content that determine the beginning of a multiline
43+
message, and the continuation of subsequent lines. Fluent Bit provides a regular expression-based
44+
configuration that supports states to handle from the most cases.
3545

36-
| Property | Description | Default |
37-
| ------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | ------- |
38-
| name | Specify a unique name for the Multiline Parser definition. A good practice is to prefix the name with the word `multiline_` to avoid confusion with normal parser's definitions. | |
39-
| type | Set the multiline mode, for now, we support the type `regex`. | |
40-
| parser | <p>Name of a pre-defined parser that must be applied to the incoming content before applying the regex rule. If no parser is defined, it's assumed that's a raw text and not a structured message. </p><p></p><p>Note: when a parser is applied to a raw text, then the regex is applied against a specific key of the structured message by using the <code>key_content</code> configuration property (see below).</p> | |
41-
| key_content | For an incoming structured message, specify the key that contains the data that should be processed by the regular expression and possibly concatenated. | |
42-
| flush_timeout | Timeout in milliseconds to flush a non-terminated multiline buffer. Default is set to 5 seconds. | 5s |
43-
| rule | Configure a rule to match a multiline pattern. The rule has a specific format described below. Multiple rules can be defined. | |
46+
| Property | Description | Default |
47+
| -------- | ----------- | ------- |
48+
| `name` | Specify a unique name for the multiline parser definition. A good practice is to prefix the name with the word `multiline_` to avoid confusion with normal parser definitions. | _none_ |
49+
| `type` | Set the multiline mode. Fluent Bit supports the type `regex`.| _none_ |
50+
| `parser` | Name of a pre-defined parser that must be applied to the incoming content before applying the regular expression rule. If no parser is defined, it's assumed that's a raw text and not a structured message. <br /> When a parser is applied to a raw text, the regular expression is applied against a specific key of the structured message by using the `key_content` configuration property. | _none_ |
51+
| `key_content` | For an incoming structured message, specify the key that contains the data that should be processed by the regular expression and possibly concatenated. | _none_ |
52+
| `flush_timeout` | Timeout in milliseconds to flush a non-terminated multiline buffer. | `5s` |
53+
| `rule` | Configure a rule to match a multiline pattern. The rule has a [specific format](#rules-definition). Multiple rules can be defined. | _none_|
4454

45-
#### Lines and States
55+
#### Lines and states
4656

47-
Before start configuring your parser you need to know the answer to the following questions:
57+
Before configuring your parser you need to know the answer to the following questions:
4858

49-
1. What is the regular expression (regex) that matches the first line of a multiline message ?
50-
2. What are the regular expressions (regex) that match the continuation lines of a multiline message ?
59+
1. What's the regular expression (`regex`) that matches the first line of a multiline message?
60+
1. What are the regular expressions (`regex`) that match the continuation lines of a multiline message?
5161

52-
When matching regex, we have to define **states**, some states define the start of a multiline message while others are states for the continuation of multiline messages. You can have multiple **continuation states** definitions to solve complex cases.
62+
When matching a regular expression, you must to define `states`. Some states define the start of a multiline message while others are states for the continuation of multiline messages. You can have multiple `continuation states` definitions to solve complex cases.
5363

54-
The first regex that matches the start of a multiline message is called **start_state**, then other regexes continuation lines can have different state names.
64+
The first regular expression that matches the start of a multiline message is called
65+
`start_state`. Other regular expression continuation lines can have different state names.
5566

56-
#### Rules Definition
67+
#### Rules definition
5768

5869
A rule specifies how to match a multiline pattern and perform the concatenation. A rule is defined by 3 specific components:
5970

60-
1. state name
61-
2. regular expression pattern
62-
3. next state
71+
- state name
72+
- regular expression pattern
73+
- next state
6374

64-
A rule might be defined as follows (comments added to simplify the definition) :
75+
A rule might be defined as follows (comments added to simplify the definition):
6576

66-
```
77+
```text
6778
# rules | state name | regex pattern | next state
6879
# --------|----------------|---------------------------------------------
6980
rule "start_state" "/([a-zA-Z]+ \d+ \d+\:\d+\:\d+)(.*)/" "cont"
7081
rule "cont" "/^\s+at.*/" "cont"
7182
```
7283

73-
In the example above, we have defined two rules, each one has its own state name, regex patterns, and the next state name. Every field that composes a rule **must be** inside double quotes.
84+
This example defines two rules. Each rule has its own state name, regex patterns, and the next state name. Every field that composes a rule must be inside double quotes.
7485

75-
The first rule of state name **must always** be **start_state**, and the regex pattern **must** match the first line of a multiline message, also a next state must be set to specify how the possible continuation lines would look like.
86+
The first rule of a state name must be `start_state`. The regex pattern must match the first line of a multiline message, and a next state must be set to specify what the possible continuation lines look like.
7687

7788
{% hint style="info" %}
78-
To simplify the configuration of regular expressions, you can use the Rubular web site. We have posted an example by using the regex described above plus a log line that matches the pattern:\
79-
\
80-
[https://rubular.com/r/NDuyKwlTGOvq2g](https://rubular.com/r/NDuyKwlTGOvq2g)
89+
To simplify the configuration of regular expressions, you can use the [Rubular]((https://rubular.com/r/NDuyKwlTGOvq2g)) web site. This link uses the regex described in the previous example, plus a log line that matches the pattern:
8190
{% endhint %}
8291

83-
#### Configuration Example
92+
#### Configuration example
8493

85-
The following example provides a full Fluent Bit configuration file for multiline parsing by using the definition explained above.
94+
The following example provides a full Fluent Bit configuration file for multiline parsing by using the definition explained previously.
8695

8796
{% hint style="info" %}
88-
The following example files can be located at:\
89-
\
90-
[https://github.com/fluent/fluent-bit/tree/master/documentation/examples/multiline/regex-001](https://github.com/fluent/fluent-bit/tree/master/documentation/examples/multiline/regex-001)
97+
The following example files can be located [at this link](https://github.com/fluent/fluent-bit/tree/master/documentation/examples/multiline/regex-001).
9198
{% endhint %}
9299

93100
Example files content:
94101

95102
{% tabs %}
96103
{% tab title="fluent-bit.conf" %}
97-
This is the primary Fluent Bit configuration file. It includes the `parsers_multiline.conf` and tails the file `test.log` by applying the multiline parser `multiline-regex-test`. Then it sends the processing to the standard output.
104+
This is the primary Fluent Bit configuration file. It includes the `parsers_multiline.conf` and tails the file `test.log` by applying the multiline parser `multiline-regex-test`. Then it sends the processing to the standard output.
98105

99-
```
106+
```python
100107
[SERVICE]
101108
flush 1
102109
log_level info
@@ -112,12 +119,13 @@ This is the primary Fluent Bit configuration file. It includes the `parsers_mult
112119
name stdout
113120
match *
114121
```
122+
115123
{% endtab %}
116124

117125
{% tab title="parsers_multiline.conf" %}
118126
This second file defines a multiline parser for the example.
119127

120-
```
128+
```python
121129
[MULTILINE_PARSER]
122130
name multiline-regex-test
123131
type regex
@@ -136,12 +144,13 @@ This second file defines a multiline parser for the example.
136144
rule "start_state" "/([a-zA-Z]+ \d+ \d+\:\d+\:\d+)(.*)/" "cont"
137145
rule "cont" "/^\s+at.*/" "cont"
138146
```
147+
139148
{% endtab %}
140149

141150
{% tab title="test.log" %}
142151
An example file with multiline content:
143152

144-
```
153+
```text
145154
single line...
146155
Dec 14 06:41:08 Exception in thread "main" java.lang.RuntimeException: Something has gone wrong, aborting!
147156
at com.myproject.module.MyProject.badMethod(MyProject.java:22)
@@ -152,13 +161,14 @@ Dec 14 06:41:08 Exception in thread "main" java.lang.RuntimeException: Something
152161
another line...
153162
154163
```
164+
155165
{% endtab %}
156166
{% endtabs %}
157167

158168
By running Fluent Bit with the given configuration file you will obtain:
159169

160-
```
161-
$ fluent-bit -c fluent-bit.conf
170+
```text
171+
$ fluent-bit -c fluent-bit.conf
162172
163173
[0] tail.0: [0.000000000, {"log"=>"single line...
164174
"}]
@@ -174,29 +184,28 @@ $ fluent-bit -c fluent-bit.conf
174184
175185
```
176186

177-
The lines that did not match a pattern are not considered as part of the multiline message, while the ones that matched the rules were concatenated properly.
187+
The lines that didn't match a pattern aren't considered as part of the multiline message, while the ones that matched the rules were concatenated properly.
178188

179189
## Limitations
180190

181191
The multiline parser is a very powerful feature, but it has some limitations that you should be aware of:
182192

183-
* The multiline parser is not affected by the `buffer_max_size` configuration option, allowing the composed log record to grow beyond this size.
184-
Hence, the `skip_long_lines` option will not be applied to multiline messages.
185-
* It is not possible to get the time key from the body of the multiline message. However, it can be extracted and set as a new key by using a filter.
193+
- The multiline parser isn't affected by the `buffer_max_size` configuration option, allowing the composed log record to grow beyond this size. The `skip_long_lines` option won't be applied to multiline messages.
194+
- It's not possible to get the time key from the body of the multiline message. However, it can be extracted and set as a new key by using a filter.
186195

187196
## Get structured data from multiline message
188197

189-
Fluent-bit supports `/pat/m` option. It allows `.` matches a new line. It is useful to parse multiline log.
198+
Fluent-bit supports the `/pat/m` option. It allows `.` matches a new line, which can be used to parse multiline logs.
190199

191-
The following example is to get `date` and `message` from concatenated log.
200+
The following example retrieves `date` and `message` from concatenated logs.
192201

193202
Example files content:
194203

195204
{% tabs %}
196205
{% tab title="fluent-bit.conf" %}
197206
This is the primary Fluent Bit configuration file. It includes the `parsers_multiline.conf` and tails the file `test.log` by applying the multiline parser `multiline-regex-test`. It also parses concatenated log by applying parser `named-capture-test`. Then it sends the processing to the standard output.
198207

199-
```
208+
```python
200209
[SERVICE]
201210
flush 1
202211
log_level info
@@ -218,12 +227,13 @@ This is the primary Fluent Bit configuration file. It includes the `parsers_mult
218227
name stdout
219228
match *
220229
```
230+
221231
{% endtab %}
222232

223233
{% tab title="parsers_multiline.conf" %}
224234
This second file defines a multiline parser for the example.
225235

226-
```
236+
```python
227237
[MULTILINE_PARSER]
228238
name multiline-regex-test
229239
type regex
@@ -247,12 +257,13 @@ This second file defines a multiline parser for the example.
247257
Format regex
248258
Regex /^(?<date>[a-zA-Z]+ \d+ \d+\:\d+\:\d+) (?<message>.*)/m
249259
```
260+
250261
{% endtab %}
251262

252263
{% tab title="test.log" %}
253264
An example file with multiline content:
254265

255-
```
266+
```text
256267
single line...
257268
Dec 14 06:41:08 Exception in thread "main" java.lang.RuntimeException: Something has gone wrong, aborting!
258269
at com.myproject.module.MyProject.badMethod(MyProject.java:22)
@@ -263,12 +274,13 @@ Dec 14 06:41:08 Exception in thread "main" java.lang.RuntimeException: Something
263274
another line...
264275

265276
```
277+
266278
{% endtab %}
267279
{% endtabs %}
268280

269281
By running Fluent Bit with the given configuration file you will obtain:
270282

271-
```
283+
```text
272284
$ fluent-bit -c fluent-bit.conf
273285

274286
[0] tail.0: [1669160706.737650473, {"log"=>"single line...
@@ -282,4 +294,4 @@ $ fluent-bit -c fluent-bit.conf
282294
"}]
283295
[2] tail.0: [1669160706.737657687, {"log"=>"another line...
284296
"}]
285-
```
297+
```

0 commit comments

Comments
 (0)