Skip to content
56 changes: 28 additions & 28 deletions pipeline/parsers/configuring-parser.md
Original file line number Diff line number Diff line change
Expand Up @@ -38,8 +38,8 @@ Multiple parsers can be defined and each section has it own properties. The foll
| `Time_Keep` | If enabled, when a time key is recognized and parsed, the parser will keep the original time key. If disabled, the parser will drop the original time field. |
| `Time_System_timezone` | If there is no time zone (`%z`) specified in the given `Time_Format`, enabling this option will make the parser detect and use the system's configured time zone. The configured time zone is detected from the [`TZ` environment variable](https://www.gnu.org/software/libc/manual/html_node/TZ-Variable.html). |
| `Types` | Specifies the data type of parsed field. The syntax is `types <field_name_1>:<type_name_1> <field_name_2>:<type_name_2> ...`. The supported types are `string` (default), `integer`, `bool`, `float`, `hex`. The option is supported by `ltsv`, `logfmt` and `regex`. |
| `Decode_Field` | If the content can be decoded in a structured message, append the structured message (keys and values) to the original log message. Decoder types: `json`, `escaped`, `escaped_utf8`. The syntax is: `Decode_Field <decoder_type> <field_name>`. See [Decoders](pipeline/parsers/decoders.md) for additional information. |
| `Decode_Field_As` | Any decoded content (unstructured or structured) will be replaced in the same key/value, and no extra keys are added. Decoder types: `json`, `escaped`, `escaped_utf8`. The syntax is: `Decode_Field_As <decoder_type> <field_name>`. See [Decoders](pipeline/parsers/decoders.md) for additional information. |
| `Decode_Field` | If the content can be decoded in a structured message, append the structured message (keys and values) to the original log message. Decoder types: `json`, `escaped`, `escaped_utf8`. The syntax is: `Decode_Field <decoder_type> <field_name>`. See [Decoders](decoders.md) for additional information. |
| `Decode_Field_As` | Any decoded content (unstructured or structured) will be replaced in the same key/value, and no extra keys are added. Decoder types: `json`, `escaped`, `escaped_utf8`. The syntax is: `Decode_Field_As <decoder_type> <field_name>`. See [Decoders](decoders.md) for additional information. |
| `Skip_Empty_Values` | Specifies a boolean which determines if the parser should skip empty values. The default is `true`. |
| `Time_Strict` | The default value (`true`) tells the parser to be strict with the expected time format. With this option set to false, the parser will be permissive with the format of the time. You can use this when the format expects time fraction but the time to be parsed doesn't include it. |

Expand All @@ -52,40 +52,40 @@ All parsers must be defined in a parsers file (see below for examples), not in t

```yaml
parsers:
- name: docker
format: json
time_key: time
time_format: '%Y-%m-%dT%H:%M:%S.%L'
time_keep: on

- name: syslog-rfc5424
format: regex
regex: '^\<(?<pri>[0-9]{1,5})\>1 (?<time>[^ ]+) (?<host>[^ ]+) (?<ident>[^ ]+) (?<pid>[-0-9]+) (?<msgid>[^ ]+) (?<extradata>(\[(.*)\]|-)) (?<message>.+)$'
time_key: time
time_format: '%Y-%m-%dT%H:%M:%S.%L'
time_keep: on
types: pid:integer
- name: docker
format: json
time_key: time
time_format: '%Y-%m-%dT%H:%M:%S.%L'
time_keep: on

- name: syslog-rfc5424
format: regex
regex: '^\<(?<pri>[0-9]{1,5})\>1 (?<time>[^ ]+) (?<host>[^ ]+) (?<ident>[^ ]+) (?<pid>[-0-9]+) (?<msgid>[^ ]+) (?<extradata>(\[(.*)\]|-)) (?<message>.+)$'
time_key: time
time_format: '%Y-%m-%dT%H:%M:%S.%L'
time_keep: on
types: pid:integer
```
{% endtab %}
{% tab title="parsers.conf" %}
```text
[PARSER]
Name docker
Format json
Time_Key time
Time_Format %Y-%m-%dT%H:%M:%S.%L
Time_Keep On
Name docker
Format json
Time_Key time
Time_Format %Y-%m-%dT%H:%M:%S.%L
Time_Keep On

[PARSER]
Name syslog-rfc5424
Format regex
Regex ^\<(?<pri>[0-9]{1,5})\>1 (?<time>[^ ]+) (?<host>[^ ]+) (?<ident>[^ ]+) (?<pid>[-0-9]+) (?<msgid>[^ ]+) (?<extradata>(\[(.*)\]|-)) (?<message>.+)$
Time_Key time
Time_Format %Y-%m-%dT%H:%M:%S.%L
Time_Keep On
Types pid:integer
Name syslog-rfc5424
Format regex
Regex ^\<(?<pri>[0-9]{1,5})\>1 (?<time>[^ ]+) (?<host>[^ ]+) (?<ident>[^ ]+) (?<pid>[-0-9]+) (?<msgid>[^ ]+) (?<extradata>(\[(.*)\]|-)) (?<message>.+)$
Time_Key time
Time_Format %Y-%m-%dT%H:%M:%S.%L
Time_Keep On
Types pid:integer
```

{% endtab %}
Expand Down Expand Up @@ -161,7 +161,7 @@ The following time zone abbreviations are supported.
| `CLT` | `-04:00` | `-14400` | no | Chile Standard Time |
| `CLST` | `-03:00` | `-10800` | yes | Chile Summer Time |

### Australasian and Oceanian time zones
### Australasian and Oceania time zones

| Abbreviation | UTC Offset (`HH:MM`) | Offset (seconds) | Is DST | Description |
| ------------ | -------------------- | ---------------- | ------ | ---------------------------------- |
Expand Down
86 changes: 44 additions & 42 deletions pipeline/parsers/decoders.md
Original file line number Diff line number Diff line change
Expand Up @@ -32,29 +32,29 @@ For example, the predefined Docker parser has the following definition:

```yaml
parsers:
- name: docker
format: json
time_key: time
time_format: '%Y-%m-%dT%H:%M:%S.%L'
time_keep: on
# Command | Decoder | Field | Optional Action |
# ==========|==========|=======|===================|
decode_field_as: escaped log
- name: docker
format: json
time_key: time
time_format: '%Y-%m-%dT%H:%M:%S.%L'
time_keep: on
# Command | Decoder | Field | Optional Action |
# ==========|==========|=======|===================|
decode_field_as: escaped log
```
{% endtab %}
{% tab title="parsers.conf" %}
```text
[PARSER]
Name docker
Format json
Time_Key time
Time_Format %Y-%m-%dT%H:%M:%S.%L
Time_Keep On
# Command | Decoder | Field | Optional Action |
# ==============|===========|=======|===================|
Decode_Field_As escaped log
Name docker
Format json
Time_Key time
Time_Format %Y-%m-%dT%H:%M:%S.%L
Time_Keep On
# Command | Decoder | Field | Optional Action |
# ==============|===========|=======|===================|
Decode_Field_As escaped log
```

{% endtab %}
Expand Down Expand Up @@ -99,12 +99,14 @@ Example input from `/path/to/log.log`:
Example output:

```text
...
[24] tail.0: [1519082729.184544400, {"log"=>" Checking indexes...
", "stream"=>"stdout", "time"=>"2018-02-19T23:25:29.1845444Z"}]
[25] tail.0: [1519082729.184553600, {"log"=>" Validated: _audit _internal _introspection _telemetry _thefishbucket history main snmp_data summary
", "stream"=>"stdout", "time"=>"2018-02-19T23:25:29.1845536Z"}]
[26] tail.0: [1519082729.184562200, {"log"=>" Done
", "stream"=>"stdout", "time"=>"2018-02-19T23:25:29.1845622Z"}]
...
```

Decoder example Fluent Bit configuration files:
Expand All @@ -114,34 +116,34 @@ Decoder example Fluent Bit configuration files:

```yaml
service:
parsers_file: parsers.yaml
parsers_file: parsers.yaml

pipeline:
inputs:
- name: tail
parser: docker
path: /path/to/log.log

outputs:
- name: stdout
match: '*'
inputs:
- name: tail
parser: docker
path: /path/to/log.log

outputs:
- name: stdout
match: '*'
```
{% endtab %}
{% tab title="fluent-bit.conf" %}
```text
[SERVICE]
Parsers_File parsers.conf
Parsers_File parsers.conf

[INPUT]
Name tail
Parser docker
Path /path/to/log.log
Name tail
Parser docker
Path /path/to/log.log

[OUTPUT]
Name stdout
Match *
Name stdout
Match *
```

{% endtab %}
Expand All @@ -154,24 +156,24 @@ The example parsers file:

```yaml
parsers:
- name: docker
format: json
time_key: time
time_format: '%Y-%m-%dT%H:%M:%S %z'
decode_field_as: escaped_utf8 log
- name: docker
format: json
time_key: time
time_format: '%Y-%m-%dT%H:%M:%S %z'
decode_field_as: escaped_utf8 log
```
{% endtab %}
{% tab title="parsers.conf" %}
```text
[PARSER]
Name docker
Format json
Time_Key time
Time_Format %Y-%m-%dT%H:%M:%S %z
Decode_Field_as escaped_utf8 log
Name docker
Format json
Time_Key time
Time_Format %Y-%m-%dT%H:%M:%S %z
Decode_Field_as escaped_utf8 log
```

{% endtab %}
{% endtabs %}
{% endtabs %}
16 changes: 8 additions & 8 deletions pipeline/parsers/json.md
Original file line number Diff line number Diff line change
Expand Up @@ -9,21 +9,21 @@ For example, the default parsers configuration file includes a parser for parsin

```yaml
parsers:
- name: docker
format: json
time_key: time
time_format: '%Y-%m-%dT%H:%M:%S %z'
- name: docker
format: json
time_key: time
time_format: '%Y-%m-%dT%H:%M:%S %z'
```
{% endtab %}
{% tab title="parsers.conf" %}
```text
[PARSER]
Name docker
Format json
Time_Key time
Time_Format %Y-%m-%dT%H:%M:%S %z
Name docker
Format json
Time_Key time
Time_Format %Y-%m-%dT%H:%M:%S %z
```

{% endtab %}
Expand Down
22 changes: 11 additions & 11 deletions pipeline/parsers/logfmt.md
Original file line number Diff line number Diff line change
Expand Up @@ -9,17 +9,17 @@ Here is an example parsers configuration:

```yaml
parsers:
- name: logfmt
format: logfmt
- name: logfmt
format: logfmt
```
{% endtab %}
{% tab title="parsers.conf" %}
```text
[PARSER]
Name logfmt
Format logfmt
Name logfmt
Format logfmt
```

{% endtab %}
Expand All @@ -46,20 +46,20 @@ If you want to be more strict than the logfmt standard and not parse lines where

```yaml
parsers:
- name: logfmt
format: logfmt
logfmt_no_bare_keys: true
- name: logfmt
format: logfmt
logfmt_no_bare_keys: true
```
{% endtab %}
{% tab title="parsers.conf" %}
```text
[PARSER]
Name logfmt
Format logfmt
Logfmt_No_Bare_Keys true
Name logfmt
Format logfmt
Logfmt_No_Bare_Keys true
```

{% endtab %}
{% endtabs %}
{% endtabs %}
28 changes: 16 additions & 12 deletions pipeline/parsers/ltsv.md
Original file line number Diff line number Diff line change
Expand Up @@ -20,23 +20,23 @@ The following is an example parsers configuration file:

```yaml
parsers:
- name: access_log_ltsv
format: ltsv
time_key: time
time_format: '[%d/%b/%Y:%H:%M:%S %z]'
types: status:integer size:integer
- name: access_log_ltsv
format: ltsv
time_key: time
time_format: '[%d/%b/%Y:%H:%M:%S %z]'
types: status:integer size:integer
```
{% endtab %}
{% tab title="parsers.conf" %}
```text
[PARSER]
Name access_log_ltsv
Format ltsv
Time_Key time
Time_Format [%d/%b/%Y:%H:%M:%S %z]
Types status:integer size:integer
Name access_log_ltsv
Format ltsv
Time_Key time
Time_Format [%d/%b/%Y:%H:%M:%S %z]
Types status:integer size:integer
```

{% endtab %}
Expand All @@ -45,19 +45,23 @@ parsers:
The following log entry is valid content for the previously defined parser:

```text
...
host:127.0.0.1 ident:- user:- time:[10/Jul/2018:13:27:05 +0200] req:GET / HTTP/1.1 status:200 size:16218 referer:http://127.0.0.1/ ua:Mozilla/5.0 (X11; Fedora; Linux x86_64; rv:59.0) Gecko/20100101 Firefox/59.0
host:127.0.0.1 ident:- user:- time:[10/Jul/2018:13:27:05 +0200] req:GET /assets/plugins/bootstrap/css/bootstrap.min.css HTTP/1.1 status:200 size:121200 referer:http://127.0.0.1/ ua:Mozilla/5.0 (X11; Fedora; Linux x86_64; rv:59.0) Gecko/20100101 Firefox/59.0
host:127.0.0.1 ident:- user:- time:[10/Jul/2018:13:27:05 +0200] req:GET /assets/css/headers/header-v6.css HTTP/1.1 status:200 size:37706 referer:http://127.0.0.1/ ua:Mozilla/5.0 (X11; Fedora; Linux x86_64; rv:59.0) Gecko/20100101 Firefox/59.0
host:127.0.0.1 ident:- user:- time:[10/Jul/2018:13:27:05 +0200] req:GET /assets/css/style.css HTTP/1.1 status:200 size:1279 referer:http://127.0.0.1/ ua:Mozilla/5.0 (X11; Fedora; Linux x86_64; rv:59.0) Gecko/20100101 Firefox/59.0
...
```

After processing, it internal representation will be:
After processing, its internal representation will be:

```text
...
[1531222025.000000000, {"host"=>"127.0.0.1", "ident"=>"-", "user"=>"-", "req"=>"GET / HTTP/1.1", "status"=>200, "size"=>16218, "referer"=>"http://127.0.0.1/", "ua"=>"Mozilla/5.0 (X11; Fedora; Linux x86_64; rv:59.0) Gecko/20100101 Firefox/59.0"}]
[1531222025.000000000, {"host"=>"127.0.0.1", "ident"=>"-", "user"=>"-", "req"=>"GET /assets/plugins/bootstrap/css/bootstrap.min.css HTTP/1.1", "status"=>200, "size"=>121200, "referer"=>"http://127.0.0.1/", "ua"=>"Mozilla/5.0 (X11; Fedora; Linux x86_64; rv:59.0) Gecko/20100101 Firefox/59.0"}]
[1531222025.000000000, {"host"=>"127.0.0.1", "ident"=>"-", "user"=>"-", "req"=>"GET /assets/css/headers/header-v6.css HTTP/1.1", "status"=>200, "size"=>37706, "referer"=>"http://127.0.0.1/", "ua"=>"Mozilla/5.0 (X11; Fedora; Linux x86_64; rv:59.0) Gecko/20100101 Firefox/59.0"}]
[1531222025.000000000, {"host"=>"127.0.0.1", "ident"=>"-", "user"=>"-", "req"=>"GET /assets/css/style.css HTTP/1.1", "status"=>200, "size"=>1279, "referer"=>"http://127.0.0.1/", "ua"=>"Mozilla/5.0 (X11; Fedora; Linux x86_64; rv:59.0) Gecko/20100101 Firefox/59.0"}]
...
```

The time was converted to Unix timestamp (UTC).
The time was converted to Unix timestamp (UTC).
Loading