Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
7 changes: 5 additions & 2 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
@@ -1,3 +1,6 @@
## 4.4.4
- [DOC] Minor doc fixes and version bump to pick up changes in [#186](https://github.com/logstash-plugins/logstash-filter-grok/pull/186) [#197](https://github.com/logstash-plugins/logstash-filter-grok/pull/197)

## 4.4.3
- Minor typos in docs examples [#176](https://github.com/logstash-plugins/logstash-filter-grok/pull/176)

Expand All @@ -9,7 +12,7 @@

## 4.4.0
- Feat: ECS compatibility support [#162](https://github.com/logstash-plugins/logstash-filter-grok/pull/162)

The filter supports using built-in pattern definitions that are fully Elastic Common Schema (ECS) compliant.

## 4.3.0
Expand All @@ -30,7 +33,7 @@

## 4.0.3
- Fixed memory leak when run on JRuby 1.x (Logstash 5.x) [#135](https://github.com/logstash-plugins/logstash-filter-grok/issues/135)

## 4.0.2
- Fixed resource leak where this plugin might get double initialized during plugin reload, leaking a thread + some objects

Expand Down
59 changes: 31 additions & 28 deletions docs/index.asciidoc
Original file line number Diff line number Diff line change
Expand Up @@ -24,13 +24,13 @@ Parse arbitrary text and structure it.

Grok is a great way to parse unstructured log data into something structured and queryable.

This tool is perfect for syslog logs, apache and other webserver logs, mysql
This tool is great for syslog logs, apache and other webserver logs, mysql
logs, and in general, any log format that is generally written for humans
and not computer consumption.

Logstash ships with about 120 patterns by default. You can find them here:
<https://github.com/logstash-plugins/logstash-patterns-core/tree/master/patterns>. You can add
your own trivially. (See the `patterns_dir` setting)
Logstash ships with about 120 https://github.com/logstash-plugins/logstash-patterns-core/tree/master/patterns[default grok patterns].
You can also add your own.
Check out the <<plugins-{type}s-{plugin}-patterns_dir,`patterns_dir` setting>> for more info.

If you need help building patterns to match your logs, try the {kibana-ref}/xpack-grokdebugger.html[Grok debugger] in {kib}.

Expand All @@ -39,7 +39,7 @@ If you need help building patterns to match your logs, try the {kibana-ref}/xpac
The {logstash-ref}/plugins-filters-dissect.html[`dissect`] filter plugin
is another way to extract unstructured event data into fields using delimiters.

Dissect differs from Grok in that it does not use regular expressions and is faster.
Dissect differs from Grok in that it does not use regular expressions and is faster.
Dissect works well when data is reliably repeated.
Grok is a better choice when the structure of your text varies from line to line.

Expand All @@ -48,12 +48,12 @@ line is reliably repeated, but the entire line is not. The Dissect filter can
deconstruct the section of the line that is repeated. The Grok filter can process
the remaining field values with more regex predictability.

==== Grok Basics
==== Grok basics

Grok works by combining text patterns into something that matches your
logs.

The syntax for a grok pattern is `%{SYNTAX:SEMANTIC}`
The syntax for a grok pattern is `%{SYNTAX:SEMANTIC}`.

The `SYNTAX` is the name of the pattern that will match your text. For
example, `3.44` will be matched by the `NUMBER` pattern and `55.3.244.1` will
Expand All @@ -65,8 +65,11 @@ simply `duration`. Further, a string `55.3.244.1` might identify the `client`
making a request.

For the above example, your grok filter would look something like this:

[source,ruby]
-----
%{NUMBER:duration} %{IP:client}
-----

Optionally you can add a data type conversion to your grok pattern. By default
all semantics are saved as strings. If you wish to convert a semantic's data type,
Expand Down Expand Up @@ -106,14 +109,14 @@ After the grok filter, the event will have a few extra fields in it:
* `bytes: 15824`
* `duration: 0.043`

==== Regular Expressions
==== Regular expressions

Grok sits on top of regular expressions, so any regular expressions are valid
in grok as well. The regular expression library is Oniguruma, and you can see
the full supported regexp syntax https://github.com/kkos/oniguruma/blob/master/doc/RE[on the Oniguruma
site].

==== Custom Patterns
==== Custom patterns

Sometimes logstash doesn't have a pattern you need. For this, you have
a few options.
Expand Down Expand Up @@ -171,7 +174,7 @@ The `timestamp`, `logsource`, `program`, and `pid` fields come from the
`SYSLOGBASE` pattern which itself is defined by other patterns.

Another option is to define patterns _inline_ in the filter using `pattern_definitions`.
This is mostly for convenience and allows user to define a pattern which can be used just in that
This is mostly for convenience and allows user to define a pattern which can be used just in that
filter. This newly defined patterns in `pattern_definitions` will not be available outside of that particular `grok` filter.

[id="plugins-{type}s-{plugin}-ecs"]
Expand All @@ -184,7 +187,7 @@ compliant with the schema.

The ECS pattern set has all of the pattern definitions from the legacy set, and is
a drop-in replacement. Use the <<plugins-{type}s-{plugin}-ecs_compatibility>>
setting to switch modes.
setting to switch modes.

New features and enhancements will be added to the ECS-compliant files. The
legacy patterns may still receive bug fixes which are backwards compatible.
Expand Down Expand Up @@ -219,7 +222,7 @@ filter plugins.
&nbsp;

[id="plugins-{type}s-{plugin}-break_on_match"]
===== `break_on_match`
===== `break_on_match`

* Value type is <<boolean,boolean>>
* Default value is `true`
Expand All @@ -243,15 +246,15 @@ Controls this plugin's compatibility with the {ecs-ref}[Elastic Common Schema (E
The value of this setting affects extracted event field names when a composite pattern (such as `HTTPD_COMMONLOG`) is matched.

[id="plugins-{type}s-{plugin}-keep_empty_captures"]
===== `keep_empty_captures`
===== `keep_empty_captures`

* Value type is <<boolean,boolean>>
* Default value is `false`

If `true`, keep empty captures as event fields.

[id="plugins-{type}s-{plugin}-match"]
===== `match`
===== `match`

* Value type is <<hash,hash>>
* Default value is `{}`
Expand Down Expand Up @@ -280,7 +283,7 @@ If you need to match multiple patterns against a single field, the value can be
}
}
}

To perform matches on multiple fields just use multiple entries in the `match` hash:

[source,ruby]
Expand Down Expand Up @@ -312,15 +315,15 @@ However, if one pattern depends on a field created by a previous pattern, separa


[id="plugins-{type}s-{plugin}-named_captures_only"]
===== `named_captures_only`
===== `named_captures_only`

* Value type is <<boolean,boolean>>
* Default value is `true`

If `true`, only store named captures from grok.

[id="plugins-{type}s-{plugin}-overwrite"]
===== `overwrite`
===== `overwrite`

* Value type is <<array,array>>
* Default value is `[]`
Expand All @@ -342,7 +345,7 @@ overwrite the `message` field with part of the match like so:
In this case, a line like `May 29 16:37:11 sadness logger: hello world`
will be parsed and `hello world` will overwrite the original message.

If you are using a field reference in `overwrite`, you must use the field
If you are using a field reference in `overwrite`, you must use the field
reference in the pattern. Example:
[source,ruby]
filter {
Expand All @@ -354,18 +357,18 @@ reference in the pattern. Example:


[id="plugins-{type}s-{plugin}-pattern_definitions"]
===== `pattern_definitions`
===== `pattern_definitions`

* Value type is <<hash,hash>>
* Default value is `{}`

A hash of pattern-name and pattern tuples defining custom patterns to be used by
the current filter. Patterns matching existing names will override the pre-existing
definition. Think of this as inline patterns available just for this definition of
A hash of pattern-name and pattern tuples defining custom patterns to be used by
the current filter. Patterns matching existing names will override the pre-existing
definition. Think of this as inline patterns available just for this definition of
grok

[id="plugins-{type}s-{plugin}-patterns_dir"]
===== `patterns_dir`
===== `patterns_dir`

* Value type is <<array,array>>
* Default value is `[]`
Expand All @@ -375,7 +378,7 @@ Logstash ships by default with a bunch of patterns, so you don't
necessarily need to define this yourself unless you are adding additional
patterns. You can point to multiple pattern directories using this setting.
Note that Grok will read all files in the directory matching the patterns_files_glob
and assume it's a pattern file (including any tilde backup files).
and assume it's a pattern file (including any tilde backup files).
[source,ruby]
patterns_dir => ["/opt/logstash/patterns", "/opt/logstash/extra_patterns"]

Expand All @@ -390,7 +393,7 @@ For example:
The patterns are loaded when the pipeline is created.

[id="plugins-{type}s-{plugin}-patterns_files_glob"]
===== `patterns_files_glob`
===== `patterns_files_glob`

* Value type is <<string,string>>
* Default value is `"*"`
Expand All @@ -399,7 +402,7 @@ Glob pattern, used to select the pattern files in the directories
specified by patterns_dir

[id="plugins-{type}s-{plugin}-tag_on_failure"]
===== `tag_on_failure`
===== `tag_on_failure`

* Value type is <<array,array>>
* Default value is `["_grokparsefailure"]`
Expand All @@ -408,7 +411,7 @@ Append values to the `tags` field when there has been no
successful match

[id="plugins-{type}s-{plugin}-tag_on_timeout"]
===== `tag_on_timeout`
===== `tag_on_timeout`

* Value type is <<string,string>>
* Default value is `"_groktimeout"`
Expand All @@ -424,7 +427,7 @@ Tag to apply if a grok regexp times out.
Define target namespace for placing matches.

[id="plugins-{type}s-{plugin}-timeout_millis"]
===== `timeout_millis`
===== `timeout_millis`

* Value type is <<number,number>>
* Default value is `30000`
Expand Down
2 changes: 1 addition & 1 deletion logstash-filter-grok.gemspec
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
Gem::Specification.new do |s|
s.name = 'logstash-filter-grok'
s.version = '4.4.3'
s.version = '4.4.4'
s.licenses = ['Apache License (2.0)']
s.summary = "Parses unstructured event data into fields"
s.description = "This gem is a Logstash plugin required to be installed on top of the Logstash core pipeline using $LS_HOME/bin/logstash-plugin install gemname. This gem is not a stand-alone program"
Expand Down