-
Notifications
You must be signed in to change notification settings - Fork 45
Description
The json filter does not parse logstash events with arrays of values, which are important with JSON input due to json not allowing duplicate keys.
Example json input file (reformatted from json_lines for readability, not real values):
{
...
"comment": [
"{\"foo.bar\": \"value1\"}",
"{\"foo.world\": \"value2\"}"
]
...
}
[WARN ][logstash.filters.json ] Error parsing json {:source=>"[comment]", :raw=>["{\"foo.bar\": \"value1\"}", "{\"foo.world\": \"value2\"}"], :exception=>java.lang.ClassCastException: org.jruby.RubyArray cannot be cast to org.jruby.RubyIO}
I think all that is missing here is a loop over the logstash array, something like https://github.com/logstash-plugins/logstash-filter-kv/blob/5f2429cfd44579a5fa2ba3590a7d6e35636449aa/lib/logstash/filters/kv.rb#L407. In this specific case I can abuse the kv parser filter, but it won't be robust to JSON escape sequences.
I'm aware that the final output would have fields like "foo.bar", but that's fine for elasticsearch since it denormalizes fields like that when indexing anyway.
The plugin should also be able to cope with the same extracted field multiple times:
{
...
"comment": [
"{\"foo.bar\": \"value1\"}",
"{\"foo.bar\": \"value2\"}"
]
...
}
to eventually produce an array at the output, if that doesn't fall out magically from logstash:
{
...
"foo.bar": [ "value1", "value2" ]
...
}
Config:
input {
tcp {
codec => json_lines
}
}
filter {
json {
source => "[comment]"
target => "[extracted]"
}
}
...
Logstash version 6.4.0