Skip to content

Conversation

@jorgelbg
Copy link

@jorgelbg jorgelbg commented Mar 5, 2018

At the moment when the network property is used in the configuration, like:

filter {
    cidr {
        address => [ "%{clientip}"]
        network => [ "10.0.0.0/8", "172.16.0.0/12", "192.168.0.0/16"]
        add_tag => [ "matched" ]
    }
}

The different IP/ranges in the network array are processed on each event, in practice we end up creating a new IPAddr instance for each IP in the network +1 from the IP specified in the address field for each event.

This PR moves the initialisation of the different ranges in network to the register method. Also, this removes the call to the event.sprintf when processing the IP networks. Considering that this is coming from the configuration it should use static values. Although, up to this point it was possible to extract the IP networks from the event itself (https://github.com/logstash-plugins/logstash-filter-cidr/blob/master/lib/logstash/filters/cidr.rb#L159).

@jorgelbg jorgelbg changed the title When using the network property on the configuration, process the def… Process the ranges on the network property only on register Mar 5, 2018
@magnusbaeck magnusbaeck self-requested a review March 5, 2018 20:34
@magnusbaeck
Copy link
Collaborator

Have you made any measurements to judge the performance improvement? Dropping the sprintf() call and making the network blocks static would be a backwards-incompatible change. How about initializing the IPAddr objects in register() iff the network contents actually is static (i.e. none of the strings contain any %{fieldname} references) but otherwise defer the initialization to filter()?

Also, this commit broke some of the tests. I had a quick look but couldn't immediately spot what was wrong.

@jorgelbg
Copy link
Author

jorgelbg commented Mar 6, 2018

Test setup: 24,400 messages, with a clientip field and collecting the metrics from the logstash monitoring API:

Before:
Before network-fix patch · GitHub

After:
After network-fix patch · GitHub

On our profiling we saw something like this before (regarding CPU):

logstash_-yourkit_java_profiler_2017_02-b75-_64-bit

Roughly 10% of the CPU time was spent on recreating the same IP networks, in our case we only have 1 range per filter, so this could be a bit more in cases with more IP networks.

After applying our change the profiling looks more like:

logstash_-yourkit_java_profiler_2017_02-b75-_64-bit

Regarding the drop of the event.sprintf call, we can go with your alternative, it's totally fine on my side, but to be honest I only noticed this by accident: in the documentation only static examples are provided. So for me was strange that this line was constantly hit during my profiling, that's why I took a look at it. I thought that it was supposed to work only with static values.

On the tests also all the examples use static values. Nevertheless, I thought it was important to highlight the change, but IMO this was almost an undocumented feature.

The issue with the tests was that the register method it's being called 2 times, and since I was rewriting the @network variable, for the second call an ArgumentError exception was being thrown (and silenced). Strangely enough, this was not happening on my local environment, that installs the normal logstash dependencies gems.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants