Skip to content

Split Filter is not multi thread, how to fix this? #37

@rkhapre

Description

@rkhapre

i am using a conf file whole input flush 80K records after read

input {
  http_poller{
      urls => {
         http_request => {
				method => get
				url => "http://xyzfmt=CSV"
				}
			}

			codec => "json"  
		  }
} 
filter {  
			
			
split {
	field => "message"
	}
    csv {
        columns => ["Org", "Network", ......................................"Brand"]
        separator => ","
    }


mutate {
    remove_field => ["message","tags"]
  }
			 
}


output { 
stdout { codec => rubydebug } 
elasticsearch 
{ 

}
}

After reading the input i get 80 K records and it does split operations and i have to wait 4 hrs

Why i cant i do filter operations on 20 records --> output to 20 records
then in second batch
filter operations on another 20 records --> output another 20 records to output

Currently it does split operation 80 K records which takes around 4 hrs and then start posting records to output, which is very slow process.

For this i have to wait 4 hrs to see records in output, cant we start seeing it just after minutes at least records in batches of 20?

Cant we start seeing 20 records in output in couple of minutes, else i have to wait 4 hrs to check the output

I have tried keeping the pipeline worker more that 2, 4,6, 12 but nothing works out. Looks like its a bug on split filter

pipeline.workers: 2
pipeline.output.workers: 2

Metadata

Metadata

Labels

No labels
No labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions