-
Hi! I'm trying to replace big filebeat deployment in company, and started testing Vector recently. The question I have - is there any documentation, or maybe someone know about the way how Vector find changes in discovered files? I'm trying to fine-tune instance, but the time delivery is quite a huge - 99 percentile is about 2-6sec. Is there any way to tune the time discovery? Maybe decrease file polling interval? Or maybe frequency of sending events in sink? Vector version: 0.30.0 > uname -r
2.6.32-5-amd64
> cat /etc/debian-release
squeeze Example of config: sources:
my_events_src:
type: file
include:
- /var/lib/state/SVC/*.log*
exclude:
- '*.gz$'
file_key: ''
host_key: ''
glob_minimum_cooldown_ms: 100
ignore_not_found: false
line_delimiter: "\n"
max_line_bytes: 1048576
max_read_bytes: 1048576
multi_line_timeout: 1000
oldest_first: false
read_from: beginning
fingerprint:
ignored_header_bytes: 0
lines: 1
strategy: checksum
transforms:
my_events_tfm:
type: remap
inputs:
- my_events_src
drop_on_error: true
source: |
. = parse_json!(.message)
[email protected] = now()
sinks:
my_events_snk:
inputs:
- my_events_tfm
address: "logstash_tcp_input:1221"
mode: tcp
type: socket
encoding:
codec: json
healthcheck:
enabled: true
buffer:
type: disk
when_full: block
max_size: 536870912 |
Beta Was this translation helpful? Give feedback.
Replies: 1 comment 3 replies
-
As far as I understand there's standard Watcher trait using here for getting updates on file change, am I correct? Line 77 in 4ce3278 But according to my investigation, for some reason, latency is quite high, as diff btw time write to file and read from it by Vector. Maybe it's some misconfiguration from my side? |
Beta Was this translation helpful? Give feedback.
What you found there is actually Vector's configuration file watcher rather than the
file
source logic that scans for new files. You can find that logic in this file: https://github.com/vectordotdev/vector/blob/master/lib/file-source/src/file_server.rsglob_minimum_cooldown_ms
is the main toggle there to reduce or increase the time between filesystem scans. However, if I remember right, it is in the same event loop that actually reads from the files so if you have a lot of files being watched, Vector could be busy reading from them before it scans for new files to watch. You could try tweakingmax_read_bytes
to reduce the amount Vector tries to read before scanning for new files.