How does disk buffer cleanup work in Vector when using EFS with stateless pods #24572

wangzhihaocom · 2026-01-29T20:23:32Z

wangzhihaocom
Jan 29, 2026

Hi Vector team
I’m trying to better understand how Vector disk buffers work internally, especially buffer cleanup behavior, when running Vector in Kubernetes with EFS-backed persistence and Spot instances.

Environment / Setup

Deployment type: Kubernetes Deployment
Helm role: Stateless-Aggregator
Autoscaling: Enabled (pods are ephemeral)
Node type: AWS Spot instances
Persistence: AWS EFS mounted to Vector data_dir

Purpose:

Handle frequent pod termination (Spot interruptions, rescheduling)
Avoid data loss during abrupt shutdowns
Sinks: ClickHouse
Buffer type: disk
Acknowledgements: true
Each pod uses a pod-specific data_dir to avoid concurrent writers:

Filesystem layout (inside a running pod)

df -h
Filesystem      Size  Used Avail Use% Mounted on
overlay          30G  6.6G   24G  22% /
127.0.0.1:/     8.0E   92M  8.0E   1% /vector-data-dir

EFS directory structure

/vector-data-dir/
├── pod-vector-6b5d85f886-qvnm5
│   └── buffer
│       └── v2
│           ├── clickhouse_syslog
│           │   ├── buffer-data-0.dat
│           │   ├── buffer.db
│           │   └── buffer.lock
│           └── clickhouse_trafficlog
└── pod-vector-6b5d85f886-v48hb

Observed Behavior:

Data is successfully written to ClickHouse

Incoming traffic has been stopped

However:

buffer-data-0.dat .This file size dose not decreasing,.. (Does this filie store all the buffer)

Disk usage does not immediately decrease

Files remain even when there is no more traffic

Questions

1. Disk buffer lifecycle

What exactly is stored in the buffer-data-0*.dat files? Do they contain all of the buffer data?

I stopped the traffic, but the file size did not decrease. Should the file size decrease when there is no incoming traffic?

2. Handling Spot instance termination

If a pod is terminated because the Spot instance is taken away, what is the recommended way to handle unflushed buffer data?

3. Current design approach
In my current setup:

When a Spot instance is terminated, the old pod is killed.

A new pod is created.

I copy the old pod’s buffer data from:

vector-data-dir/pod-vector-${OLD_POD}/buffer/v2/<sink>

to the new pod’s directory:

vector-data-dir/pod-vector-${NEW_POD}/buffer/v2/<sink>

The idea is that the new pod will continue flushing the old pod’s buffer data.

Is this the correct approach for handling buffer persistence when using Spot instances?

Any help is very appreciated

wangzhihaocom · 2026-02-06T07:35:44Z

wangzhihaocom
Feb 6, 2026
Author

anyone has any idea?

0 replies

jlambatl · 2026-02-19T21:05:33Z

jlambatl
Feb 19, 2026

@wangzhihaocom There is documentation in this file: lib/vector-buffers/src/variants/disk_v2/mod.rs

My reading is that if an instance is terminated abruptly, i.e not with a graceful shutdown, then the data might be sitting in buffer files and not yet written to the destination. Depending on how you have acknowledgements, those events might be replayed and sent via another node or not. i.e. if the source has not yet acknowledged the data and it hasn't been deleted from the source, then another node would see those events and process them.

Someone may need to confirm or deny that, but the documentation for how it works is in the file linked above.

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

How does disk buffer cleanup work in Vector when using EFS with stateless pods #24572

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{editor}}'s edit

{{editor}}'s edit

Uh oh!

Replies: 2 comments

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Select a reply

Uh oh!

How does disk buffer cleanup work in Vector when using EFS with stateless pods #24572

Uh oh!

Uh oh!

wangzhihaocom Jan 29, 2026

Replies: 2 comments

Uh oh!

wangzhihaocom Feb 6, 2026 Author

Uh oh!

jlambatl Feb 19, 2026

wangzhihaocom
Jan 29, 2026

wangzhihaocom
Feb 6, 2026
Author

jlambatl
Feb 19, 2026