Fix incorrect flush after truncate, other improvements #99

timvisee · 2025-11-21T11:40:30Z

Fix incorrect flush after truncate, because flush_offset was not bumped when truncating. This also improves or simplifies other things.

I'd recommend to review this PR per commit.

These changes have been tested as described in qdrant/qdrant#7577.

Before this PR I repeatedly see the following log line:

2025-11-21T15:23:22.205425Z  WARN wal::segment: CRC mismatch at offset 40: 414236364 != 838619149

which indicates flushing errors. This PR resolves the issue and I haven't seen that log line since.

Zero the full slice of removed data, not just the first 16 bytes

generall · 2025-11-24T10:19:18Z

src/segment.rs


                thread::spawn(move || {
-                    trace!("{log_msg}");
+                    error!("{log_msg}");


maybe warn? error should mean something unexpected happened, but as I understood, this path is possible

Of wait, is that expected that with new changes this path is an actual error?

With the new changes it should be impossible to hit this branch. That's why I promoted it to an error.

Still want to demote it to a warning?

src/segment.rs

agourlay · 2025-11-24T10:41:23Z

src/segment.rs

    ///
    /// The entries are not guaranteed to be removed until the segment is
    /// flushed.
    pub fn truncate(&mut self, from: usize) {


Can you please write a basic unit test for this function.

It seems to be completely uncovered.

There is a function that covers it:

wal/src/lib.rs

Lines 1104 to 1211 in 95c4310

#[test]

fn test_truncate_flush() {

init_logger();

let dir = Builder::new().prefix("wal").tempdir().unwrap();

// 2 entries should fit in each segment

let mut wal = Wal::with_options(

dir.path(),

&WalOptions {

segment_capacity: 4096,

segment_queue_len: 3,

retain_closed: NonZeroUsize::new(1).unwrap(),

},

)

.unwrap();

let entry: [u8; 2000] = [42u8; 2000];

// wal is empty

assert!(wal.entry(0).is_none());

// add 10 entries

for i in 0..10 {

assert_eq!(i, wal.append(&&entry[..]).unwrap());

}

// 4 closed segments

assert_eq!(wal.num_entries(), 10);

assert_eq!(wal.first_index(), 0);

assert_eq!(wal.last_index(), 9);

assert_eq!(wal.closed_segments.len(), 4); // 4 x 2 entries

assert_eq!(wal.closed_segments[0].segment.len(), 2);

assert_eq!(wal.closed_segments[1].segment.len(), 2);

assert_eq!(wal.closed_segments[2].segment.len(), 2);

assert_eq!(wal.closed_segments[3].segment.len(), 2);

assert_eq!(wal.open_segment.segment.len(), 2); // 1 x 2 entries

// first flush to set `flush_offset

wal.flush_open_segment().unwrap();

// content unchanged after flushing

assert_eq!(wal.num_entries(), 10);

assert_eq!(wal.first_index(), 0);

assert_eq!(wal.last_index(), 9);

assert_eq!(wal.closed_segments.len(), 4); // 4 x 2 entries

assert_eq!(wal.closed_segments[0].segment.len(), 2);

assert_eq!(wal.closed_segments[1].segment.len(), 2);

assert_eq!(wal.closed_segments[2].segment.len(), 2);

assert_eq!(wal.closed_segments[3].segment.len(), 2);

assert_eq!(wal.open_segment.segment.len(), 2); // 1 x 2 entries

wal.truncate(9).unwrap();

assert_eq!(wal.open_segment.segment.len(), 1); // 1 x 2 entries

// truncate half of it

wal.truncate(5).unwrap();

// assert truncation

for i in 5..10 {

assert!(wal.entry(i).is_none());

}

// flush again with `flush_offset` > segment size

wal.flush_open_segment().unwrap();

assert_eq!(wal.num_entries(), 5); // 5 entries removed

assert_eq!(wal.first_index(), 0);

assert_eq!(wal.last_index(), 4);

assert_eq!(wal.closed_segments.len(), 3); // (0, 1) + (2, 3) + (4, empty slot)

assert_eq!(wal.closed_segments[0].segment.len(), 2);

assert_eq!(wal.closed_segments[1].segment.len(), 2);

assert_eq!(wal.closed_segments[2].segment.len(), 1);

assert_eq!(wal.open_segment.segment.len(), 0); // empty open segment

// add 5 more entries

for i in 0..5 {

assert_eq!(i + 5, wal.append(&&entry[..]).unwrap());

}

// 5 closed segments

assert_eq!(wal.num_entries(), 10);

assert_eq!(wal.first_index(), 0);

assert_eq!(wal.last_index(), 9);

assert_eq!(wal.closed_segments.len(), 5);

assert_eq!(wal.closed_segments[0].segment.len(), 2); // 1,2

assert_eq!(wal.closed_segments[1].segment.len(), 2); // 3

assert_eq!(wal.closed_segments[2].segment.len(), 1); // 4 empty slot due to truncation

assert_eq!(wal.closed_segments[3].segment.len(), 2); // 5, 6

assert_eq!(wal.closed_segments[4].segment.len(), 2); // 7, 8

assert_eq!(wal.open_segment.segment.len(), 1); // 9

eprintln!("wal: {wal:?}");

eprintln!("wal open: {:?}", wal.open_segment);

eprintln!("wal closed: {:?}", wal.closed_segments);

// test persistence

drop(wal);

let wal = Wal::open(dir.path()).unwrap();

assert_eq!(wal.num_entries(), 10);

assert_eq!(wal.first_index(), 0);

assert_eq!(wal.last_index(), 9);

assert_eq!(wal.closed_segments.len(), 5);

assert_eq!(wal.closed_segments[0].segment.len(), 2);

assert_eq!(wal.closed_segments[1].segment.len(), 2);

assert_eq!(wal.closed_segments[2].segment.len(), 1); // previously half truncated

assert_eq!(wal.closed_segments[3].segment.len(), 2);

assert_eq!(wal.closed_segments[4].segment.len(), 2);

assert_eq!(wal.open_segment.segment.len(), 1);

}

But it doesn't hurt to add a bit more testing.

I cannot fully assert the actual flush behavior to disk though, since the kernel takes care of this in a non-deterministic way.

Added test in 68d63ca

We must only move it back, and not forward. Data starting at the current flush offset may still not be flushed.

src/segment.rs

timvisee added 3 commits November 21, 2025 13:35

Just flush full memory map

1897377

Simplify zeroing

c71e88f

Fix truncate zeroing

90ffd55

Zero the full slice of removed data, not just the first 16 bytes

timvisee force-pushed the truncate-flush-fixes branch from 0411076 to d32ae09 Compare November 21, 2025 12:37

timvisee changed the title ~~Fix truncate and flushing problems~~ Fix incorrect flush after truncate, other improvements Nov 21, 2025

timvisee added 5 commits November 21, 2025 13:40

Bump flush offset to flush new changes after truncation

690c518

Promote invalid flush range to error

86d8449

On load, set flush offset, we don't need to flush existing data

a6a61d3

Don't set flush offset before flush succeeds

5659945

Rust format

827edcc

timvisee force-pushed the truncate-flush-fixes branch from d32ae09 to 827edcc Compare November 21, 2025 12:40

timvisee marked this pull request as ready for review November 21, 2025 12:44

timvisee requested review from ffuugoo and generall November 21, 2025 12:45

timvisee mentioned this pull request Nov 21, 2025

Fix WAL handling on consensus snapshot qdrant/qdrant#7577

Merged

9 tasks

timvisee requested a review from agourlay November 21, 2025 15:43

generall reviewed Nov 24, 2025

View reviewed changes

src/segment.rs Show resolved Hide resolved

generall approved these changes Nov 24, 2025

View reviewed changes

agourlay reviewed Nov 24, 2025

View reviewed changes

timvisee added 3 commits November 24, 2025 13:48

Patch flush offset bump, only move offset back to first zero position

4aaa837

We must only move it back, and not forward. Data starting at the current flush offset may still not be flushed.

In test fixture, return temporary directory to not clear eagerly

90c5f7c

Add truncate test, also asserting flush offset behavior

68d63ca

timvisee requested a review from agourlay November 24, 2025 13:12

agourlay approved these changes Nov 24, 2025

View reviewed changes

ffuugoo reviewed Nov 24, 2025

View reviewed changes

src/segment.rs Outdated Show resolved Hide resolved

ffuugoo approved these changes Nov 24, 2025

View reviewed changes

Always format message, we don't expect to hit it anymore

829ddd2

timvisee merged commit fbb2f2a into master Nov 24, 2025
4 checks passed

timvisee mentioned this pull request Nov 24, 2025

Bump wal dependency to fix flush problems qdrant/qdrant#7587

Merged

3 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Fix incorrect flush after truncate, other improvements #99

Fix incorrect flush after truncate, other improvements #99

Uh oh!

timvisee commented Nov 21, 2025 •

edited

Loading

Uh oh!

generall Nov 24, 2025

Uh oh!

generall Nov 24, 2025

Uh oh!

timvisee Nov 24, 2025 •

edited

Loading

Uh oh!

Uh oh!

agourlay Nov 24, 2025

Uh oh!

timvisee Nov 24, 2025

Uh oh!

timvisee Nov 24, 2025

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

	#[test]
	fn test_truncate_flush() {
	init_logger();
	let dir = Builder::new().prefix("wal").tempdir().unwrap();
	// 2 entries should fit in each segment
	let mut wal = Wal::with_options(
	dir.path(),
	&WalOptions {
	segment_capacity: 4096,
	segment_queue_len: 3,
	retain_closed: NonZeroUsize::new(1).unwrap(),
	},
	)
	.unwrap();

	let entry: [u8; 2000] = [42u8; 2000];
	// wal is empty
	assert!(wal.entry(0).is_none());

	// add 10 entries
	for i in 0..10 {
	assert_eq!(i, wal.append(&&entry[..]).unwrap());
	}

	// 4 closed segments
	assert_eq!(wal.num_entries(), 10);
	assert_eq!(wal.first_index(), 0);
	assert_eq!(wal.last_index(), 9);
	assert_eq!(wal.closed_segments.len(), 4); // 4 x 2 entries
	assert_eq!(wal.closed_segments[0].segment.len(), 2);
	assert_eq!(wal.closed_segments[1].segment.len(), 2);
	assert_eq!(wal.closed_segments[2].segment.len(), 2);
	assert_eq!(wal.closed_segments[3].segment.len(), 2);
	assert_eq!(wal.open_segment.segment.len(), 2); // 1 x 2 entries

	// first flush to set `flush_offset
	wal.flush_open_segment().unwrap();

	// content unchanged after flushing
	assert_eq!(wal.num_entries(), 10);
	assert_eq!(wal.first_index(), 0);
	assert_eq!(wal.last_index(), 9);
	assert_eq!(wal.closed_segments.len(), 4); // 4 x 2 entries
	assert_eq!(wal.closed_segments[0].segment.len(), 2);
	assert_eq!(wal.closed_segments[1].segment.len(), 2);
	assert_eq!(wal.closed_segments[2].segment.len(), 2);
	assert_eq!(wal.closed_segments[3].segment.len(), 2);
	assert_eq!(wal.open_segment.segment.len(), 2); // 1 x 2 entries

	wal.truncate(9).unwrap();

	assert_eq!(wal.open_segment.segment.len(), 1); // 1 x 2 entries

	// truncate half of it
	wal.truncate(5).unwrap();

	// assert truncation
	for i in 5..10 {
	assert!(wal.entry(i).is_none());
	}

	// flush again with `flush_offset` > segment size
	wal.flush_open_segment().unwrap();

	assert_eq!(wal.num_entries(), 5); // 5 entries removed
	assert_eq!(wal.first_index(), 0);
	assert_eq!(wal.last_index(), 4);
	assert_eq!(wal.closed_segments.len(), 3); // (0, 1) + (2, 3) + (4, empty slot)
	assert_eq!(wal.closed_segments[0].segment.len(), 2);
	assert_eq!(wal.closed_segments[1].segment.len(), 2);
	assert_eq!(wal.closed_segments[2].segment.len(), 1);
	assert_eq!(wal.open_segment.segment.len(), 0); // empty open segment

	// add 5 more entries
	for i in 0..5 {
	assert_eq!(i + 5, wal.append(&&entry[..]).unwrap());
	}

	// 5 closed segments
	assert_eq!(wal.num_entries(), 10);
	assert_eq!(wal.first_index(), 0);
	assert_eq!(wal.last_index(), 9);
	assert_eq!(wal.closed_segments.len(), 5);
	assert_eq!(wal.closed_segments[0].segment.len(), 2); // 1,2
	assert_eq!(wal.closed_segments[1].segment.len(), 2); // 3
	assert_eq!(wal.closed_segments[2].segment.len(), 1); // 4 empty slot due to truncation
	assert_eq!(wal.closed_segments[3].segment.len(), 2); // 5, 6
	assert_eq!(wal.closed_segments[4].segment.len(), 2); // 7, 8
	assert_eq!(wal.open_segment.segment.len(), 1); // 9

	eprintln!("wal: {wal:?}");
	eprintln!("wal open: {:?}", wal.open_segment);
	eprintln!("wal closed: {:?}", wal.closed_segments);

	// test persistence
	drop(wal);
	let wal = Wal::open(dir.path()).unwrap();
	assert_eq!(wal.num_entries(), 10);
	assert_eq!(wal.first_index(), 0);
	assert_eq!(wal.last_index(), 9);
	assert_eq!(wal.closed_segments.len(), 5);
	assert_eq!(wal.closed_segments[0].segment.len(), 2);
	assert_eq!(wal.closed_segments[1].segment.len(), 2);
	assert_eq!(wal.closed_segments[2].segment.len(), 1); // previously half truncated
	assert_eq!(wal.closed_segments[3].segment.len(), 2);
	assert_eq!(wal.closed_segments[4].segment.len(), 2);
	assert_eq!(wal.open_segment.segment.len(), 1);
	}

Fix incorrect flush after truncate, other improvements #99

Fix incorrect flush after truncate, other improvements #99

Uh oh!

Conversation

timvisee commented Nov 21, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

generall Nov 24, 2025

Choose a reason for hiding this comment

Uh oh!

generall Nov 24, 2025

Choose a reason for hiding this comment

Uh oh!

timvisee Nov 24, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

agourlay Nov 24, 2025

Choose a reason for hiding this comment

Uh oh!

timvisee Nov 24, 2025

Choose a reason for hiding this comment

Uh oh!

timvisee Nov 24, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

timvisee commented Nov 21, 2025 •

edited

Loading

timvisee Nov 24, 2025 •

edited

Loading