Skip to content

fix(restore): resume S3 streams on drop#1082

Merged
corylanou merged 2 commits intomainfrom
codex/s3-restore-connection-drop
Feb 5, 2026
Merged

fix(restore): resume S3 streams on drop#1082
corylanou merged 2 commits intomainfrom
codex/s3-restore-connection-drop

Conversation

@corylanou
Copy link
Collaborator

@corylanou corylanou commented Feb 3, 2026

Summary

Add a resumable reader for S3-compatible restores and a Docker integration test that reproduces connection drop EOFs.

Fixes #1077

Problem

Large restores can keep some S3-compatible streams idle long enough for providers to close connections, leading to "unexpected EOF" during restore.

Solution

Use a resumable reader that reopens LTX streams on read errors/premature EOF and add a MinIO + Toxiproxy integration test to reproduce the issue.

Scope

In scope:

  • Resumable reader used during restore
  • Docker integration test for S3-compatible connection drops

Not in scope:

  • Provider-specific timeout tuning
  • Changes to compactor ordering

Test Plan

  • go test -v -count=1 -tags=integration,docker -run TestRestore_S3ConnectionDrop ./tests/integration

Credits

@corylanou
Copy link
Collaborator Author

Manual verification (Docker):

  1. Repro (old binary):
  • go test -v -count=1 -tags=integration,docker -run TestRestore_S3ConnectionDrop ./tests/integration
  • Result: restore failed with unexpected EOF (decode page header) when using a bin/litestream built before the resumable reader.
  1. Fix (this branch):
  • go build -o bin/litestream ./cmd/litestream
  • go test -v -count=1 -tags=integration,docker -run TestRestore_S3ConnectionDrop ./tests/integration
  • Result: PASS.

Note: the first run failed because the test uses the on-disk bin/litestream binary; rebuilding after applying the fix makes the test pass.

@corylanou corylanou merged commit 4800f27 into main Feb 5, 2026
19 checks passed
@corylanou corylanou deleted the codex/s3-restore-connection-drop branch February 5, 2026 18:49
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Restore fails with "error":"decode database: decode page 3345940: read page header 0: unexpected EOF"

2 participants