-
Notifications
You must be signed in to change notification settings - Fork 17
Open
Labels
enhancementNew feature or requestNew feature or request
Description
Currently, Woodpecker writes logs sequentially to files to ensure high throughput and persistence guarantees. However, the current file format either does not utilize compression or loses the ability to efficiently perform partial reads when compression is enabled. This limits performance tuning and restricts future extensibility (e.g., range-based recovery, fast seeking).
The goal of this issue is to design and implement a new storage file format that supports both compression and efficient partial reads.
Goals:
- Support compression: Reduce storage footprint and improve I/O efficiency.
- Enable partial reads: Allow reading specific data blocks without decompressing the entire file.
- Preserve sequential write performance: Writing must remain high-throughput and append-friendly.
- Ensure extensibility: Format should be designed to support future metadata like checksums, versioning, and block indices.
Technical Suggestions:
- Use block-based compression (e.g., compress every N records as a block).
- Record metadata for each block: offset, length, compression type.
- Add a lightweight block index section to enable fast seeks.
- Consider compression algorithms like Snappy or Zstd for a balance of speed and ratio.
Any better solution is also acceptable
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
enhancementNew feature or requestNew feature or request