feat: add flat file storage for blobs and data columns#8899
feat: add flat file storage for blobs and data columns#8899wemeetagain wants to merge 8 commits intounstablefrom
Conversation
Move blob sidecar and data column storage from LevelDB to the filesystem, inspired by Prysm's approach. Files are stored as: blob_sidecars/<padSlot>/0x<rootHex>.ssz data_columns/<padSlot>/0x<rootHex>.dcol Key design decisions: - Atomic writes via .part temp file + datasync + rename for crash safety - .dcol binary format with 149-byte header and 128-bit bitmap for O(1) random column access via popcount-based offset calculation - In-memory ExistenceCache tracks (slot, root) presence for blobs and column bitmaps, doubling as the slot-to-root index for finalized canonical lookups (no separate index needed) - Per-root write locking via promise chaining for concurrent column merges - Feature flag --chain.flatFileStorage (default: true) - Hot/cold migration skipped entirely when enabled (data is already in final location); pruning via directory-level deletion Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Summary of ChangesHello @wemeetagain, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request introduces a fundamental shift in how blob sidecars and data columns are persisted, moving from a LevelDB-based key-value store to a direct filesystem-based flat file storage. This architectural change is designed to address performance bottlenecks, particularly around pruning and write amplification, and to streamline data handling for large, ephemeral data types. By leveraging the filesystem directly, the system gains efficiency in storage, retrieval, and lifecycle management, paving the way for improved node stability and responsiveness. Highlights
🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console. Changelog
Activity
Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here. You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension. Footnotes
|
There was a problem hiding this comment.
Code Review
This pull request introduces a major feature: flat file storage for blob sidecars and data columns, moving away from LevelDB for this data. The changes are extensive, including a detailed design document, the core implementation of the flat file store with atomic writes and a custom binary format for data columns, and integration into the existing chain logic, database layer, and network handlers. The implementation is well-structured, modular, and includes a comprehensive set of unit and integration tests. My review found one area for optimization in the existence cache rebuilding logic. Overall, this is a high-quality contribution.
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 4865fb1d93
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| if (wrapper) { | ||
| return wrapper.slice(BLOB_SIDECARS_IN_WRAPPER_INDEX); | ||
| } | ||
| return null; |
There was a problem hiding this comment.
Add LevelDB fallback when flat file lookup misses
When flatFileStore is enabled, this path returns early and never falls back to blobSidecars/blobSidecarsArchive (and the same pattern is used for data columns), so upgraded nodes with pre-existing sidecars in LevelDB but no flat-file copies will suddenly serve null for historical data. Because flatFileStorage is now default-on, this creates an upgrade regression unless a migration or fallback read path is added.
Useful? React with 👍 / 👎.
| if (blobRoots) { | ||
| for (const root of blobRoots) return root; |
There was a problem hiding this comment.
Resolve slot lookups to canonical root, not arbitrary root
getAnyRootForSlot() returns the first root in an in-memory set/map, which is insertion-order dependent and not canonical-aware. Finalized by-slot handlers use this helper via getBinaryBySlot() / getColumnsBinaryBySlot(), so if a slot temporarily contains both canonical and orphan files (e.g., after reorgs or restart before cleanup), the node can serve sidecars for the wrong block root.
Useful? React with 👍 / 👎.
Performance Report✔️ no performance regression detected Full benchmark results
|
twoeths
left a comment
There was a problem hiding this comment.
with the current approach, we store all DataColumnSidecars per block
this is against #8114 so we need to reconsider should we store DataColumnSidecar separately vs store all DataColumnSidecars per file like in this approach
- Store all DataColumnSidecars per block
- pros:
- archive fast
- almost never run out of file handles
- cons:
- to read a DataColumnSidecar, we need to load the whole block file
- it writes slower per block (sequentially), may affect range sync
- pros:
- Store DataColumnSidecar separately, under the same root hex/slot
-
pros:
- read each DataColumnSidecar separately, compliant to the current approach
- write DataColumnSidecars in parallel
-
cons:
- archive a bit slower because we need to prune the whole folder. But it should still be way faster than leveldb?
- when peers ask for a lot of DataColumnSidecars we may run out of file handles (not sure how often it is by @nflaig said it's used to be an issue in Prysm)
-
it seems to me we focus on the "archiveBlocks" too much on the current approach while not on read/write flow where it happens very frequently, I'm curious what others think? I tend to be in favor of storing it separately to be compliant to the current approach
mitigation for the current approach would be not to load the whole file every time, parse header first
mitigation for storing each DataColumnSidecar separately is to put the "getColumn()" operation in a job queue if too many peers request us
leaning towards this and mitigate cons by implementing why are the writes slower in that case? sure it's one large file but it's only one vs. 128 files with the other approach per block (for supernodes) |
Each column is independently Snappy block-compressed with an offset table for O(1) random access. Targeted fd.read() with position offsets reads only header + offset table + requested columns instead of the entire file. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
I read the code wrong to write each column one by one as also clarified offline, we'll go with storing one data columns file per block |
|
|
||
| const hi = view.getUint32(SLOT_OFFSET, false); | ||
| const lo = view.getUint32(SLOT_OFFSET + 4, false); | ||
| const slot = hi * 0x100000000 + lo; |
There was a problem hiding this comment.
we always read slot as 4 bytes in sszBYtes.ts, maybe make it consistent everywhere?
/**
* Read only the first 4 bytes of Slot, max value is 4,294,967,295 will be reached 1634 years after genesis
*
* If the high bytes are not zero, return null
*/
function getSlotFromOffset(data: Uint8Array, offset: number): Slot | null {
return checkSlotHighBytes(data, offset) ? getSlotFromOffsetTrusted(data, offset) : null;
}Address PR review feedback: - Replace inline (N+1)*4 with offsetTableSize(N) in readColumn, readAllColumns, getColumnByteRange, and encodeDcolFile - Document readColumn as a test/merge convenience (production uses getColumnByteRange + fd.read for targeted I/O) - Add comment explaining popcount semantics in getColumnByteRange Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…e-storage Resolve conflicts: - writeBlockInputToDb.ts: take upstream's refactored structure (writeBlockAndBlobsToDb + writeDataColumnsToDb), re-add flat file store paths for both blobs and columns - blobSidecarsByRange.ts, dataColumnSidecarsByRange.ts: take upstream's getHead() + payloadStatus API, keep flat file store paths - blockArchiver.test.ts: add root + payloadStatus to checkpoint - flatFileStoreHandlers.test.ts: update mock forkChoice to use getHead() instead of getHeadRoot() Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Move blob sidecar and data column storage from LevelDB to the filesystem, inspired by Prysm's approach. Files are stored as:
Key design decisions
.parttemp file + datasync + rename for crash safety.dcolbinary format with 149-byte header, 128-bit bitmap, per-column Snappy compression, and offset table for O(1) random column accessfd.read()for column reads — only reads header (149B) + offset table (~516B) + requested columns (~5-8 KB each) instead of the full ~1 MB fileExistenceCachetracks (slot, root) presence for blobs and 128-bit column bitmaps, doubling as the slot→root index for finalized canonical lookups--chain.flatFileStorage(default:true).dcolformat layoutEach column is independently Snappy block-compressed (~40-50% size reduction). Reading a single column decompresses only that column's ~5-8 KB, not the full file.
Disk usage impact
Files
packages/beacon-node/src/db/flatFileStore/— all flat file storage implementationpackages/beacon-node/test/unit/db/flatFileStore/— unit testsdocs/design-flat-file-storage.md— full design document🤖 Generated with Claude Code