|
| 1 | +--- |
| 2 | +fip: "0108" |
| 3 | +title: "Filecoin Snapshot Format" |
| 4 | +author: "Hailong Mu (@hanabi1224)" |
| 5 | +discussions-to: https://github.com/filecoin-project/go-f3/issues/480 |
| 6 | +status: Draft |
| 7 | +type: "FRC" |
| 8 | +created: 2025-06-25 |
| 9 | +--- |
| 10 | + |
| 11 | +# FRC-0108: Filecoin Snapshot Format |
| 12 | + |
| 13 | +## Simple Summary |
| 14 | +<!--"If you can't explain it simply, you don't understand it well enough." Provide a simplified and layman-accessible explanation of the FIP.--> |
| 15 | + |
| 16 | +Downloading F3 finality certificates from scratch takes a long time and increases the p2p network bandwidth usage. |
| 17 | +An F3 snapshot is proposed to be included in the Filecoin CAR snapshot to reduce F3 catchup time and p2p network bandwidth usage on bootstrapping |
| 18 | +a Filecoin node with a Filecoin snapshot. |
| 19 | + |
| 20 | +## Abstract |
| 21 | +<!--A short (~200 words) description of the technical issue being addressed.--> |
| 22 | + |
| 23 | +The Filecoin ecosystem existed for years without specifying the snapshot format. That was fine until the advent of F3 and the resulting need to update the format in a coordinated way. |
| 24 | + |
| 25 | +This document outlines: |
| 26 | +- "v1": the original accepted format found implementations through 2025 and |
| 27 | +- "v2": the extension to v1 with an F3 snapshot as a raw data block, and changing CAR roots to be a CID that points to a CBOR-encoded Filecoin snapshot header struct. |
| 28 | + |
| 29 | +## Motivation |
| 30 | +<!--The motivation is critical for FIPs that want to change the Filecoin protocol. It should clearly explain why the existing protocol specification is inadequate to address the problem that the FIP solves. FIP submissions without sufficient motivation may be rejected outright.--> |
| 31 | + |
| 32 | +The time cost and the network bandwidth usage for a new Filecoin node to catch up with all F3 finality certificates grow over time, which delays the readiness of the F3-aware V2 RPC APIs. By embedding an F3 snapshot into the current Filecoin CAR snapshot, both can be vastly reduced at the cost of a slightly increased Filecoin CAR snapshot size. |
| 33 | + |
| 34 | +## V1 Specification |
| 35 | + |
| 36 | +We define the existing Filecoin snapshot format as v1 here for future reference. |
| 37 | + |
| 38 | +Filecoin snapshot v1 is in [CARv1](https://ipld.io/specs/transport/car/carv1/) format. |
| 39 | + |
| 40 | +The roots array in the `CarHeader` stores the tipset keys of the chain head in the snapshot. |
| 41 | + |
| 42 | +The data blocks are chain IPLD blocks generated in a deterministic depth-first traversal order during chain export. Thus a snapshot is deterministic for a given chain head and the number of state trees to include. (The details of the chain traversal algorithm can be found in Filecoin node implementations.) |
| 43 | + |
| 44 | +```go |
| 45 | +type CarHeader struct { |
| 46 | + version Int |
| 47 | + roots [&Any] |
| 48 | +} |
| 49 | +``` |
| 50 | + |
| 51 | +## V2 Specification |
| 52 | +<!--The technical specification should describe the syntax and semantics of any new feature. The specification should be detailed enough to allow competing, interoperable implementations for any current Filecoin implementations. --> |
| 53 | + |
| 54 | +We propose the below changes to the [V1 Filecoin CAR snapshot format](#v1-specification). |
| 55 | + |
| 56 | +- Change CAR root to be a CID that points to a CBOR-encoded [`SnapshotMetadata`](#snapshotmetadata) struct that is stored as the first data block in the CAR. |
| 57 | +- Store the raw [`F3Data`](#f3data) bytes as the second data block in the CAR when `F3Data != nil` in the metadata. |
| 58 | + |
| 59 | +### SnapshotMetadata |
| 60 | + |
| 61 | +```go |
| 62 | +type SnapshotMetadata { |
| 63 | + Version uint64 // Required, format version for SnapshotMetadata. |
| 64 | + // Only "2" is supported since "v1" was implied in the original format that predates `SnapshotMetadata`. |
| 65 | + HeadTipsetKey []Cid // Required |
| 66 | + F3Data *Cid // Optional, points to F3Data structure. The only supported codec is "RAW" (0x55). |
| 67 | +} |
| 68 | +``` |
| 69 | + |
| 70 | +### F3Data |
| 71 | + |
| 72 | +An F3 snapshot contains one header block and N data blocks (where N>0) in the below [CARv1](https://ipld.io/specs/transport/car/carv1)-like format: |
| 73 | + |
| 74 | +`[Header block] [Data block] [Data block] [Data block] ...` |
| 75 | + |
| 76 | +A header block is a CBOR-encoded [`F3SnapshotHeader`](#f3snapshotheader) with a length prefix in the below format: |
| 77 | + |
| 78 | +`[varint-encoded byte length of "CBOR-encoded F3SnapshotHeader"] [CBOR-encoded F3SnapshotHeader]` |
| 79 | + |
| 80 | +A data block is a CBOR-encoded [`FinalityCertificate`](#finalitycertificate) with a length prefix in the below format: |
| 81 | + |
| 82 | +`[varint-encoded byte length of "CBOR-encoded FinalityCertificate"] [CBOR-encoded FinalityCertificate]` |
| 83 | + |
| 84 | +Notes: |
| 85 | +- `FinalityCertificate`s should be ordered by `GPBFTInstance` in ascending order for sequential validation and intermediate power table calculation. This also ensures deterministic generation of F3 snapshot from a given F3 finality certificate chain. |
| 86 | +- The first and last `FinalityCertificate` instances should match those in the [F3SnapshotHeader](#f3snapshotheader), respectively. |
| 87 | +- This [CARv1](https://ipld.io/specs/transport/car/carv1)-like format is ideal for dumping blocks via streaming reads as the [F3SnapshotHeader](#f3snapshotheader) can be loaded first and minimal state is required for ongoing parsing. |
| 88 | +- This is "CARv1-like" but not true CARv1 because data blocks are not content addressed by CIDs. |
| 89 | +- The "varint-encoded byte length" prefixes follow the CARv1 format. It is an implementation detail of the CARv1 format that we're bleeding through here. |
| 90 | +- We aren't CBOR-encoding all of F3Data to enable streaming with lower RAM requirements. Node implementations are already experienced at streaming CARs, and we didn't want them to have to properly configure/use CBOR encoding/decoding in a streaming fashion. |
| 91 | +- A Filecoin node should delegate the F3 data to the underlyding F3 package (e.g. `go-f3`) for importing, and the F3 package should provide API for exporting the F3 snapshot bytes in the same format. Changes in the F3 data format should only bump the version in `F3SnapshotHeader` instead of the version in `SnapshotMetadata`. Backward compatibility for importing and exporting F3 snapshots should be maintained by the underlying F3 package(e.g. `go-f3`), hence transparent to Filecoin nodes. |
| 92 | + |
| 93 | +### F3SnapshotHeader |
| 94 | + |
| 95 | +```go |
| 96 | +type F3SnapshotHeader struct { |
| 97 | + Version uint64 |
| 98 | + FirstInstance uint64 |
| 99 | + LatestInstance uint64 |
| 100 | + InitialPowerTable gpbft.PowerEntries |
| 101 | +} |
| 102 | +``` |
| 103 | + |
| 104 | +### FinalityCertificate |
| 105 | + |
| 106 | +```go |
| 107 | +// Defined at <https://github.com/filecoin-project/go-f3/blob/v0.8.7/certs/certs.go#L34> |
| 108 | +// |
| 109 | +// FinalityCertificate represents a single finalized GPBFT instance. |
| 110 | +type FinalityCertificate struct { |
| 111 | + // The GPBFT instance to which this finality certificate corresponds. |
| 112 | + GPBFTInstance uint64 |
| 113 | + // The ECChain finalized during this instance, starting with the last tipset finalized in |
| 114 | + // the previous instance. |
| 115 | + ECChain *gpbft.ECChain |
| 116 | + // Additional data signed by the participants in this instance. Currently used to certify |
| 117 | + // the power table used in the next instance. |
| 118 | + SupplementalData gpbft.SupplementalData |
| 119 | + // Indexes in the base power table of the certifiers (bitset) |
| 120 | + Signers bitfield.BitField |
| 121 | + // Aggregated signature of the certifiers |
| 122 | + Signature []byte |
| 123 | + // Changes between the power table used to validate this finality certificate and the power |
| 124 | + // used to validate the next finality certificate. Sorted by ParticipantID, ascending. |
| 125 | + PowerTableDelta PowerTableDiff `json:"PowerTableDelta,omitempty"` |
| 126 | +} |
| 127 | + |
| 128 | +// Defined at <https://github.com/filecoin-project/go-f3/blob/v0.8.7/gpbft/chain.go#L194> |
| 129 | +// |
| 130 | +// A chain of tipsets comprising a base (the last finalised tipset from which the chain extends). |
| 131 | +// and (possibly empty) suffix. |
| 132 | +// Tipsets are assumed to be built contiguously on each other, |
| 133 | +// though epochs may be missing due to null rounds. |
| 134 | +// The zero value is not a valid chain, and represents a "bottom" value |
| 135 | +// when used in a Granite message. |
| 136 | +type ECChain struct { |
| 137 | + TipSets []*TipSet |
| 138 | + |
| 139 | + key ECChainKey `cborgen:"ignore"` |
| 140 | + keyLazyLoader sync.Once `cborgen:"ignore"` |
| 141 | +} |
| 142 | + |
| 143 | +// Defined at <https://github.com/filecoin-project/go-f3/blob/v0.8.7/gpbft/types.go#L123> |
| 144 | +type SupplementalData struct { |
| 145 | + // Commitments is the Merkle-tree of instance-specific commitments. Currently |
| 146 | + // empty but this will eventually include things like snark-friendly power-table |
| 147 | + // commitments. |
| 148 | + Commitments [32]byte `cborgen:"maxlen=32"` |
| 149 | + // PowerTable is the DagCBOR-blake2b256 CID of the power table used to validate |
| 150 | + // the next instance, taking lookback into account. |
| 151 | + PowerTable cid.Cid // []PowerEntry |
| 152 | +} |
| 153 | + |
| 154 | +// Defined at <https://github.com/filecoin-project/go-f3/blob/v0.8.7/certs/certs.go#L31> |
| 155 | +type PowerTableDiff []PowerTableDelta |
| 156 | + |
| 157 | +// Defined at <https://github.com/filecoin-project/go-f3/blob/v0.8.7/certs/certs.go#L16> |
| 158 | +// |
| 159 | +// PowerTableDelta represents a single power table change between GPBFT instances. If the resulting |
| 160 | +// power is 0 after applying the delta, the participant is removed from the power table. |
| 161 | +type PowerTableDelta struct { |
| 162 | + // Participant with changed power |
| 163 | + ParticipantID gpbft.ActorID |
| 164 | + // Change in power from base (signed). |
| 165 | + PowerDelta gpbft.StoragePower |
| 166 | + // New signing key if relevant (else empty) |
| 167 | + SigningKey gpbft.PubKey `cborgen:"maxlen=48"` |
| 168 | +} |
| 169 | + |
| 170 | +// Defined at <https://github.com/filecoin-project/go-f3/blob/v0.8.7/gpbft/types.go#L15> |
| 171 | +type ActorID uint64 |
| 172 | + |
| 173 | +// Defined at <https://github.com/filecoin-project/go-f3/blob/v0.8.7/gpbft/types.go#L17> |
| 174 | +type StoragePower = big.Int |
| 175 | + |
| 176 | +// Defined at <https://github.com/filecoin-project/go-f3/blob/v0.8.7/gpbft/types.go#L19> |
| 177 | +type PubKey []byte |
| 178 | + |
| 179 | +// Defined at <https://github.com/filecoin-project/go-f3/blob/v0.8.7/gpbft/chain.go#L52> |
| 180 | +// |
| 181 | +// TipSet represents a single EC tipset. |
| 182 | +type TipSet struct { |
| 183 | + // The EC epoch (strictly increasing). |
| 184 | + Epoch int64 |
| 185 | + // The tipset's key (canonically ordered concatenated block-header CIDs). |
| 186 | + Key TipSetKey `cborgen:"maxlen=760"` // 20 * 38B |
| 187 | + // Blake2b256-32 CID of the CBOR-encoded power table. |
| 188 | + PowerTable cid.Cid |
| 189 | + // Keccak256 root hash of the commitments merkle tree. |
| 190 | + Commitments [32]byte `cborgen:"maxlen=32"` |
| 191 | +} |
| 192 | + |
| 193 | +// Defined at <https://github.com/filecoin-project/go-f3/blob/v0.8.7/gpbft/chain.go#L20> |
| 194 | +// |
| 195 | +// TipSetKey is the canonically ordered concatenation of the block CIDs in a tipset. |
| 196 | +type TipSetKey = []byte |
| 197 | +``` |
| 198 | + |
| 199 | +## Backwards Compatibility |
| 200 | +<!--All FIPs that introduce backwards incompatibilities must include a section describing these incompatibilities and their severity. The FIP must explain how the author proposes to deal with these incompatibilities. FIP submissions without a sufficient backwards compatibility treatise may be rejected outright.--> |
| 201 | + |
| 202 | +- A Filecoin node should try to read a snapshot CAR in the [v2 format](#v2-specification). If there is a failure to successfully decode the block referenced as the CAR's single root using the schema presented above, a snapshot reader may fallback to the [v1 format](#v1-specification) and maintain backward compatibility. Additional failures to decode original snapshot format would indicate a fatal error. |
| 203 | +- CLI options for implementations like Forest, Lotus and Venus should remain unchanged to make it transparent to the node users. |
| 204 | +- The code change in all Filecoin nodes should be shipped with a network upgrade, and the Filecoin snapshot providers should only start publishing with the new format after the mainnet upgrade finishes to avoid potential errors during snapshot import for node users. That is to say: |
| 205 | + - Before NV27, |
| 206 | + - node implementations can read v1 snapshots for sure and v2 snapshot reading is being rolled out before the upgrade. |
| 207 | + - node implementations generate v1 snapshots by default |
| 208 | + - node implementation providers publish and host v1 snapshots |
| 209 | + - After NV27, |
| 210 | + - node implementations can read both v1 and v2 snapshots |
| 211 | + - node implementations generate v2 snapshots by default |
| 212 | + - node implementation providers publish and host v2 snapshots |
| 213 | +- Backward compatibility for importing and exporting F3 snapshots should be maintained by the underlying F3 package(e.g. `go-f3`), hence transparent to Filecoin nodes. |
| 214 | + |
| 215 | +## Test Cases |
| 216 | +<!--Test cases for an implementation are mandatory for FIPs affecting consensus changes. Other FIPs can choose to include links to test cases if applicable.--> |
| 217 | + |
| 218 | +## Security Considerations |
| 219 | +<!--All FIPs must contain a section that discusses the security implications/considerations relevant to the proposed change. Include information that might be important for security discussions, surfaces risks and can be used throughout the life cycle of the proposal. E.g. include security-relevant design decisions, concerns, important discussions, implementation-specific guidance and pitfalls, an outline of threats and risks and how they are being addressed. FIP submissions missing the "Security Considerations" section will be rejected. A FIP cannot proceed to status "Final" without a Security Considerations discussion deemed sufficient by the reviewers.--> |
| 220 | + |
| 221 | +This change has minimal security implications as the additional F3 data are also stored in the node database, unencrypted. Key considerations: |
| 222 | + |
| 223 | +- **Integrity**: The F3 snapshot can be validated. |
| 224 | +- **Performance** The F3 snapshot data blocks can be read, validated and imported in a stream, without requiring to hold the entire finality certificate chain in the RAM. To facilitate this, it might require some new API(s) in the CAR reader package. |
| 225 | +- **Cyclic structure** Not applicable. The F3 snapshot does not build any cyclic graph during import and export, only a single block(certificate) is required to be held in the RAM. |
| 226 | + |
| 227 | +The change does not introduce new attack vectors or modify existing security properties of the protocol. |
| 228 | + |
| 229 | +## Incentive Considerations |
| 230 | +<!--All FIPs must contain a section that discusses the incentive implications/considerations relative to the proposed change. Include information that might be important for incentive discussion. A discussion on how the proposed change will incentivize reliable and useful storage is required. FIP submissions missing the "Incentive Considerations" section will be rejected. An FIP cannot proceed to status "Final" without a Incentive Considerations discussion deemed sufficient by the reviewers.--> |
| 231 | + |
| 232 | +Node users should experience faster F3 bootstrapping time and less network bandwidth usage. |
| 233 | + |
| 234 | +## Product Considerations |
| 235 | +<!--All FIPs must contain a section that discusses the product implications/considerations relative to the proposed change. Include information that might be important for product discussion. A discussion on how the proposed change will enable better storage-related goods and services to be developed on Filecoin. FIP submissions missing the "Product Considerations" section will be rejected. An FIP cannot proceed to status "Final" without a Product Considerations discussion deemed sufficient by the reviewers.--> |
| 236 | + |
| 237 | +### Start-up without initial F3 data |
| 238 | + |
| 239 | +Nodes starting from a snapshot should not rely on the certificate exchange protocol to catch up with the F3 data because we expect this will get slower over time. One outcome of a slow F3 catchup time is a delay in the readiness of F3-aware RPC APIs. |
| 240 | + |
| 241 | +### CAR format expectations |
| 242 | + |
| 243 | +This change introduces a relatively novel use of the CAR format in that it contains one very large block, much larger than typical blocks found in most CAR containers for use with IPLD data. At the time of this proposal, this block size would be approximately 100 MiB and this will only grow over time. While this is not disallowed by the CAR specification, many CAR processing utilities are built on an assumption of classic IPFS style blocks of more more than approximately 1MiB each. Some CAR tooling may struggle to deal with the new proposed format, although handling CAR data outside of the narrow use-case of snapshort imports on Filecoin nodes is not typical or necessarily recommended. |
| 244 | + |
| 245 | +## Implementation |
| 246 | +<!--The implementations must be completed before any core FIP is given status "Final", but it need not be completed before the FIP is accepted. While there is merit to the approach of reaching consensus on the specification and rationale before writing code, the principle of "rough consensus and running code" is still useful when it comes to resolving many discussions of API details.--> |
| 247 | + |
| 248 | +- Lotus: https://github.com/filecoin-project/lotus/issues/13129 |
| 249 | +- Forest: |
| 250 | + |
| 251 | +## Future Work |
| 252 | +<!--A section that lists any unresolved issues or tasks that are part of the FIP proposal. Examples of these include performing benchmarking to know gas fees, validate claims made in the FIP once the final implementation is ready, etc. A FIP can only move to a "Last Call" status once all these items have been resolved.--> |
| 253 | + |
| 254 | +## Copyright |
| 255 | +Copyright and related rights waived via [CC0](https://creativecommons.org/publicdomain/zero/1.0/). |
0 commit comments