Skip to content

Commit 5aa10b7

Browse files
committed
Add documentation of the boostrapping process
1 parent fe4a971 commit 5aa10b7

File tree

1 file changed

+80
-0
lines changed

1 file changed

+80
-0
lines changed

common/src/snapshot/NOTES.md

Lines changed: 80 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,80 @@
1+
# Bootstrapping from a Snapshot file
2+
We can boot an Acropolis node either from geneis and replay all of the blocks up to
3+
some point, or we can boot from a snapshot file. This module provides the components
4+
needed to boot from a snapshot file. See [snapshot_bootsrapper](../../../modules/snapshot_bootstrapper/src/snapshot_bootstrapper.rs) for the process that references and runs with these helpers.
5+
6+
Booting from a snapshot takes minutes instead of the hours it takes to boot from
7+
genesis. It also allows booting from a given epoch which allows one to create tests
8+
that rely only on that epoch of data. We're also skipping some of the problematic
9+
eras and will typically boot from Conway around epoch 305, 306, and 307. It takes
10+
three epochs to have enough context to correctly calculate the rewards.
11+
12+
The required data for boostrapping are:
13+
- snapshot files (each has an associated epoch number and point)
14+
- nonces
15+
- headers
16+
17+
## Snapshot Files
18+
The snapshots come from the Amaru project. In their words,
19+
"the snapshots we generated are different [from a Mithril snapshot]: they're
20+
the actual ledger state; i.e. the in-memory state that is constructed by iterating over each block up to a specific point. So, it's all the UTxOs, the set of pending governance actions, the account balance, etc.
21+
If you get this from a trusted source, you don't need to do any replay, you can just start up and load this from disk.
22+
The format of these is completely non-standard; we just forked the haskell node and spit out whatever we needed to in CBOR."
23+
24+
Snapshot files are referenced by their epoch number in the config.json file below.
25+
26+
See [Amaru snapshot format](../../../docs/amaru-snapshot-structure.md)
27+
28+
## Configuration files
29+
There is a path for each network bootstrap configuration file. Network Should
30+
be one of 'mainnet', 'preprod', 'preview' or 'testnet_<magic>' where
31+
`magic` is a 32-bits unsigned value denoting a particular testnet.
32+
33+
The bootstrapper will be given a path to a directory that is expected to contain
34+
the following files: snapshots.json, nonces.json, and headers.json. The path will
35+
be used as a prefix to resolve per-network configuration files
36+
needed for bootstrapping. Given a source directory `data`, and a
37+
a network name of `preview`, the expected layout for configuration files would be:
38+
39+
* `data/preview/config.json`: a list of epochs to load.
40+
* `data/preview/snapshots.json`: a list of `Snapshot` values (epoch, point, url)
41+
* `data/preview/nonces.json`: a list of `InitialNonces` values,
42+
* `data/preview/headers.json`: a list of `Point`s.
43+
44+
These files are loaded by [snapshot_bootsrapper](../../../modules/snapshot_bootstrapper/src/snapshot_bootstrapper.rs) during bootup.
45+
46+
## Bootstrapping sequence
47+
48+
The bootstrapper will be started with an argument that specifies a network,
49+
e.g. "mainnet". From the network, it will build a path to the configuration
50+
and snapshot files as shown above, then load the data contained or described
51+
in those files. config.json holds a list of typically 3 epochs that can be
52+
used to index into snapshots.json to find the corresponding URLs and meta-data
53+
for each of the three snapshot files. Loading occurs in this order:
54+
55+
* publish `SnapshotMessage::Startup`
56+
* download the snapshots (on demand; may have already been done externally)
57+
* parse each snapshot and publish their data on the message bus
58+
* read nonces and publish
59+
* read headers and publish
60+
* publish `CardanoMessage::GenesisComplete(GenesisCompleteMessage {...})`
61+
62+
Modules in the system will have subscribed to the Startup message and also
63+
to individual structural data update messages before the
64+
boostrapper runs the above sequence. Upon receiving the `Startup` message,
65+
they will use data messages to populate their state, history (for BlockFrost),
66+
and any other state required to achieve readiness to operate on reception of
67+
the `GenesisCompleteMessage`.
68+
69+
## Data update messages
70+
71+
The bootstrapper will publish data as it parses the snapshot files, nonces, and
72+
headers. Snapshot parsing is done while streaming the data to keep the memory
73+
footprint lower. As elements of the file are parsed, callbacks provide the data
74+
to the boostrapper which publishes the data on the message bus.
75+
76+
There are TODO markers in [snapshot_bootsrapper](../../../modules/snapshot_bootstrapper/src/snapshot_bootstrapper.rs) that show where to add the
77+
publishing of the parsed snapshot data.
78+
79+
80+

0 commit comments

Comments
 (0)