|
| 1 | +# State Trie |
| 2 | + |
| 3 | +State Trie is used in blockchain networks to store the entire world state, typically organized using MPT (Merkle Patricia Trie) data structures. In Ethereum, all account basic states (balance, nonce, code_hash, storage_root) are stored in the leaf nodes of the state tree, with each contract account's storage data stored separately using an MPT. |
| 4 | + |
| 5 | +Conflux's state storage method differs from Ethereum in several ways: |
| 6 | + |
| 7 | +1. Account basic states and contract storage data are stored in a single MPT tree. |
| 8 | +2. Core Space accounts and eSpace accounts are also stored in the same MPT tree. |
| 9 | +3. Additionally, the MPT tree stores Core Space account VoteList data and DepositList data (currently no longer used). |
| 10 | +4. Contract account Code is also stored in the MPT tree. |
| 11 | + |
| 12 | +In summary, Conflux uses one massive MPT to store all global state data, including account basic information, code, storage, VoteList, and DepositList. |
| 13 | + |
| 14 | +## StorageKey |
| 15 | + |
| 16 | +The core functionality of MPT trees is to support key/value storage and retrieval. Conflux actually implements storing different types of data in the same MPT through different encoding rules. |
| 17 | +The StorageKey data type is defined as follows: |
| 18 | + |
| 19 | +```rust |
| 20 | +pub enum StorageKey<'a> { |
| 21 | + AccountKey(&'a [u8]), |
| 22 | + StorageRootKey(&'a [u8]), |
| 23 | + StorageKey { |
| 24 | + address_bytes: &'a [u8], |
| 25 | + storage_key: &'a [u8], |
| 26 | + }, |
| 27 | + CodeRootKey(&'a [u8]), |
| 28 | + CodeKey { |
| 29 | + address_bytes: &'a [u8], |
| 30 | + code_hash_bytes: &'a [u8], |
| 31 | + }, |
| 32 | + DepositListKey(&'a [u8]), |
| 33 | + VoteListKey(&'a [u8]), |
| 34 | +} |
| 35 | +``` |
| 36 | + |
| 37 | +The main data in each of the above keys is the account address bytes array. Different types of keys can be encoded into different MPT keys and used to store different data: |
| 38 | + |
| 39 | +- AccountKey: Used to store account basic information such as nonce, balance, code_hash, etc. |
| 40 | +- StorageRootKey: Used to store StorageLayout information, currently has no practical use |
| 41 | +- StorageKey: Used to store contract storage data |
| 42 | +- CodeRootKey: Used to store contract account code hash |
| 43 | +- CodeKey: Used to store contract account code data |
| 44 | +- DepositListKey: Used to store Core Space account DepositList information (currently no longer used) |
| 45 | +- VoteListKey: Used to store Core Space account VoteList information (currently no longer used) |
| 46 | + |
| 47 | +### Encoding |
| 48 | + |
| 49 | +Assuming there's an account address `0x8fb79782e14c082bfbb91692bf071187866007d2`, let's see what different types of keys look like after encoding: |
| 50 | + |
| 51 | +```sh |
| 52 | +# AccountKey directly uses the address itself |
| 53 | +8fb79782e14c082bfbb91692bf071187866007d2 |
| 54 | + |
| 55 | +# StorageRootKey adds b"data"(64617461) after the address |
| 56 | +8fb79782e14c082bfbb91692bf071187866007d2 + 64617461 |
| 57 | + |
| 58 | +# StorageKey adds b"data" after the address, then adds the contract storage key |
| 59 | +# Assuming the storage key is 0000000000000000000000000000000000000000000000000000000000000008 |
| 60 | +8fb79782e14c082bfbb91692bf071187866007d2 + 64617461 + 0000000000000000000000000000000000000000000000000000000000000008 |
| 61 | + |
| 62 | +# CodeRootKey adds b"code"(636f6465) after the address |
| 63 | +8fb79782e14c082bfbb91692bf071187866007d2 + 636f6465 |
| 64 | + |
| 65 | +# CodeKey adds b"code" after the address, then adds the code hash |
| 66 | +# Assuming the code hash is 0x405787fa12a823e0f2b7631cc41b3ba8828b3321ca811111fa75cd3aa3bb5acf |
| 67 | +8fb79782e14c082bfbb91692bf071187866007d2 + 636f6465 + 0x405787fa12a823e0f2b7631cc41b3ba8828b3321ca811111fa75cd3aa3bb5acf |
| 68 | + |
| 69 | +# DepositKey adds b"deposit"(6465706f736974) after the address |
| 70 | +8fb79782e14c082bfbb91692bf071187866007d2 + 6465706f736974 |
| 71 | + |
| 72 | +# VoteListKey adds b"vote"(766f7465) after the address |
| 73 | +8fb79782e14c082bfbb91692bf071187866007d2 + 766f7465 |
| 74 | +``` |
| 75 | + |
| 76 | +The above encoding is for Core Space accounts. eSpace is slightly different, specifically inserting b"\x81"(81) after the address bytes: |
| 77 | + |
| 78 | +```sh |
| 79 | +# AccountKey |
| 80 | +8fb79782e14c082bfbb91692bf071187866007d2 + 81 |
| 81 | + |
| 82 | +# StorageRootKey |
| 83 | +8fb79782e14c082bfbb91692bf071187866007d2 + 81 + 64617461 |
| 84 | + |
| 85 | +# StorageKey |
| 86 | +# Assuming the storage key is 0000000000000000000000000000000000000000000000000000000000000008 |
| 87 | +8fb79782e14c082bfbb91692bf071187866007d2 + 81 + 64617461 + 0000000000000000000000000000000000000000000000000000000000000008 |
| 88 | + |
| 89 | +# CodeRootKey |
| 90 | +8fb79782e14c082bfbb91692bf071187866007d2 + 81 + 636f6465 |
| 91 | + |
| 92 | +# CodeKey |
| 93 | +# Assuming the code hash is 0x405787fa12a823e0f2b7631cc41b3ba8828b3321ca811111fa75cd3aa3bb5acf |
| 94 | +8fb79782e14c082bfbb91692bf071187866007d2 + 81 + 636f6465 + 0x405787fa12a823e0f2b7631cc41b3ba8828b3321ca811111fa75cd3aa3bb5acf |
| 95 | + |
| 96 | +# DepositKey |
| 97 | +8fb79782e14c082bfbb91692bf071187866007d2 + 81 + 6465706f736974 |
| 98 | + |
| 99 | +# VoteListKey |
| 100 | +8fb79782e14c082bfbb91692bf071187866007d2 + 81 + 766f7465 |
| 101 | +``` |
| 102 | + |
| 103 | +For specific encoding implementation, refer to the StorageKeyWithSpace::to_key_bytes method. |
| 104 | + |
| 105 | +## DeltaMpt and IntermediaMpt |
| 106 | + |
| 107 | +In terms of implementation, Conflux's state tree consists of three trees: |
| 108 | + |
| 109 | +1. DeltaMpt: An incremental Merkle Patricia Trie used to store incremental data of state changes. |
| 110 | +2. IntermediaMpt: An intermediate state Merkle Patricia Trie that represents intermediate states between snapshots. |
| 111 | +3. Snapshot: Data state snapshots. |
| 112 | + |
| 113 | +When accessing certain data states, the overall access flow (hierarchy) is: DeltaMpt (current changes) → IntermediaMpt (intermediate states) → Snapshot (snapshot states). |
| 114 | + |
| 115 | +The encoding method for DeltaMpt and IntermediaMpt keys differs slightly from regular MPT encoding. Overall, it has an additional padding process. |
| 116 | +The basic length of regular MPT keys is the account address bytes length of 20, while delta MPT key basic length is 32. The specific method is as follows: |
| 117 | + |
| 118 | +1. First, there's a padding data with length 32. |
| 119 | +2. Concatenate the first 12 bits of padding data with address data to form 32 bits. |
| 120 | +3. Calculate keccak hash of the result from step 2. |
| 121 | +4. Concatenate the first 12 bits of the hash result with the address to form the final basic key. |
| 122 | + |
| 123 | +The encoding method for extended keys is the same as regular keys. |
| 124 | + |
| 125 | +```sh |
| 126 | +# AccountKey |
| 127 | +b41eca2cce25321f5ecf85540888000000000000000000000000000000000004 + 81 |
| 128 | + |
| 129 | +# StorageRootKey |
| 130 | +b41eca2cce25321f5ecf85540888000000000000000000000000000000000004 + 81 + 64617461 |
| 131 | + |
| 132 | +# StorageKey |
| 133 | +# Assuming the storage key is 0000000000000000000000000000000000000000000000000000000000000008 |
| 134 | +b41eca2cce25321f5ecf85540888000000000000000000000000000000000004 + 81 + 64617461 + 0000000000000000000000000000000000000000000000000000000000000008 |
| 135 | + |
| 136 | +# CodeRootKey |
| 137 | +b41eca2cce25321f5ecf85540888000000000000000000000000000000000004 + 81 + 636f6465 |
| 138 | + |
| 139 | +# CodeKey |
| 140 | +# Assuming the code hash is 0x405787fa12a823e0f2b7631cc41b3ba8828b3321ca811111fa75cd3aa3bb5acf |
| 141 | +b41eca2cce25321f5ecf85540888000000000000000000000000000000000004 + 81 + 636f6465 + 0x405787fa12a823e0f2b7631cc41b3ba8828b3321ca811111fa75cd3aa3bb5acf |
| 142 | + |
| 143 | +# DepositKey |
| 144 | +b41eca2cce25321f5ecf85540888000000000000000000000000000000000004 + 81 + 6465706f736974 |
| 145 | + |
| 146 | +# VoteListKey |
| 147 | +b41eca2cce25321f5ecf85540888000000000000000000000000000000000004 + 81 + 766f7465 |
| 148 | +``` |
| 149 | + |
| 150 | +## Considerations |
| 151 | + |
| 152 | +1. Following Ethereum's approach by storing account basic data and contract storage data in separate tries would greatly reduce the size of the state trie and significantly speed up traversal. |
| 153 | +2. The special flag (0x81) used to distinguish spaces, if placed at the front position, could enable searching only for data from a specific space in prefix search operations, which should also improve search speed. |
| 154 | +3. Currently, Conflux's state search method is prefix search, meaning that given 0x01, it can search for all addresses starting with 0x01. Geth and Reth's search method is to find a cursor given an arbitrary address 0x0888000000000000000000000000000000000004, then iterate through all account addresses greater than that address. |
0 commit comments