|
| 1 | +package follower |
| 2 | + |
| 3 | +import ( |
| 4 | + "github.com/onflow/flow-go/model/flow" |
| 5 | + "github.com/onflow/flow-go/module" |
| 6 | +) |
| 7 | + |
| 8 | +// complianceCore interface describes the follower's compliance core logic. Slightly simplified, the |
| 9 | +// compliance layer ingest incoming untrusted blocks from the network, filter out all invalid block, |
| 10 | +// extend the protocol state with the valid blocks, and lastly pipes the valid blocks to the HotStuff |
| 11 | +// follower. Conceptually, the algorithm proceeds as follows: |
| 12 | +// |
| 13 | +// 1. _light_ validation of the block header: |
| 14 | +// - check that the block's proposer is the legitimate primary for the respective view |
| 15 | +// - verify the primary's signature |
| 16 | +// - verify QC within the block |
| 17 | +// - verify whether TC should be included and check the TC |
| 18 | +// |
| 19 | +// Optimization for fast catchup: |
| 20 | +// Honest nodes that we synchronize blocks from supply those blocks in sequentially connected order. |
| 21 | +// This allows us to only validate the highest QC of such a sequence. A QC proves validity of the |
| 22 | +// referenced block as well as all its ancestors. The only other detail we have to verify is that the |
| 23 | +// block hashes match with the ParentID in their respective child. |
| 24 | +// To utilize this optimization, we require that the input `connectedRange` is continuous sequence |
| 25 | +// of blocks, i.e. connectedRange[i] is the parent of connectedRange[i+1]. |
| 26 | +// |
| 27 | +// 2. All blocks that pass the light validation go into a size-limited cache with random ejection policy. |
| 28 | +// Under happy operations this cache should not run full, as we prune it by finalized view. |
| 29 | +// |
| 30 | +// 3. Only certified blocks pass the cache [Note: this is the reason why we need to validate the QC]. |
| 31 | +// This caching strategy provides the fist line of defence: |
| 32 | +// - Broken blocks from malicious primaries do not pass this cache, as they will never get certified. |
| 33 | +// - Hardening [heuristic] against spam via block synchronization: |
| 34 | +// TODO: implement |
| 35 | +// We differentiate between two scenarios: (i) the blocks are _all_ already known, i.e. a no-op from |
| 36 | +// the cache's perspective vs (ii) there were some previously unknown blocks in the batch. If and only |
| 37 | +// if there is new information (case ii), we pass the certified blocks to step 4. In case of (i), |
| 38 | +// this is completely redundant information (idempotent), and hence we just exit early. |
| 39 | +// Thereby, the only way for a spamming node to load our higher-level logic is to include |
| 40 | +// valid pending yet previously unknown blocks (very few generally exist in the system). |
| 41 | +// |
| 42 | +// 4. All certified blocks are passed to the PendingTree, which constructs a graph of all blocks |
| 43 | +// with view greater than the latest finalized block [Note: graph-theoretically this is a forest]. |
| 44 | +// |
| 45 | +// 5. In a nutshell, the PendingTree tracks which blocks have already been connected to the latest finalized |
| 46 | +// block. When adding certified blocks to the PendingTree, it detects additional blocks now connecting |
| 47 | +// the latest finalized block. More formally, the PendingTree locally tracks the tree of blocks rooted |
| 48 | +// on the latest finalized block. When new vertices (i.e. certified blocks) are added to the tree, they |
| 49 | +// they move onto step 6. Blocks are entering step 6 are guaranteed to be in 'parent-first order', i.e. |
| 50 | +// connect to already known blocks. Disconnected blocks remain in the PendingTree, until they are pruned |
| 51 | +// by latest finalized view. |
| 52 | +// |
| 53 | +// 6. All blocks entering this step are guaranteed to be valid (as they are confirmed to be certified in |
| 54 | +// step 3). Furthermore, we know they connect to previously processed blocks. |
| 55 | +// |
| 56 | +// On the one hand, step 1 includes CPU-intensive cryptographic checks. On the other hand, it is very well |
| 57 | +// parallelizable. In comparison, step 2 and 3 are negligible. Therefore, we can have multiple worker |
| 58 | +// routines: a worker takes a batch of transactions and runs it through steps 1,2,3. The blocks that come |
| 59 | +// out of step 3, are queued in a channel for further processing. |
| 60 | +// |
| 61 | +// The PendingTree(step 4) requires very little CPU. Step 5 is a data base write populating many indices, |
| 62 | +// to extend the protocol state. Step 6 is only a queuing operation, with vanishing cost. There is little |
| 63 | +// benefit to parallelizing state extension, because under normal operations forks are rare and knowing |
| 64 | +// the full ancestry is required for the protocol state. Therefore, we have a single thread to extend |
| 65 | +// the protocol state with new certified blocks, executing |
| 66 | +// |
| 67 | +// Notes: |
| 68 | +// - At the moment, this interface exists to facilitate testing. Specifically, it allows to |
| 69 | +// test the ComplianceEngine with a mock of complianceCore. Higher level business logic does not |
| 70 | +// interact with complianceCore, because complianceCore is wrapped inside the ComplianceEngine. |
| 71 | +// - At the moment, we utilize this interface to also document the algorithmic design. |
| 72 | +type complianceCore interface { |
| 73 | + module.Startable |
| 74 | + module.ReadyDoneAware |
| 75 | + |
| 76 | + // OnBlockRange consumes an *untrusted* range of connected blocks( part of a fork). The originID parameter |
| 77 | + // identifies the node that sent the batch of blocks. The input `connectedRange` must be sequentially ordered |
| 78 | + // blocks that form a chain, i.e. connectedRange[i] is the parent of connectedRange[i+1]. Submitting a |
| 79 | + // disconnected batch results in an `ErrDisconnectedBatch` error and the batch is dropped (no-op). |
| 80 | + // Implementors need to ensure that this function is safe to be used in concurrent environment. |
| 81 | + // Caution: this method is allowed to block. |
| 82 | + // Expected errors during normal operations: |
| 83 | + // - cache.ErrDisconnectedBatch |
| 84 | + OnBlockRange(originID flow.Identifier, connectedRange []*flow.Block) error |
| 85 | + |
| 86 | + // OnFinalizedBlock prunes all blocks below the finalized view from the compliance layer's Cache |
| 87 | + // and PendingTree. |
| 88 | + // Caution: this method is allowed to block |
| 89 | + // Implementors need to ensure that this function is safe to be used in concurrent environment. |
| 90 | + OnFinalizedBlock(finalized *flow.Header) |
| 91 | +} |
0 commit comments