Tree column

Introduce a new column type: `MultiTree` intended for more efficient storage of blockchain state tries. 

### Design goals:

Allow storing arbitrary trie structures in the database. Each tree is identified by a root key. There may be arbitrary number of roots. The column should optimize disk layout for tree traversal.

### The API.

``` rust
type NodeAddress = u64;
type Children = Vec<NodeAddress>;
```

`NodeAddress` is an opaque ID that allows for quick node retrieval. These IDs are defined by the database and will most likely be just `Table::Address` 
 
`Operation` enum is extended with two additional tree operations.
```rust
pub enum Operation<Key, Value> {
    InsertTree { root: Key, root: NewNode }, 
    RemoveTree { root: Key },
//...
}

pub enum NodeRef {
    New(NewNode),
    Existing(NodeAddress),
}

struct NewNode {
    data: Vec<u8>,
    children: Vec<NodeRef>, // May reference the node in the same commit or an existing node.
}
```
Querying a root:
``` rust
fn get_root(col: ColId, key: Key) -> Result<Option<(Vec<u8>, Children)>>;
```
Querying non-root nodes:
``` rust
fn get_node(col: ColId, node_address:  NodeAddress) -> Result<Option<(Vec<u8>, Children)>>;
```

### Internal structure.

Hash index will be used for root nodes only. Nodes are stored in value tables. For Each node we store
* A list of up to `N` pairs of `(NodeKey, NodeAddress)`. `N` is a compile time constant = 16.
* User data.

### Removing trees.

There are two possible approches to implementing tree removal

1. Journaling. For each root the database records all added value `NodeAddresses`. On deletion these are dereferenced and the journal record is deleted as well. These records may be stored in a separate value table.
This should be faster overall, but potentially waste space for large tree sets.
2. Reference counting. For each node we store a counter of all the parent nodes that reference it directly. When a node is deleted, we read all direct children and dereference them. The downside here is that inserting or removing a node would require updating up to 16 children nodes, potentially blowing up the IO.

### Background optimization.

The database will periodically resettle nodes within value tables to optimize on-disk tree layout. The idea here that iterating a recent trie root should result in sequential IO.

### Potential issues:

Data duplicaton. The design as proposed here has no node index. If the same value node is inserted into 2 different paths of the same tree, it will be stored twice. If this turns out to be an issue we'd need to add each node to the index and use reference counting.
We need to collect statistics for existing chains to estimate the impact for this.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Tree column #199

Design goals:

The API.

Internal structure.

Removing trees.

Background optimization.

Potential issues:

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Tree column #199

Description

Design goals:

The API.

Internal structure.

Removing trees.

Background optimization.

Potential issues:

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions