Skip to content

Store to wire streaming. #683

@evoskuil

Description

@evoskuil

I've also given some thought to wire serialization of blocks/txs from the store. While performance is excellent, it's astounding how much translation is taking place. The store spans 8 tables and two hashmap headers (header & txs) to retrieve a block by hash, and (6 tables and one header - tx) to retrieve a tx by hash independent of a block. It then deserializes from store encoding to the native chain:: object model. All objects, down to op.data stored on std::share_ptr<> instances. Then to serialize out the entire object model is walked on a system::ostream to create a single buffer. For text (hex) encoding the buffer is then translated to hex using system::base16_encode. For json the object modal is walked directly by boost::json.

For wire/text there is really no need whatsoever to translated through the normal form (object model). The wire serialization can be trivially produced directly onto the stream from store serialization. This would bypass countless allocations and copies. Scripts and witnesses for example, making up the bulk of all data, do not even require parse, they can just be written to the stream. (Though it would certainly be cleaner if no for the tragic segwit decision to serialize witnesses independently from inputs.) I'd estimate that this would cut large block deserialization time in half, and on my old machine that would put it under 20ms.

Implementation for this is just a set of store queries that operate over the stream, and swapping out the get block/tx calls that then serialize to wire with one that just returns the stream buffer. Blocks are stored with their serialized size (witnessed if enabled on the node), in the txs association table. So system::stream can be used over a preallocated contiguous buffer. But eventually this could even just stream write out to the port in chunks. It's possible to write the message heading without reading the object, since we always have the size indexed. So stream direct from store into the port with zero translation of scripts, witnesses, and point hashes, and the rest is just just organizing the pieces. I'd estimate 2-3 days of work for the whole thing to serialize blocks and txs directly from store to wire. The store queries no more than 1 day.

Metadata

Metadata

Assignees

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions