[refactor] Simplify block structure by removing unnecessary serialization/deserialization

<html><head></head><body><h1>Simplify block structure by removing unnecessary serialization/deserialization</h1>
<h2>Summary</h2>
<p>The current block structure carries several fields as serialized byte arrays that are then deserialized on every access. This introduces unnecessary CPU overhead and code complexity. We should simplify the block structure by using typed fields directly and removing unused fields.</p>
<h2>Current Block Structure in Fabric</h2>
<pre><code class="language-protobuf">message Block {
  BlockHeader header = 1;
  BlockData data = 2;
  BlockMetadata metadata = 3;
}
message BlockHeader {
  uint64 number = 1;
  bytes previous_hash = 2;
  bytes data_hash = 3;
}
message BlockData {
  repeated bytes data = 1;  // each entry is a serialized Envelope
}
message Envelope {
  bytes payload = 1;    // serialized Payload
  bytes signature = 2;
}
message Payload {
  Header header = 1;
  bytes data = 2;
}
message Header {
  bytes channel_header = 1;    // serialized ChannelHeader
  bytes signature_header = 2;  // serialized SignatureHeader
}
message ChannelHeader {
  int32 type = 1;
  int32 version = 2;
  google.protobuf.Timestamp timestamp = 3;
  string channel_id = 4;
  string tx_id = 5;
  uint64 epoch = 6;
  bytes extension = 7;
  bytes tls_cert_hash = 8;
}
message SignatureHeader {
  bytes creator = 1;
  bytes nonce = 2;
}
</code></pre>
<p>Note how the structure is deeply nested with serialized bytes at every level: <code>Block.data</code> → <code>Envelope.payload</code> → <code>Payload.header.channel_header</code> / <code>Payload.header.signature_header</code>. Each layer requires a separate <code>proto.Unmarshal</code> call to access the typed fields within.</p>
<h2>Problem</h2>
<p>In the current implementation, various components of the block are stored as serialized <code>[]byte</code> within the protobuf message and must be deserialized each time they are accessed. For example:</p>
<ul>
<li><strong>Signature Header</strong>: The <code>SignatureHeader</code> is embedded as raw bytes inside the payload. Every consumer of this field must unmarshal it before use, even though it could be stored as a typed struct directly.</li>
<li><strong>Common Header</strong>: Similarly, the <code>CommonHeader</code> (channel header + signature header) is serialized into bytes within the <code>Header</code> field of the payload. Accessing channel ID, tx type, or timestamp requires repeated deserialization.</li>
<li><strong>Envelope payload</strong>: The <code>Payload</code> inside each <code>Envelope</code> is carried as bytes, requiring deserialization at each processing stage (validation, commit, indexing, etc.).</li>
</ul>
<p>This pattern leads to:</p>
<ol>
<li><strong>Redundant CPU work</strong> — the same fields are deserialized multiple times across the transaction lifecycle (endorsement, ordering, validation, commit).</li>
<li><strong>Verbose boilerplate</strong> — every call site needs error-handling logic around <code>proto.Unmarshal</code> for what are conceptually direct field accesses.</li>
<li><strong>Increased GC pressure</strong> — repeated deserialization creates short-lived objects that add to garbage collection overhead.</li>
</ol>
<h2>Proposal</h2>
<ol>
<li><strong>Use typed structs instead of byte slices</strong> — Replace serialized byte fields (e.g., <code>SignatureHeader</code>, <code>ChannelHeader</code>) with their corresponding typed protobuf message fields in the internal block representation.</li>
<li><strong>Deserialize once at ingress</strong> — Parse the block fully when it is first received (e.g., at the orderer or committer ingress) and pass the fully typed structure through the pipeline.</li>
<li><strong>Remove unused fields</strong> — Audit the block and envelope structures for fields that are never read or are redundant, and remove them from the internal representation.</li>
<li><strong>Serialize only at egress</strong> — Re-serialize to the wire format only when the block needs to be persisted or transmitted over the network.</li>
</ol>
<h2>Example</h2>
<p><strong>Before:</strong></p>
<pre><code class="language-go">// Every call site repeats this pattern
payload := &amp;cb.Payload{}
err := proto.Unmarshal(envelope.Payload, payload)
// ...
header := &amp;cb.SignatureHeader{}
err = proto.Unmarshal(payload.Header.SignatureHeader, header)
// ... finally use header.Creator, header.Nonce
</code></pre>
<p><strong>After:</strong></p>
<pre><code class="language-go">// Direct typed access — deserialized once at ingress
creator := envelope.Payload.Header.SignatureHeader.Creator
nonce := envelope.Payload.Header.SignatureHeader.Nonce
</code></pre>
<h2>Expected Benefits</h2>
<ul>
<li>Reduced CPU usage from eliminating redundant <code>proto.Unmarshal</code> calls across the transaction pipeline</li>
<li>Simpler, more readable code with direct field access</li>
<li>Lower GC pressure from fewer intermediate allocations</li>
<li>Smaller internal block representation after removing unused fields</li>
</ul>
<h2>Backward-Compatible Evolution via <code>oneof</code></h2>
<p>If we need to introduce a new <code>SignatureHeader</code> type (e.g., one with typed fields or additional metadata), we could potentially leverage protobuf's <code>oneof</code> to do so without breaking backward compatibility:</p>
<pre><code class="language-protobuf">// Current Header definition (see above)
message Header {
  bytes channel_header = 1;    // serialized ChannelHeader
  bytes signature_header = 2;  // serialized SignatureHeader
}
// Proposed Header with oneof for backward-compatible evolution
// Since Header only has fields 1 and 2, field 3 is safe to use.
// ⚠️  For messages with more fields, always verify the next unused field number.
message Header {
  bytes channel_header = 1;    // existing: serialized ChannelHeader
  oneof signature_header_type {
    bytes signature_header = 2;                // existing: serialized SignatureHeader
    SignatureHeaderV2 signature_header_v2 = 3; // new: typed, uses next available field number
  }
}
</code></pre>
<p><strong>How this could work:</strong></p>
<ul>
<li>Existing nodes that only understand the old format would continue reading field 2 (<code>signature_header</code> bytes) and deserializing as before.</li>
<li>Newer nodes would populate and read field 3 (<code>signature_header_v2</code>) directly, avoiding the serialize/deserialize overhead.</li>
<li>Since <code>oneof</code> fields are mutually exclusive, only one representation is present on the wire at a time — no storage overhead from carrying both.</li>
<li>The same <code>oneof</code> pattern could also be applied to <code>channel_header</code> (field 1) in the future if needed.</li>
</ul>
<p><strong>⚠️ Critical: New oneof members must use unused field numbers.</strong> If the existing message has additional fields beyond the <code>oneof</code> candidates, the new oneof member <strong>cannot</strong> reuse an occupied field number. Protobuf will either reject the descriptor outright ("duplicate field number") or, when old and new nodes run different schema versions, cause <strong>silent data corruption</strong> — the old node would misinterpret the serialized embedded message bytes as the original field's data. The safe approach is to always use the next available unused field number.</p>
<p><strong>Considerations:</strong></p>
<ul>
<li><strong>Wire compatibility is confirmed for old → new direction</strong>: Old nodes can produce data using field 2 (bytes) and new nodes correctly identify it via the <code>oneof</code> discriminator. New nodes can also continue writing the old bytes format for full backward compatibility.</li>
<li><strong>New V2 → old nodes requires gating</strong>: When a new node writes using <code>signature_header_v2</code> (field 3), old nodes see an <strong>empty</strong> <code>signature_header</code> — the new field is unknown to them and silently ignored. This means a capability flag or channel configuration update is <strong>required</strong> to gate when the new format is used during rolling upgrades.</li>
<li><strong>Field number safety varies by message</strong>: For <code>Header</code> specifically, field 3 is safe since only fields 1 and 2 exist today. When applying this pattern to other messages (e.g., <code>ChannelHeader</code> which has fields up to 8), care must be taken to use a truly unused field number.</li>
<li><strong>Ledger persistence</strong>: If the block is persisted to the ledger in the new format, older binaries reading historical blocks post-upgrade must handle both variants. This may require a migration strategy or dual-write period.</li>
</ul>
<p>This approach is viable but requires a clear upgrade path and capability gating before it can be adopted.</p>
<h3>Compatibility Matrix</h3>

Scenario | Result
-- | --
Old node writes → New node reads | ✅ Fully compatible — new node reads bytes via oneof
New node writes V2 → Old node reads | ⚠️ Old node sees empty signature_header — V2 field is unknown
New node writes old format → Old node reads | ✅ Fully compatible — new node can still produce old wire format
Oneof mutual exclusivity | ✅ Setting one field correctly clears the other
Field number collision (reusing existing field number for oneof) | ❌ Silent data corruption — old node misinterprets V2 bytes as the original field
Safe approach (unused field number for oneof) | ✅ All existing fields preserved, V2 invisible to old nodes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[refactor] Simplify block structure by removing unnecessary serialization/deserialization #76

Simplify block structure by removing unnecessary serialization/deserialization

Summary

Current Block Structure in Fabric

Problem

Proposal

Example

Expected Benefits

Backward-Compatible Evolution via `oneof`

Compatibility Matrix

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Scenario	Result
Old node writes → New node reads	✅ Fully compatible — new node reads bytes via oneof
New node writes V2 → Old node reads	⚠️ Old node sees empty signature_header — V2 field is unknown
New node writes old format → Old node reads	✅ Fully compatible — new node can still produce old wire format
Oneof mutual exclusivity	✅ Setting one field correctly clears the other
Field number collision (reusing existing field number for oneof)	❌ Silent data corruption — old node misinterprets V2 bytes as the original field
Safe approach (unused field number for oneof)	✅ All existing fields preserved, V2 invisible to old nodes

[refactor] Simplify block structure by removing unnecessary serialization/deserialization #76

Description

Simplify block structure by removing unnecessary serialization/deserialization

Summary

Current Block Structure in Fabric

Problem

Proposal

Example

Expected Benefits

Backward-Compatible Evolution via oneof

Compatibility Matrix

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

Backward-Compatible Evolution via `oneof`