Skip to content

Stream state entries to service endpoints instead of materializing all in StartMessage #4344

@tillrohrmann

Description

@tillrohrmann

Problem

The StartMessage in the service protocol requires all state entries to be sent as part of the message. This means the server must materialize all state entries into memory before sending the message to the service endpoint.

With the recent changes in #275, we now read state lazily from storage (using a lazy iterator/stream). However, we still have to collect all entries into a Vec<StateEntry> before constructing the StartMessage:

// In service_protocol_runner.rs write_start()
let state_entries: Vec<StateEntry> = state_stream
    .map_ok(|(key, value)| StateEntry { key, value })
    .try_collect()
    .await?;

// All entries must be in memory before creating the message
new_start_message(..., state_entries, ...)

Impact

For services with large state (many keys or large values), this causes:

  1. Memory spikes - All state entries must be held in memory simultaneously
  2. Increased latency - Cannot start sending to the service until all state is loaded from storage
  3. Missed pipelining opportunity - Could otherwise stream from storage → network without buffering

Potential Solutions

Option 1: Protocol Extension - Separate State Messages

Add a new message type for streaming state entries after the initial StartMessage:

StartMessage (partial: true, state_count: N)
StateEntryMessage (key, value)  // repeated N times
...then journal entries...

This would allow:

  • StartMessage sent immediately with metadata
  • State entries streamed one-by-one as they're read from storage
  • True pipelining from storage to network

Option 2: Chunked State in StartMessage

Allow multiple StartMessage frames, each containing a batch of state entries:

StartMessage (state_chunk: 1/3, entries: [...])
StartMessage (state_chunk: 2/3, entries: [...])
StartMessage (state_chunk: 3/3, entries: [...])

Considerations

  • Requires protocol version bump
  • SDK implementations would need updates to handle streaming state
  • Need to maintain backward compatibility with existing protocol versions
  • Should consider the trade-off between complexity and the frequency of large state scenarios

Related

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions