Skip to content

Gossip: Refactor stake ci to better handle Gossip delta updates #7391

@ravyu-jump

Description

@ravyu-jump

fd_stake_ci is used to track leader schedules with a weighted sampler. Gossip's role is to update the destination address for staked nodes, and also keep track of unstaked nodes active in the cluster. Frankendancer/Agave (for which stake_ci's APIs were written for) would publish all destinations in a single broadcast at a set interval. stake_ci would generate a new fd_shred_dest object any time a change to unstaked nodes was detected between two broadcasts. This is a costly operation that involves:

  • sorting unstaked pubkeys in place
  • initializing a fd_wsample for all staked keys, an nlog5(n) operation
    Making it an O(nlog(n) + mlog(m)) operation where n is the number of staked nodes and m is unstaked.

Firedancer's Gossip edge-triggered update model publishes updates to individual contact info entries (both insertion and removal) as they happen. Time constraints led us to retrofit this regime into the existing fd_stake_ci APIs with little modifications, introducing two shims, to deal with new and dropped contact infos. Both trigger the (expensive) regeneration of fd_shred_dest. This means anytime a new unstaked contact info is added or dropped, we undergo a O(nlog(n) + mlog(m)) operation.

This problem is exacerbated by the fact that gossip publishes contact info updates from bootup, even before stake information is extracted from snapshot loading. So there is a window where all nodes are "unstaked" in the beginning. Therefore all new contact infos observed in this window will trigger the regeneration. Therefore this O(mlog(m)) operation (and note, all nodes are "unstaked" nodes) will be called m times during this window.

#7393 further exacerbates this with a surge of contact info remove updates (which also trigger a regeneration) when stake info is received

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions