Skip to content

feat: Multi-Worker Support — PR 1: Configuration #554

@grantkee

Description

@grantkee

Problem

The node currently hardcodes a single worker per validator (worker_id = 0). The EpochManager, network layer, and several initialization paths assume exactly one worker. This blocks the ability to run independent fee markets, specialized transaction pools, or any form of worker-level parallelism.

Goal

Refactor the node to support N independent workers per validator. Each worker operates as a standalone unit with its own:

  • libp2p swarm (dedicated gossip topics, listen address, network key)
  • RPC server (unique port)
  • Transaction pool
  • Batch builder + batch validator
  • LocalNetwork instance for primary communication

Workers share only the Primary (consensus) and the execution engine (block production). The num_workers count is a consensus-level parameter — all validators must agree on it.

Why

The immediate motivation is multiple fee markets. Once multi-worker is in place, a follow-up (Phase 2) spawns 2 workers by default:

  • Worker 0 (General): accepts all transactions, standard EIP-1559 fee market
  • Worker 1 (Whitelisted Transfers): accepts only whitelisted ERC-20 transfer/transferFrom calls, operates with a reduced base fee

This architecture also enables future process separation — workers can be extracted into standalone processes communicating with the primary over RPC. They can even support multiple execution environments.

Design Constraints

  1. Workers are fully independent — no cross-worker shared state. Each worker has its own network identity, pool, and gossip topics.
  2. Per-worker gossip topicstn-worker-{id} and tn-txn-{id} replace the current global tn-worker and tn-txn topics. This provides network-level isolation.
  3. Per-worker LocalNetwork — each worker gets its own LocalNetwork instance for primary communication. The primary registers as the handler on every worker's LocalNetwork. This is the seam for future process separation.
  4. num_workers is a consensus parameter — changing it requires a coordinated upgrade across all validators. Defaults to 1 for backward compatibility.
  5. Execution engine is shared — batches from all workers are processed sequentially by the same engine. Worker ID is already encoded in the block difficulty field.
  6. Faucet on worker 0 only — the testnet faucet attaches to the general-purpose worker.

Current State

Much of the infrastructure already supports N workers but is only called with worker_id = 0:

  • ExecutionNodeInner.workers: Vec<WorkerComponents> — vec exists, only 1 element
  • GasAccumulator — supports N workers internally, but initialized with new(1)
  • BatchValidator — already stores worker_id and rejects mismatched batches
  • adjust_base_fees() — loops over num_workers() but is a no-op
  • Block difficulty field — already encodes batch_index << 16 | worker_id

Hardcoded locations that block multi-worker:

Location Current Fix
manager.rs spawn_worker_node_components() let worker_id = 0; Loop over 0..num_workers
manager.rs GasAccumulator::new(1) Hardcoded 1 worker Use num_workers
manager.rs catchup_accumulator() gas_accumulator.base_fee(0) Restore per-worker base fees
manager.rs EpochManager struct Singular worker_network_handle Vec<WorkerNetworkHandle>
manager.rs create_consensus() Returns (PrimaryNode, WorkerNode) Returns (PrimaryNode, Vec<WorkerNode>)
config/genesis.rs NodeP2pInfo Single worker: NetworkInfo workers: Vec<NetworkInfo>
config/node.rs Parameters No num_workers field Add num_workers: u16 (default 1)
config/network.rs Global topics tn-worker, tn-txn Per-worker tn-worker-{id}, tn-txn-{id}
config/consensus.rs Single LocalNetwork Vec<LocalNetwork>

This PR: Configuration (foundation — must land first)

This PR lays the configuration foundation that all subsequent multi-worker PRs build on. It changes data structures and configuration to support multiple workers while defaulting to 1 (preserving current behavior).

Scope

crates/config/src/genesis.rsNodeP2pInfo:

  • Change NodeP2pInfo.worker from NetworkInfo to workers: Vec<NetworkInfo> (one entry per worker)
  • Update NodeInfo::worker_network_address() to take a WorkerId parameter
  • Update NodeInfo::worker_network_key() to take a WorkerId parameter
  • Add NodeInfo::num_workers() -> usize that returns self.p2p_info.workers.len()
  • Update Default impl to create a single-element vec (backward compatible)
  • Update NetworkGenesis::validate() to ensure all validators have the same number of workers

crates/config/src/node.rsParameters:

  • Add num_workers: u16 field with #[serde(default = "default_num_workers")] where default_num_workers() -> u16 { 1 }
  • All nodes must agree on this value; changing it requires a coordinated upgrade

crates/config/src/node.rsConfig:

  • Add convenience method Config::num_workers() -> u16 delegating to self.parameters.num_workers

crates/config/src/network.rsLibP2pConfig topics:

  • Change worker_batch_topic() to worker_batch_topic(worker_id: WorkerId) -> String returning format!("tn-worker-{worker_id}")
  • Change worker_txn_topic() to worker_txn_topic(worker_id: WorkerId) -> String returning format!("tn-txn-{worker_id}")

Genesis / test fixtures:

  • Update YAML genesis and committee files to use the new vec format for workers
  • Update crates/test-utils-committee/src/builder.rs to generate per-worker network info

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions