Skip to content

Non-validator node stuck in bootstrap loop — P2P state transfer always times out (829MB, 60s timeout) #143

@Oldman-beginer

Description

@Oldman-beginer

Problem

My non-validator node crashed with "too many blocks to request" panic and lost visor_abci_state.json. Now it's stuck in
an infinite bootstrap loop — every peer times out before completing the 829MB state transfer.

Environment

  • Ubuntu 24.04 LTS, 64GB RAM, NVMe SSD
  • EU-based server (94.x.x.x)
  • Internet: >100 Mbps (not the bottleneck)
  • hl-visor: 4a5921208e0c8d246d26aeaa3958557ee538d48b (2026-02-28)
  • hl-node: 105cb1dc0129b99e8a81d16b8cc4118749a0ed37 (2026-03-21)
  • try_new_peers: true, 63 peers configured

What happens

  1. Node connects to a peer and starts downloading abci_state (829MB)
  2. Transfer speed is ~1-1.3 MB/s from all peers
  3. After ~60 seconds → tcp_read_exact: deadline has elapsed (downloaded only ~60MB of 829MB)
  4. Next attempt → same peer says Rate limited by peer
  5. Moves to next peer → same timeout
  6. Loop continues indefinitely across 100+ peers

Logs

WARN >>> hl-node @@ considering local abci state as stale @@ [stale_reason: "local height is too far behind"]
WARN >>> hl-node @@ connecting to peer: Ip(x.x.x.x)                                                                          
WARN >>> hl-node @@ reading bytes for abci_stream recv greeting: 52000000/829967179
WARN >>> hl-node @@ could not read abci state from x.x.x.x: lu::timeout tcp_read_exact: deadline has elapsed                 
WARN >>> hl-node @@ Rate limited by peer                                                                                     

What I've tried

  • 63 different peers (EU, US, JP, SG) — all give ~1 MB/s, all timeout
  • Helsinki Hetzner peers (65.109.x.x) — always early eof, never serve state
  • Restoring from periodic_abci_states (height 939600000) — node says "too far behind", forces full bootstrap
  • override_gossip_config.json with try_new_peers: true — discovers 100+ peers, none fast enough
  • Running for 10+ hours — no successful bootstrap

The math

829MB ÷ 60s timeout = ~14 MB/s required. All peers serve at ~1 MB/s. This means bootstrap is physically impossible
from my location unless:

  • A peer serves faster than 14 MB/s, or
  • The timeout is increased, or
  • State can be downloaded via HTTP/S3 instead of P2P

Questions

  1. Is there a way to download abci_state.rmp directly (HTTP/S3 snapshot)?
  2. Can the 60-second stream timeout be configured?
  3. Is there a plan to provide state snapshots for bootstrap (like other L1 chains do)?

Related issues

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions