Non-validator node stuck in bootstrap loop — P2P state transfer always times out (829MB, 60s timeout)

  ### Problem                                                                                                                  
                                                     
  My non-validator node crashed with `"too many blocks to request"` panic and lost `visor_abci_state.json`. Now it's stuck in  
  an infinite bootstrap loop — every peer times out before completing the 829MB state transfer.
                                                                                                                               
  ### Environment                                    

  - Ubuntu 24.04 LTS, 64GB RAM, NVMe SSD                                                                                       
  - EU-based server (94.x.x.x)
  - Internet: >100 Mbps (not the bottleneck)                                                                                   
  - hl-visor: `4a5921208e0c8d246d26aeaa3958557ee538d48b` (2026-02-28)                                                          
  - hl-node: `105cb1dc0129b99e8a81d16b8cc4118749a0ed37` (2026-03-21)                                                           
  - `try_new_peers: true`, 63 peers configured                                                                                 
                                                                                                                               
  ### What happens                                                                                                             
                                                                                                                               
  1. Node connects to a peer and starts downloading `abci_state` (829MB)                                                       
  2. Transfer speed is ~1-1.3 MB/s from all peers
  3. After ~60 seconds → `tcp_read_exact: deadline has elapsed` (downloaded only ~60MB of 829MB)                               
  4. Next attempt → same peer says `Rate limited by peer`                                                                      
  5. Moves to next peer → same timeout                                                                                         
  6. Loop continues indefinitely across 100+ peers                                                                             
                                                                                                                               
  ### Logs                                           
                                                                                                                               
  ```                                                
  WARN >>> hl-node @@ considering local abci state as stale @@ [stale_reason: "local height is too far behind"]
  WARN >>> hl-node @@ connecting to peer: Ip(x.x.x.x)                                                                          
  WARN >>> hl-node @@ reading bytes for abci_stream recv greeting: 52000000/829967179
  WARN >>> hl-node @@ could not read abci state from x.x.x.x: lu::timeout tcp_read_exact: deadline has elapsed                 
  WARN >>> hl-node @@ Rate limited by peer                                                                                     
  ```                                                                                                                          
                                                                                                                               
  ### What I've tried                                
                                                                                                                               
  - **63 different peers** (EU, US, JP, SG) — all give ~1 MB/s, all timeout                                                    
  - **Helsinki Hetzner peers** (65.109.x.x) — always `early eof`, never serve state
  - **Restoring from `periodic_abci_states`** (height 939600000) — node says "too far behind", forces full bootstrap           
  - **`override_gossip_config.json`** with `try_new_peers: true` — discovers 100+ peers, none fast enough                      
  - Running for **10+ hours** — no successful bootstrap                                                                        
                                                                                                                               
  ### The math                                                                                                                 
                                                                                                                               
  829MB ÷ 60s timeout = **~14 MB/s required**. All peers serve at ~1 MB/s. This means bootstrap is **physically impossible**   
  from my location unless:
  - A peer serves faster than 14 MB/s, or                                                                                      
  - The timeout is increased, or                                                                                               
  - State can be downloaded via HTTP/S3 instead of P2P
                                                                                                                               
  ### Questions                                                                                                                
                                                                                                                               
  1. Is there a way to download `abci_state.rmp` directly (HTTP/S3 snapshot)?                                                  
  2. Can the 60-second stream timeout be configured? 
  3. Is there a plan to provide state snapshots for bootstrap (like other L1 chains do)?                                       
                                                                                                                               
  ### Related issues                                                                                                           
                                                                                                                               
  - #141 — same "too many blocks to request" panic                                                                             
  - #73 — same "early eof" during bootstrap (resolved by moving to Tokyo)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Non-validator node stuck in bootstrap loop — P2P state transfer always times out (829MB, 60s timeout) #143

Problem

Environment

What happens

Logs

What I've tried

The math

Questions

Related issues

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Non-validator node stuck in bootstrap loop — P2P state transfer always times out (829MB, 60s timeout) #143

Description

Problem

Environment

What happens

Logs

What I've tried

The math

Questions

Related issues

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions