gap_sync/fix: Close gap and peer banning after warp sync on parachains by lexnv · Pull Request #11309 · paritytech/polkadot-sdk

lexnv · 2026-03-09T10:55:06Z

This PR fixes an issue with parachain collators stuck in a ban loop after starting with warp sync. Effectively having 0 connected peers.

The root causes are

warp sync now imports header only blocks before the gap is created
DB never updates the gap

Warp Sync

Gap sync fails to advance, causing the sync engine to request the same block multiple times. Considering that the gap is old, no other collator can provide a response and close it. The downstream effect is that after 3 identical requests, the peer gets banned.

For non-archive nodes (ie block_data.block.body.is_none()), the gap sync requests blocks without bodies.
When a warp synced block is provided, the block is already in the chain. That causes the import to return AlreadyInChain before having a chance to advance the gap start.
Then, because the gap start is never advanced, the request is duplicated to peers, causing a banning loop and disconnects.

DB never closes the gap

The DB gap stalls across node restarts: the DB gap is never updated. Even when a warp-sync is skipped by advancing now the gap's best queued number, the change is not reflected in the DB.

The next block is imported at number == gap sttart + 1
DB only handles exact match at number == gap start

To fix this, a while loop is added on the MissingHeaderAndBody case, that closes the gap if the block's headers were already imported.

The issue has surfaced after introducing the following optimization:

Block import improvements #10373

Closes:

master/regression: Parachain Gap cannot be filled #11299

Signed-off-by: Alexandru Vasile <alexandru.vasile@parity.io>

skunert · 2026-03-09T17:07:25Z

Can you explain a bit more clearly what exactly leads to this bug?
Is it only for nodes with existing DB or also fresh ones?

I remember reviewing @lrubasze PR and thought this case was handled where gap sync encounters a sparse warp block 🤔

bkchr

This requires a regression test.

bkchr · 2026-03-10T09:24:31Z

Can you explain a bit more clearly what exactly leads to this bug?
Is it only for nodes with existing DB or also fresh ones?

My AI assistant told me that this gap already existed before, using the old nodes. I did not yet had verified it. So, there was maybe a bug before, that triggers now something using the latest node. Clearly we need a test that exactly reproduces the problem.

skunert · 2026-03-10T10:03:09Z

My main point of confusion is whether this bug triggers for freshly warp-synced nodes. If it only triggers when gap sync is in progress and you upgrade mid-sync, then ignoring this is also an option.

lexnv · 2026-03-10T15:03:27Z

Closing this in favor of:

lexnv added 4 commits March 9, 2026 10:28

gap_sync/fix: Close gap and peer banning after warp sync on parachains

dce99c8

Signed-off-by: Alexandru Vasile <alexandru.vasile@parity.io>

client/db: Advance the start of the gap on already imported headers

d70e9a9

Signed-off-by: Alexandru Vasile <alexandru.vasile@parity.io>

sync: Dont close the gap on bad blocks

0ad1042

Signed-off-by: Alexandru Vasile <alexandru.vasile@parity.io>

chain_sync: Clear the gap entirely if hashes are in the db

d0465c8

Signed-off-by: Alexandru Vasile <alexandru.vasile@parity.io>

lexnv requested review from lrubasze and skunert March 9, 2026 16:38

bkchr requested changes Mar 10, 2026

View reviewed changes

lexnv closed this Mar 10, 2026

bkchr deleted the lexnv/unblock-gap-sync branch March 10, 2026 21:57

lexnv mentioned this pull request Mar 11, 2026

aura/import: Skip block execution when collators have no parent block state #11330

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

gap_sync/fix: Close gap and peer banning after warp sync on parachains#11309

gap_sync/fix: Close gap and peer banning after warp sync on parachains#11309
lexnv wants to merge 4 commits intomasterfrom
lexnv/unblock-gap-sync

lexnv commented Mar 9, 2026

Uh oh!

skunert commented Mar 9, 2026

Uh oh!

bkchr left a comment

Uh oh!

bkchr commented Mar 10, 2026

Uh oh!

skunert commented Mar 10, 2026

Uh oh!

lexnv commented Mar 10, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

lexnv commented Mar 9, 2026

Warp Sync

DB never closes the gap

Uh oh!

skunert commented Mar 9, 2026

Uh oh!

bkchr left a comment

Choose a reason for hiding this comment

Uh oh!

bkchr commented Mar 10, 2026

Uh oh!

skunert commented Mar 10, 2026

Uh oh!

lexnv commented Mar 10, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants