Skip to content

fix(shared-log): harden rebalance repair under churn#617

Open
marcus-pousette wants to merge 3 commits intomasterfrom
codex/shared-log-force-fresh-rebalance-on-removed
Open

fix(shared-log): harden rebalance repair under churn#617
marcus-pousette wants to merge 3 commits intomasterfrom
codex/shared-log-force-fresh-rebalance-on-removed

Conversation

@marcus-pousette
Copy link
Member

Summary

  • force a fresh rebalance pass on removal-driven churn by allowing toRebalance(..., { forceFresh: true })
  • proactively evict remote replicator ranges on unsubscribe to avoid stale-leader selection when unsubscribe races replication reset messages
  • add a full repair sweep + bounded retry scheduling after removal churn
  • chunk force-fresh repair dispatches so churn recovery uses deterministic simple-sync batches instead of a single large IBLT pass

Tests

  • pnpm -C packages/programs/data/shared-log test -- --grep "maturity will retrigger rebalance|handles peer joining and leaving multiple times" --reporter=min
  • stress loop: 8x pnpm -C packages/programs/data/shared-log test -- --grep "handles peer joining and leaving multiple times" --reporter=min
  • pnpm -C packages/programs/data/shared-log test -- --grep "u64-iblt" --reporter=min

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant