Skip to content
This repository was archived by the owner on Apr 26, 2024. It is now read-only.

Commit 3a569fb

Browse files
committed
Fix sharded federation sender sometimes using 100% CPU.
We pull all destinations requiring catchup from the DB in batches. However, if all those destinations get filtered out (due to the federation sender being sharded), then the `last_processed` destination doesn't get updated, and we keep requesting the same set repeatedly.
1 parent 48d44ab commit 3a569fb

File tree

2 files changed

+5
-2
lines changed

2 files changed

+5
-2
lines changed

changelog.d/9770.bugfix

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1 @@
1+
Fix bug where sharded federation senders could get stuck repeatedly querying the DB in a loop, using lots of CPU.

synapse/federation/sender/__init__.py

Lines changed: 4 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -734,16 +734,18 @@ async def _wake_destinations_needing_catchup(self) -> None:
734734
self._catchup_after_startup_timer = None
735735
break
736736

737+
last_processed = destinations_to_wake[-1]
738+
737739
destinations_to_wake = [
738740
d
739741
for d in destinations_to_wake
740742
if self._federation_shard_config.should_handle(self._instance_name, d)
741743
]
742744

743-
for last_processed in destinations_to_wake:
745+
for destination in destinations_to_wake:
744746
logger.info(
745747
"Destination %s has outstanding catch-up, waking up.",
746748
last_processed,
747749
)
748-
self.wake_destination(last_processed)
750+
self.wake_destination(destination)
749751
await self.clock.sleep(CATCH_UP_STARTUP_INTERVAL_SEC)

0 commit comments

Comments
 (0)