Skip to content
This repository was archived by the owner on Apr 26, 2024. It is now read-only.

Commit c4fa4f3

Browse files
authored
Fix perf of fetching the same events many times. (#10703)
The code to deduplicate repeated fetches of the same set of events was N^2 (over the number of events requested), which could lead to a process being completely wedged. The main fix is to deduplicate the returned deferreds so we only await on a deferred once rather than many times. Seperately, when handling the returned events from the defrered we only add the events we care about to the event map to be returned (so that we don't pay the price of inserting extraneous events into the dict).
1 parent 1800aab commit c4fa4f3

File tree

2 files changed

+24
-6
lines changed

2 files changed

+24
-6
lines changed

changelog.d/10703.bugfix

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1 @@
1+
Fix a regression introduced in v1.41.0 which affected the performance of concurrent fetches of large sets of events, in extreme cases causing the process to hang.

synapse/storage/databases/main/events_worker.py

Lines changed: 23 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -520,16 +520,26 @@ async def _get_events_from_cache_or_db(
520520
# We now look up if we're already fetching some of the events in the DB,
521521
# if so we wait for those lookups to finish instead of pulling the same
522522
# events out of the DB multiple times.
523-
already_fetching: Dict[str, defer.Deferred] = {}
523+
#
524+
# Note: we might get the same `ObservableDeferred` back for multiple
525+
# events we're already fetching, so we deduplicate the deferreds to
526+
# avoid extraneous work (if we don't do this we can end up in a n^2 mode
527+
# when we wait on the same Deferred N times, then try and merge the
528+
# same dict into itself N times).
529+
already_fetching_ids: Set[str] = set()
530+
already_fetching_deferreds: Set[
531+
ObservableDeferred[Dict[str, _EventCacheEntry]]
532+
] = set()
524533

525534
for event_id in missing_events_ids:
526535
deferred = self._current_event_fetches.get(event_id)
527536
if deferred is not None:
528537
# We're already pulling the event out of the DB. Add the deferred
529538
# to the collection of deferreds to wait on.
530-
already_fetching[event_id] = deferred.observe()
539+
already_fetching_ids.add(event_id)
540+
already_fetching_deferreds.add(deferred)
531541

532-
missing_events_ids.difference_update(already_fetching)
542+
missing_events_ids.difference_update(already_fetching_ids)
533543

534544
if missing_events_ids:
535545
log_ctx = current_context()
@@ -569,18 +579,25 @@ async def _get_events_from_cache_or_db(
569579
with PreserveLoggingContext():
570580
fetching_deferred.callback(missing_events)
571581

572-
if already_fetching:
582+
if already_fetching_deferreds:
573583
# Wait for the other event requests to finish and add their results
574584
# to ours.
575585
results = await make_deferred_yieldable(
576586
defer.gatherResults(
577-
already_fetching.values(),
587+
(d.observe() for d in already_fetching_deferreds),
578588
consumeErrors=True,
579589
)
580590
).addErrback(unwrapFirstError)
581591

582592
for result in results:
583-
event_entry_map.update(result)
593+
# We filter out events that we haven't asked for as we might get
594+
# a *lot* of superfluous events back, and there is no point
595+
# going through and inserting them all (which can take time).
596+
event_entry_map.update(
597+
(event_id, entry)
598+
for event_id, entry in result.items()
599+
if event_id in already_fetching_ids
600+
)
584601

585602
if not allow_rejected:
586603
event_entry_map = {

0 commit comments

Comments
 (0)