Skip to content
Open
Show file tree
Hide file tree
Changes from 25 commits
Commits
Show all changes
58 commits
Select commit Hold shift + click to select a range
fc6000c
Refactor heroes to not be added to room state
erikjohnston Nov 18, 2025
087f6eb
Always return all memberships for non-limited syncs
erikjohnston Nov 19, 2025
49fa7eb
Make _required_state_changes return struct
erikjohnston Nov 18, 2025
8cba313
Track lazy loaded members in SSS separately.
erikjohnston Nov 15, 2025
6303bb1
Update tests
erikjohnston Nov 18, 2025
5c48983
Newsfile
erikjohnston Nov 20, 2025
4984858
Fix check delta script
erikjohnston Nov 20, 2025
7a0a8a2
Rename required_user_state
erikjohnston Nov 24, 2025
8a3ec20
Reword the cache comments on the schema
erikjohnston Nov 24, 2025
fc01740
Rename RoomLazyMembershipChanges fields
erikjohnston Nov 24, 2025
ae3f569
Add RoomLazyMembershipChanges last_seen_ts comment
erikjohnston Nov 24, 2025
027b422
Clean up comments
erikjohnston Nov 24, 2025
abee4db
Always include lazy_members_previously_returned and lazy_members_prev…
erikjohnston Nov 24, 2025
99855ba
Fixup comment
erikjohnston Nov 24, 2025
0b1ecf1
Use duration constants
erikjohnston Nov 24, 2025
2090d14
Update tests/handlers/test_sliding_sync.py
erikjohnston Nov 25, 2025
113f6ce
Rename previously_returned_user_state param
erikjohnston Nov 25, 2025
ec45e00
Fix bug where we didn't correctly filter lazy members by room
erikjohnston Nov 25, 2025
f8f6dc9
Lint
erikjohnston Nov 25, 2025
5604d3a
Add test for forked position
erikjohnston Nov 25, 2025
815b852
Ensure that the last_seen_ts is correctly updated
erikjohnston Nov 25, 2025
2e844aa
Expand comment why is fine this is a cache
erikjohnston Nov 25, 2025
deaf995
Add tests for state reset and lazy loading
erikjohnston Nov 25, 2025
cdeebc8
Merge remote-tracking branch 'origin/develop' into erikj/sss_better_m…
erikjohnston Nov 25, 2025
65aebf4
Fix limited sync lazy members
erikjohnston Nov 26, 2025
4d4c1b8
Merge remote-tracking branch 'origin/develop' into erikj/sss_better_m…
erikjohnston Dec 2, 2025
e6939e7
Use Duration
erikjohnston Dec 2, 2025
69fc61d
Fix bug where lazy members were shared between connections
erikjohnston Dec 2, 2025
56ead16
Fixup lazy_members_previously_returned
erikjohnston Dec 2, 2025
2d2047d
Update synapse/storage/databases/main/sliding_sync.py
erikjohnston Dec 3, 2025
da08203
Update synapse/storage/databases/main/sliding_sync.py
erikjohnston Dec 3, 2025
2546ca6
Split state_key_expand_lazy_keep_previous_memberships
erikjohnston Dec 3, 2025
ba59391
Update LAZY_MEMBERS_UPDATE_INTERVAL comment
erikjohnston Dec 3, 2025
0a68e12
Fixup RoomLazyMembershipChanges comment
erikjohnston Dec 3, 2025
45d1bfa
Fixup returned_user_id_to_last_seen_ts_map docs
erikjohnston Dec 3, 2025
4070326
Update synapse/handlers/sliding_sync/__init__.py
erikjohnston Dec 3, 2025
e2b4fe8
Update tests/rest/client/sliding_sync/test_rooms_required_state.py
erikjohnston Dec 3, 2025
b75b3cb
Update tests/rest/client/sliding_sync/test_rooms_required_state.py
erikjohnston Dec 3, 2025
6caacd1
Move 'lazy_members_previously_returned' definition
erikjohnston Dec 3, 2025
b1bc509
Remove spurious lazy load user in test
erikjohnston Dec 3, 2025
7ff3d2f
Don't add to lazy_members_previously_returned what we're lazy loading
erikjohnston Dec 3, 2025
008cb58
Add context to state reset comment
erikjohnston Dec 3, 2025
d3f3f98
Note it is a regression test
erikjohnston Dec 3, 2025
0ffb32a
Update comment on update ts test
erikjohnston Dec 3, 2025
17bf341
Only persist lazy members if we need to
erikjohnston Dec 3, 2025
85c6754
Apply suggestions from code review
erikjohnston Dec 5, 2025
1adcdaa
Fix wrapping
erikjohnston Dec 5, 2025
aa2c426
Explain why we don't use sliding_sync_connection_required_state
erikjohnston Dec 5, 2025
1d7b649
Add comment to has_updates
erikjohnston Dec 5, 2025
6d8950e
Update comment for which prev lazy members we fetch
erikjohnston Dec 5, 2025
bea19c4
s/changed_required_state_map/required_state_map_change/
erikjohnston Dec 5, 2025
91770fc
s/lazy_load_user_ids/request_lazy_load_user_ids/
erikjohnston Dec 5, 2025
31c913e
Expand why we always record move to LAZY
erikjohnston Dec 5, 2025
855b448
Expand comment about moving positions to NULL
erikjohnston Dec 5, 2025
6fc746c
Expand return doc comment for get_sliding_sync_connection_lazy_members
erikjohnston Dec 5, 2025
b63c8ad
s/mem/members/ in query
erikjohnston Dec 5, 2025
c1887b8
Test s/lazy_load_user_ids/request_lazy_load_user_ids/
erikjohnston Dec 5, 2025
bfe05de
s/lazy_members_previously_returned/users_to_add_to_lazy_cache/
erikjohnston Dec 5, 2025
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions changelog.d/19206.bugfix
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
Fix sliding sync performance slow down for long lived connections.
2 changes: 1 addition & 1 deletion scripts-dev/check_schema_delta.py
Original file line number Diff line number Diff line change
Expand Up @@ -12,7 +12,7 @@

SCHEMA_FILE_REGEX = re.compile(r"^synapse/storage/schema/(.*)/delta/(.*)/(.*)$")
INDEX_CREATION_REGEX = re.compile(
r"CREATE .*INDEX .*ON ([a-z_0-9]+)", flags=re.IGNORECASE
r"CREATE .*INDEX .*ON ([a-z_0-9]+)\s+\(", flags=re.IGNORECASE
)
INDEX_DELETION_REGEX = re.compile(r"DROP .*INDEX ([a-z_0-9]+)", flags=re.IGNORECASE)
TABLE_CREATION_REGEX = re.compile(
Expand Down
353 changes: 291 additions & 62 deletions synapse/handlers/sliding_sync/__init__.py

Large diffs are not rendered by default.

172 changes: 171 additions & 1 deletion synapse/storage/databases/main/sliding_sync.py
Original file line number Diff line number Diff line change
Expand Up @@ -14,7 +14,7 @@


import logging
from typing import TYPE_CHECKING, Mapping, cast
from typing import TYPE_CHECKING, AbstractSet, Mapping, cast

import attr

Expand All @@ -26,13 +26,16 @@
DatabasePool,
LoggingDatabaseConnection,
LoggingTransaction,
make_in_list_sql_clause,
)
from synapse.storage.engines import PostgresEngine
from synapse.types import MultiWriterStreamToken, RoomStreamToken
from synapse.types.handlers.sliding_sync import (
HaveSentRoom,
HaveSentRoomFlag,
MutablePerConnectionState,
PerConnectionState,
RoomLazyMembershipChanges,
RoomStatusMap,
RoomSyncConfig,
)
Expand All @@ -52,6 +55,10 @@
logger = logging.getLogger(__name__)


# How often to update the last seen timestamp for lazy members. We don't want to
# update it too often as that causes DB writes.
LAZY_MEMBERS_UPDATE_INTERVAL_MS = ONE_HOUR_SECONDS * MILLISECONDS_PER_SECOND

# How often to update the `last_used_ts` column on
# `sliding_sync_connection_positions` when the client uses a connection
# position. We don't want to update it on every use to avoid excessive
Expand Down Expand Up @@ -378,6 +385,13 @@ def persist_per_connection_state_txn(
value_values=values,
)

self._persist_sliding_sync_connection_lazy_members_txn(
txn,
connection_key,
connection_position,
per_connection_state.room_lazy_membership,
)

return connection_position

@cached(iterable=True, max_entries=100000)
Expand Down Expand Up @@ -448,6 +462,19 @@ def _get_and_clear_connection_positions_txn(
"""
txn.execute(sql, (connection_key, connection_position))

# Move any lazy membership entries for this connection position to have
# `NULL` connection position, indicating that it applies to all future
# positions on this connecetion.
self.db_pool.simple_update_txn(
txn,
table="sliding_sync_connection_lazy_members",
keyvalues={
"connection_key": connection_key,
"connection_position": connection_position,
},
updatevalues={"connection_position": None},
)

# Fetch and create a mapping from required state ID to the actual
# required state for the connection.
rows = self.db_pool.simple_select_list_txn(
Expand Down Expand Up @@ -527,8 +554,146 @@ def _get_and_clear_connection_positions_txn(
receipts=RoomStatusMap(receipts),
account_data=RoomStatusMap(account_data),
room_configs=room_configs,
room_lazy_membership={},
)

async def get_sliding_sync_connection_lazy_members(
self,
connection_position: int,
room_id: str,
user_ids: AbstractSet[str],
) -> Mapping[str, int]:
"""Get which user IDs in the room we have previously sent lazy
membership for.

Args:
connection_position: The sliding sync connection position.
room_id: The room ID to get lazy members for.
user_ids: The user IDs to check for lazy membership.

Returns:
The mapping of user IDs to the last seen timestamp for those user
IDs.
"""

def get_sliding_sync_connection_lazy_members_txn(
txn: LoggingTransaction,
) -> Mapping[str, int]:
user_clause, user_args = make_in_list_sql_clause(
txn.database_engine, "user_id", user_ids
)

sql = f"""
SELECT user_id, connection_position, last_seen_ts
FROM sliding_sync_connection_lazy_members AS pos
WHERE room_id = ? AND {user_clause}
"""

txn.execute(sql, (room_id, *user_args))

# Filter out any cache entries that only apply to forked connection
# positions. Entries with `NULL` connection position apply to all
# positions on the connection.
return {
user_id: last_seen_ts
for user_id, db_connection_position, last_seen_ts in txn
if db_connection_position == connection_position
or db_connection_position is None
Comment on lines +612 to +613
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not do this in the query itself?

Copy link
Member Author

@erikjohnston erikjohnston Dec 5, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think mainly to avoid having confusion over different positions in the query. The vast majority of the time the query won't return any extra rows (as that only happens when there has been a forked position, which is rare).

If/when we just pass in the connection_key we could more easily move it into the query (other discussion).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

}

return await self.db_pool.runInteraction(
"sliding_sync_connection_lazy_members",
get_sliding_sync_connection_lazy_members_txn,
db_autocommit=True, # Avoid transaction for single read
)

def _persist_sliding_sync_connection_lazy_members_txn(
self,
txn: LoggingTransaction,
connection_key: int,
new_connection_position: int,
all_changes: dict[str, RoomLazyMembershipChanges],
) -> None:
"""Persist that we have sent lazy membership for the given user IDs."""

now = self.clock.time_msec()

# Figure out which cache entries to add or update.
#
# These are either a) new entries we've never sent before (i.e. with a
# None last_seen_ts), or b) where the `last_seen_ts` is old enough that
# we want to update it.
#
# We don't update the timestamp every time to avoid hammering the DB
# with writes, and we don't need the timestamp to be precise. It is used
# to evict old entries that haven't been used in a while.
to_update: list[tuple[str, str]] = []
for room_id, room_changes in all_changes.items():
for (
user_id,
last_seen_ts,
) in room_changes.returned_user_id_to_last_seen_ts_map.items():
if last_seen_ts is None:
# We've never sent this user before, so we need to record that
# we've sent it at the new connection position.
to_update.append((room_id, user_id))
elif last_seen_ts + LAZY_MEMBERS_UPDATE_INTERVAL_MS < now:
# We last saw this user over
# `LAZY_MEMBERS_UPDATE_INTERVAL_MS` ago, so we update the
# timestamp (c.f. comment above).
to_update.append((room_id, user_id))

if to_update:
# Upsert the new/updated entries.
#
# Ignore conflicts where the existing entry has a different
# connection position (i.e. from a forked connection position). This
# may mean that we lose some updates, but that's acceptable as this
# is a cache and its fine for it to *not* include rows. (Downstream
# this will cause us to maybe send a few extra lazy members down
# sync, but we're allowed to send extra members).
sql = """
INSERT INTO sliding_sync_connection_lazy_members
(connection_key, connection_position, room_id, user_id, last_seen_ts)
VALUES {value_placeholder}
ON CONFLICT (connection_key, room_id, user_id)
DO UPDATE SET last_seen_ts = EXCLUDED.last_seen_ts
WHERE sliding_sync_connection_lazy_members.connection_position IS NULL
OR sliding_sync_connection_lazy_members.connection_position = EXCLUDED.connection_position
"""

args = [
(connection_key, new_connection_position, room_id, user_id, now)
for room_id, user_id in to_update
]

if isinstance(self.database_engine, PostgresEngine):
sql = sql.format(value_placeholder="?")
txn.execute_values(sql, args, fetch=False)
else:
sql = sql.format(value_placeholder="(?, ?, ?, ?, ?)")
txn.execute_batch(sql, args)

# Remove any invalidated entries.
to_remove: list[tuple[str, str]] = []
for room_id, room_changes in all_changes.items():
for user_id in room_changes.invalidated_user_ids:
to_remove.append((room_id, user_id))

if to_remove:
# We don't try and match on connection position here: it's fine to
# remove it from all forks. This is a cache so it's fine to expire
# arbitrary entries, the worst that happens is we send a few extra
# lazy members down sync.
self.db_pool.simple_delete_many_batch_txn(
txn,
table="sliding_sync_connection_lazy_members",
keys=("connection_key", "room_id", "user_id"),
values=[
(connection_key, room_id, user_id) for room_id, user_id in to_remove
],
)

@wrap_as_background_process("delete_old_sliding_sync_connections")
async def delete_old_sliding_sync_connections(self) -> None:
"""Delete sliding sync connections that have not been used for a long time."""
Expand Down Expand Up @@ -556,6 +721,8 @@ class PerConnectionStateDB:
serialized to strings.

When persisting this *only* contains updates to the state.

The `room_lazy_membership` field is only used when persisting.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should move this as a """ comment on the attribute itself so people abuse it less

"""

last_used_ts: int | None
Expand All @@ -566,6 +733,8 @@ class PerConnectionStateDB:

room_configs: Mapping[str, "RoomSyncConfig"]

room_lazy_membership: dict[str, RoomLazyMembershipChanges]

@staticmethod
async def from_state(
per_connection_state: "MutablePerConnectionState", store: "DataStore"
Expand Down Expand Up @@ -620,6 +789,7 @@ async def from_state(
receipts=RoomStatusMap(receipts),
account_data=RoomStatusMap(account_data),
room_configs=per_connection_state.room_configs.maps[0],
room_lazy_membership=per_connection_state.room_lazy_membership,
)

async def to_state(self, store: "DataStore") -> "PerConnectionState":
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,48 @@
--
-- This file is licensed under the Affero General Public License (AGPL) version 3.
--
-- Copyright (C) 2025 Element Creations Ltd
--
-- This program is free software: you can redistribute it and/or modify
-- it under the terms of the GNU Affero General Public License as
-- published by the Free Software Foundation, either version 3 of the
-- License, or (at your option) any later version.
--
-- See the GNU Affero General Public License for more details:
-- <https://www.gnu.org/licenses/agpl-3.0.html>.


-- Tracks which member states have been sent to the client for lazy-loaded
-- members in sliding sync. This is a *cache* as it doesn't matter if we send
-- down members we've previously sent down, i.e. it's safe to delete any rows.
--
-- We track a *rough* `last_seen_ts` for each user in each room which indicates
-- when we last would've sent their member state to the client. This is used so
-- that we can remove members which haven't been seen for a while to save space.
--
-- Care must be taken when handling "forked" positions, i.e. we have responded
-- to a request with a position and then get another different request using the
-- previous position as a base. We track this by including a
-- `connection_position` for newly inserted rows. When we advance the position
-- we set this to NULL for all rows which were present at that position, and
-- delete all other rows. When reading rows we can then filter out any rows
-- which have a non-NULL `connection_position` which is not the current
-- position.
--
-- I.e. `connection_position` is NULL for rows which are valid for *all*
-- positions on the connection, and is non-NULL for rows which are only valid
-- for a specific position.
--
-- When invalidating rows, we can just delete them. Technically this could
-- invalidate for a forked position, but this is acceptable as equivalent to a
-- cache eviction.
CREATE TABLE sliding_sync_connection_lazy_members (
connection_key BIGINT NOT NULL REFERENCES sliding_sync_connections(connection_key) ON DELETE CASCADE,
connection_position BIGINT REFERENCES sliding_sync_connection_positions(connection_position) ON DELETE CASCADE,
room_id TEXT NOT NULL,
user_id TEXT NOT NULL,
last_seen_ts BIGINT NOT NULL
);

CREATE UNIQUE INDEX sliding_sync_connection_lazy_members_idx ON sliding_sync_connection_lazy_members (connection_key, room_id, user_id);
CREATE INDEX sliding_sync_connection_lazy_members_pos_idx ON sliding_sync_connection_lazy_members (connection_key, connection_position) WHERE connection_position IS NOT NULL;
44 changes: 44 additions & 0 deletions synapse/types/handlers/sliding_sync.py
Original file line number Diff line number Diff line change
Expand Up @@ -891,6 +891,43 @@ def __len__(self) -> int:
return len(self.rooms) + len(self.receipts) + len(self.room_configs)


@attr.s(auto_attribs=True)
class RoomLazyMembershipChanges:
"""Changes to lazily-loaded room memberships for a given room.

Attributes:
returned: Map from user ID to timestamp for users whose membership we
have lazily loaded. The timestamp indicates the time we previously
saw the membership if we have sent it down previously, or None if
we sent it down for the first time.

Note: this will include users whose membership we would have sent
down but didn't due to us having previously sent them.
invalidated: Set of user IDs whose latest membership we have *not* sent
down
"""

# A map from user ID -> timestamp. Indicates that those memberships have
# been lazily loaded. I.e. that either a) we sent those memberships down, or
# b) we did so previously. The timestamp indicates the time we previously
# saw the membership.
#
# We track a *rough* `last_seen_ts` for each user in each room which
# indicates when we last would've sent their member state to the client.
# This is used so that we can remove members which haven't been seen for a
# while to save space.
returned_user_id_to_last_seen_ts_map: Mapping[str, int | None] = attr.Factory(dict)

# A set of user IDs whose membership change we have *not* sent
# down
invalidated_user_ids: AbstractSet[str] = attr.Factory(set)

def __bool__(self) -> bool:
return bool(
self.returned_user_id_to_last_seen_ts_map or self.invalidated_user_ids
)


@attr.s(auto_attribs=True)
class MutablePerConnectionState(PerConnectionState):
"""A mutable version of `PerConnectionState`"""
Expand All @@ -903,12 +940,19 @@ class MutablePerConnectionState(PerConnectionState):

room_configs: typing.ChainMap[str, RoomSyncConfig]

# A map from room ID -> user ID -> timestamp. Indicates that those
# memberships have been lazily loaded. I.e. that either a) we sent those
# memberships down, or b) we did so previously. The timestamp indicates the
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wording here needs to be more precise. Both scenarios seem like past tense.

I assume it's supposed to be a) we are going to send those memberships down this time

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've merged all these docs into the docstring of RoomLazyMembershipChanges, as there was duplication.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

-> 45d1bfa

# time we previously saw the membership.
room_lazy_membership: dict[str, RoomLazyMembershipChanges] = attr.Factory(dict)

def has_updates(self) -> bool:
return (
bool(self.rooms.get_updates())
or bool(self.receipts.get_updates())
or bool(self.account_data.get_updates())
or bool(self.get_room_config_updates())
or bool(self.room_lazy_membership)
)

def get_room_config_updates(self) -> Mapping[str, RoomSyncConfig]:
Expand Down
Loading
Loading