-
Notifications
You must be signed in to change notification settings - Fork 405
Implement MSC3871: Gappy timelines - indicate gaps
in /messages
#18873
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Draft
MadLittleMods
wants to merge
5
commits into
develop
Choose a base branch
from
madlittlemods/msc3871-gappy-timeline
base: develop
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Draft
Changes from 4 commits
Commits
Show all changes
5 commits
Select commit
Hold shift + click to select a range
a14808e
First stab at `gaps` in `/messages`
MadLittleMods 50d8337
Add `test_gaps_going_backwards`
MadLittleMods d353cfc
Add changelog
MadLittleMods b0b9b4e
Temp: Implement `/messages?backfill=true/false`
MadLittleMods e835134
Picking backfill points: `current_depth` should be padded for approxi…
MadLittleMods File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1 @@ | ||
Implement experimental [MSC3871](https://github.com/matrix-org/matrix-spec-proposals/pull/3871) to indicate `gaps` in the `/messages` timeline. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -34,6 +34,7 @@ | |
Mapping, | ||
MutableMapping, | ||
Optional, | ||
Sequence, | ||
Set, | ||
Tuple, | ||
cast, | ||
|
@@ -42,6 +43,7 @@ | |
|
||
import attr | ||
from prometheus_client import Gauge | ||
from typing_extensions import assert_never | ||
|
||
from twisted.internet import defer | ||
|
||
|
@@ -83,13 +85,17 @@ | |
LoggingTransaction, | ||
make_tuple_in_list_sql_clause, | ||
) | ||
|
||
# from synapse.storage.databases.main.stream import ( | ||
# generate_next_token, | ||
# ) | ||
from synapse.storage.types import Cursor | ||
from synapse.storage.util.id_generators import ( | ||
AbstractStreamIdGenerator, | ||
MultiWriterIdGenerator, | ||
) | ||
from synapse.storage.util.sequence import build_sequence_generator | ||
from synapse.types import JsonDict, get_domain_from_id | ||
from synapse.types import JsonDict, RoomStreamToken, get_domain_from_id | ||
from synapse.types.state import StateFilter | ||
from synapse.types.storage import _BackgroundUpdates | ||
from synapse.util import unwrapFirstError | ||
|
@@ -100,6 +106,7 @@ | |
from synapse.util.cancellation import cancellable | ||
from synapse.util.iterutils import batch_iter | ||
from synapse.util.metrics import Measure | ||
from synapse.util.tokens import generate_next_token | ||
|
||
if TYPE_CHECKING: | ||
from synapse.server import HomeServer | ||
|
@@ -214,6 +221,34 @@ class EventRedactBehaviour(Enum): | |
block = auto() | ||
|
||
|
||
@attr.s(slots=True, frozen=True, auto_attribs=True) | ||
class EventGapEntry: | ||
""" | ||
Represents a gap in the timeline. | ||
|
||
From MSC3871: Gappy timeline | ||
""" | ||
|
||
event_id: str | ||
""" | ||
The target event ID which we see a gap before or after. | ||
""" | ||
|
||
prev_token: RoomStreamToken | ||
""" | ||
The token position before the target `event_id` | ||
|
||
Remember: tokens are positions between events | ||
""" | ||
|
||
next_token: RoomStreamToken | ||
""" | ||
The token position after the target `event_id` | ||
|
||
Remember: tokens are positions between events | ||
""" | ||
|
||
|
||
class EventsWorkerStore(SQLBaseStore): | ||
# Whether to use dedicated DB threads for event fetching. This is only used | ||
# if there are multiple DB threads available. When used will lock the DB | ||
|
@@ -2315,15 +2350,24 @@ def is_event_next_to_backward_gap_txn(txn: LoggingTransaction) -> bool: | |
is_event_next_to_backward_gap_txn, | ||
) | ||
|
||
async def is_event_next_to_forward_gap(self, event: EventBase) -> bool: | ||
"""Check if the given event is next to a forward gap of missing events. | ||
The gap in front of the latest events is not considered a gap. | ||
async def is_event_next_to_forward_gap( | ||
self, event: EventBase, *, ignore_gap_after_latest: bool = True | ||
) -> bool: | ||
""" | ||
Check if the given event is next to a forward gap of missing events. | ||
|
||
By default when `ignore_gap_after_latest = True`, the gap in front of the | ||
latest events is not considered a gap. | ||
|
||
<latest messages> A(False)--->B(False)--->C(False)---> <gap, unknown events> <oldest messages> | ||
<latest messages> A(False)--->B(False)---> <gap, unknown events> --->D(True)--->E(False) <oldest messages> | ||
|
||
When `ignore_gap_after_latest = False`, `A` would be considered next to a gap. | ||
|
||
Args: | ||
room_id: room where the event lives | ||
event: event to check (can't be an `outlier`) | ||
ignore_gap_after_latest: Whether the gap after the latest events (forward | ||
extremeties) in the room should be considered as an actual gap. | ||
Comment on lines
+2369
to
+2370
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Per matrix-org/matrix-spec-proposals#3871 (comment), we should revert the The default of "omit the gap after the latest messages in the room" is the correct choice |
||
|
||
Returns: | ||
Boolean indicating whether it's an extremity | ||
|
@@ -2335,38 +2379,39 @@ async def is_event_next_to_forward_gap(self, event: EventBase) -> bool: | |
) | ||
|
||
def is_event_next_to_gap_txn(txn: LoggingTransaction) -> bool: | ||
# If the event in question is a forward extremity, we will just | ||
# consider any potential forward gap as not a gap since it's one of | ||
# the latest events in the room. | ||
# | ||
# `event_forward_extremities` does not include backfilled or outlier | ||
# events so we can't rely on it to find forward gaps. We can only | ||
# use it to determine whether a message is the latest in the room. | ||
# | ||
# We can't combine this query with the `forward_edge_query` below | ||
# because if the event in question has no forward edges (isn't | ||
# referenced by any other event's prev_events) but is in | ||
# `event_forward_extremities`, we don't want to return 0 rows and | ||
# say it's next to a gap. | ||
forward_extremity_query = """ | ||
SELECT 1 FROM event_forward_extremities | ||
WHERE | ||
room_id = ? | ||
AND event_id = ? | ||
LIMIT 1 | ||
""" | ||
if ignore_gap_after_latest: | ||
# If the event in question is a forward extremity, we will just | ||
# consider any potential forward gap as not a gap since it's one of | ||
# the latest events in the room. | ||
# | ||
# `event_forward_extremities` does not include backfilled or outlier | ||
# events so we can't rely on it to find forward gaps. We can only | ||
# use it to determine whether a message is the latest in the room. | ||
# | ||
# We can't combine this query with the `forward_edge_query` below | ||
# because if the event in question has no forward edges (isn't | ||
# referenced by any other event's prev_events) but is in | ||
# `event_forward_extremities`, we don't want to return 0 rows and | ||
# say it's next to a gap. | ||
forward_extremity_query = """ | ||
SELECT 1 FROM event_forward_extremities | ||
WHERE | ||
room_id = ? | ||
AND event_id = ? | ||
LIMIT 1 | ||
""" | ||
|
||
# We consider any forward extremity as the latest in the room and | ||
# not a forward gap. | ||
# | ||
# To expand, even though there is technically a gap at the front of | ||
# the room where the forward extremities are, we consider those the | ||
# latest messages in the room so asking other homeservers for more | ||
# is useless. The new latest messages will just be federated as | ||
# usual. | ||
txn.execute(forward_extremity_query, (event.room_id, event.event_id)) | ||
if txn.fetchone(): | ||
return False | ||
# We consider any forward extremity as the latest in the room and | ||
# not a forward gap. | ||
# | ||
# To expand, even though there is technically a gap at the front of | ||
# the room where the forward extremities are, we consider those the | ||
# latest messages in the room so asking other homeservers for more | ||
# is useless. The new latest messages will just be federated as | ||
# usual. | ||
txn.execute(forward_extremity_query, (event.room_id, event.event_id)) | ||
if txn.fetchone(): | ||
return False | ||
|
||
# Check to see whether the event in question is already referenced | ||
# by another event. If we don't see any edges, we're next to a | ||
|
@@ -2398,6 +2443,61 @@ def is_event_next_to_gap_txn(txn: LoggingTransaction) -> bool: | |
is_event_next_to_gap_txn, | ||
) | ||
|
||
async def get_events_next_to_gaps( | ||
self, events: Sequence[EventBase], direction: Direction | ||
) -> Sequence[EventGapEntry]: | ||
""" | ||
Find all of the events that have gaps next to them. | ||
|
||
When going backwards, we look for backward gaps (i.e. missing prev_events). | ||
|
||
When going forwards, we look for forward gaps (i.e. events that aren't | ||
referenced by any other events). | ||
|
||
Args: | ||
events: topological ordered list of events | ||
direction: which side of the events to check for gaps. This should match the | ||
direction we're paginating in. | ||
""" | ||
|
||
gaps = [] | ||
for event in events: | ||
# FIXME: We should use a bulk look-up instead of N+1 queries. | ||
if direction == Direction.BACKWARDS: | ||
is_next_to_gap = await self.is_event_next_to_backward_gap(event) | ||
elif direction == Direction.FORWARDS: | ||
is_next_to_gap = await self.is_event_next_to_forward_gap( | ||
event, ignore_gap_after_latest=False | ||
) | ||
else: | ||
assert_never(direction) | ||
|
||
if not is_next_to_gap: | ||
continue | ||
|
||
stream_ordering = event.internal_metadata.stream_ordering | ||
assert stream_ordering is not None, ( | ||
"persisted events should have stream_ordering" | ||
) | ||
|
||
gaps.append( | ||
EventGapEntry( | ||
prev_token=generate_next_token( | ||
direction=Direction.BACKWARDS, | ||
last_topo_ordering=event.depth, | ||
last_stream_ordering=stream_ordering, | ||
), | ||
event_id=event.event_id, | ||
next_token=generate_next_token( | ||
direction=Direction.FORWARDS, | ||
last_topo_ordering=event.depth, | ||
last_stream_ordering=stream_ordering, | ||
), | ||
) | ||
) | ||
|
||
return gaps | ||
|
||
async def get_event_id_for_timestamp( | ||
self, room_id: str, timestamp: int, direction: Direction | ||
) -> Optional[str]: | ||
|
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.