Skip to content
This repository was archived by the owner on Apr 26, 2024. It is now read-only.

Commit 7392b63

Browse files
committed
Updates from feedback and more concepts
1 parent 12c0b89 commit 7392b63

File tree

1 file changed

+61
-8
lines changed

1 file changed

+61
-8
lines changed
Lines changed: 61 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -1,17 +1,70 @@
11
# Room DAG concepts
22

3-
## What is an `outlier`?
3+
## Edges
44

5-
An `outlier` is an arbitrary floating event in the DAG (as opposed to being
6-
inline with the current DAG). It also means that we don't have the state events
7-
backfilled on the homeserver and we trust the events *claimed* auth events rather
8-
than those we calculate and verify to be correct.
5+
The word "edge" comes from graph theory lingo. An edge is just a connection
6+
between two events. In Synapse, we connect events by specifying their
7+
`prev_events`. A subsequent event points back at a previous event.
98

10-
An event can be unmarked as an `outlier` once we fetch all of its `prev_events` (you will see some `ex_outlier` code around this).
9+
```
10+
A (oldest) <---- B <---- C (most recent)
11+
```
1112

1213

13-
## What is a `state_group`?
14+
## Depth and stream ordering
1415

15-
For every non-outlier event we need to know the state at that event. Instead of storing the full state for each event in the DB (i.e. a `event_id -> state` mapping), which is *very* space inefficient when state doesn't change, we instead assign each different set of state a "state group" and then have mappings of `event_id -> state_group` and `state_group -> state`.
16+
Events are sorted by `(topological_ordering, stream_ordering)` where `topological_ordering` is just `depth`. Normally, `stream_ordering` is an auto incrementing integer but for `backfilled=true` events, it decrements.
1617

18+
`depth` is not re-calculated when messages are inserted into the DAG.
1719

20+
21+
## Forward extremity
22+
23+
Most-recent-in-time events in the DAG which are not referenced by any `prev_events` yet.
24+
25+
The forward extremities of a room are used as the `prev_events` when the next event is sent.
26+
27+
28+
## Backwards extremity
29+
30+
The current marker of where we have backfilled up to.
31+
32+
A backwards extremity is a place where the oldest-in-time events of the DAG
33+
34+
This is an event where we haven't fetched all of the `prev_events` for.
35+
36+
Once we have fetched all of it's `prev_events`, it's unmarked as backwards extremity
37+
and those `prev_events` become the new backwards extremities.
38+
39+
40+
## Outliers
41+
42+
We mark an event as an `outlier` when we haven't figured out the state for the
43+
room at that point in the DAG yet.
44+
45+
We won't *necessarily* have the `prev_events` of an `outlier` in the database, but it's entirely possible that we *might*. The status of whether we have all of the `prev_events` is marked as
46+
a [backwards extremity](#backwards-extremity).
47+
48+
For example, when we fetch the event auth chain or state for a given event, we mark all of those
49+
claimed auth events as outliers because we haven't done the state calculation ourself.
50+
51+
52+
### Floating outlier
53+
54+
A floating `outlier` is an arbitrary floating event in the DAG (as opposed to being
55+
inline with the current DAG). This happens when it the event doesn't have any `prev_events`
56+
or fake `prev_events` that don't exist.
57+
58+
59+
## State groups
60+
61+
For every non-outlier event we need to know the state at that event. Instead of
62+
storing the full state for each event in the DB (i.e. a `event_id -> state`
63+
mapping), which is *very* space inefficient when state doesn't change, we
64+
instead assign each different set of state a "state group" and then have
65+
mappings of `event_id -> state_group` and `state_group -> state`.
66+
67+
68+
### Stage group edges
69+
70+
TODO: `state_group_edges` is a further optimization...

0 commit comments

Comments
 (0)