|
1 | 1 | # Room DAG concepts |
2 | 2 |
|
3 | | -## What is an `outlier`? |
| 3 | +## Edges |
4 | 4 |
|
5 | | -An `outlier` is an arbitrary floating event in the DAG (as opposed to being |
6 | | -inline with the current DAG). It also means that we don't have the state events |
7 | | -backfilled on the homeserver and we trust the events *claimed* auth events rather |
8 | | -than those we calculate and verify to be correct. |
| 5 | +The word "edge" comes from graph theory lingo. An edge is just a connection |
| 6 | +between two events. In Synapse, we connect events by specifying their |
| 7 | +`prev_events`. A subsequent event points back at a previous event. |
9 | 8 |
|
10 | | -An event can be unmarked as an `outlier` once we fetch all of its `prev_events` (you will see some `ex_outlier` code around this). |
| 9 | +``` |
| 10 | +A (oldest) <---- B <---- C (most recent) |
| 11 | +``` |
11 | 12 |
|
12 | 13 |
|
13 | | -## What is a `state_group`? |
| 14 | +## Depth and stream ordering |
14 | 15 |
|
15 | | -For every non-outlier event we need to know the state at that event. Instead of storing the full state for each event in the DB (i.e. a `event_id -> state` mapping), which is *very* space inefficient when state doesn't change, we instead assign each different set of state a "state group" and then have mappings of `event_id -> state_group` and `state_group -> state`. |
| 16 | +Events are sorted by `(topological_ordering, stream_ordering)` where `topological_ordering` is just `depth`. Normally, `stream_ordering` is an auto incrementing integer but for `backfilled=true` events, it decrements. |
16 | 17 |
|
| 18 | +`depth` is not re-calculated when messages are inserted into the DAG. |
17 | 19 |
|
| 20 | + |
| 21 | +## Forward extremity |
| 22 | + |
| 23 | +Most-recent-in-time events in the DAG which are not referenced by any `prev_events` yet. |
| 24 | + |
| 25 | +The forward extremities of a room are used as the `prev_events` when the next event is sent. |
| 26 | + |
| 27 | + |
| 28 | +## Backwards extremity |
| 29 | + |
| 30 | +The current marker of where we have backfilled up to. |
| 31 | + |
| 32 | +A backwards extremity is a place where the oldest-in-time events of the DAG |
| 33 | + |
| 34 | +This is an event where we haven't fetched all of the `prev_events` for. |
| 35 | + |
| 36 | +Once we have fetched all of it's `prev_events`, it's unmarked as backwards extremity |
| 37 | +and those `prev_events` become the new backwards extremities. |
| 38 | + |
| 39 | + |
| 40 | +## Outliers |
| 41 | + |
| 42 | +We mark an event as an `outlier` when we haven't figured out the state for the |
| 43 | +room at that point in the DAG yet. |
| 44 | + |
| 45 | +We won't *necessarily* have the `prev_events` of an `outlier` in the database, but it's entirely possible that we *might*. The status of whether we have all of the `prev_events` is marked as |
| 46 | +a [backwards extremity](#backwards-extremity). |
| 47 | + |
| 48 | +For example, when we fetch the event auth chain or state for a given event, we mark all of those |
| 49 | +claimed auth events as outliers because we haven't done the state calculation ourself. |
| 50 | + |
| 51 | + |
| 52 | +### Floating outlier |
| 53 | + |
| 54 | +A floating `outlier` is an arbitrary floating event in the DAG (as opposed to being |
| 55 | +inline with the current DAG). This happens when it the event doesn't have any `prev_events` |
| 56 | +or fake `prev_events` that don't exist. |
| 57 | + |
| 58 | + |
| 59 | +## State groups |
| 60 | + |
| 61 | +For every non-outlier event we need to know the state at that event. Instead of |
| 62 | +storing the full state for each event in the DB (i.e. a `event_id -> state` |
| 63 | +mapping), which is *very* space inefficient when state doesn't change, we |
| 64 | +instead assign each different set of state a "state group" and then have |
| 65 | +mappings of `event_id -> state_group` and `state_group -> state`. |
| 66 | + |
| 67 | + |
| 68 | +### Stage group edges |
| 69 | + |
| 70 | +TODO: `state_group_edges` is a further optimization... |
0 commit comments