Skip to content
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
63 changes: 1 addition & 62 deletions site/content/3.12/deploy/architecture/replication.md
Original file line number Diff line number Diff line change
Expand Up @@ -50,65 +50,4 @@ In addition to the replication factor, there is a **writeConcern** that
specifies the minimum number of in-sync followers required for write operations.
If you specify the `writeConcern` parameter with a value greater than `1`, the
collection's leader shards are locked down for writing as soon as too few
followers are available.

## Asynchronous replication

When using asynchronous replication, _Followers_ connect to a _Leader_ and apply
all the events from the Leader log in the same order locally. As a result, the
_Followers_ end up with the same state of data as the _Leader_.

_Followers_ are only eventually consistent with the _Leader_.

Transactions are honored in replication, i.e. transactional write operations
become visible on _Followers_ atomically.

All write operations are logged to the Leader's _write-ahead log_. Therefore,
asynchronous replication in ArangoDB cannot be used for write-scaling. The main
purposes of this type of replication are to provide read-scalability and
hot standby servers.

It is possible to connect multiple _Follower_ to the same _Leader_. _Followers_
should be used as read-only instances, and no user-initiated write operations
should be carried out on them. Otherwise, data conflicts may occur that cannot
be solved automatically, and this makes the replication stop.

In an asynchronous replication scenario, Followers _pull_ changes
from the _Leader_. _Followers_ need to know to which _Leader_ they should
connect to, but a _Leader_ is not aware of the _Followers_ that replicate from it.
When the network connection between the _Leader_ and a _Follower_ goes down, write
operations on the Leader can continue normally. When the network is up again, _Followers_
can reconnect to the _Leader_ and transfer the remaining changes. This
happens automatically, provided _Followers_ are configured appropriately.

### Replication lag

As described above, write operations are applied first in the _Leader_, and then applied
in the _Followers_.

For example, let's assume a write operation is executed in the _Leader_
at point in time _t0_. To make a _Follower_ apply the same operation, it must first
fetch the write operation's data from Leader's write-ahead log, then parse it and
apply it locally. This happens at some point in time after _t0_, let's say _t1_.

The difference between _t1_ and _t0_ is called the _replication lag_, and it is unavoidable
in asynchronous replication. The amount of replication _lag_ depends on many factors, a
few of which are:

- the network capacity between the _Followers_ and the _Leader_
- the load of the _Leader_ and the _Followers_
- the frequency in which _Followers_ poll the _Leader_ for updates

Between _t0_ and _t1_, the state of data on the _Leader_ is newer than the state of data
on the _Followers_. At point in time _t1_, the state of data on the _Leader_ and _Followers_
is consistent again (provided no new data modifications happened on the _Leader_ in
between). Thus, the replication leads to an _eventually consistent_ state of data.

### Replication overhead

As the _Leader_ servers are logging any write operation in the _write-ahead-log_
anyway, replication doesn't cause any extra overhead on the _Leader_. However, it
causes some overhead for the _Leader_ to serve incoming read
requests of the _Followers_. However, returning the requested data is a trivial
task for the _Leader_ and should not result in a notable performance
degradation in production.
followers are available.