WIP

AnnaSjerling · AnnaSjerling · commit ccc082669512 · 2024-11-28T15:23:55.000+01:00
diff --git a/modules/ROOT/pages/clustering/disaster-recovery.adoc b/modules/ROOT/pages/clustering/disaster-recovery.adoc
@@ -6,48 +6,48 @@
 A database can become unavailable due to issues on different system levels.
 For example, a data center failover may lead to the loss of multiple servers, which may cause a set of databases to become unavailable.
 
-This section contains a step-by-step guide on how to recover _unavailable databases_ that are incapable of serving writes, while still may be able to serve reads.
+This section contains a step-by-step guide on how to recover *unavailable databases* that are incapable of serving writes, while possibly still being able to serve reads.
 However, if a database is not performing as expected for other reasons, this section cannot help.
-By following the steps outlined here, you can recover the unavailable databases and make them fully operational with minimal impact on the other databases in the cluster.
+By following the steps outlined here, you can recover the unavailable databases and make them fully operational, with minimal impact on the other databases in the cluster.
 
-[NOTE]
+[CAUTION]
 ====
-If *all* servers in a Neo4j cluster are lost in a data center failover, it is not possible to recover the current cluster.
-You have to create a new cluster and restore the databases.
-See xref:clustering/setup/deploy.adoc[Deploy a basic cluster] and xref:clustering/databases.adoc#cluster-seed[Seed a database] for more information.
+If *all* servers in a Neo4j cluster are lost in a disaster, it is not possible to recover the current cluster.
+You have to create a new cluster and restore the databases, see xref:clustering/setup/deploy.adoc[Deploy a basic cluster] and xref:clustering/databases.adoc#cluster-seed[Seed a database] for more information.
 ====
 
 == Faults in clusters
 
 Databases in clusters follow an allocation strategy.
 This means that they are allocated differently within the cluster and may also have different numbers of primaries and secondaries.
-Furthermore, some databases may not be allowed to be allocated to some servers because of user defined strategies.
-The consequence of this is that all servers may be different in which databases they are hosting and are allowed to host.
+The consequence of this is that all servers may be different in which databases they are hosting.
 Losing a server in a cluster may cause some databases to lose a member while others are unaffected.
 Therefore, in a disaster where one or more servers go down, some databases may keep running with little to no impact, while others may lose all their allocated resources.
 
 == Guide structure
+[NOTE]
+====
+In this guide, an _offline_ server is a server that is not running but may be restartable.
+A _lost_ server, however, is a server that is currently not running and cannot be restarted.
+A _write available_ database is able to serve writes, while a _write unavailable_ database is not.
+====
+
 There are three main steps to recovering a cluster from a disaster.
-First, ensure the `system` database is write available i.e. able to accept writes.
-Then, detach any potential lost servers and replace them by new ones.
-Finish disaster recovery by starting or continuing to manage databases and verify that they are available.
+First, ensure the `system` database is write available.
+Then, detach any potential lost servers from the cluster and replace them by new ones.
+Finish disaster recovery by starting or continuing to manage databases and verify that they are write available.
 
-Every step consists of the following four sections:
+Every step consists of the following three sections:
 
-. State that needs to be verified.
-. Example of how the state can be verified.
-. Motivation for why this state is necessary.
-. Path to correct state.
+. A state that needs to be verified, with optional motivation.
+. An example of how the state can be verified.
+. A proposed series of steps to get to the correct state.
 
 [CAUTION]
 ====
 Verifying each state before continuing to the next step, regardless of the disaster scenario, is recommended to ensure the cluster is fully operational.
-
 ====
 
-In this section, an _offline_ server is a server that is not running but may be _restartable_.
-A _lost_ server, however, is a server that is currently not running and cannot be restarted.
-
 
 == Guide to disaster recovery
 
@@ -68,14 +68,14 @@ See xref:clustering/setup/routing.adoc#clustering-routing[Server-side routing] f
 
 ==== State
 ====
-The `system` database is write available, i.e. able to accept writes.
+The `system` database is write available.
 ====
 
-==== Motivation
-The `system` database contains the view of the cluster. This includes which servers and databases are present and how they are configured.
-During a disaster, the goal is to change the view of the cluster, for example by removing and adding servers or recreating databases.
-In order for the view to be updated, the `system` database needs to be write available.
-Therefore, it is vital to ensure it is available so that the next steps are possible to execute.
+The `system` database contains the view of the cluster.
+This includes which servers and databases are present, where they live and how they are configured.
+During a disaster, the view of the cluster might need to change to reflect a new reality, for example by removing lost servers.
+Databases might also need to be recreated to regain write availability.
+Because both of these steps are executed by writing to the `system` database, this is a vital first step during disaster recovery.
 
 ==== Example verification
 The `system` database's write availability can be verified by using the xref:clustering/monitoring/status-check.adoc#monitoring-replication[Status check] procedure.
@@ -93,7 +93,7 @@ CALL dbms.cluster.statusCheck(["system"]);
 ==== Path to correct state
 The following steps can be used to regain write availability for the `system` database if it has been lost.
 They create a new `system` database from the most up-to-date copy of the `system` database that can be found in the cluster.
-It is important to get a `system` database that is as up-to-date as possible, so that future commands operate on state that is as correct as possible.
+It is important to get a `system` database that is as up-to-date as possible, so it corresponds to the view before the disaster closely.
 
 .Guide
 [%collapsible]
@@ -110,13 +110,14 @@ This causes downtime for all databases in the cluster until the processes are st
 . On each server, run `bin/neo4j-admin database info system` and compare the `lastCommittedTransaction` to find out which server has the most up-to-date copy of the `system` database.
 . On the most up-to-date server, run `bin/neo4j-admin database dump system --to-path=[path-to-dump]` to take a dump of the current `system` database and store it in an accessible location.
 . For every _lost_ server, add a new *unconstrained* one according to xref:clustering/servers.adoc#cluster-add-server[Add a server to the cluster].
-It is important that the new servers are unconstrained, or deallocating servers might be blocked even though enough servers was added.
+It is important that the new servers are unconstrained, or deallocating servers might be blocked even though enough servers were added.
 +
 [NOTE]
 =====
-While recommended to avoid cluster overload, it is not strictly necessary to add servers in this step.
+While recommended, it is not strictly necessary to add new servers in this step.
 There is also an option to change the `system` database mode (`server.cluster.system_database_mode`) on secondary allocations to make them primary allocations for the new `system` database.
 The amount of primary allocations needed is defined by `dbms.cluster.minimum_initial_system_primaries_count`, see the xref:configuration/configuration-settings.adoc#config_dbms.cluster.minimum_initial_system_primaries_count[Configuration settings] for more information.
+Not replacing servers can cause cluster overload when databases are moved from lost servers to available ones in the next step of this guide.
 =====
 +
 . On each server, run `bin/neo4j-admin database load system --from-path=[path-to-dump] --overwrite-destination=true` to load the current `system` database dump.
@@ -133,11 +134,10 @@ The amount of primary allocations needed is defined by `dbms.cluster.minimum_ini
 All servers in the cluster's view are available and enabled.
 ====
 
-==== Motivation
-// different stuffs here
-Following the loss of one or more servers, the cluster's view of servers must be updated.
-This includes moving allocations on the lost servers onto servers which are actually in the cluster
-This includes identifying the lost servers and replacing them by new ones.
+A lost server will still be in the `system` database's view of the cluster, but in an unavailable state.
+According to the view of the cluster, these lost servers are still hosting the databases they had before they became lost.
+Therefore, removing lost servers is not as easy as informing the `system` database that they are lost.
+It also includes moving requested allocations on the lost servers onto servers which are actually in the cluster, so that those databases' topologies are still satisfied.
 
 ==== Example verification
 The cluster's view of servers can be seen by listing the servers, see xref:clustering/servers.adoc#_listing_servers[Listing servers] for more information.
@@ -149,7 +149,9 @@ SHOW SERVERS;
 ----
 
 ==== Path to correct state
-Detach lost servers and add new ones to the cluster
+The following steps can be used to remove lost servers and add new ones to the cluster.
+They include moving any potential database allocations from lost servers to available servers in the cluster.
+These steps might also recreate some databases, since a database which has lost a majority of its primary allocations cannot be moved from one server to another.
 
 .Guide
 [%collapsible]
@@ -158,16 +160,19 @@ Detach lost servers and add new ones to the cluster
 This prevents new database allocations from being moved to this server.
 . For each `CORDONED` server, make sure a new *unconstrained* server has been added to the cluster to take its place, see xref:clustering/servers.adoc#cluster-add-server[Add a server to the cluster] for more information.
 If servers were added in the 'System database write availability' step of this guide, additional servers might not be needed here.
+It is important that the new servers are unconstrained, or deallocating servers might be blocked even though enough servers were added.
 
 +
 [NOTE]
 =====
 While recommended, it is not strictly necessary to add new servers in this step.
-However, not adding new servers reduces the capacity of the cluster to handle work and might require the topology for a database to be altered to make deallocations and recreations possible.
+However, not adding new servers reduces the capacity of the cluster to handle work.
+Furthermore, it might require the topology for a database to be altered to make deallocating servers and recreating databases possible.
 =====
 
+// ? from here
 . For each `CORDONED` server, run `DEALLOCATE DATABASES FROM SERVER cordoned-server-id` on one of the available servers.
-This will try to move all database allocations from this server to another server in the cluster.
+This will try to move all database allocations from this server to an available server in the cluster.
 Once a server is `DEALLOCATED`, all allocated user databases on this server has been moved successfully.
 +
 [NOTE]
@@ -178,6 +183,7 @@ Therefore, an allocation with `currentStatus` = `DEALLOCATING` should reach the
 . If any deallocations failed, make them possible by executing the following steps:
 .. Run `SHOW DATABASES`. If a database show `currentStatus`= `offline` this database has been stopped.
 .. For each stopped database that has at least one allocation on any of the `CORDONED` servers, start them by running `START DATABASE stopped-db WAIT`.
+This is necessary since stopped databases cannot be moved from one server to another.
 +
 [NOTE]
 =====
@@ -188,7 +194,7 @@ A database can be set to `READ-ONLY` before it is started to avoid updates on a
 Depending on the environment, consider extending the timeout for this procedure.
 If any of the primary allocations for a database report `replicationSuccessful` = `TRUE`, this database is write available.
 
-.. For each database that is not write available, recreate it to regain write availability.
+.. For each database that is not write available, recreate it to move it from lost servers and regain write availability.
 Go to xref:clustering/databases.adoc#recreate-databases[Recreate databases] for more information about recreate options.
 Remember to make sure there are recent backups for the databases before recreating them, see xref:backup-restore/online-backup.adoc[Online backup] for more information.
 +
@@ -199,42 +205,62 @@ Otherwise, recreating with xref:clustering/databases.adoc#uri-seed[Backup as see
 =====
 .. Return to step 3 to retry deallocating all servers.
 . For each deallocated server, run `DROP SERVER deallocated-server-id`.
-This safely removes the server from the cluster view.
+This safely removes the server from the cluster's view.
 
+// ? to here really
 ====
 
 
 [[recover-databases]]
 === Database availability
 
-Once the `system` database and all servers are available, manage and verify that all databases are in the desired state.
-
-. Run `CALL dbms.cluster.statusCheck([])` on all servers, see xref:clustering/monitoring/status-check.adoc#monitoring-replication[Monitoring replication] for more information.
-Depending on the environment, consider extending the timeout for this procedure.
-If any of the primary allocations for a database report `replicationSuccessful` = `TRUE`, this database is write available.
-If all databases are write available, disaster recovery is complete.
-+
-[NOTE]
+==== State
 ====
-Remember that previously stopped databases might have been started during this process.
+All databases are write available.
 ====
 
-. Recreate every database that is not write available and has not been recreated previously, see xref:clustering/databases.adoc#recreate-databases[Recreate databases] for more information.
-Remember to make sure there are recent backups for the databases before recreating them, see xref:backup-restore/online-backup.adoc[Online backup] for more information.
-. Run `SHOW DATABASES` and check any recreated databases which are not write available.
+Once this state is verified, disaster recovery is complete.
+However, remember that previously stopped databases might have been started during this process.
+If they are still desired to be in stopped state, run `START DATABASE started-db WAIT`.
 
-+
 [NOTE]
 ====
-Remember, recreating a database can take an unbounded amount of time since it may involve copying the store to a new server, as described in  xref:clustering/databases.adoc#recreate-databases[Recreate databases].
+Remember, recreating a database can take an unbounded amount of time since it may involve copying the store to a new server, as described in xref:clustering/databases.adoc#recreate-databases[Recreate databases].
 Therefore, an allocation with `currentStatus` = `STARTING` might reach the `requestedStatus` given some time.
 ====
+
+==== Example verification
+All databases' write availability can be verified by using the xref:clustering/monitoring/status-check.adoc#monitoring-replication[Status check] procedure.
+The procedure should be called on all servers in the cluster, in order to provide the correct view.
+The status check procedure writes a dummy transaction, and therefore the correctness of the procedure depends on the given timeout.
+The default timeout is 1 second, but depending on the network latency in the environment it might need to be extended.
+If any of the primary allocations for a database report `replicationSuccessful` = `TRUE`, this database is write available.
+Therefore, the desired state has been verified when this is true for all databases.
+
+[source, shell]
+----
+CALL dbms.cluster.statusCheck([]);
+----
+
+A stricter verification could be done to verify if all databases are in desired states on all servers.
+For the stricter check, run `SHOW DATABASES` and verify that `requestedStatus` = `currentStatus` for all database allocations on all servers.
+
+==== Path to correct state
+The following steps can be used to make all databases in the cluster write available again.
+They include recreating any databases that are not write available, as well as identifying any recreations which will not complete.
+Recreations might fail for different reasons, but one example is that the checksums does not match for the same transaction on different copies.
+
+.Guide
+[%collapsible]
+====
+. Run `CALL dbms.cluster.statusCheck([])` on all servers to identify write unavailable databases, see xref:clustering/monitoring/status-check.adoc#monitoring-replication[Monitoring replication] for more information.
+. Recreate every database that is not write available and has not been recreated previously, see xref:clustering/databases.adoc#recreate-databases[Recreate databases] for more information.
+Remember to make sure there are recent backups for the databases before recreating them, see xref:backup-restore/online-backup.adoc[Online backup] for more information.
+. Run `SHOW DATABASES` and check any recreated databases which are not write available.
 Recreating a database will not complete if one of the following messages is displayed in the message field:
 ** `Seeders ServerId1 and ServerId2 have different checksums for transaction TransactionId. All seeders must have the same checksum for the same append index.`
 ** `Seeders ServerId1 and ServerId2 have incompatible storeIds. All seeders must have compatible storeIds.`
 ** `No store found on any of the seeders ServerId1, ServerId2...`
-+
-
 . For each database which will not complete recreation, recreate them from backup using xref:clustering/databases.adoc#uri-seed[Backup as seed] or define seeding servers in the recreate procedure using xref:clustering/databases.adoc#specified-servers[Specified seeders] so that problematic allocations are excluded.
-. Return to step 1 to make sure all databases are in their desired state.
 
+====