Skip to content

Commit 78f1701

Browse files
committed
Re-writing parts of the system db docs in disaster recovery.
1 parent ffa2618 commit 78f1701

File tree

1 file changed

+21
-27
lines changed

1 file changed

+21
-27
lines changed

modules/ROOT/pages/clustering/disaster-recovery.adoc

Lines changed: 21 additions & 27 deletions
Original file line numberDiff line numberDiff line change
@@ -58,48 +58,42 @@ See xref:clustering/setup/routing.adoc#clustering-routing[Server-side routing] f
5858
The first step of recovery is to ensure that the `system` database is available.
5959
The `system` database is required for clusters to function properly.
6060

61-
. Start all servers that are _offline_.
62-
(If a server is unable to start, inspect the logs and contact support personnel.
63-
The server may have to be considered indefinitely lost.)
64-
. *Validate the `system` database's availability.*
65-
.. Run `SHOW DATABASE system`.
66-
If the response does not contain a writer, the `system` database is unavailable and needs to be recovered, continue to step 3.
67-
.. Optionally, you can create a temporary user to validate the `system` database's writability by running `CREATE USER 'temporaryUser' SET PASSWORD 'temporaryPassword'`.
68-
.. Confirm that the temporary user is created as expected, by running `SHOW USERS`, then continue to xref:clustering/disaster-recovery.adoc#recover-servers[Recover servers].
69-
If not, continue to step 3.
61+
. *Start all servers that are _offline_*.
62+
If a server is unable to start, inspect the logs and contact support personnel.
63+
The server may have to be considered indefinitely lost.
64+
. *Validate the `system` database's availability.* Use one of the following options:
65+
** Run `SHOW DATABASE system`.
66+
If the response contain a writer, the `system` database is write available and does not need to be recovered, skip to step xref:clustering/disaster-recovery.adoc#recover-servers[Recover servers].
67+
** Create a temporary user by running `CREATE USER 'temporaryUser' SET PASSWORD 'temporaryPassword'`.
68+
Check if the temporary user is created by running `SHOW USERS`. If it was created as expected, the `system` database is write available and does not need to be recovered, skip to step xref:clustering/disaster-recovery.adoc#recover-servers[Recover servers].
69+
7070
+
7171
. *Restore the `system` database.*
7272
+
7373
[NOTE]
7474
====
75-
Only do the steps below if the `system` database's availability cannot be validated by the first two steps in this section.
76-
====
77-
+
78-
[NOTE]
79-
====
80-
Recall that the cluster remains fault tolerant, and thus able to serve both reads and writes, as long as a majority of the primaries are available.
81-
In case of a disaster affecting one or more server(s), but where the majority of servers are still available, it is possible to add a new server to the cluster and recover the `system` database (and any other affected user databases) on it by copying the `system` database (and affected user databases) from one of the available servers.
82-
This method prevents downtime for the other databases in the cluster.
83-
If this is the case, ie. if a majority of servers are still available, follow the instructions in <<recover-servers>>.
75+
Only do the steps below if the `system` database's write availability cannot be validated by the first two steps in this section.
8476
====
8577
+
78+
8679
The following steps create a new `system` database from a backup of the current `system` database.
87-
This is required since the current `system` database has lost too many members in the server failover.
80+
This is required since the current `system` database has lost too many members to be able to accept writes.
8881

8982
.. Shut down the Neo4j process on all servers.
9083
Note that this causes downtime for all databases in the cluster.
9184
.. On each server, run the following `neo4j-admin` command `bin/neo4j-admin dbms unbind-system-db` to reset the `system` database state on the servers.
92-
See xref:tools/neo4j-admin/index.adoc#neo4j-admin-commands[`neo4j-admin` commands] for more information.
85+
See xref:tools/neo4j-admin/index.adoc#neo4j-admin-commands[neo4j-admin commands] for more information.
9386
.. On each server, run the following `neo4j-admin` command `bin/neo4j-admin database info system` to find out which server is most up-to-date, ie. has the highest last-committed transaction id.
9487
.. On the most up-to-date server, take a dump of the current `system` database by running `bin/neo4j-admin database dump system --to-path=[path-to-dump]` and store the dump in an accessible location.
95-
See xref:tools/neo4j-admin/index.adoc#neo4j-admin-commands[`neo4j-admin` commands] for more information.
96-
.. Ensure there are enough `system` database primaries to create the new `system` database with fault tolerance.
97-
Either:
98-
... Add completely new servers (see xref:clustering/servers.adoc#cluster-add-server[Add a server to the cluster]) or
99-
... Change the `system` database mode (`server.cluster.system_database_mode`) on the current `system` database's secondary servers to allow them to be primaries for the new `system` database.
88+
See xref:tools/neo4j-admin/index.adoc#neo4j-admin-commands[neo4j-admin commands] for more information.
89+
.. Ensure there are enough `system` database primaries to create the new `system` database.
90+
The amount of primaries needed is equal or more than the `dbms.cluster.minimum_initial_system_primaries_count` config, see xref:tools/neo4j-admin/index.adoc#neo4j-admin-commands[fix link] for more information.
91+
Use one of the following options:
92+
** Add completely new servers, see xref:clustering/servers.adoc#cluster-add-server[Add a server to the cluster].
93+
** Change the `system` database mode (`server.cluster.system_database_mode`) on the current `system` database's secondary servers to allow them to be primaries for the new `system` database.
10094
.. On each server, run `bin/neo4j-admin database load system --from-path=[path-to-dump] --overwrite-destination=true` to load the current `system` database dump.
101-
.. Ensure that `dbms.cluster.discovery.endpoints` are set correctly on all servers, see xref:clustering/setup/discovery.adoc[Cluster server discovery] for more information.
102-
.. Return to step 1.
95+
.. Ensure that the discovery settings are correct on all servers, see xref:clustering/setup/discovery.adoc[Cluster server discovery] for more information.
96+
.. Return to step 1, to start all servers and confirm the `system` database is now available.
10397

10498

10599
[[recover-servers]]

0 commit comments

Comments
 (0)