You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: modules/ROOT/pages/clustering/disaster-recovery.adoc
+82-56Lines changed: 82 additions & 56 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -6,48 +6,48 @@
6
6
A database can become unavailable due to issues on different system levels.
7
7
For example, a data center failover may lead to the loss of multiple servers, which may cause a set of databases to become unavailable.
8
8
9
-
This section contains a step-by-step guide on how to recover _unavailable databases_ that are incapable of serving writes, while still may be able to serve reads.
9
+
This section contains a step-by-step guide on how to recover *unavailable databases* that are incapable of serving writes, while possibly still being able to serve reads.
10
10
However, if a database is not performing as expected for other reasons, this section cannot help.
11
-
By following the steps outlined here, you can recover the unavailable databases and make them fully operational with minimal impact on the other databases in the cluster.
11
+
By following the steps outlined here, you can recover the unavailable databases and make them fully operational, with minimal impact on the other databases in the cluster.
12
12
13
-
[NOTE]
13
+
[CAUTION]
14
14
====
15
-
If *all* servers in a Neo4j cluster are lost in a data center failover, it is not possible to recover the current cluster.
16
-
You have to create a new cluster and restore the databases.
17
-
See xref:clustering/setup/deploy.adoc[Deploy a basic cluster] and xref:clustering/databases.adoc#cluster-seed[Seed a database] for more information.
15
+
If *all* servers in a Neo4j cluster are lost in a disaster, it is not possible to recover the current cluster.
16
+
You have to create a new cluster and restore the databases, see xref:clustering/setup/deploy.adoc[Deploy a basic cluster] and xref:clustering/databases.adoc#cluster-seed[Seed a database] for more information.
18
17
====
19
18
20
19
== Faults in clusters
21
20
22
21
Databases in clusters follow an allocation strategy.
23
22
This means that they are allocated differently within the cluster and may also have different numbers of primaries and secondaries.
24
-
Furthermore, some databases may not be allowed to be allocated to some servers because of user defined strategies.
25
-
The consequence of this is that all servers may be different in which databases they are hosting and are allowed to host.
23
+
The consequence of this is that all servers may be different in which databases they are hosting.
26
24
Losing a server in a cluster may cause some databases to lose a member while others are unaffected.
27
25
Therefore, in a disaster where one or more servers go down, some databases may keep running with little to no impact, while others may lose all their allocated resources.
28
26
29
27
== Guide structure
28
+
[NOTE]
29
+
====
30
+
In this guide, an _offline_ server is a server that is not running but may be restartable.
31
+
A _lost_ server, however, is a server that is currently not running and cannot be restarted.
32
+
A _write available_ database is able to serve writes, while a _write unavailable_ database is not.
33
+
====
34
+
30
35
There are three main steps to recovering a cluster from a disaster.
31
-
First, ensure the `system` database is write available i.e. able to accept writes.
32
-
Then, detach any potential lost servers and replace them by new ones.
33
-
Finish disaster recovery by starting or continuing to manage databases and verify that they are available.
36
+
First, ensure the `system` database is write available.
37
+
Then, detach any potential lost servers from the cluster and replace them by new ones.
38
+
Finish disaster recovery by starting or continuing to manage databases and verify that they are write available.
34
39
35
-
Every step consists of the following four sections:
40
+
Every step consists of the following three sections:
36
41
37
-
. State that needs to be verified.
38
-
. Example of how the state can be verified.
39
-
. Motivation for why this state is necessary.
40
-
. Path to correct state.
42
+
. A state that needs to be verified, with optional motivation.
43
+
. An example of how the state can be verified.
44
+
. A proposed series of steps to get to the correct state.
41
45
42
46
[CAUTION]
43
47
====
44
48
Verifying each state before continuing to the next step, regardless of the disaster scenario, is recommended to ensure the cluster is fully operational.
45
-
46
49
====
47
50
48
-
In this section, an _offline_ server is a server that is not running but may be _restartable_.
49
-
A _lost_ server, however, is a server that is currently not running and cannot be restarted.
50
-
51
51
52
52
== Guide to disaster recovery
53
53
@@ -68,14 +68,14 @@ See xref:clustering/setup/routing.adoc#clustering-routing[Server-side routing] f
68
68
69
69
==== State
70
70
====
71
-
The `system` database is write available, i.e. able to accept writes.
71
+
The `system` database is write available.
72
72
====
73
73
74
-
==== Motivation
75
-
The `system` database contains the view of the cluster. This includes which servers and databases are present and how they are configured.
76
-
During a disaster, the goal is to change the view of the cluster, for example by removing and adding servers or recreating databases.
77
-
In order for the view to be updated, the `system` database needs to be write available.
78
-
Therefore, it is vital to ensure it is available so that the next steps are possible to execute.
74
+
The `system` database contains the view of the cluster.
75
+
This includes which servers and databases are present, where they live and how they are configured.
76
+
During a disaster, the view of the cluster might need to change to reflect a new reality, for example by removing lost servers.
77
+
Databases might also need to be recreated to regain write availability.
78
+
Because both of these steps are executed by writing to the `system` database, this is a vital first step during disaster recovery.
79
79
80
80
==== Example verification
81
81
The `system` database's write availability can be verified by using the xref:clustering/monitoring/status-check.adoc#monitoring-replication[Status check] procedure.
The following steps can be used to regain write availability for the `system` database if it has been lost.
95
95
They create a new `system` database from the most up-to-date copy of the `system` database that can be found in the cluster.
96
-
It is important to get a `system` database that is as up-to-date as possible, so that future commands operate on state that is as correct as possible.
96
+
It is important to get a `system` database that is as up-to-date as possible, so it corresponds to the view before the disaster closely.
97
97
98
98
.Guide
99
99
[%collapsible]
@@ -110,13 +110,14 @@ This causes downtime for all databases in the cluster until the processes are st
110
110
. On each server, run `bin/neo4j-admin database info system` and compare the `lastCommittedTransaction` to find out which server has the most up-to-date copy of the `system` database.
111
111
. On the most up-to-date server, run `bin/neo4j-admin database dump system --to-path=[path-to-dump]` to take a dump of the current `system` database and store it in an accessible location.
112
112
. For every _lost_ server, add a new *unconstrained* one according to xref:clustering/servers.adoc#cluster-add-server[Add a server to the cluster].
113
-
It is important that the new servers are unconstrained, or deallocating servers might be blocked even though enough servers was added.
113
+
It is important that the new servers are unconstrained, or deallocating servers might be blocked even though enough servers were added.
114
114
+
115
115
[NOTE]
116
116
=====
117
-
While recommended to avoid cluster overload, it is not strictly necessary to add servers in this step.
117
+
While recommended, it is not strictly necessary to add new servers in this step.
118
118
There is also an option to change the `system` database mode (`server.cluster.system_database_mode`) on secondary allocations to make them primary allocations for the new `system` database.
119
119
The amount of primary allocations needed is defined by `dbms.cluster.minimum_initial_system_primaries_count`, see the xref:configuration/configuration-settings.adoc#config_dbms.cluster.minimum_initial_system_primaries_count[Configuration settings] for more information.
120
+
Not replacing servers can cause cluster overload when databases are moved from lost servers to available ones in the next step of this guide.
120
121
=====
121
122
+
122
123
. On each server, run `bin/neo4j-admin database load system --from-path=[path-to-dump] --overwrite-destination=true` to load the current `system` database dump.
@@ -133,11 +134,10 @@ The amount of primary allocations needed is defined by `dbms.cluster.minimum_ini
133
134
All servers in the cluster's view are available and enabled.
134
135
====
135
136
136
-
==== Motivation
137
-
// different stuffs here
138
-
Following the loss of one or more servers, the cluster's view of servers must be updated.
139
-
This includes moving allocations on the lost servers onto servers which are actually in the cluster
140
-
This includes identifying the lost servers and replacing them by new ones.
137
+
A lost server will still be in the `system` database's view of the cluster, but in an unavailable state.
138
+
According to the view of the cluster, these lost servers are still hosting the databases they had before they became lost.
139
+
Therefore, removing lost servers is not as easy as informing the `system` database that they are lost.
140
+
It also includes moving requested allocations on the lost servers onto servers which are actually in the cluster, so that those databases' topologies are still satisfied.
141
141
142
142
==== Example verification
143
143
The cluster's view of servers can be seen by listing the servers, see xref:clustering/servers.adoc#_listing_servers[Listing servers] for more information.
@@ -149,7 +149,9 @@ SHOW SERVERS;
149
149
----
150
150
151
151
==== Path to correct state
152
-
Detach lost servers and add new ones to the cluster
152
+
The following steps can be used to remove lost servers and add new ones to the cluster.
153
+
They include moving any potential database allocations from lost servers to available servers in the cluster.
154
+
These steps might also recreate some databases, since a database which has lost a majority of its primary allocations cannot be moved from one server to another.
153
155
154
156
.Guide
155
157
[%collapsible]
@@ -158,16 +160,19 @@ Detach lost servers and add new ones to the cluster
158
160
This prevents new database allocations from being moved to this server.
159
161
. For each `CORDONED` server, make sure a new *unconstrained* server has been added to the cluster to take its place, see xref:clustering/servers.adoc#cluster-add-server[Add a server to the cluster] for more information.
160
162
If servers were added in the 'System database write availability' step of this guide, additional servers might not be needed here.
163
+
It is important that the new servers are unconstrained, or deallocating servers might be blocked even though enough servers were added.
161
164
162
165
+
163
166
[NOTE]
164
167
=====
165
168
While recommended, it is not strictly necessary to add new servers in this step.
166
-
However, not adding new servers reduces the capacity of the cluster to handle work and might require the topology for a database to be altered to make deallocations and recreations possible.
169
+
However, not adding new servers reduces the capacity of the cluster to handle work.
170
+
Furthermore, it might require the topology for a database to be altered to make deallocating servers and recreating databases possible.
167
171
=====
168
172
173
+
// ? from here
169
174
. For each `CORDONED` server, run `DEALLOCATE DATABASES FROM SERVER cordoned-server-id` on one of the available servers.
170
-
This will try to move all database allocations from this server to another server in the cluster.
175
+
This will try to move all database allocations from this server to an available server in the cluster.
171
176
Once a server is `DEALLOCATED`, all allocated user databases on this server has been moved successfully.
172
177
+
173
178
[NOTE]
@@ -178,6 +183,7 @@ Therefore, an allocation with `currentStatus` = `DEALLOCATING` should reach the
178
183
. If any deallocations failed, make them possible by executing the following steps:
179
184
.. Run `SHOW DATABASES`. If a database show `currentStatus`= `offline` this database has been stopped.
180
185
.. For each stopped database that has at least one allocation on any of the `CORDONED` servers, start them by running `START DATABASE stopped-db WAIT`.
186
+
This is necessary since stopped databases cannot be moved from one server to another.
181
187
+
182
188
[NOTE]
183
189
=====
@@ -188,7 +194,7 @@ A database can be set to `READ-ONLY` before it is started to avoid updates on a
188
194
Depending on the environment, consider extending the timeout for this procedure.
189
195
If any of the primary allocations for a database report `replicationSuccessful` = `TRUE`, this database is write available.
190
196
191
-
.. For each database that is not write available, recreate it to regain write availability.
197
+
.. For each database that is not write available, recreate it to move it from lost servers and regain write availability.
192
198
Go to xref:clustering/databases.adoc#recreate-databases[Recreate databases] for more information about recreate options.
193
199
Remember to make sure there are recent backups for the databases before recreating them, see xref:backup-restore/online-backup.adoc[Online backup] for more information.
194
200
+
@@ -199,42 +205,62 @@ Otherwise, recreating with xref:clustering/databases.adoc#uri-seed[Backup as see
199
205
=====
200
206
.. Return to step 3 to retry deallocating all servers.
201
207
. For each deallocated server, run `DROP SERVER deallocated-server-id`.
202
-
This safely removes the server from the cluster view.
208
+
This safely removes the server from the cluster's view.
203
209
210
+
// ? to here really
204
211
====
205
212
206
213
207
214
[[recover-databases]]
208
215
=== Database availability
209
216
210
-
Once the `system` database and all servers are available, manage and verify that all databases are in the desired state.
211
-
212
-
. Run `CALL dbms.cluster.statusCheck([])` on all servers, see xref:clustering/monitoring/status-check.adoc#monitoring-replication[Monitoring replication] for more information.
213
-
Depending on the environment, consider extending the timeout for this procedure.
214
-
If any of the primary allocations for a database report `replicationSuccessful` = `TRUE`, this database is write available.
215
-
If all databases are write available, disaster recovery is complete.
216
-
+
217
-
[NOTE]
217
+
==== State
218
218
====
219
-
Remember that previously stopped databases might have been started during this process.
219
+
All databases are write available.
220
220
====
221
221
222
-
. Recreate every database that is not write available and has not been recreated previously, see xref:clustering/databases.adoc#recreate-databases[Recreate databases] for more information.
223
-
Remember to make sure there are recent backups for the databases before recreating them, see xref:backup-restore/online-backup.adoc[Online backup] for more information.
224
-
. Run `SHOW DATABASES` and check any recreated databases which are not write available.
222
+
Once this state is verified, disaster recovery is complete.
223
+
However, remember that previously stopped databases might have been started during this process.
224
+
If they are still desired to be in stopped state, run `START DATABASE started-db WAIT`.
225
225
226
-
+
227
226
[NOTE]
228
227
====
229
-
Remember, recreating a database can take an unbounded amount of time since it may involve copying the store to a new server, as described in xref:clustering/databases.adoc#recreate-databases[Recreate databases].
228
+
Remember, recreating a database can take an unbounded amount of time since it may involve copying the store to a new server, as described in xref:clustering/databases.adoc#recreate-databases[Recreate databases].
230
229
Therefore, an allocation with `currentStatus` = `STARTING` might reach the `requestedStatus` given some time.
231
230
====
231
+
232
+
==== Example verification
233
+
All databases' write availability can be verified by using the xref:clustering/monitoring/status-check.adoc#monitoring-replication[Status check] procedure.
234
+
The procedure should be called on all servers in the cluster, in order to provide the correct view.
235
+
The status check procedure writes a dummy transaction, and therefore the correctness of the procedure depends on the given timeout.
236
+
The default timeout is 1 second, but depending on the network latency in the environment it might need to be extended.
237
+
If any of the primary allocations for a database report `replicationSuccessful` = `TRUE`, this database is write available.
238
+
Therefore, the desired state has been verified when this is true for all databases.
239
+
240
+
[source, shell]
241
+
----
242
+
CALL dbms.cluster.statusCheck([]);
243
+
----
244
+
245
+
A stricter verification could be done to verify if all databases are in desired states on all servers.
246
+
For the stricter check, run `SHOW DATABASES` and verify that `requestedStatus` = `currentStatus` for all database allocations on all servers.
247
+
248
+
==== Path to correct state
249
+
The following steps can be used to make all databases in the cluster write available again.
250
+
They include recreating any databases that are not write available, as well as identifying any recreations which will not complete.
251
+
Recreations might fail for different reasons, but one example is that the checksums does not match for the same transaction on different copies.
252
+
253
+
.Guide
254
+
[%collapsible]
255
+
====
256
+
. Run `CALL dbms.cluster.statusCheck([])` on all servers to identify write unavailable databases, see xref:clustering/monitoring/status-check.adoc#monitoring-replication[Monitoring replication] for more information.
257
+
. Recreate every database that is not write available and has not been recreated previously, see xref:clustering/databases.adoc#recreate-databases[Recreate databases] for more information.
258
+
Remember to make sure there are recent backups for the databases before recreating them, see xref:backup-restore/online-backup.adoc[Online backup] for more information.
259
+
. Run `SHOW DATABASES` and check any recreated databases which are not write available.
232
260
Recreating a database will not complete if one of the following messages is displayed in the message field:
233
261
** `Seeders ServerId1 and ServerId2 have different checksums for transaction TransactionId. All seeders must have the same checksum for the same append index.`
234
262
** `Seeders ServerId1 and ServerId2 have incompatible storeIds. All seeders must have compatible storeIds.`
235
263
** `No store found on any of the seeders ServerId1, ServerId2...`
236
-
+
237
-
238
264
. For each database which will not complete recreation, recreate them from backup using xref:clustering/databases.adoc#uri-seed[Backup as seed] or define seeding servers in the recreate procedure using xref:clustering/databases.adoc#specified-servers[Specified seeders] so that problematic allocations are excluded.
239
-
. Return to step 1 to make sure all databases are in their desired state.
0 commit comments