Skip to content

Commit 2119158

Browse files
Add more images
1 parent 6598d6c commit 2119158

File tree

8 files changed

+714
-64
lines changed

8 files changed

+714
-64
lines changed

modules/ROOT/images/disaster.svg

Lines changed: 52 additions & 50 deletions
Loading

modules/ROOT/images/fully-recovered-cluster.svg

Lines changed: 97 additions & 0 deletions
Loading

modules/ROOT/images/healthy-cluster.svg

Lines changed: 2 additions & 0 deletions
Loading

modules/ROOT/images/servers-cordoned-databases-moved.svg

Lines changed: 156 additions & 0 deletions
Loading

modules/ROOT/images/servers-cordoned.svg

Lines changed: 135 additions & 0 deletions
Loading

modules/ROOT/images/servers-deallocated.svg

Lines changed: 135 additions & 0 deletions
Loading

modules/ROOT/images/system-db-restored.svg

Lines changed: 117 additions & 0 deletions
Loading

modules/ROOT/pages/clustering/multi-region-deployment/disaster-recovery.adoc

Lines changed: 20 additions & 14 deletions
Original file line numberDiff line numberDiff line change
@@ -152,9 +152,6 @@ Use the following steps to regain write availability for the `system` database i
152152
They create a new `system` database from the most up-to-date copy of the `system` database that can be found in the cluster.
153153
It is important to get a `system` database that is as up-to-date as possible, so it corresponds to the view before the disaster closely.
154154

155-
.Guide
156-
[%collapsible]
157-
====
158155

159156
[NOTE]
160157
=====
@@ -180,10 +177,12 @@ Be aware that not replacing servers can cause cluster overload when databases ar
180177
=====
181178
+
182179
. On each server, run `bin/neo4j-admin database load system --from-path=[path-to-dump] --overwrite-destination=true` to load the current `system` database dump.
180+
+
181+
image::system-db-restored.svg[width="400", title="The `system` database is restored and unconstrained servers are added", role=popup]
182+
+
183183
. On each server, ensure that the discovery settings are correct.
184184
See xref:clustering/setup/discovery.adoc[Cluster server discovery] for more information.
185185
. Start the Neo4j process on all servers.
186-
====
187186

188187

189188
[[make-servers-available]]
@@ -217,16 +216,17 @@ This is done in two different steps:
217216
* Any allocations that cannot move by themselves require the database to be recreated so that they are forced to move.
218217
* Any allocations that can move will be instructed to do so by deallocating the server.
219218

220-
.Guide
221-
[%collapsible]
222-
====
219+
223220
. For each `Unavailable` server, run `CALL dbms.cluster.cordonServer("unavailable-server-id")` on one of the available servers.
224221
This prevents new database allocations from being moved to this server.
222+
+
223+
image::servers-cordoned.svg[width="400", title="Cordon unavailable servers", role=popup]
224+
225225
. For each `Cordoned` server, make sure a new *unconstrained* server has been added to the cluster to take its place.
226226
See xref:clustering/servers.adoc#cluster-add-server[Add a server to the cluster] for more information.
227227
+
228-
If servers were added in the <<make-the-system-database-write-available, Make the `system` database write-available>> step of this guide, additional servers might not be needed here.
229-
It is important that the new servers are unconstrained, or deallocating servers might be blocked even though enough servers were added.
228+
If servers were added in the <<make-the-system-database-write-available, Make the `system` database write-available>> step of this guide (like it is done in the current disaster recovery example), additional servers might not be needed here.
229+
It is important that the new servers are unconstrained, or deallocating servers might be blocked even though enough servers were added.
230230
+
231231
[NOTE]
232232
=====
@@ -266,10 +266,14 @@ If any database has `currentStatus` = `quarantined` on an available server, recr
266266
=====
267267
If you recreate databases using xref:database-administration/standard-databases/recreate-database.adoc#undefined-servers[undefined servers] or xref:database-administration/standard-databases/recreate-database.adoc#undefined-servers-backup[undefined servers with fallback backup], the store might not be recreated as up-to-date as possible in certain edge cases where the `system` database has been restored.
268268
=====
269+
+
270+
image::servers-cordoned-databases-moved.svg[width="400", title="Recreate databases", role=popup]
269271

270272
. For each `Cordoned` server, run `DEALLOCATE DATABASES FROM SERVER cordoned-server-id` on one of the available servers.
271273
This will move all database allocations from this server to an available server in the cluster.
272274
+
275+
image::servers-deallocated.svg[width="400", title="Deallocate databases from unavailable servers", role=popup]
276+
+
273277
[NOTE]
274278
=====
275279
This operation might fail if enough unconstrained servers were not added to the cluster to replace lost servers.
@@ -278,7 +282,7 @@ Another reason is that some available servers are also `Cordoned`.
278282

279283
. For each deallocating or deallocated server, run `DROP SERVER deallocated-server-id`.
280284
This removes the server from the cluster's view.
281-
====
285+
282286

283287

284288
[[make-databases-write-available]]
@@ -318,13 +322,12 @@ A stricter verification can be done to verify that all databases are in their de
318322
For the stricter check, run `SHOW DATABASES` and verify that `requestedStatus` = `currentStatus` for all database allocations on all servers.
319323

320324
==== Path to correct state
325+
321326
Use the following steps to make all databases in the cluster write-available again.
322327
They include recreating any databases that are not write-available and identifying any recreations that will not complete.
323328
Recreations might fail for different reasons, but one example is that the checksums do not match for the same transaction on different servers.
324329

325-
.Guide
326-
[%collapsible]
327-
====
330+
328331
. Identify all write-unavailable databases by running `CALL dbms.cluster.statusCheck([])` as described in the <<#example-verification, Example verification>> part of this disaster recovery step.
329332
Filter out all databases desired to be stopped, so that they are not recreated unnecessarily.
330333
. Recreate every database that is not write-available and has not been recreated previously.
@@ -345,4 +348,7 @@ Recreating a database will not complete if one of the following messages is disp
345348
** `No store found on any of the seeders ServerId1, ServerId2...`
346349
. For each database which will not complete recreation, recreate them from backup using xref:database-administration/standard-databases/recreate-database.adoc#uri-seed[Backup as seed].
347350

348-
====
351+
image::fully-recovered-cluster.svg[width="400", title="Fully recovered cluster", role="popup"]
352+
353+
354+

0 commit comments

Comments
 (0)