Skip to content

Commit 107a686

Browse files
Remove the note and redesign the structure of the Monitor replication status page (#1970) (#1972)
1 parent 0371b2f commit 107a686

File tree

1 file changed

+25
-28
lines changed

1 file changed

+25
-28
lines changed

modules/ROOT/pages/clustering/monitoring/status-check.adoc

Lines changed: 25 additions & 28 deletions
Original file line numberDiff line numberDiff line change
@@ -4,42 +4,44 @@
44
[[monitoring-replication]]
55
= Monitor replication status
66

7-
Neo4j 5.24 introduces the xref:reference/procedures.adoc#procedure_dbms_cluster_statusCheck[`dbms.cluster.statusCheck()`] procedure, which can be used to monitor the ability to replicate in clustered databases.
8-
In most cases this means a clustered database is write available.
7+
Neo4j 5.24 introduces the xref:reference/procedures.adoc#procedure_dbms_cluster_statusCheck[`dbms.cluster.statusCheck()`] procedure to monitor the ability to replicate in clustered databases.
8+
99
The procedure identifies which members of a clustered database are up-to-date and can participate in successful replication.
1010
Therefore, it is useful in determining the fault tolerance of a clustered database.
1111
Additionally, you can use the procedure to identify the leader of a clustered database within the cluster.
1212

1313
[NOTE]
1414
====
15-
The member on which the procedure is called replicates a dummy transaction in the same cluster as the real transactions, and verifies that it can be replicated and applied.
16-
17-
Since the status check doesn't replicate an actual transaction, it's not guaranteed that the database is write available even though the status check reports that it can replicate.
18-
Apart from replication there are other stops in the write path that can potentially block a transaction from being applied, e.g. issues in the database.
19-
However, it tells that the cluster is healthy and in most cases that means that the database is write available.
15+
The procedure replicates a dummy transaction within the cluster and verifies that it can be replicated and applied.
16+
Since the status check does not replicate an actual transaction, it does not guarantee write availability, as other factors in the write path (e.g., database issues) may block transactions.
17+
However, a healthy status typically indicates write availability in most cases.
2018
====
2119

2220
[[cluster-status-check]]
2321
== Cluster status check
2422

25-
*Syntax:*
23+
[procedure-status-check-syntax]
24+
=== Syntax
25+
2626
[source, shell]
2727
----
2828
CALL dbms.cluster.statusCheck(databases :: LIST<STRING>, timeoutMilliseconds = null :: INTEGER)
2929
----
3030

31-
*Arguments:*
31+
[status-check-input-arguments]
32+
=== Input arguments
3233

3334
[options="header", cols="m,a,a"]
3435
|===
3536
| Name | Type | Description
3637
| databases | List<String> | Databases for which the status check should run.
37-
Providing an empty list runs the status check for all *clustered* databases on that server, i.e. it won't run on singles or secondaries.
38+
Providing an empty list runs the status check for all *clustered* databases on that server, i.e. it does not run on singles or secondaries.
3839
| timeoutMilliseconds | Integer | How long to allow for replication, before returning it was unsuccessful.
3940
Default value is 1000 milliseconds.
4041
|===
4142

42-
*Returns:*
43+
[status-check-return-arguments]
44+
=== Return arguments
4345

4446
The procedure returns a row for all primary members of all the requested databases where each row consists of:
4547

@@ -60,42 +62,37 @@ If the members report different leaders, the one with the highest term should be
6062
An example of an error is that one or more of the requested databases do not exist on the requester.
6163
|===
6264

63-
=== Possible values of `replicationSuccessful`
65+
[replication-successful-values]
66+
==== Possible values of `replicationSuccessful`
67+
6468
* `TRUE` -- if this server managed to replicate the dummy transaction to a majority of cluster members within the given timeout.
6569
* `FALSE` -- if it failed to replicate within the timeout.
6670
The value is the same column-wise.
6771
A failed replication can either indicate a real issue in the cluster (e.g., no leader) or that this server is too far behind in applying updates and can't replicate.
6872

69-
=== Possible values of `memberStatus`
73+
[member-status-values]
74+
==== Possible values of `memberStatus`
75+
7076
* `APPLYING` means that the member can replicate and is actively applying transactions.
71-
* `REPLICATING` means that the member can participate in replicating, but can't apply.
77+
* `REPLICATING` means that the member can participate in replicating but cannot apply.
7278
This state is uncommon, but may happen while waiting for the database to start and accept transactions.
7379
* `UNAVAILABLE` means that the member is either too far behind the leader or unreachable.
80+
They are unhealthy and cannot add to the fault-tolerance.
81+
82+
[requester-values]
83+
==== Possible values of `requester`
7484

75-
=== Possible values of `requester`
7685
* `TRUE` -- for the server on which the procedure is run.
7786
* `FALSE` -- on the remaining servers.
7887

7988
In general, you can use the `replicationSuccessful` field to determine overall write-availability, whereas the `memberStatus` field can be checked in order to see whether the database is fault-tolerant or not.
8089

81-
[NOTE]
82-
====
83-
Members that are `REPLICATING` are good from a data safety point of view.
84-
They can participate in replication and keep the data durably until application.
85-
They are also up-to-date and therefore eligible leaders.
86-
So they add to the fault-tolerance.
87-
88-
Members that are `APPLYING` have all the qualities of `REPLICATING` members, so they too add to the fault-tolerance.
89-
But they are also applying to the database, which is a requirement for writing transactions and reading with bookmarks in a timely manner.
90-
91-
Lastly, `UNAVAILABLE` members are either too far behind or unreachable.
92-
They are unhealthy and cannot add to the fault-tolerance.
93-
====
9490

9591
[[status-check-example]]
9692
== Example
9793

9894
=== Running the status check
95+
9996
When running the cluster status check against a server, expect similar output to the following:
10097

10198
[source,queryresults,role=noplay]

0 commit comments

Comments
 (0)