Skip to content

Commit 4b7e44f

Browse files
committed
DOC-5567 RS: Added lag-aware checks to DB availability doc
1 parent 7b1751d commit 4b7e44f

File tree

2 files changed

+51
-1
lines changed

2 files changed

+51
-1
lines changed

content/operate/rs/monitoring/db-availability.md

Lines changed: 50 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -45,6 +45,56 @@ Returns HTTP status code 200 OK if all primary (master) shards are reachable fro
4545

4646
If the local database endpoint is unavailable, returns an error status code and a JSON object that contains [`error_code` and `description` fields]({{<relref "/operate/rs/references/rest-api/requests/bdbs/availability#get-endpoint-error-codes">}}).
4747

48+
## Use lag-aware availability checks for disaster recovery {#lag-aware}
49+
50+
The database availability API supports lag-aware availability checks that consider replication lag tolerance. You can reduce the risk of data inconsistencies during disaster recovery by incorporating lag-aware availability checks into your disaster recovery solution and ensuring failover-failback flows only occur when databases are accessible and sufficiently synchronized.
51+
52+
### Adjust availability lag tolerance threshold
53+
54+
The lag tolerance threshold is 100 milliseconds by default. Depending on factors such as workload, network conditions, and throughput, you might want to adjust the lag tolerance threshold using one of the following methods:
55+
56+
- Change the default threshold for the entire cluster by setting `availability_lag_tolerance_ms` with an [update cluster]({{<relref "/operate/rs/references/rest-api/requests/cluster#put-cluster">}}) request.
57+
58+
```sh
59+
PUT /v1/cluster
60+
{ "availability_lag_tolerance_ms": 100 }
61+
```
62+
63+
- Override the default threshold by adding the `availability_lag_tolerance_ms` query parameter to specific lag-aware [availability checks]({{<relref "/operate/rs/references/rest-api/requests/bdbs/availability">}}).
64+
65+
```sh
66+
GET /v1/bdbs/<database_id>/availability?extend_check=lag&availability_lag_tolerance_ms=100
67+
```
68+
69+
### Lag-aware database availability checks
70+
71+
To perform a lag-aware database availability check using the cluster's default lag tolerance threshold:
72+
73+
```sh
74+
GET /v1/bdbs/<database_id>/availability?extend_check=lag
75+
```
76+
77+
To perform a lag-aware database availability check and override the cluster's default lag tolerance threshold:
78+
79+
```sh
80+
GET /v1/bdbs/<database_id>/availability?extend_check=lag&availability_lag_tolerance_ms=100
81+
```
82+
83+
### Lag-aware endpoint availability checks
84+
85+
To perform a lag-aware database endpoint availability check using the cluster's default lag tolerance threshold:
86+
87+
```sh
88+
GET /v1/local/bdbs/<database_id>/endpoint/availability?extend_check=lag
89+
```
90+
91+
To perform a lag-aware database endpoint availability check and override the cluster's default lag tolerance threshold:
92+
93+
```sh
94+
GET /v1/local/bdbs/<database_id>/endpoint/availability?extend_check=lag&availability_lag_tolerance_ms=100
95+
```
96+
97+
4898
## Availability by database status
4999
50100
The following table shows the relationship between a database's status and availability. For more details about the database status values, see [BDB status field]({{<relref "/operate/rs/references/rest-api/objects/bdb/status">}}).

content/operate/rs/references/rest-api/objects/cluster/_index.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -16,7 +16,7 @@ An API object that represents the cluster.
1616
| Name | Type/Value | Description |
1717
|------|------------|-------------|
1818
| alert_settings | [alert_settings]({{< relref "/operate/rs/references/rest-api/objects/cluster/alert_settings" >}}) object | Cluster and node alert settings |
19-
| <span class="break-all">availability_lag_tolerance_ms</span> | integer (default: 100) | The maximum replication lag in milliseconds tolerated between source and replicas during lag-aware [database availability checks]({{<relref "/operate/rs/monitoring/db-availability">}}). |
19+
| <span class="break-all">availability_lag_tolerance_ms</span> | integer (default: 100) | The maximum replication lag in milliseconds tolerated between source and replicas during [lag-aware database availability checks]({{<relref "/operate/rs/monitoring/db-availability#lag-aware">}}). |
2020
| bigstore_driver | 'speedb'<br />'rocksdb' | Storage engine for [Auto Tiering]({{<relref "/operate/rs/databases/auto-tiering">}}) |
2121
| <span class="break-all">cluster_ssh_public_key</span> | string | Cluster's autogenerated SSH public key |
2222
| cm_port | integer, (range: 1024-65535) | UI HTTPS listening port |

0 commit comments

Comments
 (0)