You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: docs/scalardb-cluster/remote-replication.mdx
+72-1Lines changed: 72 additions & 1 deletion
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -457,7 +457,12 @@ Verify the primary site deployment:
457
457
kubectl logs <PRIMARY_POD_NAME> -n <NAMESPACE>
458
458
```
459
459
460
-
Replace `<PRIMARY_POD_NAME>` with your actual Pod name. Ensure there are no errors.
460
+
Replace `<PRIMARY_POD_NAME>` with your actual Pod name. If there are no errors, you should see a message indicating that LogWriter is properly initialized:
461
+
462
+
```console
463
+
2025-07-03 08:56:10,162 [INFO com.scalar.db.cluster.replication.logwriter.LogWriterSnapshotHook] LogWriter is initialized
In this step, you'll monitor the replication status by using Replication CLI and Prometheus metrics.
883
+
884
+
#### Replication CLI
885
+
886
+
Replication CLI can get the status of LogApplier. This includes the number of partitions that contain remaining unapplied write operations in the replication database. This information is important because, if there are zero partitions, it means that all write operations have been successfully replicated and applied to the backup site database. In this case, you can use the synchronized backup site database as a new primary site database.
887
+
888
+
Create a Kubernetes Pod to run Replication CLI for the backup site:
Replace `<BACKUP_CLUSTER_CONTACT_POINTS>` with your backup site cluster contact points (in the same format as [ScalarDB Cluster client configurations](scalardb-cluster-configurations.mdx#client-configurations)) and `<VERSION>` with the ScalarDB Cluster version that you're using.
908
+
909
+
Ensure no new writes are being made to the primary site database to get an accurate synchronization point. Then, apply and run Replication CLI, and check the output:
If there are no errors, you should see a JSON output that includes the number of partitions containing the remaining unapplied write operations in the replication database:
923
+
924
+
```json
925
+
{"remainingTransactionGroupPartitions":0}
926
+
```
927
+
928
+
If `remainingTransactionGroupPartitions` is more than 0, it indicates unapplied write operations still remain and you need to wait until it becomes 0 before using the backup site database as a new primary database.
You can monitor LogApplier by using metrics. ScalarDB Cluster exposes many Prometheus format metrics, including LogApplier metrics, which can be monitored by using any tool that supports the format. For example, one option is using [Prometheus Operator (kube-prometheus-stack)](helm-charts/getting-started-monitoring.mdx).
939
+
940
+
While LogApplier provides many metrics, the following metric is the most important for monitoring overall replication health:
941
+
942
+
- **scalardb_cluster_stats_transaction_group_repo_oldest_record_age_millis:** The age (milliseconds) of the oldest transaction data in the replication database scanned by LogApplier. If this metric increases continuously, it indicates one of the following issues, which requires immediate investigation:
943
+
- LogApplier is failing to process stored write operations (for example, the backup site database is down).
944
+
- LogApplier cannot keep up with the primary site's throughput.
945
+
875
946
## Additional details
876
947
877
948
Remote replication is currently in Private Preview. This feature and documentation are subject to change. For more details, please [contact us](https://www.scalar-labs.com/contact) or wait for this feature to become public preview or GA.
0 commit comments