Skip to content

Commit 0169e23

Browse files
committed
AUTO: Sync ScalarDB docs in English to docs site repo
1 parent 55c42ee commit 0169e23

File tree

1 file changed

+72
-1
lines changed

1 file changed

+72
-1
lines changed

docs/scalardb-cluster/remote-replication.mdx

Lines changed: 72 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -457,7 +457,12 @@ Verify the primary site deployment:
457457
kubectl logs <PRIMARY_POD_NAME> -n <NAMESPACE>
458458
```
459459

460-
Replace `<PRIMARY_POD_NAME>` with your actual Pod name. Ensure there are no errors.
460+
Replace `<PRIMARY_POD_NAME>` with your actual Pod name. If there are no errors, you should see a message indicating that LogWriter is properly initialized:
461+
462+
```console
463+
2025-07-03 08:56:10,162 [INFO com.scalar.db.cluster.replication.logwriter.LogWriterSnapshotHook] LogWriter is initialized
464+
```
465+
461466

462467
#### 2.3 Create primary site tables
463468

@@ -872,6 +877,72 @@ kubectl delete -f sql-cli-primary.yaml -n <NAMESPACE>
872877
kubectl delete -f sql-cli-backup.yaml -n <NAMESPACE>
873878
```
874879

880+
### Step 5: Monitor the replication state
881+
882+
In this step, you'll monitor the replication status by using Replication CLI and Prometheus metrics.
883+
884+
#### Replication CLI
885+
886+
Replication CLI can get the status of LogApplier. This includes the number of partitions that contain remaining unapplied write operations in the replication database. This information is important because, if there are zero partitions, it means that all write operations have been successfully replicated and applied to the backup site database. In this case, you can use the synchronized backup site database as a new primary site database.
887+
888+
Create a Kubernetes Pod to run Replication CLI for the backup site:
889+
890+
```yaml
891+
# repl-cli-backup.yaml
892+
apiVersion: v1
893+
kind: Pod
894+
metadata:
895+
name: repl-cli-backup
896+
spec:
897+
restartPolicy: Never
898+
containers:
899+
- name: repl-cli-backup
900+
image: ghcr.io/scalar-labs/scalardb-cluster-replication-cli:<VERSION>
901+
args:
902+
- "--contact-points"
903+
- "<BACKUP_CLUSTER_CONTACT_POINTS>"
904+
- "status"
905+
```
906+
907+
Replace `<BACKUP_CLUSTER_CONTACT_POINTS>` with your backup site cluster contact points (in the same format as [ScalarDB Cluster client configurations](scalardb-cluster-configurations.mdx#client-configurations)) and `<VERSION>` with the ScalarDB Cluster version that you're using.
908+
909+
Ensure no new writes are being made to the primary site database to get an accurate synchronization point. Then, apply and run Replication CLI, and check the output:
910+
911+
```bash
912+
# Apply the Pod
913+
kubectl apply -f repl-cli-backup.yaml -n <NAMESPACE>
914+
915+
# Check the status
916+
kubectl get pod repl-cli-backup -n <NAMESPACE>
917+
918+
# Check the output from the Pod
919+
kubectl logs repl-cli-backup -n <NAMESPACE>
920+
```
921+
922+
If there are no errors, you should see a JSON output that includes the number of partitions containing the remaining unapplied write operations in the replication database:
923+
924+
```json
925+
{"remainingTransactionGroupPartitions":0}
926+
```
927+
928+
If `remainingTransactionGroupPartitions` is more than 0, it indicates unapplied write operations still remain and you need to wait until it becomes 0 before using the backup site database as a new primary database.
929+
930+
Clean up the Replication CLI Pod when done:
931+
932+
```bash
933+
kubectl delete -f repl-cli-backup.yaml -n <NAMESPACE>
934+
```
935+
936+
#### Prometheus metrics
937+
938+
You can monitor LogApplier by using metrics. ScalarDB Cluster exposes many Prometheus format metrics, including LogApplier metrics, which can be monitored by using any tool that supports the format. For example, one option is using [Prometheus Operator (kube-prometheus-stack)](helm-charts/getting-started-monitoring.mdx).
939+
940+
While LogApplier provides many metrics, the following metric is the most important for monitoring overall replication health:
941+
942+
- **scalardb_cluster_stats_transaction_group_repo_oldest_record_age_millis:** The age (milliseconds) of the oldest transaction data in the replication database scanned by LogApplier. If this metric increases continuously, it indicates one of the following issues, which requires immediate investigation:
943+
- LogApplier is failing to process stored write operations (for example, the backup site database is down).
944+
- LogApplier cannot keep up with the primary site's throughput.
945+
875946
## Additional details
876947

877948
Remote replication is currently in Private Preview. This feature and documentation are subject to change. For more details, please [contact us](https://www.scalar-labs.com/contact) or wait for this feature to become public preview or GA.

0 commit comments

Comments
 (0)