Skip to content

Commit 201ce81

Browse files
server compat and setup cluster pages (#913)
* server compat and setup cluster pages * updated to reflect changes and style guide compliance * style guide compliance in older docs page * Update modules/deploy/pages/setting-up-dr-cluster.adoc Co-authored-by: Tor Colvin <tor.colvin@couchbase.com> * Update modules/server-compatibility/pages/server-compatibility-xdcr.adoc Co-authored-by: Tor Colvin <tor.colvin@couchbase.com> * Move SGW 4.0+ requirement to top of active-active DR section --------- Co-authored-by: Tor Colvin <tor.colvin@couchbase.com>
1 parent b36a68a commit 201ce81

File tree

3 files changed

+106
-36
lines changed

3 files changed

+106
-36
lines changed
1.37 MB
Loading

modules/deploy/pages/setting-up-dr-cluster.adoc

Lines changed: 59 additions & 24 deletions
Original file line numberDiff line numberDiff line change
@@ -19,6 +19,7 @@ include::ROOT:partial$_set_page_context.adoc[]
1919
:image-sgw-xdcr-dr-same-regn-in-recovery: sgw-xdcr-dr-same-regn-in-recovery.png
2020
:image-sgw-xdcr-dr-diff-regn-setup: sgw-xdcr-dr-diff-regn-setup.png
2121
:image-sgw-xdcr-dr-diff-regn-in-recovery: sgw-xdcr-dr-diff-regn-in-recovery.png
22+
:image-xdcr-active-active-replication: xdcr-active-active-replication.png
2223

2324
// end::page-attributes[]
2425

@@ -35,38 +36,74 @@ include::ROOT:partial$_show_page_header_block.adoc[]
3536
== Introduction
3637

3738
{server-xdcr--xref} (XDCR) replicates data between two or more autonomous Couchbase Server clusters.
38-
It serves an important role in supporting Disaster Recovery (DR) and Data Migration, even where Sync Gateway is the normal replicator of choice for mobile data.
39+
It plays an important role in supporting Disaster Recovery (DR) and Data Migration, even where Sync Gateway is the normal replicator of choice for mobile data.
3940

4041

4142
== Recommended Deployment Models
4243

44+
45+
=== Zero Downtime Active-Active Disaster Recovery
46+
// tag::zero-downtime-active-active[]
47+
48+
This model provides zero-downtime disaster recovery using bi-directional XDCR between two active mobile clusters.
49+
This requires running Sync Gateway 4.0+ on both sides of the active-active XDCR setup.
50+
Both clusters remain operational, with seamless fail-over through load balancer switching.
51+
You must configure both clusters with `import_docs=true`.
52+
53+
.Set Up
54+
To set up zero-downtime disaster recovery:
55+
56+
. Configure bi-directional XDCR between the Primary and disaster recovery clusters. +
57+
Enable automatic filtering of cluster specific metadata.
58+
. Deploy Sync Gateway in active mode on both clusters.
59+
. Configure users, roles, and databases independently on both clusters. +
60+
XDCR replicates documents and attachments, but you must configure users, roles, and databases separately on each cluster.
61+
. Configure your load balancer to route traffic primarily to the Primary cluster.
62+
. Verify replication health between the two active clusters.
63+
64+
[#fig-dr-active-active-setup]
65+
.DR Cluster Setup (Active-Active)
66+
image::ROOT:{image-xdcr-active-active-replication}[,{std-image-size}]
67+
68+
69+
.Activation
70+
To activate disaster recovery:
71+
72+
. Update load balancer configuration to redirect traffic to the disaster recovery cluster. +
73+
This process requires no Sync Gateway service interruption.
74+
. Verify disaster recovery cluster is handling traffic properly.
75+
. Maintain bi-directional replication for recovery preparedness. +
76+
The original primary becomes the new DR cluster automatically and requires no manual XDCR reconfiguration.
77+
78+
// end::zero-downtime-active-active[]
79+
4380
=== Clusters in Same Region
4481
// tag::clusters-in-same-region[]
4582

46-
This model caters for situations where the Active and Disaster Recovery clusters are in the same region or data center -- see: <<fig-dr-same-regn>>.
47-
It includes an optional optimization step, which will ensure that there is no downtime during the activation stage.
83+
This model caters for situations where the Active and Disaster Recovery clusters are in the same region or datacenter -- see: <<fig-dr-same-regn>>.
84+
It includes an optional optimization step, which confirms that there is no downtime during the activation stage.
4885

4986
.Set Up
5087
To set up and maintain a disaster recovery cluster:
5188

52-
. [_Optional step -- for optimization_] Connect Sync Gateway to the Disaster Recovery cluster just long enough to create indexes.
53-
Having everything reindexed lowers switching costs. +
54-
If you skip this test, you will incur latency when Sync Gateway is switched to the Disaster Recovery cluster and Sync Gateway rebuilds its indexes.
89+
. [*Optional step -- for optimization*] Start Sync Gateway with `offline: true` in the Disaster Recovery cluster to asynchronously create indexes.
90+
Creating all indexes beforehand reduces switching costs. +
91+
If you skip this test, you'll incur latency when Sync Gateway switches to the Disaster Recovery cluster and Sync Gateway rebuilds its indexes.
5592
. Connect Sync Gateway to your Primary cluster.
56-
. Initiate the *unidirectional* XDCR from the Primary cluster to the Disaster Recovery cluster.
93+
. Start the *unidirectional* XDCR from the Primary cluster to the Disaster Recovery cluster.
5794

5895
[#fig-dr-same-regn]
5996
.DR Cluster Setup (Clusters in Same Regions)
6097
image::ROOT:{image-sgw-xdcr-dr-same-regn-setup}[,{std-image-size}]
6198

6299
.Activation
63-
When you are ready to switch to Disaster Recovery operations:
100+
When you're ready to switch to Disaster Recovery operations:
64101

65102
. Stop the replication (XDCR) from the Primary cluster to Disaster Recovery cluster.
66-
. *When XDCR is stopped:* Switch the Load Balancer to point to the Sync Gateway on the Disaster Recovery cluster.
67-
This maintains the deployment of Sync Gateway at only one end of the XDCR replication.
103+
. *After you stop XDCR:* Switch the Load Balancer to point to the Sync Gateway on the Disaster Recovery cluster.
104+
This approach keeps the deployment of Sync Gateway at only 1 end of the XDCR replication.
68105
. Promote the Disaster Recovery cluster to Primary and the *old* Primary to Disaster Recovery.
69-
. Flush all replicated buckets in the Primary cluster; as a precaution against any spurious writes coming into the Primary cluster that had not been replicated when XDCR was stopped.
106+
. Flush all replicated buckets in the Primary cluster as a precaution against any spurious writes that enter the Primary cluster and XDCR fails to replicate when you stop it.
70107
. Reverse the XDCR to replicate from the newly promoted Primary to the old Primary to set up a new Backup.
71108

72109
[#fig-dr-same-regn-in-recovery]
@@ -75,38 +112,36 @@ image::ROOT:{image-sgw-xdcr-dr-same-regn-in-recovery}[,{std-image-size}]
75112

76113
// end::clusters-in-same-region[]
77114

78-
79115
=== Clusters in Different Regions or Data Centers
80116
// tag::clusters-in-diff-region[]
81117

82118
This model caters for situations where the Active and Disaster Recovery clusters are in different regions or data centers.
83-
Although the model has a separate Sync Gateway cluster attached to the Disaster Recovery cluster, it maintains the deployment of Sync Gateway at only one end of the XDCR replication.
84-
The optional optimization step will ensure that there is no downtime during the activation stage.
119+
Although the model has a separate Sync Gateway cluster attached to the Disaster Recovery cluster, this approach keeps the deployment of Sync Gateway at only 1 end of the XDCR replication.
120+
The optional optimization step confirms that there is no downtime during the activation stage.
85121

86122

87123
.Set Up
88124
To set up and maintain a disaster recovery cluster - see: <<fig-dr-diff-regn-setup>>:
89125

90-
. [_Optional step -- for optimization_] Turn on _Sync Gateway_ in the Disaster Recovery cluster just long enough to create indexes.
91-
Having everything re-indexed lowers switching costs. +
92-
If you skip this test, you will incur latency when Sync Gateway is switched to the Disaster Recovery cluster and Sync Gateway rebuilds its indexes.
93-
. [_Critical step_] Turn off *all* the Sync Gateways in the Disaster Recovery cluster.
94-
. Initiate the *unidirectional* XDCR from the Primary cluster to the Disaster Recovery cluster.
126+
. [*Optional step -- for optimization*] Start Sync Gateway with `offline: true` in the Disaster Recovery cluster to asynchronously create indexes.
127+
If you skip this test, you'll incur latency when you switch Sync Gateway to the Disaster Recovery cluster and Sync Gateway rebuilds its indexes.
128+
. [*Critical step*] Turn off *all* the Sync Gateways in the Disaster Recovery cluster.
129+
. Start the *unidirectional* XDCR from the Primary cluster to the Disaster Recovery cluster.
95130

96131
[#fig-dr-diff-regn-setup]
97132
.DR Cluster Setup (Clusters in Different Regions)
98133
image::ROOT:{image-sgw-xdcr-dr-diff-regn-setup}[,{std-image-size}]
99134

100135

101136
.Activation
102-
When you are ready to switch to Disaster Recovery operations -- see: <<fig-dr-diff-regn-in-recovery>>:
137+
When you're ready to switch to Disaster Recovery operations -- see: <<fig-dr-diff-regn-in-recovery>>:
103138

104139
. Stop Sync Gateway on the Primary cluster
105140
. Stop the replication (XDCR) from the Primary cluster to the Disaster Recovery cluster.
106-
. Ensure that any and all Load Balancer(s) are updated to direct all traffic to the new Sync Gateway cluster(s).
107-
. Turn on the Sync Gateway cluster(s) in the Disaster Recovery cluster.
108-
. Promote the Disaster Recovery cluster to be the *new* Primary cluster, and make the *old* Primary cluster the *new* Disaster Recovery cluster
109-
. Flush all replicated buckets in the Primary cluster; as a precaution against any spurious writes coming into the Primary cluster that had not been replicated when XDCR was stopped.
141+
. Verify that any and all Load Balancer updates to direct all traffic to the new Sync Gateway clusters.
142+
. Turn on the Sync Gateway cluster in the Disaster Recovery cluster.
143+
. Assign the Disaster Recovery cluster to be the *new* Primary cluster, and make the *old* Primary cluster the *new* Disaster Recovery cluster.
144+
. Flush all replicated buckets in the Primary cluster as a precaution against any spurious writes coming into the Primary cluster that XDCR did not replicate when you stopped it.
110145
. Reverse the original XDCR to replicate from the newly promoted Primary to the old Primary, to set up a new Backup.
111146

112147
[#fig-dr-diff-regn-in-recovery]

modules/server-compatibility/pages/server-compatibility-xdcr.adoc

Lines changed: 47 additions & 12 deletions
Original file line numberDiff line numberDiff line change
@@ -15,6 +15,7 @@ include::ROOT:partial$_set_page_context.adoc[]
1515
// BEGIN -- Page Attributes
1616
:image-xdcr-data-replication-to-read-only: xdcr-data-replication-to-read-only.png
1717
:image-icr-active-mobile-sync: icr-active-mobile-sync.png
18+
:image-xdcr-active-active-replication: xdcr-active-active-replication.png
1819
// END -- Page Attributes
1920

2021

@@ -43,18 +44,18 @@ Here we provide details on how XDCR feature relates to the {cbm} ecosystem.
4344

4445

4546
If you need to sync mobile clusters, you should use Inter-Sync Gateway replication -- see: {sync-inter-syncgateway-overview--xref}.
46-
It was designed to keep mobile clusters in different data centers in sync.
47+
It's designed to keep mobile clusters in different data centers in sync.
4748
The ideal use-case being the need to replicate edge clusters containing active Sync{nbsp}Gateway nodes between geographically separate cloud-based Sync Gateway deployments.
48-
A typical architecture for this use case is shown in <<icr-active-mobile>>.
49+
<<icr-active-mobile>> shows a typical architecture for this use case.
4950

50-
Inter-Sync Gateway replication provides bi-directional read/write replications that ensure:
51+
Inter-Sync Gateway replication provides bi-directional read/write replications that make sure:
5152

52-
* Cluster specific security is observed; by invoking the appropriate Sync Function.
53-
* The integrity of security history is maintained.
54-
Historical access rules are held and maintained in the Sync Gateway metadata.
53+
* Sync Gateway observes cluster-specific security by invoking the appropriate Sync Function.
54+
* Sync Gateway maintains the integrity of security history.
55+
Sync Gateway holds and maintains historical access rules in its metadata.
5556
This history is necessary to consistently handle the revocation of access grants.
56-
* A consistent Revision Id is used across all clusters, allowing clients to identify a revision regardless of the cluster it is on.
57-
* Cluster-specific _sync documents are not replicated to other mobile clusters
57+
* All clusters use a consistent Revision Id, allowing clients to identify a revision regardless of which cluster it's on.
58+
* Cluster-specific `_sync` documents are not replicated to other mobile clusters
5859

5960
[[icr-active-mobile]]
6061
.Active-to-active mobile synchronization
@@ -70,7 +71,11 @@ XDCR replicates all of Sync Gateway’s metadata (_sync xattr) along with associ
7071

7172
Your default preference for the replication of {cbm} changes should always be to use inter-Sync{nbsp}Gateway replication.
7273

73-
XDCR can be useful though in use-cases where the entire dataset from a source bucket is replicated to a target bucket. This could include categories such as active standby, disaster recovery, data migration and _lift-and-shift_ cases in hybrid cloud.
74+
XDCR proves useful in use-cases where you replicate the entire dataset from a source bucket to a target bucket.
75+
These categories include active standby, disaster recovery, data migration and _lift-and-shift_ cases in hybrid cloud.
76+
77+
78+
=== Unidirectional XDCR
7479

7580
In all these categories you should run XDCR in unidirectional mode, deploying Sync Gateway only at one end of the XDCR-replicated bucket (source, or target) -- see: <<ex-xdcr-data-repl>>.
7681

@@ -79,22 +84,52 @@ In all these categories you should run XDCR in unidirectional mode, deploying Sy
7984
====
8085
In this example XDCR deployment:
8186
82-
* XDCR is run unidirectionally.
83-
It pushes data from the primary data center to the secondary data centers, where it is pulled by Sync Gateway for downstream clients.
87+
* XDCR runs unidirectionally.
88+
It pushes data from the primary datacenter to the secondary datacenter, where it's pulled by Sync Gateway for downstream clients.
8489
* Sync Gateway, although deployed at both ends of the XDCR replication, crucially is in *read-only* mode at the target end.
8590
8691
.XDCR Replication
8792
image::ROOT:{image-xdcr-data-replication-to-read-only}[,std-image-size]
8893
====
8994

95+
=== Bi-directional Active-Active XDCR
96+
97+
Sync Gateway supports bi-directional replication using XDCR between two active mobile clusters.
98+
This enables active-active deployments where both clusters can process writes simultaneously.
99+
100+
TIP: Verify that you installed Couchbase Server 7.6.5 or later and configured your Sync Gateway nodes with `shared_bucket_access=true` and `import_docs=true`
101+
102+
Bi-directional XDCR allows for both clusters to process writes simultaneously, with automatic conflict resolution.
103+
The setup supports different sync functions between clusters and provides unified replication for both mobile and non-mobile documents.
104+
105+
[#ex-xdcr-bi-data-repl]
106+
.Bi-directional XDCR Replication
107+
====
108+
In this example XDCR deployment:
109+
110+
* XDCR runs bi-directionally between two active mobile clusters.
111+
Both clusters can simultaneously process writes from Sync Gateway clients and replicate changes to each other.
112+
* Both clusters deploy Sync Gateway in active mode, enabling full read-write operations.
113+
Automatic conflict resolution ensures data consistency across clusters.
114+
* Clients can switch between clusters seamlessly without data loss, as both clusters maintain synchronized datasets.
115+
116+
.XDCR Replication
117+
image::ROOT:{image-xdcr-active-active-replication}[,std-image-size]
118+
119+
====
120+
90121
[#lbl-prov-drc]
91122
== Disaster Recovery Scenario
92123

93124
In a Disaster Recovery scenario XDCR serves an important role in setting-up target mobile clusters.
94125

95126
Here XDCR supports Disaster Recovery and Data Migration.
96-
Even though Sync Gateway is the operational replicator of choice for mobile data, it is only ever deployed at one end of the unidirectional XDCR replication.
127+
Even though Sync Gateway is the operational replicator of choice for mobile data, it's only ever deployed at one end of the unidirectional XDCR replication.
128+
129+
130+
=== Zero Downtime Active-Active Disaster Recovery
97131

132+
include::deploy:setting-up-dr-cluster.adoc[tags=zero-downtime-active-active]
98133

99134
=== Clusters in Same Region
100135

0 commit comments

Comments
 (0)