You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
* Various changes
Added a number of small changes from PCR overhaul doc and fixed broken links and typos
More various changes
More various changes - will elaborate as needed in comments
Moved info
Moved info to a more relevant section
Fixed broken link
Accidentally broke a link - fixed typo
Fixed more broken links
Lots of broken links from these updates- hopefully fixed the last ones
Minor fixes from review
A few minor changes based on Alicia's review
* Adjustments from review
Some changes based on a review from Michael Butler
* changed 'must' to 'may want to'
one more change from review
* Small change from review
Small reword from review
* Fixed broken links
Fixed broken links
* Apply suggestions from code review
Co-authored-by: Florence Morris <[email protected]>
* Changes from docs review
Changes from docs review
---------
Co-authored-by: Florence Morris <[email protected]>
Copy file name to clipboardExpand all lines: src/current/v25.3/create-virtual-cluster.md
+1-1Lines changed: 1 addition & 1 deletion
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -62,7 +62,7 @@ To form a connection string similar to the example, include the following values
62
62
63
63
Value | Description
64
64
----------------+------------
65
-
`{replication user}` | The user on the primary cluster that has the `REPLICATION` system privilege. Refer to the [Create a replication user and password]({% link {{ page.version.version }}/set-up-physical-cluster-replication.md %}#create-a-replication-user-and-password) for more detail.
65
+
`{replication user}` | The user on the primary cluster that has the `REPLICATION` system privilege. Refer to [Create a user with replication privileges]({% link {{ page.version.version }}/set-up-physical-cluster-replication.md %}#create-a-user-with-replication-privileges) for more detail.
66
66
`{password}` | The replication user's password.
67
67
`{node ID or hostname}` | The node IP address or hostname of any node from the primary cluster.
68
68
`options=ccluster=system` | The parameter to connect to the system virtual cluster on the primary cluster.
Copy file name to clipboardExpand all lines: src/current/v25.3/failover-replication.md
+7-4Lines changed: 7 additions & 4 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -38,12 +38,15 @@ To initiate a failover to the standby cluster, specify the point in time for its
38
38
39
39
-[`LATEST`](#fail-over-to-the-most-recent-replicated-time): The most recent replicated timestamp. This minimizes any data loss from the replication lag in asynchronous replication.
40
40
-[Point-in-time](#fail-over-to-a-point-in-time):
41
-
- Past: A past timestamp within the [failover window]({% link {{ page.version.version }}/physical-cluster-replication-technical-overview.md %}#failover-and-promotion-process) of up to 4 hours in the past. Failing over to a past point in time is useful if you need to recover from a recent human error.
41
+
- Past: A past timestamp within the [failover window]({% link {{ page.version.version }}/physical-cluster-replication-technical-overview.md %}#failover-and-promotion-process) of up to 4 hours in the past.
42
+
{{site.data.alerts.callout_success}}
43
+
Failing over to a past point in time is useful if you need to recover from a recent human error
44
+
{{site.data.alerts.end}}
42
45
- Future: A future timestamp for planning a failover.
43
46
44
47
#### Fail over to the most recent replicated time
45
48
46
-
To initiate a failover to the most recent replicated timestamp, specify `LATEST` when you start the failover. Due to [_replication lag_]({% link {{ page.version.version }}/physical-cluster-replication-technical-overview.md %}#failover-and-promotion-process), the latest replicated time may be behind the current actual time. Replication lag is the time between the most up-to-date replicated time and the actual time.
49
+
To initiate a failover to the most recent replicated timestamp, specify `LATEST`. Due to [_replication lag_]({% link {{ page.version.version }}/physical-cluster-replication-technical-overview.md %}#failover-and-promotion-process), the most recent replicated time may be behind the current actual time. Replication lag is the time difference between the most recent replicated time and the actual time.
47
50
48
51
1. To view the current replication timestamp, use:
49
52
@@ -172,7 +175,7 @@ To enable PCR again, from the new primary to the original primary (or a complete
172
175
173
176
After failing over to the standby cluster, you may want to return to your original configuration by failing back to the original primary-standby cluster setup. Depending on the configuration of the primary cluster in the original PCR stream, use one of the following workflows:
174
177
175
-
- [From the original standby cluster (after it was promoted during failover) to the original primary cluster](#fail-back-to-the-original-primary-cluster). If this failback is initiated within 24 hours of the failover, PCR replicates the net-new changes from the standby cluster to the primary cluster, so you do not need to re-seed the primary cluster.
178
+
- [From the original standby cluster (after it was promoted during failover) to the original primary cluster](#fail-back-to-the-original-primary-cluster). If this failback is initiated within 24 hours of the failover, PCR replicates the net-new changes from the standby cluster to the primary cluster, rather than fully replacing the existing data in the primary cluster.
176
179
- [After the PCR stream used an existing cluster as the primary cluster](#fail-back-after-replicating-from-an-existing-primary-cluster).
177
180
178
181
{{site.data.alerts.callout_info}}
@@ -304,7 +307,7 @@ At this point, **Cluster A** has caught up to **Cluster B**. The clusters are en
304
307
305
308
You can replicate data from an existing CockroachDB cluster that does not have [cluster virtualization]({% link {{ page.version.version }}/cluster-virtualization-overview.md %}) enabled to a standby cluster with cluster virtualization enabled. For instructions on setting up a PCR in this way, refer to [Set up PCR from an existing cluster]({% link {{ page.version.version }}/set-up-physical-cluster-replication.md %}#set-up-pcr-from-an-existing-cluster).
306
309
307
-
After a [failover](#failover) to the standby cluster, you must set up PCR from the original standby cluster, which is now the primary, to another cluster, which will become the standby. There are multiple ways to set up a new standby, and some considerations.
310
+
After a [failover](#failover) to the standby cluster, you may want to set up PCR from the original standby cluster, which is now the primary, to another cluster, which will become the standby. There are multiple ways to set up a new standby, and some considerations.
308
311
309
312
In the example, the clusters are named for reference:
Copy file name to clipboardExpand all lines: src/current/v25.3/physical-cluster-replication-overview.md
+2-2Lines changed: 2 additions & 2 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -31,7 +31,7 @@ You can use PCR to:
31
31
-**Transactional consistency**: Avoid conflicts in data after recovery; the replication completes to a transactionally consistent state.
32
32
-**Improved RPO and RTO**: Depending on workload and deployment configuration, [replication lag]({% link {{ page.version.version }}/physical-cluster-replication-technical-overview.md %}) between the primary and standby is generally in the tens-of-seconds range. The failover process from the primary cluster to the standby should typically happen within five minutes when completing a failover to the latest replicated time using [`LATEST`]({% link {{ page.version.version }}/alter-virtual-cluster.md %}#synopsis).
33
33
-**Failover to a timestamp in the past or the future**: In the case of logical disasters or mistakes, you can [fail over]({% link {{ page.version.version }}/failover-replication.md %}) from the primary to the standby cluster to a timestamp in the past. This means that you can return the standby to a timestamp before the mistake was replicated to the standby. Furthermore, you can plan a failover by specifying a timestamp in the future.
34
-
-**Fast failback**: Switch back from the promoted standby cluster to the original primary cluster after a failover event without reseeding data for an initial scan.
34
+
-**Fast failback**: Switch back from the promoted standby cluster to the original primary cluster after a failover event by replicating net-new changes rather than fully replacing existing data for an initial scan.
35
35
-**Read from standby cluster**: You can configure PCR to allow `SELECT` queries on the standby cluster. For more details, refer to [Start a PCR stream with read from standby]({% link {{ page.version.version }}/create-virtual-cluster.md %}#start-a-pcr-stream-with-read-from-standby).
36
36
-**Monitoring**: To monitor the replication's initial progress, current status, and performance, you can use metrics available in the [DB Console]({% link {{ page.version.version }}/ui-overview.md %}) and [Prometheus]({% link {{ page.version.version }}/monitor-cockroachdb-with-prometheus.md %}). For more details, refer to [Physical Cluster Replication Monitoring]({% link {{ page.version.version }}/physical-cluster-replication-monitoring.md %}).
37
37
@@ -70,7 +70,7 @@ Statement | Action
70
70
## Cluster versions and upgrades
71
71
72
72
{{site.data.alerts.callout_info}}
73
-
The entire standby cluster must be at the same version as, or one version ahead of, the primary's virtual cluster at the time of [failover]({% link {{ page.version.version }}/failover-replication.md %}).
73
+
The entire standby cluster must be at the same version as, or one version ahead of, the primary's virtual cluster.
74
74
{{site.data.alerts.end}}
75
75
76
76
When PCR is enabled, upgrade with the following procedure. This upgrades the standby cluster before the primary cluster. Within the primary and standby CockroachDB clusters, the system virtual cluster must be at a cluster version greater than or equal to the virtual cluster:
Copy file name to clipboardExpand all lines: src/current/v25.3/physical-cluster-replication-technical-overview.md
+3-5Lines changed: 3 additions & 5 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -5,13 +5,11 @@ toc: true
5
5
docs_area: manage
6
6
---
7
7
8
-
[**Physical cluster replication (PCR)**]({% link {{ page.version.version }}/physical-cluster-replication-overview.md %}) continuously asynchronously replicates data from an active _primary_ CockroachDB cluster to a passive _standby_ cluster. When both clusters are virtualized, each cluster contains a _system virtual cluster_ and an application [virtual cluster]({% link {{ page.version.version }}/cluster-virtualization-overview.md %}) during the PCR stream:
8
+
[**Physical cluster replication (PCR)**]({% link {{ page.version.version }}/physical-cluster-replication-overview.md %}) continuously and asynchronously replicates data from an active _primary_ CockroachDB cluster to a passive _standby_ cluster. When both clusters are virtualized, each cluster contains a _system virtual cluster_ and an application [virtual cluster]({% link {{ page.version.version }}/cluster-virtualization-overview.md %}) during the PCR stream:
9
9
10
10
{% include {{ page.version.version }}/physical-replication/interface-virtual-cluster.md %}
11
11
12
-
If you utilize the read from standby feature in PCR, the standby cluster has an additional reader virtual cluster which is a copy of the application virtual cluster.
13
-
14
-
This separation of controls and data means that the replication stream can operate without affecting work happening in a virtual cluster.
12
+
If you utilize the [read on standby](#start-up-sequence-with-read-on-standby) feature in PCR, the standby cluster has an additional reader virtual cluster that safely serves read requests on the replicating virtual cluster.
15
13
16
14
### PCR stream start-up sequence
17
15
@@ -22,7 +20,7 @@ This separation of controls and data means that the replication stream can opera
22
20
23
21
The stream initialization proceeds as follows:
24
22
25
-
1. The standby's consumer job connects to the primary cluster via the standby's system virtual cluster and starts the primary cluster's physical stream producer job.
23
+
1. The standby's consumer job connects to the primary cluster via the standby's system virtual cluster and starts the primary cluster's `REPLICATION STREAM PRODUCER` job.
26
24
1. The primary cluster chooses a timestamp at which to start the physical replication stream. Data on the primary is protected from [garbage collection]({% link {{ page.version.version }}/architecture/storage-layer.md %}#garbage-collection) until it is replicated to the standby using a [protected timestamp]({% link {{ page.version.version }}/architecture/storage-layer.md %}#protected-timestamps).
27
25
1. The primary cluster returns the timestamp and a [job ID]({% link {{ page.version.version }}/show-jobs.md %}#response) for the replication job.
28
26
1. The standby cluster retrieves a list of all nodes in the primary cluster. It uses this list to distribute work across all nodes in the standby cluster.
Copy file name to clipboardExpand all lines: src/current/v25.3/set-up-physical-cluster-replication.md
+7-7Lines changed: 7 additions & 7 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -38,7 +38,7 @@ To set up PCR from an existing CockroachDB cluster, which will serve as the prim
38
38
- You need two separate CockroachDB clusters (primary and standby), each with a minimum of three nodes. The standby cluster should be the same version or one version ahead of the primary cluster. The primary and standby clusters must be configured with similar hardware profiles, number of nodes, and overall size. Significant discrepancies in the cluster configurations may result in degraded performance.
39
39
- To set up each cluster, you can follow [Deploy CockroachDB on Premises]({% link {{ page.version.version }}/deploy-cockroachdb-on-premises.md %}). When you initialize the cluster with the [`cockroach init`]({% link {{ page.version.version }}/cockroach-init.md %}) command, you **must** pass the `--virtualized` or `--virtualized-empty` flag. Refer to the cluster creation steps for the [primary cluster](#initialize-the-primary-cluster) and for the [standby cluster](#initialize-the-standby-cluster) for details.
40
40
- The [Deploy CockroachDB on Premises]({% link {{ page.version.version }}/deploy-cockroachdb-on-premises.md %}) tutorial creates a self-signed certificate for each {{ site.data.products.core }} cluster. To create certificates signed by an external certificate authority, refer to [Create Security Certificates using OpenSSL]({% link {{ page.version.version }}/create-security-certificates-openssl.md %}).
41
-
- All nodes in each cluster will need access to the Certificate Authority for the other cluster. Refer to [Manage the cluster certificates](#step-3-manage-cluster-certificates-and-generate-connection-strings).
41
+
- All nodes in each cluster will need access to the Certificate Authority for the other cluster. Refer to [Manage cluster certificates](#step-3-manage-cluster-certificates-and-generate-connection-strings).
42
42
- The primary and standby clusters can have different [region topologies]({% link {{ page.version.version }}/topology-patterns.md %}). However, behavior for features that rely on multi-region primitives, such as Region by Row and Region by Table, may be affected.
43
43
44
44
## Step 1. Create the primary cluster
@@ -103,7 +103,7 @@ Connect to your primary cluster's system virtual cluster using [`cockroach sql`]
103
103
104
104
Because this is the primary cluster rather than the standby cluster, the `data_state` of all rows is `ready`, rather than `replicating` or another [status]({% link {{ page.version.version }}/physical-cluster-replication-monitoring.md %}).
105
105
106
-
### Create a replication user and password
106
+
### Create a user with replication privileges
107
107
108
108
The standby cluster connects to the primary cluster's system virtual cluster using an identity with the `REPLICATIONSOURCE` [privilege]({% link {{ page.version.version }}/security-reference/authorization.md %}#supported-privileges). Connect to the primary cluster's system virtual cluster and create a user with a password:
109
109
@@ -223,7 +223,7 @@ Connect to your standby cluster's system virtual cluster using [`cockroach sql`]
223
223
(1 rows)
224
224
~~~
225
225
226
-
### Create a replication user and password
226
+
### Create a user with replication privileges on the standby cluster
227
227
228
228
Create a user to run the PCR stream and access the [DB Console]({% link {{ page.version.version }}/ui-overview.md %}) to observe the job:
229
229
@@ -285,7 +285,7 @@ The system virtual cluster in the standby cluster initializes and controls the r
285
285
~~~
286
286
287
287
Otherwise, pass the connection string that contains:
288
-
- The replication user and password that you [created for the primary cluster](#create-a-replication-user-and-password).
288
+
- The replication user and password that you [created for the primary cluster](#create-a-user-with-replication-privileges).
289
289
- The node IP address or hostname of one node from the primary cluster.
290
290
- The path to the primary node's certificate on the standby cluster.
291
291
@@ -355,7 +355,7 @@ You can set up PCR replication from an existing CockroachDB cluster that does no
355
355
{{site.data.alerts.callout_info}}
356
356
When you start PCR with an existing primary cluster that does **not** have [cluster virtualization]({% link {{ page.version.version }}/cluster-virtualization-overview.md %}) enabled, you will not be able to [_fail back_]({% link {{ page.version.version }}/failover-replication.md %}#failback) to the original primary cluster from the promoted, original standby.
357
357
358
-
For more details on the failback process when you have started PCR with a non-virtualized primary, refer to [Fail back after PCR from an existing cluster]({% link {{ page.version.version }}/failover-replication.md %}#fail-back-after-replicating-from-an-existing-primary-cluster).
358
+
For more details on the failback process when you have started PCR with a non-virtualized primary, refer to [Fail back after replicating from an existing cluster]({% link {{ page.version.version }}/failover-replication.md %}#fail-back-after-replicating-from-an-existing-primary-cluster).
359
359
{{site.data.alerts.end}}
360
360
361
361
Before you begin, you will need:
@@ -396,7 +396,7 @@ Before you begin, you will need:
396
396
(1 row)
397
397
~~~
398
398
399
-
1. To create the replication job, you will need a connection string for the **primary cluster** containing its CA certificate. For steps to generate a connection string with `cockroach encode-uri`, refer to [Step 3. Manage the cluster certificates](#step-3-manage-cluster-certificates-and-generate-connection-strings).
399
+
1. To create the replication job, you will need a connection string for the **primary cluster** containing its CA certificate. For steps to generate a connection string with `cockroach encode-uri`, refer to [Step 3. Manage cluster certificates and generate connection strings](#step-3-manage-cluster-certificates-and-generate-connection-strings).
400
400
401
401
1. If you would like to run a test workload on your existing **primary cluster**, you can use [`cockroach workload`]({% link {{ page.version.version }}/cockroach-workload.md %}) like the following:
402
402
@@ -441,7 +441,7 @@ At this point, your replication stream will be running.
441
441
442
442
To _fail over_ to the standby cluster, follow the instructions on the [Fail Over from a Primary Cluster to a Standby Cluster]({% link {{ page.version.version }}/failover-replication.md %}) page.
443
443
444
-
For details on how to _fail back_ after replicating a non-virtualized cluster, refer to [Fail back after PCR from an existing cluster]({% link {{ page.version.version }}/failover-replication.md %}#fail-back-after-replicating-from-an-existing-primary-cluster).
444
+
For details on how to _fail back_ after replicating a non-virtualized cluster, refer to [Fail back after replicating from an existing cluster]({% link {{ page.version.version }}/failover-replication.md %}#fail-back-after-replicating-from-an-existing-primary-cluster).
0 commit comments