Add small changes from PCR overhaul (#20304)

peachdawnleach · florence-crl · peachdawnleach · commit 73b9e2e6650b · 2025-09-16T16:18:44.000-04:00
* Various changes

Added a number of small changes from PCR overhaul doc and fixed broken links and typos

More various changes

More various changes - will elaborate as needed in comments

Moved info

Moved info to a more relevant section

Fixed broken link

Accidentally broke a link - fixed typo

Fixed more broken links

Lots of broken links from these updates- hopefully fixed the last ones

Minor fixes from review

A few minor changes based on Alicia's review

* Adjustments from review

Some changes based on a review from Michael Butler

* changed 'must' to 'may want to'

one more change from review

* Small change from review

Small reword from review

* Fixed broken links

Fixed broken links

* Apply suggestions from code review

Co-authored-by: Florence Morris &lt;58752716+florence-crl@users.noreply.github.com&gt;

* Changes from docs review

Changes from docs review

---------

Co-authored-by: Florence Morris &lt;58752716+florence-crl@users.noreply.github.com&gt;
diff --git a/src/current/v25.3/create-virtual-cluster.md b/src/current/v25.3/create-virtual-cluster.md
@@ -62,7 +62,7 @@ To form a connection string similar to the example, include the following values
 
 Value | Description
 ----------------+------------
-`{replication user}` | The user on the primary cluster that has the `REPLICATION` system privilege. Refer to the [Create a replication user and password]({% link {{ page.version.version }}/set-up-physical-cluster-replication.md %}#create-a-replication-user-and-password) for more detail.
+`{replication user}` | The user on the primary cluster that has the `REPLICATION` system privilege. Refer to [Create a user with replication privileges]({% link {{ page.version.version }}/set-up-physical-cluster-replication.md %}#create-a-user-with-replication-privileges) for more detail.
 `{password}` | The replication user's password.
 `{node ID or hostname}` | The node IP address or hostname of any node from the primary cluster.
 `options=ccluster=system` | The parameter to connect to the system virtual cluster on the primary cluster.
diff --git a/src/current/v25.3/failover-replication.md b/src/current/v25.3/failover-replication.md
@@ -38,12 +38,15 @@ To initiate a failover to the standby cluster, specify the point in time for its
 
 - [`LATEST`](#fail-over-to-the-most-recent-replicated-time): The most recent replicated timestamp. This minimizes any data loss from the replication lag in asynchronous replication.
 - [Point-in-time](#fail-over-to-a-point-in-time):
-    - Past: A past timestamp within the [failover window]({% link {{ page.version.version }}/physical-cluster-replication-technical-overview.md %}#failover-and-promotion-process) of up to 4 hours in the past. Failing over to a past point in time is useful if you need to recover from a recent human error. 
+    - Past: A past timestamp within the [failover window]({% link {{ page.version.version }}/physical-cluster-replication-technical-overview.md %}#failover-and-promotion-process) of up to 4 hours in the past.
+    {{site.data.alerts.callout_success}}
+    Failing over to a past point in time is useful if you need to recover from a recent human error
+    {{site.data.alerts.end}}
     - Future: A future timestamp for planning a failover.
 
 #### Fail over to the most recent replicated time
 
-To initiate a failover to the most recent replicated timestamp, specify `LATEST` when you start the failover. Due to [_replication lag_]({% link {{ page.version.version }}/physical-cluster-replication-technical-overview.md %}#failover-and-promotion-process), the latest replicated time may be behind the current actual time. Replication lag is the time between the most up-to-date replicated time and the actual time.
+To initiate a failover to the most recent replicated timestamp, specify `LATEST`. Due to [_replication lag_]({% link {{ page.version.version }}/physical-cluster-replication-technical-overview.md %}#failover-and-promotion-process), the most recent replicated time may be behind the current actual time. Replication lag is the time difference between the most recent replicated time and the actual time.
 
 1. To view the current replication timestamp, use:
 
@@ -172,7 +175,7 @@ To enable PCR again, from the new primary to the original primary (or a complete
 
 After failing over to the standby cluster, you may want to return to your original configuration by failing back to the original primary-standby cluster setup. Depending on the configuration of the primary cluster in the original PCR stream, use one of the following workflows:
 
-- [From the original standby cluster (after it was promoted during failover) to the original primary cluster](#fail-back-to-the-original-primary-cluster). If this failback is initiated within 24 hours of the failover, PCR replicates the net-new changes from the standby cluster to the primary cluster, so you do not need to re-seed the primary cluster.
+- [From the original standby cluster (after it was promoted during failover) to the original primary cluster](#fail-back-to-the-original-primary-cluster). If this failback is initiated within 24 hours of the failover, PCR replicates the net-new changes from the standby cluster to the primary cluster, rather than fully replacing the existing data in the primary cluster.
 - [After the PCR stream used an existing cluster as the primary cluster](#fail-back-after-replicating-from-an-existing-primary-cluster).
 
 {{site.data.alerts.callout_info}}
@@ -304,7 +307,7 @@ At this point, **Cluster A** has caught up to **Cluster B**. The clusters are en
 
 You can replicate data from an existing CockroachDB cluster that does not have [cluster virtualization]({% link {{ page.version.version }}/cluster-virtualization-overview.md %}) enabled to a standby cluster with cluster virtualization enabled. For instructions on setting up a PCR in this way, refer to [Set up PCR from an existing cluster]({% link {{ page.version.version }}/set-up-physical-cluster-replication.md %}#set-up-pcr-from-an-existing-cluster).
 
-After a [failover](#failover) to the standby cluster, you must set up PCR from the original standby cluster, which is now the primary, to another cluster, which will become the standby. There are multiple ways to set up a new standby, and some considerations.
+After a [failover](#failover) to the standby cluster, you may want to set up PCR from the original standby cluster, which is now the primary, to another cluster, which will become the standby. There are multiple ways to set up a new standby, and some considerations.
 
 In the example, the clusters are named for reference:
 
diff --git a/src/current/v25.3/physical-cluster-replication-overview.md b/src/current/v25.3/physical-cluster-replication-overview.md
@@ -31,7 +31,7 @@ You can use PCR to:
 - **Transactional consistency**: Avoid conflicts in data after recovery; the replication completes to a transactionally consistent state.
 - **Improved RPO and RTO**: Depending on workload and deployment configuration, [replication lag]({% link {{ page.version.version }}/physical-cluster-replication-technical-overview.md %}) between the primary and standby is generally in the tens-of-seconds range. The failover process from the primary cluster to the standby should typically happen within five minutes when completing a failover to the latest replicated time using [`LATEST`]({% link {{ page.version.version }}/alter-virtual-cluster.md %}#synopsis).
 - **Failover to a timestamp in the past or the future**: In the case of logical disasters or mistakes, you can [fail over]({% link {{ page.version.version }}/failover-replication.md %}) from the primary to the standby cluster to a timestamp in the past. This means that you can return the standby to a timestamp before the mistake was replicated to the standby. Furthermore, you can plan a failover by specifying a timestamp in the future.
-- **Fast failback**: Switch back from the promoted standby cluster to the original primary cluster after a failover event without reseeding data for an initial scan.
+- **Fast failback**: Switch back from the promoted standby cluster to the original primary cluster after a failover event by replicating net-new changes rather than fully replacing existing data for an initial scan.
 - **Read from standby cluster**: You can configure PCR to allow `SELECT` queries on the standby cluster. For more details, refer to [Start a PCR stream with read from standby]({% link {{ page.version.version }}/create-virtual-cluster.md %}#start-a-pcr-stream-with-read-from-standby).
 - **Monitoring**: To monitor the replication's initial progress, current status, and performance, you can use metrics available in the [DB Console]({% link {{ page.version.version }}/ui-overview.md %}) and [Prometheus]({% link {{ page.version.version }}/monitor-cockroachdb-with-prometheus.md %}). For more details, refer to [Physical Cluster Replication Monitoring]({% link {{ page.version.version }}/physical-cluster-replication-monitoring.md %}).
 
@@ -70,7 +70,7 @@ Statement | Action
 ## Cluster versions and upgrades
 
 {{site.data.alerts.callout_info}}
-The entire standby cluster must be at the same version as, or one version ahead of, the primary's virtual cluster at the time of [failover]({% link {{ page.version.version }}/failover-replication.md %}).
+The entire standby cluster must be at the same version as, or one version ahead of, the primary's virtual cluster.
 {{site.data.alerts.end}}
 
 When PCR is enabled, upgrade with the following procedure. This upgrades the standby cluster before the primary cluster. Within the primary and standby CockroachDB clusters, the system virtual cluster must be at a cluster version greater than or equal to the virtual cluster:
diff --git a/src/current/v25.3/physical-cluster-replication-technical-overview.md b/src/current/v25.3/physical-cluster-replication-technical-overview.md
@@ -5,13 +5,11 @@ toc: true
 docs_area: manage
 ---
 
-[**Physical cluster replication (PCR)**]({% link {{ page.version.version }}/physical-cluster-replication-overview.md %}) continuously asynchronously replicates data from an active _primary_ CockroachDB cluster to a passive _standby_ cluster. When both clusters are virtualized, each cluster contains a _system virtual cluster_ and an application [virtual cluster]({% link {{ page.version.version }}/cluster-virtualization-overview.md %}) during the PCR stream:
+[**Physical cluster replication (PCR)**]({% link {{ page.version.version }}/physical-cluster-replication-overview.md %}) continuously and asynchronously replicates data from an active _primary_ CockroachDB cluster to a passive _standby_ cluster. When both clusters are virtualized, each cluster contains a _system virtual cluster_ and an application [virtual cluster]({% link {{ page.version.version }}/cluster-virtualization-overview.md %}) during the PCR stream:
 
 {% include {{ page.version.version }}/physical-replication/interface-virtual-cluster.md %}
 
-If you utilize the read from standby feature in PCR, the standby cluster has an additional reader virtual cluster which is a copy of the application virtual cluster. 
-
-This separation of controls and data means that the replication stream can operate without affecting work happening in a virtual cluster.
+If you utilize the [read on standby](#start-up-sequence-with-read-on-standby) feature in PCR, the standby cluster has an additional reader virtual cluster that safely serves read requests on the replicating virtual cluster. 
 
 ### PCR stream start-up sequence
 
@@ -22,7 +20,7 @@ This separation of controls and data means that the replication stream can opera
 
 The stream initialization proceeds as follows:
 
-1. The standby's consumer job connects to the primary cluster via the standby's system virtual cluster and starts the primary cluster's physical stream producer job.
+1. The standby's consumer job connects to the primary cluster via the standby's system virtual cluster and starts the primary cluster's `REPLICATION STREAM PRODUCER` job.
 1. The primary cluster chooses a timestamp at which to start the physical replication stream. Data on the primary is protected from [garbage collection]({% link {{ page.version.version }}/architecture/storage-layer.md %}#garbage-collection) until it is replicated to the standby using a [protected timestamp]({% link {{ page.version.version }}/architecture/storage-layer.md %}#protected-timestamps).
 1. The primary cluster returns the timestamp and a [job ID]({% link {{ page.version.version }}/show-jobs.md %}#response) for the replication job.
 1. The standby cluster retrieves a list of all nodes in the primary cluster. It uses this list to distribute work across all nodes in the standby cluster.
diff --git a/src/current/v25.3/set-up-physical-cluster-replication.md b/src/current/v25.3/set-up-physical-cluster-replication.md
@@ -38,7 +38,7 @@ To set up PCR from an existing CockroachDB cluster, which will serve as the prim
 - You need two separate CockroachDB clusters (primary and standby), each with a minimum of three nodes. The standby cluster should be the same version or one version ahead of the primary cluster. The primary and standby clusters must be configured with similar hardware profiles, number of nodes, and overall size. Significant discrepancies in the cluster configurations may result in degraded performance.
     - To set up each cluster, you can follow [Deploy CockroachDB on Premises]({% link {{ page.version.version }}/deploy-cockroachdb-on-premises.md %}). When you initialize the cluster with the [`cockroach init`]({% link {{ page.version.version }}/cockroach-init.md %}) command, you **must** pass the `--virtualized` or `--virtualized-empty` flag. Refer to the cluster creation steps for the [primary cluster](#initialize-the-primary-cluster) and for the [standby cluster](#initialize-the-standby-cluster) for details.
     - The [Deploy CockroachDB on Premises]({% link {{ page.version.version }}/deploy-cockroachdb-on-premises.md %}) tutorial creates a self-signed certificate for each {{ site.data.products.core }} cluster. To create certificates signed by an external certificate authority, refer to [Create Security Certificates using OpenSSL]({% link {{ page.version.version }}/create-security-certificates-openssl.md %}).
-- All nodes in each cluster will need access to the Certificate Authority for the other cluster. Refer to [Manage the cluster certificates](#step-3-manage-cluster-certificates-and-generate-connection-strings).
+- All nodes in each cluster will need access to the Certificate Authority for the other cluster. Refer to [Manage cluster certificates](#step-3-manage-cluster-certificates-and-generate-connection-strings).
 - The primary and standby clusters can have different [region topologies]({% link {{ page.version.version }}/topology-patterns.md %}). However, behavior for features that rely on multi-region primitives, such as Region by Row and Region by Table, may be affected.
 
 ## Step 1. Create the primary cluster
@@ -103,7 +103,7 @@ Connect to your primary cluster's system virtual cluster using [`cockroach sql`]
 
     Because this is the primary cluster rather than the standby cluster, the `data_state` of all rows is `ready`, rather than `replicating` or another [status]({% link {{ page.version.version }}/physical-cluster-replication-monitoring.md %}).
 
-### Create a replication user and password
+### Create a user with replication privileges
 
 The standby cluster connects to the primary cluster's system virtual cluster using an identity with the `REPLICATIONSOURCE` [privilege]({% link {{ page.version.version }}/security-reference/authorization.md %}#supported-privileges). Connect to the primary cluster's system virtual cluster and create a user with a password:
 
@@ -223,7 +223,7 @@ Connect to your standby cluster's system virtual cluster using [`cockroach sql`]
     (1 rows)
     ~~~
 
-### Create a replication user and password
+### Create a user with replication privileges on the standby cluster
 
 Create a user to run the PCR stream and access the [DB Console]({% link {{ page.version.version }}/ui-overview.md %}) to observe the job:
 
@@ -285,7 +285,7 @@ The system virtual cluster in the standby cluster initializes and controls the r
     ~~~
 
     Otherwise, pass the connection string that contains:
-    - The replication user and password that you [created for the primary cluster](#create-a-replication-user-and-password).
+    - The replication user and password that you [created for the primary cluster](#create-a-user-with-replication-privileges).
     - The node IP address or hostname of one node from the primary cluster.
     - The path to the primary node's certificate on the standby cluster.
 
@@ -355,7 +355,7 @@ You can set up PCR replication from an existing CockroachDB cluster that does no
 {{site.data.alerts.callout_info}}
 When you start PCR with an existing primary cluster that does **not** have [cluster virtualization]({% link {{ page.version.version }}/cluster-virtualization-overview.md %}) enabled, you will not be able to [_fail back_]({% link {{ page.version.version }}/failover-replication.md %}#failback) to the original primary cluster from the promoted, original standby. 
 
-For more details on the failback process when you have started PCR with a non-virtualized primary, refer to [Fail back after PCR from an existing cluster]({% link {{ page.version.version }}/failover-replication.md %}#fail-back-after-replicating-from-an-existing-primary-cluster).
+For more details on the failback process when you have started PCR with a non-virtualized primary, refer to [Fail back after replicating from an existing cluster]({% link {{ page.version.version }}/failover-replication.md %}#fail-back-after-replicating-from-an-existing-primary-cluster).
 {{site.data.alerts.end}}
 
 Before you begin, you will need:
@@ -396,7 +396,7 @@ Before you begin, you will need:
     (1 row)
     ~~~
 
-1. To create the replication job, you will need a connection string for the **primary cluster** containing its CA certificate. For steps to generate a connection string with `cockroach encode-uri`, refer to [Step 3. Manage the cluster certificates](#step-3-manage-cluster-certificates-and-generate-connection-strings).
+1. To create the replication job, you will need a connection string for the **primary cluster** containing its CA certificate. For steps to generate a connection string with `cockroach encode-uri`, refer to [Step 3. Manage cluster certificates and generate connection strings](#step-3-manage-cluster-certificates-and-generate-connection-strings).
 
 1. If you would like to run a test workload on your existing **primary cluster**, you can use [`cockroach workload`]({% link {{ page.version.version }}/cockroach-workload.md %}) like the following:
 
@@ -441,7 +441,7 @@ At this point, your replication stream will be running.
 
 To _fail over_ to the standby cluster, follow the instructions on the [Fail Over from a Primary Cluster to a Standby Cluster]({% link {{ page.version.version }}/failover-replication.md %}) page.
 
-For details on how to _fail back_ after replicating a non-virtualized cluster, refer to [Fail back after PCR from an existing cluster]({% link {{ page.version.version }}/failover-replication.md %}#fail-back-after-replicating-from-an-existing-primary-cluster).
+For details on how to _fail back_ after replicating a non-virtualized cluster, refer to [Fail back after replicating from an existing cluster]({% link {{ page.version.version }}/failover-replication.md %}#fail-back-after-replicating-from-an-existing-primary-cluster).
 
 ## Connection reference