Cleanup

Feediver1 · Feediver1 · commit 7587ae1bf07c · 2025-12-17T12:07:38.000-05:00
diff --git a/modules/manage/partials/whole-cluster-restore.adoc b/modules/manage/partials/whole-cluster-restore.adoc
@@ -12,14 +12,14 @@ endif::[]
 include::shared:partial$enterprise-license.adoc[]
 ====
 
-With xref:{link-tiered-storage}[Tiered Storage] enabled, you can use Whole Cluster Restore to restore data from a failed cluster (source cluster), including its metadata, onto a new cluster (target cluster). This is a simpler and cheaper alternative to active-active replication, for example with xref:migrate:data-migration.adoc[MirrorMaker 2]. Use this recovery method to restore your application to the latest functional state as quickly as possible.
+With xref:{link-tiered-storage}[Tiered Storage] enabled, you can use Whole Cluster Restore to restore data from a failed cluster (source cluster you are restoring from), including its metadata, onto a new cluster (target cluster you are restoring to). This is a simpler and cheaper alternative to active-active replication, for example with xref:migrate:data-migration.adoc[MirrorMaker 2]. Use this recovery method to restore your application to the latest functional state as quickly as possible.
 
 [CAUTION]
 ====
 Whole Cluster Restore is not a fully-functional disaster recovery solution. It does not provide snapshot-style consistency. Some partitions in some topics will be more up-to-date than others. Committed transactions are not guaranteed to be atomic.
 ====
 
-TIP: If you need to restore only a subset of topic data, consider using xref:deploy:redpanda/manual/disaster-recovery/topic-recovery.adoc[topic recovery] instead of a Whole Cluster Restore.
+TIP: If you need to restore only a subset of topic data, consider using xref:manage:disaster-recovery/topic-recovery.adoc[topic recovery] instead of a Whole Cluster Restore.
 
 The following metadata is included in a Whole Cluster Restore:
 
@@ -227,74 +227,34 @@ endif::[]
 
 When the cluster restore is successfully completed, you can redirect your application workload to the new cluster. Make sure to update your application code to use the new addresses of your brokers.
 
-== (Advanced) Restore data when multiple clusters share data 
+== Advanced: Restore data when multiple clusters share data
 
 [CAUTION]
 ====
-This is an advanced use case and should be performed only after consulting with Redpanda support.
+This is an advanced use case that should be performed only by Redpanda support.
 ====
 
-Typically, there is a one-to-one mapping between a Redpanda cluster and its object storage bucket. However, you can also run multiple clusters that share the same bucket. This allows you to move tenants between clusters without moving data, as the data remains in the same bucket. For example, you can mount topics to multiple clusters in the same bucket.
+Typically, you will have a one-to-one mapping between a Redpanda cluster and its object storage bucket. However, it's possible to run multiple clusters that share the same storage bucket. Sharing an object storage bucket allows you to move tenants between clusters without moving data. For example, you might wish to mount topics to multiple clusters in the same bucket without having to move data.
 
-Running multiple clusters that share the same storage bucket presents unique challenges during Whole Cluster Restore operations. To manage these challenges, you must first understand how Redpanda uses <<the-role-of-cluster-uuids-in-whole-cluster-restore,UUIDs>> (universal unique identifiers) to identify clusters during Whole Cluster Restore.
+Running multiple clusters that share the same storage bucket presents unique challenges during Whole Cluster Restore operations. To manage these challenges, you must understand how Redpanda uses <<the-role-of-cluster-uuids-in-whole-cluster-restore,UUIDs>> (universally unique identifiers) to identify clusters during a Whole Cluster Restore. This shared storage approach can create identification challenges during restore operations.
 
 === The role of cluster UUIDs in Whole Cluster Restore
 
-Every time a Redpanda cluster (single node or more) starts, it is automatically assigned a random UUID. From that moment forward, all entities created by the cluster are identifiable using that cluster UUID. Such entities include:
+Each Redpanda cluster (single node or more) receives a unique UUID every time it starts. From that moment forward, all entities created by the cluster are identifiable using this cluster UUID. These entities include:
 
 - Topic data
 - Topic metadata
 - Whole Cluster Restore manifests
 - Controller log snapshots for Whole Cluster Restore
 - Consumer offsets for Whole Cluster Restore
 
-However, not all entities _managed_ by the cluster are identifiable using this cluster UUID. In fact, Redpanda can recover a different cluster in lieu of the existing cluster, or mount topics from different clusters. For a cluster that has been running for some time, your object storage may look like this:
-
-[source,bash]
-----
-/
-+- cluster_metadata/
-   +- <uuid-a>/manifests/
-   |  +- 0/cluster_manifest.json
-   |  +- 1/cluster_manifest.json
-   |  +- 2/cluster_manifest.json
-   + <uuid-b>/manifests/
-   |  +- 3/cluster_manifest.json
-   |  +- 4/cluster_manifest.json
-   + <uuid-c>/manifests/ # Previously active but not restored.
-   |                     # Still, the manifest number starts at
-   |                     # highest found in the bucket plus one.
-   |  +- 5/cluster_manifest.json
-   |  +- 6/cluster_manifest.json
-   + <uuid-d>/manifests/ # active cluster (not restored)
-      +- 7/cluster_manifest.json
-      +- 8/cluster_manifest.json
-----
-
-Redpanda's algorithm lists all objects (cluster manifests) from object storage and during a Whole Cluster Restore, picks the object with the _highest ID available_, not the current UUID. In this case, if you attempt to restore you would recover `/cluster_metadata/<uuid-c>/manifests/6/cluster_manifest.json`, even though the active cluster is `<uuid-d>`.
-
-However, this algorithm does not work if you have multiple clusters sharing the same object storage bucket. For example, your object storage might look like:
-
-[source,bash]
-----
-/
-+- cluster_metadata/
-   + <uuid-a>/manifests/
-   | +- 0/cluster_manifest.json
-   | +- 1/cluster_manifest.json
-   | +- 2/cluster_manifest.json
-   + <uuid-b>/manifests/
-     +- 0/cluster_manifest.json
-     +- 1/cluster_manifest.json (lost cluster)
-----
-
-Here, if you've lost the cluster `uuid-b` and wish to recover it, the recovery process will select the metadata for `uuid-a`, which will lead to a split-brain/data corruption scenario. For troubleshooting details, see <<resolve-repeated-recovery-failures,Resolve repeated recovery failures>>
+However, not all entities _managed_ by the cluster are identifiable using this cluster UUID. Each time a cluster uploads its metadata, the name of the object has two parts: the cluster UUID, which is unique each time you create a cluster (even after a restore it will have a new UUID), and a metadata (sequence) ID. When performing a restore, Redpanda scans the bucket to find the highest-sequenced ID uploaded by the cluster. It can be ambiguous what to restore when the highest sequential ID has been uploaded by another cluster, and result in a split-brain scenario, where you have two independent clusters that both believe they are the “rightful owner” of the same logical data.
 
 === Configure cluster names for multiple source clusters
 
-To disambiguate cluster metadata from multiple clusters, use the xref:reference:properties/object-storage-properties.adoc#cloud_storage_cluster_name[`cloud_storage_cluster_name`] property (off by default), which allows you to assign a unique name to each cluster sharing the same object storage bucket. This name must be unique within the bucket, 1-64 characters, and use only letters, numbers, underscores, and hyphens. Do not change this value once set. Once set, your object storage bucket may look like this:
+To disambiguate cluster metadata from multiple clusters, use the xref:reference:properties/object-storage-properties.adoc#cloud_storage_cluster_name[`cloud_storage_cluster_name`] property (off by default), which allows you to assign a unique name to each cluster sharing the same object storage bucket. Redpanda uses this name to organize the cluster metadata within the shared object storage bucket. This ensures that each cluster's data remains distinct and prevents conflicts during recovery operations.The name must be unique within the bucket, 1-64 characters, and use only letters, numbers, underscores, and hyphens. Do not change this value once set. After setting, your object storage bucket organization may look like the following:
 
-[source,bash]
+[,bash]
 ----
 /
 +- cluster_metadata/
@@ -310,9 +270,9 @@ To disambiguate cluster metadata from multiple clusters, use the xref:reference:
    +- rp-qux/uuid/<uuid-b>
 ----
 
-When a new cluster is created, and you have specified its `cloud_storage_cluster_name` (here, `rp-qux`), your object storage bucket may look like this:
+During a Whole Cluster Restore, Redpanda looks for the cluster name specified in `cloud_storage_cluster_name` and only consider manifests associated with that name. Because the name specified here is `rp-qux`, Redpanda only considers manifests for the clusters `<uuid-b>` and `<uuid-c>`, ignoring cluster `<uuid-a>` entirely. In this case, your object storage bucket may look like the following:
 
-[source,bash]
+[,bash]
 ----
 +- cluster_metadata/
 |  + <uuid-a>/manifests/
@@ -332,15 +292,11 @@ When a new cluster is created, and you have specified its `cloud_storage_cluster
       +- <uuid-c> # reference to new cluster
 ----
 
-During a Whole Cluster Restore, Redpanda will look for the cluster name specified in `cloud_storage_cluster_name` and only consider manifests associated with that name. In this example, if you start a cluster with `cloud_storage_cluster_name` set to `rp-qux`, Redpanda will only consider manifests under `<uuid-b>` and `<uuid-c>`, ignoring `<uuid-a>` entirely.
-
-Redpanda uses this name to organize the cluster metadata within the shared object storage bucket. This ensures that each cluster's data remains distinct and prevents conflicts during recovery operations.
-
 === Resolve repeated recovery failures
 
-If you are experiencing repeated failures when a cluster is lost and recreated, the automated recovery algorithm may have selected the manifest with the highest sequence number, which might be the most recent one with no data, instead of the original one that contains the data. Your object storage bucket might look like this:
+If you experience repeated failures when a cluster is lost and recreated, the automated recovery algorithm may have selected the manifest with the highest sequence number, which might be the most recent one with no data, instead of the original one containing the data. In such a scenario, your object storage bucket might be organized like the following:
 
-[source,bash]
+[,bash]
 ----
 /
 +- cluster_metadata/
@@ -356,11 +312,11 @@ If you are experiencing repeated failures when a cluster is lost and recreated,
 
 In such cases, you can explicitly run a POST request using the Admin API:
 
-[source,bash]
+[,bash]
 ----
 curl -XPOST \
      --data '{"cluster_uuid_override":  "<uuid-a>"}'
      http://localhost:9644/v1/cloud_storage/automated_recovery
 ----
 
-For details, see the Admin API.
+For details, see the xref:manage:use-admin-api.adoc[Admin API reference].