datastax
diff --git a/‎modules/ROOT/images/migration-phase3ra9.png
-134 KB b/‎modules/ROOT/images/migration-phase3ra9.png
-134 KB
diff --git a/‎modules/ROOT/pages/astra-migration-paths.adoc
Lines changed: 1 addition & 0 deletions b/‎modules/ROOT/pages/astra-migration-paths.adoc
Lines changed: 1 addition & 0 deletions
diff --git a/‎modules/ROOT/pages/cassandra-data-migrator.adoc
Lines changed: 31 additions & 16 deletions b/‎modules/ROOT/pages/cassandra-data-migrator.adoc
Lines changed: 31 additions & 16 deletions
diff --git a/‎modules/ROOT/pages/change-read-routing.adoc
Lines changed: 5 additions & 9 deletions b/‎modules/ROOT/pages/change-read-routing.adoc
Lines changed: 5 additions & 9 deletions
diff --git a/‎modules/ROOT/pages/components.adoc
Lines changed: 42 additions & 37 deletions b/‎modules/ROOT/pages/components.adoc
Lines changed: 42 additions & 37 deletions
@@ -103,4 +103,5 @@ If you have questions about migrating from a specific source to {astra-db}, cont
 
 == See also
 
+* https://www.datastax.com/events/migrating-your-legacy-cassandra-app-to-astra-db[Migrating your legacy {cass-reg} app to {astra-db}]
 * xref:astra-db-serverless:databases:migration-path-serverless.adoc[Migrate to {astra-db}]
@@ -6,25 +6,33 @@
 //This page was an exact duplicate of cdm-overview.adoc and the (now deleted) cdm-steps.adoc, they are just in different parts of the nav.
 
 // tag::body[]
-You can use {cass-migrator} ({cass-migrator-short}) to migrate and validate tables between {cass-short}-based clusters.
-It is designed to connect to your target cluster, compare it with the origin cluster, log any differences, and, optionally, automatically reconcile inconsistencies and missing data.
+You can use {cass-migrator} ({cass-migrator-short}) for data migration and validation between {cass-reg}-based databases.
+It supports important {cass} features and offers extensive configuration options:
 
-{cass-migrator-short} facilitates data transfer by creating multiple jobs that access the {cass-short} cluster concurrently, making it an ideal choice for migrating large datasets.
-It offers extensive configuration options, including logging, reconciliation, performance optimization, and more.
+* Logging and run tracking
+* Automatic reconciliation
+* Performance tuning
+* Record filtering
+* Support for advanced data types, including sets, lists, maps, and UDTs
+* Support for SSL, including custom cipher algorithms
+* Use `writetime` timestamps to maintain chronological write history
+* Use Time To Live (TTL) values to maintain data lifecycles
 
-{cass-migrator-short} features include the following:
+For more information and a complete list of features, see the {cass-migrator-repo}?tab=readme-ov-file#features[{cass-migrator-short} GitHub repository].
 
-* Validate migration accuracy and performance using examples that provide a smaller, randomized data set.
-* Preserve internal `writetime` timestamps and Time To Live (TTL) values.
-* Use advanced data types, including sets, lists, maps, and UDTs.
-* Filter records from the origin cluster's data, using {cass-short}'s internal `writetime` timestamp.
-* Use SSL Support, including custom cipher algorithms.
+== {cass-migrator} requirements
 
-For more features and information, see the {cass-migrator-repo}?tab=readme-ov-file#features[{cass-migrator-short} GitHub repository].
+To use {cass-migrator-short} successfully, your origin and target clusters must be {cass-short}-based databases with matching schemas.
 
-== {cass-migrator} requirements
+== {cass-migrator-short} with {product-proxy}
+
+You can use {cass-migrator-short} alone or with {product-proxy}.
+
+When using {cass-migrator-short} with {product-proxy}, {cass-short}'s last-write-wins semantics ensure that new, real-time writes accurately take precedence over historical writes.
+
+Last-write-wins compares the `writetime` of conflicting records, and then retains the most recent write.
 
-To use {cass-migrator-short} successfully, your origin and target clusters must have matching schemas.
+For example, if a new write occurs in your target cluster with a `writetime` of `2023-10-01T12:05:00Z`, and then {cass-migrator-short} migrates a record against the same row with a `writetime` of `2023-10-01T12:00:00Z`, the target cluster retains the data from the new write because it has the most recent `writetime`.
 
 == Install {cass-migrator}
 
@@ -124,6 +132,10 @@ For example, the 4.x series of {cass-migrator-short} isn't backwards compatible
 [#migrate]
 == Run a {cass-migrator-short} data migration job
 
+A data migration job copies data from a table in your origin cluster to a table with the same schema in your target cluster.
+
+To optimize large-scale migrations, {cass-migrator-short} can run multiple concurrent migration jobs on the same table.
+
 The following `spark-submit` command migrates one table from the origin to the target cluster, using the configuration in your properties file.
 The migration job is specified in the `--class` argument.
 
@@ -189,7 +201,9 @@ For additional modifications to this command, see <<advanced>>.
 [#cdm-validation-steps]
 == Run a {cass-migrator-short} data validation job
 
-After you migrate data, you can use {cass-migrator-short}'s data validation mode to find inconsistencies between the origin and target tables.
+After migrating data, use {cass-migrator-short}'s data validation mode to identify any inconsistencies between the origin and target tables, such as missing or mismatched records.
+
+Optionally, {cass-migrator-short} can automatically correct discrepancies in the target cluster during validation.
 
 . Use the following `spark-submit` command to run a data validation job using the configuration in your properties file.
 The data validation job is specified in the `--class` argument.
@@ -276,9 +290,10 @@ Optionally, you can run {cass-migrator-short} validation jobs in **AutoCorrect**
 +
 [IMPORTANT]
 ====
-`TIMESTAMP` has an effect on this function.
+Timestamps have an effect on this function.
+
+If the `writetime` of the origin record (determined with `.writetime.names`) is before the `writetime` of the corresponding target record, then the original write won't appear in the target cluster.
 
-If the `WRITETIME` of the origin record (determined with `.writetime.names`) is earlier than the `WRITETIME` of the target record, then the change doesn't appear in the target cluster.
 This comparative state can be challenging to troubleshoot if individual columns or cells were modified in the target cluster.
 ====
 
 
@@ -1,4 +1,4 @@
-= Phase 4: Route reads to the target
+= Route reads to the target
 :page-tag: migration,zdm,zero-downtime,zdm-proxy,read-routing
 
 This topic explains how you can configure the {product-proxy} to route all reads to the target cluster instead of the origin cluster.
@@ -15,16 +15,12 @@ This operation is a configuration change that can be carried out as explained xr
 
 [TIP]
 ====
-If you performed the optional steps described in the prior topic, xref:enable-async-dual-reads.adoc[] -- to verify that your target cluster was ready and tuned appropriately to handle the production read load -- be sure to disable async dual reads when you're done testing.
-If you haven't already, revert `read_mode` in `vars/zdm_proxy_core_config.yml` to `PRIMARY_ONLY` when switching sync reads to the target cluster.
-Example:
+If you xref:enable-async-dual-reads.adoc[enabled asynchronous dual reads] to test your target cluster's performance, make sure that you disable asynchronous dual reads when you're done testing.
 
-[source,yml]
-----
-read_mode: PRIMARY_ONLY
-----
+To do this, edit the `vars/zdm_proxy_core_config.yml` file, and then set the `read_mode` variable  to `PRIMARY_ONLY`.
 
-If you don't disable async dual reads, {product-proxy} instances continue to send async reads to the origin, which, although harmless, is unnecessary.
+If you don't disable asynchronous dual reads, {product-proxy} instances send asynchronous, duplicate read requests to your origin cluster.
+This is harmless but unnecessary.
 ====
 
 == Changing the read routing configuration
 
@@ -17,63 +17,68 @@ The main component of the {company} {product} toolkit is {product-proxy}, which
 {product-proxy} is open-source software that is available from the {product-proxy-repo}[zdm-proxy GitHub repo].
 This project is open for public contributions.
 
-The {product-proxy} is an orchestrator for monitoring application activity and keeping multiple clusters in sync through dual writes.
+The {product-proxy} is an orchestrator for monitoring application activity and keeping multiple clusters (databases) in sync through dual writes.
 {product-proxy} isn't linked to the actual migration process.
 It doesn't perform data migrations and it doesn't have awareness of ongoing migrations.
 Instead, you use a data migration tool, like {sstable-sideloader}, {cass-migrator}, or {dsbulk-migrator}, to perform the data migration and validate migrated data.
 
-=== How {product-proxy} works
+{product-proxy} reduces risks to upgrades and migrations by decoupling the origin cluster from the target cluster and maintaining consistency between both clusters.
+You decide when you want to switch permanently to the target cluster.
 
-{company} created {product-proxy} to function between the application and both the origin and target databases.
-The databases can be any CQL-compatible data store, such as {cass-reg}, {dse}, and {astra-db}.
-The proxy always sends every write operation (Insert, Update, Delete) synchronously to both clusters at the desired Consistency Level:
+After migrating your data, changes to your application code are usually minimal, depending on your client's compatibility with the origin and target clusters.
+Typically, you only need to update the connection string.
 
-* If the write is successful in both clusters, it returns a successful acknowledgement to the client application.
-* If the write fails on either cluster, the failure is passed back to the client application so that it can retry it as appropriate, based on its own retry policy.
+[#how-zdm-proxy-handles-reads-and-writes]
+=== How {product-proxy} handles reads and writes
+
+{company} created {product-proxy} to orchestrate requests between a client application and both the origin and target clusters.
+These clusters can be any CQL-compatible data store, such as {cass-reg}, {dse}, and {astra-db}.
+
+During the migration process, you designate one cluster as the _primary cluster_, which serves as the source of truth for reads.
+For the majority of the migration process, this is typically the origin cluster.
+Towards the end of the migration process, when you are ready to read from your target cluster, you set the target cluster as the primary cluster.
+
+==== Writes
+
+{product-proxy} sends every write operation (`INSERT`, `UPDATE`, `DELETE`) synchronously to both clusters at the requested consistency level:
+
+* If the write is acknowledged in both clusters at the requested consistency level, then the operation returns a successful write acknowledgement to the client that issued the request.
+* If the write fails in either cluster, then {product-proxy} passes a write failure, originating from the primary cluster, back to the client.
+The client can then retry the request, if appropriate, based on the client's retry policy.
 
 This design ensures that new data is always written to both clusters, and that any failure on either cluster is always made visible to the client application.
-{product-proxy} also sends all reads to the primary cluster, and then returns the result to the client application.
-The primary cluster is initially the origin cluster, and you change it to the target cluster at the end of the migration process.
 
-{product-proxy} is designed to be highly available. It can be scaled horizontally, so typical deployments are made up of a minimum of 3 servers.
-{product-proxy} can be restarted in a rolling fashion, for example, to change configuration for different phases of the migration.
+For information about how {product-proxy} handles lightweight transactions (LWTs), see xref:feasibility-checklists.adoc#_lightweight_transactions_and_the_applied_flag[Lightweight Transactions and the applied flag].
 
-=== Key features of {product-proxy}
+==== Reads
 
-* Allows you to lift-and-shift existing application code from your origin cluster to your target cluster by changing only the connection string, if all else is compatible.
+By default, {product-proxy} sends all reads to the primary cluster, and then returns the result to the client application.
 
-* Reduces risks to upgrades and migrations by decoupling the origin cluster from the target cluster.
-You can determine an explicit cut-over point once you're ready to commit to using the target cluster permanently.
+If you enable _asynchronous dual reads_, {product-proxy} sends asynchronous read requests to the secondary cluster (typically the target cluster) in addition to the synchronous read requests that are sent to the primary cluster.
 
-* Bifurcates writes synchronously to both clusters during the migration process.
+This feature is designed to test the target cluster's ability to handle a production workload before you permanently switch to the target cluster at the end of the migration process.
 
-* Read operations return the response from the primary (origin) cluster, which is its designated source of truth.
-+
-During a migration, the primary cluster is typically the origin cluster.
-Near the end of the migration, you shift the primary cluster to be the target cluster.
+With or without asynchronous dual reads, the client application only receives results from synchronous reads on the primary cluster.
+The results of asynchronous reads aren't returned to the client because asynchronous reads are for testing purposes only.
 
-* Option to read asynchronously from the target cluster as well as the origin cluster
-This capability is called **Asynchronous Dual Reads** or **Read Mirroring**, and it allows you to observe what read latencies and throughput the target cluster can achieve under the actual production load.
-+
-** Results from the asynchronous reads executed on the target cluster are not sent back to the client application.
-** This design implies that a failure on asynchronous reads from the target cluster does not cause an error on the client application.
-** Asynchronous dual reads can be enabled and disabled dynamically with a rolling restart of the {product-proxy} instances.
+For more information, see xref:ROOT:enable-async-dual-reads.adoc[].
 
-[NOTE]
-====
-When using Asynchronous Dual Reads, any additional read load on the target cluster may impact its ability to keep up with writes.
-This behavior is expected and desired.
-The idea is to mimic the full read and write load on the target cluster so there are no surprises during the last migration phase; that is, after cutting over completely to the target cluster.
-====
+=== High availability and multiple {product-proxy} instances
+
+{product-proxy} is designed to be highly available and run a clustered fashion to avoid a single point of failure.
 
-=== Run multiple {product-proxy} instances
+With the exception of local test environments, {company} recommends that all {product-proxy} deployments have multiple {product-proxy} instances.
+Deployments typically consist of three or more instances.
 
-{product-proxy} has been designed to run in a clustered fashion so that it is never a single point of failure.
-Unless it is for a demo or local testing environment, a {product-proxy} deployment should always comprise multiple {product-proxy} instances.
+[TIP]
+====
+Throughout the {product-short} documentation, the term _{product-proxy} deployment_ refers to the entire deployment, and _{product-proxy} instance_ refers to an individual proxy process in the deployment.
+====
 
-Throughout the documentation, the term _{product-proxy} deployment_ refers to the entire deployment, and _{product-proxy} instance_ refers to an individual proxy process in the deployment.
+You can scale {product-proxy} instances horizontally and vertically.
+To avoid downtime when applying configuration changes, you can perform rolling restarts on your {product-proxy} instances.
 
-You can use the {product-utility} and {product-automation} to set up and run Ansible playbooks that deploy and manage {product-proxy} and its monitoring stack.
+For simplicity, you can use the {product-utility} and {product-automation} to set up and run Ansible playbooks that deploy and manage {product-proxy} and its monitoring stack.
 
 == {product-utility} and {product-automation}
Original file line number	Diff line number	Diff line change
`@@ -103,4 +103,5 @@ If you have questions about migrating from a specific source to {astra-db}, cont`
`103`	`103`
`104`	`104`	`== See also`
`105`	`105`
	`106`	`+* https://www.datastax.com/events/migrating-your-legacy-cassandra-app-to-astra-db[Migrating your legacy {cass-reg} app to {astra-db}]`
`106`	`107`	`* xref:astra-db-serverless:databases:migration-path-serverless.adoc[Migrate to {astra-db}]`