Skip to content

Commit e3aaebe

Browse files
beajohnsonmsmygitKimberlyFields
authored
DOC-4573: Updates to remove partition ranges and parameters (#185)
* updates to remove partition ranges and parameters * divided paragraph for easier read * udpated spark version to 3.5.2 per Madhavan * Further updates to sync docs up to date * Remove CDM properties & associated files * Clarify Spark cluster mode requirement * Removed partition range partial doc * Update sub-partitions documents * Add page alias * Update modules/ROOT/partials/cdm-prerequisites.adoc --------- Co-authored-by: Madhavan Sridharan <[email protected]> Co-authored-by: KimberlyFields <[email protected]>
1 parent 53e710d commit e3aaebe

13 files changed

+18
-339
lines changed

modules/ROOT/nav.adoc

Lines changed: 0 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -41,7 +41,6 @@
4141
* {cstar-data-migrator}
4242
** xref:cdm-overview.adoc[]
4343
** xref:cdm-steps.adoc[Migrate data]
44-
** xref:cdm-parameters.adoc[Parameters]
4544
4645
* {dsbulk-loader}
4746
** https://docs.datastax.com/en/dsbulk/overview/dsbulk-about.html[Overview]
Lines changed: 4 additions & 51 deletions
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,5 @@
11
= {cstar-data-migrator}
2+
:page-aliases: cdm-parameters.adoc
23

34
Use {cstar-data-migrator} to migrate and validate tables between origin and target Cassandra clusters, with available logging and reconciliation support.
45

@@ -42,55 +43,7 @@ include::partial$cdm-partition-ranges.adoc[]
4243

4344
include::partial$cdm-guardrail-checks.adoc[]
4445

46+
[[cdm-next-steps]]
47+
== Next steps
4548

46-
[[cdm-reference]]
47-
== {cstar-data-migrator} references
48-
49-
=== Common connection parameters for Origin and Target
50-
51-
include::partial$common-connection-parameters.adoc[]
52-
53-
=== Origin schema parameters
54-
55-
include::partial$origin-schema-parameters.adoc[]
56-
57-
=== Target schema parameters
58-
59-
include::partial$target-schema-parameters.adoc[]
60-
61-
=== Auto-correction parameters
62-
63-
include::partial$auto-correction-parameters.adoc[]
64-
65-
=== Performance and operations parameters
66-
67-
include::partial$performance-and-operations-parameters.adoc[]
68-
69-
=== Transformation parameters
70-
71-
include::partial$transformation-parameters.adoc[]
72-
73-
=== Cassandra filter parameters
74-
75-
include::partial$cassandra-filter-parameters.adoc[]
76-
77-
=== Java filter parameters
78-
79-
include::partial$java-filter-parameters.adoc[]
80-
81-
=== Constant column feature parameters
82-
83-
include::partial$constant-column-feature-parameters.adoc[]
84-
85-
=== Explode map feature parameters
86-
87-
include::partial$explode-map-feature-parameters.adoc[]
88-
89-
=== Guardrail feature parameter
90-
91-
include::partial$guardrail-feature-parameters.adoc[]
92-
93-
=== TLS (SSL) connection parameters
94-
95-
include::partial$tls-ssl-connection-parameters.adoc[]
96-
49+
For advanced operations, see documentation at https://github.com/datastax/cassandra-data-migrator[the repository].

modules/ROOT/pages/cdm-parameters.adoc

Lines changed: 0 additions & 70 deletions
This file was deleted.

modules/ROOT/partials/cdm-guardrail-checks.adoc

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -9,5 +9,5 @@ Example:
99
--conf spark.cdm.schema.origin.keyspaceTable="<keyspacename>.<tablename>" \
1010
--conf spark.cdm.feature.guardrail.colSizeInKB=10000 \
1111
--master "local[*]" --driver-memory 25G --executor-memory 25G \
12-
--class com.datastax.cdm.job.GuardrailCheck cassandra-data-migrator-4.x.x.jar &> logfile_name_$(date +%Y%m%d_%H_%M).txt
12+
--class com.datastax.cdm.job.GuardrailCheck cassandra-data-migrator-x.y.z.jar &> logfile_name_$(date +%Y%m%d_%H_%M).txt
1313
----
Lines changed: 4 additions & 30 deletions
Original file line numberDiff line numberDiff line change
@@ -1,35 +1,9 @@
1-
You can also use {cstar-data-migrator} to migrate or validate specific partition ranges. Use a **partition-file** with the name `./<keyspacename>.<tablename>_partitions.csv`.
2-
Use the following format in the CSV file, in the current folder as input.
3-
Example:
4-
5-
[source,csv]
6-
----
7-
-507900353496146534,-107285462027022883
8-
-506781526266485690,1506166634797362039
9-
2637884402540451982,4638499294009575633
10-
798869613692279889,8699484505161403540
11-
----
12-
13-
Each line in the CSV represents a partition-range (`min,max`).
14-
15-
Alternatively, you can also pass the partition-file with a command-line parameter.
16-
Example:
1+
You can also use {cstar-data-migrator} to xref:cdm-steps.adoc#cdm-steps[migrate] or xref:cdm-steps.adoc#cdm-validation-steps[validate] specific partition ranges by passing the below additional parameters.
172

183
[source,bash]
194
----
20-
./spark-submit --properties-file cdm.properties \
21-
--conf spark.cdm.schema.origin.keyspaceTable="<keyspacename>.<tablename>" \
22-
--conf spark.cdm.tokenrange.partitionFile.input="/<path-to-file>/<csv-input-filename>" \
23-
--master "local[*]" --driver-memory 25G --executor-memory 25G \
24-
--class com.datastax.cdm.job.<Migrate|DiffData> cassandra-data-migrator-x.y.z.jar &> logfile_name_$(date +%Y%m%d_%H_%M).txt
5+
--conf spark.cdm.filter.cassandra.partition.min=<token-range-min>
6+
--conf spark.cdm.filter.cassandra.partition.max=<token-range-max>
257
----
268

27-
This mode is specifically useful to process a subset of partition-ranges that may have failed during a previous run.
28-
29-
[NOTE]
30-
====
31-
In the format shown above, the migration and validation jobs autogenerate a file named `./<keyspacename>.<tablename>_partitions.csv`.
32-
The file contains any failed partition ranges.
33-
No file is created if there were no failed partitions.
34-
You can use the CSV as input to process any failed partition in a subsequent run.
35-
====
9+
This mode is specifically useful to process a subset of partition-ranges.

modules/ROOT/partials/cdm-prerequisites.adoc

Lines changed: 5 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -2,15 +2,15 @@ Read the prerequisites below before using the Cassandra Data Migrator.
22

33
* Install or switch to Java 11.
44
The Spark binaries are compiled with this version of Java.
5-
* Select a single VM to run this job and install https://archive.apache.org/dist/spark/spark-3.5.1/[Spark 3.5.1] there.
6-
No cluster is necessary.
7-
* Optionally, install https://maven.apache.org/download.cgi[Maven] 3.9.x if you want to build the JAR for local development.
5+
* Select a single VM to run this job and install https://archive.apache.org/dist/spark/spark-3.5.3/[Spark 3.5.3] there.
6+
No cluster is necessary for most one-time migrations. However, Spark cluster mode is also supported for complex migrations.
7+
* Optionally, install https://maven.apache.org/download.cgi[Maven] `3.9.x` if you want to build the JAR for local development.
88
99
Run the following commands to install Apache Spark:
1010

1111
[source,bash]
1212
----
13-
wget https://archive.apache.org/dist/spark/spark-3.5.1/spark-3.5.1-bin-hadoop3-scala2.13.tgz
13+
wget https://archive.apache.org/dist/spark/spark-3.5.3/spark-3.5.3-bin-hadoop3-scala2.13.tgz
1414
15-
tar -xvzf spark-3.5.1-bin-hadoop3-scala2.13.tgz
15+
tar -xvzf spark-3.5.3-bin-hadoop3-scala2.13.tgz
1616
----

modules/ROOT/partials/cdm-validation-steps.adoc

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -41,6 +41,6 @@ spark.cdm.autocorrect.mismatch false|true
4141

4242
[IMPORTANT]
4343
====
44-
The {cstar-data-migrator} validation job never deletes records from the target cluster.
44+
The {cstar-data-migrator} validation job never deletes records from the source or target clusters.
4545
The job only adds or updates data on the target cluster.
4646
====

modules/ROOT/partials/constant-column-feature-parameters.adoc

Lines changed: 0 additions & 29 deletions
This file was deleted.

modules/ROOT/partials/explode-map-feature-parameters.adoc

Lines changed: 0 additions & 19 deletions
This file was deleted.

modules/ROOT/partials/guardrail-feature-parameters.adoc

Lines changed: 0 additions & 16 deletions
This file was deleted.

0 commit comments

Comments
 (0)