Skip to content

Commit b5ca2fa

Browse files
committed
CDM-88: Added/updated SIT feature tests
1 parent dd2a11c commit b5ca2fa

File tree

12 files changed

+41
-16
lines changed

12 files changed

+41
-16
lines changed

README.md

Lines changed: 8 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -34,7 +34,7 @@ tar -xvzf spark-3.3.1-bin-hadoop3.tgz
3434

3535
```
3636
./spark-submit --properties-file cdm.properties /
37-
--conf spark.cdm.schema.origin.keyspaceTable="<keyspace-name>.<table-name>" /
37+
--conf spark.cdm.schema.origin.keyspaceTable="<keyspacename>.<tablename>" /
3838
--master "local[*]" /
3939
--class com.datastax.cdm.job.Migrate cassandra-data-migrator-4.x.x.jar &> logfile_name_$(date +%Y%m%d_%H_%M).txt
4040
```
@@ -44,7 +44,7 @@ Note:
4444
- Add option `--driver-memory 25G --executor-memory 25G` as shown below if the table migrated is large (over 100GB)
4545
```
4646
./spark-submit --properties-file cdm.properties /
47-
--conf spark.cdm.schema.origin.keyspaceTable="<keyspace-name>.<table-name>" /
47+
--conf spark.cdm.schema.origin.keyspaceTable="<keyspacename>.<tablename>" /
4848
--master "local[*]" --driver-memory 25G --executor-memory 25G /
4949
--class com.datastax.cdm.job.Migrate cassandra-data-migrator-4.x.x.jar &> logfile_name_$(date +%Y%m%d_%H_%M).txt
5050
```
@@ -55,7 +55,7 @@ Note:
5555

5656
```
5757
./spark-submit --properties-file cdm.properties /
58-
--conf spark.cdm.schema.origin.keyspaceTable="<keyspace-name>.<table-name>" /
58+
--conf spark.cdm.schema.origin.keyspaceTable="<keyspacename>.<tablename>" /
5959
--master "local[*]" /
6060
--class com.datastax.cdm.job.DiffData cassandra-data-migrator-4.x.x.jar &> logfile_name_$(date +%Y%m%d_%H_%M).txt
6161
```
@@ -83,7 +83,7 @@ Note:
8383
- The validation job will never delete records from target i.e. it only adds or updates data on target
8484

8585
# Migrating or Validating specific partition ranges
86-
- You can also use the tool to Migrate or Validate specific partition ranges by using a partition-file with the name `./<keyspace>.<tablename>_partitions.csv` in the below format in the current folder as input
86+
- You can also use the tool to Migrate or Validate specific partition ranges by using a partition-file with the name `./<keyspacename>.<tablename>_partitions.csv` in the below format in the current folder as input
8787
```
8888
-507900353496146534,-107285462027022883
8989
-506781526266485690,1506166634797362039
@@ -94,21 +94,21 @@ Each line above represents a partition-range (`min,max`). Alternatively, you can
9494

9595
```
9696
spark-submit --properties-file cdm.properties /
97-
--conf spark.cdm.schema.origin.keyspaceTable="<keyspace-name>.<table-name>" /
98-
--conf spark.tokenRange.partitionFile="/<path-to-file>.<csv-input-filename>" /
97+
--conf spark.cdm.schema.origin.keyspaceTable="<keyspacename>.<tablename>" /
98+
--conf spark.tokenRange.partitionFile="/<path-to-file>/<csv-input-filename>" /
9999
--master "local[*]" /
100100
--class com.datastax.cdm.job.<Migrate|DiffData> cassandra-data-migrator-4.x.x.jar &> logfile_name_$(date +%Y%m%d_%H_%M).txt
101101
```
102102
This mode is specifically useful to processes a subset of partition-ranges that may have failed during a previous run.
103103

104104
> **Note:**
105-
> A file named `./<keyspace>.<tablename>_partitions.csv` is auto generated by the Migration & Validation jobs in the above format containing any failed partition ranges. No file is created if there are no failed partitions. You can use this file as an input to process any failed partition in a following run.
105+
> A file named `./<keyspacename>.<tablename>_partitions.csv` is auto generated by the Migration & Validation jobs in the above format containing any failed partition ranges. No file is created if there are no failed partitions. You can use this file as an input to process any failed partition in a following run.
106106
107107
# Perform large-field Guardrail violation checks
108108
- The tool can be used to identify large fields from a table that may break you cluster guardrails (e.g. AstraDB has a 10MB limit for a single large field) `--class com.datastax.cdm.job.GuardrailCheck` as shown below
109109
```
110110
./spark-submit --properties-file cdm.properties /
111-
--conf spark.cdm.schema.origin.keyspaceTable="<keyspace-name>.<table-name>" /
111+
--conf spark.cdm.schema.origin.keyspaceTable="<keyspacename>.<tablename>" /
112112
--conf spark.cdm.feature.guardrail.colSizeInKB=10000 /
113113
--master "local[*]" /
114114
--class com.datastax.cdm.job.GuardrailCheck cassandra-data-migrator-4.x.x.jar &> logfile_name_$(date +%Y%m%d_%H_%M).txt
Lines changed: 4 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,2 +1,4 @@
1-
migrateData com.datastax.cdm.job.Migrate migrate.properties
2-
validateData com.datastax.cdm.job.DiffData migrate.properties
1+
migrateDataDefault com.datastax.cdm.job.Migrate migrate.properties
2+
validateDataDefault com.datastax.cdm.job.DiffData migrate.properties
3+
migrateData com.datastax.cdm.job.Migrate migrate_with_partitionfile.properties
4+
validateData com.datastax.cdm.job.DiffData migrate_with_partitionfile.properties

SIT/features/06_partition_range/execute.sh

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -3,9 +3,11 @@
33
workingDir="$1"
44
cd "$workingDir"
55

6+
/local/cdm.sh -f cdm.txt -s migrateDataDefault -d "$workingDir"
67
/local/cdm.sh -f cdm.txt -s migrateData -d "$workingDir"
78

89
cqlsh -u $CASS_USERNAME -p $CASS_PASSWORD $CASS_CLUSTER -f $workingDir/breakData.cql > $workingDir/other.breakData.out 2> $workingDir/other.breakData.err
910

11+
/local/cdm.sh -f cdm.txt -s validateDataDefault -d "$workingDir"
1012
/local/cdm.sh -f cdm.txt -s validateData -d "$workingDir"
1113

SIT/features/06_partition_range/migrate.properties

Lines changed: 0 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -8,5 +8,3 @@ spark.cdm.perfops.numParts 1
88
spark.cdm.autocorrect.missing true
99
spark.cdm.autocorrect.mismatch true
1010

11-
spark.tokenrange.partitionFile ./partitions.csv
12-
Lines changed: 12 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,12 @@
1+
spark.cdm.connect.origin.host cdm-sit-cass
2+
spark.cdm.connect.target.host cdm-sit-cass
3+
4+
spark.cdm.schema.origin.keyspaceTable origin.feature_partition_range
5+
spark.cdm.schema.target.keyspaceTable target.feature_partition_range
6+
spark.cdm.perfops.numParts 1
7+
8+
spark.cdm.autocorrect.missing true
9+
spark.cdm.autocorrect.mismatch true
10+
11+
spark.tokenrange.partitionFile ./partitions.csv
12+
Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,2 @@
1+
0,2000000000000000000
2+
8100000000000000000,8500000000000000000

SIT/features/07_migrate_rows/cdm.txt

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1 +1,2 @@
1-
migrateData com.datastax.cdm.job.MigrateRowsFromFile migrate.properties
1+
migrateDataDefault com.datastax.cdm.job.MigrateRowsFromFile migrate.properties
2+
migrateData com.datastax.cdm.job.MigrateRowsFromFile migrate_with_pkrowsfile.properties

SIT/features/07_migrate_rows/execute.sh

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -3,6 +3,7 @@
33
workingDir="$1"
44
cd "$workingDir"
55

6+
/local/cdm.sh -f cdm.txt -s migrateDataDefault -d "$workingDir"
67
/local/cdm.sh -f cdm.txt -s migrateData -d "$workingDir"
78

89

SIT/features/07_migrate_rows/migrate.properties

Lines changed: 0 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -4,5 +4,3 @@ spark.cdm.connect.target.host cdm-sit-cass
44
spark.cdm.schema.origin.keyspaceTable origin.feature_migrate_rows
55
spark.cdm.schema.target.keyspaceTable target.feature_migrate_rows
66
spark.cdm.perfops.numParts 1
7-
8-
spark.tokenrange.partitionFile ./primary_key_rows.csv
Lines changed: 8 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,8 @@
1+
spark.cdm.connect.origin.host cdm-sit-cass
2+
spark.cdm.connect.target.host cdm-sit-cass
3+
4+
spark.cdm.schema.origin.keyspaceTable origin.feature_migrate_rows
5+
spark.cdm.schema.target.keyspaceTable target.feature_migrate_rows
6+
spark.cdm.perfops.numParts 1
7+
8+
spark.tokenrange.partitionFile ./primary_key_rows.csv

0 commit comments

Comments
 (0)