Skip to content

Commit 35bf514

Browse files
authored
Removed deprecated partition-file feature (#274)
* Removed deprecated partition-file feature * Updated release notes * Enforcing improved coverage after the recent test additions & removal of deprecated functions
1 parent 2cd422e commit 35bf514

39 files changed

+202
-595
lines changed

Dockerfile

Lines changed: 0 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -25,7 +25,6 @@ COPY ./src /assets/src
2525
COPY ./pom.xml /assets/pom.xml
2626
COPY ./src/resources/cdm.properties /assets/
2727
COPY ./src/resources/cdm-detailed.properties /assets/
28-
COPY ./src/resources/partitions.csv /assets/
2928
COPY ./scripts/get-latest-maven-version.sh ./get-latest-maven-version.sh
3029

3130
RUN chmod +x ./get-latest-maven-version.sh && \

README.md

Lines changed: 8 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -49,6 +49,13 @@ Exception in thread "main" java.lang.NoSuchMethodError: scala.runtime.Statics.re
4949
Note:
5050
- Above command generates a log file `logfile_name_*.txt` to avoid log output on the console.
5151
- Update the memory options (driver & executor memory) based on your use-case
52+
- To track details of a run in the `target` keyspace, pass param `--conf spark.cdm.trackRun=true`
53+
- To filter and migrate data only in a specific token range, you can pass the below two additional params to the `Migration` or `Validation` jobs
54+
55+
```
56+
--conf spark.cdm.filter.cassandra.partition.min=<token-range-min>
57+
--conf spark.cdm.filter.cassandra.partition.max=<token-range-max>
58+
```
5259

5360
# Steps for Data-Validation:
5461

@@ -84,7 +91,7 @@ Note:
8491
- The validation job will never delete records from target i.e. it only adds or updates data on target
8592

8693
# Rerun (previously incomplete) Migration or Validation
87-
- You can rerun a Migration or Validation job to complete a previous run that could have stopped for any reasons. This mode will skip any token-ranges from previous run that were migrated or validated successfully. This is done by passing the `spark.cdm.trackRun.previousRunId` param as shown below
94+
- You can rerun/resume a Migration or Validation job to complete a previous run that could have stopped (or completed with some errors) for any reasons. This mode will skip any token-ranges from the previous run that were migrated (or validated) successfully. This is done by passing the `spark.cdm.trackRun.previousRunId` param as shown below
8895

8996
```
9097
./spark-submit --properties-file cdm.properties \
@@ -93,8 +100,6 @@ Note:
93100
--master "local[*]" --driver-memory 25G --executor-memory 25G \
94101
--class com.datastax.cdm.job.<Migrate|DiffData> cassandra-data-migrator-4.x.x.jar &> logfile_name_$(date +%Y%m%d_%H_%M).txt
95102
```
96-
Note:
97-
- This feature replaces and improves upon an older similar feature (using param `spark.cdm.tokenrange.partitionFile`) that is now deprecated and will be removed soon.
98103

99104
# Perform large-field Guardrail violation checks
100105
- The tool can be used to identify large fields from a table that may break you cluster guardrails (e.g. AstraDB has a 10MB limit for a single large field) `--class com.datastax.cdm.job.GuardrailCheck` as shown below
@@ -125,7 +130,6 @@ Note:
125130
- Validate migration accuracy and performance using a smaller randomized data-set
126131
- Supports adding custom fixed `writetime`
127132
- Track run information (start-time, end-time, status, etc.) in tables (`cdm_run_info` and `cdm_run_details`) on the target keyspace
128-
- Validation - Log partitions range level exceptions, use the exceptions file as input for rerun
129133

130134
# Things to know
131135
- Each run (Migration or Validation) can be tracked (when enabled). You can find summary and details of the same in tables `cdm_run_info` and `cdm_run_details` in the target keyspace.

RELEASE.md

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,7 @@
11
# Release Notes
2+
## [4.3.3] - 2024-07-22
3+
- Removed deprecated functionality related to processing token-ranges via partition-file
4+
25
## [4.3.2] - 2024-07-19
36
- Removed deprecated functionality related to retry
47

0 commit comments

Comments
 (0)