Skip to content

Commit a5ff3c2

Browse files
authored
Merge branch 'main' into snyk-upgrade-dafe6fd4d1eb795d6504c32641f4b46f
2 parents a876016 + b9b5e39 commit a5ff3c2

File tree

130 files changed

+1923
-100
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

130 files changed

+1923
-100
lines changed

.all-contributorsrc

Lines changed: 11 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -139,6 +139,16 @@
139139
"contributions": [
140140
"review"
141141
]
142+
},
143+
{
144+
"login": "Jeremya",
145+
"name": "Jeremy",
146+
"avatar_url": "https://avatars.githubusercontent.com/u/576519?v=4",
147+
"profile": "https://github.com/Jeremya",
148+
"contributions": [
149+
"code"
150+
]
142151
}
143-
]
152+
],
153+
"commitType": "docs"
144154
}

.github/workflows/cdm-integrationtest.yml

Lines changed: 17 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -1,14 +1,28 @@
11
name: Build and test jar with integration tests
2-
on: [push]
2+
on:
3+
workflow_dispatch:
4+
pull_request:
5+
push:
6+
branches:
7+
- main
8+
9+
concurrency:
10+
group: '${{ github.workflow }} @ ${{ github.event.pull_request.head.label || github.head_ref || github.ref }}'
11+
cancel-in-progress: true
12+
313
jobs:
414
CDM-Integration-Test:
5-
runs-on: ubuntu-latest
15+
strategy:
16+
matrix:
17+
jdk: [ 8 ]
18+
os: [ ubuntu-latest ]
19+
runs-on: ${{ matrix.os }}
620
steps:
721
- uses: actions/checkout@v3
822
- name: Set up JDK 8
923
uses: actions/setup-java@v3
1024
with:
11-
java-version: '8'
25+
java-version: ${{ matrix.jdk }}
1226
distribution: 'temurin'
1327
cache: maven
1428
- name: Test SIT with cdm

.github/workflows/maven.yml

Lines changed: 16 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -8,19 +8,32 @@
88

99
name: Java CI with Maven
1010

11-
on: [push, pull_request]
11+
on:
12+
workflow_dispatch:
13+
pull_request:
14+
push:
15+
branches:
16+
- main
17+
18+
concurrency:
19+
group: '${{ github.workflow }} @ ${{ github.event.pull_request.head.label || github.head_ref || github.ref }}'
20+
cancel-in-progress: true
1221

1322
jobs:
1423
build:
24+
strategy:
25+
matrix:
26+
jdk: [ 8 ]
27+
os: [ ubuntu-latest ]
1528

16-
runs-on: ubuntu-latest
29+
runs-on: ${{ matrix.os }}
1730

1831
steps:
1932
- uses: actions/checkout@v3
2033
- name: Set up JDK 8
2134
uses: actions/setup-java@v3
2235
with:
23-
java-version: '8'
36+
java-version: ${{ matrix.jdk }}
2437
distribution: 'temurin'
2538
cache: maven
2639
- name: Build with Maven

.github/workflows/snyk-cli-scan.yml

Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -10,5 +10,11 @@ on:
1010
branches: [ main ]
1111
workflow_dispatch:
1212

13+
concurrency:
14+
group: '${{ github.workflow }} @ ${{ github.event.pull_request.head.label || github.head_ref || github.ref }}'
15+
#group: ${{ github.workflow }}-${{ github.ref }}-${{ github.job || github.run_id }}
16+
cancel-in-progress: true
17+
1318
env:
1419
SNYK_SEVERITY_THRESHOLD_LEVEL: critical
20+

.github/workflows/snyk-pr-cleanup.yml

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -9,3 +9,8 @@ on:
99
- main
1010
workflow_dispatch:
1111

12+
concurrency:
13+
group: '${{ github.workflow }} @ ${{ github.event.pull_request.head.label || github.head_ref || github.ref }}'
14+
#group: ${{ github.workflow }}-${{ github.ref }}-${{ github.job || github.run_id }}
15+
cancel-in-progress: true
16+

.settings/org.eclipse.core.resources.prefs

Lines changed: 0 additions & 5 deletions
This file was deleted.

.settings/org.eclipse.jdt.core.prefs

Lines changed: 0 additions & 8 deletions
This file was deleted.

.settings/org.eclipse.m2e.core.prefs

Lines changed: 0 additions & 4 deletions
This file was deleted.

CONTRIBUTING.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -149,6 +149,7 @@ For recognizing contributions, please follow [this documentation](https://allcon
149149
<td align="center" valign="top" width="16.66%"><a href="https://github.com/vaishakhbn"><img src="https://avatars.githubusercontent.com/u/2619002?v=4?s=50" width="50px;" alt="Vaishakh Baragur Narasimhareddy"/><br /><sub><b>Vaishakh Baragur Narasimhareddy</b></sub></a><br /><a href="https://github.com/datastax/cassandra-data-migrator/commits?author=vaishakhbn" title="Code">💻</a> <a href="https://github.com/datastax/cassandra-data-migrator/commits?author=vaishakhbn" title="Tests">⚠️</a></td>
150150
<td align="center" valign="top" width="16.66%"><a href="https://github.com/mieslep"><img src="https://avatars.githubusercontent.com/u/5420540?v=4?s=50" width="50px;" alt="Phil Miesle"/><br /><sub><b>Phil Miesle</b></sub></a><br /><a href="https://github.com/datastax/cassandra-data-migrator/commits?author=mieslep" title="Code">💻</a></td>
151151
<td align="center" valign="top" width="16.66%"><a href="https://github.com/mfmaher2"><img src="https://avatars.githubusercontent.com/u/64795956?v=4?s=50" width="50px;" alt="mfmaher2"/><br /><sub><b>mfmaher2</b></sub></a><br /><a href="https://github.com/datastax/cassandra-data-migrator/pulls?q=is%3Apr+reviewed-by%3Amfmaher2" title="Reviewed Pull Requests">👀</a></td>
152+
<td align="center" valign="top" width="16.66%"><a href="https://github.com/Jeremya"><img src="https://avatars.githubusercontent.com/u/576519?v=4?s=50" width="50px;" alt="Jeremy"/><br /><sub><b>Jeremy</b></sub></a><br /><a href="https://github.com/datastax/cassandra-data-migrator/commits?author=Jeremya" title="Code">💻</a></td>
152153
</tr>
153154
</tbody>
154155
</table>

README.md

Lines changed: 20 additions & 15 deletions
Original file line numberDiff line numberDiff line change
@@ -18,12 +18,17 @@ Migrate and Validate Tables between Origin and Target Cassandra Clusters.
1818

1919
### Prerequisite
2020
- Install Java8 as spark binaries are compiled with it.
21-
- Install Spark version [3.4.1](https://archive.apache.org/dist/spark/spark-3.4.1/) on a single VM (no cluster necessary) where you want to run this job. Spark can be installed by running the following: -
21+
- Install Spark version [3.4.1](https://archive.apache.org/dist/spark/spark-3.4.1/spark-3.4.1-bin-hadoop3-scala2.13.tgz) on a single VM (no cluster necessary) where you want to run this job. Spark can be installed by running the following: -
2222
```
2323
wget https://archive.apache.org/dist/spark/spark-3.4.1/spark-3.4.1-bin-hadoop3-scala2.13.tgz
2424
tar -xvzf spark-3.4.1-bin-hadoop3-scala2.13.tgz
2525
```
2626

27+
> :warning: If the above Spark and Scala version is not properly installed, you'll then see a similar exception like below when running the CDM jobs,
28+
```
29+
Exception in thread "main" java.lang.NoSuchMethodError: scala.runtime.Statics.releaseFence()V
30+
```
31+
2732
# Steps for Data-Migration:
2833

2934
> :warning: Note that Version 4 of the tool is not backward-compatible with .properties files created in previous versions, and that package names have changed.
@@ -35,9 +40,9 @@ tar -xvzf spark-3.4.1-bin-hadoop3-scala2.13.tgz
3540
3. Run the below job using `spark-submit` command as shown below:
3641

3742
```
38-
./spark-submit --properties-file cdm.properties /
39-
--conf spark.cdm.schema.origin.keyspaceTable="<keyspacename>.<tablename>" /
40-
--master "local[*]" --driver-memory 25G --executor-memory 25G /
43+
./spark-submit --properties-file cdm.properties \
44+
--conf spark.cdm.schema.origin.keyspaceTable="<keyspacename>.<tablename>" \
45+
--master "local[*]" --driver-memory 25G --executor-memory 25G \
4146
--class com.datastax.cdm.job.Migrate cassandra-data-migrator-4.x.x.jar &> logfile_name_$(date +%Y%m%d_%H_%M).txt
4247
```
4348

@@ -50,9 +55,9 @@ Note:
5055
- To run the job in Data validation mode, use class option `--class com.datastax.cdm.job.DiffData` as shown below
5156

5257
```
53-
./spark-submit --properties-file cdm.properties /
54-
--conf spark.cdm.schema.origin.keyspaceTable="<keyspacename>.<tablename>" /
55-
--master "local[*]" --driver-memory 25G --executor-memory 25G /
58+
./spark-submit --properties-file cdm.properties \
59+
--conf spark.cdm.schema.origin.keyspaceTable="<keyspacename>.<tablename>" \
60+
--master "local[*]" --driver-memory 25G --executor-memory 25G \
5661
--class com.datastax.cdm.job.DiffData cassandra-data-migrator-4.x.x.jar &> logfile_name_$(date +%Y%m%d_%H_%M).txt
5762
```
5863

@@ -89,10 +94,10 @@ Note:
8994
Each line above represents a partition-range (`min,max`). Alternatively, you can also pass the partition-file via command-line param as shown below
9095

9196
```
92-
./spark-submit --properties-file cdm.properties /
93-
--conf spark.cdm.schema.origin.keyspaceTable="<keyspacename>.<tablename>" /
94-
--conf spark.cdm.tokenRange.partitionFile="/<path-to-file>/<csv-input-filename>" /
95-
--master "local[*]" --driver-memory 25G --executor-memory 25G /
97+
./spark-submit --properties-file cdm.properties \
98+
--conf spark.cdm.schema.origin.keyspaceTable="<keyspacename>.<tablename>" \
99+
--conf spark.cdm.tokenRange.partitionFile="/<path-to-file>/<csv-input-filename>" \
100+
--master "local[*]" --driver-memory 25G --executor-memory 25G \
96101
--class com.datastax.cdm.job.<Migrate|DiffData> cassandra-data-migrator-4.x.x.jar &> logfile_name_$(date +%Y%m%d_%H_%M).txt
97102
```
98103
This mode is specifically useful to processes a subset of partition-ranges that may have failed during a previous run.
@@ -103,10 +108,10 @@ This mode is specifically useful to processes a subset of partition-ranges that
103108
# Perform large-field Guardrail violation checks
104109
- The tool can be used to identify large fields from a table that may break you cluster guardrails (e.g. AstraDB has a 10MB limit for a single large field) `--class com.datastax.cdm.job.GuardrailCheck` as shown below
105110
```
106-
./spark-submit --properties-file cdm.properties /
107-
--conf spark.cdm.schema.origin.keyspaceTable="<keyspacename>.<tablename>" /
108-
--conf spark.cdm.feature.guardrail.colSizeInKB=10000 /
109-
--master "local[*]" --driver-memory 25G --executor-memory 25G /
111+
./spark-submit --properties-file cdm.properties \
112+
--conf spark.cdm.schema.origin.keyspaceTable="<keyspacename>.<tablename>" \
113+
--conf spark.cdm.feature.guardrail.colSizeInKB=10000 \
114+
--master "local[*]" --driver-memory 25G --executor-memory 25G \
110115
--class com.datastax.cdm.job.GuardrailCheck cassandra-data-migrator-4.x.x.jar &> logfile_name_$(date +%Y%m%d_%H_%M).txt
111116
```
112117

0 commit comments

Comments
 (0)