Skip to content

Commit f5d5a00

Browse files
authored
Upgrade to leverage Spark 3.5.1 (#249)
1 parent f3db2a4 commit f5d5a00

File tree

6 files changed

+20
-13
lines changed

6 files changed

+20
-13
lines changed

Dockerfile

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -9,9 +9,9 @@ RUN mkdir -p /assets/ && cd /assets && \
99
curl -OL https://downloads.datastax.com/enterprise/cqlsh-astra.tar.gz && \
1010
tar -xzf ./cqlsh-astra.tar.gz && \
1111
rm ./cqlsh-astra.tar.gz && \
12-
curl -OL https://archive.apache.org/dist/spark/spark-3.4.2/spark-3.4.2-bin-hadoop3-scala2.13.tgz && \
13-
tar -xzf ./spark-3.4.2-bin-hadoop3-scala2.13.tgz && \
14-
rm ./spark-3.4.2-bin-hadoop3-scala2.13.tgz
12+
curl -OL https://archive.apache.org/dist/spark/spark-3.5.1/spark-3.5.1-bin-hadoop3-scala2.13.tgz && \
13+
tar -xzf ./spark-3.5.1-bin-hadoop3-scala2.13.tgz && \
14+
rm ./spark-3.5.1-bin-hadoop3-scala2.13.tgz
1515

1616
RUN apt-get update && apt-get install -y openssh-server vim python3 --no-install-recommends && \
1717
rm -rf /var/lib/apt/lists/* && \
@@ -46,7 +46,7 @@ RUN chmod +x ./get-latest-maven-version.sh && \
4646
rm -rf "$USER_HOME_DIR/.m2"
4747

4848
# Add all migration tools to path
49-
ENV PATH="${PATH}:/assets/dsbulk/bin/:/assets/cqlsh-astra/bin/:/assets/spark-3.4.2-bin-hadoop3-scala2.13/bin/"
49+
ENV PATH="${PATH}:/assets/dsbulk/bin/:/assets/cqlsh-astra/bin/:/assets/spark-3.5.1-bin-hadoop3-scala2.13/bin/"
5050

5151
EXPOSE 22
5252

README.md

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -7,7 +7,7 @@
77

88
Migrate and Validate Tables between Origin and Target Cassandra Clusters.
99

10-
> :warning: Please note this job has been tested with spark version [3.4.2](https://archive.apache.org/dist/spark/spark-3.4.2/)
10+
> :warning: Please note this job has been tested with spark version [3.5.1](https://archive.apache.org/dist/spark/spark-3.5.1/)
1111
1212
## Install as a Container
1313
- Get the latest image that includes all dependencies from [DockerHub](https://hub.docker.com/r/datastax/cassandra-data-migrator)
@@ -18,10 +18,10 @@ Migrate and Validate Tables between Origin and Target Cassandra Clusters.
1818

1919
### Prerequisite
2020
- Install **Java11** (minimum) as Spark binaries are compiled with it.
21-
- Install Spark version [`3.4.2`](https://archive.apache.org/dist/spark/spark-3.4.2/spark-3.4.2-bin-hadoop3-scala2.13.tgz) on a single VM (no cluster necessary) where you want to run this job. Spark can be installed by running the following: -
21+
- Install Spark version [`3.5.1`](https://archive.apache.org/dist/spark/spark-3.5.1/spark-3.5.1-bin-hadoop3-scala2.13.tgz) on a single VM (no cluster necessary) where you want to run this job. Spark can be installed by running the following: -
2222
```
23-
wget https://archive.apache.org/dist/spark/spark-3.4.2/spark-3.4.2-bin-hadoop3-scala2.13.tgz
24-
tar -xvzf spark-3.4.2-bin-hadoop3-scala2.13.tgz
23+
wget https://archive.apache.org/dist/spark/spark-3.5.1/spark-3.5.1-bin-hadoop3-scala2.13.tgz
24+
tar -xvzf spark-3.5.1-bin-hadoop3-scala2.13.tgz
2525
```
2626

2727
> :warning: If the above Spark and Scala version is not properly installed, you'll then see a similar exception like below when running the CDM jobs,

RELEASE.md

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,7 @@
11
# Release Notes
2+
## [4.1.13] - 2024-02-27
3+
- Upgraded to use Spark `3.5.1`.
4+
25
## [4.1.12] - 2024-01-22
36
- Upgraded to use Spark `3.4.2`.
47
- Added Java `11` as the minimally required pre-requisite to run CDM jobs.

pom.xml

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -10,9 +10,9 @@
1010
<project.build.sourceEncoding>UTF-8</project.build.sourceEncoding>
1111
<scala.version>2.13.12</scala.version>
1212
<scala.main.version>2.13</scala.main.version>
13-
<spark.version>3.4.1</spark.version>
14-
<connector.version>3.4.1</connector.version>
15-
<cassandra.version>5.0-alpha1</cassandra.version>
13+
<spark.version>3.5.1</spark.version>
14+
<connector.version>3.5.0</connector.version>
15+
<cassandra.version>5.0-beta1</cassandra.version>
1616
<junit.version>5.9.1</junit.version>
1717
<mockito.version>4.11.0</mockito.version>
1818
<java-driver.version>4.17.0</java-driver.version>
@@ -198,7 +198,7 @@
198198
<plugin>
199199
<groupId>net.alchim31.maven</groupId>
200200
<artifactId>scala-maven-plugin</artifactId>
201-
<version>4.8.0</version>
201+
<version>4.8.1</version>
202202
<executions>
203203
<execution>
204204
<phase>process-sources</phase>

rat-excludes.txt

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -6,6 +6,7 @@
66
.github/workflows/maven.yml
77
.github/workflows/snyk-cli-scan.yml
88
.github/workflows/snyk-pr-cleanup.yml
9+
.github/workflows/dependabot.yml
910
README.md
1011
rat-excludes.txt
1112
pom.xml
@@ -19,7 +20,9 @@ Dockerfile
1920
.snyk
2021
.snyk.ignore.example
2122
PERF/*
23+
PERF/*/*/output/*
2224
SIT/*
25+
SIT/*/*/output/*
2326
scripts/*
2427
test-backup/feature/*
2528
src/resources/partitions.csv
@@ -81,6 +84,7 @@ SIT/smoke/04_counters/cdm.validateData.assert
8184
SIT/smoke/04_counters/cdm.fixForce.assert
8285
SIT/smoke/05_reserved_keyword/cdm.txt
8386
SIT/smoke/05_reserved_keyword/expected.out
87+
SIT/smoke_inflight/06_vector/cdm.sh
8488
PERF/logs/scenario_20230523_162859_122.log
8589
PERF/logs/scenario_20230523_162126_056.log
8690
PERF/logs/scenario_20230523_162204_904.log

src/resources/migrate_data.sh

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -35,7 +35,7 @@
3535
###########################################################################################################################
3636

3737
# Path to spark-submit
38-
SPARK_SUBMIT=/home/ubuntu/spark-3.4.2-bin-hadoop3-scala2.13/bin/spark-submit
38+
SPARK_SUBMIT=/home/ubuntu/spark-3.5.1-bin-hadoop3-scala2.13/bin/spark-submit
3939

4040
# Path to spark configuration for the table
4141
PROPS_FILE=/home/ubuntu/sparkConf.properties

0 commit comments

Comments
 (0)