Draft
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Purpose
Linked issue: close #2666
Add support for Apache Spark 4.1 by introducing a new fluss-spark-4.1 connector module. Spark 4.x drops Scala 2.12 and requires Scala 2.13, so this change also introduces Maven profile-based Scala version switching to build Spark 3.x modules with Scala 2.12 and the Spark 4.1 module with Scala 2.13.
Brief change log
New module fluss-spark/fluss-spark-4.1: Thin packaging module (mirroring fluss-spark-3.5) that depends on fluss-spark-common and produces a shaded connector JAR using Apache Spark 4.1.1.
Maven profile restructuring in fluss-spark/pom.xml: Moved version-specific modules into profiles. spark3 (active by default) includes fluss-spark-3.5 and fluss-spark-3.4. spark4 includes fluss-spark-4.1 and switches scala.binary.version to 2.13 and scala.version to ${scala213.version}.
CI updates: Added spark4 stage to stage.sh and ci-template.yaml. The spark4 matrix job builds with -Pspark4 to activate Scala 2.13 and the Spark 4.1 module.
Test coverage: Added test-spark4 profile to fluss-test-coverage/pom.xml for JaCoCo coverage aggregation.
Scala 2.13 compatibility fix: Removed empty parentheses from doubleValue() calls in DataConverterTest, FlussAsSparkArrayTest, and FlussAsSparkRowTest — in Scala 2.13, BigDecimal.doubleValue is defined without parens, so doubleValue() is interpreted as calling apply() on Double. The fix is backward-compatible with Scala 2.12.
Documentation: Updated getting-started.md to list Spark 4.1 as a supported version, added a note about the Scala 2.13 requirement, and added Spark 4.1 download/install instructions.
Tests
Existing tests in fluss-spark-ut run against both Spark 3.x (Scala 2.12) and Spark 4.1 (Scala 2.13) via the profile mechanism.
DataConverterTest — verifies type conversion including decimal handling
FlussAsSparkArrayTest — verifies array data access including decimal arrays
FlussAsSparkRowTest — verifies row data access including full row with all types
API and Format
No changes. This change does not affect any public API or storage format. It adds a new connector artifact (fluss-spark-4.1_2.13) alongside the existing ones.
Documentation
Updated website/docs/engine-spark/getting-started.md:
Added Spark 4.1 to the supported versions table
Added a note about the Scala 2.13 requirement for Spark 4.x
Added Spark 4.1 download and jar copy instructions