Skip to content

[feat] spark 4.1#2735

Draft
gezapeti wants to merge 1 commit intoapache:mainfrom
gezapeti:spark4
Draft

[feat] spark 4.1#2735
gezapeti wants to merge 1 commit intoapache:mainfrom
gezapeti:spark4

Conversation

@gezapeti
Copy link

@gezapeti gezapeti commented Feb 26, 2026

Purpose

Linked issue: close #2666
Add support for Apache Spark 4.1 by introducing a new fluss-spark-4.1 connector module. Spark 4.x drops Scala 2.12 and requires Scala 2.13, so this change also introduces Maven profile-based Scala version switching to build Spark 3.x modules with Scala 2.12 and the Spark 4.1 module with Scala 2.13.

Brief change log

New module fluss-spark/fluss-spark-4.1: Thin packaging module (mirroring fluss-spark-3.5) that depends on fluss-spark-common and produces a shaded connector JAR using Apache Spark 4.1.1.
Maven profile restructuring in fluss-spark/pom.xml: Moved version-specific modules into profiles. spark3 (active by default) includes fluss-spark-3.5 and fluss-spark-3.4. spark4 includes fluss-spark-4.1 and switches scala.binary.version to 2.13 and scala.version to ${scala213.version}.
CI updates: Added spark4 stage to stage.sh and ci-template.yaml. The spark4 matrix job builds with -Pspark4 to activate Scala 2.13 and the Spark 4.1 module.
Test coverage: Added test-spark4 profile to fluss-test-coverage/pom.xml for JaCoCo coverage aggregation.
Scala 2.13 compatibility fix: Removed empty parentheses from doubleValue() calls in DataConverterTest, FlussAsSparkArrayTest, and FlussAsSparkRowTest — in Scala 2.13, BigDecimal.doubleValue is defined without parens, so doubleValue() is interpreted as calling apply() on Double. The fix is backward-compatible with Scala 2.12.
Documentation: Updated getting-started.md to list Spark 4.1 as a supported version, added a note about the Scala 2.13 requirement, and added Spark 4.1 download/install instructions.

Tests

Existing tests in fluss-spark-ut run against both Spark 3.x (Scala 2.12) and Spark 4.1 (Scala 2.13) via the profile mechanism.
DataConverterTest — verifies type conversion including decimal handling
FlussAsSparkArrayTest — verifies array data access including decimal arrays
FlussAsSparkRowTest — verifies row data access including full row with all types

API and Format

No changes. This change does not affect any public API or storage format. It adds a new connector artifact (fluss-spark-4.1_2.13) alongside the existing ones.

Documentation

Updated website/docs/engine-spark/getting-started.md:
Added Spark 4.1 to the supported versions table
Added a note about the Scala 2.13 requirement for Spark 4.x
Added Spark 4.1 download and jar copy instructions

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[spark] to support Spark4.x

1 participant