[VL] Deprecate and remove Spark 3.2 support #11351

QCLyu · 2026-01-04T01:50:30Z

What changes are proposed in this pull request?

This PR comprehensively removes Spark 3.2 support from the Gluten Velox backend. It cleans up the source code, build profiles, CI/CD pipelines, and documentation.

Key changes include:

Source Code: Removed shims/spark32 and gluten-ut/spark32 directories.
Build System: Deleted the spark-3.2 profile from the root and all sub-module pom.xml files.
CI/CD: Removed legacy Spark 3.2 jobs (spark-test-spark32, spark-test-spark32-slow, and TPC-H OOM tests) from GitHub Workflows to reduce CI overhead.
Test Migration: Refactored VeloxHashJoinSuite and other backend tests to remove Spark 3.2-specific conditional logic, ensuring these tests now run on Spark 3.3+.
Documentation: Updated the build guide and ClickHouse deployment docs to remove references to Spark 3.2.

How was this patch tested?

Manual Build: Verified successful compilation on aarch64 (ARM64) using -Pspark-3.5 -Pbackends-velox.
Unit Tests: Verified that migrated tests in VeloxHashJoinSuite pass successfully under Spark 3.5.
CI: Infrastructure changes have been validated to ensure remaining Spark versions (3.3, 3.4, 3.5) trigger correctly.

Closes #8960

github-actions · 2026-01-04T01:50:57Z

Run Gluten Clickhouse CI on x86

github-actions · 2026-01-04T02:05:57Z

Run Gluten Clickhouse CI on x86

QCLyu · 2026-01-04T06:55:35Z

I have verified the changes for Spark 3.5 locally, while GitHub Actions was showing failures:

Jenkins (ClickHouse CI): SUCCESS.

The Reactor Summary confirms all modules (including gluten-core, shims, and backends-clickhouse) built successfully and all 36 tests passed.

Click to view Jenkins Reactor Summary (Build Success)

12:33:20 Run completed in 2 minutes, 25 seconds.
12:33:20 Total number of tests run: 36
12:33:20 Suites: completed 2, aborted 0
12:33:20 Tests: succeeded 36, failed 0, canceled 0, ignored 12, pending 0
12:33:20 All tests passed.
12:33:21 [INFO] ------------------------------------------------------------------------
12:33:21 [INFO] Reactor Summary for Gluten Parent Pom 1.6.0-SNAPSHOT:
12:33:21 [INFO]
12:33:21 [INFO] Gluten Parent Pom .................................. SUCCESS [ 17.421 s]
12:33:21 [INFO] Gluten Ras ......................................... SUCCESS [ 23.776 s]
12:33:21 [INFO] Gluten Ras Common .................................. SUCCESS [ 51.164 s]
12:33:21 [INFO] Gluten Core ........................................ SUCCESS [ 37.484 s]
12:33:21 [INFO] Gluten Shims ....................................... SUCCESS [ 0.260 s]
12:33:21 [INFO] Gluten Shims Common ................................ SUCCESS [ 6.371 s]
12:33:21 [INFO] Gluten Shims for Spark 3.3 ......................... SUCCESS [ 13.796 s]
12:33:21 [INFO] Gluten UI .......................................... SUCCESS [ 4.449 s]
12:33:21 [INFO] Gluten Substrait ................................... SUCCESS [ 59.459 s]
12:33:21 [INFO] Gluten Celeborn .................................... SUCCESS [ 4.854 s]
12:33:21 [INFO] Gluten Iceberg ..................................... SUCCESS [ 11.367 s]
12:33:21 [INFO] Gluten DeltaLake ................................... SUCCESS [ 9.760 s]
12:33:21 [INFO] Gluten Package ..................................... SUCCESS [ 6.663 s]
12:33:21 [INFO] Gluten Ras Planner ................................. SUCCESS [ 1.137 s]
12:33:21 [INFO] Gluten Kafka ....................................... SUCCESS [ 9.403 s]
12:33:21 [INFO] Gluten Backends ClickHouse ......................... SUCCESS [59:48 min]
12:33:21 [INFO] Gluten Unit Test Parent ............................ SUCCESS [ 1.323 s]
12:33:21 [INFO] Gluten Unit Test Common ............................ SUCCESS [ 5.468 s]
12:33:21 [INFO] Gluten Unit Test ................................... SUCCESS [ 16.964 s]
12:33:21 [INFO] Gluten Unit Test Spark33 ........................... SUCCESS [02:45 min]
12:33:21 [INFO] ------------------------------------------------------------------------
12:33:21 [INFO] BUILD SUCCESS
12:33:21 [INFO] ------------------------------------------------------------------------
12:33:21 [INFO] Total time: 01:07 h
12:33:21 [INFO] Finished at: 2026-01-04T04:33:21Z
12:33:21 [INFO] ------------------------------------------------------------------------

GitHub Actions:

These jobs are failing with 403 Forbidden errors during dependency resolution (Log4j, ASM, etc.). In the following commit, these issues are found and fixed:

.github/workflows/velox_nightly.yml (2 occurrences)
Removed mvn clean install -Pspark-3.2 from both the x86 and arm64 build jobs
Lines 103 and 226
.github/workflows/build_bundle_package.yml
Updated description from 'Spark version: spark-3.2, spark-3.3, spark-3.4 or spark-3.5' to 'Spark version: spark-3.3, spark-3.4, spark-3.5 or spark-4.0'
.github/workflows/util/install-spark-resources.sh
Removed the Spark 3.2 case (lines 92-96) from the script

These changes aim to resolve the 403 errors. The workflows will no longer attempt to build with the removed Spark 3.2 profile.

github-actions · 2026-01-04T18:34:49Z

Run Gluten Clickhouse CI on x86

github-actions · 2026-01-05T06:09:19Z

Run Gluten Clickhouse CI on x86

github-actions · 2026-01-06T00:21:15Z

Run Gluten Clickhouse CI on x86

github-actions · 2026-01-06T00:36:39Z

Run Gluten Clickhouse CI on x86

QCLyu · 2026-01-06T07:50:57Z

Hi @zhouyuan , the code changes here look good. The only ClickHouse CI failure is due to the Jenkins job still running -Pspark-3.2 on Java 8, but this repo no longer has a spark-3.2 profile and pulls Iceberg artifacts built for Java 11 (class version 55). That mismatch causes the compile error in gluten-iceberg. This is a CI config issue, not a code regression.

Please proceed with merge if everything else looks good to you, and we can update/disable the Spark 3.2 Java 8 leg in the Jenkins pipeline separately.

This is the only failed step in ClickHouse CI: https://opencicd.kyligence.com/job/gluten/job/gluten-ci/18337/flowGraphTable/
This is the log: https://opencicd.kyligence.com/job/gluten/job/gluten-ci/18337/execution/node/235/log/

18:00:32  [ERROR] COMPILATION ERROR : 
18:00:32  [INFO] -------------------------------------------------------------
18:00:32  [ERROR] /home/jenkins/agent/workspace/gluten/gluten-ci/ut-stage-1/gluten-iceberg/src/main/java/org/apache/gluten/connector/write/MetricsWrapper.java:[21,25] error: cannot access Metrics
18:00:32    bad class file: /root/.m2/repository/org/apache/iceberg/iceberg-spark-runtime-3.5_2.12/1.10.0/iceberg-spark-runtime-3.5_2.12-1.10.0.jar(org/apache/iceberg/Metrics.class)
18:00:32      class file has wrong version 55.0, should be 52.0
18:00:32      Please remove or make sure it appears in the correct subdirectory of the classpath.
18:00:32  [INFO] 1 error
18:00:32  [INFO] -------------------------------------------------------------
18:00:32  [INFO] ------------------------------------------------------------------------
18:00:32  [INFO] Reactor Summary for Gluten Parent Pom 1.6.0-SNAPSHOT:
18:00:32  [INFO] 
18:00:32  [INFO] Gluten Parent Pom .................................. SUCCESS [ 21.426 s]
18:00:32  [INFO] Gluten Ras ......................................... SUCCESS [ 22.345 s]
18:00:32  [INFO] Gluten Ras Common .................................. SUCCESS [ 51.109 s]
18:00:32  [INFO] Gluten Core ........................................ SUCCESS [ 39.051 s]
18:00:32  [INFO] Gluten Shims ....................................... SUCCESS [  0.336 s]
18:00:32  [INFO] Gluten Shims Common ................................ SUCCESS [  6.173 s]
18:00:32  [INFO] Gluten Shims for Spark 3.5 ......................... SUCCESS [ 12.425 s]
18:00:32  [INFO] Gluten UI .......................................... SUCCESS [  5.105 s]
18:00:32  [INFO] Gluten Substrait ................................... SUCCESS [01:01 min]
18:00:32  [INFO] Gluten Celeborn .................................... SUCCESS [  4.345 s]
18:00:32  [INFO] Gluten Iceberg ..................................... FAILURE [  5.848 s]
18:00:32  [INFO] Gluten DeltaLake ................................... SKIPPED
18:00:32  [INFO] Gluten Package ..................................... SKIPPED
18:00:32  [INFO] Gluten Ras Planner ................................. SKIPPED
18:00:32  [INFO] Gluten Kafka ....................................... SKIPPED
18:00:32  [INFO] Gluten Backends ClickHouse ......................... SKIPPED
18:00:32  [INFO] Gluten Unit Test Parent ............................ SKIPPED
18:00:32  [INFO] Gluten Unit Test Common ............................ SKIPPED
18:00:32  [INFO] Gluten Unit Test ................................... SKIPPED
18:00:32  [INFO] ------------------------------------------------------------------------
18:00:32  [INFO] BUILD FAILURE
18:00:32  [INFO] ------------------------------------------------------------------------
18:00:32  [INFO] Total time:  03:49 min
18:00:32  [INFO] Finished at: 2026-01-06T02:00:32Z
18:00:32  [INFO] ------------------------------------------------------------------------
18:00:32  [WARNING] The requested profile "spark-3.2" could not be activated because it does not exist.
18:00:32  [ERROR] Failed to execute goal org.apache.maven.plugins:maven-compiler-plugin:3.14.1:compile (default-compile) on project gluten-iceberg: Compilation failure
18:00:32  [ERROR] /home/jenkins/agent/workspace/gluten/gluten-ci/ut-stage-1/gluten-iceberg/src/main/java/org/apache/gluten/connector/write/MetricsWrapper.java:[21,25] error: cannot access Metrics
18:00:32  [ERROR]   bad class file: /root/.m2/repository/org/apache/iceberg/iceberg-spark-runtime-3.5_2.12/1.10.0/iceberg-spark-runtime-3.5_2.12-1.10.0.jar(org/apache/iceberg/Metrics.class)
18:00:32  [ERROR]     class file has wrong version 55.0, should be 52.0
18:00:32  [ERROR]     Please remove or make sure it appears in the correct subdirectory of the classpath.
18:00:32  [ERROR] 
18:00:32  [ERROR] -> [Help 1]

zhouyuan · 2026-01-06T09:36:07Z

@zzcclp could you please help to take a look?

QCLyu · 2026-01-07T03:57:25Z

Hi @zhouyuan @zzcclp Just following up on my last comment here. I’d love to get your thoughts on CI config issue—do you feel this makes sense, or maybe not? Want to make sure we’re aligned before I move forward.

zzcclp · 2026-01-07T08:26:58Z

Sorry for the late reply, I modified the CI script , please hava a try again.

github-actions · 2026-01-08T03:36:32Z

Run Gluten Clickhouse CI on x86

* [Scala 2.13][IntelliJ] Remove suppression for lint-multiarg-infix warnings in pom.xml see apache/spark#43332 * [Scala 2.13][IntelliJ] Suppress warning for `ContentFile::path` * [Scala 2.13][IntelliJ] Suppress warning for ContextAwareIterator initialization * [Scala 2.13][IntelliJ] Refactor to use Symbol for column references to fix compilation error in Scala 2.13 with IntelliJ compiler: symbol literal is deprecated; use Symbol("i") * [Fix] Replace deprecated fileToString with Files.readString for file reading in GlutenSQLQueryTestSuite see apache/spark#51911 which removes Spark's fileToString method from Spark code base. * [Scala 2.13][IntelliJ] Update the Java compiler release version from 8 to `${java.version}` in the Scala 2.13 profiler to align it with `maven.compiler.target` * [Refactor] Replace usage of `Symbol` with `col` for column references to align with Spark API best practices --------- Co-authored-by: Chang chen <[email protected]>

…/ut (apache#11317) Bumps org.apache.kafka:kafka_2.12 from 3.4.0 to 3.9.1. --- updated-dependencies: - dependency-name: org.apache.kafka:kafka_2.12 dependency-version: 3.9.1 dependency-type: direct:development ... Signed-off-by: dependabot[bot] <[email protected]> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* [Refactor] Rename GlutenCastSuite to GlutenCastWithAnsiOffSuite and update test settings to use the new suite * [Refactor] Add GlutenDataSourceV2SQLSuite classes for V1 and V2 filter testing Remove GlutenDataSourceV2SQLSuiteV1Filter.scala and GlutenDataSourceV2SQLSuiteV2Filter.scala * [Refactor] Rename FallbackStrategiesSuite to GlutenFallbackStrategiesSuite and move to gluten package * [Refactor] Consolidate GlutenDeleteFromTableSuite into GlutenGroupBasedDeleteFromTableSuite for cleaner structure * [Refactor] Remove ParquetReadBenchmark as it is no longer necessary * [Refactor] Adjust import structure and package declaration for GlutenValidateRequirementsSuite

…l values (apache#11331) --------- Co-authored-by: jiangtian <[email protected]>

…#11349) Upstream Velox's New Commits: 2fdcd253e by Xiaoxuan Meng, misc: Added index bound type unit test (15879) 74af4ef1b by Xiao Du, feat: Add string compaction for approx_most_frequent global aggregation (15852) 48e853131 by Pedro Eugenio Rocha Pedreira, refactor(simple-function): Make materialization of string-types explicit (15869) 80638a89e by Artem Selishchev, fix: [velox] Reuse context in ZSTD_decompress (15854) Signed-off-by: glutenperfbot <[email protected]> Co-authored-by: glutenperfbot <[email protected]>

github-actions · 2026-01-08T03:53:41Z

Run Gluten Clickhouse CI on x86

github-actions · 2026-01-08T03:57:36Z

Run Gluten Clickhouse CI on x86

PHILO-HE

Thank you so much for your work. Some comments. Please check if they make sense.

Suggest to use git pull --rebase <remote name> main to rebase code. Otherwise, some commits already merged to main branch are included in this PR, which are mixed with your changes and not friendly to reviewers.

Please also clean the use of Spark 3.2 in build scripts under dev.

Maybe, some Spark shim APIs were introduced to adapt to the differences between Spark 3.2 and later Spark versions. If so, we should also remove them (can be done in separate PRs).

PHILO-HE · 2026-01-08T05:40:40Z

.github/workflows/velox_backend_x86.yml

I assume we should only remove Spark 3.2 UT test jobs from CI. Other jobs like celeborn test should change to using Spark 3.3 or higher supported versions, instead of deleting those tests.

Thanks @PHILO-HE Agreed.
Also, created a linked issue for Spark 3.2-specific compatibility code removal: #11379

PHILO-HE · 2026-01-08T05:50:34Z

.idea/vcs.xml

Please confirm if these changes were intended.

Thanks @PHILO-HE I'm working on it and will commit again.

github-actions · 2026-01-08T08:12:59Z

Run Gluten Clickhouse CI on x86

github-actions · 2026-01-08T08:21:37Z

Run Gluten Clickhouse CI on x86

github-actions bot added CORE works for Gluten Core VELOX INFRA TOOLS DOCS labels Jan 4, 2026

github-actions bot added the CLICKHOUSE label Jan 5, 2026

github-actions bot added BUILD FLINK labels Jan 8, 2026

QCLyu and others added 10 commits January 7, 2026 19:41

Cleanup: Remove Spark 3.2 shims, CI jobs, and docs (apache#8960)

67698cf

[GLUTEN-6887][VL] Daily Update Velox Version (2025_12_29) (apache#11337)

ce94264

[GLUTEN-11330][VL] Make PartialProject support array and map with nul…

d32882f

…l values (apache#11331) --------- Co-authored-by: jiangtian <[email protected]>

Address 403 access errors

9c34435

replace spark3.2 check by 3.3-3.5 in check-env.sh

a4bb405

temporarily exlcude q72 for spark4.0 in TPC-DS tests

0d170bd

QCLyu force-pushed the qingchuanlyu branch from 362d0c2 to 2118983 Compare January 8, 2026 03:53

github-actions bot added the DATA_LAKE label Jan 8, 2026

github-actions bot removed BUILD DATA_LAKE FLINK labels Jan 8, 2026

PHILO-HE reviewed Jan 8, 2026

View reviewed changes

QCLyu added 3 commits January 7, 2026 23:59

final fixes

bc13f72

add velox backend

696c703

undo changes to .idea/vcs.xml

e210cb3

QCLyu force-pushed the qingchuanlyu branch from 0a4fbf6 to 6c73691 Compare January 8, 2026 08:12

github-actions bot added BUILD DATA_LAKE FLINK labels Jan 8, 2026

QCLyu force-pushed the qingchuanlyu branch from 6c73691 to e210cb3 Compare January 8, 2026 08:21

QCLyu marked this pull request as draft January 8, 2026 08:24

[VL] Deprecate and remove Spark 3.2 support #11351

Are you sure you want to change the base?

[VL] Deprecate and remove Spark 3.2 support #11351

Conversation

QCLyu commented Jan 4, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What changes are proposed in this pull request?

How was this patch tested?

Uh oh!

github-actions bot commented Jan 4, 2026

Uh oh!

github-actions bot commented Jan 4, 2026

Uh oh!

QCLyu commented Jan 4, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Jenkins (ClickHouse CI): SUCCESS.

GitHub Actions:

Uh oh!

github-actions bot commented Jan 4, 2026

Uh oh!

github-actions bot commented Jan 5, 2026

Uh oh!

github-actions bot commented Jan 6, 2026

Uh oh!

github-actions bot commented Jan 6, 2026

Uh oh!

QCLyu commented Jan 6, 2026

Uh oh!

zhouyuan commented Jan 6, 2026

Uh oh!

QCLyu commented Jan 7, 2026

Uh oh!

zzcclp commented Jan 7, 2026

Uh oh!

github-actions bot commented Jan 8, 2026

Uh oh!

github-actions bot commented Jan 8, 2026

Uh oh!

github-actions bot commented Jan 8, 2026

Uh oh!

PHILO-HE left a comment

Choose a reason for hiding this comment

Uh oh!

PHILO-HE Jan 8, 2026

Choose a reason for hiding this comment

Uh oh!

QCLyu Jan 8, 2026

Choose a reason for hiding this comment

Uh oh!

PHILO-HE Jan 8, 2026

Choose a reason for hiding this comment

Uh oh!

QCLyu Jan 8, 2026

Choose a reason for hiding this comment

Uh oh!

github-actions bot commented Jan 8, 2026

Uh oh!

github-actions bot commented Jan 8, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

8 participants

QCLyu commented Jan 4, 2026 •

edited

Loading

QCLyu commented Jan 4, 2026 •

edited

Loading