Skip to content

Conversation

@baibaichen
Copy link
Contributor

@baibaichen baibaichen commented Jan 4, 2026

What changes are proposed in this pull request?

Note: Commits categorized as Test Exclusion will be addressed in future PRs.

Cause Type Category Description Affected Files
N/A Feat Build Update build configuration to support Spark 4.1 UT .github/workflows/velox_backend_x86.yml, gluten-ut/pom.xml, gluten-ut/spark41/pom.xml, tools/gluten-it/pom.xml
#52165 Fix Dependency Update Parquet dependency version to 1.16.0 to avoid NoSuchMethodError issue gluten-ut/spark41/pom.xml
#51477 Fix Compatibility Update imports to reflect streaming runtime package refactoring in Apache Spark gluten-ut/spark41/.../GlutenDynamicPartitionPruningSuite.scala, gluten-ut/spark41/.../GlutenStreamingQuerySuite.scala
#50674 Fix Compatibility Fix compatibility issue introduced by TypedConfigBuilder gluten-substrait/.../ExpressionConverter.scala, gluten-ut/spark41/.../GlutenCSVSuite.scala, gluten-ut/spark41/.../GlutenJsonSuite.scala
#49766 Fix Compatibility Disable V2 bucketing in GlutenDynamicPartitionPruningSuite since spark.sql.sources.v2.bucketing.enabled is now enabled by default gluten-ut/spark41/.../GlutenDynamicPartitionPruningSuite.scala
#42414, #53038 Fix Bug Fix Resolve an issue introduced by SPARK-42414, as identified in SPARK-53038 backends-velox/.../VeloxBloomFilterAggregate.scala
N/A Fix Bug Fix Enforce row fallback for unsupported cached batches - keep columnar execution only when schema validation succeeds backends-velox/.../ColumnarCachedBatchSerializer.scala
SPARK-53132, SPARK-53142 4.1.0 Test Exclusion Exclude additional Spark 4.1 KeyGroupedPartitioningSuite tests. Excluded tests: SPARK-53322*, SPARK-54439* gluten-ut/spark41/.../VeloxTestSettings.scala
SPARK-53535, SPARK-54220 4.1.0 Test Exclusion Exclude additional Spark 4.1 GlutenParquetIOSuite tests. Excluded tests: SPARK-53535*, vectorized reader: missing all struct fields*, SPARK-54220* gluten-ut/spark41/.../VeloxTestSettings.scala
#52645 4.1.0 Test Exclusion Exclude additional Spark 4.1 GlutenStreamingQuerySuite tests. Excluded tests: SPARK-53942: changing the number of stateless shuffle partitions via config, SPARK-53942: stateful shuffle partitions are retained from old checkpoint gluten-ut/spark41/.../VeloxTestSettings.scala
#47856 4.1.0 Test Exclusion Exclude additional Spark 4.1 GlutenDataFrameWindowFunctionsSuite and GlutenJoinSuite tests. Excluded tests: SPARK-49386: Window spill with more than the inMemoryThreshold and spillSizeThreshold, SPARK-49386: test SortMergeJoin (with spill by size threshold) gluten-ut/spark41/.../VeloxTestSettings.scala
#52157 4.1.0 Test Exclusion Exclude additional Spark 4.1 GlutenQueryExecutionSuite tests. Excluded test: #53413: Cleanup shuffle dependencies for commands gluten-ut/spark41/.../VeloxTestSettings.scala
#48470 4.1.0 Test Exclusion Exclude split test in GlutenRegexpExpressionsSuite. Excluded test: GlutenRegexpExpressionsSuite.SPLIT gluten-ut/spark41/.../VeloxTestSettings.scala
#51623 4.1.0 Test Exclusion Add spark.sql.unionOutputPartitioning=false to Maven test args. Excluded tests: GlutenBroadcastExchangeSuite.SPARK-52962, GlutenDataFrameSetOperationsSuite.SPARK-52921* .github/workflows/velox_backend_x86.yml, gluten-ut/spark41/.../VeloxTestSettings.scala, tools/gluten-it/common/.../Suite.scala
N/A 4.1.0 Test Exclusion Excludes failed SQL tests that need to be fixed for Spark 4.1 compatibility. Excluded tests: decimalArithmeticOperations.sql, identifier-clause.sql, keywords.sql, literals.sql, operators.sql, exists-orderby-limit.sql, postgreSQL/date.sql, nonansi/keywords.sql, nonansi/literals.sql, datetime-legacy.sql, datetime-parsing-invalid.sql, misc-functions.sql gluten-ut/spark41/.../VeloxSQLQueryTestSettings.scala
#11252 4.1.0 Test Exclusion Exclude Gluten test for SPARK-47939: Explain should work with parameterized queries gluten-ut/spark41/.../VeloxTestSettings.scala

Fixes #11343

How was this patch tested?

Tested with Spark 4.1 unit tests.

@github-actions github-actions bot added CORE works for Gluten Core VELOX INFRA labels Jan 4, 2026
@github-actions
Copy link

github-actions bot commented Jan 4, 2026

Run Gluten Clickhouse CI on x86

@baibaichen baibaichen changed the title Support Spark 4.1 UT [GLUTEN-11343][CORE][VL] Support Spark 4.1 UT Jan 4, 2026
@github-actions github-actions bot added the TOOLS label Jan 4, 2026
@github-actions
Copy link

github-actions bot commented Jan 4, 2026

Run Gluten Clickhouse CI on x86

@github-actions
Copy link

github-actions bot commented Jan 4, 2026

Run Gluten Clickhouse CI on x86

1 similar comment
@github-actions
Copy link

github-actions bot commented Jan 4, 2026

Run Gluten Clickhouse CI on x86

@github-actions
Copy link

github-actions bot commented Jan 5, 2026

Run Gluten Clickhouse CI on x86

@github-actions
Copy link

github-actions bot commented Jan 5, 2026

Run Gluten Clickhouse CI on x86

@github-actions
Copy link

github-actions bot commented Jan 5, 2026

Run Gluten Clickhouse CI on x86

@github-actions
Copy link

github-actions bot commented Jan 7, 2026

Run Gluten Clickhouse CI on x86

@baibaichen baibaichen marked this pull request as ready for review January 7, 2026 10:16
Copy link
Contributor

@zhouyuan zhouyuan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

@github-actions
Copy link

github-actions bot commented Jan 7, 2026

Run Gluten Clickhouse CI on x86

<activation>
<activeByDefault>false</activeByDefault>
</activation>
<properties>
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need to remove these properties in a subsequent PR.

@baibaichen baibaichen requested a review from Copilot January 7, 2026 16:01
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR adds support for Spark 4.1 unit tests by updating the build configuration, resolving compatibility issues, and adding new test resources. The changes accommodate API changes introduced in Spark 4.1, including dependency updates, package refactorings, and configuration parameter modifications.

Key Changes

  • Updated build and dependency configurations to support Spark 4.1 testing
  • Fixed compatibility issues from Spark API changes (streaming package refactoring, TypedConfigBuilder, V2 bucketing defaults)
  • Added comprehensive SQL test input files for Spark 4.1 compatibility validation

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@github-actions
Copy link

github-actions bot commented Jan 7, 2026

Run Gluten Clickhouse CI on x86

## Changes

| Cause | Type | Category | Description | Affected Files |
|-------|------|----------|-------------|----------------|
| N/A | Feat | Build | Update build configuration to support Spark 4.1 UT | `.github/workflows/velox_backend_x86.yml`, `gluten-ut/pom.xml`, `gluten-ut/spark41/pom.xml`, `tools/gluten-it/pom.xml` |
| [#52165](apache/spark#52165) | Fix | Dependency | Update Parquet dependency version to 1.16.0 to avoid NoSuchMethodError issue | `gluten-ut/spark41/pom.xml` |
| [#51477](apache/spark#51477) | Fix | Compatibility | Update imports to reflect streaming runtime package refactoring in Apache Spark | `gluten-ut/spark41/.../GlutenDynamicPartitionPruningSuite.scala`, `gluten-ut/spark41/.../GlutenStreamingQuerySuite.scala` |
| [#50674](apache/spark#50674) | Fix | Compatibility | Fix compatibility issue introduced by `TypedConfigBuilder` | `gluten-substrait/.../ExpressionConverter.scala`, `gluten-ut/spark41/.../GlutenCSVSuite.scala`, `gluten-ut/spark41/.../GlutenJsonSuite.scala` |
| [#49766](apache/spark#49766) | Fix | Compatibility | Disable V2 bucketing in GlutenDynamicPartitionPruningSuite since spark.sql.sources.v2.bucketing.enabled is now enabled by default | `gluten-ut/spark41/.../GlutenDynamicPartitionPruningSuite.scala` |
| [#42414](apache/spark#42414), [#53038](apache/spark#53038) | Fix | Bug Fix | Resolve an issue introduced by SPARK-42414, as identified in SPARK-53038 | `backends-velox/.../VeloxBloomFilterAggregate.scala` |
| N/A | Fix | Bug Fix | Enforce row fallback for unsupported cached batches - keep columnar execution only when schema validation succeeds | `backends-velox/.../ColumnarCachedBatchSerializer.scala` |
| [SPARK-53132](apache/spark#53132), [SPARK-53142](apache/spark#53142) | 4.1.0 | Test Exclusion | Exclude additional Spark 4.1 KeyGroupedPartitioningSuite tests. Excluded tests: `SPARK-53322*`, `SPARK-54439*` | `gluten-ut/spark41/.../VeloxTestSettings.scala` |
| [SPARK-53535](https://issues.apache.org/jira/browse/SPARK-53535), [SPARK-54220](https://issues.apache.org/jira/browse/SPARK-54220) | 4.1.0 | Test Exclusion | Exclude additional Spark 4.1 GlutenParquetIOSuite tests. Excluded tests: `SPARK-53535*`, `vectorized reader: missing all struct fields*`, `SPARK-54220*` | `gluten-ut/spark41/.../VeloxTestSettings.scala` |
| [#52645](apache/spark#52645) | 4.1.0 | Test Exclusion | Exclude additional Spark 4.1 GlutenStreamingQuerySuite tests. Excluded tests: `SPARK-53942: changing the number of stateless shuffle partitions via config`, `SPARK-53942: stateful shuffle partitions are retained from old checkpoint` | `gluten-ut/spark41/.../VeloxTestSettings.scala` |
| [#47856](apache/spark#47856) | 4.1.0 | Test Exclusion | Exclude additional Spark 4.1 GlutenDataFrameWindowFunctionsSuite and GlutenJoinSuite tests. Excluded tests: `SPARK-49386: Window spill with more than the inMemoryThreshold and spillSizeThreshold`, `SPARK-49386: test SortMergeJoin (with spill by size threshold)` | `gluten-ut/spark41/.../VeloxTestSettings.scala` |
| [#52157](apache/spark#52157) | 4.1.0 | Test Exclusion | Exclude additional Spark 4.1 GlutenQueryExecutionSuite tests. Excluded test: `#53413: Cleanup shuffle dependencies for commands` | `gluten-ut/spark41/.../VeloxTestSettings.scala` |
| [#48470](apache/spark#48470) | 4.1.0 | Test Exclusion | Exclude split test in GlutenRegexpExpressionsSuite. Excluded test: `GlutenRegexpExpressionsSuite.SPLIT` | `gluten-ut/spark41/.../VeloxTestSettings.scala` |
| [#51623](apache/spark#51623) | 4.1.0 | Test Exclusion | Add `spark.sql.unionOutputPartitioning=false` to Maven test args. Excluded tests: `GlutenBroadcastExchangeSuite.SPARK-52962`, `GlutenDataFrameSetOperationsSuite.SPARK-52921*` | `.github/workflows/velox_backend_x86.yml`, `gluten-ut/spark41/.../VeloxTestSettings.scala`, `tools/gluten-it/common/.../Suite.scala` |
| N/A | 4.1.0 | Test Exclusion | Excludes failed SQL tests that need to be fixed for Spark 4.1 compatibility. Excluded tests: `decimalArithmeticOperations.sql`, `identifier-clause.sql`, `keywords.sql`, `literals.sql`, `operators.sql`, `exists-orderby-limit.sql`, `postgreSQL/date.sql`, `nonansi/keywords.sql`, `nonansi/literals.sql`, `datetime-legacy.sql`, `datetime-parsing-invalid.sql`, `misc-functions.sql` | `gluten-ut/spark41/.../VeloxSQLQueryTestSettings.scala` |
| apache#11252 | 4.1.0 | Test Exclusion | Exclude Gluten test for SPARK-47939: Explain should work with parameterized queries |  `gluten-ut/spark41/.../VeloxTestSettings.scala` |
@github-actions
Copy link

github-actions bot commented Jan 8, 2026

Run Gluten Clickhouse CI on x86

@baibaichen baibaichen merged commit ed5b65e into apache:main Jan 8, 2026
117 of 119 checks passed
@baibaichen baibaichen deleted the feature/41_ut branch January 8, 2026 05:37
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

CORE works for Gluten Core INFRA TOOLS VELOX

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Add Spark 4.1.x unit tests

2 participants