Skip to content

Releases: apache/incubator-gluten

v1.6.0-rc0

24 Feb 20:29

Choose a tag to compare

v1.6.0-rc0 Pre-release
Pre-release

What's Changed

  • [TEST] Disable a gluten test temporarily: cast string to timestamp by @PHILO-HE in #10518
  • [CORE] Bump version to 1.6.0-SNAPSHOT by @PHILO-HE in #10517
  • [MINOR] Refactor a string concatenation by following scala style by @beliefer in #10520
  • [VL][INFRA] Fix docker build error on Centos-7 by @PHILO-HE in #10522
  • [GLUTEN-8953][VL] Support Iceberg overwrite table by @Zouxxyy in #10514
  • [GLUTEN-6887][VL] Daily Update Velox Version (2025_08_26) by @GlutenPerfBot in #10528
  • [GLUTEN-10521][VL] Fall back to_json function for uppercase struct field name by @zml1206 in #10523
  • [VL] Gluten-it: Simplify CollectionConverter.scala by @zhztheplayer in #10533
  • [VL] Fix missing path package/** from the Velox backend PR CI path trigger by @zhztheplayer in #10538
  • [GLUTEN-10529] Remove unnecessary create for Runtimes by @beliefer in #10530
  • [TEST][VL] Reinclude "cast string to timestamp" test by @PHILO-HE in #10532
  • [VL] Extend gluten-it to support more data source types by @zhztheplayer in #10554
  • [GLUTEN-6887][VL] Daily Update Velox Version (2025_08_27) by @GlutenPerfBot in #10549
  • [GLUTEN-10555] Remove unnecessary parameter leafTransformers for WholeStageTransformer by @beliefer in #10556
  • [VL] Gluten-it: Clean up Maven dependency relationships by @zhztheplayer in #10563
  • [GLUTEN-10552][VL] Fix openEuler compiling issue by @zhouyuan in #10564
  • [GLUTEN-6887][VL] Daily Update Velox Version (2025_08_28) by @GlutenPerfBot in #10571
  • [FLINK] Add java-17 profile for Flink build and update project version in flink doc by @zjuwangg in #10561
  • [GLUTEN-9671][VL] Fix broadcast exchange stackoverflow due to Kryo serialization by @felixloesing in #10541
  • [VL] Separate filesystem configuration initialization by @marin-ma in #10540
  • [MINOR] Remove unnecessary fields by @beliefer in #10560
  • [VL] Support independent Gluten CPP build by @kerwin-zk in #10575
  • [GLUTEN-10107][CH] Decouple Celeborn-related code from CH backend module by @zjuwangg in #10537
  • [GLUTEN-6887][VL] Daily Update Velox Version (2025_08_29) by @GlutenPerfBot in #10580
  • [GLUTEN-10578] Remove unnecessary numaBindingInfo by @beliefer in #10579
  • [GLUTEN-6887][VL] Daily Update Velox Version (2025_08_30) by @GlutenPerfBot in #10588
  • [GLUTEN-6887][VL] Daily Update Velox Version (2025_08_31) by @GlutenPerfBot in #10589
  • [GLUTEN-6887][VL] Daily Update Velox Version (2025_09_01) by @GlutenPerfBot in #10592
  • [GLUEN-10107][INFRA]Deprecate isUseUniffleShuffleManager from glutenConfig by @zjuwangg in #10558
  • [VL] Gluten-it: Support using Delta tables in TPC-H and TPC-DS benchmarks by @zhztheplayer in #10562
  • [GLUTEN-10582][VL] Add Cudf memory resource mode and percent parameters by @jinchengchenghh in #10583
  • [GLUTEN-8852][VL] Update package script for spark-400 by @zhouyuan in #10584
  • [GLUTEN-8821][VL] Weekly Update Velox function support docs (2025_09_01) by @GlutenPerfBot in #10590
  • [GLUTEN-10387][VL] Set ANSI mode for Velox according to Spark's configuration by @PHILO-HE in #10385
  • [VL] Gluten-it: Update Delta versions, and other minors by @zhztheplayer in #10594
  • [VL] Update Velox branch by @rui-mo in #10597
  • [GLUTEN-10524] Remove unnecessary outputAttributes from BasicScanExecTransformer by @beliefer in #10525
  • [GLUTEN-10595][VL] Separate cpp test utils from the utils directory by @marin-ma in #10596
  • [GLUTEN-6887][VL] Daily Update Velox Version (2025_09_02) by @GlutenPerfBot in #10601
  • [DOC][FLINK] Update flink build command to skip gpg and spotless check by @zjuwangg in #10604
  • [Minor] Refactor test utility to let users compare the query result by @jinchengchenghh in #10565
  • [VL][Minor] Remove unused code for shuffle compression mode by @marin-ma in #10609
  • [GLUTEN-10599][VL] Fix Centos dev docker image build by @zhouyuan in #10600
  • [GLUTEN-10599][VL] Followup to enable git in CI scripts by @zhouyuan in #10610
  • [Minor] Add enhanced features runtime config by @jinchengchenghh in #10608
  • [VL] Update class duplication list in Maven enforcer by @zhztheplayer in #10536
  • [GLUTEN-6887][VL] Daily Update Velox Version (2025_09_03) by @GlutenPerfBot in #10612
  • [GLUTEN-9335][VL] Support iceberg partition write by @jinchengchenghh in #10497
  • [GLUTEN-10607][MINOR] Fix: Use setSafe in DateWriter to avoid overflow by @jiangjiangtian in #10581
  • [GLUTEN-10577][CELEBORN] Refactor CelebornShuffleManager to load factory in a better way by @zjuwangg in #10591
  • [CORE] Merge SubstraitUtil classes by @kevinwilfong in #10587
  • [GLUTEN-10546][FLINK] Support all flink operators for nexmark by @shuai-xu in #10548
  • [GLUTEN-10566][VL] Add Spark unix_timestamp support with timestamp and format arguments by @nimesh1601 in #10567
  • [Minor] Fix the velox target duplicated include VELOX_BUILD_PATH by @jinchengchenghh in #10615
  • [GLUTEN-10570][FLINK] Add --add-opens options to MAVEN_OPTS for Java 17 compatibility by @KevinyhZou in #10572
  • [GLUTEN-10605][VL] Rewrite unbounded window to an equivalent aggregate join by @zml1206 in #10606
  • [GLUTEN-10013][FLINK] Support function reinterpret by @KevinyhZou in #10022
  • [GLUTEN-6887][VL] Daily Update Velox Version (2025_09_04) by @GlutenPerfBot in #10626
  • [VL] Refactor gluten-it to pass structured query information to runner by @zhztheplayer in #10623
  • [GLUTEN-10210][VL] Enable tpcds tests for Spark-400 in CI by @zhouyuan in #10633
  • [VL] Fix arrow url typo by @liujiayi771 in #10641
  • [GLUTEN-6887][VL] Daily Update Velox Version (2025_09_06) by @GlutenPerfBot in #10643
  • [GLUTEN-10214][VL] Merge inputstream for shuffle reader by @marin-ma in #10499
  • [MINOR] Add .java-version to .gitignore by @Zouxxyy in #10642
  • [GLUTEN-10618][VL] Update input iterator metrics name to include more details by @marin-ma in #10619
  • [GLUTEN-6887][VL] Daily Update Velox Version (2025_09_08) by @GlutenPerfBot in #10653
  • [GLUTEN-10635][VL] bugfix: file INSTALL cannot set permissions by @beliefer in #10638
  • [GLUTEN-10361][FLINK] Fix UT failure between the conversion of BinaryRowData and StatefulRecord by @KevinyhZou in #10362
  • [GLUTEN-8889][VL] Fix Spark-355 download in GHA by @zhouyuan in #10655
  • [GLUTEN-10450][VL] Reclassify internal/public configs and remove internal configs from doc by @zjuwangg in #10603
  • [GLUTEN-9366][VL] Support Iceberg functions by @jinchengchenghh in #10285
  • [GLUTEN-10544] Remove unnecessary method separateScanRDD by @beliefer in #10545
  • [GLUTEN-6887][VL] Daily Update Velox Version (2025_09_09) by @GlutenPerfBot in #10656
    ...
Read more

v1.5.0

29 Oct 07:30
a269698

Choose a tag to compare

What's Changed

  • [GLUTEN-8846][CH] [Part 3] Add benchmark for Icerberg Delete by @baibaichen in #9192
  • [GLUTEN-9020][CH] Support delta DV BitmapAggregator by @loneylee in #9138
  • [GLUTEN-9197][CH] Simplify sum aggregate expression by @taiyang-li in #9198
  • [VL] Enable more ut in VeloxTestSettings by @WangGuangxin in #9080
  • [GLUTEN-9199][VL] Fix error when creating shuffle file: open with O_CREAT or O_TMPFILE in second argument needs 3 arguments by @zhztheplayer in #9200
  • [CORE] Fix duplicate setting for config LEGACY_TIME_PARSER_POLICY by @jinchengchenghh in #9201
  • [GLUTEN-9176][CH] Rewrite aggregate if to aggregate with filter clause by @taiyang-li in #9185
  • [GLUTEN-8557][CH] Flatten nested And/Or for performance optimization by @KevinyhZou in #8558
  • Revert "[GLUTEN-9164][CH]Enable row group level bloom filter push down" by @taiyang-li in #9214
  • [GLUTEN-9182][VL] Support new s3 configuration in Gluten by @dcoliversun in #9183
  • [VL] Celeborn shuffle reader OOM with many empty input stream by @marin-ma in #9221
  • [GLUTEN-8821][VL] Update aggregate/generator/window support doc and script by @marin-ma in #8971
  • [VL] Change to use Velox's wget_and_untar in setup-centos7.sh by @yaooqinn in #9207
  • [GLUTEN-9196][CH] Use wide-table aggregation to eliminate multi-table joins by @lgbo-ustc in #9155
  • [GLUTEN-9149][CORE] Remove Spark-specific code from JniLibLoader & JniWorkspace by @shuai-xu in #9150
  • [VL][CI] Change to use JDK-17 for Spark 3.3/3.4/3.5 tests by @PHILO-HE in #9209
  • [CORE][VL] Hide child nodes from implementations of OffloadSingleNode by @zhztheplayer in #9220
  • [GLUTEN-9008][VL] Support json_object_keys function by @dcoliversun in #9009
  • [GLUTEN-9239][CH] Support JDK17 for the CH backend by @zzcclp in #9242
  • [GLUTEN-9152][CORE] Avoid unnecessary serialization of hadoop conf by @zml1206 in #9153
  • [GLUTEN-9240][VL] Write NULL value into relation in gluten unit tests by @dcoliversun in #9241
  • [VL][CI] bump to use ubuntu-22.04 runner by @zhouyuan in #9262
  • [GLUTEN-9177][CH]Fix diff on parse host of url and refactor SparkParseURL by @KevinyhZou in #9179
  • [CORE] Decrease offheap memory size in resource profile for whole stage fallback case by @PHILO-HE in #8911
  • [GLUTEN-9205][CH] Support deletion vector native write by @loneylee in #9248
  • [VL] Delete global reference to a class object in JNI unload by @PHILO-HE in #9268
  • [GLUTEN-9245][VL] Fix partial project expression contains subquery by @jinchengchenghh in #9259
  • [GLUTEN-9244][CORE] Change the way of passing default timezone to native config by @zml1206 in #9249
  • [GLUTEN-8497][VL] Fix columnar batch type mismatch in table cache by @zhztheplayer in #9230
  • [VL] Support Spark legacy statistical aggregation function behavior by @NEUpanning in #9181
  • [CORE] Remove library unloading API from JniLibLoader as unused by @zhztheplayer in #9277
  • [GLUTEN-9237][CH] Fix the nullability missmatch issue for the Nothing type by @lgbo-ustc in #9238
  • [VL] Disable FlushableHashAggreagte when aggregates contains sum/avg for floating type by @kecookier in #8986
  • [CORE] Refine the test with specified spark version by @yikf in #9274
  • [CH] Add a comment to explain why the endpoint uses a single thread by @dcoliversun in #9257
  • [GLUTEN-8891][VL] Refine local ssd cache feature by @zhouyuan in #9228
  • [GLUTEN-9267][CH] Fix a bug in EliminateDeduplicateAggregateWithAnyJoin by @lgbo-ustc in #9293
  • [VL] Remove param original of ColumnarPartialProjectExec by @zml1206 in #9290
  • [GLUTEN-9178][CH] Fix cse in aggregate operator not working by @loneylee in #9301
  • [CORE] Post events until both spark ui and gluten ui are enable by @yikf in #9272
  • [CORE] Correctly handle driver configurations when spark.sql.extensions is explicitly set for GlutenSessionExtensions by @zhztheplayer in #9312
  • [GLUTEN-8851][VL] feat: Support cudf by @jinchengchenghh in #9229
  • [GLUTEN-9288][VL] Enable array_prepend function for spark 3.5+ by @dcoliversun in #9305
  • [GLUTEN-9317][CH]Fix: duplicated column names in shuffle read by @lgbo-ustc in #9318
  • [Gluten-9254][CH] Support RDDScanExec by @loneylee in #9270
  • [VL] Count total JVM memory as the on-heap portion for the off-heap sizing feature by @zhztheplayer in #9321
  • [GLUTEN-9300][DOC] Support replacement expression in gen-function-support-docs by @dcoliversun in #9331
  • [GLUTEN-9239][CH] [PART-1] Support Java-17 Rmove JNI_OnUnload by @baibaichen in #9275
  • [GLUTEN-7652][VL] Support binary as string by @wForget in #9325
  • [Gluten-9334][CH] Support delta metadata column file_path and row_index for mergetree by @loneylee in #9340
  • [GLUTEN-6867][CH] Fix Bug that cann't read file on minio by @baibaichen in #9332
  • [VL] Provide a configuration option to completely turn off off-heap memory tracking with Spark memory manager by @zhztheplayer in #9341
  • [GLUTEN-9313][VL] ColumnarPartialProject supports built-in but blacklisted function by @WangGuangxin in #9315
  • [GLUTEN-8772][CORE] refactor: Refactoring the usage of SubstraitContext#functionMap by @wypb in #8775
  • [VL] Move pre-configuration code of dynamic off-heap sizing to its own place by @zhztheplayer in #9336
  • [GLUTEN-9163][VL] Use stream de/compressor in sort-based shuffle by @marin-ma in #9278
  • [GLUTEN-9287][VL] Enable array_compact function for Spark 3.4+ by @dcoliversun in #9349
  • [GLUTEN-9095][UT] Remove Vanilla Spark InternalRow based checkEvaluation by @ArnavBalyan in #9096
  • [CORE] Make max broadcast table size configurable by @yaooqinn in #9359
  • [CH] Fix build error by @exmy in #9363
  • [GLUTEN-9243][VL] Fix cuda docker image by @zhouyuan in #9333
  • [GLUTEN-8912][VL] Add Offset support for CollectLimitExec by @ArnavBalyan in #8914
  • [GLUTEN-7589][VL] Support date_trunc function by @zml1206 in #7611
  • [GLUTEN-9279] Not pulling out expression from PartialMerge aggregate function to avoid invalid reference binding in ProjectExecTransformer by @Z1Wu in #9280
  • [Gluten-8792][CH] Support delta project incrementMetric expr by @loneylee in #9353
  • [GLUTEN-9034][VL] Add VeloxResizeBatchesExec for Shuffle by @WangGuangxin in #9035
  • Fix ColumnarToRowRemovalGuard not able to be copied by @yaooqinn in #9384
  • [GLUTEN-8846][CH] [Part 4] Add full-chain UT by @jlfsdtc in #9256
  • [VL] Follow up on #9384 to avoid swallowing exceptions in UT by @zhztheplayer in #9393
  • [GLUTEN-9163][VL] Separate compression buffer and disk write buffer configuration by @marin-ma in #9356
  • [VL][INFRA] Improve build bundle package workflow by @wForget in #9404
  • [VL] Refactor WholeStageTransformer to remove some duplicate code by @wyp...
Read more

v1.5.0-rc1

14 Oct 08:33
a269698

Choose a tag to compare

v1.5.0-rc1 Pre-release
Pre-release

What's Changed

  • [GLUTEN-8846][CH] [Part 3] Add benchmark for Icerberg Delete by @baibaichen in #9192
  • [GLUTEN-9020][CH] Support delta DV BitmapAggregator by @loneylee in #9138
  • [GLUTEN-9197][CH] Simplify sum aggregate expression by @taiyang-li in #9198
  • [VL] Enable more ut in VeloxTestSettings by @WangGuangxin in #9080
  • [GLUTEN-9199][VL] Fix error when creating shuffle file: open with O_CREAT or O_TMPFILE in second argument needs 3 arguments by @zhztheplayer in #9200
  • [CORE] Fix duplicate setting for config LEGACY_TIME_PARSER_POLICY by @jinchengchenghh in #9201
  • [GLUTEN-9176][CH] Rewrite aggregate if to aggregate with filter clause by @taiyang-li in #9185
  • [GLUTEN-8557][CH] Flatten nested And/Or for performance optimization by @KevinyhZou in #8558
  • Revert "[GLUTEN-9164][CH]Enable row group level bloom filter push down" by @taiyang-li in #9214
  • [GLUTEN-9182][VL] Support new s3 configuration in Gluten by @dcoliversun in #9183
  • [VL] Celeborn shuffle reader OOM with many empty input stream by @marin-ma in #9221
  • [GLUTEN-8821][VL] Update aggregate/generator/window support doc and script by @marin-ma in #8971
  • [VL] Change to use Velox's wget_and_untar in setup-centos7.sh by @yaooqinn in #9207
  • [GLUTEN-9196][CH] Use wide-table aggregation to eliminate multi-table joins by @lgbo-ustc in #9155
  • [GLUTEN-9149][CORE] Remove Spark-specific code from JniLibLoader & JniWorkspace by @shuai-xu in #9150
  • [VL][CI] Change to use JDK-17 for Spark 3.3/3.4/3.5 tests by @PHILO-HE in #9209
  • [CORE][VL] Hide child nodes from implementations of OffloadSingleNode by @zhztheplayer in #9220
  • [GLUTEN-9008][VL] Support json_object_keys function by @dcoliversun in #9009
  • [GLUTEN-9239][CH] Support JDK17 for the CH backend by @zzcclp in #9242
  • [GLUTEN-9152][CORE] Avoid unnecessary serialization of hadoop conf by @zml1206 in #9153
  • [GLUTEN-9240][VL] Write NULL value into relation in gluten unit tests by @dcoliversun in #9241
  • [VL][CI] bump to use ubuntu-22.04 runner by @zhouyuan in #9262
  • [GLUTEN-9177][CH]Fix diff on parse host of url and refactor SparkParseURL by @KevinyhZou in #9179
  • [CORE] Decrease offheap memory size in resource profile for whole stage fallback case by @PHILO-HE in #8911
  • [GLUTEN-9205][CH] Support deletion vector native write by @loneylee in #9248
  • [VL] Delete global reference to a class object in JNI unload by @PHILO-HE in #9268
  • [GLUTEN-9245][VL] Fix partial project expression contains subquery by @jinchengchenghh in #9259
  • [GLUTEN-9244][CORE] Change the way of passing default timezone to native config by @zml1206 in #9249
  • [GLUTEN-8497][VL] Fix columnar batch type mismatch in table cache by @zhztheplayer in #9230
  • [VL] Support Spark legacy statistical aggregation function behavior by @NEUpanning in #9181
  • [CORE] Remove library unloading API from JniLibLoader as unused by @zhztheplayer in #9277
  • [GLUTEN-9237][CH] Fix the nullability missmatch issue for the Nothing type by @lgbo-ustc in #9238
  • [VL] Disable FlushableHashAggreagte when aggregates contains sum/avg for floating type by @kecookier in #8986
  • [CORE] Refine the test with specified spark version by @yikf in #9274
  • [CH] Add a comment to explain why the endpoint uses a single thread by @dcoliversun in #9257
  • [GLUTEN-8891][VL] Refine local ssd cache feature by @zhouyuan in #9228
  • [GLUTEN-9267][CH] Fix a bug in EliminateDeduplicateAggregateWithAnyJoin by @lgbo-ustc in #9293
  • [VL] Remove param original of ColumnarPartialProjectExec by @zml1206 in #9290
  • [GLUTEN-9178][CH] Fix cse in aggregate operator not working by @loneylee in #9301
  • [CORE] Post events until both spark ui and gluten ui are enable by @yikf in #9272
  • [CORE] Correctly handle driver configurations when spark.sql.extensions is explicitly set for GlutenSessionExtensions by @zhztheplayer in #9312
  • [GLUTEN-8851][VL] feat: Support cudf by @jinchengchenghh in #9229
  • [GLUTEN-9288][VL] Enable array_prepend function for spark 3.5+ by @dcoliversun in #9305
  • [GLUTEN-9317][CH]Fix: duplicated column names in shuffle read by @lgbo-ustc in #9318
  • [Gluten-9254][CH] Support RDDScanExec by @loneylee in #9270
  • [VL] Count total JVM memory as the on-heap portion for the off-heap sizing feature by @zhztheplayer in #9321
  • [GLUTEN-9300][DOC] Support replacement expression in gen-function-support-docs by @dcoliversun in #9331
  • [GLUTEN-9239][CH] [PART-1] Support Java-17 Rmove JNI_OnUnload by @baibaichen in #9275
  • [GLUTEN-7652][VL] Support binary as string by @wForget in #9325
  • [Gluten-9334][CH] Support delta metadata column file_path and row_index for mergetree by @loneylee in #9340
  • [GLUTEN-6867][CH] Fix Bug that cann't read file on minio by @baibaichen in #9332
  • [VL] Provide a configuration option to completely turn off off-heap memory tracking with Spark memory manager by @zhztheplayer in #9341
  • [GLUTEN-9313][VL] ColumnarPartialProject supports built-in but blacklisted function by @WangGuangxin in #9315
  • [GLUTEN-8772][CORE] refactor: Refactoring the usage of SubstraitContext#functionMap by @wypb in #8775
  • [VL] Move pre-configuration code of dynamic off-heap sizing to its own place by @zhztheplayer in #9336
  • [GLUTEN-9163][VL] Use stream de/compressor in sort-based shuffle by @marin-ma in #9278
  • [GLUTEN-9287][VL] Enable array_compact function for Spark 3.4+ by @dcoliversun in #9349
  • [GLUTEN-9095][UT] Remove Vanilla Spark InternalRow based checkEvaluation by @ArnavBalyan in #9096
  • [CORE] Make max broadcast table size configurable by @yaooqinn in #9359
  • [CH] Fix build error by @exmy in #9363
  • [GLUTEN-9243][VL] Fix cuda docker image by @zhouyuan in #9333
  • [GLUTEN-8912][VL] Add Offset support for CollectLimitExec by @ArnavBalyan in #8914
  • [GLUTEN-7589][VL] Support date_trunc function by @zml1206 in #7611
  • [GLUTEN-9279] Not pulling out expression from PartialMerge aggregate function to avoid invalid reference binding in ProjectExecTransformer by @Z1Wu in #9280
  • [Gluten-8792][CH] Support delta project incrementMetric expr by @loneylee in #9353
  • [GLUTEN-9034][VL] Add VeloxResizeBatchesExec for Shuffle by @WangGuangxin in #9035
  • Fix ColumnarToRowRemovalGuard not able to be copied by @yaooqinn in #9384
  • [GLUTEN-8846][CH] [Part 4] Add full-chain UT by @jlfsdtc in #9256
  • [VL] Follow up on #9384 to avoid swallowing exceptions in UT by @zhztheplayer in #9393
  • [GLUTEN-9163][VL] Separate compression buffer and disk write buffer configuration by @marin-ma in #9356
  • [VL][INFRA] Improve build bundle package workflow by @wForget in #9404
  • [VL] Refactor WholeStageTransformer to remove some duplicate code by @wyp...
Read more

v1.5.0-rc0

03 Oct 17:40
42e3f92

Choose a tag to compare

v1.5.0-rc0 Pre-release
Pre-release

What's Changed

  • [GLUTEN-8846][CH] [Part 3] Add benchmark for Icerberg Delete by @baibaichen in #9192
  • [GLUTEN-9020][CH] Support delta DV BitmapAggregator by @loneylee in #9138
  • [GLUTEN-9197][CH] Simplify sum aggregate expression by @taiyang-li in #9198
  • [VL] Enable more ut in VeloxTestSettings by @WangGuangxin in #9080
  • [GLUTEN-9199][VL] Fix error when creating shuffle file: open with O_CREAT or O_TMPFILE in second argument needs 3 arguments by @zhztheplayer in #9200
  • [CORE] Fix duplicate setting for config LEGACY_TIME_PARSER_POLICY by @jinchengchenghh in #9201
  • [GLUTEN-9176][CH] Rewrite aggregate if to aggregate with filter clause by @taiyang-li in #9185
  • [GLUTEN-8557][CH] Flatten nested And/Or for performance optimization by @KevinyhZou in #8558
  • Revert "[GLUTEN-9164][CH]Enable row group level bloom filter push down" by @taiyang-li in #9214
  • [GLUTEN-9182][VL] Support new s3 configuration in Gluten by @dcoliversun in #9183
  • [VL] Celeborn shuffle reader OOM with many empty input stream by @marin-ma in #9221
  • [GLUTEN-8821][VL] Update aggregate/generator/window support doc and script by @marin-ma in #8971
  • [VL] Change to use Velox's wget_and_untar in setup-centos7.sh by @yaooqinn in #9207
  • [GLUTEN-9196][CH] Use wide-table aggregation to eliminate multi-table joins by @lgbo-ustc in #9155
  • [GLUTEN-9149][CORE] Remove Spark-specific code from JniLibLoader & JniWorkspace by @shuai-xu in #9150
  • [VL][CI] Change to use JDK-17 for Spark 3.3/3.4/3.5 tests by @PHILO-HE in #9209
  • [CORE][VL] Hide child nodes from implementations of OffloadSingleNode by @zhztheplayer in #9220
  • [GLUTEN-9008][VL] Support json_object_keys function by @dcoliversun in #9009
  • [GLUTEN-9239][CH] Support JDK17 for the CH backend by @zzcclp in #9242
  • [GLUTEN-9152][CORE] Avoid unnecessary serialization of hadoop conf by @zml1206 in #9153
  • [GLUTEN-9240][VL] Write NULL value into relation in gluten unit tests by @dcoliversun in #9241
  • [VL][CI] bump to use ubuntu-22.04 runner by @zhouyuan in #9262
  • [GLUTEN-9177][CH]Fix diff on parse host of url and refactor SparkParseURL by @KevinyhZou in #9179
  • [CORE] Decrease offheap memory size in resource profile for whole stage fallback case by @PHILO-HE in #8911
  • [GLUTEN-9205][CH] Support deletion vector native write by @loneylee in #9248
  • [VL] Delete global reference to a class object in JNI unload by @PHILO-HE in #9268
  • [GLUTEN-9245][VL] Fix partial project expression contains subquery by @jinchengchenghh in #9259
  • [GLUTEN-9244][CORE] Change the way of passing default timezone to native config by @zml1206 in #9249
  • [GLUTEN-8497][VL] Fix columnar batch type mismatch in table cache by @zhztheplayer in #9230
  • [VL] Support Spark legacy statistical aggregation function behavior by @NEUpanning in #9181
  • [CORE] Remove library unloading API from JniLibLoader as unused by @zhztheplayer in #9277
  • [GLUTEN-9237][CH] Fix the nullability missmatch issue for the Nothing type by @lgbo-ustc in #9238
  • [VL] Disable FlushableHashAggreagte when aggregates contains sum/avg for floating type by @kecookier in #8986
  • [CORE] Refine the test with specified spark version by @yikf in #9274
  • [CH] Add a comment to explain why the endpoint uses a single thread by @dcoliversun in #9257
  • [GLUTEN-8891][VL] Refine local ssd cache feature by @zhouyuan in #9228
  • [GLUTEN-9267][CH] Fix a bug in EliminateDeduplicateAggregateWithAnyJoin by @lgbo-ustc in #9293
  • [VL] Remove param original of ColumnarPartialProjectExec by @zml1206 in #9290
  • [GLUTEN-9178][CH] Fix cse in aggregate operator not working by @loneylee in #9301
  • [CORE] Post events until both spark ui and gluten ui are enable by @yikf in #9272
  • [CORE] Correctly handle driver configurations when spark.sql.extensions is explicitly set for GlutenSessionExtensions by @zhztheplayer in #9312
  • [GLUTEN-8851][VL] feat: Support cudf by @jinchengchenghh in #9229
  • [GLUTEN-9288][VL] Enable array_prepend function for spark 3.5+ by @dcoliversun in #9305
  • [GLUTEN-9317][CH]Fix: duplicated column names in shuffle read by @lgbo-ustc in #9318
  • [Gluten-9254][CH] Support RDDScanExec by @loneylee in #9270
  • [VL] Count total JVM memory as the on-heap portion for the off-heap sizing feature by @zhztheplayer in #9321
  • [GLUTEN-9300][DOC] Support replacement expression in gen-function-support-docs by @dcoliversun in #9331
  • [GLUTEN-9239][CH] [PART-1] Support Java-17 Rmove JNI_OnUnload by @baibaichen in #9275
  • [GLUTEN-7652][VL] Support binary as string by @wForget in #9325
  • [Gluten-9334][CH] Support delta metadata column file_path and row_index for mergetree by @loneylee in #9340
  • [GLUTEN-6867][CH] Fix Bug that cann't read file on minio by @baibaichen in #9332
  • [VL] Provide a configuration option to completely turn off off-heap memory tracking with Spark memory manager by @zhztheplayer in #9341
  • [GLUTEN-9313][VL] ColumnarPartialProject supports built-in but blacklisted function by @WangGuangxin in #9315
  • [GLUTEN-8772][CORE] refactor: Refactoring the usage of SubstraitContext#functionMap by @wypb in #8775
  • [VL] Move pre-configuration code of dynamic off-heap sizing to its own place by @zhztheplayer in #9336
  • [GLUTEN-9163][VL] Use stream de/compressor in sort-based shuffle by @marin-ma in #9278
  • [GLUTEN-9287][VL] Enable array_compact function for Spark 3.4+ by @dcoliversun in #9349
  • [GLUTEN-9095][UT] Remove Vanilla Spark InternalRow based checkEvaluation by @ArnavBalyan in #9096
  • [CORE] Make max broadcast table size configurable by @yaooqinn in #9359
  • [CH] Fix build error by @exmy in #9363
  • [GLUTEN-9243][VL] Fix cuda docker image by @zhouyuan in #9333
  • [GLUTEN-8912][VL] Add Offset support for CollectLimitExec by @ArnavBalyan in #8914
  • [GLUTEN-7589][VL] Support date_trunc function by @zml1206 in #7611
  • [GLUTEN-9279] Not pulling out expression from PartialMerge aggregate function to avoid invalid reference binding in ProjectExecTransformer by @Z1Wu in #9280
  • [Gluten-8792][CH] Support delta project incrementMetric expr by @loneylee in #9353
  • [GLUTEN-9034][VL] Add VeloxResizeBatchesExec for Shuffle by @WangGuangxin in #9035
  • Fix ColumnarToRowRemovalGuard not able to be copied by @yaooqinn in #9384
  • [GLUTEN-8846][CH] [Part 4] Add full-chain UT by @jlfsdtc in #9256
  • [VL] Follow up on #9384 to avoid swallowing exceptions in UT by @zhztheplayer in #9393
  • [GLUTEN-9163][VL] Separate compression buffer and disk write buffer configuration by @marin-ma in #9356
  • [VL][INFRA] Improve build bundle package workflow by @wForget in #9404
  • [VL] Refactor WholeStageTransformer to remove some duplicate code by @wyp...
Read more

v1.4.0

19 Jun 09:43
50dd117

Choose a tag to compare

Release Notes - Gluten version 1.4.0

Highlights

  • Spark 3.2.2/3.3.1/3.4.4(upgraded)/3.5.2
  • Add more spark functions support including date_format, make_date, map_filter, map_concat, from_json, btrim, array_append, and more
  • Add more spark operators support including Range, CollectLimit, and more
  • Update OAP's Velox codebase to 2025/05/12
  • Join optimizations: BNLJ full outer join
  • Shuffle optimizations: RSS ShuffleReader optimization and bug fixing
  • RSS: Celeborn 0.5.4(upgraded)/Uniffle 0.9.2(upgraded)
  • Query Plan: RAS cost model optimizations and refactor
  • Datalake: Add Iceberg/Hudi in test
  • CI: Docker image and JDK version update
  • Support dynamically adjust Stage Resource Profile
  • Support Query Trace
  • Add Qualification Tool
  • Fix OOM issues for some untracked memory

What's Changed

  • [GLUTEN-8327][CORE][Part-3] Introduce the ConfigEntry to gluten config by @yikf in #8431
  • [VL] Fix wrong warning of "Memory overhead is set to ..." under default Spark config settings by @zhztheplayer in #8448
  • [GLUTEN-8385][VL] Support write compatible-hive bucket table for Spark3.4 and Spark3.5. by @yikf in #8386
  • Revert "[CH] Disable gluten arm ci" by @lwz9103 in #8460
  • [GLUTEN-8453] [VL] Allow Heavy Batch to be Processed by ColumnarCachedBatchSerializer by @ArnavBalyan in #8454
  • [CH] Add tools to dump ActionsDAG into tree graph by @taiyang-li in #8461
  • [GLUTEN-6887][VL] Daily Update Velox Version (2025_01_08) by @GlutenPerfBot in #8457
  • [VL] Update document of build gluten in Docker by @FelixYBW in #8459
  • [GLUTEN-8462][CORE] Raise a meaningful error when no component is found from classpath by @zhztheplayer in #8468
  • [GLUTEN-8453][VL] Follow-up to #8454 to add a ensureVeloxBatch API for limited use cases by @zhztheplayer in #8463
  • [VL] Refactor Velox.md by @FelixYBW in #8478
  • [GLUTEN-8465] [VL] Bump Celeborn to 0.5.3 by @SteNicholas in #8467
  • [GLUTEN-8455][VL] Fallback Scan for Encrypted Parquet Files by @ArnavBalyan in #8456
  • [GLUTEN-6887][VL] Daily Update Velox Version (2025_01_09) by @GlutenPerfBot in #8472
  • [CORE] Refactor columnar noop write rule by @jackylee-ch in #8422
  • [GLUTEN-8462][CH] Fixed the loading of Components and Backend by @gleonSun in #8464
  • [GLUTEN-8414][VL] Override doCanonicalize in ColumnarPartialProjectEx… by @lifulong in #8415
  • [GLUTEN-8397][CH][Part-2] Fix statica_cast failed on macos by @yxheartipp in #8485
  • [GLUTEN-8343][CH]Fix cast number to decimal and improve performance of it by @KevinyhZou in #8351
  • [GLUTEN-8481][VL] Clean up shuffle reader cpp code by @marin-ma in #8482
  • [Core] Bump version to 1.4.0-SNAPSHOT by @weiting-chen in #8452
  • [GLUTEN-8483][CORE] A stable and universal way to find component files by @zhztheplayer in #8486
  • [DOC][VL] Fix typo in microbenchmark.md by @marin-ma in #8495
  • [GLUTEN-1632][CH]Daily Update Clickhouse Version (20250110) by @kyligence-git in #8490
  • [GLUTEN-6887][VL] Daily Update Velox Version (2025_01_10) by @GlutenPerfBot in #8489
  • [GLUTEN-8476][VL] Fix allocate and free memory by @jkhaliqi in #8477
  • [GLUTEN-8503][VL] Fix macro parenthesis CVE by @jkhaliqi in #8504
  • [GLUTEN-8471][VL] Fix usage of uninitialized variables by @jkhaliqi in #8470
  • [GLUTEN-6887][VL] Daily Update Velox Version (2025_01_11) by @GlutenPerfBot in #8507
  • [GLUTEN-6887][VL] Daily Update Velox Version (2025_01_12) by @GlutenPerfBot in #8508
  • [GLUTEN-8497][VL] A bad test case that fails columnar table cache query by @zhztheplayer in #8498
  • [DOC] Update README.md by @PHILO-HE in #8444
  • [GLUTEN-8319][VL] Support date_format Spark function by @PHILO-HE in #8323
  • [GLUTEN-8487][VL] adding JDK11 based Centos8 image by @zhouyuan in #8513
  • [GLUTEN-6887][VL] Daily Update Velox Version (2025_01_14) by @GlutenPerfBot in #8522
  • [GLUTEN-8020][VL] Remove the libhdfs3 installation script required for static linking by @JkSelf in #8013
  • [GLUTEN-8532][VL] Fix parenthesis within macro by @jkhaliqi in #8533
  • [GLUTEN-6887][VL] Daily Update Velox Version (2025_01_15) by @GlutenPerfBot in #8536
  • [CORE] Use RAS's cost model for legacy transition planner to evaluate cost of transitions by @zhztheplayer in #8527
  • [GLUTEN-8487][VL] adding JDK17 based Centos8 image (#8513) by @zhouyuan in #8539
  • [GLUTEN-1632][CH]Daily Update Clickhouse Version (20250115) by @kyligence-git in #8537
  • [GLUTEN-8479][CORE][Part-1] Remove unnecessary config by @yikf in #8480
  • [GLUTEN-8520][VL] Fix bitwise operators by @jkhaliqi in #8521
  • [GLUTEN-8524][VL] Fix input output errors by @jkhaliqi in #8525
  • [GLUTEN-6876][VL] update spark 3.5.2 in doc by @FelixYBW in #8543
  • [GLUTEN-8455][VL] Port encrypted file checks to shim layer by @ArnavBalyan in #8501
  • [CORE][VL] Cost model code refactors by @zhztheplayer in #8541
  • [GLUTEN-6887][VL] Daily Update Velox Version (2025_01_16) by @GlutenPerfBot in #8546
  • [GLUTEN-1632][CH]Daily Update Clickhouse Version (20250116) by @kyligence-git in #8544
  • [GLUTEN-8432][CH]Remove duplicate output attributes of aggregate's child by @lgbo-ustc in #8450
  • [GLUTEN-6887][VL] Daily Update Velox Version (2025_01_17) by @GlutenPerfBot in #8553
  • [GLUTEN-8497][CORE] A unified CallInfo API to replace AdaptiveContext by @zhztheplayer in #8551
  • [GLUTEN-8529][CH]Fix get_json_object when path has asterisk by @KevinyhZou in #8540
  • [MINOR] Fix comment of function VeloxAggregateFunctionsBuilder.create by @zml1206 in #8549
  • [CORE] Optimize duplicated code for create rel node by @zml1206 in #8548
  • [GLUTEN-7706][CORE] Support Spark-344 + JDK17 by @zhouyuan in #7789
  • [GLUTEN-8475][VL] Fix C-style casts to C++-style by @jkhaliqi in #8474
  • [GLUTEN-8534][VL] Fix allowing loops to iterate beyond end of array by @jkhaliqi in #8535
  • [GLUTEN-8538][VL] Fix incorrect calculation of buffer size by @jkhaliqi in #8542
  • [CORE][CH] Support MicroBatchScanExec with KafkaScan in batch mode by @loneylee in #8321
  • [CORE][MIRROR] Change config.defaultValue.get.toString to config.defaultValueString by @jackylee-ch in #8572
  • [GLUTEN-6887][VL] Daily Update Velox Version (2025_01_18) by @GlutenPerfBot in #8561
  • [GLUTEN-6887][VL] Daily Update Velox Version (2025_01_19) by @GlutenPerfBot in #8563
  • [GLUTEN-8406][CH] Replace from_json(s, 'Map<String, String>')[k] with get_json_object(s, '$.k') by @lgbo-ustc in #8409
  • [GLUTEN-8479][CORE][Part-2] All configurations should be defined through ConfigEntry by @yikf in #8559
  • [VL] CMake configuration cleanup to remove variable VELOX_COMPONENTS_PATH by @zhztheplayer in #8579
  • [GLUTEN-1632][CH]Daily Update Clickhouse Version (20250121) by @kyligence-git in #8577
  • [DOC] Fix outdated operators in documentation by @ArnavBalyan in #8582
  • [GLUTEN-8379][VL] Support query trace by @jinchengchenghh in https://gi...
Read more

v1.4.0-rc2

06 Jun 09:52
50dd117

Choose a tag to compare

v1.4.0-rc2 Pre-release
Pre-release

What's Changed

  • [GLUTEN-8327][CORE][Part-3] Introduce the ConfigEntry to gluten config by @yikf in #8431
  • [VL] Fix wrong warning of "Memory overhead is set to ..." under default Spark config settings by @zhztheplayer in #8448
  • [GLUTEN-8385][VL] Support write compatible-hive bucket table for Spark3.4 and Spark3.5. by @yikf in #8386
  • Revert "[CH] Disable gluten arm ci" by @lwz9103 in #8460
  • [GLUTEN-8453] [VL] Allow Heavy Batch to be Processed by ColumnarCachedBatchSerializer by @ArnavBalyan in #8454
  • [CH] Add tools to dump ActionsDAG into tree graph by @taiyang-li in #8461
  • [GLUTEN-6887][VL] Daily Update Velox Version (2025_01_08) by @GlutenPerfBot in #8457
  • [VL] Update document of build gluten in Docker by @FelixYBW in #8459
  • [GLUTEN-8462][CORE] Raise a meaningful error when no component is found from classpath by @zhztheplayer in #8468
  • [GLUTEN-8453][VL] Follow-up to #8454 to add a ensureVeloxBatch API for limited use cases by @zhztheplayer in #8463
  • [VL] Refactor Velox.md by @FelixYBW in #8478
  • [GLUTEN-8465] [VL] Bump Celeborn to 0.5.3 by @SteNicholas in #8467
  • [GLUTEN-8455][VL] Fallback Scan for Encrypted Parquet Files by @ArnavBalyan in #8456
  • [GLUTEN-6887][VL] Daily Update Velox Version (2025_01_09) by @GlutenPerfBot in #8472
  • [CORE] Refactor columnar noop write rule by @jackylee-ch in #8422
  • [GLUTEN-8462][CH] Fixed the loading of Components and Backend by @gleonSun in #8464
  • [GLUTEN-8414][VL] Override doCanonicalize in ColumnarPartialProjectEx… by @lifulong in #8415
  • [GLUTEN-8397][CH][Part-2] Fix statica_cast failed on macos by @yxheartipp in #8485
  • [GLUTEN-8343][CH]Fix cast number to decimal and improve performance of it by @KevinyhZou in #8351
  • [GLUTEN-8481][VL] Clean up shuffle reader cpp code by @marin-ma in #8482
  • [Core] Bump version to 1.4.0-SNAPSHOT by @weiting-chen in #8452
  • [GLUTEN-8483][CORE] A stable and universal way to find component files by @zhztheplayer in #8486
  • [DOC][VL] Fix typo in microbenchmark.md by @marin-ma in #8495
  • [GLUTEN-1632][CH]Daily Update Clickhouse Version (20250110) by @kyligence-git in #8490
  • [GLUTEN-6887][VL] Daily Update Velox Version (2025_01_10) by @GlutenPerfBot in #8489
  • [GLUTEN-8476][VL] Fix allocate and free memory by @jkhaliqi in #8477
  • [GLUTEN-8503][VL] Fix macro parenthesis CVE by @jkhaliqi in #8504
  • [GLUTEN-8471][VL] Fix usage of uninitialized variables by @jkhaliqi in #8470
  • [GLUTEN-6887][VL] Daily Update Velox Version (2025_01_11) by @GlutenPerfBot in #8507
  • [GLUTEN-6887][VL] Daily Update Velox Version (2025_01_12) by @GlutenPerfBot in #8508
  • [GLUTEN-8497][VL] A bad test case that fails columnar table cache query by @zhztheplayer in #8498
  • [DOC] Update README.md by @PHILO-HE in #8444
  • [GLUTEN-8319][VL] Support date_format Spark function by @PHILO-HE in #8323
  • [GLUTEN-8487][VL] adding JDK11 based Centos8 image by @zhouyuan in #8513
  • [GLUTEN-6887][VL] Daily Update Velox Version (2025_01_14) by @GlutenPerfBot in #8522
  • [GLUTEN-8020][VL] Remove the libhdfs3 installation script required for static linking by @JkSelf in #8013
  • [GLUTEN-8532][VL] Fix parenthesis within macro by @jkhaliqi in #8533
  • [GLUTEN-6887][VL] Daily Update Velox Version (2025_01_15) by @GlutenPerfBot in #8536
  • [CORE] Use RAS's cost model for legacy transition planner to evaluate cost of transitions by @zhztheplayer in #8527
  • [GLUTEN-8487][VL] adding JDK17 based Centos8 image (#8513) by @zhouyuan in #8539
  • [GLUTEN-1632][CH]Daily Update Clickhouse Version (20250115) by @kyligence-git in #8537
  • [GLUTEN-8479][CORE][Part-1] Remove unnecessary config by @yikf in #8480
  • [GLUTEN-8520][VL] Fix bitwise operators by @jkhaliqi in #8521
  • [GLUTEN-8524][VL] Fix input output errors by @jkhaliqi in #8525
  • [GLUTEN-6876][VL] update spark 3.5.2 in doc by @FelixYBW in #8543
  • [GLUTEN-8455][VL] Port encrypted file checks to shim layer by @ArnavBalyan in #8501
  • [CORE][VL] Cost model code refactors by @zhztheplayer in #8541
  • [GLUTEN-6887][VL] Daily Update Velox Version (2025_01_16) by @GlutenPerfBot in #8546
  • [GLUTEN-1632][CH]Daily Update Clickhouse Version (20250116) by @kyligence-git in #8544
  • [GLUTEN-8432][CH]Remove duplicate output attributes of aggregate's child by @lgbo-ustc in #8450
  • [GLUTEN-6887][VL] Daily Update Velox Version (2025_01_17) by @GlutenPerfBot in #8553
  • [GLUTEN-8497][CORE] A unified CallInfo API to replace AdaptiveContext by @zhztheplayer in #8551
  • [GLUTEN-8529][CH]Fix get_json_object when path has asterisk by @KevinyhZou in #8540
  • [MINOR] Fix comment of function VeloxAggregateFunctionsBuilder.create by @zml1206 in #8549
  • [CORE] Optimize duplicated code for create rel node by @zml1206 in #8548
  • [GLUTEN-7706][CORE] Support Spark-344 + JDK17 by @zhouyuan in #7789
  • [GLUTEN-8475][VL] Fix C-style casts to C++-style by @jkhaliqi in #8474
  • [GLUTEN-8534][VL] Fix allowing loops to iterate beyond end of array by @jkhaliqi in #8535
  • [GLUTEN-8538][VL] Fix incorrect calculation of buffer size by @jkhaliqi in #8542
  • [CORE][CH] Support MicroBatchScanExec with KafkaScan in batch mode by @loneylee in #8321
  • [CORE][MIRROR] Change config.defaultValue.get.toString to config.defaultValueString by @jackylee-ch in #8572
  • [GLUTEN-6887][VL] Daily Update Velox Version (2025_01_18) by @GlutenPerfBot in #8561
  • [GLUTEN-6887][VL] Daily Update Velox Version (2025_01_19) by @GlutenPerfBot in #8563
  • [GLUTEN-8406][CH] Replace from_json(s, 'Map<String, String>')[k] with get_json_object(s, '$.k') by @lgbo-ustc in #8409
  • [GLUTEN-8479][CORE][Part-2] All configurations should be defined through ConfigEntry by @yikf in #8559
  • [VL] CMake configuration cleanup to remove variable VELOX_COMPONENTS_PATH by @zhztheplayer in #8579
  • [GLUTEN-1632][CH]Daily Update Clickhouse Version (20250121) by @kyligence-git in #8577
  • [DOC] Fix outdated operators in documentation by @ArnavBalyan in #8582
  • [GLUTEN-8379][VL] Support query trace by @jinchengchenghh in #8380
  • [GLUTEN-8266][VL][CI] Pre-install uniffle in docker image by @zhouyuan in #8578
  • [VL] Update the Scaladoc of Component API by @zhztheplayer in #8589
  • [GLUTEN-8455][VL] Support encrypted parquet fallback for 3.5 by @ArnavBalyan in #8560
  • [GLUTEN-6887][VL] Daily Update Velox Version (2025_01_22) by @GlutenPerfBot in #8587
  • [GLUTEN-8580][CORE][Part-1] Clean up unnecessary code related to input file expression by @zml1206 in #8584
  • [GLUTEN-8379][VL] Fix typo in query trace document by @jinchengchenghh in https://github.com...
Read more

v1.4.0-rc1

21 May 12:42
bb28bb7

Choose a tag to compare

v1.4.0-rc1 Pre-release
Pre-release

What's Changed

  • [GLUTEN-8327][CORE][Part-3] Introduce the ConfigEntry to gluten config by @yikf in #8431
  • [VL] Fix wrong warning of "Memory overhead is set to ..." under default Spark config settings by @zhztheplayer in #8448
  • [GLUTEN-8385][VL] Support write compatible-hive bucket table for Spark3.4 and Spark3.5. by @yikf in #8386
  • Revert "[CH] Disable gluten arm ci" by @lwz9103 in #8460
  • [GLUTEN-8453] [VL] Allow Heavy Batch to be Processed by ColumnarCachedBatchSerializer by @ArnavBalyan in #8454
  • [CH] Add tools to dump ActionsDAG into tree graph by @taiyang-li in #8461
  • [GLUTEN-6887][VL] Daily Update Velox Version (2025_01_08) by @GlutenPerfBot in #8457
  • [VL] Update document of build gluten in Docker by @FelixYBW in #8459
  • [GLUTEN-8462][CORE] Raise a meaningful error when no component is found from classpath by @zhztheplayer in #8468
  • [GLUTEN-8453][VL] Follow-up to #8454 to add a ensureVeloxBatch API for limited use cases by @zhztheplayer in #8463
  • [VL] Refactor Velox.md by @FelixYBW in #8478
  • [GLUTEN-8465] [VL] Bump Celeborn to 0.5.3 by @SteNicholas in #8467
  • [GLUTEN-8455][VL] Fallback Scan for Encrypted Parquet Files by @ArnavBalyan in #8456
  • [GLUTEN-6887][VL] Daily Update Velox Version (2025_01_09) by @GlutenPerfBot in #8472
  • [CORE] Refactor columnar noop write rule by @jackylee-ch in #8422
  • [GLUTEN-8462][CH] Fixed the loading of Components and Backend by @gleonSun in #8464
  • [GLUTEN-8414][VL] Override doCanonicalize in ColumnarPartialProjectEx… by @lifulong in #8415
  • [GLUTEN-8397][CH][Part-2] Fix statica_cast failed on macos by @yxheartipp in #8485
  • [GLUTEN-8343][CH]Fix cast number to decimal and improve performance of it by @KevinyhZou in #8351
  • [GLUTEN-8481][VL] Clean up shuffle reader cpp code by @marin-ma in #8482
  • [Core] Bump version to 1.4.0-SNAPSHOT by @weiting-chen in #8452
  • [GLUTEN-8483][CORE] A stable and universal way to find component files by @zhztheplayer in #8486
  • [DOC][VL] Fix typo in microbenchmark.md by @marin-ma in #8495
  • [GLUTEN-1632][CH]Daily Update Clickhouse Version (20250110) by @kyligence-git in #8490
  • [GLUTEN-6887][VL] Daily Update Velox Version (2025_01_10) by @GlutenPerfBot in #8489
  • [GLUTEN-8476][VL] Fix allocate and free memory by @jkhaliqi in #8477
  • [GLUTEN-8503][VL] Fix macro parenthesis CVE by @jkhaliqi in #8504
  • [GLUTEN-8471][VL] Fix usage of uninitialized variables by @jkhaliqi in #8470
  • [GLUTEN-6887][VL] Daily Update Velox Version (2025_01_11) by @GlutenPerfBot in #8507
  • [GLUTEN-6887][VL] Daily Update Velox Version (2025_01_12) by @GlutenPerfBot in #8508
  • [GLUTEN-8497][VL] A bad test case that fails columnar table cache query by @zhztheplayer in #8498
  • [DOC] Update README.md by @PHILO-HE in #8444
  • [GLUTEN-8319][VL] Support date_format Spark function by @PHILO-HE in #8323
  • [GLUTEN-8487][VL] adding JDK11 based Centos8 image by @zhouyuan in #8513
  • [GLUTEN-6887][VL] Daily Update Velox Version (2025_01_14) by @GlutenPerfBot in #8522
  • [GLUTEN-8020][VL] Remove the libhdfs3 installation script required for static linking by @JkSelf in #8013
  • [GLUTEN-8532][VL] Fix parenthesis within macro by @jkhaliqi in #8533
  • [GLUTEN-6887][VL] Daily Update Velox Version (2025_01_15) by @GlutenPerfBot in #8536
  • [CORE] Use RAS's cost model for legacy transition planner to evaluate cost of transitions by @zhztheplayer in #8527
  • [GLUTEN-8487][VL] adding JDK17 based Centos8 image (#8513) by @zhouyuan in #8539
  • [GLUTEN-1632][CH]Daily Update Clickhouse Version (20250115) by @kyligence-git in #8537
  • [GLUTEN-8479][CORE][Part-1] Remove unnecessary config by @yikf in #8480
  • [GLUTEN-8520][VL] Fix bitwise operators by @jkhaliqi in #8521
  • [GLUTEN-8524][VL] Fix input output errors by @jkhaliqi in #8525
  • [GLUTEN-6876][VL] update spark 3.5.2 in doc by @FelixYBW in #8543
  • [GLUTEN-8455][VL] Port encrypted file checks to shim layer by @ArnavBalyan in #8501
  • [CORE][VL] Cost model code refactors by @zhztheplayer in #8541
  • [GLUTEN-6887][VL] Daily Update Velox Version (2025_01_16) by @GlutenPerfBot in #8546
  • [GLUTEN-1632][CH]Daily Update Clickhouse Version (20250116) by @kyligence-git in #8544
  • [GLUTEN-8432][CH]Remove duplicate output attributes of aggregate's child by @lgbo-ustc in #8450
  • [GLUTEN-6887][VL] Daily Update Velox Version (2025_01_17) by @GlutenPerfBot in #8553
  • [GLUTEN-8497][CORE] A unified CallInfo API to replace AdaptiveContext by @zhztheplayer in #8551
  • [GLUTEN-8529][CH]Fix get_json_object when path has asterisk by @KevinyhZou in #8540
  • [MINOR] Fix comment of function VeloxAggregateFunctionsBuilder.create by @zml1206 in #8549
  • [CORE] Optimize duplicated code for create rel node by @zml1206 in #8548
  • [GLUTEN-7706][CORE] Support Spark-344 + JDK17 by @zhouyuan in #7789
  • [GLUTEN-8475][VL] Fix C-style casts to C++-style by @jkhaliqi in #8474
  • [GLUTEN-8534][VL] Fix allowing loops to iterate beyond end of array by @jkhaliqi in #8535
  • [GLUTEN-8538][VL] Fix incorrect calculation of buffer size by @jkhaliqi in #8542
  • [CORE][CH] Support MicroBatchScanExec with KafkaScan in batch mode by @loneylee in #8321
  • [CORE][MIRROR] Change config.defaultValue.get.toString to config.defaultValueString by @jackylee-ch in #8572
  • [GLUTEN-6887][VL] Daily Update Velox Version (2025_01_18) by @GlutenPerfBot in #8561
  • [GLUTEN-6887][VL] Daily Update Velox Version (2025_01_19) by @GlutenPerfBot in #8563
  • [GLUTEN-8406][CH] Replace from_json(s, 'Map<String, String>')[k] with get_json_object(s, '$.k') by @lgbo-ustc in #8409
  • [GLUTEN-8479][CORE][Part-2] All configurations should be defined through ConfigEntry by @yikf in #8559
  • [VL] CMake configuration cleanup to remove variable VELOX_COMPONENTS_PATH by @zhztheplayer in #8579
  • [GLUTEN-1632][CH]Daily Update Clickhouse Version (20250121) by @kyligence-git in #8577
  • [DOC] Fix outdated operators in documentation by @ArnavBalyan in #8582
  • [GLUTEN-8379][VL] Support query trace by @jinchengchenghh in #8380
  • [GLUTEN-8266][VL][CI] Pre-install uniffle in docker image by @zhouyuan in #8578
  • [VL] Update the Scaladoc of Component API by @zhztheplayer in #8589
  • [GLUTEN-8455][VL] Support encrypted parquet fallback for 3.5 by @ArnavBalyan in #8560
  • [GLUTEN-6887][VL] Daily Update Velox Version (2025_01_22) by @GlutenPerfBot in #8587
  • [GLUTEN-8580][CORE][Part-1] Clean up unnecessary code related to input file expression by @zml1206 in #8584
  • [GLUTEN-8379][VL] Fix typo in query trace document by @jinchengchenghh in https://github.com...
Read more

v1.4.0-rc0

08 Apr 14:12
88899db

Choose a tag to compare

v1.4.0-rc0 Pre-release
Pre-release

What's Changed

  • [GLUTEN-8327][CORE][Part-3] Introduce the ConfigEntry to gluten config by @yikf in #8431
  • [VL] Fix wrong warning of "Memory overhead is set to ..." under default Spark config settings by @zhztheplayer in #8448
  • [GLUTEN-8385][VL] Support write compatible-hive bucket table for Spark3.4 and Spark3.5. by @yikf in #8386
  • Revert "[CH] Disable gluten arm ci" by @lwz9103 in #8460
  • [GLUTEN-8453] [VL] Allow Heavy Batch to be Processed by ColumnarCachedBatchSerializer by @ArnavBalyan in #8454
  • [CH] Add tools to dump ActionsDAG into tree graph by @taiyang-li in #8461
  • [GLUTEN-6887][VL] Daily Update Velox Version (2025_01_08) by @GlutenPerfBot in #8457
  • [VL] Update document of build gluten in Docker by @FelixYBW in #8459
  • [GLUTEN-8462][CORE] Raise a meaningful error when no component is found from classpath by @zhztheplayer in #8468
  • [GLUTEN-8453][VL] Follow-up to #8454 to add a ensureVeloxBatch API for limited use cases by @zhztheplayer in #8463
  • [VL] Refactor Velox.md by @FelixYBW in #8478
  • [GLUTEN-8465] [VL] Bump Celeborn to 0.5.3 by @SteNicholas in #8467
  • [GLUTEN-8455][VL] Fallback Scan for Encrypted Parquet Files by @ArnavBalyan in #8456
  • [GLUTEN-6887][VL] Daily Update Velox Version (2025_01_09) by @GlutenPerfBot in #8472
  • [CORE] Refactor columnar noop write rule by @jackylee-ch in #8422
  • [GLUTEN-8462][CH] Fixed the loading of Components and Backend by @gleonSun in #8464
  • [GLUTEN-8414][VL] Override doCanonicalize in ColumnarPartialProjectEx… by @lifulong in #8415
  • [GLUTEN-8397][CH][Part-2] Fix statica_cast failed on macos by @yxheartipp in #8485
  • [GLUTEN-8343][CH]Fix cast number to decimal and improve performance of it by @KevinyhZou in #8351
  • [GLUTEN-8481][VL] Clean up shuffle reader cpp code by @marin-ma in #8482
  • [Core] Bump version to 1.4.0-SNAPSHOT by @weiting-chen in #8452
  • [GLUTEN-8483][CORE] A stable and universal way to find component files by @zhztheplayer in #8486
  • [DOC][VL] Fix typo in microbenchmark.md by @marin-ma in #8495
  • [GLUTEN-1632][CH]Daily Update Clickhouse Version (20250110) by @kyligence-git in #8490
  • [GLUTEN-6887][VL] Daily Update Velox Version (2025_01_10) by @GlutenPerfBot in #8489
  • [GLUTEN-8476][VL] Fix allocate and free memory by @jkhaliqi in #8477
  • [GLUTEN-8503][VL] Fix macro parenthesis CVE by @jkhaliqi in #8504
  • [GLUTEN-8471][VL] Fix usage of uninitialized variables by @jkhaliqi in #8470
  • [GLUTEN-6887][VL] Daily Update Velox Version (2025_01_11) by @GlutenPerfBot in #8507
  • [GLUTEN-6887][VL] Daily Update Velox Version (2025_01_12) by @GlutenPerfBot in #8508
  • [GLUTEN-8497][VL] A bad test case that fails columnar table cache query by @zhztheplayer in #8498
  • [DOC] Update README.md by @PHILO-HE in #8444
  • [GLUTEN-8319][VL] Support date_format Spark function by @PHILO-HE in #8323
  • [GLUTEN-8487][VL] adding JDK11 based Centos8 image by @zhouyuan in #8513
  • [GLUTEN-6887][VL] Daily Update Velox Version (2025_01_14) by @GlutenPerfBot in #8522
  • [GLUTEN-8020][VL] Remove the libhdfs3 installation script required for static linking by @JkSelf in #8013
  • [GLUTEN-8532][VL] Fix parenthesis within macro by @jkhaliqi in #8533
  • [GLUTEN-6887][VL] Daily Update Velox Version (2025_01_15) by @GlutenPerfBot in #8536
  • [CORE] Use RAS's cost model for legacy transition planner to evaluate cost of transitions by @zhztheplayer in #8527
  • [GLUTEN-8487][VL] adding JDK17 based Centos8 image (#8513) by @zhouyuan in #8539
  • [GLUTEN-1632][CH]Daily Update Clickhouse Version (20250115) by @kyligence-git in #8537
  • [GLUTEN-8479][CORE][Part-1] Remove unnecessary config by @yikf in #8480
  • [GLUTEN-8520][VL] Fix bitwise operators by @jkhaliqi in #8521
  • [GLUTEN-8524][VL] Fix input output errors by @jkhaliqi in #8525
  • [GLUTEN-6876][VL] update spark 3.5.2 in doc by @FelixYBW in #8543
  • [GLUTEN-8455][VL] Port encrypted file checks to shim layer by @ArnavBalyan in #8501
  • [CORE][VL] Cost model code refactors by @zhztheplayer in #8541
  • [GLUTEN-6887][VL] Daily Update Velox Version (2025_01_16) by @GlutenPerfBot in #8546
  • [GLUTEN-1632][CH]Daily Update Clickhouse Version (20250116) by @kyligence-git in #8544
  • [GLUTEN-8432][CH]Remove duplicate output attributes of aggregate's child by @lgbo-ustc in #8450
  • [GLUTEN-6887][VL] Daily Update Velox Version (2025_01_17) by @GlutenPerfBot in #8553
  • [GLUTEN-8497][CORE] A unified CallInfo API to replace AdaptiveContext by @zhztheplayer in #8551
  • [GLUTEN-8529][CH]Fix get_json_object when path has asterisk by @KevinyhZou in #8540
  • [MINOR] Fix comment of function VeloxAggregateFunctionsBuilder.create by @zml1206 in #8549
  • [CORE] Optimize duplicated code for create rel node by @zml1206 in #8548
  • [GLUTEN-7706][CORE] Support Spark-344 + JDK17 by @zhouyuan in #7789
  • [GLUTEN-8475][VL] Fix C-style casts to C++-style by @jkhaliqi in #8474
  • [GLUTEN-8534][VL] Fix allowing loops to iterate beyond end of array by @jkhaliqi in #8535
  • [GLUTEN-8538][VL] Fix incorrect calculation of buffer size by @jkhaliqi in #8542
  • [CORE][CH] Support MicroBatchScanExec with KafkaScan in batch mode by @loneylee in #8321
  • [CORE][MIRROR] Change config.defaultValue.get.toString to config.defaultValueString by @jackylee-ch in #8572
  • [GLUTEN-6887][VL] Daily Update Velox Version (2025_01_18) by @GlutenPerfBot in #8561
  • [GLUTEN-6887][VL] Daily Update Velox Version (2025_01_19) by @GlutenPerfBot in #8563
  • [GLUTEN-8406][CH] Replace from_json(s, 'Map<String, String>')[k] with get_json_object(s, '$.k') by @lgbo-ustc in #8409
  • [GLUTEN-8479][CORE][Part-2] All configurations should be defined through ConfigEntry by @yikf in #8559
  • [VL] CMake configuration cleanup to remove variable VELOX_COMPONENTS_PATH by @zhztheplayer in #8579
  • [GLUTEN-1632][CH]Daily Update Clickhouse Version (20250121) by @kyligence-git in #8577
  • [DOC] Fix outdated operators in documentation by @ArnavBalyan in #8582
  • [GLUTEN-8379][VL] Support query trace by @jinchengchenghh in #8380
  • [GLUTEN-8266][VL][CI] Pre-install uniffle in docker image by @zhouyuan in #8578
  • [VL] Update the Scaladoc of Component API by @zhztheplayer in #8589
  • [GLUTEN-8455][VL] Support encrypted parquet fallback for 3.5 by @ArnavBalyan in #8560
  • [GLUTEN-6887][VL] Daily Update Velox Version (2025_01_22) by @GlutenPerfBot in #8587
  • [GLUTEN-8580][CORE][Part-1] Clean up unnecessary code related to input file expression by @zml1206 in #8584
  • [GLUTEN-8379][VL] Fix typo in query trace document by @jinchengchenghh in https://github.com...
Read more

v1.3.0

24 Jan 02:26
646329d

Choose a tag to compare

Release Notes - Gluten version 1.3.0

Highlights

  • Spark 3.2.2/3.3.1/3.4.3(upgraded)/3.5.2(upgraded)
  • 268+ spark functions including json
  • Update OAP's Velox codebase to 2025/01/07
  • Join: Sort Merge Join support
  • Shuffle: Sort based Shuffle(Row)
  • Query Plan: RAS Optimization
  • Datalake: Hudi 0.15.0 support/Iceberg 1.5.0/Delta 3.2.0
  • RSS: Celeborn 0.5.2/Uniffle 0.9.1
  • File Format: CSV support via arrow
  • JVM libhdfs with viewfs/kerberos support
  • Partial Project(UDF) support
  • Mix backend refactor
  • Bucket write in partitioned Hive table
  • CI/Nightly Package Tools Update
  • Build & Compile Tools Update(recommend to use vcpkg with static build)
  • Fix several result mismatch issues
  • Fix OOM/Yarn Kill unstable issues

What's Changed

Read more

v1.3.0-rc0

16 Jan 12:22
646329d

Choose a tag to compare

v1.3.0-rc0 Pre-release
Pre-release

Release Notes - Gluten version 1.3.0-rc0

Highlights

  • Spark 3.2.2/3.3.1/3.4.3(upgraded)/3.5.2(upgraded)
  • 268+ spark functions including json
  • Update OAP's Velox codebase to 2025/01/07
  • Join: Sort Merge Join support
  • Shuffle: Sort based Shuffle(Row)
  • Query Plan: RAS Optimization
  • Datalake: Hudi 0.15.0 support/Iceberg 1.5.0/Delta 3.2.0
  • RSS: Celeborn 0.5.2/Uniffle 0.9.1
  • File Format: CSV support via arrow
  • JVM libhdfs with viewfs/kerberos support
  • Partial Project(UDF) support
  • Mix backend refactor
  • Bucket write in partitioned Hive table
  • CI/Nightly Package Tools Update
  • Build & Compile Tools Update(recommend to use vcpkg with static build)
  • Fix several result mismatch issues
  • Fix OOM/Yarn Kill unstable issues

What's Changed

Read more