0.12.0
Pre-release
Pre-release
DataFusion Comet 0.12.0 Changelog
This release consists of 105 commits from 13 contributors. See credits at the end of this changelog for more information.
Fixed bugs:
- fix: Fix
None.getinstringDecodewhenbinchild cannot be converted #2606 (cfmcgrady) - fix: Update FuzzDataGenerator to produce dictionary-encoded string arrays & fix bugs that this exposes #2635 (andygrove)
- fix: Fallback to Spark for lpad/rpad for unsupported arguments & fix negative length handling #2630 (andygrove)
- fix: Mark SortOrder with floating-point as incompatible #2650 (andygrove)
- fix: Fall back to Spark for
trunc/date_truncfunctions when format string is unsupported, or is not a literal value #2634 (andygrove) - fix: [native_datafusion] only pass single partition of PartitionedFiles into DataSourceExec #2675 (mbutrovich)
- fix: Fix subcommands options in fuzz-testing #2684 (manuzhang)
- fix: Do not replace SMJ with HJ for
LeftSemi#2687 (comphead) - fix: Apply spotless on Iceberg 1.8.1 diff [iceberg] #2700 (hsiang-c)
- fix: Fix generate-user-guide-reference-docs failure when mvn command is not executed at root #2691 (manuzhang)
- fix: Fix missing SortOrder fallback reason in range partitioning #2716 (andygrove)
- fix: CometLiteral class cast exception with arrays #2718 (andygrove)
- fix: NormalizeNaNAndZero::children() returns child's child #2732 (mbutrovich)
- fix: checkSparkMaybeThrows should compare Spark and Comet results in success case #2728 (andygrove)
- fix: Mark
WindowsExecas incompatible #2748 (andygrove) - fix: Add strict floating point mode and fallback to Spark for min/max/sort on floating point inputs when enabled #2747 (andygrove)
- fix: Implement producedAttributes for CometWindowExec #2789 (rahulbabarwal89)
- fix: Pass all Comet configs to native plan #2801 (andygrove)
Implemented enhancements:
- feat: Add option to write benchmark results to file #2640 (andygrove)
- feat: Implement metrics for iceberg compat #2615 (EmilyMatt)
- feat: Define function signatures in CometFuzz #2614 (andygrove)
- feat: cherry-pick UUID conversion logic from #2528 #2648 (mbutrovich)
- feat: support
concatfor strings #2604 (comphead) - feat: Add support for
abs#2689 (andygrove) - feat: Support variadic function in CometFuzz #2682 (manuzhang)
- feat: CometExecRule refactor: Unify CometNativeExec creation with Serde in CometOperatorSerde trait #2768 (andygrove)
- feat: support cot #2755 (psvri)
- feat: Add bash script to build and run fuzz testing #2686 (manuzhang)
- feat: Add getSupportLevel to CometAggregateExpressionSerde trait #2777 (andygrove)
- feat: Add CI check to ensure generated docs are in sync with code #2779 (andygrove)
- feat: Add prettier enforcement #2783 (andygrove)
- feat: hyperbolic trig functions #2784 (psvri)
- feat: [iceberg] Native scan by serializing FileScanTasks to iceberg-rust #2528 (mbutrovich)
Documentation updates:
- docs: Add changelog for 0.11.0 release #2585 (mbutrovich)
- docs: Improve documentation layout #2587 (andygrove)
- docs: Publish 0.11.0 user guide #2589 (andygrove)
- docs: Put Comet logo in top nav bar, respect light/dark mode #2591 (andygrove)
- docs: Improve main landing page #2593 (andygrove)
- docs: Improve site navigation #2597 (andygrove)
- docs: Update benchmark results #2596 (andygrove)
- docs: Upgrade pydata-sphinx-theme to 0.16.1 #2602 (andygrove)
- docs: Fix redirect #2603 (andygrove)
- docs: Fix broken image link #2613 (andygrove)
- docs: Add FFI docs to contributor guide #2668 (andygrove)
- docs: Various documentation updates #2674 (andygrove)
- docs: Add supported SortOrder expressions and fix a typo #2694 (andygrove)
- docs: Minor docs update for running Spark SQL tests #2712 (andygrove)
- docs: Update contributor guide for adding a new expression #2704 (andygrove)
- docs: Documentation updates for
LocalTableScanandWindowExec#2742 (andygrove) - docs: Typo fix #2752 (wForget)
- docs: Categorize some configs as
testingand add notes about known time zone issues #2740 (andygrove) - docs: Run prettier on all markdown files #2782 (andygrove)
- docs: Ignore prettier formatting for generated tables #2790 (andygrove)
- docs: Add new section to contributor guide, explaining how to add a new operator #2758 (andygrove)
Other:
- chore: Start 0.12.0 development #2584 (mbutrovich)
- chore: Bump Spark from 3.5.6 to 3.5.7 #2574 (cfmcgrady)
- chore(deps): bump parquet from 56.0.0 to 56.2.0 in /native #2608 (dependabot[bot])
- chore(deps): bump tikv-jemallocator from 0.6.0 to 0.6.1 in /native #2609 (dependabot[bot])
- chore(deps): bump tikv-jemalloc-ctl from 0.6.0 to 0.6.1 in /native #2610 (dependabot[bot])
- tests: FuzzDataGenerator instead of Parquet-specific generator #2616 (mbutrovich)
- chore: Simplify on-heap memory configuration #2599 (andygrove)
- Feat: Add sha1 function impl #2471 (kazantsev-maksim)
- chore: Refactor Parquet/DataFrame fuzz data generators #2629 (andygrove)
- chore: Remove needless from_raw calls #2638 (EmilyMatt)
- chore: support DataFusion 50.3.0 #2605 (comphead)
- chore(deps): bump actions/upload-artifact from 4 to 5 #2654 (dependabot[bot])
- chore(deps): bump cc from 1.2.42 to 1.2.43 in /native #2653 (dependabot[bot])
- chore(deps): bump actions/download-artifact from 5 to 6 #2652 (dependabot[bot])
- chore: extract comparison into separate tool #2632 (comphead)
- chore: Various improvements to
checkSparkAnswer*methods inCometTestBase#2656 (andygrove) - chore: Remove code for unpacking dictionaries prior to FilterExec #2659 (andygrove)
- chore: display schema for datasets being compared #2665 (comphead)
- chore: Remove
CopyExec#2663 (andygrove) - chore: Add extended explain plans to stability suite #2669 (andygrove)
- chore(deps): bump aws-config from 1.8.8 to 1.8.10 in /native #2677 (dependabot[bot])
- chore(deps): bump cc from 1.2.43 to 1.2.44 in /native #2678 (dependabot[bot])
- chore:
tpcbenchoutputexplainjust once and formatted #2679 (comphead) - chore: Add tolerance for
ComparisonTool#2699 (comphead) - chore: Expand test coverage for
CometWindowsExec#2711 (comphead) - chore: generate Float/Double NaN #2695 (hsiang-c)
- minor: Combine two CI workflows for Spark SQL tests #2727 (andygrove)
- chore: Improve framework for specifying that configs can be set with env vars #2722 (andygrove)
- chore: Rename
COMET_EXPLAIN_VERBOSE_ENABLEDtoCOMET_EXTENDED_EXPLAIN_FORMATand change default #2644 (andygrove) - chore: Fallback to Spark for windows functions #2726 (comphead)
- chore: Refactor operator serde - part 1 #2738 (andygrove)
- Feat: Add CometLocalTableScanExec operator #2735 (kazantsev-maksim)
- chore(deps): bump cc from 1.2.44 to 1.2.45 in /native #2750 (dependabot[bot])
- chore(deps): bump aws-credential-types from 1.2.8 to 1.2.9 in /native #2751 (dependabot[bot])
- chore: Operator serde refactor part 2 #2741 (andygrove)
- chore: Fallback to Spark for
array_reverseforarray<binary>#2759 (comphead) - chore: [iceberg] test iceberg 1.10.0 #2709 (manuzhang)
- chore: Add
docs/comet-*to rat exclude list #2762 (manuzhang) - Chore: Refactor static invoke exprs #2671 (kazantsev-maksim)
- minor: Small refactor for consistent serde for hash aggregate #2764 (andygrove)
- minor: Move
operator2PrototoCometExecRule#2767 (andygrove) - chore: various refactoring changes for iceberg [iceberg] #2680 (parthchandra)
- chore: Refactor CometExecRule handling of sink operators #2771 (andygrove)
- minor: Refactor to move window-specific code from
QueryPlanSerdetoCometWindowExec#2780 (andygrove) - chore: Remove many references to
COMET_EXPR_ALLOW_INCOMPATIBLE#2775 (andygrove) - chore: Remove COMET_EXPR_ALLOW_INCOMPATIBLE config #2786 (andygrove)
- chore: check
missingInputfor Comet plan nodes #2795 (comphead) - chore: Finish refactoring expression serde out of
QueryPlanSerde#2791 (andygrove) - chore: Update docs to fix CI after #2784 #2799 (mbutrovich)
- chore: Update q79 golden plan for Spark 4.0 after #2795 #2800 (mbutrovich)
- Fix: Fix null handling in CometVector implementations #2643 (cfmcgrady)
Credits
Thank you to everyone who contributed to this release. Here is a breakdown of commits (PRs merged) per contributor.
54 Andy Grove
11 Oleks V
10 dependabot[bot]
9 Matt Butrovich
6 Manu Zhang
3 Fu Chen
3 Kazantsev Maksim
2 Emily Matheys
2 Vrishabh
2 hsiang-c
1 Parth Chandra
1 Zhen Wang
1 rahulbabarwal89
Thank you also to everyone who contributed in other ways such as filing issues, reviewing PRs, and providing feedback on this release.