Skip to content

0.12.0

Pre-release
Pre-release

Choose a tag to compare

@andygrove andygrove released this 01 Dec 16:23
· 101 commits to main since this release
6086438

DataFusion Comet 0.12.0 Changelog

This release consists of 105 commits from 13 contributors. See credits at the end of this changelog for more information.

Fixed bugs:

  • fix: Fix None.get in stringDecode when bin child cannot be converted #2606 (cfmcgrady)
  • fix: Update FuzzDataGenerator to produce dictionary-encoded string arrays & fix bugs that this exposes #2635 (andygrove)
  • fix: Fallback to Spark for lpad/rpad for unsupported arguments & fix negative length handling #2630 (andygrove)
  • fix: Mark SortOrder with floating-point as incompatible #2650 (andygrove)
  • fix: Fall back to Spark for trunc / date_trunc functions when format string is unsupported, or is not a literal value #2634 (andygrove)
  • fix: [native_datafusion] only pass single partition of PartitionedFiles into DataSourceExec #2675 (mbutrovich)
  • fix: Fix subcommands options in fuzz-testing #2684 (manuzhang)
  • fix: Do not replace SMJ with HJ for LeftSemi #2687 (comphead)
  • fix: Apply spotless on Iceberg 1.8.1 diff [iceberg] #2700 (hsiang-c)
  • fix: Fix generate-user-guide-reference-docs failure when mvn command is not executed at root #2691 (manuzhang)
  • fix: Fix missing SortOrder fallback reason in range partitioning #2716 (andygrove)
  • fix: CometLiteral class cast exception with arrays #2718 (andygrove)
  • fix: NormalizeNaNAndZero::children() returns child's child #2732 (mbutrovich)
  • fix: checkSparkMaybeThrows should compare Spark and Comet results in success case #2728 (andygrove)
  • fix: Mark WindowsExec as incompatible #2748 (andygrove)
  • fix: Add strict floating point mode and fallback to Spark for min/max/sort on floating point inputs when enabled #2747 (andygrove)
  • fix: Implement producedAttributes for CometWindowExec #2789 (rahulbabarwal89)
  • fix: Pass all Comet configs to native plan #2801 (andygrove)

Implemented enhancements:

  • feat: Add option to write benchmark results to file #2640 (andygrove)
  • feat: Implement metrics for iceberg compat #2615 (EmilyMatt)
  • feat: Define function signatures in CometFuzz #2614 (andygrove)
  • feat: cherry-pick UUID conversion logic from #2528 #2648 (mbutrovich)
  • feat: support concat for strings #2604 (comphead)
  • feat: Add support for abs #2689 (andygrove)
  • feat: Support variadic function in CometFuzz #2682 (manuzhang)
  • feat: CometExecRule refactor: Unify CometNativeExec creation with Serde in CometOperatorSerde trait #2768 (andygrove)
  • feat: support cot #2755 (psvri)
  • feat: Add bash script to build and run fuzz testing #2686 (manuzhang)
  • feat: Add getSupportLevel to CometAggregateExpressionSerde trait #2777 (andygrove)
  • feat: Add CI check to ensure generated docs are in sync with code #2779 (andygrove)
  • feat: Add prettier enforcement #2783 (andygrove)
  • feat: hyperbolic trig functions #2784 (psvri)
  • feat: [iceberg] Native scan by serializing FileScanTasks to iceberg-rust #2528 (mbutrovich)

Documentation updates:

  • docs: Add changelog for 0.11.0 release #2585 (mbutrovich)
  • docs: Improve documentation layout #2587 (andygrove)
  • docs: Publish 0.11.0 user guide #2589 (andygrove)
  • docs: Put Comet logo in top nav bar, respect light/dark mode #2591 (andygrove)
  • docs: Improve main landing page #2593 (andygrove)
  • docs: Improve site navigation #2597 (andygrove)
  • docs: Update benchmark results #2596 (andygrove)
  • docs: Upgrade pydata-sphinx-theme to 0.16.1 #2602 (andygrove)
  • docs: Fix redirect #2603 (andygrove)
  • docs: Fix broken image link #2613 (andygrove)
  • docs: Add FFI docs to contributor guide #2668 (andygrove)
  • docs: Various documentation updates #2674 (andygrove)
  • docs: Add supported SortOrder expressions and fix a typo #2694 (andygrove)
  • docs: Minor docs update for running Spark SQL tests #2712 (andygrove)
  • docs: Update contributor guide for adding a new expression #2704 (andygrove)
  • docs: Documentation updates for LocalTableScan and WindowExec #2742 (andygrove)
  • docs: Typo fix #2752 (wForget)
  • docs: Categorize some configs as testing and add notes about known time zone issues #2740 (andygrove)
  • docs: Run prettier on all markdown files #2782 (andygrove)
  • docs: Ignore prettier formatting for generated tables #2790 (andygrove)
  • docs: Add new section to contributor guide, explaining how to add a new operator #2758 (andygrove)

Other:

  • chore: Start 0.12.0 development #2584 (mbutrovich)
  • chore: Bump Spark from 3.5.6 to 3.5.7 #2574 (cfmcgrady)
  • chore(deps): bump parquet from 56.0.0 to 56.2.0 in /native #2608 (dependabot[bot])
  • chore(deps): bump tikv-jemallocator from 0.6.0 to 0.6.1 in /native #2609 (dependabot[bot])
  • chore(deps): bump tikv-jemalloc-ctl from 0.6.0 to 0.6.1 in /native #2610 (dependabot[bot])
  • tests: FuzzDataGenerator instead of Parquet-specific generator #2616 (mbutrovich)
  • chore: Simplify on-heap memory configuration #2599 (andygrove)
  • Feat: Add sha1 function impl #2471 (kazantsev-maksim)
  • chore: Refactor Parquet/DataFrame fuzz data generators #2629 (andygrove)
  • chore: Remove needless from_raw calls #2638 (EmilyMatt)
  • chore: support DataFusion 50.3.0 #2605 (comphead)
  • chore(deps): bump actions/upload-artifact from 4 to 5 #2654 (dependabot[bot])
  • chore(deps): bump cc from 1.2.42 to 1.2.43 in /native #2653 (dependabot[bot])
  • chore(deps): bump actions/download-artifact from 5 to 6 #2652 (dependabot[bot])
  • chore: extract comparison into separate tool #2632 (comphead)
  • chore: Various improvements to checkSparkAnswer* methods in CometTestBase #2656 (andygrove)
  • chore: Remove code for unpacking dictionaries prior to FilterExec #2659 (andygrove)
  • chore: display schema for datasets being compared #2665 (comphead)
  • chore: Remove CopyExec #2663 (andygrove)
  • chore: Add extended explain plans to stability suite #2669 (andygrove)
  • chore(deps): bump aws-config from 1.8.8 to 1.8.10 in /native #2677 (dependabot[bot])
  • chore(deps): bump cc from 1.2.43 to 1.2.44 in /native #2678 (dependabot[bot])
  • chore: tpcbench output explain just once and formatted #2679 (comphead)
  • chore: Add tolerance for ComparisonTool #2699 (comphead)
  • chore: Expand test coverage for CometWindowsExec #2711 (comphead)
  • chore: generate Float/Double NaN #2695 (hsiang-c)
  • minor: Combine two CI workflows for Spark SQL tests #2727 (andygrove)
  • chore: Improve framework for specifying that configs can be set with env vars #2722 (andygrove)
  • chore: Rename COMET_EXPLAIN_VERBOSE_ENABLED to COMET_EXTENDED_EXPLAIN_FORMAT and change default #2644 (andygrove)
  • chore: Fallback to Spark for windows functions #2726 (comphead)
  • chore: Refactor operator serde - part 1 #2738 (andygrove)
  • Feat: Add CometLocalTableScanExec operator #2735 (kazantsev-maksim)
  • chore(deps): bump cc from 1.2.44 to 1.2.45 in /native #2750 (dependabot[bot])
  • chore(deps): bump aws-credential-types from 1.2.8 to 1.2.9 in /native #2751 (dependabot[bot])
  • chore: Operator serde refactor part 2 #2741 (andygrove)
  • chore: Fallback to Spark for array_reverse for array<binary> #2759 (comphead)
  • chore: [iceberg] test iceberg 1.10.0 #2709 (manuzhang)
  • chore: Add docs/comet-* to rat exclude list #2762 (manuzhang)
  • Chore: Refactor static invoke exprs #2671 (kazantsev-maksim)
  • minor: Small refactor for consistent serde for hash aggregate #2764 (andygrove)
  • minor: Move operator2Proto to CometExecRule #2767 (andygrove)
  • chore: various refactoring changes for iceberg [iceberg] #2680 (parthchandra)
  • chore: Refactor CometExecRule handling of sink operators #2771 (andygrove)
  • minor: Refactor to move window-specific code from QueryPlanSerde to CometWindowExec #2780 (andygrove)
  • chore: Remove many references to COMET_EXPR_ALLOW_INCOMPATIBLE #2775 (andygrove)
  • chore: Remove COMET_EXPR_ALLOW_INCOMPATIBLE config #2786 (andygrove)
  • chore: check missingInput for Comet plan nodes #2795 (comphead)
  • chore: Finish refactoring expression serde out of QueryPlanSerde #2791 (andygrove)
  • chore: Update docs to fix CI after #2784 #2799 (mbutrovich)
  • chore: Update q79 golden plan for Spark 4.0 after #2795 #2800 (mbutrovich)
  • Fix: Fix null handling in CometVector implementations #2643 (cfmcgrady)

Credits

Thank you to everyone who contributed to this release. Here is a breakdown of commits (PRs merged) per contributor.

    54	Andy Grove
    11	Oleks V
    10	dependabot[bot]
     9	Matt Butrovich
     6	Manu Zhang
     3	Fu Chen
     3	Kazantsev Maksim
     2	Emily Matheys
     2	Vrishabh
     2	hsiang-c
     1	Parth Chandra
     1	Zhen Wang
     1	rahulbabarwal89

Thank you also to everyone who contributed in other ways such as filing issues, reviewing PRs, and providing feedback on this release.