Skip to content

Commit 07c7ef5

Browse files
authored
Add changelog for 0.8.0 (#1675)
1 parent ea125f5 commit 07c7ef5

File tree

1 file changed

+139
-0
lines changed

1 file changed

+139
-0
lines changed

dev/changelog/0.8.0.md

Lines changed: 139 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,139 @@
1+
<!--
2+
Licensed to the Apache Software Foundation (ASF) under one
3+
or more contributor license agreements. See the NOTICE file
4+
distributed with this work for additional information
5+
regarding copyright ownership. The ASF licenses this file
6+
to you under the Apache License, Version 2.0 (the
7+
"License"); you may not use this file except in compliance
8+
with the License. You may obtain a copy of the License at
9+
10+
http://www.apache.org/licenses/LICENSE-2.0
11+
12+
Unless required by applicable law or agreed to in writing,
13+
software distributed under the License is distributed on an
14+
"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
15+
KIND, either express or implied. See the License for the
16+
specific language governing permissions and limitations
17+
under the License.
18+
-->
19+
20+
# DataFusion Comet 0.8.0 Changelog
21+
22+
This release consists of 81 commits from 11 contributors. See credits at the end of this changelog for more information.
23+
24+
**Fixed bugs:**
25+
26+
- fix: remove code duplication in native_datafusion and native_iceberg_compat implementations [#1443](https://github.com/apache/datafusion-comet/pull/1443) (parthchandra)
27+
- fix: Refactor CometScanRule and fix bugs [#1483](https://github.com/apache/datafusion-comet/pull/1483) (andygrove)
28+
- fix: check if handle has been initialized before closing [#1554](https://github.com/apache/datafusion-comet/pull/1554) (wForget)
29+
- fix: Taking slicing into account when writing BooleanBuffers as fast-encoding format [#1522](https://github.com/apache/datafusion-comet/pull/1522) (Kontinuation)
30+
- fix: isCometEnabled name conflict [#1569](https://github.com/apache/datafusion-comet/pull/1569) (kazuyukitanimura)
31+
- fix: make register_object_store use same session_env as file scan [#1555](https://github.com/apache/datafusion-comet/pull/1555) (wForget)
32+
- fix: adjust CometNativeScan's doCanonicalize and hashCode for AQE, use DataSourceScanExec trait [#1578](https://github.com/apache/datafusion-comet/pull/1578) (mbutrovich)
33+
- fix: corrected the logic of eliminating CometSparkToColumnarExec [#1597](https://github.com/apache/datafusion-comet/pull/1597) (wForget)
34+
- fix: avoid panic caused by close null handle of parquet reader [#1604](https://github.com/apache/datafusion-comet/pull/1604) (wForget)
35+
- fix: Make AQE capable of converting Comet shuffled joins to Comet broadcast hash joins [#1605](https://github.com/apache/datafusion-comet/pull/1605) (Kontinuation)
36+
- fix: Making shuffle files generated in native shuffle mode reclaimable [#1568](https://github.com/apache/datafusion-comet/pull/1568) (Kontinuation)
37+
- fix: Support per-task shuffle write rows and shuffle write time metrics [#1617](https://github.com/apache/datafusion-comet/pull/1617) (Kontinuation)
38+
- fix: Modify Spark SQL core 2 tests for `native_datafusion` reader, change 3.5.5 diff hash length to 11 [#1641](https://github.com/apache/datafusion-comet/pull/1641) (mbutrovich)
39+
- fix: fix spark/sql test failures in native_iceberg_compat [#1593](https://github.com/apache/datafusion-comet/pull/1593) (parthchandra)
40+
- fix: handle missing field correctly in native_iceberg_compat [#1656](https://github.com/apache/datafusion-comet/pull/1656) (parthchandra)
41+
- fix: better int96 support for experimental native scans [#1652](https://github.com/apache/datafusion-comet/pull/1652) (mbutrovich)
42+
- fix: respect `ignoreNulls` flag in `first_value` and `last_value` [#1626](https://github.com/apache/datafusion-comet/pull/1626) (andygrove)
43+
- fix: update row groups count in internal metrics accumulator [#1658](https://github.com/apache/datafusion-comet/pull/1658) (parthchandra)
44+
- fix: Shuffle should maintain insertion order [#1660](https://github.com/apache/datafusion-comet/pull/1660) (EmilyMatt)
45+
46+
**Performance related:**
47+
48+
- perf: Use a global tokio runtime [#1614](https://github.com/apache/datafusion-comet/pull/1614) (andygrove)
49+
- perf: Respect Spark's PARQUET_FILTER_PUSHDOWN_ENABLED config [#1619](https://github.com/apache/datafusion-comet/pull/1619) (andygrove)
50+
- perf: Experimental fix to avoid join strategy regression [#1674](https://github.com/apache/datafusion-comet/pull/1674) (andygrove)
51+
52+
**Implemented enhancements:**
53+
54+
- feat: add read array support [#1456](https://github.com/apache/datafusion-comet/pull/1456) (comphead)
55+
- feat: introduce hadoop mini cluster to test native scan on hdfs [#1556](https://github.com/apache/datafusion-comet/pull/1556) (wForget)
56+
- feat: make parquet native scan schema case insensitive [#1575](https://github.com/apache/datafusion-comet/pull/1575) (wForget)
57+
- feat: enable iceberg compat tests, more tests for complex types [#1550](https://github.com/apache/datafusion-comet/pull/1550) (comphead)
58+
- feat: pushdown filter for native_iceberg_compat [#1566](https://github.com/apache/datafusion-comet/pull/1566) (wForget)
59+
- feat: Fix struct of arrays schema issue [#1592](https://github.com/apache/datafusion-comet/pull/1592) (comphead)
60+
- feat: adding more struct/arrays tests [#1594](https://github.com/apache/datafusion-comet/pull/1594) (comphead)
61+
- feat: respect `batchSize/workerThreads/blockingThreads` configurations for native_iceberg_compat scan [#1587](https://github.com/apache/datafusion-comet/pull/1587) (wForget)
62+
- feat: add MAP type support for first level [#1603](https://github.com/apache/datafusion-comet/pull/1603) (comphead)
63+
- feat: Add more tests for nested types combinations for `native_datafusion` [#1632](https://github.com/apache/datafusion-comet/pull/1632) (comphead)
64+
- feat: Override MapBuilder values field with expected schema [#1643](https://github.com/apache/datafusion-comet/pull/1643) (comphead)
65+
- feat: track unified memory pool [#1651](https://github.com/apache/datafusion-comet/pull/1651) (wForget)
66+
- feat: Add support for complex types in native shuffle [#1655](https://github.com/apache/datafusion-comet/pull/1655) (andygrove)
67+
68+
**Documentation updates:**
69+
70+
- docs: Update configuration guide to show optional configs [#1524](https://github.com/apache/datafusion-comet/pull/1524) (andygrove)
71+
- docs: Add changelog for 0.7.0 release [#1527](https://github.com/apache/datafusion-comet/pull/1527) (andygrove)
72+
- docs: Use a shallow clone for Spark SQL test instructions [#1547](https://github.com/apache/datafusion-comet/pull/1547) (mbutrovich)
73+
- docs: Update benchmark results for 0.7.0 release [#1548](https://github.com/apache/datafusion-comet/pull/1548) (andygrove)
74+
- doc: Renew `kubernetes.md` [#1549](https://github.com/apache/datafusion-comet/pull/1549) (comphead)
75+
- docs: various improvements to tuning guide [#1525](https://github.com/apache/datafusion-comet/pull/1525) (andygrove)
76+
- docs: Update supported Spark versions [#1580](https://github.com/apache/datafusion-comet/pull/1580) (andygrove)
77+
- docs: change OSX/OS X to macOS [#1584](https://github.com/apache/datafusion-comet/pull/1584) (mbutrovich)
78+
- docs: docs for benchmarking in aws ec2 [#1601](https://github.com/apache/datafusion-comet/pull/1601) (andygrove)
79+
- docs: Update compatibility docs for new native scans [#1657](https://github.com/apache/datafusion-comet/pull/1657) (andygrove)
80+
- doc: Document local HDFS setup [#1673](https://github.com/apache/datafusion-comet/pull/1673) (comphead)
81+
82+
**Other:**
83+
84+
- chore: fix issue in release process [#1528](https://github.com/apache/datafusion-comet/pull/1528) (andygrove)
85+
- chore: Remove all subdependencies [#1514](https://github.com/apache/datafusion-comet/pull/1514) (EmilyMatt)
86+
- chore: Drop support for Spark 3.3 (EOL) [#1529](https://github.com/apache/datafusion-comet/pull/1529) (andygrove)
87+
- chore: Prepare for 0.8.0 development [#1530](https://github.com/apache/datafusion-comet/pull/1530) (andygrove)
88+
- chore: Re-enable GitHub discussions [#1535](https://github.com/apache/datafusion-comet/pull/1535) (andygrove)
89+
- chore: [FOLLOWUP] Drop support for Spark 3.3 (EOL) [#1534](https://github.com/apache/datafusion-comet/pull/1534) (kazuyukitanimura)
90+
- build: Use unique name for surefire artifacts [#1544](https://github.com/apache/datafusion-comet/pull/1544) (andygrove)
91+
- chore: Update links for released version [#1540](https://github.com/apache/datafusion-comet/pull/1540) (andygrove)
92+
- chore: Enable Comet explicitly in `CometTPCDSQueryTestSuite` [#1559](https://github.com/apache/datafusion-comet/pull/1559) (andygrove)
93+
- chore: Fix some inconsistencies in memory pool configuration [#1561](https://github.com/apache/datafusion-comet/pull/1561) (andygrove)
94+
- upgraded spark 3.5.4 to 3.5.5 [#1565](https://github.com/apache/datafusion-comet/pull/1565) (YanivKunda)
95+
- minor: fix typo [#1570](https://github.com/apache/datafusion-comet/pull/1570) (wForget)
96+
- Chore: simplify array related functions impl [#1490](https://github.com/apache/datafusion-comet/pull/1490) (kazantsev-maksim)
97+
- added fallback using reflection for backward-compatibility [#1573](https://github.com/apache/datafusion-comet/pull/1573) (YanivKunda)
98+
- chore: Override node name for CometSparkToColumnar [#1577](https://github.com/apache/datafusion-comet/pull/1577) (l0kr)
99+
- chore: Reimplement ShuffleWriterExec using interleave_record_batch [#1511](https://github.com/apache/datafusion-comet/pull/1511) (Kontinuation)
100+
- chore: Run Comet tests for more Spark versions [#1582](https://github.com/apache/datafusion-comet/pull/1582) (andygrove)
101+
- Feat: support array_except function [#1343](https://github.com/apache/datafusion-comet/pull/1343) (kazantsev-maksim)
102+
- minor: Fix clippy warnings [#1606](https://github.com/apache/datafusion-comet/pull/1606) (Kontinuation)
103+
- chore: Remove some unwraps in hashing code [#1600](https://github.com/apache/datafusion-comet/pull/1600) (andygrove)
104+
- chore: Remove redundant shims for getFailOnError [#1608](https://github.com/apache/datafusion-comet/pull/1608) (andygrove)
105+
- chore: Making comet native operators write spill files to spark local dir [#1581](https://github.com/apache/datafusion-comet/pull/1581) (Kontinuation)
106+
- chore: Refactor QueryPlanSerde to use idiomatic Scala and reduce verbosity [#1609](https://github.com/apache/datafusion-comet/pull/1609) (andygrove)
107+
- chore: Create simple fuzz test as part of test suite [#1610](https://github.com/apache/datafusion-comet/pull/1610) (andygrove)
108+
- chore: Document `testSingleLineQuery` test method [#1628](https://github.com/apache/datafusion-comet/pull/1628) (comphead)
109+
- chore: Parquet fuzz testing [#1623](https://github.com/apache/datafusion-comet/pull/1623) (andygrove)
110+
- chore: Change default Spark version to 3.5 [#1620](https://github.com/apache/datafusion-comet/pull/1620) (andygrove)
111+
- chore: Add manually-triggered CI jobs for testing Spark SQL with native scans [#1624](https://github.com/apache/datafusion-comet/pull/1624) (andygrove)
112+
- chore: refactor v2 scan conversion [#1621](https://github.com/apache/datafusion-comet/pull/1621) (andygrove)
113+
- chore: clean up `planner.rs` [#1650](https://github.com/apache/datafusion-comet/pull/1650) (comphead)
114+
- chore: correct name of pipelines for native_datafusion ci workflow [#1653](https://github.com/apache/datafusion-comet/pull/1653) (parthchandra)
115+
- chore: Upgrade to datafusion 47.0.0-rc1 and arrow-rs 55.0.0 [#1563](https://github.com/apache/datafusion-comet/pull/1563) (andygrove)
116+
- chore: Upgrade to datafusion 47.0.0 [#1663](https://github.com/apache/datafusion-comet/pull/1663) (YanivKunda)
117+
- chore: Enable CometFuzzTestSuite int96 test for experimental native scans (without complex types) [#1664](https://github.com/apache/datafusion-comet/pull/1664) (mbutrovich)
118+
- chore: Refactor Memory Pools [#1662](https://github.com/apache/datafusion-comet/pull/1662) (EmilyMatt)
119+
120+
## Credits
121+
122+
Thank you to everyone who contributed to this release. Here is a breakdown of commits (PRs merged) per contributor.
123+
124+
```
125+
31 Andy Grove
126+
11 Oleks V
127+
10 Zhen Wang
128+
7 Kristin Cowalcijk
129+
6 Matt Butrovich
130+
5 Parth Chandra
131+
3 Emily Matheys
132+
3 Yaniv Kunda
133+
2 KAZUYUKI TANIMURA
134+
2 Kazantsev Maksim
135+
1 Łukasz
136+
```
137+
138+
Thank you also to everyone who contributed in other ways such as filing issues, reviewing PRs, and providing feedback on this release.
139+

0 commit comments

Comments
 (0)