You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
[SPARK-24934][SQL] Explicitly whitelist supported types in upper/lower bounds for in-memory partition pruning
## What changes were proposed in this pull request?
Looks we intentionally set `null` for upper/lower bounds for complex types and don't use it. However, these look used in in-memory partition pruning, which ends up with incorrect results.
This PR proposes to explicitly whitelist the supported types.
```scala
val df = Seq(Array("a", "b"), Array("c", "d")).toDF("arrayCol")
df.cache().filter("arrayCol > array('a', 'b')").show()
```
```scala
val df = sql("select cast('a' as binary) as a")
df.cache().filter("a == cast('a' as binary)").show()
```
**Before:**
```
+--------+
|arrayCol|
+--------+
+--------+
```
```
+---+
| a|
+---+
+---+
```
**After:**
```
+--------+
|arrayCol|
+--------+
| [c, d]|
+--------+
```
```
+----+
| a|
+----+
|[61]|
+----+
```
## How was this patch tested?
Unit tests were added and manually tested.
Author: hyukjinkwon <[email protected]>
Closesapache#21882 from HyukjinKwon/stats-filter.
0 commit comments