Skip to content

SQL: Incorrect result from OR+BETWEEN combo with NULL value in vector runtime #6431

@philrz

Description

@philrz

The following query returns a result in vector runtime but should return nothing.

$ super -f parquet -o data.parquet -c "values {d:null,e:15}" &&
SUPER_VAM=1 super -c "
SELECT d,e
FROM data.parquet
WHERE d>e OR d NOT BETWEEN 10 AND 20;"

{d:null,e:15}

Details

Repro is with super commit 30f64bb. The query above is a simplification of one from this sqllogictest (aka select2/q394).

This query has characteristics in common with the one from #6277 in that it seems to be unique to NULL values being subject to comparisons, but for some reason was not fixed by the changes in #6426.

Interestingly, if I do an a la carte WHERE of what's on each end of the OR, they each return the expected "nothing".

$ super -f parquet -o data.parquet -c "values {d:null,e:15}" &&
SUPER_VAM=1 super -c "
SELECT d,e
FROM data.parquet
WHERE d>e;"

[no output]
$ super -f parquet -o data.parquet -c "values {d:null,e:15}" &&
SUPER_VAM=1 super -c "
SELECT d,e
FROM data.parquet
WHERE d NOT BETWEEN 10 AND 20;"

[no output]

And here it is in sequential runtime and DuckDB with the full OR'd together WHERE clause giving the expected "nothing" response.

$ super -f parquet -o data.parquet -c "values {d:null,e:15}" &&
super -c "
SELECT d,e
FROM data.parquet
WHERE d>e OR d NOT BETWEEN 10 AND 20;"

[no output]
$ duckdb
DuckDB v1.4.2 (Andium) 68d7555f68
Enter ".help" for usage hints.
Connected to a transient in-memory database.
Use ".open FILENAME" to reopen on a persistent database.
D SELECT d,e
  FROM data.parquet
  WHERE d>e OR d NOT BETWEEN 10 AND 20;
┌───────┬───────┐
│   d   │   e   │
│ int32 │ int64 │
├───────┴───────┤
│    0 rows     │
└───────────────┘

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions