Skip to content

ESQL: Do not discard disjunction conditions when is null/is not null might invalidate them#145941

Open
astefan wants to merge 13 commits intoelastic:mainfrom
astefan:141579_fix
Open

ESQL: Do not discard disjunction conditions when is null/is not null might invalidate them#145941
astefan wants to merge 13 commits intoelastic:mainfrom
astefan:141579_fix

Conversation

@astefan
Copy link
Copy Markdown
Contributor

@astefan astefan commented Apr 8, 2026

This is closing (not fixing, because there is no reproduceable use case, nor there is a possible java execution path for that) #141579.
AI-assisted PR.

@astefan astefan requested review from alex-spies and bpintea April 9, 2026 12:29
@astefan astefan added >bug :Analytics/ES|QL AKA ESQL v9.4.0 v9.3.4 auto-backport Automatically create backport pull requests when merged labels Apr 9, 2026
@astefan astefan marked this pull request as ready for review April 9, 2026 12:29
@elasticsearchmachine elasticsearchmachine added the Team:Analytics Meta label for analytical engine team (ESQL/Aggs/Geo) label Apr 9, 2026
@elasticsearchmachine
Copy link
Copy Markdown
Collaborator

Pinging @elastic/es-analytical-engine (Team:Analytics)

@elasticsearchmachine
Copy link
Copy Markdown
Collaborator

Hi @astefan, I've created a changelog YAML for you.

required_capability: fix_propagate_nullable_or_disjunction

FROM employees
| WHERE (gender IS NOT NULL OR emp_no > 10000) AND gender IS NULL
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One of the tests that is the actual bug this PR is fixing.
Before this PR, this query returned 0 because emp_no > 10000 was lost.

Copy link
Copy Markdown
Contributor

@bpintea bpintea left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LG.
Not sure if worth it, but if going through another CI anyways, I'd maybe only a test with a composite expressions, not just fields. Seeing that semantic equality (a + b IS NOT NULL ... OR b + a > 0).

@astefan
Copy link
Copy Markdown
Contributor Author

astefan commented Apr 9, 2026

LG. Not sure if worth it, but if going through another CI anyways, I'd maybe only a test with a composite expressions, not just fields. Seeing that semantic equality (a + b IS NOT NULL ... OR b + a > 0).

Yep, doing exactly that as I write this. Thank you for reviewing.

Copy link
Copy Markdown
Contributor

@alex-spies alex-spies left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @astefan , the productive changes look right to me. Mostly have a question :)

Comment on lines +125 to +139
* Substitutes {@code IS NULL} and {@code IS NOT NULL} predicates — and, when the field is known to be {@code NULL},
* direct field references — throughout {@code exp}.
* <p>
* When {@code isNullResult} is {@code true} (field constrained to {@code IS NULL}):
* direct field occurrences become {@code null}, {@code IS NULL(field)} becomes {@code true},
* {@code IS NOT NULL(field)} becomes {@code false}.
* <p>
* When {@code isNullResult} is {@code false} (field constrained to {@code IS NOT NULL}):
* only the predicate forms are substituted ({@code IS NULL} → {@code false},
* {@code IS NOT NULL} → {@code true}); the field value itself is left intact because its
* concrete value is unknown.
* <p>
* This preserves surviving OR branches instead of discarding them. For example:
* {@code (a IS NOT NULL OR p) AND a IS NULL} → {@code OR(false, p) AND a IS NULL},
* which {@code BooleanSimplification} then reduces to {@code p AND a IS NULL}.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: This comment is longer than the method impl and doesn't say much except for the last paragraph with the example.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tbh, usually I don't like what method comments AI adds, but in this particular case I went ahead with these because:

  • this edge case that this PR covers is tricky enough to understand to warrant a big fat method comment
  • optimization rules are notoriously hard to understand for someone who didn't either developed that thing or didn't spend quite some time going over edge cases to understand it

With all its ugliness, I will keep this method comment for the reasons above.

Comment on lines 112 to 120
if (exp instanceof Coalesce) {
// Remove direct COALESCE arguments that are exactly the null field; the remaining
// arguments may still reference the field and are handled by substituteNullability below.
List<Expression> newChildren = new ArrayList<>(exp.children());
newChildren.removeIf(e -> e.semanticEquals(nullExp));
if (newChildren.size() != exp.children().size() && newChildren.size() > 0) { // coalesce needs at least one input
return exp.replaceChildren(newChildren);
base = exp.replaceChildren(newChildren);
}
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: It's a bit weird that we only look at top-level COALESCEs, but we descend fully into the expression to look for more IS NULL/IS NOT NULLs. Maybe this can be simplified?

Copy link
Copy Markdown
Contributor Author

@astefan astefan Apr 9, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe there are things to simplify further, no argument against that. I kept this code as is because:

  • this PR is really targeted and surgical, if there is room for more, that "more" deserves a new PR and specific edge cases to cover. Those edge cases must be called out somehow (new issue, bug we discover, bug an user discovers, CI failure from generative testing etc).
  • this code existed before and I didn't want to touch it, because it's not this PR's concern tbh.

I believe I dug deep enough with the many tests I added to be happy with the PR as is now :-).

Comment on lines -118 to +121
return Literal.of(exp, null);
return substituteNullability(base, nullExp, true);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, thank you. It's a bit hard to spot with the convoluted replacer logic, but we used to return literal nulls for the whole expression, rather than descending into it and replacing occurences of the null field. That was very wrong.

* {@code (a IS NOT NULL OR p) AND a IS NULL} → {@code OR(false, p) AND a IS NULL},
* which {@code BooleanSimplification} then reduces to {@code p AND a IS NULL}.
*/
private static Expression substituteNullability(Expression exp, Expression field, boolean isNullResult) {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Question: why do we have to special case IsNull and IsNotNull expressions? Is it because our constant folding only folds fully foldable expressions, but doesn't attempt to partially fold subexpressions? Maybe a more complete solution would be to make ConstantFolding attempt partial folding?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ConstantFolding, FoldNull and PropagateNullable all share some conceptual stuff: dealing with a sort of constant. When that constant can be null/is null is a variant of constant folding but is special enough (see is null/is not null and the concept of Nullability) that it needs one-two separate rules. It is tricky (and I'd be afraid to attempt this) to change these core optimization rules just to try doing a generic partial folding without any good/strong reason. For example, if aggregation by constants would be attempted by someone as a generic fix, that would be a good time to consider this here specific partial folding on ConstantFolding.

@astefan astefan added the auto-merge-without-approval Automatically merge pull request when CI checks pass (NB doesn't wait for reviews!) label Apr 9, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

:Analytics/ES|QL AKA ESQL auto-backport Automatically create backport pull requests when merged auto-merge-without-approval Automatically merge pull request when CI checks pass (NB doesn't wait for reviews!) >bug Team:Analytics Meta label for analytical engine team (ESQL/Aggs/Geo) v9.3.4 v9.4.0

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants