-
Notifications
You must be signed in to change notification settings - Fork 25.6k
ESQL: Fix FieldAttribute name usage in InferNonNullAggConstraint #128910
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from 7 commits
674c4b6
5926191
83fb557
85858f8
4921bc7
be574fd
be97463
5e696aa
6da3847
8a57abc
7edd899
f8a5d1b
f8435eb
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,5 @@ | ||
| pr: 128910 | ||
| summary: Fix `FieldAttribute` name usage in `InferNonNullAggConstraint` | ||
| area: ES|QL | ||
| type: bug | ||
| issues: [] |
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -25,7 +25,7 @@ | |
|
|
||
| /** | ||
| * The vast majority of aggs ignore null entries - this rule adds a pushable filter, as it is cheap | ||
| * to execute, to filter this entries out to begin with. | ||
| * to execute, to filter these entries out to begin with. | ||
| * STATS x = min(a), y = sum(b) | ||
| * becomes | ||
| * | WHERE a IS NOT NULL OR b IS NOT NULL | ||
|
|
@@ -55,7 +55,7 @@ protected LogicalPlan rule(Aggregate aggregate, LocalLogicalOptimizerContext con | |
| Expression field = af.field(); | ||
| // ignore literals (e.g. COUNT(1)) | ||
| // make sure the field exists at the source and is indexed (not runtime) | ||
| if (field.foldable() == false && field instanceof FieldAttribute fa && stats.isIndexed(fa.name())) { | ||
| if (field.foldable() == false && field instanceof FieldAttribute fa && stats.isIndexed(fa.fieldName())) { | ||
|
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Yup, this is the actual fix. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This looks so familiar. I'm sure I've done that in a few places before, probably during the union-types work. Pity I was not thorough! What I did was replace There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Caution!
I hope this gets easier with the updated javadoc. :/ |
||
| nonNullAggFields.add(field); | ||
| } else { | ||
| // otherwise bail out since unless disjunction needs to cover _all_ fields, things get filtered out | ||
|
|
||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -126,32 +126,36 @@ public boolean canUseEqualityOnSyntheticSourceDelegate(FieldAttribute attr, Stri | |
| }; | ||
|
|
||
| /** | ||
| * If we have access to SearchStats over a collection of shards, we can make more fine-grained decisions about what can be pushed down. | ||
| * This should open up more opportunities for lucene pushdown. | ||
| * If we have access to {@link SearchStats} over a collection of shards, we can make more fine-grained decisions about what can be | ||
| * pushed down. This should open up more opportunities for lucene pushdown. | ||
| */ | ||
| static LucenePushdownPredicates from(SearchStats stats) { | ||
| // TODO: use FieldAttribute#fieldName, otherwise this doesn't apply to field attributes used for union types. | ||
| // C.f. https://github.com/elastic/elasticsearch/issues/128905 | ||
| return new LucenePushdownPredicates() { | ||
| @Override | ||
| public boolean hasExactSubfield(FieldAttribute attr) { | ||
| return stats.hasExactSubfield(attr.name()); | ||
| return stats.hasExactSubfield(new FieldAttribute.FieldName(attr.name())); | ||
|
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Why are we not calling attr.fieldName()? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This fails the union type csv tests - I didn't want to fix it in this PR but opened #128905 There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Applies also to the instances below. |
||
| } | ||
|
|
||
| @Override | ||
| public boolean isIndexedAndHasDocValues(FieldAttribute attr) { | ||
| // We still consider the value of isAggregatable here, because some fields like ScriptFieldTypes are always aggregatable | ||
| // But this could hide issues with fields that are not indexed but are aggregatable | ||
| // This is the original behaviour for ES|QL, but is it correct? | ||
| return attr.field().isAggregatable() || stats.isIndexed(attr.name()) && stats.hasDocValues(attr.name()); | ||
| return attr.field().isAggregatable() | ||
| || stats.isIndexed(new FieldAttribute.FieldName(attr.name())) | ||
alex-spies marked this conversation as resolved.
Show resolved
Hide resolved
|
||
| && stats.hasDocValues(new FieldAttribute.FieldName(attr.name())); | ||
alex-spies marked this conversation as resolved.
Show resolved
Hide resolved
|
||
| } | ||
|
|
||
| @Override | ||
| public boolean isIndexed(FieldAttribute attr) { | ||
| return stats.isIndexed(attr.name()); | ||
| return stats.isIndexed(new FieldAttribute.FieldName(attr.name())); | ||
| } | ||
|
|
||
| @Override | ||
| public boolean canUseEqualityOnSyntheticSourceDelegate(FieldAttribute attr, String value) { | ||
| return stats.canUseEqualityOnSyntheticSourceDelegate(attr.field().getName(), value); | ||
| return stats.canUseEqualityOnSyntheticSourceDelegate(attr.fieldName(), value); | ||
| } | ||
| }; | ||
| } | ||
|
|
||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In union-types we do not have multiple names like this, since the name specifies the type we're converting to, not the type we're converting from. So if
fieldis IP and KEYWORD in different mappings, the converted field might be$$field$coverted_to$ipif the query usedfield::ipto resolve the mapping conflict. A different query might usefield::stringto resolve the conflict, and in that query the attribute gets name$$field$converted_to$string. So we only ever generate 1 new attribute, not 2, in the same query.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We do generate more than 1; I just confirmed this locally with the example given in the javadoc: