ESQL: Type conflict resolution in unmapped-fields load#143693
ESQL: Type conflict resolution in unmapped-fields load#143693GalLalouche wants to merge 164 commits intoelastic:mainfrom
Conversation
PotentiallyUnmappedKeywordEsField (used for fields loaded from _source when unmapped_fields="load") had isAggregatable=true, which caused isPushableFieldAttribute to short-circuit past the SearchStats check and push filters down to Lucene. On shards where the field is not indexed, the Lucene query returns no results instead of letting the compute engine evaluate the filter on _source-loaded values. Guard isPushableFieldAttribute against PotentiallyUnmappedKeywordEsField so these fields are always filtered in the compute engine. Co-authored-by: Cursor <cursoragent@cursor.com>
The isPushableFieldAttribute fix (rejecting PotentiallyUnmappedKeywordEsField) already covers sort pushdown since PushTopNToSource uses the same method. This test verifies correct sort order when a field is mapped in one index but unmapped in another under unmapped_fields="load". Co-authored-by: Cursor <cursoragent@cursor.com>
|
Hi @GalLalouche, I've updated the changelog YAML for you. |
1 similar comment
|
Hi @GalLalouche, I've updated the changelog YAML for you. |
x-pack/plugin/esql/qa/testFixtures/src/main/resources/unmapped-load.csv-spec
Show resolved
Hide resolved
x-pack/plugin/esql/src/main/java/org/elasticsearch/xpack/esql/analysis/Analyzer.java
Outdated
Show resolved
Hide resolved
| ; | ||
|
|
||
| partiallyUnmappedNonKeyword | ||
| required_capability: optional_fields_v2 |
| required_capability: index_metadata_field | ||
|
|
||
| SET unmapped_fields="load"\; | ||
| FROM sample_data, no_mapping_sample_data METADATA _index |
| ; | ||
|
|
||
| ####################### | ||
| # Type conflict tests # |
| # Type conflict tests # | ||
| ####################### | ||
|
|
||
| typeConflictKeywordUnmappedCastToKeyword |
| var convert = new ToDateNanos(f.source(), f, context.configuration()); | ||
| imf.types().forEach(type -> typeResolutions(f, convert, type, imf, typeResolutions)); | ||
| var resolvedField = ResolveUnionTypes.resolvedMultiTypeEsField(f, typeResolutions); | ||
| var resolvedField = ResolveUnionTypes.resolvedMultiTypeEsField(f, typeResolutions, null); |
There was a problem hiding this comment.
So while analyzing this, I uncovered an underlying pre-existing union type bug! #144870.
It looks like this is actually fine to be null (I added an assertion), but the other one isn't.
| indexToConversionExpressions, | ||
| fa.field().getTimeSeriesFieldType() | ||
| fa.field().getTimeSeriesFieldType(), | ||
| null // FIXME(gal, NOCOMMIT) This basically a copy constructor, shouldn't be null; write a test to flush it out. |
There was a problem hiding this comment.
This is actually fine, due to the behavior mentioned in #144870. I've added an assertion and comment instead.
| @@ -2633,7 +2653,7 @@ public LogicalPlan apply(LogicalPlan plan, AnalyzerContext context) { | |||
There was a problem hiding this comment.
This part though, is a bug! If some fields are unmapped and load is on, then not all types are actually dates! Fixed and added tests.
x-pack/plugin/esql/qa/testFixtures/src/main/resources/unmapped-load.csv-spec
Show resolved
Hide resolved
There was a problem hiding this comment.
Done, except TS. Now it's a race to see who finishes first :)
| indexToConversionExpressions, | ||
| fa.field().getTimeSeriesFieldType() | ||
| fa.field().getTimeSeriesFieldType(), | ||
| null |
There was a problem hiding this comment.
Added an assert and a comment why this should be null.
| if (indexMappingHashToDuplicateCount.compute(response.getIndexMappingHash(), (k, v) -> v == null ? 1 : v + 1) > 1) { | ||
| continue; | ||
| } | ||
| boolean isNew = indexMappingHashToDuplicateCount.compute( |
There was a problem hiding this comment.
This was a bug found by existing INSIST_🐔 tests! (The one I stupidly deleted because I didn't want to start fixing INSIST code 🤦.) If two indices share the same mapping, we still need to update fieldToMappedIndices. I've transferred the tests to use unmapped_fields=load instead.
Resolves #141912.
Resolves #142004.