ESQL: loading unmapped fields on synthetic _source#144112
ESQL: loading unmapped fields on synthetic _source#144112GalLalouche merged 15 commits intoelastic:mainfrom
Conversation
33bcf97 to
231f767
Compare
…3916) Add forUnmappedLoad flag to KeywordFieldType so only the unmapped-keyword path (created by ESQL for PotentiallyUnmappedKeywordEsField) converts decoded Object values to string in the fallback synthetic source reader; mapped keyword fields keep assuming BytesRef. Add synthetic-source test cases to unmapped-load and unmapped-nullify: - partial_mapping_synthetic_sample_data (unmapped_event_duration as long) - unmappedNumericFromK8sSyntheticTsIndex (TS k8s_unmapped, network.cost) Made-with: Cursor
0ee8e70 to
ce8e992
Compare
|
Pinging @elastic/es-analytical-engine (Team:Analytics) |
|
Hi @GalLalouche, I've created a changelog YAML for you. |
...lugin/esql/qa/testFixtures/src/main/resources/data/partial_mapping_synthetic_sample_data.csv
Outdated
Show resolved
Hide resolved
| public static final TypeParser PARSER = createTypeParserWithLegacySupport(Builder::new); | ||
|
|
||
| public static final class KeywordFieldType extends TextFamilyFieldType { | ||
| public static class KeywordFieldType extends TextFamilyFieldType { |
There was a problem hiding this comment.
I wouldn't change this class unless someone from the core ES team approves it. My advice is to seek approval for this from the relevant team.
...in/esql/src/main/java/org/elasticsearch/xpack/esql/planner/EsPhysicalOperationProviders.java
Outdated
Show resolved
Hide resolved
| return new FallbackSyntheticSourceBlockLoader.SingleValueReader<BytesRef>(nullValueBytes) { | ||
| @Override | ||
| public void convertValue(Object value, List<BytesRef> accumulator) { | ||
| String stringValue = ((BytesRef) value).utf8ToString(); |
There was a problem hiding this comment.
I think you're fine to change this line rather than making KeywordFieldType non-final. ie:
String stringValue = value instanceof BytesRef br ? br.utf8ToString() : value.toString();
There was a problem hiding this comment.
I wanted to be extra defensive and avoid changing behavior if the field is not potentially unmapped, but if you're fine with it then I'm fine with it :)
| 172.21.2.113 |null |null |null | ||
| 172.21.2.162 |null |null |null | ||
| 172.21.3.15 |null |null |null | ||
| client_ip:ip | foo:null | bar:keyword | baz:null |
There was a problem hiding this comment.
These changes were unnecessary.
There was a problem hiding this comment.
Of course not, but I used a script to fix the alignments for the tests I did add, and it just makes it easier to align the entire file rather than align a subset of it. I can revert it if you think it's a blocker.
Fix loading of unmapped numeric values when the
_sourceis synthetic. Also add a few more timeseries tests, since now we can actually run them!Resolves #143916.