Skip to content
Merged
Show file tree
Hide file tree
Changes from 27 commits
Commits
Show all changes
69 commits
Select commit Hold shift + click to select a range
fa3dd20
TEMP
GalLalouche Jan 13, 2025
c2b0d9c
WTF checkstyle?!
GalLalouche Jan 9, 2025
7b33f15
Trying to fix failures
GalLalouche Jan 9, 2025
a105f0a
Add capability, Fix parser test
GalLalouche Jan 12, 2025
0ca5e79
Update docs/changelog/119886.yaml
GalLalouche Jan 12, 2025
bf309fb
Update changelog
GalLalouche Jan 14, 2025
0b72c89
Merge branch 'main' into feature/unmapped_fields_squashed
GalLalouche Jan 14, 2025
c046112
Really merge main this time
GalLalouche Jan 14, 2025
aca3796
Merge branch 'main' into feature/unmapped_fields_squashed
craigtaverner Jan 14, 2025
dacc8f6
Craig's PR notes
GalLalouche Jan 14, 2025
f71a586
Test for no sources
GalLalouche Jan 16, 2025
2ba5935
More code review fixes
GalLalouche Jan 16, 2025
c22b9d4
Merge branch 'main' into feature/unmapped_fields_squashed
GalLalouche Jan 16, 2025
f5c77fa
Update docs/changelog/119886.yaml
GalLalouche Jan 19, 2025
d6afb24
More code review fixes
GalLalouche Jan 19, 2025
25f0c53
Merge branch 'main' into feature/unmapped_fields_squashed
GalLalouche Jan 19, 2025
165ad16
Merge branch 'feature/unmapped_fields_squashed' of github.com:GalLalo…
GalLalouche Jan 19, 2025
9d85915
Some of Costin's notes
GalLalouche Jan 22, 2025
d0959ea
Pre huge refactor
GalLalouche Jan 23, 2025
d68a0ca
Post huge refactor
GalLalouche Jan 23, 2025
a12aa82
Merge branch 'main' into feature/unmapped_fields_squashed
GalLalouche Jan 26, 2025
f02fd43
Temp fixes
GalLalouche Jan 26, 2025
a8d9372
Slighly more delicate refactor
GalLalouche Jan 26, 2025
fcd4663
Fix borken test
GalLalouche Jan 26, 2025
be4bbfd
Handle failing tests
GalLalouche Jan 26, 2025
b84ff2c
Fix test names
GalLalouche Jan 26, 2025
a0e1840
Replace flag with inheritance
GalLalouche Jan 26, 2025
0b4132e
Add comment parsing
GalLalouche Jan 27, 2025
b05b327
Add a few more unmapped fields CSV tests
GalLalouche Jan 29, 2025
3f9faef
Added IndexResolverFieldNames test
GalLalouche Jan 29, 2025
903f6fa
Merge branch 'main' into feature/unmapped_fields_squashed
GalLalouche Jan 29, 2025
85b84f5
Move tests from LogicalPlanOptimizerTests to AnalyzerTests
GalLalouche Jan 29, 2025
400f473
Merge branch 'main' into feature/unmapped_fields_squashed
GalLalouche Jan 29, 2025
fb09120
Properly apply Andrei's suggestion after it somehow got removed o_O
GalLalouche Jan 29, 2025
0d9b280
Add snapshot check to StatementParserTests
GalLalouche Jan 29, 2025
d7ce623
Merge branch 'main' into feature/unmapped_fields_squashed
GalLalouche Jan 29, 2025
0f01ebe
Merge branch 'main' into feature/unmapped_fields_squashed
GalLalouche Jan 30, 2025
5c2c7d7
Merge branch 'main' into feature/unmapped_fields_squashed
GalLalouche Feb 2, 2025
af26d5f
More code review fixes
GalLalouche Feb 2, 2025
faef67a
Merge branch 'main' into feature/unmapped_fields_squashed
GalLalouche Feb 2, 2025
1204791
TEMP
GalLalouche Feb 2, 2025
22569b8
Things pass! Excitement!
GalLalouche Feb 4, 2025
a9f4a0b
Merge branch 'main' into feature/unmapped_fields_squashed
GalLalouche Feb 5, 2025
8f3c235
Remove println
GalLalouche Feb 5, 2025
7d07d1a
Merge branch 'main' into feature/unmapped_fields_squashed
GalLalouche Feb 5, 2025
697989c
Fix borken test
GalLalouche Feb 5, 2025
c0fe85e
Merge branch 'main' into feature/unmapped_fields_squashed
GalLalouche Feb 5, 2025
ab9bfdf
More review fixes
GalLalouche Feb 6, 2025
5c1c2be
Merge branch 'main' into feature/unmapped_fields_squashed
GalLalouche Feb 6, 2025
9805f12
Merge branch 'main' into feature/unmapped_fields_squashed
GalLalouche Feb 9, 2025
d5dc662
Fix borken test
GalLalouche Feb 9, 2025
ef56e8a
Merge branch 'main' into feature/unmapped_fields_squashed
GalLalouche Feb 10, 2025
13b3dff
Merge branch 'main' into feature/unmapped_fields_squashed
GalLalouche Feb 10, 2025
088a622
Try to fix BWC test
GalLalouche Feb 11, 2025
ea8567b
Merge branch 'main' into feature/unmapped_fields_squashed
GalLalouche Feb 11, 2025
9c466e4
Try to fix BWC test (2)
GalLalouche Feb 11, 2025
1fb5cb4
Try to fix BWC test (3)
GalLalouche Feb 11, 2025
8017c78
Merge branch 'main' into feature/unmapped_fields_squashed
GalLalouche Feb 11, 2025
4f6f6a1
Try to fix BWC test (4)
GalLalouche Feb 11, 2025
971b869
Merge branch 'main' into feature/unmapped_fields_squashed
GalLalouche Feb 11, 2025
ef94c52
Merge branch 'main' into feature/unmapped_fields_squashed
GalLalouche Feb 12, 2025
c9fdc92
Merge branch 'main' into feature/unmapped_fields_squashed
GalLalouche Feb 12, 2025
b46cb9d
Merge branch 'main' into feature/unmapped_fields_squashed
GalLalouche Feb 12, 2025
b5da964
Merge branch 'main' into feature/unmapped_fields_squashed
GalLalouche Feb 12, 2025
7a48277
Merge branch 'main' into feature/unmapped_fields_squashed
GalLalouche Feb 12, 2025
dcc518c
Merge branch 'main' into feature/unmapped_fields_squashed
GalLalouche Feb 13, 2025
770ea6f
Merge branch 'main' into feature/unmapped_fields_squashed
GalLalouche Feb 13, 2025
97399fc
Merge branch 'main' into feature/unmapped_fields_squashed
GalLalouche Feb 13, 2025
e32a446
Merge branch 'main' into feature/unmapped_fields_squashed
GalLalouche Feb 13, 2025
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
5 changes: 5 additions & 0 deletions docs/changelog/119886.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
pr: 119886
summary: Initial support for unmapped fields
area: ES|QL
type: feature
issues: []
Original file line number Diff line number Diff line change
Expand Up @@ -22,6 +22,7 @@
import java.io.IOException;
import java.util.ArrayList;
import java.util.List;
import java.util.Objects;

/**
* Loads values from {@code _source}. This whole process is very slow and cast-tastic,
Expand Down Expand Up @@ -230,7 +231,7 @@ private static class BytesRefs extends BlockSourceReader {

@Override
protected void append(BlockLoader.Builder builder, Object v) {
((BlockLoader.BytesRefBuilder) builder).appendBytesRef(toBytesRef(scratch, (String) v));
((BlockLoader.BytesRefBuilder) builder).appendBytesRef(toBytesRef(scratch, Objects.toString(v)));
}

@Override
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -27,11 +27,12 @@
public class EsField implements Writeable {

private static Map<String, Writeable.Reader<? extends EsField>> readers = Map.ofEntries(
Map.entry("EsField", EsField::new),
Map.entry("DateEsField", DateEsField::new),
Map.entry("EsField", EsField::new),
Map.entry("InvalidMappedField", InvalidMappedField::new),
Map.entry("KeywordEsField", KeywordEsField::new),
Map.entry("MultiTypeEsField", MultiTypeEsField::new),
Map.entry("PotentiallyUnmappedKeywordEsField", PotentiallyUnmappedKeywordEsField::new),
Map.entry("TextEsField", TextEsField::new),
Map.entry("UnsupportedEsField", UnsupportedEsField::new)
);
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,26 @@
/*
* Copyright Elasticsearch B.V. and/or licensed to Elasticsearch B.V. under one
* or more contributor license agreements. Licensed under the Elastic License
* 2.0; you may not use this file except in compliance with the Elastic License
* 2.0.
*/
package org.elasticsearch.xpack.esql.core.type;

import org.elasticsearch.common.io.stream.StreamInput;

import java.io.IOException;

/** Information about a field in an ES index that may not mapped. If it is unmapped, it should be loaded from source directly. */
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please, format this javadoc properly. And add a bit more details. For example, what does "unmapped" mean? There are many situations (ES mapping particularities) where a field is unmapped, but it can be loaded from "somewhere".

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You make it sound as if a one-line javadoc isn't a valid javadoc 😉. Done.

public class PotentiallyUnmappedKeywordEsField extends KeywordEsField {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The field will be unmapped in at least one index hence its potential has been realized :) and the class can be simply called UnmappedKeywordEsField

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The reason it is "potentially" unmapped is that it might be mapped in some fields. I've added a comment to try to better explain this.

public PotentiallyUnmappedKeywordEsField(String name) {
super(name);
}

public PotentiallyUnmappedKeywordEsField(StreamInput in) throws IOException {
super(in);
}

public String getWriteableName() {
return "PotentiallyUnmappedKeywordEsField";
}
}
Original file line number Diff line number Diff line change
Expand Up @@ -10,7 +10,9 @@
import java.util.Collection;
import java.util.Collections;
import java.util.List;
import java.util.OptionalInt;
import java.util.Set;
import java.util.function.Predicate;

import static java.util.Collections.emptyList;

Expand Down Expand Up @@ -83,4 +85,13 @@ public static <T> List<T> nullSafeList(T... entries) {
}
return list;
}

public static <T> OptionalInt findFirstIndex(List<T> list, Predicate<T> predicate) {
for (int i = 0; i < list.size(); i++) {
if (predicate.test(list.get(i))) {
return OptionalInt.of(i);
}
}
return OptionalInt.empty();
}
}
Original file line number Diff line number Diff line change
Expand Up @@ -51,6 +51,7 @@
import static org.elasticsearch.xpack.esql.action.EsqlCapabilities.Cap.JOIN_LOOKUP_V12;
import static org.elasticsearch.xpack.esql.action.EsqlCapabilities.Cap.JOIN_PLANNING_V1;
import static org.elasticsearch.xpack.esql.action.EsqlCapabilities.Cap.METADATA_FIELDS_REMOTE_TEST;
import static org.elasticsearch.xpack.esql.action.EsqlCapabilities.Cap.UNMAPPED_FIELDS;
import static org.elasticsearch.xpack.esql.qa.rest.EsqlSpecTestCase.Mode.SYNC;
import static org.mockito.ArgumentMatchers.any;
import static org.mockito.Mockito.doAnswer;
Expand Down Expand Up @@ -125,6 +126,8 @@ protected void shouldSkipTest(String testName) throws IOException {
assumeFalse("INLINESTATS not yet supported in CCS", testCase.requiredCapabilities.contains(INLINESTATS_V2.capabilityName()));
assumeFalse("INLINESTATS not yet supported in CCS", testCase.requiredCapabilities.contains(JOIN_PLANNING_V1.capabilityName()));
assumeFalse("LOOKUP JOIN not yet supported in CCS", testCase.requiredCapabilities.contains(JOIN_LOOKUP_V12.capabilityName()));
// Unmapped fields require a coorect capability response from every cluster, which isn't currently implemented.
assumeFalse("UNMAPPED FIELDS not yet supported in CCS", testCase.requiredCapabilities.contains(UNMAPPED_FIELDS.capabilityName()));
}

private TestFeatureService remoteFeaturesService() throws IOException {
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -120,7 +120,9 @@ public static Tuple<Version, Version> skipVersionRange(String testName, String i
return null;
}

public static Tuple<Page, List<String>> loadPageFromCsv(URL source, Map<String, String> typeMapping) throws Exception {
public record PageColumn(String name, DataType dataType) {}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think the dataType of a PageColumn is used. Probably was used in the initial version of this PR; you might want to revert this change (adding the PageColumn), at this point it looks unnecessary, unless I'm missing where the dataType is actually used.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, you're right. It was used when we needed to check for type conflicts during casting.


public static Tuple<Page, List<PageColumn>> loadPageFromCsv(URL source, Map<String, String> typeMapping) throws Exception {

record CsvColumn(String name, Type type, BuilderWrapper builderWrapper) implements Releasable {
void append(String stringValue) {
Expand Down Expand Up @@ -230,13 +232,13 @@ public void close() {
lineNumber++;
}
}
var columnNames = new ArrayList<String>(columns.length);
var pageColumns = new ArrayList<PageColumn>(columns.length);
try {
var blocks = Arrays.stream(columns)
.peek(b -> columnNames.add(b.name))
.peek(b -> pageColumns.add(new PageColumn(b.name, DataType.fromTypeName(b.type.name()))))
.map(b -> b.builderWrapper.builder().build())
.toArray(Block[]::new);
return new Tuple<>(new Page(blocks), columnNames);
return new Tuple<>(new Page(blocks), pageColumns);
} finally {
Releasables.closeExpectNoException(columns);
}
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -42,6 +42,8 @@
import java.util.List;
import java.util.Map;
import java.util.Set;
import java.util.stream.Collectors;
import java.util.stream.Stream;

import static org.elasticsearch.common.logging.LoggerMessageFormat.format;
import static org.elasticsearch.xpack.esql.CsvTestUtils.COMMA_ESCAPING_REGEX;
Expand Down Expand Up @@ -83,6 +85,22 @@ public class CsvTestsDataLoader {
.withData("sample_data_ts_nanos.csv")
.withTypeMapping(Map.of("@timestamp", "date_nanos"));
private static final TestDataset MISSING_IP_SAMPLE_DATA = new TestDataset("missing_ip_sample_data");
private static final TestDataset SAMPLE_DATA_PARTIAL_MAPPING = new TestDataset("partial_mapping_sample_data");
private static final TestDataset SAMPLE_DATA_NO_MAPPING = new TestDataset(
"no_mapping_sample_data",
"mapping-no_mapping_sample_data.json",
"partial_mapping_sample_data.csv"
).withTypeMapping(Stream.of("timestamp", "client_ip", "event_duration").collect(Collectors.toMap(k -> k, k -> "keyword")));
private static final TestDataset SAMPLE_DATA_PARTIAL_MAPPING_NO_SOURCE = new TestDataset(
"partial_mapping_no_source_sample_data",
"mapping-partial_mapping_no_source_sample_data.json",
"partial_mapping_sample_data.csv"
);
private static final TestDataset SAMPLE_DATA_PARTIAL_MAPPING_EXCLUDED_SOURCE = new TestDataset(
"partial_mapping_excluded_source_sample_data",
"mapping-partial_mapping_excluded_source_sample_data.json",
"partial_mapping_sample_data.csv"
);
private static final TestDataset CLIENT_IPS = new TestDataset("clientips");
private static final TestDataset CLIENT_IPS_LOOKUP = CLIENT_IPS.withIndex("clientips_lookup")
.withSetting("clientips_lookup-settings.json");
Expand Down Expand Up @@ -128,6 +146,10 @@ public class CsvTestsDataLoader {
Map.entry(LANGUAGES_NESTED_FIELDS.indexName, LANGUAGES_NESTED_FIELDS),
Map.entry(UL_LOGS.indexName, UL_LOGS),
Map.entry(SAMPLE_DATA.indexName, SAMPLE_DATA),
Map.entry(SAMPLE_DATA_PARTIAL_MAPPING.indexName, SAMPLE_DATA_PARTIAL_MAPPING),
Map.entry(SAMPLE_DATA_NO_MAPPING.indexName, SAMPLE_DATA_NO_MAPPING),
Map.entry(SAMPLE_DATA_PARTIAL_MAPPING_NO_SOURCE.indexName, SAMPLE_DATA_PARTIAL_MAPPING_NO_SOURCE),
Map.entry(SAMPLE_DATA_PARTIAL_MAPPING_EXCLUDED_SOURCE.indexName, SAMPLE_DATA_PARTIAL_MAPPING_EXCLUDED_SOURCE),
Map.entry(MV_SAMPLE_DATA.indexName, MV_SAMPLE_DATA),
Map.entry(ALERTS.indexName, ALERTS),
Map.entry(SAMPLE_DATA_STR.indexName, SAMPLE_DATA_STR),
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,8 @@
@timestamp:date,client_ip:ip,event_duration:long,message:keyword,unmapped_message:keyword,unmapped_event_duration:keyword
2024-10-23T13:55:01.543Z,173.21.3.15,1756466,Connected to 10.1.0.1!,Disconnected from 10.1.0.1,1756468
2024-10-23T13:53:55.832Z,173.21.3.15,5033754,Connection error?,Disconnection error,5033756
2024-10-23T13:52:55.015Z,173.21.3.15,8268152,Connection error?,Disconnection error,8268154
2024-10-23T13:51:54.732Z,173.21.3.15,725447,Connection error?,Disconnection error,725449
2024-10-23T13:33:34.937Z,173.21.0.5,1232381,42,43,1232383
2024-10-23T12:27:28.948Z,173.21.2.113,2764888,Connected to 10.1.0.2!,Disconnected from 10.1.0.2,2764890
2024-10-23T12:15:03.360Z,173.21.2.162,3450232,Connected to 10.1.0.3!,Disconnected from 10.1.0.3,3450234
Original file line number Diff line number Diff line change
@@ -0,0 +1,4 @@
{
"dynamic": "false",
"properties": {}
}
Original file line number Diff line number Diff line change
@@ -0,0 +1,13 @@
{
"dynamic": "false",
"properties": {
"@timestamp": {
"type": "date"
}
},
"_source": {
"excludes": [
"message"
]
}
}
Original file line number Diff line number Diff line change
@@ -0,0 +1,11 @@
{
"dynamic": "false",
"properties": {
"@timestamp": {
"type": "date"
}
},
"_source": {
"enabled": false
}
}
Original file line number Diff line number Diff line change
@@ -0,0 +1,17 @@
{
"dynamic": "false",
"properties": {
"@timestamp": {
"type": "date"
},
"client_ip": {
"type": "ip"
},
"event_duration": {
"type": "long"
},
"message": {
"type": "keyword"
}
}
}
Loading