Skip to content

Commit bd5268f

Browse files
committed
Merge branch 'main' into query_visit_percentage
2 parents cd47e8c + 190a3f1 commit bd5268f

File tree

12 files changed

+142
-58
lines changed

12 files changed

+142
-58
lines changed

docs/changelog/133369.yaml

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,5 @@
1+
pr: 133369
2+
summary: Enable `date` `date_nanos` implicit casting
3+
area: ES|QL
4+
type: enhancement
5+
issues: []

docs/reference/query-languages/esql/esql-multi-index.md

Lines changed: 56 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -135,6 +135,62 @@ FROM events_*
135135
| 2023-10-23T12:27:28.948Z | 172.21.2.113 | 2764889 | Connected to 10.1.0.2 |
136136
| 2023-10-23T12:15:03.360Z | 172.21.2.162 | 3450233 | Connected to 10.1.0.3 |
137137

138+
### Date and date_nanos union type [esql-multi-index-date-date-nanos-union]
139+
```{applies_to}
140+
stack: ga 9.2.0
141+
```
142+
When the type of an {{esql}} field is a *union* of `date` and `date_nanos` across different indices, {{esql}} automatically casts all values to the `date_nanos` type during query execution. This implicit casting ensures that all values are handled with nanosecond precision, regardless of their original type. As a result, users can write queries against such fields without needing to perform explicit type conversions, and the query engine will seamlessly align the types for consistent and precise results.
143+
144+
`date_nanos` fields offer higher precision but have a narrower range of valid values compared to `date` fields. This limits their representable dates roughly from 1970 to 2262. This is because dates are stored as a `long` representing nanoseconds since the epoch. When a field is mapped as both `date` and `date_nanos` across different indices, {{esql}} defaults to the more precise `date_nanos` type. This behavior ensures that no precision is lost when querying multiple indices with differing date field types. For dates that fall outside the valid range of `date_nanos` in fields that are mapped to both `date` and `date_nanos` across different indices, {{esql}} returns null by default. However, users can explicitly cast these fields to the `date` type to obtain a valid value, with precision limited to milliseconds.
145+
146+
For example, if the `@timestamp` field is mapped as `date` in one index and `date_nanos` in another, {{esql}} will automatically treat all `@timestamp` values as `date_nanos` during query execution. This allows users to write queries that utilize the `@timestamp` field without encountering type mismatch errors, ensuring accurate time-based operations and comparisons across the combined dataset.
147+
148+
**index: events_date**
149+
150+
```
151+
{
152+
"mappings": {
153+
"properties": {
154+
"@timestamp": { "type": "date" },
155+
"client_ip": { "type": "ip" },
156+
"event_duration": { "type": "long" },
157+
"message": { "type": "keyword" }
158+
}
159+
}
160+
}
161+
```
162+
163+
**index: events_date_nanos**
164+
165+
```
166+
{
167+
"mappings": {
168+
"properties": {
169+
"@timestamp": { "type": "date_nanos" },
170+
"client_ip": { "type": "ip" },
171+
"event_duration": { "type": "long" },
172+
"message": { "type": "keyword" }
173+
}
174+
}
175+
}
176+
```
177+
178+
```esql
179+
FROM events_date*
180+
| EVAL date = @timestamp::date
181+
| KEEP @timestamp, date, client_ip, event_duration, message
182+
| SORT date
183+
```
184+
185+
| @timestamp:date_nanos | date:date | client_ip:ip | event_duration:long | message:keyword |
186+
|--------------------------| --- |--------------|---------| --- |
187+
| null |1969-10-23T13:33:34.937Z| 172.21.0.5 | 1232382 |Disconnected|
188+
| 2023-10-23T12:15:03.360Z |2023-10-23T12:15:03.360Z| 172.21.2.162 | 3450233 |Connected to 10.1.0.3|
189+
| 2023-10-23T12:15:03.360103847Z|2023-10-23T12:15:03.360Z| 172.22.2.162 | 3450233 |Connected to 10.1.0.3|
190+
| 2023-10-23T12:27:28.948Z |2023-10-23T12:27:28.948Z| 172.22.2.113 | 2764889 |Connected to 10.1.0.2|
191+
| 2023-10-23T12:27:28.948Z |2023-10-23T12:27:28.948Z| 172.21.2.113 | 2764889 |Connected to 10.1.0.2|
192+
| 2023-10-23T13:33:34.937193Z |2023-10-23T13:33:34.937Z| 172.22.0.5 | 1232382 |Disconnected|
193+
| null |2263-10-23T13:51:54.732Z| 172.21.3.15 | 725448 |Connection error|
138194

139195
## Index metadata [esql-multi-index-index-metadata]
140196

muted-tests.yml

Lines changed: 0 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -513,12 +513,6 @@ tests:
513513
- class: org.elasticsearch.xpack.ml.integration.InferenceIT
514514
method: testInferClassificationModel
515515
issue: https://github.com/elastic/elasticsearch/issues/133448
516-
- class: org.elasticsearch.xpack.logsdb.LogsIndexingIT
517-
method: testRouteOnSortFields
518-
issue: https://github.com/elastic/elasticsearch/issues/133993
519-
- class: org.elasticsearch.xpack.logsdb.LogsIndexingIT
520-
method: testShrink
521-
issue: https://github.com/elastic/elasticsearch/issues/133875
522516
- class: org.elasticsearch.xpack.esql.action.CrossClusterQueryWithPartialResultsIT
523517
method: testPartialResults
524518
issue: https://github.com/elastic/elasticsearch/issues/131481

server/src/main/java/org/elasticsearch/index/codec/PerFieldFormatSupplier.java

Lines changed: 24 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -27,15 +27,37 @@
2727
import org.elasticsearch.index.mapper.Mapper;
2828
import org.elasticsearch.index.mapper.MapperService;
2929
import org.elasticsearch.index.mapper.SeqNoFieldMapper;
30+
import org.elasticsearch.index.mapper.TimeSeriesIdFieldMapper;
31+
import org.elasticsearch.index.mapper.TimeSeriesRoutingHashFieldMapper;
3032
import org.elasticsearch.index.mapper.vectors.DenseVectorFieldMapper;
3133

34+
import java.util.Collections;
35+
import java.util.HashSet;
36+
import java.util.Set;
37+
3238
/**
3339
* Class that encapsulates the logic of figuring out the most appropriate file format for a given field, across postings, doc values and
3440
* vectors.
3541
*/
3642
public class PerFieldFormatSupplier {
3743

38-
private static final FeatureFlag SEQNO_FIELD_USE_TSDB_DOC_VALUES_FORMAT = new FeatureFlag("seqno_field_use_tsdb_doc_values_format");
44+
static final FeatureFlag SEQNO_FIELD_USE_TSDB_DOC_VALUES_FORMAT = new FeatureFlag("seqno_field_use_tsdb_doc_values_format");
45+
private static final Set<String> INCLUDE_META_FIELDS;
46+
47+
static {
48+
// TODO: should we just allow all fields to use tsdb doc values codec?
49+
// Avoid using tsdb codec for fields like _seq_no, _primary_term.
50+
// But _tsid and _ts_routing_hash should always use the tsdb codec.
51+
Set<String> includeMetaField = new HashSet<>(3);
52+
includeMetaField.add(TimeSeriesIdFieldMapper.NAME);
53+
includeMetaField.add(TimeSeriesRoutingHashFieldMapper.NAME);
54+
if (SEQNO_FIELD_USE_TSDB_DOC_VALUES_FORMAT.isEnabled()) {
55+
includeMetaField.add(SeqNoFieldMapper.NAME);
56+
}
57+
// Don't the include _recovery_source_size and _recovery_source fields, since their values can be trimmed away in
58+
// RecoverySourcePruneMergePolicy, which leads to inconsistencies between merge stats and actual values.
59+
INCLUDE_META_FIELDS = Collections.unmodifiableSet(includeMetaField);
60+
}
3961

4062
private static final DocValuesFormat docValuesFormat = new Lucene90DocValuesFormat();
4163
private static final KnnVectorsFormat knnVectorsFormat = new Lucene99HnswVectorsFormat();
@@ -126,13 +148,7 @@ boolean useTSDBDocValuesFormat(final String field) {
126148
}
127149

128150
private boolean excludeFields(String fieldName) {
129-
// TODO: should we just allow all fields to use tsdb doc values codec?
130-
// Avoid using tsdb codec for fields like _seq_no, _primary_term.
131-
// But _tsid and _ts_routing_hash should always use the tsdb codec.
132-
return fieldName.startsWith("_")
133-
&& fieldName.equals("_tsid") == false
134-
&& fieldName.equals("_ts_routing_hash") == false
135-
&& (SEQNO_FIELD_USE_TSDB_DOC_VALUES_FORMAT.isEnabled() && fieldName.equals(SeqNoFieldMapper.NAME) == false);
151+
return fieldName.startsWith("_") && INCLUDE_META_FIELDS.contains(fieldName) == false;
136152
}
137153

138154
private boolean isTimeSeriesModeIndex() {

server/src/test/java/org/elasticsearch/index/codec/PerFieldMapperCodecTests.java

Lines changed: 31 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -21,6 +21,10 @@
2121
import org.elasticsearch.index.codec.bloomfilter.ES87BloomFilterPostingsFormat;
2222
import org.elasticsearch.index.codec.postings.ES812PostingsFormat;
2323
import org.elasticsearch.index.mapper.MapperService;
24+
import org.elasticsearch.index.mapper.SeqNoFieldMapper;
25+
import org.elasticsearch.index.mapper.SourceFieldMapper;
26+
import org.elasticsearch.index.mapper.TimeSeriesIdFieldMapper;
27+
import org.elasticsearch.index.mapper.TimeSeriesRoutingHashFieldMapper;
2428
import org.elasticsearch.test.ESTestCase;
2529

2630
import java.io.IOException;
@@ -201,6 +205,33 @@ public void testLogsIndexMode() throws IOException {
201205
assertThat((perFieldMapperCodec.useTSDBDocValuesFormat("response_size")), is(true));
202206
}
203207

208+
public void testMetaFields() throws IOException {
209+
PerFieldFormatSupplier perFieldMapperCodec = createFormatSupplier(true, IndexMode.LOGSDB, MAPPING_3);
210+
assertThat((perFieldMapperCodec.useTSDBDocValuesFormat(TimeSeriesIdFieldMapper.NAME)), is(true));
211+
assertThat((perFieldMapperCodec.useTSDBDocValuesFormat(TimeSeriesRoutingHashFieldMapper.NAME)), is(true));
212+
// See: PerFieldFormatSupplier why these fields shouldn't use tsdb codec
213+
assertThat((perFieldMapperCodec.useTSDBDocValuesFormat(SourceFieldMapper.RECOVERY_SOURCE_NAME)), is(false));
214+
assertThat((perFieldMapperCodec.useTSDBDocValuesFormat(SourceFieldMapper.RECOVERY_SOURCE_SIZE_NAME)), is(false));
215+
}
216+
217+
public void testSeqnoField() throws IOException {
218+
assumeTrue(
219+
"seqno_field_use_tsdb_doc_values_format should be enabled",
220+
PerFieldFormatSupplier.SEQNO_FIELD_USE_TSDB_DOC_VALUES_FORMAT.isEnabled()
221+
);
222+
PerFieldFormatSupplier perFieldMapperCodec = createFormatSupplier(true, IndexMode.LOGSDB, MAPPING_3);
223+
assertThat((perFieldMapperCodec.useTSDBDocValuesFormat(SeqNoFieldMapper.NAME)), is(true));
224+
}
225+
226+
public void testSeqnoFieldFeatureFlagDisabled() throws IOException {
227+
assumeTrue(
228+
"seqno_field_use_tsdb_doc_values_format should be disabled",
229+
PerFieldFormatSupplier.SEQNO_FIELD_USE_TSDB_DOC_VALUES_FORMAT.isEnabled() == false
230+
);
231+
PerFieldFormatSupplier perFieldMapperCodec = createFormatSupplier(true, IndexMode.LOGSDB, MAPPING_3);
232+
assertThat((perFieldMapperCodec.useTSDBDocValuesFormat(SeqNoFieldMapper.NAME)), is(false));
233+
}
234+
204235
private PerFieldFormatSupplier createFormatSupplier(boolean enableES87TSDBCodec, IndexMode mode, String mapping) throws IOException {
205236
Settings.Builder settings = Settings.builder();
206237
settings.put(IndexSettings.MODE.getKey(), mode);

x-pack/plugin/esql/qa/server/single-node/src/javaRestTest/java/org/elasticsearch/xpack/esql/qa/single_node/RestEsqlIT.java

Lines changed: 1 addition & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -58,7 +58,6 @@
5858
import static org.elasticsearch.test.ListMatcher.matchesList;
5959
import static org.elasticsearch.test.MapMatcher.assertMap;
6060
import static org.elasticsearch.test.MapMatcher.matchesMap;
61-
import static org.elasticsearch.xpack.esql.action.EsqlCapabilities.Cap.IMPLICIT_CASTING_DATE_AND_DATE_NANOS;
6261
import static org.elasticsearch.xpack.esql.core.type.DataType.isMillisOrNanos;
6362
import static org.elasticsearch.xpack.esql.qa.rest.RestEsqlTestCase.Mode.SYNC;
6463
import static org.elasticsearch.xpack.esql.tools.ProfileParser.parseProfile;
@@ -715,9 +714,7 @@ public void testSuggestedCast() throws IOException {
715714
Map<String, Object> results = entityAsMap(resp);
716715
List<?> columns = (List<?>) results.get("columns");
717716
DataType suggestedCast = DataType.suggestedCast(Set.of(listOfTypes.get(i), listOfTypes.get(j)));
718-
if (IMPLICIT_CASTING_DATE_AND_DATE_NANOS.isEnabled()
719-
&& isMillisOrNanos(listOfTypes.get(i))
720-
&& isMillisOrNanos(listOfTypes.get(j))) {
717+
if (isMillisOrNanos(listOfTypes.get(i)) && isMillisOrNanos(listOfTypes.get(j))) {
721718
// datetime and date_nanos are casted to date_nanos implicitly
722719
assertThat(columns, equalTo(List.of(Map.ofEntries(Map.entry("name", "my_field"), Map.entry("type", "date_nanos")))));
723720
} else {

x-pack/plugin/esql/src/main/java/org/elasticsearch/xpack/esql/action/EsqlCapabilities.java

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -353,7 +353,7 @@ public enum Cap {
353353
/**
354354
* Support implicit casting for union typed fields that are mixed with date and date_nanos type.
355355
*/
356-
IMPLICIT_CASTING_DATE_AND_DATE_NANOS(Build.current().isSnapshot()),
356+
IMPLICIT_CASTING_DATE_AND_DATE_NANOS,
357357

358358
/**
359359
* Support for named or positional parameters in EsqlQueryRequest.

x-pack/plugin/esql/src/main/java/org/elasticsearch/xpack/esql/analysis/Analyzer.java

Lines changed: 23 additions & 34 deletions
Original file line numberDiff line numberDiff line change
@@ -149,7 +149,6 @@
149149
import static java.util.Collections.emptyList;
150150
import static java.util.Collections.singletonList;
151151
import static org.elasticsearch.xpack.core.enrich.EnrichPolicy.GEO_MATCH_TYPE;
152-
import static org.elasticsearch.xpack.esql.action.EsqlCapabilities.Cap.IMPLICIT_CASTING_DATE_AND_DATE_NANOS;
153152
import static org.elasticsearch.xpack.esql.core.type.DataType.AGGREGATE_METRIC_DOUBLE;
154153
import static org.elasticsearch.xpack.esql.core.type.DataType.BOOLEAN;
155154
import static org.elasticsearch.xpack.esql.core.type.DataType.DATETIME;
@@ -192,7 +191,7 @@ public class Analyzer extends ParameterizedRuleExecutor<LogicalPlan, AnalyzerCon
192191
new ResolveLookupTables(),
193192
new ResolveFunctions(),
194193
new ResolveInference(),
195-
new DateMillisToNanosInEsRelation(IMPLICIT_CASTING_DATE_AND_DATE_NANOS.isEnabled())
194+
new DateMillisToNanosInEsRelation()
196195
),
197196
new Batch<>(
198197
"Resolution",
@@ -1975,42 +1974,32 @@ private static LogicalPlan planWithoutSyntheticAttributes(LogicalPlan plan) {
19751974
*/
19761975
private static class DateMillisToNanosInEsRelation extends Rule<LogicalPlan, LogicalPlan> {
19771976

1978-
private final boolean isSnapshot;
1979-
1980-
DateMillisToNanosInEsRelation(boolean isSnapshot) {
1981-
this.isSnapshot = isSnapshot;
1982-
}
1983-
19841977
@Override
19851978
public LogicalPlan apply(LogicalPlan plan) {
1986-
if (isSnapshot) {
1987-
return plan.transformUp(EsRelation.class, relation -> {
1988-
if (relation.indexMode() == IndexMode.LOOKUP) {
1989-
return relation;
1979+
return plan.transformUp(EsRelation.class, relation -> {
1980+
if (relation.indexMode() == IndexMode.LOOKUP) {
1981+
return relation;
1982+
}
1983+
return relation.transformExpressionsUp(FieldAttribute.class, f -> {
1984+
if (f.field() instanceof InvalidMappedField imf && imf.types().stream().allMatch(DataType::isDate)) {
1985+
HashMap<ResolveUnionTypes.TypeResolutionKey, Expression> typeResolutions = new HashMap<>();
1986+
var convert = new ToDateNanos(f.source(), f);
1987+
imf.types().forEach(type -> typeResolutions(f, convert, type, imf, typeResolutions));
1988+
var resolvedField = ResolveUnionTypes.resolvedMultiTypeEsField(f, typeResolutions);
1989+
return new FieldAttribute(
1990+
f.source(),
1991+
f.parentName(),
1992+
f.qualifier(),
1993+
f.name(),
1994+
resolvedField,
1995+
f.nullable(),
1996+
f.id(),
1997+
f.synthetic()
1998+
);
19901999
}
1991-
return relation.transformExpressionsUp(FieldAttribute.class, f -> {
1992-
if (f.field() instanceof InvalidMappedField imf && imf.types().stream().allMatch(DataType::isDate)) {
1993-
HashMap<ResolveUnionTypes.TypeResolutionKey, Expression> typeResolutions = new HashMap<>();
1994-
var convert = new ToDateNanos(f.source(), f);
1995-
imf.types().forEach(type -> typeResolutions(f, convert, type, imf, typeResolutions));
1996-
var resolvedField = ResolveUnionTypes.resolvedMultiTypeEsField(f, typeResolutions);
1997-
return new FieldAttribute(
1998-
f.source(),
1999-
f.parentName(),
2000-
f.qualifier(),
2001-
f.name(),
2002-
resolvedField,
2003-
f.nullable(),
2004-
f.id(),
2005-
f.synthetic()
2006-
);
2007-
}
2008-
return f;
2009-
});
2000+
return f;
20102001
});
2011-
} else {
2012-
return plan;
2013-
}
2002+
});
20142003
}
20152004
}
20162005

x-pack/plugin/esql/src/test/java/org/elasticsearch/xpack/esql/analysis/AnalyzerTests.java

Lines changed: 0 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -4141,7 +4141,6 @@ public void testBucketWithIntervalInStringInGroupingReferencedInAggregation() {
41414141
}
41424142

41434143
public void testImplicitCastingForDateAndDateNanosFields() {
4144-
assumeTrue("requires snapshot", EsqlCapabilities.Cap.IMPLICIT_CASTING_DATE_AND_DATE_NANOS.isEnabled());
41454144
IndexResolution indexWithUnionTypedFields = indexWithDateDateNanosUnionType();
41464145
Analyzer analyzer = AnalyzerTestUtils.analyzer(indexWithUnionTypedFields);
41474146

x-pack/plugin/esql/src/test/java/org/elasticsearch/xpack/esql/optimizer/LocalPhysicalPlanOptimizerTests.java

Lines changed: 0 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -2422,7 +2422,6 @@ public void testMatchFunctionStatisWithNonPushableCondition() {
24222422
}
24232423

24242424
public void testToDateNanosPushDown() {
2425-
assumeTrue("requires snapshot", EsqlCapabilities.Cap.IMPLICIT_CASTING_DATE_AND_DATE_NANOS.isEnabled());
24262425
IndexResolution indexWithUnionTypedFields = indexWithDateDateNanosUnionType();
24272426
plannerOptimizerDateDateNanosUnionTypes = new TestPlannerOptimizer(EsqlTestUtils.TEST_CFG, makeAnalyzer(indexWithUnionTypedFields));
24282427
var stats = EsqlTestUtils.statsForExistingField("date_and_date_nanos", "date_and_date_nanos_and_long");

0 commit comments

Comments
 (0)