Skip to content
Merged
Show file tree
Hide file tree
Changes from 4 commits
Commits
Show all changes
20 commits
Select commit Hold shift + click to select a range
59b2d57
Add REST test with nested field in time-series index
jordan-powers Feb 11, 2025
412f12e
Enable nested fields in time-series mode indices
jordan-powers Feb 11, 2025
c6097c7
Update docs/changelog/122224.yaml
jordan-powers Feb 11, 2025
3c669c0
Merge remote-tracking branch 'upstream/main' into fix_120874
jordan-powers Feb 11, 2025
ed6f965
Use nested query in nested field test
jordan-powers Feb 11, 2025
40334db
Iter
jordan-powers Feb 11, 2025
42a20f9
Add test for tsdb with multi-level nested fields
jordan-powers Feb 11, 2025
ab7935d
Update tests to support nested fields
jordan-powers Feb 11, 2025
22c9ea3
Merge remote-tracking branch 'upstream/main' into fix_120874
jordan-powers Feb 11, 2025
d7acbe0
Add cluster feature
jordan-powers Feb 11, 2025
117217c
Add cluster feature to 20_mapping tests
jordan-powers Feb 11, 2025
e181096
Merge remote-tracking branch 'upstream/main' into fix_120874
jordan-powers Feb 11, 2025
ff1a4ea
Fix yaml=tsdb/20_mapping/nested dimensions
jordan-powers Feb 11, 2025
6907664
Merge remote-tracking branch 'upstream/main' into fix_120874
jordan-powers Feb 11, 2025
07b7a57
Avoid use of LuceneDocument::getField
jordan-powers Feb 12, 2025
84bd7dd
Remove redundant test {tsdb/20_mapping/nested fields}
jordan-powers Feb 12, 2025
b5f8642
Merge remote-tracking branch 'upstream/main' into fix_120874
jordan-powers Feb 12, 2025
138759a
Merge remote-tracking branch 'upstream/main' into fix_120874
jordan-powers Feb 13, 2025
905534f
Mute tsdb/20_mapping/nested fields
jordan-powers Feb 13, 2025
7b4a860
Merge remote-tracking branch 'upstream/main' into fix_120874
jordan-powers Feb 13, 2025
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
6 changes: 6 additions & 0 deletions docs/changelog/122224.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
pr: 122224
summary: Enable the use of nested field type with index.mode=time_series
area: Mapping
type: enhancement
issues:
- 120874
Original file line number Diff line number Diff line change
@@ -0,0 +1,61 @@
---
"Create TSDB index with field of nested type":
- do:
indices.create:
index: test
body:
settings:
index:
mode: time_series
number_of_replicas: 1
number_of_shards: 1
routing_path: [department]
time_series:
start_time: 2021-04-28T00:00:00Z
end_time: 2021-04-29T00:00:00Z
mappings:
properties:
"@timestamp":
type: date
department:
type: keyword
time_series_dimension: true
staff:
type: integer
courses:
type: nested
properties:
name:
type: keyword
credits:
type: integer

- do:
index:
index: test
body: { "@timestamp": "2021-04-28T01:00:00Z", "department": "compsci", "staff": 12, "courses": [ { "name": "Object Oriented Programming", "credits": 3 }, { "name": "Theory of Computation", "credits": 4, } ] }

- do:
index:
index: test
body: { "@timestamp": "2021-04-28T02:00:00Z", "department": "math", "staff": 20, "courses": [ { "name": "Precalculus", "credits": 1 }, { "name": "Linear Algebra", "credits": 3 } ] }

- do:
indices.refresh:
index: [ test ]

- do:
search:
index: test
body:
size: 0
query:
bool:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think nested query is required if you intend to query at the courses level.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oops, you're totally right

must:
- term:
courses.name: Precalculus
- term:
courses.credits: 3

- match:
hits.total.value: 0
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe also do search that returns a hit?

4 changes: 0 additions & 4 deletions server/src/main/java/org/elasticsearch/index/IndexMode.java
Original file line number Diff line number Diff line change
Expand Up @@ -29,7 +29,6 @@
import org.elasticsearch.index.mapper.MapperService;
import org.elasticsearch.index.mapper.MappingLookup;
import org.elasticsearch.index.mapper.MetadataFieldMapper;
import org.elasticsearch.index.mapper.NestedLookup;
import org.elasticsearch.index.mapper.ProvidedIdFieldMapper;
import org.elasticsearch.index.mapper.RoutingFieldMapper;
import org.elasticsearch.index.mapper.RoutingFields;
Expand Down Expand Up @@ -156,9 +155,6 @@ private static String error(Setting<?> unsupported) {

@Override
public void validateMapping(MappingLookup lookup) {
if (lookup.nestedLookup() != NestedLookup.EMPTY) {
throw new IllegalArgumentException("cannot have nested fields when index is in " + tsdbMode());
}
if (((RoutingFieldMapper) lookup.getMapper(RoutingFieldMapper.NAME)).required()) {
throw new IllegalArgumentException(routingRequiredBad());
}
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -665,8 +665,13 @@ public final DocumentParserContext createNestedContext(NestedObjectMapper nested
if (idField != null) {
// We just need to store the id as indexed field, so that IndexWriter#deleteDocuments(term) can then
// delete it when the root document is deleted too.
// NOTE: we don't support nested fields in tsdb so it's safe to assume the standard id mapper.
doc.add(new StringField(IdFieldMapper.NAME, idField.binaryValue(), Field.Store.NO));
} else if (indexSettings().getMode() == IndexMode.TIME_SERIES) {
// For time series indices, the _id is generated from the _tsid, which in turn is generated from the values of the configured
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe we should add an assert that getRoutingFields() doesn't return a reference to RoutingFields.Noop#INSTANCE? Just to make sure we are able to collect dimension values in order to generate _tsid / _id at a later stage?

// routing fields. At this point in document parsing, we can't guarantee that we've parsed all the routing fields yet, so the
// parent document's _id is not yet available.
// So we just add the child document without the parent _id, then in TimeSeriesIdFieldMapper#postParse we set the _id on all
// child documents once we've calculated it.
} else {
throw new IllegalStateException("The root document of a nested document should have an _id field");
}
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -9,7 +9,9 @@

package org.elasticsearch.index.mapper;

import org.apache.lucene.document.Field;
import org.apache.lucene.document.SortedDocValuesField;
import org.apache.lucene.document.StringField;
import org.apache.lucene.search.Query;
import org.apache.lucene.util.BytesRef;
import org.elasticsearch.common.Strings;
Expand Down Expand Up @@ -142,6 +144,14 @@ public void postParse(DocumentParserContext context) throws IOException {
: null,
timeSeriesId
);

// We need to add the uid or id to nested Lucene documents so that when a document gets deleted, the nested documents are
// also deleted. Usually this happens when the nested document is created (in DocumentParser#createNestedContext), but
// for time-series indices the _id isn't available at that point.
assert context.id() != null;
for (LuceneDocument doc : context.nonRootDocuments()) {
doc.add(new StringField(IdFieldMapper.NAME, Uid.encodeId(context.id()), Field.Store.NO));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe also assert that _id field hasn't been added yet to non root documents?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it possible to have nested within nested? If it's problematic for TSDB, we can throw.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

++, i wonder what happens with multi-nested documents. I believe you may want to check the parent of the document here because they can differ.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking at DocumentParserContext#createNestedContext, it seems that the child document's _id always inherits the parent's _id, which eventually inherits from the root document's _id. So even in multi-level nested documents, the _id is the same root-level _id.

IndexableField idField = doc.getParent().getField(IdFieldMapper.NAME);
if (idField != null) {
    doc.add(new StringField(IdFieldMapper.NAME, idField.binaryValue(), Field.Store.NO));
}

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's test it :)

}
}

private IndexVersion getIndexVersionCreated(final DocumentParserContext context) {
Expand Down