Enable the use of nested field type with index.mode=time_series #122224

jordan-powers · 2025-02-11T01:57:31Z

This patch removes the check that fails requests that attempt to use fields of type: nested within indices with mode time_series.

This patch also updates TimeSeriesIdFieldMapper#postParse to set the _id field on child documents once it's calculated.

Closes #120874

elasticsearchmachine · 2025-02-11T01:57:56Z

Pinging @elastic/es-storage-engine (Team:StorageEngine)

elasticsearchmachine · 2025-02-11T01:57:57Z

Hi @jordan-powers, I've created a changelog YAML for you.

martijnvg

Looks good @jordan-powers. I didn't expect that a change inDocumentParserContext is needed, but I understand why it is needed.

The responsibility of adding the _id field to nested documents is now in another place, which isn't ideal. This is why added two comments about asserts.

martijnvg · 2025-02-11T10:23:21Z

server/src/main/java/org/elasticsearch/index/mapper/DocumentParserContext.java

-            // NOTE: we don't support nested fields in tsdb so it's safe to assume the standard id mapper.
            doc.add(new StringField(IdFieldMapper.NAME, idField.binaryValue(), Field.Store.NO));
+        } else if (indexSettings().getMode() == IndexMode.TIME_SERIES) {
+            // For time series indices, the _id is generated from the _tsid, which in turn is generated from the values of the configured


Maybe we should add an assert that getRoutingFields() doesn't return a reference to RoutingFields.Noop#INSTANCE? Just to make sure we are able to collect dimension values in order to generate _tsid / _id at a later stage?

martijnvg · 2025-02-11T10:24:38Z

server/src/main/java/org/elasticsearch/index/mapper/TimeSeriesIdFieldMapper.java

+        // for time-series indices the _id isn't available at that point.
+        assert context.id() != null;
+        for (LuceneDocument doc : context.nonRootDocuments()) {
+            doc.add(new StringField(IdFieldMapper.NAME, Uid.encodeId(context.id()), Field.Store.NO));


Maybe also assert that _id field hasn't been added yet to non root documents?

Is it possible to have nested within nested? If it's problematic for TSDB, we can throw.

++, i wonder what happens with multi-nested documents. I believe you may want to check the parent of the document here because they can differ.

Looking at DocumentParserContext#createNestedContext, it seems that the child document's _id always inherits the parent's _id, which eventually inherits from the root document's _id. So even in multi-level nested documents, the _id is the same root-level _id.

IndexableField idField = doc.getParent().getField(IdFieldMapper.NAME); if (idField != null) { doc.add(new StringField(IdFieldMapper.NAME, idField.binaryValue(), Field.Store.NO)); }

Let's test it :)

martijnvg · 2025-02-11T10:41:46Z

rest-api-spec/src/yamlRestTest/resources/rest-api-spec/test/tsdb/160_nested_fields.yml

+        body:
+          size: 0
+          query:
+            bool:


I think nested query is required if you intend to query at the courses level.

Oops, you're totally right

martijnvg · 2025-02-11T10:42:00Z

rest-api-spec/src/yamlRestTest/resources/rest-api-spec/test/tsdb/160_nested_fields.yml

+                    courses.credits: 3
+
+  - match:
+     hits.total.value: 0


Maybe also do search that returns a hit?

lkts

This is likely documented somewhere and that documentation needs to be adjusted. I think we are in the middle of documentation migration though so let's create a task for that.

lkts · 2025-02-11T17:35:24Z

server/src/main/java/org/elasticsearch/index/mapper/TimeSeriesIdFieldMapper.java

+        // for time-series indices the _id isn't available at that point.
+        assert context.id() != null;
+        for (LuceneDocument doc : context.nonRootDocuments()) {
+            doc.add(new StringField(IdFieldMapper.NAME, Uid.encodeId(context.id()), Field.Store.NO));


++, i wonder what happens with multi-nested documents. I believe you may want to check the parent of the document here because they can differ.

Also, add a test that actually returns a document.

lkts · 2025-02-12T21:55:43Z

rest-api-spec/src/yamlRestTest/resources/rest-api-spec/test/tsdb/20_mapping.yml

                        time_series_dimension: true

 ---
 nested fields:


do we need this test? i think it repeats tests above

I don't think it repeats any tests above since this is the only test in this file with a nested non-time_series_dimension field. But it is definitely redundant with the tests I added in 160_nested_fields.yml, so I'll take it out.

I meant above in the PR, sorry.

lkts · 2025-02-12T22:01:24Z

server/src/main/java/org/elasticsearch/index/mapper/TimeSeriesIdFieldMapper.java

+        // We need to add the uid or id to nested Lucene documents so that when a document gets deleted, the nested documents are
+        // also deleted. Usually this happens when the nested document is created (in DocumentParserContext#createNestedContext), but
+        // for time-series indices the _id isn't available at that point.
+        var binaryId = context.doc().getField(IdFieldMapper.NAME).binaryValue();


getField is kind of expensive since it iterates over all fields. Let's do this only when there are non root documents. Or maybe we can return the id from TsidExtractingIdFieldMapper above.

elasticsearchmachine · 2025-02-13T17:34:27Z

💔 Backport failed

Status	Branch	Result
❌	8.x	Commit could not be cherrypicked due to conflicts

You can use sqren/backport to manually backport by running backport --upstream elastic/elasticsearch --pr 122224

jordan-powers · 2025-02-13T17:42:36Z

💚 All backports created successfully

Status	Branch	Result
✅	8.x

Questions ?

Please refer to the Backport tool documentation

) (#122520) This patch removes the check that fails requests that attempt to use fields of type: nested within indices with mode time_series. This patch also updates TimeSeriesIdFieldMapper#postParse to set the _id field on child documents once it's calculated. Closes #120874 (cherry picked from commit 5315088) # Conflicts: # rest-api-spec/build.gradle

NatElkins · 2025-03-31T06:22:21Z

Any idea when this enhancement will be usable for Elasticsearch Cloud customers?

jordan-powers added 2 commits February 10, 2025 17:37

Add REST test with nested field in time-series index

59b2d57

Enable nested fields in time-series mode indices

412f12e

jordan-powers added >enhancement auto-backport Automatically create backport pull requests when merged :StorageEngine/Mapping The storage related side of mappings v8.19.0 v9.1.0 labels Feb 11, 2025

jordan-powers self-assigned this Feb 11, 2025

elasticsearchmachine added the Team:StorageEngine label Feb 11, 2025

Update docs/changelog/122224.yaml

c6097c7

Merge remote-tracking branch 'upstream/main' into fix_120874

3c669c0

martijnvg reviewed Feb 11, 2025

View reviewed changes

lkts reviewed Feb 11, 2025

View reviewed changes

jordan-powers added 10 commits February 11, 2025 10:42

Use nested query in nested field test

ed6f965

Also, add a test that actually returns a document.

Iter

40334db

Add test for tsdb with multi-level nested fields

42a20f9

Update tests to support nested fields

ab7935d

Merge remote-tracking branch 'upstream/main' into fix_120874

22c9ea3

Add cluster feature

d7acbe0

Add cluster feature to 20_mapping tests

117217c

Merge remote-tracking branch 'upstream/main' into fix_120874

e181096

Fix yaml=tsdb/20_mapping/nested dimensions

ff1a4ea

Merge remote-tracking branch 'upstream/main' into fix_120874

6907664

lkts approved these changes Feb 12, 2025

View reviewed changes

jordan-powers added 4 commits February 12, 2025 15:13

Avoid use of LuceneDocument::getField

07b7a57

Remove redundant test {tsdb/20_mapping/nested fields}

84bd7dd

Merge remote-tracking branch 'upstream/main' into fix_120874

b5f8642

Merge remote-tracking branch 'upstream/main' into fix_120874

138759a

jordan-powers added 2 commits February 13, 2025 08:18

Mute tsdb/20_mapping/nested fields

905534f

Merge remote-tracking branch 'upstream/main' into fix_120874

7b4a860

jordan-powers merged commit 5315088 into elastic:main Feb 13, 2025
17 checks passed

elasticsearchmachine added the backport pending label Feb 13, 2025

jordan-powers mentioned this pull request Feb 13, 2025

[8.x] Enable the use of nested field type with index.mode=time_series (#122224) #122520

Merged

jordan-powers deleted the fix_120874 branch February 13, 2025 17:44

Enable the use of nested field type with index.mode=time_series #122224

Enable the use of nested field type with index.mode=time_series #122224

Uh oh!

Conversation

jordan-powers commented Feb 11, 2025

Uh oh!

elasticsearchmachine commented Feb 11, 2025

Uh oh!

elasticsearchmachine commented Feb 11, 2025

Uh oh!

martijnvg left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

lkts left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

elasticsearchmachine commented Feb 13, 2025

💔 Backport failed

Uh oh!

jordan-powers commented Feb 13, 2025

💚 All backports created successfully

Questions ?

Uh oh!

NatElkins commented Mar 31, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants