Skip to content

Add sub-field support to flattened field type#144451

Merged
parkertimmins merged 44 commits intoelastic:mainfrom
parkertimmins:parker/flattened-sub-fields
Mar 26, 2026
Merged

Add sub-field support to flattened field type#144451
parkertimmins merged 44 commits intoelastic:mainfrom
parkertimmins:parker/flattened-sub-fields

Conversation

@parkertimmins
Copy link
Contributor

@parkertimmins parkertimmins commented Mar 17, 2026

The flattened field type indexes all leaf values as untyped keywords, preventing type-aware operations (range queries, date math, numeric aggregations) on individual keys. Users needing typed behavior must switch the entire object to object type/

A new optional properties parameter lets users declare specific paths with real leaf field types while leaving the rest on the default untyped flattened path.

  {
    "labels": {
      "type": "flattened",
      "properties": {
        "host.name": { "type": "keyword" },
        "status_code": { "type": "long" }
      }
    }
  }

At index time, values matching a mapped property are delegated to that sub-field's mapper and excluded from the root/keyed flattened fields. Unmapped keys continue to behave as normal flattened keywords. Allowed sub-field types: keyword, constant_keyword, wildcard, text, long, integer, short, byte, double, float, half_float, scaled_float, unsigned_long, date, date_nanos, boolean, ip. Supported operations: typed search, sort (including index sort), aggregations, ESQL block loading, and synthetic _source. Restrictions: copy_to and fields (multi-fields) are disallowed on mapped properties. Only leaf types from the allow-list are permitted.

Made-with: Cursor

Allow specific keys within a flattened field to be mapped as
typed sub-fields (keyword, ip, etc.) via a new "properties"
mapping attribute. Mapped keys are indexed exclusively through
their sub-field mapper and excluded from the flattened field's
root/keyed representation.

Made-with: Cursor
@github-actions
Copy link
Contributor

github-actions bot commented Mar 17, 2026

🔍 Preview links for changed docs

@github-actions
Copy link
Contributor

ℹ️ Important: Docs version tagging

👋 Thanks for updating the docs! Just a friendly reminder that our docs are now cumulative. This means all 9.x versions are documented on the same page and published off of the main branch, instead of creating separate pages for each minor version.

We use applies_to tags to mark version-specific features and changes.

Expand for a quick overview

When to use applies_to tags:

✅ At the page level to indicate which products/deployments the content applies to (mandatory)
✅ When features change state (e.g. preview, ga) in a specific version
✅ When availability differs across deployments and environments

What NOT to do:

❌ Don't remove or replace information that applies to an older version
❌ Don't add new information that applies to a specific version without an applies_to tag
❌ Don't forget that applies_to tags can be used at the page, section, and inline level

🤔 Need help?

@github-actions
Copy link
Contributor

github-actions bot commented Mar 17, 2026

Vale Linting Results

Summary: 1 suggestion found

💡 Suggestions (1)
File Line Rule Message
docs/reference/elasticsearch/mapping-reference/flattened.md 255 Elastic.WordChoice Consider using 'can, might' instead of 'may', unless the term is in the UI.

The Vale linter checks documentation changes against the Elastic Docs style guide.

To use Vale locally or report issues, refer to Elastic style guide for Vale.

Tighten field count assertions to exact values, add test for
mapped properties matched via nested object notation, simplify
serialization roundtrip test, and fix minor doc formatting.

Made-with: Cursor
Made-with: Cursor

# Conflicts:
#	server/src/main/java/org/elasticsearch/index/mapper/flattened/FlattenedFieldMapper.java
#	server/src/main/java/org/elasticsearch/index/mapper/flattened/FlattenedFieldParser.java
#	server/src/test/java/org/elasticsearch/index/mapper/flattened/FlattenedFieldParserTests.java
Add search tests covering term queries, sorting, aggregations,
doc value field loading, and numeric range queries on mapped
sub-fields within flattened fields.

Made-with: Cursor
Extend FlattenedDocValuesSyntheticFieldLoader to compose mapped
property loaders alongside the flattened field's own keyed doc
values, so mapped properties are included in synthetic source.

Made-with: Cursor
Register mapper.flattened.mapped_properties cluster feature and
add REST tests covering queries, sorting, aggregations, synthetic
source, and mapping serialization for mapped properties.

Made-with: Cursor
Reset docValues to NO_VALUES in the binary doc values branch of
FlattenedDocValuesSyntheticFieldLoader when no binary DVs exist
for a segment, preventing stale state from a prior segment.

Forward null tokens to mapped property mappers in
FlattenedFieldParser so sub-field null_value handling works
independently of the parent flattened field's null_value.

Made-with: Cursor
Reject copy_to and multi_fields on flattened mapped properties
since multi-fields are silently un-queryable and copy_to is
inconsistent with the parent flattened field's restrictions.

Add tests for multi-value arrays, null value forwarding, exists
queries, property preservation on merge, additional types, empty
strings, ignore_above/depth_limit interaction, cross-index
queries, synthetic source ordering, and all disallowed types.

Made-with: Cursor
The RootFlattenedDocValuesBlockLoader was not passing mapped property
loaders through to FlattenedDocValuesSyntheticFieldLoader, causing
ES|QL block loading to omit all mapped property values from the
resulting JSON. Also fixed writeToBlock to use hasValue() so it does
not emit null when only mapped properties have data.

Made-with: Cursor
@parkertimmins parkertimmins added >enhancement :StorageEngine/Mapping The storage related side of mappings labels Mar 18, 2026
@elasticsearchmachine
Copy link
Collaborator

Hi @parkertimmins, I've created a changelog YAML for you.

parkertimmins and others added 4 commits March 18, 2026 17:57
Replace LinkedHashMap with TreeMap for propertyBuilders and
mappedProperties, and use unmodifiableSortedMap to preserve
sort order through Map.copyOf. This removes the need to wrap
in new TreeMap<>() at each usage site.

Made-with: Cursor
@parkertimmins parkertimmins marked this pull request as ready for review March 18, 2026 23:47
@parkertimmins parkertimmins requested a review from a team as a code owner March 18, 2026 23:47
@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/es-storage-engine (Team:StorageEngine)

parkertimmins and others added 6 commits March 20, 2026 12:10
The test relied on unsorted search returning results in
index sort order, which fails when CCS searches a replica
that is still being peer-recovered from the primary.

Made-with: Cursor
Flattened type cannot be used as a multi-field since the
feature branch changed its TypeParser from FieldMapper.TypeParser
to Mapper.TypeParser. Use object properties instead.

Made-with: Cursor
# [foo] is flattened in index5
# [bar] is keyword in index5
# [bar].[baz] is flattened in index5
# [bar.baz] is flattened in index5
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not super necessary but I think I'd be nice to tell immediately, just by looking at comments like these, if foo.bar was itself the field name or bar is a field under foo ([foo].[bar]). Seems like it's still the latter case in index5.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mouhc1ne Thanks for the review. Unfortunately, we ended up needing the keep the NOOP behavior that currently exists, because changing it would be a breaking change. Martijn pointed out that, because the flattened mapping is serialized, not being backwards compatible here would cause failures during a shard upgrade.

@mouhc1ne
Copy link
Contributor

Just fyi that you might need to pull in #144741 in case you run into CI errors in 260_flattened_subfield.yml related to load not allowed.

The feature branch changed FlattenedFieldMapper.PARSER from a
FieldMapper.TypeParser to a Mapper.TypeParser to handle the new
properties key. This broke the instanceof check in
TypeParsers.parseMultiField, rejecting flattened as a multi-field.

Fix by subclassing FieldMapper.TypeParser instead: extract properties
from the node before delegating to super.parse(). This preserves
backwards compatibility while supporting the new properties parsing.

Made-with: Cursor
});
}

public void testBlockLoaderWithMappedPropertiesOnly() throws IOException {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From what I can tell, these two blockloader tests are testing the root blockloader. We should probably add a test for the KeyedFlattenedDocValuesBlockLoader too.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point, I've added a test here: https://github.com/elastic/elasticsearch/pull/144451/changes#diff-7113916e741652f4a9e73b8cf5573b3af612664e2c54e22cbc697b49468a8886R1760

This brings up an important question. Should the mapped values be included in the KeyedFlattenedDocValuesBlockLoader? Currently they are not. They have to be obtained through a separate, and correctly typed, blockloader. I can also see the argument that they should included, though this would involve casting everything to strings. For this reason, I'm inclined to say they should be left out.

@jordan-powers How will this choice effect query time behavior ES|QL. @martijnvg Any thoughts?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good question. Not really related, but synthetic source does include the values of the mapped sub fields. The root block loader's purpose is to include all the content of a flattened field. So I think the root block loader should also include the content of mapped sub fields.

Maybe we can quickly implement this by falling back to the source based block loader if there are mapped sub fields (in RootFlattenedFieldType#blockLoader(...))? Which should do the right thing, given that synthetic source already has this behaviour.

Then in a follow up we can improve RootFlattenedDocValuesBlockLoader?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I misunderstood the question :) Never mind my previous comment.

Should the mapped values be included in the KeyedFlattenedDocValuesBlockLoader?

No, these fields have their own block loader via their mapped sub field.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How will this choice affect query time behavior in ES|QL?

So long as the sub-field mappers can be resolved by FieldTypeLookup#get, then the fields will be loaded with the correct blockloaders.

Switch PARSER from a manual Mapper.TypeParser to
FieldMapper.TypeParser via createTypeParserWithLegacySupport,
allowing flattened fields as multi-fields for bwc. The
"properties" field is now a Parameter<Map<String, Builder>>
with parsing handled by the standard parameter framework.

Made-with: Cursor
Copy link
Member

@martijnvg martijnvg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM2

Use explicit sort: _doc to ensure deterministic result
order after force merge, avoiding reliance on undefined
tiebreaker behavior with equal _score values.

Made-with: Cursor
@elasticsearchmachine elasticsearchmachine added the serverless-linked Added by automation, don't add manually label Mar 26, 2026
@parkertimmins parkertimmins merged commit fee6b18 into elastic:main Mar 26, 2026
36 checks passed
@parkertimmins parkertimmins deleted the parker/flattened-sub-fields branch March 26, 2026 22:30
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

>enhancement serverless-linked Added by automation, don't add manually :StorageEngine/Mapping The storage related side of mappings Team:StorageEngine v9.4.0

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants