Skip to content

Conversation

jordan-powers
Copy link
Contributor

This PR configures default values for the index settings index.sort.field, index.sort.order, index.sort.mode, and index.sort.missing.

Fixes #129062

@jordan-powers jordan-powers self-assigned this Oct 3, 2025
@jordan-powers jordan-powers added >bug auto-backport Automatically create backport pull requests when merged Team:StorageEngine :StorageEngine/Mapping The storage related side of mappings v8.19.6 v9.1.6 v9.0.9 v9.2.1 v9.3.0 labels Oct 3, 2025
@elasticsearchmachine
Copy link
Collaborator

Hi @jordan-powers, I've created a changelog YAML for you.

return new FieldSortSpec[0];
}

String indexMode = settings.get(IndexSettings.MODE.getKey());
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can't use IndexSettings.MODE.get(settings) here because the validation logic for IndexSettings.MODE uses the default value of index.sort.*, which causes infinite recursion (since we're already in the default value provider for those settings).
So we need to get the mode while bypassing the validation.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe add this also as a comment?

List<String> asList = defaultSettings.getAsList(getKey(), null);
if (asList == null) {
builder.putList(getKey(), defaultStringValue.apply(defaultSettings));
builder.putList(getKey(), defaultStringValue.apply(source));
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is a bug, but I'm not sure why we never saw it before. Maybe we never had string list settings whose default value depended on other settings before?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice find, definitely looks like a bug to me

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good find, I agree that the current behaviour isn't correct and actual settings need to be pushed down instead of default settings..

@jordan-powers jordan-powers marked this pull request as ready for review October 3, 2025 15:31
@jordan-powers jordan-powers requested a review from a team as a code owner October 3, 2025 15:31
@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/es-storage-engine (Team:StorageEngine)

return Arrays.stream(getDefaultSortSpecs(settings)).map(sortSpec -> sortSpec.field).toList();
}

public static List<String> getDefaultSortOrder(Settings settings) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe there are situations where non-default and default settings can be mixed incorrectly. For example, if mode=logsdb, and only a sort field is set, the order, mode, missing settings will use the defaults for timestamp sort.

this test (added to 80_index_sort_defaults.yml) shows the issue:

---
create logsdb data stream with non-default sort field:
  - do:
      cluster.put_component_template:
        name: "logsdb-mappings"
        body:
          template:
            settings:
              mode: "logsdb"
              index.sort.field: ["some_field"]
            mappings:
              properties:
                some_field:
                  type: "keyword"
                "@timestamp":
                  type: "date"

  - do:
      indices.put_index_template:
        name: "logsdb-index-template"
        body:
          index_patterns: ["logsdb"]
          data_stream: {}
          composed_of: ["logsdb-mappings"]
      allowed_warnings:
        - "index template [logsdb-index-template] has index patterns [logsdb] matching patterns from existing older templates [global] with patterns (global => [*]); this template [logsdb-index-template] will take precedence during new index creation"

  - do:
      indices.create_data_stream:
        name: "logsdb"

  - is_true: acknowledged
  - do:
      indices.get_data_stream:
        name: "logsdb"
        expand_wildcards: hidden
  - length: { data_streams: 1 }
  - set: { data_streams.0.indices.0.index_name: backing_index }

  - do:
      indices.get_settings:
        index: $backing_index
        include_defaults: true
  - match: { .$backing_index.settings.index.mode: logsdb }
  - match: { .$backing_index.settings.index.sort.field: [ "some_field" ] }
  - match: { .$backing_index.defaults.index.sort.order: [ "asc" ] }
  - match: { .$backing_index.defaults.index.sort.mode: [ "min" ] }
  - match: { .$backing_index.defaults.index.sort.missing: [ "_last" ] }

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch. The current logic does ensure the defaults aren't actually applied, but they're still reported via the settings api. I'll update the logic so that if a custom sort is defined, the defaults match index.sort.fields.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, should be fixed as of b8ff892

@parkertimmins parkertimmins self-requested a review October 6, 2025 16:59
Copy link
Contributor

@parkertimmins parkertimmins left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

Copy link
Member

@martijnvg martijnvg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks good Jordan! I left a few questions.

List<String> asList = defaultSettings.getAsList(getKey(), null);
if (asList == null) {
builder.putList(getKey(), defaultStringValue.apply(defaultSettings));
builder.putList(getKey(), defaultStringValue.apply(source));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good find, I agree that the current behaviour isn't correct and actual settings need to be pushed down instead of default settings..

return new FieldSortSpec[0];
}

String indexMode = settings.get(IndexSettings.MODE.getKey());
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe add this also as a comment?

- match: { [email protected]: date }
- match: { .$backing_index.mappings.properties.host.properties.name.type: keyword }
- match: { .$backing_index.mappings.properties.host.properties.name.ignore_above: 1024 }
- match: { .$backing_index.mappings.properties.host: null }
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why does this need to be changed? iirc these fields get injected by default if index mode is logsdb?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The update I made to LogsdbIndexModeSettingsProvider was causing this test to fail.

We don't inject the host.name mapper when there is a custom sort configured. In this test, we configure index.sort.field to be an empty array. Previously, we checked for a custom sort by checking if index.sort.field is non-empty. We can't do that anymore because index.sort.field now has a proper default and will be sometimes non-empty even when there is no custom sort configured. Now, we check for a custom sort by actually checking if index.sort.field is configured.

- match: { .$backing_index.defaults.index.sort.mode: [ "max" ] }
- match: { .$backing_index.defaults.index.sort.missing: [ "_last" ] }
---
create logsdb data stream with host as text and name as double:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a strange setup, but I see how this can work and using host.name as primary sort field.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah it's definitely a strange setup, but I found it tested in 30_logsdb_default_mapping, and I figured it couldn't hurt to test the edge case.

"mode": "time_series",
"sort.field": [],
"sort.order": []
"sort.field": ["_tsid", "@timestamp"],
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't the sort.field / sort.order index settings be derived from the defaults? Or maybe the settings shouldn't be defined in this test?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Omitting the sort.field and sort.order results in the error updating component template [logs@custom] results in invalid composable template [logs-apm.error@template] after templates are merged.

From what I can tell, the logs-apm.error@template template is composed of logs@custom which we define here, and apm@settings which contains a non-default index sort configuration. Since this this template sets the index mode to time_series, we need to override the index sort from apm@settings with another index sort here.

@jordan-powers jordan-powers merged commit 9fe91ff into elastic:main Oct 8, 2025
34 checks passed
@elasticsearchmachine
Copy link
Collaborator

💔 Backport failed

Status Branch Result
8.19 Commit could not be cherrypicked due to conflicts
9.1 Commit could not be cherrypicked due to conflicts
9.0 Commit could not be cherrypicked due to conflicts
9.2 Commit could not be cherrypicked due to conflicts

You can use sqren/backport to manually backport by running backport --upstream elastic/elasticsearch --pr 135886

@jordan-powers
Copy link
Contributor Author

After some discussion, we decided not to backport this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Include index.sort.* index settings defaults for time_series and logsdb index modes.
4 participants