Skip to content

Commit 3cbbcc5

Browse files
Default LogsDB value for ignore_dynamic_beyond_limit (#115265)
When ingesting logs, it's important to ensure that documents are not dropped due to mapping issues, also when dealing with dynamically mapped fields. Elasticsearch provides two key settings that help manage the total number of field mappings and handle situations where this limit might be exceeded: 1. **`index.mapping.total_fields.limit`**: This setting defines the maximum number of fields allowed in an index. If this limit is reached, any further mapped fields would cause indexing to fail. 2. **`index.mapping.total_fields.ignore_dynamic_beyond_limit`**: This setting determines whether Elasticsearch should ignore any dynamically mapped fields that exceed the limit defined by `index.mapping.total_fields.limit`. If set to `false`, indexing will fail once the limit is surpassed. However, if set to `true`, Elasticsearch will continue indexing the document but will silently ignore any additional dynamically mapped fields beyond the limit. To prevent indexing failures due to dynamic mapping issues, especially in logs where the schema might change frequently, we change the default value of **`index.mapping.total_fields.ignore_dynamic_beyond_limit` from `false` to `true` in LogsDB**. This change ensures that even when the number of dynamically mapped fields exceeds the set limit, documents will still be indexed, and additional fields will simply be ignored rather than causing an indexing failure. This adjustment is important for LogsDB, where dynamically mapped fields may be common, and we want to make sure to avoid documents from being dropped.
1 parent aaf7a3e commit 3cbbcc5

File tree

5 files changed

+302
-3
lines changed

5 files changed

+302
-3
lines changed

rest-api-spec/src/yamlRestTest/resources/rest-api-spec/test/indices.create/20_synthetic_source.yml

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,3 +1,4 @@
1+
---
12
object with unmapped fields:
23
- requires:
34
cluster_features: ["mapper.track_ignored_source", "mapper.bwc_workaround_9_0"]

rest-api-spec/src/yamlRestTest/resources/rest-api-spec/test/logsdb/10_settings.yml

Lines changed: 281 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -599,3 +599,284 @@ end time not allowed in logs mode:
599599
- match: { error.root_cause.0.type: "illegal_argument_exception" }
600600
- match: { error.type: "illegal_argument_exception" }
601601
- match: { error.reason: "[index.time_series.end_time] requires [index.mode=time_series]" }
602+
603+
---
604+
ignore dynamic beyond limit logsdb default value:
605+
- requires:
606+
cluster_features: [ "mapper.logsdb_default_ignore_dynamic_beyond_limit" ]
607+
reason: requires logsdb default value for `index.mapping.total_fields.ignore_dynamic_beyond_limit`
608+
609+
- do:
610+
indices.create:
611+
index: test-ignore-dynamic-default
612+
body:
613+
settings:
614+
index:
615+
mode: logsdb
616+
617+
- do:
618+
indices.get_settings:
619+
index: test-ignore-dynamic-default
620+
include_defaults: true
621+
622+
- match: { test-ignore-dynamic-default.settings.index.mode: "logsdb" }
623+
- match: { test-ignore-dynamic-default.defaults.index.mapping.total_fields.limit: "1000" }
624+
- match: { test-ignore-dynamic-default.defaults.index.mapping.total_fields.ignore_dynamic_beyond_limit: "true" }
625+
626+
---
627+
ignore dynamic beyond limit logsdb override value:
628+
- requires:
629+
cluster_features: [ "mapper.logsdb_default_ignore_dynamic_beyond_limit" ]
630+
reason: requires logsdb default value for `index.mapping.total_fields.ignore_dynamic_beyond_limit`
631+
632+
- do:
633+
indices.create:
634+
index: test-ignore-dynamic-override
635+
body:
636+
settings:
637+
index:
638+
mode: logsdb
639+
mapping:
640+
total_fields:
641+
ignore_dynamic_beyond_limit: false
642+
643+
- do:
644+
indices.get_settings:
645+
index: test-ignore-dynamic-override
646+
647+
- match: { test-ignore-dynamic-override.settings.index.mode: "logsdb" }
648+
- match: { test-ignore-dynamic-override.settings.index.mapping.total_fields.ignore_dynamic_beyond_limit: "false" }
649+
650+
---
651+
logsdb with default ignore dynamic beyond limit and default sorting:
652+
- requires:
653+
cluster_features: ["mapper.logsdb_default_ignore_dynamic_beyond_limit"]
654+
reason: requires default value for ignore_dynamic_beyond_limit
655+
656+
- do:
657+
indices.create:
658+
index: test-logsdb-default-sort
659+
body:
660+
settings:
661+
index:
662+
mode: logsdb
663+
mapping:
664+
# NOTE: When the index mode is set to `logsdb`, the `host.name` field is automatically injected if
665+
# sort settings are not overridden.
666+
# With `subobjects` set to `true` (default), this creates a `host` object field and a nested `name`
667+
# keyword field (`host.name`).
668+
#
669+
# As a result, there are always at least 4 statically mapped fields (`@timestamp`, `host`, `host.name`
670+
# and `name`). We cannot use a field limit lower than 4 because these fields are always present.
671+
#
672+
# Indeed, if `index.mapping.total_fields.ignore_dynamic_beyond_limit` is `true`, any dynamically
673+
# mapped fields beyond the limit `index.mapping.total_fields.limit` are ignored, but the statically
674+
# mapped fields are always counted.
675+
total_fields:
676+
limit: 4
677+
mappings:
678+
properties:
679+
"@timestamp":
680+
type: date
681+
name:
682+
type: keyword
683+
684+
- do:
685+
indices.get_settings:
686+
index: test-logsdb-default-sort
687+
688+
- match: { test-logsdb-default-sort.settings.index.mode: "logsdb" }
689+
690+
- do:
691+
bulk:
692+
index: test-logsdb-default-sort
693+
refresh: true
694+
body:
695+
- '{ "index": { } }'
696+
- '{ "@timestamp": "2024-08-13T12:30:00Z", "name": "foo", "host.name": "92f4a67c", "value": 10, "message": "the quick brown fox", "region": "us-west", "pid": 153462 }'
697+
- '{ "index": { } }'
698+
- '{ "@timestamp": "2024-08-13T12:01:00Z", "name": "bar", "host.name": "24eea278", "value": 20, "message": "jumps over the lazy dog", "region": "us-central", "pid": 674972 }'
699+
- match: { errors: false }
700+
701+
- do:
702+
search:
703+
index: test-logsdb-default-sort
704+
body:
705+
query:
706+
match_all: {}
707+
708+
- match: { hits.total.value: 2 }
709+
- match: { hits.hits.0._source.name: "bar" }
710+
- match: { hits.hits.0._source.value: 20 }
711+
- match: { hits.hits.0._source.message: "jumps over the lazy dog" }
712+
- match: { hits.hits.0._ignored: [ "message", "pid", "region", "value" ] }
713+
- match: { hits.hits.1._source.name: "foo" }
714+
- match: { hits.hits.1._source.value: 10 }
715+
- match: { hits.hits.1._source.message: "the quick brown fox" }
716+
- match: { hits.hits.1._ignored: [ "message", "pid", "region", "value" ] }
717+
718+
---
719+
logsdb with default ignore dynamic beyond limit and non-default sorting:
720+
- requires:
721+
cluster_features: ["mapper.logsdb_default_ignore_dynamic_beyond_limit"]
722+
reason: requires default value for ignore_dynamic_beyond_limit
723+
724+
- do:
725+
indices.create:
726+
index: test-logsdb-non-default-sort
727+
body:
728+
settings:
729+
index:
730+
sort.field: [ "name" ]
731+
sort.order: [ "desc" ]
732+
mode: logsdb
733+
mapping:
734+
# NOTE: Here sort settings are overridden and we do not have any additional statically mapped field other
735+
# than `name` and `timestamp`. As a result, there are only 2 statically mapped fields.
736+
total_fields:
737+
limit: 2
738+
mappings:
739+
properties:
740+
"@timestamp":
741+
type: date
742+
name:
743+
type: keyword
744+
745+
- do:
746+
indices.get_settings:
747+
index: test-logsdb-non-default-sort
748+
749+
- match: { test-logsdb-non-default-sort.settings.index.mode: "logsdb" }
750+
751+
- do:
752+
bulk:
753+
index: test-logsdb-non-default-sort
754+
refresh: true
755+
body:
756+
- '{ "index": { } }'
757+
- '{ "@timestamp": "2024-08-13T12:30:00Z", "name": "foo", "host.name": "92f4a67c", "value": 10, "message": "the quick brown fox", "region": "us-west", "pid": 153462 }'
758+
- '{ "index": { } }'
759+
- '{ "@timestamp": "2024-08-13T12:01:00Z", "name": "bar", "host.name": "24eea278", "value": 20, "message": "jumps over the lazy dog", "region": "us-central", "pid": 674972 }'
760+
- match: { errors: false }
761+
762+
- do:
763+
search:
764+
index: test-logsdb-non-default-sort
765+
body:
766+
query:
767+
match_all: {}
768+
769+
- match: { hits.total.value: 2 }
770+
- match: { hits.hits.0._source.name: "foo" }
771+
- match: { hits.hits.0._source.value: 10 }
772+
- match: { hits.hits.0._source.message: "the quick brown fox" }
773+
- match: { hits.hits.0._ignored: [ "host", "message", "pid", "region", "value" ] }
774+
- match: { hits.hits.1._source.name: "bar" }
775+
- match: { hits.hits.1._source.value: 20 }
776+
- match: { hits.hits.1._source.message: "jumps over the lazy dog" }
777+
- match: { hits.hits.1._ignored: [ "host", "message", "pid", "region", "value" ] }
778+
779+
---
780+
logsdb with default ignore dynamic beyond limit and too low limit:
781+
- requires:
782+
cluster_features: ["mapper.logsdb_default_ignore_dynamic_beyond_limit"]
783+
reason: requires default value for ignore_dynamic_beyond_limit
784+
785+
- do:
786+
catch: bad_request
787+
indices.create:
788+
index: test-logsdb-low-limit
789+
body:
790+
settings:
791+
index:
792+
mode: logsdb
793+
mapping:
794+
# NOTE: When the index mode is set to `logsdb`, the `host.name` field is automatically injected if
795+
# sort settings are not overridden.
796+
# With `subobjects` set to `true` (default), this creates a `host` object field and a nested `name`
797+
# keyword field (`host.name`).
798+
#
799+
# As a result, there are always at least 4 statically mapped fields (`@timestamp`, `host`, `host.name`
800+
# and `name`). We cannot use a field limit lower than 4 because these fields are always present.
801+
#
802+
# Indeed, if `index.mapping.total_fields.ignore_dynamic_beyond_limit` is `true`, any dynamically
803+
# mapped fields beyond the limit `index.mapping.total_fields.limit` are ignored, but the statically
804+
# mapped fields are always counted.
805+
total_fields:
806+
limit: 3
807+
mappings:
808+
properties:
809+
"@timestamp":
810+
type: date
811+
name:
812+
type: keyword
813+
- match: { error.type: "illegal_argument_exception" }
814+
- match: { error.reason: "Limit of total fields [3] has been exceeded" }
815+
816+
---
817+
logsdb with default ignore dynamic beyond limit and subobjects false:
818+
- requires:
819+
cluster_features: ["mapper.logsdb_default_ignore_dynamic_beyond_limit"]
820+
reason: requires default value for ignore_dynamic_beyond_limit
821+
822+
- do:
823+
indices.create:
824+
index: test-logsdb-subobjects-false
825+
body:
826+
settings:
827+
index:
828+
mode: logsdb
829+
mapping:
830+
# NOTE: When the index mode is set to `logsdb`, the `host.name` field is automatically injected if
831+
# sort settings are not overridden.
832+
# With `subobjects` set to `false` anyway, a single `host.name` keyword field is automatically mapped.
833+
#
834+
# As a result, there are just 3 statically mapped fields (`@timestamp`, `host.name` and `name`).
835+
# We cannot use a field limit lower than 3 because these fields are always present.
836+
#
837+
# Indeed, if `index.mapping.total_fields.ignore_dynamic_beyond_limit` is `true`, any dynamically
838+
# mapped fields beyond the limit `index.mapping.total_fields.limit` are ignored, but the statically
839+
# mapped fields are always counted.
840+
total_fields:
841+
limit: 3
842+
mappings:
843+
subobjects: false
844+
properties:
845+
"@timestamp":
846+
type: date
847+
name:
848+
type: keyword
849+
850+
- do:
851+
indices.get_settings:
852+
index: test-logsdb-subobjects-false
853+
854+
- match: { test-logsdb-subobjects-false.settings.index.mode: "logsdb" }
855+
856+
- do:
857+
bulk:
858+
index: test-logsdb-subobjects-false
859+
refresh: true
860+
body:
861+
- '{ "index": { } }'
862+
- '{ "@timestamp": "2024-08-13T12:30:00Z", "name": "foo", "host.name": "92f4a67c", "value": 10, "message": "the quick brown fox", "region": "us-west", "pid": 153462 }'
863+
- '{ "index": { } }'
864+
- '{ "@timestamp": "2024-08-13T12:01:00Z", "name": "bar", "host.name": "24eea278", "value": 20, "message": "jumps over the lazy dog", "region": "us-central", "pid": 674972 }'
865+
- match: { errors: false }
866+
867+
- do:
868+
search:
869+
index: test-logsdb-subobjects-false
870+
body:
871+
query:
872+
match_all: {}
873+
874+
- match: { hits.total.value: 2 }
875+
- match: { hits.hits.0._source.name: "bar" }
876+
- match: { hits.hits.0._source.value: 20 }
877+
- match: { hits.hits.0._source.message: "jumps over the lazy dog" }
878+
- match: { hits.hits.0._ignored: [ "message", "pid", "region", "value" ] }
879+
- match: { hits.hits.1._source.name: "foo" }
880+
- match: { hits.hits.1._source.value: 10 }
881+
- match: { hits.hits.1._source.message: "the quick brown fox" }
882+
- match: { hits.hits.1._ignored: [ "message", "pid", "region", "value" ] }

server/src/main/java/org/elasticsearch/index/IndexVersions.java

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -129,8 +129,9 @@ private static Version parseUnchecked(String version) {
129129
public static final IndexVersion UPGRADE_TO_LUCENE_9_12 = def(8_516_00_0, Version.LUCENE_9_12_0);
130130
public static final IndexVersion ENABLE_IGNORE_ABOVE_LOGSDB = def(8_517_00_0, Version.LUCENE_9_12_0);
131131
public static final IndexVersion ADD_ROLE_MAPPING_CLEANUP_MIGRATION = def(8_518_00_0, Version.LUCENE_9_12_0);
132+
public static final IndexVersion LOGSDB_DEFAULT_IGNORE_DYNAMIC_BEYOND_LIMIT_BACKPORT = def(8_519_00_0, Version.LUCENE_9_12_0);
132133
public static final IndexVersion UPGRADE_TO_LUCENE_10_0_0 = def(9_000_00_0, Version.LUCENE_10_0_0);
133-
134+
public static final IndexVersion LOGSDB_DEFAULT_IGNORE_DYNAMIC_BEYOND_LIMIT = def(9_001_00_0, Version.LUCENE_10_0_0);
134135
/*
135136
* STOP! READ THIS FIRST! No, really,
136137
* ____ _____ ___ ____ _ ____ _____ _ ____ _____ _ _ ___ ____ _____ ___ ____ ____ _____ _

server/src/main/java/org/elasticsearch/index/mapper/MapperFeatures.java

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -64,7 +64,8 @@ public Set<NodeFeature> getTestFeatures() {
6464
IgnoredSourceFieldMapper.DONT_EXPAND_DOTS_IN_IGNORED_SOURCE,
6565
SourceFieldMapper.REMOVE_SYNTHETIC_SOURCE_ONLY_VALIDATION,
6666
IgnoredSourceFieldMapper.IGNORED_SOURCE_AS_TOP_LEVEL_METADATA_ARRAY_FIELD,
67-
IgnoredSourceFieldMapper.ALWAYS_STORE_OBJECT_ARRAYS_IN_NESTED_OBJECTS
67+
IgnoredSourceFieldMapper.ALWAYS_STORE_OBJECT_ARRAYS_IN_NESTED_OBJECTS,
68+
MapperService.LOGSDB_DEFAULT_IGNORE_DYNAMIC_BEYOND_LIMIT
6869
);
6970
}
7071
}

server/src/main/java/org/elasticsearch/index/mapper/MapperService.java

Lines changed: 16 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -22,9 +22,12 @@
2222
import org.elasticsearch.common.xcontent.LoggingDeprecationHandler;
2323
import org.elasticsearch.common.xcontent.XContentHelper;
2424
import org.elasticsearch.core.Nullable;
25+
import org.elasticsearch.features.NodeFeature;
2526
import org.elasticsearch.index.AbstractIndexComponent;
27+
import org.elasticsearch.index.IndexMode;
2628
import org.elasticsearch.index.IndexSettings;
2729
import org.elasticsearch.index.IndexVersion;
30+
import org.elasticsearch.index.IndexVersions;
2831
import org.elasticsearch.index.analysis.AnalysisRegistry;
2932
import org.elasticsearch.index.analysis.IndexAnalyzers;
3033
import org.elasticsearch.index.analysis.NamedAnalyzer;
@@ -121,9 +124,21 @@ public boolean isAutoUpdate() {
121124
Property.IndexScope,
122125
Property.ServerlessPublic
123126
);
127+
128+
public static final NodeFeature LOGSDB_DEFAULT_IGNORE_DYNAMIC_BEYOND_LIMIT = new NodeFeature(
129+
"mapper.logsdb_default_ignore_dynamic_beyond_limit"
130+
);
124131
public static final Setting<Boolean> INDEX_MAPPING_IGNORE_DYNAMIC_BEYOND_LIMIT_SETTING = Setting.boolSetting(
125132
"index.mapping.total_fields.ignore_dynamic_beyond_limit",
126-
false,
133+
settings -> {
134+
boolean isLogsDBIndexMode = IndexSettings.MODE.get(settings) == IndexMode.LOGSDB;
135+
final IndexVersion indexVersionCreated = IndexMetadata.SETTING_INDEX_VERSION_CREATED.get(settings);
136+
boolean isNewIndexVersion = indexVersionCreated.between(
137+
IndexVersions.LOGSDB_DEFAULT_IGNORE_DYNAMIC_BEYOND_LIMIT_BACKPORT,
138+
IndexVersions.UPGRADE_TO_LUCENE_10_0_0
139+
) || indexVersionCreated.onOrAfter(IndexVersions.LOGSDB_DEFAULT_IGNORE_DYNAMIC_BEYOND_LIMIT);
140+
return String.valueOf(isLogsDBIndexMode && isNewIndexVersion);
141+
},
127142
Property.Dynamic,
128143
Property.IndexScope,
129144
Property.ServerlessPublic

0 commit comments

Comments
 (0)