Going over the many shards benchmark bootstrapping I noticed it slowed down quite a bit recently.
Turns out a big contributor to this is org.elasticsearch.cluster.metadata.Metadata#isIndexManagedByILM called from
org.elasticsearch.xpack.ilm.IndexLifecycleService#triggerPolicies on every cluster state update and costing O(N) in the number of indices.
This could be made more efficient in various ways:
At least we should:
- remove setting read in the hot loop
- stop using
Metadata.getIndicesLookup, this one is extremely expensive on the applier thread
a first quick fix would be to first check if any datastreams even use DLM and if the answer is no, the whole logic can be skipped. This currently introduces an about 5% overhead into every CS update (relative to stuff like create index and shard allocation in the many shards benchmark) at 25k indices in a cluster and the overhead grows in O(number_of_indices).