Skip to content

Commit 04cb4e7

Browse files
authored
Fix PolicyStepsRegistry cache concurrency issue (#126840) (#126913)
The following order of events was possible: - An ILM policy update cleared `cachedSteps` - ILM retrieves the step definition for an index, this populates `cachedSteps` with the outdated policy - The updated policy is put in `lifecyclePolicyMap` Any subsequent cache retrievals will see the old step definition. By clearing `cachedSteps` _after_ we update `lifecyclePolicyMap`, we ensure eventual consistency between the policy and the cache. Fixes #118406 (cherry picked from commit 5383f0f) # Conflicts: # muted-tests.yml
1 parent df4176b commit 04cb4e7

File tree

2 files changed

+11
-3
lines changed

2 files changed

+11
-3
lines changed

docs/changelog/126840.yaml

Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,6 @@
1+
pr: 126840
2+
summary: Fix `PolicyStepsRegistry` cache concurrency issue
3+
area: ILM+SLM
4+
type: bug
5+
issues:
6+
- 118406

x-pack/plugin/ilm/src/main/java/org/elasticsearch/xpack/ilm/PolicyStepsRegistry.java

Lines changed: 5 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -105,9 +105,6 @@ Map<String, Map<Step.StepKey, Step>> getStepMap() {
105105
public void update(IndexLifecycleMetadata meta) {
106106
assert meta != null : "IndexLifecycleMetadata cannot be null when updating the policy steps registry";
107107

108-
// since the policies (may have) changed, the whole steps cache needs to be thrown out
109-
cachedSteps.clear();
110-
111108
DiffableUtils.MapDiff<String, LifecyclePolicyMetadata, Map<String, LifecyclePolicyMetadata>> mapDiff = DiffableUtils.diff(
112109
lifecyclePolicyMap,
113110
meta.getPolicyMetadatas(),
@@ -163,6 +160,11 @@ public LifecyclePolicyMetadata read(StreamInput in, String key) {
163160
}
164161
}
165162
}
163+
164+
// Since the policies (may have) changed, the whole steps cache needs to be thrown out.
165+
// We do this after we update `lifecyclePolicyMap` to ensure `cachedSteps` does not contain outdated data.
166+
// This means we may clear up-to-date data, but that's a lot better than the cache containing outdated entries indefinitely.
167+
cachedSteps.clear();
166168
}
167169

168170
/**

0 commit comments

Comments
 (0)