Fix enrich caches outdated value after policy run #133680

joegallo · 2025-08-27T19:22:58Z

There's a (likely very rare) race condition in the way we update the enrich alias that can result in outdated entries being stored in the cache. As a workaround, if you execute the policy a second time (without having changed the data in the source index), the cache will certainly contain the correct information.

We update the cache key based on the concrete index name associated with the enrich alias based on the cluster state that we see in a cluster state applier. However, we execute searches (and cache the found values) based on the concrete index name associated with the enrich alias based on the current cluster state (after listeners and appliers have run -- I'm a little squishy on exactly this detail). Suffice it to say, there's a race where if there's a slow enough cluster state applier, then it's possible that we run the search against the old enrich index, but we cache the value as if it was for the new enrich index. That is, it results in an outdated entry being stored in the cache (😱).

The fix is to stop searching through the alias. Instead, we use the same dereferencing logic to ensure that the index that we record the cache entry for is also the concrete index that we searched against.

elasticsearchmachine · 2025-08-27T19:23:23Z

Pinging @elastic/es-data-management (Team:Data Management)

elasticsearchmachine · 2025-08-27T19:23:24Z

Hi @joegallo, I've created a changelog YAML for you.

jbaiera

Mostly a question on how the search runner updates after a policy execution

jbaiera · 2025-08-28T03:38:50Z

x-pack/plugin/enrich/src/main/java/org/elasticsearch/xpack/enrich/EnrichProcessorFactory.java

+    private SearchRunner createSearchRunner(final ProjectMetadata project, final String indexAlias) {
+        final Client originClient = new OriginSettingClient(client, ENRICH_ORIGIN);
        return (value, maxMatches, reqSupplier, handler) -> {
+            final String concreteEnrichIndex = getEnrichIndexKey(project, indexAlias);


We capture this enrich index name here when we create the processor, but how will this get updated if we execute an enrich policy and a new index is created?

It's all lunatic higher order functions (I say with the highest compliments) -- we capture the client at createSearchRunner invocation time, that is, when a processor is created. But the search runner itself doesn't run until it's invoked for some document in AbstractEnrichProcessor#execute and that's when the concreteEnrichIndex is finally set (and of course the handler is another functional argument so this is all clear as mud when one reads the code).

This search runner captures the project metadata to extract that concrete enrich index name, but it only does this at processor creation time. Is there a possibility that when we execute an enrich policy, thus updating the concrete index to use, but this processor isn't recreated when that happens? It looks like we only create a processor if it's configuration changes after a cluster state update.

Yeahhhhhhhhhhhh... to be clear, that problem existed before this, though, right? I mean, it seems like it does work... but I must admit I'm not sure I see how it does work just yet.

Oh my! #133752 (edit: merged!)

joegallo · 2025-08-28T21:17:33Z

...enrich/src/internalClusterTest/java/org/elasticsearch/xpack/enrich/EnrichPolicyChangeIT.java

+    protected Settings nodeSettings() {
+        return Settings.builder()
+            // TODO Change this to run with security enabled
+            // https://github.com/elastic/elasticsearch/issues/75940


I've already added this new test to #75940.

jbaiera

LGTM!

elasticsearchmachine · 2025-08-30T15:53:29Z

💔 Backport failed

Status	Branch	Result
✅	9.1
❌	9.0	Commit could not be cherrypicked due to conflicts
❌	8.18	Commit could not be cherrypicked due to conflicts
❌	8.19	Commit could not be cherrypicked due to conflicts

You can use sqren/backport to manually backport by running backport --upstream elastic/elasticsearch --pr 133680

joegallo · 2025-08-30T16:37:05Z

Backports are up, so I'm dropping the backport pending label.

…33879) * Fix enrich caches outdated value after policy run (#133680) * Trying to get a new build

…lastic#133880)

joegallo added 5 commits August 27, 2025 15:09

Add a simple (passing) test loop

c3156f3

Make the test fail

fe70ef2

Make these variables final

6bbf461

Extract a variable for the concrete enrich index

2686739

Do not search through the alias

2e71031

joegallo added >bug :Data Management/Ingest Node Execution or management of Ingest Pipelines including GeoIP Team:Data Management Meta label for data/management team auto-backport Automatically create backport pull requests when merged v9.2.0 v9.1.4 v9.0.7 v8.18.7 v8.19.4 labels Aug 27, 2025

Update docs/changelog/133680.yaml

9674e4c

jbaiera self-requested a review August 28, 2025 03:12

jbaiera reviewed Aug 28, 2025

View reviewed changes

joegallo changed the title ~~Fix enrich cache containing outdated value after policy execution~~ Fix enrich caches outdated value after policy run Aug 28, 2025

joegallo added 2 commits August 28, 2025 08:48

Update changelog

4cf61e2

Merge branch 'main' into enrich-policy-cache-race-condition

3dcb993

joegallo mentioned this pull request Aug 28, 2025

Avoid stale enrich results after policy execution #133752

Merged

Merge branch 'main' into enrich-policy-cache-race-condition

e6b59dc

joegallo requested a review from jbaiera August 28, 2025 21:14

joegallo commented Aug 28, 2025

View reviewed changes

jbaiera approved these changes Aug 30, 2025

View reviewed changes

joegallo merged commit dc57b3e into elastic:main Aug 30, 2025
33 checks passed

joegallo deleted the enrich-policy-cache-race-condition branch August 30, 2025 15:52

joegallo mentioned this pull request Aug 30, 2025

[9.1] Fix enrich caches outdated value after policy run (#133680) #133878

Merged

elasticsearchmachine added the backport pending label Aug 30, 2025

joegallo added a commit to joegallo/elasticsearch that referenced this pull request Aug 30, 2025

Fix enrich caches outdated value after policy run (elastic#133680)

0e53c3b

joegallo added a commit to joegallo/elasticsearch that referenced this pull request Aug 30, 2025

Fix enrich caches outdated value after policy run (elastic#133680)

ccf9174

This was referenced Aug 30, 2025

[9.0] Fix enrich caches outdated value after policy run (#133680) #133879

Merged

[8.19] Fix enrich caches outdated value after policy run (#133680) #133880

Merged

[8.18] Fix enrich caches outdated value after policy run (#133680) #133881

Merged

joegallo added a commit to joegallo/elasticsearch that referenced this pull request Aug 30, 2025

Fix enrich caches outdated value after policy run (elastic#133680)

103ddf2

joegallo removed the backport pending label Aug 30, 2025

elasticsearchmachine pushed a commit that referenced this pull request Aug 30, 2025

Fix enrich caches outdated value after policy run (#133680) (#133878)

7bddc49

elasticsearchmachine pushed a commit that referenced this pull request Aug 30, 2025

Fix enrich caches outdated value after policy run (#133680) (#133881)

41b1be5

elasticsearchmachine pushed a commit that referenced this pull request Aug 30, 2025

Fix enrich caches outdated value after policy run (#133680) (#133880)

01c83eb

elasticsearchmachine pushed a commit that referenced this pull request Aug 30, 2025

[9.0] Fix enrich caches outdated value after policy run (#133680) (#1…

56ce6b9

…33879) * Fix enrich caches outdated value after policy run (#133680) * Trying to get a new build

sarog pushed a commit to portsbuild/elasticsearch that referenced this pull request Sep 11, 2025

Fix enrich caches outdated value after policy run (elastic#133680) (e…

675928f

…lastic#133880)

sarog pushed a commit to portsbuild/elasticsearch that referenced this pull request Sep 19, 2025

Fix enrich caches outdated value after policy run (elastic#133680) (e…

dd5b985

…lastic#133880)

Fix enrich caches outdated value after policy run #133680

Fix enrich caches outdated value after policy run #133680

Uh oh!

Conversation

joegallo commented Aug 27, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

elasticsearchmachine commented Aug 27, 2025

Uh oh!

elasticsearchmachine commented Aug 27, 2025

Uh oh!

jbaiera left a comment

Choose a reason for hiding this comment

Uh oh!

jbaiera Aug 28, 2025

Choose a reason for hiding this comment

Uh oh!

joegallo Aug 28, 2025

Choose a reason for hiding this comment

Uh oh!

jbaiera Aug 28, 2025

Choose a reason for hiding this comment

Uh oh!

joegallo Aug 28, 2025

Choose a reason for hiding this comment

Uh oh!

joegallo Aug 28, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

joegallo Aug 28, 2025

Choose a reason for hiding this comment

Uh oh!

jbaiera left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

elasticsearchmachine commented Aug 30, 2025

💔 Backport failed

Uh oh!

joegallo commented Aug 30, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

joegallo commented Aug 27, 2025 •

edited

Loading

joegallo Aug 28, 2025 •

edited

Loading