Introducing a file group watcher #134564

ankit--sethi · 2025-09-11T16:07:10Z

Introducing the notion of a FileGroupWatcher, an orchestrating watcher built atop FileWatcher that invokes a callback only once per group of file changes.

The motivation is to better handle cases where multiple files may have changed in a particular monitoring interval, yet the action we need to take should execute only once per set of changes. Triggering callbacks over a collection of File Watchers results in the same action being executed more number of times than is necessary.

…built atop FileWatcher that invokes a callback once per group of file changes.

…hange

elasticsearchmachine · 2025-09-11T16:07:58Z

Pinging @elastic/es-security (Team:Security)

…hange

slobodanadamovic · 2025-09-15T10:42:07Z

server/src/main/java/org/elasticsearch/watcher/FileGroupWatcher.java

+    @Override
+    protected void doCheckAndNotify() throws IOException {
+        if (coalescingListener.filesChanged()) {
+            // fallback in case state wasn't cleared correctly in previous iteration


This sounds concerning. How can this happen?

No, it shouldn't ever happen due to the reset happening in the finally block. I was thinking of very rare situations where the JDK spec says finally will not execute. I had initially found this reference:

Likewise, if the thread executing the try or catch code is interrupted or killed, the finally block may not execute even though the application as a whole continues

https://stackoverflow.com/a/2417986

After digging a little deeper to write this reply, it turns out this is a JDK documentation mistake that they have now updated. Apparently, the section I quoted above is not actually true!
https://bugs.openjdk.org/browse/JDK-8276156?page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel

With things where they are now, I'm fine with removing this if condition if it scares more than helps!

I've removed this if-condition

slobodanadamovic · 2025-09-15T13:16:18Z

server/src/main/java/org/elasticsearch/watcher/FileGroupWatcher.java

+
+    static class CoalescingFileChangesListener implements FileChangesListener {
+
+        private final ThreadLocal<List<Path>> filesChanged = ThreadLocal.withInitial(() -> Collections.synchronizedList(new ArrayList<>()));


I'm failing to understand the need for ThreadLocal and synchronized list. Could you explain?

Hmm, I think synchronized list was the result of an earlier solution that got superseded by ThreadLocal but I forgot to remove it. I'll explain the general scenario either way though:

File change detected at the end of a particular time interval triggers this listener to action, the listener is executing in thread A.

If the listener execution is time-consuming it is possible that thread A is still executing by the time a second monitoring interval completes. And if there is yet another change to the files, then the resource watcher will assign a second thread B to execute the same same listener. (I'm making some assumptions about how the resource watcher schedules tasks here)

now thread A and B are modifying the same list concurrently.

This edge case requires CoalescingFileChangesListener to be doing something very time-consuming, or the watcher frequency is very very small. It's very unlikely but wrapping a list in the ThreadLocal is a low cost way to have some thread isolation in case this ever happens.

If the listener execution is time-consuming it is possible that thread A is still executing by the time a second monitoring interval completes. And if there is yet another change to the files, then the resource watcher will assign a second thread B to execute the same same listener. (I'm making some assumptions about how the resource watcher schedules tasks here)

I think we should verify these assumptions with @elastic/es-core-infra (assuming you are the owners of resource watcher). My understanding is that file watcher does not schedule new thread until it finishes with processing current. Meaning, it seems to be single threaded.

Sounds right, I tracked it down to these lines here and the flow is such that only after a task completes is the next runnable scheduled.

I feel though that depending on this information adds risk because it is tightly coupling the design to the internals of how a separate component works. With the ThreadLocal in place we don't need to depend on the scheduler guaranteeing thread safety, or things breaking if the design is ever modified.

I've removed the synchronizedList usage, it's not required.

…hange

elasticsearchmachine · 2025-09-16T12:14:50Z

Pinging @elastic/es-core-infra (Team:Core/Infra)

…n-settings-change' into feature/recreate-http-client-when-settings-change

…ture/recreate-http-client-when-settings-change

…hange

rjernst · 2025-09-18T19:43:32Z

I'm trying to understand the usecase. Can you describe a concrete use case that requires grouping in this way?

slobodanadamovic · 2025-09-19T12:24:10Z

I'm trying to understand the usecase. Can you describe a concrete use case that requires grouping in this way?

We need to be able to watch multiple SSL configuration files and be notified if any of the watched files changes, instead of being notified for every file we are watching. We use notification as a trigger to build new SSLContext and re-create an HTTP client that uses this configuration. This is the main motivation for the change.

Some background: Currently, FileWatcher only accepts a single file or a directory. Due to entitlements issues, we went from watching a directory to watching an individual files (#129738). Which means we now get notified and re-create configs and client on each notification. This is not ideal.

We may not necessarily have to take the approach as proposed in this PR. And we could certainly take other approaches or implement security-specific workarounds, but it would be preferred to be able to tell watcher here are files to monitor, notify me once with grouped information about all changed files.

rjernst · 2025-09-22T18:15:11Z

Due to entitlements issues, we went from watching a directory to watching an individual files

We are working towards watching a directory being ok. The code in question would think the secure files do not exist, but would not receive any errors. I think this would be a more robust long term approach. The proposal here seems like a workaround that will maintenance/complexity in the long term.

But aside from the above, I wonder if SSL service really needs to be in xpack core. It seems the only reason it is defined/exposed there is for xpack plugins to be able to create ssl capable clients. I wonder if this could be hidden away so that security implements this "create an ssl capable client", and then it can truly own these secure files.

cc @mosche

tvernum · 2025-09-23T01:45:15Z

I wonder if SSL service really needs to be in xpack core

At the moment it does because monitoring, watcher, ML, etc all use it.

We're working on reducing the need for other plugins to directly depend on the SSL service. The end point might get us to where it doesn't need to be in x-pack core anymore, but we're a number of steps away from that.

Add 'SslProfileExtension' SPI interface #134609

ankit--sethi · 2025-09-24T20:18:35Z

In light of the discussion above, I'm closing this PR as we will eventually have a better solution native to File Watcher

ankit--sethi added 2 commits September 11, 2025 11:00

introduce the notion of a FileGroupWatcher, an orchestrating watcher …

3624c8d

…built atop FileWatcher that invokes a callback once per group of file changes.

don't need this buffer

8f28855

ankit--sethi added >non-issue :Security/Security Security issues without another label labels Sep 11, 2025

Merge branch 'main' into feature/recreate-http-client-when-settings-c…

bd47039

…hange

elasticsearchmachine added serverless-linked Added by automation, don't add manually Team:Security Meta label for security team labels Sep 11, 2025

elasticsearchmachine added the v9.2.0 label Sep 11, 2025

[CI] Auto commit changes from spotless

fa0a26b

ankit--sethi requested review from slobodanadamovic and tvernum September 11, 2025 16:22

ankit--sethi added 3 commits September 11, 2025 12:58

Merge branch 'main' into feature/recreate-http-client-when-settings-c…

676b11d

…hange

Merge branch 'main' into feature/recreate-http-client-when-settings-c…

6fd33fc

…hange

Merge branch 'main' into feature/recreate-http-client-when-settings-c…

b106143

…hange

slobodanadamovic reviewed Sep 15, 2025

View reviewed changes

Merge branch 'main' into feature/recreate-http-client-when-settings-c…

9ac32e8

…hange

slobodanadamovic added :Core/Infra/Core Core issues without another label Team:Core/Infra Meta label for core/infra team labels Sep 16, 2025

ankit--sethi added 4 commits September 16, 2025 16:37

clean up

c6685d2

Merge remote-tracking branch 'origin/feature/recreate-http-client-whe…

fac653a

…n-settings-change' into feature/recreate-http-client-when-settings-change

Merge branch 'main' of github.com:ankit--sethi/elasticsearch into fea…

2ab2619

…ture/recreate-http-client-when-settings-change

Merge branch 'main' into feature/recreate-http-client-when-settings-c…

0199010

…hange

ankit--sethi closed this Sep 24, 2025


		static class CoalescingFileChangesListener implements FileChangesListener {

		private final ThreadLocal<List<Path>> filesChanged = ThreadLocal.withInitial(() -> Collections.synchronizedList(new ArrayList<>()));

Introducing a file group watcher #134564

Introducing a file group watcher #134564

Uh oh!

Conversation

ankit--sethi commented Sep 11, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

elasticsearchmachine commented Sep 11, 2025

Uh oh!

slobodanadamovic Sep 15, 2025

Choose a reason for hiding this comment

Uh oh!

ankit--sethi Sep 15, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

ankit--sethi Sep 16, 2025

Choose a reason for hiding this comment

Uh oh!

slobodanadamovic Sep 15, 2025

Choose a reason for hiding this comment

Uh oh!

ankit--sethi Sep 15, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

slobodanadamovic Sep 16, 2025

Choose a reason for hiding this comment

Uh oh!

ankit--sethi Sep 16, 2025

Choose a reason for hiding this comment

Uh oh!

ankit--sethi Sep 16, 2025

Choose a reason for hiding this comment

Uh oh!

elasticsearchmachine commented Sep 16, 2025

Uh oh!

rjernst commented Sep 18, 2025

Uh oh!

slobodanadamovic commented Sep 19, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

rjernst commented Sep 22, 2025

Uh oh!

tvernum commented Sep 23, 2025

Uh oh!

ankit--sethi commented Sep 24, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

ankit--sethi commented Sep 11, 2025 •

edited

Loading

ankit--sethi Sep 15, 2025 •

edited

Loading

ankit--sethi Sep 15, 2025 •

edited

Loading

slobodanadamovic commented Sep 19, 2025 •

edited

Loading