Skip to content

Conversation

@mosche
Copy link
Contributor

@mosche mosche commented May 7, 2025

Fix MasterService to execute tasks in the context in which the task was originally created.

MasterService used to execute tasks in a completely empty context rather than the original context which was captured when the task was created.

This fixes the settings API to properly return deprecation warning headers when updating deprecated settings, while properly associating them with the correct correlation id.

Previously, the correlation id of the original context wasn't propagated to the cluster state update task.
However, using newRestorableContext(preserveResponseHeaders=true) isn't enough to also - in case - propagate warning headers back to the task's thread context and hence can't be captured by the task executor in MasterService to return the response headers back to the caller.

This PR adds a new variant of newRestorableContext that supports propagation of response headers if required.

Fixes #108628 (DeprecationHttpIT.testDeprecatedSettingsReturnWarnings)

This fixes the settings API to properly return deprecation warning
headers when updating deprecated settings, while properly associating
them with the correct correlation id.

Previously, the correlation id of the original context wasn't propagated
to the cluster state update task.
However, using `newRestorableContext(preserveResponseHeaders=true)`
isn't enough to also - in case - propagate warning headers back to
the task's thread context and hence can't be captured by the task
executor in MasterService to return the response headers back to the
caller.

This PR adds a new variant of `newRestorableContext` that supports
propagation of response headers if required.

Fixes elastic#108628 (DeprecationHttpIT.testDeprecatedSettingsReturnWarnings)
@mosche mosche requested a review from DaveCTurner May 7, 2025 07:07
@mosche mosche requested a review from a team as a code owner May 7, 2025 07:07
@mosche mosche added >non-issue :Core/Infra/Core Core issues without another label labels May 7, 2025
@elasticsearchmachine elasticsearchmachine added Team:Core/Infra Meta label for core/infra team v9.1.0 labels May 7, 2025
@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/es-core-infra (Team:Core/Infra)

Copy link
Contributor

@ldematte ldematte left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So, we had already newStoredContextPreservingResponseHeaders, but this PR exposes it via newRestorableContext (and that fixes the issue with deprecation warnings, but also with other response headers).
Nice :)

Copy link
Contributor

@DaveCTurner DaveCTurner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm I'm confused is this not what org.elasticsearch.cluster.service.MasterService.ExecutionResult#captureResponseHeaders is supposed to do already? If there's a problem with propagating response headers back then it needs fixing in MasterService doesn't it?

@mosche
Copy link
Contributor Author

mosche commented May 7, 2025

@DaveCTurner this is necessary for MasterService.ExecutionResult#captureResponseHeaders to work. Otherwise captureResponseHeaders won't see the headers added within the ClusterStateUpdateTask#execute child context. By default those won't be propagated back to the parent context - and I was worried about making that the default as these might leak into unintended places that way.

An option I considered was to automatically create such a thread context supplier for TransportMasterNodeAction to make sure this is always the case.

@mosche
Copy link
Contributor Author

mosche commented May 7, 2025

@DaveCTurner To explain further, if I don't restore the previously stored context (which includes the correlation id) as child context in ClusterStateUpdateTask#execute, capturing of headers via MasterService.ExecutionResult#captureResponseHeaders works perfectly fine.

However, if using try (var ignored = storedContext.get()) { ... } to be able to see the correlation id, the response headers added within that child context aren't visible in the context accessible to captureResponseHeaders.

Definitely agree, a more general fix would be better here. This is really extremely hard to get right :(

@DaveCTurner
Copy link
Contributor

Otherwise captureResponseHeaders won't see the headers added within the ClusterStateUpdateTask#execute child context. By default those won't be propagated back to the parent context - and I was worried about making that the default as these might leak into unintended places that way.

Yep that's my confusion, the whole point of ExecutionResult#captureResponseHeaders is to capture response headers added during execute and pass them back to the parent context. See org.elasticsearch.cluster.service.MasterServiceTests#testThreadContext for example, particularly the use of expectedResponseHeaders.

@DaveCTurner
Copy link
Contributor

Oh wait I think I see, the MasterService executes the task in a completely empty context, but really we want to execute it in the context in which the task was originally created right?

Does this work?

diff --git a/server/src/main/java/org/elasticsearch/cluster/service/MasterService.java b/server/src/main/java/org/elasticsearch/cluster/service/MasterService.java
index c46b3b7d6618..05fa03283d06 100644
--- a/server/src/main/java/org/elasticsearch/cluster/service/MasterService.java
+++ b/server/src/main/java/org/elasticsearch/cluster/service/MasterService.java
@@ -913,7 +913,7 @@ public class MasterService extends AbstractLifecycleComponent {

         @Override
         public Releasable captureResponseHeaders() {
-            final var storedContext = threadContext.newStoredContext();
+            final var storedContext = threadContextSupplier.get();
             return Releasables.wrap(() -> {
                 final var newResponseHeaders = threadContext.getResponseHeaders();
                 if (newResponseHeaders.isEmpty()) {

@mosche
Copy link
Contributor Author

mosche commented May 7, 2025

Does this work?

I'll give it a try, but I'm not very optimistic ... the issue as I understand is really about bubbling up response headers from a restored child to a parent context ohh, i see, this could work in deed!

@mosche
Copy link
Contributor Author

mosche commented May 7, 2025

🎉 you're a wizard @DaveCTurner 🧙 I wasn't aware the task already captures the context it was originally created in, that's what I had in mind as alternative above. If I don't have to manually restore that original context myself in execute, there's no need to propagate headers back to the parent context 💯

mosche added 2 commits May 7, 2025 12:20
…as originally created.

MasterService used to execute tasks in a completely empty context rather than the original context which was captured when the task was created.

Fixes elastic#108628 (DeprecationHttpIT.testDeprecatedSettingsReturnWarnings)
@mosche
Copy link
Contributor Author

mosche commented May 7, 2025

@DaveCTurner any concerns that this is having an undesirable / negative impact in some places when changed globally?

@DaveCTurner
Copy link
Contributor

IMO it makes more sense than what we do today everywhere I can think of: TaskContext#captureResponseHeaders is really saying to run a part of the cluster state update computation in the thread context of a specific task. It probably deserves a better name with this change. I suspect you'll need to update MasterServiceTests to reflect the change in behaviour. Let's see if CI indicates anywhere else that this causes problems and then we can think about them more deeply.

Copy link
Contributor

@DaveCTurner DaveCTurner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM assuming CI is happy

@mosche
Copy link
Contributor Author

mosche commented May 8, 2025

@DaveCTurner currently looking into profiling yaml specs, these do fail with this change due to deprecation warnings bubbling up which got swallowed before: Index [.profiling-symbols-global-v001-000001] name begins with a dot (.), which is deprecated, and will not be allowed in a future Elasticsearch version."

./gradlew ":x-pack:plugin:yamlRestCompatTest" --tests "org.elasticsearch.xpack.test.rest.XPackRestIT.test {p0=profiling/10_basic/*}"

@mosche
Copy link
Contributor Author

mosche commented May 8, 2025

I thought we had fixed these towards 9, wondering if this was missed because the warnings never surfaced ...
Having issues ignoring the warnings in the specs, but that's my plan forward (+ creating an issue to fix the indices) unless you prefer otherwise

@DaveCTurner
Copy link
Contributor

Yes, that's good, shows the change is working :). Makes sense to me to just mark those warnings as acceptable in the tests.

@mosche mosche merged commit 89efe2c into elastic:main May 9, 2025
17 checks passed
jfreden pushed a commit to jfreden/elasticsearch that referenced this pull request May 12, 2025
…as originally created (elastic#127799)

Previously MasterService executed tasks in a completely empty context rather than the original context which was captured when the task was created. This PR fixes MasterService to restore and use the original context.

Fixes elastic#108628
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

:Core/Infra/Core Core issues without another label >non-issue Team:Core/Infra Meta label for core/infra team v9.1.0

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[CI] DeprecationHttpIT testDeprecatedSettingsReturnWarnings failing

4 participants