-
Notifications
You must be signed in to change notification settings - Fork 25.5k
Convert BytesTransportResponse when proxying response from/to local node #135873
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Convert BytesTransportResponse when proxying response from/to local node #135873
Conversation
execution, to a BytesTransportResponse as opposed to materializing the response object on heap. When a proxy node acts as a proxy to query its local data, and the coordinating node is on a different version than the proxy node, the response will fail to deserialize in the coord node because it was written with the version of the proxy node as opposed to that of the coord (target) node. This is because DirectResponseChannel does not read and write such response, which would lead to it being converted to the right format. This commit attempts to fix this problem by tracking the version used to write the response, and conditionally converting it in the ProxyRequestHandler.
Hi @javanna, I've created a changelog YAML for you. |
Pinging @elastic/es-distributed-coordination (Team:Distributed Coordination) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This deserves some tests in TransportActionProxyTests
showing that we do/don't re-serialize the response in these cases.
server/src/main/java/org/elasticsearch/transport/TransportActionProxy.java
Outdated
Show resolved
Hide resolved
…onProxy.java Co-authored-by: David Turner <[email protected]>
try { | ||
channel.sendResponse(convertedResponse); | ||
} finally { | ||
convertedResponse.decRef(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd initially forgotten this, can you double check that this is what I need to be doing please?
convertedResponse.decRef(); | ||
} | ||
} catch (IOException e) { | ||
throw new UncheckedIOException(e); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should I throw or send the exception as a response? I am assuming that exceptions thrown are already caught and sent as responses, but good to double check.
public void writeTo(StreamOutput out) throws IOException { | ||
out.writeString(targetNode); | ||
if (out.getTransportVersion().supports(transportVersion1)) { | ||
out.writeBoolean(true); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This reproduces the problem in the tests I added, which is fixed with the conversion this change introduces.
"internal:test", | ||
cancellable, | ||
// For a proxy node proxying to itself, the response is sent directly, without it being read by the proxy layer | ||
r -> { throw new AssertionError(); }, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this diff is misleading! I only changed this line in testSendLocalRequest
Thanks for the suggestion, I added tests @DaveCTurner. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
#127112 introduced
BytesTransportResponse
to be used in search batched execution, so thatNodeQueryResponse
could be written asBytesTransportResponse
as opposed to materializing the response object on heap.When a proxy node acts as a proxy to query its local data, and the coordinating node is on a different version than the proxy node, the response will fail to deserialize in the coord node because it was written with the version of the proxy node as opposed to that of the coord (target) node. This is because
DirectResponseChannel
skips the step of reading and writing back such response, which would lead to it being converted to the right format.This commit attempts to fix this problem by tracking the version used to write the response, and conditionally converting it in the ProxyRequestHandler.
There may be better ways to fix this problem, let's use this draft to align on the problem and agree on possible solutions. I am happy to iterate on it, as this currently blocks merging #135549 due to the serialization change it introduces.
Note that search batched executed is still under a feature flag, and disabled by default, so this bug does not currently affect our users.