-
Notifications
You must be signed in to change notification settings - Fork 25.6k
Fix SearchErrorTraceIT and friends to work with batched query execution #132227
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix SearchErrorTraceIT and friends to work with batched query execution #132227
Conversation
Making this work with batched execution and fixing a memory leak: * Fix memory leak by removing listener on first message. There really only is a single message here per node anyway with batched execution in the mix. Either it's a single shard on the data node and we get a single query message or it's multiple shards and we get a single batched message, so fine to remove listener after the first message since all tests do a single request only anyway. * Add a new hook that allows inspection of the actual response. This is needed for batched since batched sends a non-error response even if the data node failed all searches. We had this before in the `onResponseSent` hook but checking the instance after it's been sent over the wire causes needless overhead in the production code so moving to a "before-style" hook here.
Pinging @elastic/es-search-foundations (Team:Search Foundations) |
* @param action the request action | ||
* @param response response instance | ||
*/ | ||
default void onBeforeResponseSent(long requestId, String action, TransportResponse response) {} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not sure I follow the issue in the first place, as well as the proposed fix. The proposal is to add a new default method to TransportMessageListener, to then only ever use it in a test? Could you help me better understand what other options we have?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Async/SearchErrorTraceIT forces an error on all shards during a search. Before batched execution, this would trigger onResponseSent(long requestId, String action, Exception error)
- called for every failed action response. We could then implement onResponseSent and inspect the exception to assert things (like whether the stack trace was present).
Batched execution sends a non-error response even if all searches fail. We still want to inspect this response to assert the same things that were once captured in the exception. But TransportMessageListener has no hook to inspect non-error payloads, so this PR introduces one.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hey Ben, sorry for the delay. I understand the problem better. I think this is a bit of an intrusive fix though. Isn't there another way to intercept responses in this specific test as opposed to using transport message listener? You may want to reach out to the distrib team and ask for advice on this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@javanna I've reworked the tests using a nice suggestion from @DaveCTurner - thanks for pushing for a better solution!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM! good work!
CI is borked in #127150 so creating a duplicate PR here.