-
Notifications
You must be signed in to change notification settings - Fork 25.6k
ES|QL: do not use skip_unavailable in CPS #135419
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ES|QL: do not use skip_unavailable in CPS #135419
Conversation
…lable' into esql/cps_no_skip_unavailable
| EsqlExecutionInfo executionInfo, | ||
| ActionListener<LookupResponse> lookupListener | ||
| ) { | ||
| if (ExceptionsHelper.isRemoteUnavailableException(e) && executionInfo.shouldSkipOnFailure(cluster)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think org.elasticsearch.xpack.esql.action.EsqlExecutionInfo#skipOnFailurePredicate is initialized with essentially the same:
Line 379 in cbd9f05
| return new EsqlExecutionInfo(clusterAlias -> remoteClusterService.isSkipUnavailable(clusterAlias).orElse(true), includeCcsMetadata); |
But that is definitely confusing that we use different things in different palces
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, now that predicate is the place where we centralize this behavior.
|
Pinging @elastic/es-analytical-engine (Team:Analytics) |
| listener | ||
| ); | ||
| protected void getRemoteConnection(String cluster, EsqlExecutionInfo executionInfo, ActionListener<Transport.Connection> listener) { | ||
| remoteClusterService.maybeEnsureConnectedAndGetConnection(cluster, executionInfo.shouldSkipOnFailure(cluster) == false, listener); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just to ensure - the desired behavior on CPS is that we always want to try to reconnect immediately if the connection is bad, when allow_partial = true, and don't try to reconnect when allow_partial = false?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's a bit of a stretch, as allow_partial is not really about connection retries, but on the other hand the previous behavior relied on skip_unavailable, that is a bit closer to the concept of connection but not really about reconnection either.
In conclusion, IMHO this is fine in terms of consistency (also compared to stateful).
| remoteClusterService.isSkipUnavailable(cluster).orElse(true) == false, | ||
| listener | ||
| ); | ||
| protected void getRemoteConnection(String cluster, EsqlExecutionInfo executionInfo, ActionListener<Transport.Connection> listener) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not saying this is wrong, but here you don't actually need EsqlExecutionInfo, only one flag that describes executionInfo.shouldSkipOnFailure(cluster) == false which is ensureConnected value. So maybe this should be boolean ensureConnected too?
| String cluster, | ||
| EsqlExecutionInfo executionInfo, | ||
| ActionListener<LookupResponse> lookupListener | ||
| ) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here again (as below) we don't really need neither cluster nor EsqlExecutionInfo but only boolean executionInfo.shouldSkipOnFailure(cluster). Maybe we could simplify it by calculating this boolean above and pass it to the methods that need it? Just a suggestion.
In CPS we don't have
skip_unavailable, we'll useallow_partial_resultsinstead