-
Notifications
You must be signed in to change notification settings - Fork 25.6k
Handle failures that occur during field-caps #130840
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
a74566a
to
8fede66
Compare
Pinging @elastic/es-analytical-engine (Team:Analytics) |
x-pack/plugin/esql/src/main/java/org/elasticsearch/xpack/esql/action/EsqlExecutionInfo.java
Outdated
Show resolved
Hide resolved
x-pack/plugin/esql/src/main/java/org/elasticsearch/xpack/esql/session/EsqlCCSUtils.java
Outdated
Show resolved
Hide resolved
x-pack/plugin/esql/src/main/java/org/elasticsearch/xpack/esql/session/EsqlCCSUtils.java
Outdated
Show resolved
Hide resolved
x-pack/plugin/esql/src/main/java/org/elasticsearch/xpack/esql/session/EsqlCCSUtils.java
Outdated
Show resolved
Hide resolved
x-pack/plugin/esql/src/main/java/org/elasticsearch/xpack/esql/session/EsqlSession.java
Outdated
Show resolved
Hide resolved
x-pack/plugin/esql/src/main/java/org/elasticsearch/xpack/esql/session/EsqlSession.java
Show resolved
Hide resolved
} | ||
} | ||
|
||
public void testResolutionFailures() throws Exception { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Check out also #129957 - I've made some tests for things that weren't handled properly.
x-pack/plugin/esql/src/main/java/org/elasticsearch/xpack/esql/index/IndexResolution.java
Show resolved
Hide resolved
x-pack/plugin/esql/src/main/java/org/elasticsearch/xpack/esql/session/EsqlSession.java
Outdated
Show resolved
Hide resolved
...sterTest/java/org/elasticsearch/xpack/esql/action/CrossClusterQueryWithPartialResultsIT.java
Show resolved
Hide resolved
@smalyshev @idegtiarenko Thanks for review. I think it's ready again. |
...sterTest/java/org/elasticsearch/xpack/esql/action/CrossClusterQueryWithPartialResultsIT.java
Show resolved
Hide resolved
x-pack/plugin/esql/src/main/java/org/elasticsearch/xpack/esql/plugin/ComputeService.java
Show resolved
Hide resolved
) { | ||
Set<String> clustersWithResolvedIndices = new HashSet<>(); | ||
// determine missing clusters | ||
final Set<String> clustersWithNoMatchingIndices = new HashSet<>(executionInfo.clusterAliases()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I still think we should use something like executionInfo.getClusterStates(RUNNING)
here. That should automatically exclude unavailable clusters if we call updateExecutionInfoWithUnavailableClusters
before this. I think this would also let us eliminate unavailableClusters
parameter.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated in 77aacf2
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I had to revert this refactoring because the tests in EsqlCCSUtilsTests assume these conditions together. Unfortunately, I don't have the bandwidth to work on this now. Could you work on this after this PR? Thanks!
// when queries use a remote cluster wildcard, e.g., `*:my-logs*`. | ||
Exception nonIndexNotFound = failures.stream() | ||
.map(FieldCapabilitiesFailure::getException) | ||
.filter(ex -> ExceptionsHelper.unwrap(ex, IndexNotFoundException.class) == null) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Question here: are all situations where we can't use index for some reason (security, closed, hidden, whatever it is) would be covered by IndexNotFoundException
? If not, we still could get unhelpful failures here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should report all errors, including IndexNotFoundException. This special handling exists because index_options are not available in ES|QL; otherwise, we shouldn't have it.
EsqlCCSUtils.updateExecutionInfoWithClustersWithNoMatchingIndices(executionInfo, result.indices, null); | ||
var unavailableClusters = EsqlCCSUtils.determineUnavailableRemoteClusters(result.indices.failures()); | ||
EsqlCCSUtils.updateExecutionInfoWithUnavailableClusters(executionInfo, unavailableClusters); | ||
EsqlCCSUtils.updateExecutionInfoWithClustersWithNoMatchingIndices( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I feel like it may be better to move updateExecutionInfoWithUnavailableClusters
into updateExecutionInfoWithUnavailableClusters
and have updateExecutionInfoWithClustersWithNoMatchingIndices
use cluster statuses (since unavailables would get marked as skipped). What do you think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, it's a good idea, but I don’t have bandwidth for it right now and would prefer to focus on getting this in for 8.19/9.1 as soon as possible. Would you be able to work on it later?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure I could make a followup on this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍 Thank you!
EsqlCCSUtils.updateExecutionInfoWithClustersWithNoMatchingIndices(executionInfo, result.indices, null); | ||
var unavailableClusters = EsqlCCSUtils.determineUnavailableRemoteClusters(result.indices.failures()); | ||
EsqlCCSUtils.updateExecutionInfoWithUnavailableClusters(executionInfo, unavailableClusters); | ||
EsqlCCSUtils.updateExecutionInfoWithClustersWithNoMatchingIndices( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure I could make a followup on this.
@smalyshev @idegtiarenko Thank you for reviews! |
💔 Backport failed
You can use sqren/backport to manually backport by running |
Currently, errors from the field-caps phase are not always handled properly, leading to cases where the final response is not marked as partial correctly. For example: FROM ok*,unavailable_index* should return a partial result, as unavailable_index* is skipped after the resolution phase. This change tracks failures that occur during field-caps and reports them in the final response. Since this only affects cases with allow_partial_results=true, I am labeling this as a non-issue and will backport the change to 9.1 and 8.19.
Currently, errors from the field-caps phase are not always handled properly, leading to cases where the final response is not marked as partial correctly. For example: FROM ok*,unavailable_index* should return a partial result, as unavailable_index* is skipped after the resolution phase. This change tracks failures that occur during field-caps and reports them in the final response. Since this only affects cases with allow_partial_results=true, I am labeling this as a non-issue and will backport the change to 9.1 and 8.19. (cherry picked from commit a699655)
Currently, errors from the field-caps phase are not always handled properly, leading to cases where the final response is not marked as partial correctly. For example: FROM ok*,unavailable_index* should return a partial result, as unavailable_index* is skipped after the resolution phase. This change tracks failures that occur during field-caps and reports them in the final response. Since this only affects cases with allow_partial_results=true, I am labeling this as a non-issue and will backport the change to 9.1 and 8.19. (cherry picked from commit a699655)
💚 All backports created successfully
Questions ?Please refer to the Backport tool documentation |
Currently, errors from the field-caps phase are not always handled properly, leading to cases where the final response is not marked as partial correctly. For example: FROM ok*,unavailable_index* should return a partial result, as unavailable_index* is skipped after the resolution phase. This change tracks failures that occur during field-caps and reports them in the final response. Since this only affects cases with allow_partial_results=true, I am labeling this as a non-issue and will backport the change to 9.1 and 8.19. (cherry picked from commit a699655)
Currently, errors from the field-caps phase are not always handled properly, leading to cases where the final response is not marked as partial correctly. For example: FROM ok*,unavailable_index* should return a partial result, as unavailable_index* is skipped after the resolution phase. This change tracks failures that occur during field-caps and reports them in the final response. Since this only affects cases with allow_partial_results=true, I am labeling this as a non-issue and will backport the change to 9.1 and 8.19. (cherry picked from commit a699655)
Currently, errors from the field-caps phase are not always handled properly, leading to cases where the final response is not marked as partial correctly. For example: FROM ok*,unavailable_index* should return a partial result, as unavailable_index* is skipped after the resolution phase. This change tracks failures that occur during field-caps and reports them in the final response. Since this only affects cases with allow_partial_results=true, I am labeling this as a non-issue and will backport the change to 9.1 and 8.19.
Currently, errors from the field-caps phase are not always handled properly, leading to cases where the final response is not marked as partial correctly. For example: FROM ok*,unavailable_index* should return a partial result, as unavailable_index* is skipped after the resolution phase. This change tracks failures that occur during field-caps and reports them in the final response. Since this only affects cases with allow_partial_results=true, I am labeling this as a non-issue and will backport the change to 9.1 and 8.19.
Currently, errors from the field-caps phase are not always handled properly, leading to cases where the final response is not marked as partial correctly. For example:
FROM ok*,unavailable_index*
should return a partial result, asunavailable_index*
is skipped after the resolution phase. This change tracks failures that occur during field-caps and reports them in the final response. Since this only affects cases withallow_partial_results=true
, I am labeling this as a non-issue and will backport the change to 9.1 and 8.19.