-
Notifications
You must be signed in to change notification settings - Fork 25.5k
Change FailedToCommitClusterStateException to NotMasterException #135008
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Change FailedToCommitClusterStateException to NotMasterException #135008
Conversation
Modifies the `Batch.onRejection` method to accept a `NotMasterException` rather than a `FailedToCommitClusterStateException`. The `NotMasterException` is thrown because the master node has closed. However, at this point we haven't even tried to commit the cluster state so we know for a fact that it's failed, so a `FailedToCommitClusterStateException`, which implies ambiguity, is wrong here.
I have assumed that these exceptions are eventually rolled into a 4XX error and therefore not exposed to the user, so it shouldn't matter that I'm changing them and no changelog update is required. Are there any disagreements to this? |
Pinging @elastic/es-distributed-coordination (Team:Distributed Coordination) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These changes look like they are in line with the discussions you and David have had in slack. The exact exception used here seems like an internal implementation detail, so >non-issue
seems ok, or even >refactoring
. It might be good to have @DiannaHohensee or @henningandersen weigh in, or wait for David to comment when he is back if you are not blocked by this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Left a couple questions
// single-threaded: started when totalQueueSize transitions from 0 to 1 and keeps calling itself until the queue is drained. | ||
if (lifecycle.started() == false) { | ||
drainQueueOnRejection(new FailedToCommitClusterStateException("node closed", getRejectionException())); | ||
drainQueueOnRejection(new NotMasterException("node closed", getRejectionException())); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
forkQueueProcessor
can be called on success or failure of a Runnable. Is that a problem in terms of knowing whether or not the cluster update was successful, or is it not a problem for some reason? I haven't worked with the MasterService before, so I might be missing something.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
AFAIU, your question is: "During a cluster state update, where is this code running, since if this exception is thrown after the cluster state update is pushed to the wire, then according to the FailedToCommitClusterStateException
exception definition here, this is the correct exception to be throwing".
I'm still trying to understand the code myself so excuse me if I make a mistake, but as I understand it, this code isn't even running during a cluster state update. forkQueueProcessor
is called twice in the code, here in the Runnable and here in a PerPriorityQueue
.
PerPriorityQueue
In this case we're draining the queue because the threadpool shut down. The workflow is:
- We batch tasks on the master node to be executed together. This is handled by a
BatchingTaskQueue
, here - When submitting a task to this queue, we invoke a
PerPriorityQueue
, here - The
execute()
function called invokesforkQueueProcessor
here - A
FailedToCommitClusterStateException
is passed intodrainQueueOnRejection
here drainQueueOnRejection
here uses the exception to reject a batch of tasks.- In this case, we are not attempting to publish a cluster state update, and so a
FailedToCommitClusterStateException
is wrong. The cluster state update publication workflow is here inside theCoordinator
. The reason aFailedToCommitClusterStateException
is thrown is explained by a comment inside theBatch
interface, here, but I shall copy below:
/**
* Called when the batch is rejected due to the master service shutting down.
*
* @param e is a {@link FailedToCommitClusterStateException} to cause things like {@link TransportMasterNodeAction} to retry after
* submitting a task to a master which shut down. {@code e.getCause()} is the rejection exception, which should be a
* {@link EsRejectedExecutionException} with {@link EsRejectedExecutionException#isExecutorShutdown()} true.
*/
// Should really be a NodeClosedException instead, but this exception type doesn't trigger retries today.
**/
^^ As shown, the FailedToCommitClusterStateException
was always acknowledged to be wrong from the
beginning.
Runnable
The runnable is invoked in only one place, inside forkQueueProcessor
, here. The success or failure you mention above is not referring to cluster state update success or failure, but rather success or failure executing a batch of tasks from the queue.
Therefore, in this case, I believe the exception to be incorrect, and used as a quick way to guarantee retries. As explained in another comment, both FailedToCommitClusterStateException
and NotMasterException
retry inside TransportMasterNodeAction
, so replacing one with the other should have no adverse effect, while semantically improving our exception handling
* submitting a task to a master which shut down. {@code e.getCause()} is the rejection exception, which should be a | ||
* {@link EsRejectedExecutionException} with {@link EsRejectedExecutionException#isExecutorShutdown()} true. | ||
*/ | ||
// Should really be a NodeClosedException instead, but this exception type doesn't trigger retries today. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Did you look into this, whether NotMasterException will trigger some sort of retry?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A NotMasterException
is thrown when a non-master node is attempting to perform an action reserved for a master node. In this instance, it could occur when a master node loses an election midway through a cluster state update and is then not the master anymore. The action is therefore not to be retried on the same node (since it would hit the same exception), but is expected to be retried on the next master.
In the code, this can be seen here inside TransportMasterNodeAction
:
// TransportMasterNodeAction
ActionListener<Response> delegate = listener.delegateResponse((delegatedListener, t) -> {
if (MasterService.isPublishFailureException(t)) {
logger.debug(
() -> format(
"master could not publish cluster state or "
+ "stepped down before publishing action [%s], scheduling a retry",
actionName
),
t
);
retryOnNextState(currentStateVersion, t);
} else {
logger.debug("unexpected exception during publication", t);
delegatedListener.onFailure(t);
}
});
// MasterService
public static boolean isPublishFailureException(Exception e) {
return e instanceof NotMasterException || e instanceof FailedToCommitClusterStateException;
}
where if an NotMasterException
is thrown, we retry on the next cluster state update, which should be sent from the new master. Since the exception currently thrown is a FailedToCommitClusterStateException
which is retried in the same way, this change should preserve functionality.
... and reformat Javadoc
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM2 apart from one nit
* submitting a task to a master which shut down. {@code e.getCause()} is the rejection exception, which should be a | ||
* {@link EsRejectedExecutionException} with {@link EsRejectedExecutionException#isExecutorShutdown()} true. | ||
*/ | ||
// Should really be a NodeClosedException instead, but this exception type doesn't trigger retries today. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should have kept this comment - I opened #135902 to reinstate it.
This is the first of a series of PRs fixing how the
FailedToCommitClusterStateException
is used in ElasticSearch. As per #135017,FailedToCommitClusterStateException
is defined as:Currently,
FailedToCommitClusterStateException
is used as a 'catch-all' exception thrown at multiple places throughout theCoordinator
andMasterService
during the publication process. Semantically however, it doesn't make sense to throw this exception before the cluster state update is actually sent over the wire, since at this point, we know for certain that the cluster state update failed.FailedToCommitClusterStateException
is intended to display ambiguity.This work is a pre-requisite to #134213.
Changes
This PR modifies the
Batch.onRejection
method to accept aNotMasterException
rather than aFailedToCommitClusterStateException
. We should throw aNotMasterException
here because:FailedToCommitClusterStateException
is retryable withinTransportMasterNodeAction
, and it was a quick way to guarantee retriesNotMasterException
is semantically clearer, while also being retryable to preserve functionalityTesting
MasterServiceTests
100 timesNext Steps
The goal of this work is to fix up all erroneously used
FailedToCommitClusterStateException
.Done:
Todo:
FailedToCommitClusterStateException
toNotMasterException
during the pre-publication process: Changes FailedToCommitClusterStateException to NotMasterException #135548FailedToCommitClusterStateException
s withFailedToPublishClusterStateException
, a new exception designed to capture any failures before the cluster state update is sent over the wire. This exception will capture guaranteed cluster state update failure, withFailedToCommitClusterStateException
used once the update has been sent over the wire, representing ambiguityFailedToCommitClusterStateException
that need to be updatedRelates to ES-13061