Skip to content

Conversation

@kjnilsson
Copy link
Contributor

@kjnilsson kjnilsson commented Nov 1, 2024

It is possible for a slow running follower with local consumers to crash after a snapshot installation as it tries to read an entry from its log that is no longer there (as it has been consumed and completed by another node but still refers to prior consumers on the current node).

This commit makes the log effect callback function more defensive to check that the number of commands returned by the log effect isn't different from what was requested. if it is different we consider this a stale read request and return no further effects.

Fixes #12635

It is possible for a slow running follower with local consumers
to crash after a snapshot installation as it tries to read an entry
from its log that is no longer there (as it has been consumed and
completed by another node but still refers to prior consumers on the
current node).

This commit makes the log effect callback function more defensive
to check that the number of commands returned by the log effect
isn't different from what was requested. if it is different we
consider this a stale read request and return no further effects.
@kjnilsson kjnilsson added this to the 4.0.4 milestone Nov 1, 2024
@michaelklishin michaelklishin merged commit ef2c8df into main Nov 1, 2024
273 checks passed
@michaelklishin michaelklishin deleted the gh-12635 branch November 1, 2024 17:23
michaelklishin added a commit that referenced this pull request Nov 1, 2024
QQ: handle case where a stale read request results in member crash. (backport #12636)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

QQ: crash after snapshot installation

3 participants