beacon/light: keep retrying checkpoint init if failed#33966
Open
zsfelfoldi wants to merge 2 commits intoethereum:masterfrom
Open
beacon/light: keep retrying checkpoint init if failed#33966zsfelfoldi wants to merge 2 commits intoethereum:masterfrom
zsfelfoldi wants to merge 2 commits intoethereum:masterfrom
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
This PR changes the blsync checkpoint init logic so that even if the initialization fails with a certain server and an error log message is printed, the server goes back to its initial state and is allowed to retry initialization after the failure delay period. The previous logic had an
ssDoneserver state that did put the server in a permanently unusable state once the checkpoint init failed for an apparently permanent reason. This was not the correct behavior because different servers behave differently in case of overload and sometimes the response to a permanently missing item is not clearly distinguishable from an overload response. A safer logic is to never assume anything to be permanent and always give a chance to retry.The failure delay formula is also fixed; now it is properly capped at
maxFailureDelay. The previous formula did allow the delay to grow unlimited if a retry was attempted immediately after each delay period.