-
Notifications
You must be signed in to change notification settings - Fork 47
DOC-975 crash_loop_sleep_sec broker config #966
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
✅ Deploy Preview for redpanda-docs-preview ready!
To edit notification comments on pull requests, go to your Netlify site configuration. |
pgellert
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, the content looks good, I just left some minor suggestions.
| To prevent infinite crash loops, the Redpanda Helm chart sets the `crash_loop_limit` node property to 5. The crash loop limit is the number of consecutive crashes that can happen within one hour of each other. After Redpanda reaches this limit, it will not start until its internal consecutive crash counter is reset to zero. In Kubernetes, the Pod running Redpanda remains in a `CrashLoopBackoff` state until its internal consecutive crash counter is reset to zero. | ||
| To prevent infinite crash loops, the Redpanda Helm chart sets the xref:reference:properties/broker-properties.adoc#crash_loop_limit[`crash_loop_limit`] broker configuration property to `5`. The crash loop limit is the number of consecutive crashes that can happen within one hour of each other. By default, the broker terminates immediately after hitting the `crash_loop_limit`. The Pod running Redpanda remains in a `CrashLoopBackoff` state until its internal consecutive crash counter is reset to zero. | ||
|
|
||
| To facilitate debugging in environments where a broker is stuck in a crash loop, you can also set the xref:reference:properties/broker-properties.adoc#crash_loop_sleep_sec[`crash_loop_sleep_sec` configuration] configuration. This setting determines how long the broker sleeps before terminating the process after reaching the crash loop limit. By providing a window during which the Pod remains in a paused state, you can SSH into the Pod and troubleshoot the issue. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@chrisseto fyi when you implement K8S-491 to set crash_loop_sleep_sec to be on by default in the Redpanda Helm chart, then this paragraph will probably need updating.
modules/manage/pages/cluster-maintenance/configure-availability.adoc
Outdated
Show resolved
Hide resolved
Co-authored-by: Gellért Peresztegi-Nagy <[email protected]>
|
Thanks @pgellert |
pgellert
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm
Feediver1
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm
Co-authored-by: Joyce Fee <[email protected]>
Description
Review deadline: 29 Jan
Page previews
Checks