Skip to content

Commit e6599b2

Browse files
ricardoamaroshannonxtremeTim Bannistercolossus06
authored
Adding documentation explaining what is a CrashLoopBackOff (#45928)
* Documentation on CrashLoopBackOff * Update content/en/docs/concepts/workloads/pods/pod-lifecycle.md Co-authored-by: Shannon Kularathna <[email protected]> * Update content/en/docs/concepts/workloads/pods/pod-lifecycle.md Co-authored-by: Shannon Kularathna <[email protected]> * Update content/en/docs/concepts/workloads/pods/pod-lifecycle.md Co-authored-by: Shannon Kularathna <[email protected]> * Update content/en/docs/concepts/workloads/pods/pod-lifecycle.md Co-authored-by: Tim Bannister <[email protected]> * Update content/en/docs/concepts/workloads/pods/pod-lifecycle.md Co-authored-by: Shannon Kularathna <[email protected]> * Address some feedback * exponential backoff delay * Address some feedback * Start by explaing handle * break lines * Update content/en/docs/concepts/workloads/pods/pod-lifecycle.md Co-authored-by: Gulcan Topcu <[email protected]> * Update content/en/docs/concepts/workloads/pods/pod-lifecycle.md Co-authored-by: Gulcan Topcu <[email protected]> * Update content/en/docs/concepts/workloads/pods/pod-lifecycle.md Co-authored-by: Tim Bannister <[email protected]> * address feedback --------- Co-authored-by: Shannon Kularathna <[email protected]> Co-authored-by: Tim Bannister <[email protected]> Co-authored-by: Gulcan Topcu <[email protected]>
1 parent 65ffa36 commit e6599b2

File tree

1 file changed

+61
-4
lines changed

1 file changed

+61
-4
lines changed

content/en/docs/concepts/workloads/pods/pod-lifecycle.md

Lines changed: 61 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -145,6 +145,58 @@ finish time for that container's period of execution.
145145
If a container has a `preStop` hook configured, this hook runs before the container enters
146146
the `Terminated` state.
147147

148+
## How Pods handle problems with containers {#container-restarts}
149+
150+
Kubernetes manages container failures within Pods using a [`restartPolicy`](#restart-policy) defined in the Pod `spec`. This policy determines how Kubernetes reacts to containers exiting due to errors or other reasons, which falls in the following sequence:
151+
152+
1. **Initial crash**: Kubernetes attempts an immediate restart based on the Pod `restartPolicy`.
153+
1. **Repeated crashes**: After the the initial crash Kubernetes applies an exponential
154+
backoff delay for subsequent restarts, described in [`restartPolicy`](#restart-policy).
155+
This prevents rapid, repeated restart attempts from overloading the system.
156+
1. **CrashLoopBackOff state**: This indicates that the backoff delay mechanism is currently
157+
in effect for a given container that is in a crash loop, failing and restarting repeatedly.
158+
1. **Backoff reset**: If a container runs successfully for a certain duration
159+
(e.g., 10 minutes), Kubernetes resets the backoff delay, treating any new crash
160+
as the first one.
161+
162+
In practice, a `CrashLoopBackOff` is a condition or event that might be seen as output
163+
from the `kubectl` command, while describing or listing Pods, when a container in the Pod
164+
fails to start properly and then continually tries and fails in a loop.
165+
166+
In other words, when a container enters the crash loop, Kubernetes applies the
167+
exponential backoff delay mentioned in the [Container restart policy](#restart-policy).
168+
This mechanism prevents a faulty container from overwhelming the system with continuous
169+
failed start attempts.
170+
171+
The `CrashLoopBackOff` can be caused by issues like the following:
172+
173+
* Application errors that cause the container to exit.
174+
* Configuration errors, such as incorrect environment variables or missing
175+
configuration files.
176+
* Resource constraints, where the container might not have enough memory or CPU
177+
to start properly.
178+
* Health checks failing if the application doesn't start serving within the
179+
expected time.
180+
* Container liveness probes or startup probes returning a `Failure` result
181+
as mentioned in the [probes section](#container-probes).
182+
183+
To investigate the root cause of a `CrashLoopBackOff` issue, a user can:
184+
185+
1. **Check logs**: Use `kubectl logs <name-of-pod>` to check the logs of the container.
186+
This is often the most direct way to diagnose the issue causing the crashes.
187+
1. **Inspect events**: Use `kubectl describe pod <name-of-pod>` to see events
188+
for the Pod, which can provide hints about configuration or resource issues.
189+
1. **Review configuration**: Ensure that the Pod configuration, including
190+
environment variables and mounted volumes, is correct and that all required
191+
external resources are available.
192+
1. **Check resource limits**: Make sure that the container has enough CPU
193+
and memory allocated. Sometimes, increasing the resources in the Pod definition
194+
can resolve the issue.
195+
1. **Debug application**: There might exist bugs or misconfigurations in the
196+
application code. Running this container image locally or in a development
197+
environment can help diagnose application specific issues.
198+
199+
148200
## Container restart policy {#restart-policy}
149201

150202
The `spec` of a Pod has a `restartPolicy` field with possible values Always, OnFailure,
@@ -156,17 +208,22 @@ in the Pod and to regular [init containers](/docs/concepts/workloads/pods/init-c
156208
ignore the Pod-level `restartPolicy` field: in Kubernetes, a sidecar is defined as an
157209
entry inside `initContainers` that has its container-level `restartPolicy` set to `Always`.
158210
For init containers that exit with an error, the kubelet restarts the init container if
159-
the Pod level `restartPolicy` is either `OnFailure` or `Always`.
211+
the Pod level `restartPolicy` is either `OnFailure` or `Always`:
212+
213+
* `Always`: Automatically restarts the container after any termination.
214+
* `OnFailure`: Only restarts the container if it exits with an error (non-zero exit status).
215+
* `Never`: Does not automatically restart the terminated container.
160216

161217
When the kubelet is handling container restarts according to the configured restart
162218
policy, that only applies to restarts that make replacement containers inside the
163219
same Pod and running on the same node. After containers in a Pod exit, the kubelet
164-
restarts them with an exponential back-off delay (10s, 20s, 40s, …), that is capped at
165-
five minutes. Once a container has executed for 10 minutes without any problems, the
166-
kubelet resets the restart backoff timer for that container.
220+
restarts them with an exponential backoff delay (10s, 20s, 40s, …), that is capped at
221+
300 seconds (5 minutes). Once a container has executed for 10 minutes without any
222+
problems, the kubelet resets the restart backoff timer for that container.
167223
[Sidecar containers and Pod lifecycle](/docs/concepts/workloads/pods/sidecar-containers/#sidecar-containers-and-pod-lifecycle)
168224
explains the behaviour of `init containers` when specify `restartpolicy` field on it.
169225

226+
170227
## Pod conditions
171228

172229
A Pod has a PodStatus, which has an array of

0 commit comments

Comments
 (0)