You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: keps/sig-node/4603-tune-crashloopbackoff/README.md
+39-14Lines changed: 39 additions & 14 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -434,7 +434,7 @@ What are some important details that didn't come across above?
434
434
Go in to as much detail as necessary here.
435
435
This might be a good place to talk about core concepts and how they relate.
436
436
-->
437
-
#### On Success
437
+
#### On Success and the 10 minute recovery threshold
438
438
439
439
The original version of this proposal included a change specific to Pods
440
440
transitioning through the "Succeeded" phase to have flat rate restarts. On
@@ -530,7 +530,7 @@ proposal will be implemented, this is the place to discuss them.
530
530
-->
531
531
532
532
### Front loaded decay curve methodology
533
-
As mentioned above, today the standard backoff curve is an exponential decay
533
+
As mentioned above, today the standard backoff curve is a 2x exponential decay
534
534
starting at 10s and capping at 5 minutes, resulting in a composite of the
535
535
standard hockey-stick exponential decay graph followed by a linear rise until
536
536
the heat death of the universe as depicted below:
@@ -601,12 +601,15 @@ to allow legitimately crashing workloads to have a backoff of 0, but it is in
601
601
scope for the first alpha to provide users a way to opt workloads in to a even
602
602
faster restart behavior.
603
603
604
-
The finalization of the initial and max cap after benchmarking. As a
605
-
conservative first estimate in line with maximums discussed on
606
-
[Kubernetes#57291](https://github.com/kubernetes/kubernetes/issues/57291), the
607
-
initial curve is selected at initial=250ms / cap=1 minute, but during
604
+
The finalization of the initial and max cap can only be done after benchmarking.
605
+
But as a conservative first estimate for alpha in line with maximums discussed
606
+
on [Kubernetes#57291](https://github.com/kubernetes/kubernetes/issues/57291),
607
+
the initial curve is selected at initial=250ms / cap=1 minute, but during
608
608
benchmarking this will be modelled against kubelet capacity, potentially
609
-
targeting something closer to an initial value near 0s, and a cap of 10-30s.
609
+
targeting something closer to an initial value near 0s, and a cap of 10-30s. To
610
+
further restrict the blast radius of this change before full and complete
611
+
benchmarking is worked up, this is gated by a separate alpha feature gate and is
612
+
opted in to per Pod using a new `restartPolicy: Rapid` value, described below.
610
613
611
614
612
615
#### New OneOf for `restartPolicy` -- `Rapid`
@@ -619,8 +622,8 @@ continuously alongside the regular containers in the Pod.
619
622
620
623
This KEP will support a new value for this field, `Rapid`, which on feature flag
621
624
disablement will be interpreted as `Always`. If `restartPolicy: Rapid` is set or
622
-
inherited for a container, that container will follow the new Rapid backoff
623
-
curve.
625
+
inherited for a container and the feature flag for this feature is turned on for
626
+
a given cluster, that container will follow the new Rapid backoff curve.
624
627
625
628
Due to configuring this as another option to this field, this would make Rapid
626
629
backoff possible for restartable init (aka sidecar) containers, Pods,
@@ -670,7 +673,13 @@ Running state would also be useful.
670
673
671
674
### Relationship with Job API podFailurePolicy and backoffLimit
672
675
673
-
Job API provides its own API surface for describing alterntive restart behaviors, from [KEP-3329: Retriable and non-retriable Pod failures for Jobs](https://github.com/kubernetes/enhancements/tree/master/keps/sig-apps/3329-retriable-and-non-retriable-failures), in beta as of Kubernetes 1.30. The following example from that KEP shows the new configuration options: `backoffLimit`, which controls for number of retries on failure, and `podFailurePolicy`, which controls for types of workload exit codes or kube system events to ignore against that `backoffLimit`.
676
+
Job API provides its own API surface for describing alterntive restart
677
+
behaviors, from [KEP-3329: Retriable and non-retriable Pod failures for
in beta as of Kubernetes 1.30. The following example from that KEP shows the new
680
+
configuration options: `backoffLimit`, which controls for number of retries on
681
+
failure, and `podFailurePolicy`, which controls for types of workload exit codes
682
+
or kube system events to ignore against that `backoffLimit`.
674
683
675
684
```yaml
676
685
apiVersion: v1
@@ -690,13 +699,29 @@ spec:
690
699
- type: DisruptionTarget
691
700
```
692
701
693
-
The implementation of KEP-3329 is entirely in the Job controller, and the restarts are not handled by kubelet at all; in fact, use of this API is only available if the `restartPolicy` is set to `Never`. As a result, to expose the new backoff curve Jobs using this feature, the updated backoff curve must also be implemented in the Job controller.
702
+
The implementation of KEP-3329 is entirely in the Job controller, and the
703
+
restarts are not handled by kubelet at all; in fact, use of this API is only
704
+
available if the `restartPolicy` is set to `Never`. As a result, to expose the
705
+
new backoff curve Jobs using this feature, the updated backoff curve must also
706
+
be implemented in the Job controller.
694
707
695
708
### Relationship with ImagePullBackOff
696
709
697
-
ImagePullBackoff is used, as the name suggests, only when a container needs to pull a new image. If the iamge pull fails, a backoff decay is used to make later retries on the image download wait longer and longer. This is configured internally independently ([here](https://github.com/kubernetes/kubernetes/blob/release-1.30/pkg/kubelet/kubelet.go#L606)) from the backoff for container restarts ([here](https://github.com/kubernetes/kubernetes/blob/release-1.30/pkg/kubelet/kubelet.go#L855)).
698
-
699
-
This KEP considers changes to ImagePullBackoff as out of scope, so during implementation this will keep the same backoff. This is both to reduce the number of variables during the benchmarking period for the restart counter, and because the problem space of ImagePullBackoff could likely be handled by a compeltely different pattern, as unlike with CrashLoopBackoff the types of errors with ImagePullBackoff are less variable and better interpretable by the infrastructure as recovereable or non-recoverable (i.e. 404s).
710
+
ImagePullBackoff is used, as the name suggests, only when a container needs to
711
+
pull a new image. If the iamge pull fails, a backoff decay is used to make later
712
+
retries on the image download wait longer and longer. This is configured
0 commit comments