You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: keps/sig-node/4603-tune-crashloopbackoff/README.md
+75-51Lines changed: 75 additions & 51 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -6,23 +6,23 @@ To get started with this template:
6
6
- [x] **Pick a hosting SIG.**
7
7
Make sure that the problem space is something the SIG is interested in taking
8
8
up. KEPs should not be checked in without a sponsoring SIG.
9
-
- [] **Create an issue in kubernetes/enhancements**
9
+
- [x] **Create an issue in kubernetes/enhancements**
10
10
When filing an enhancement tracking issue, please make sure to complete all
11
11
fields in that template. One of the fields asks for a link to the KEP. You
12
12
can leave that blank until this KEP is filed, and then go back to the
13
13
enhancement and add the link.
14
-
- [] **Make a copy of this template directory.**
14
+
- [x] **Make a copy of this template directory.**
15
15
Copy this template into the owning SIG's directory and name it
16
16
`NNNN-short-descriptive-title`, where `NNNN` is the issue number (with no
17
17
leading-zero padding) assigned to your enhancement above.
18
-
- [] **Fill out as much of the kep.yaml file as you can.**
18
+
- [x] **Fill out as much of the kep.yaml file as you can.**
19
19
At minimum, you should fill in the "Title", "Authors", "Owning-sig",
20
20
"Status", and date-related fields.
21
-
- [] **Fill out this file as best you can.**
21
+
- [x] **Fill out this file as best you can.**
22
22
At minimum, you should fill in the "Summary" and "Motivation" sections.
23
23
These should be easy if you've preflighted the idea of the KEP with the
24
24
appropriate SIG(s).
25
-
- [] **Create a PR for this KEP.**
25
+
- [x] **Create a PR for this KEP.**
26
26
Assign it to people in the SIG who are sponsoring this process.
27
27
- [ ] **Merge early and iterate.**
28
28
Avoid getting hung up on specific details and instead aim to get the goals of
@@ -151,14 +151,14 @@ checklist items _must_ be updated for the enhancement to be released.
151
151
152
152
Items marked with (R) are required *prior to targeting to a milestone / release*.
153
153
154
-
-[] (R) Enhancement issue in release milestone, which links to KEP dir in [kubernetes/enhancements] (not the initial KEP PR)
154
+
-[x] (R) Enhancement issue in release milestone, which links to KEP dir in [kubernetes/enhancements] (not the initial KEP PR)
155
155
-[ ] (R) KEP approvers have approved the KEP status as `implementable`
156
156
-[ ] (R) Design details are appropriately documented
157
157
-[ ] (R) Test plan is in place, giving consideration to SIG Architecture and SIG Testing input (including test refactors)
158
158
-[ ] e2e Tests for all Beta API Operations (endpoints)
159
159
-[ ] (R) Ensure GA e2e tests meet requirements for [Conformance Tests](https://github.com/kubernetes/community/blob/master/contributors/devel/sig-architecture/conformance-tests.md)
160
160
-[ ] (R) Minimum Two Week Window for GA e2e tests to prove flake free
161
-
-[] (R) Graduation criteria is in place
161
+
-[x] (R) Graduation criteria is in place
162
162
-[ ] (R) [all GA Endpoints](https://github.com/kubernetes/community/pull/1806) must be hit by [Conformance Tests](https://github.com/kubernetes/community/blob/master/contributors/devel/sig-architecture/conformance-tests.md)
163
163
-[ ] (R) Production readiness review completed
164
164
-[ ] (R) Production readiness review approved
@@ -348,55 +348,22 @@ these changes.
348
348
349
349
#### Existing backoff curve change: front loaded decay
350
350
351
-
As mentioned above, today the standard backoff curve is an exponential decay
352
-
starting at 10s and capping at 5 minutes, resulting in a composite of the
353
-
standard hockey-stick exponential decay graph followed by a linear rise until
354
-
the heat death of the universe as depicted below:
355
-
356
-

503
+
504
+
For today's decay rate, the first restart is within the
517
505
first 10s, the second within the first 30s, the third within the first 70s.
518
506
Using those same time windows to compare alternate initial values, for example
519
507
changing the initial rate to 1s, we would instead have 3 restarts in the first
@@ -523,19 +511,55 @@ earlier, but even at 250ms or 25ms initial values, each approach a similar rate
523
511
of restarts after the third time window.
524
512
525
513

528
516
529
517
Among these modeled initial values, we would get between 3-7 excess restarts per
530
518
backoff lifetime, mostly within the first three time windows matching today's
531
519
restart behavior.
532
520
533
-
#### New OneOf for `restartPolicy` -- `Rapid`
534
-
`restartPolicy` is an immutable field in podSpec and containerSpec. If set in podSpec, each container in the Pod inherits the Pod's restart policy of either `Never` (default), `OnFailure`, or `Always`; for a Job, the only valid options are `Never` and `OnFailure`. In containerSpec, it is valid ONLY on init containers and ONLY as `Always`, to configure a sidecar container that runs continuously alongside the regular containers in the Pod.
521
+
#### Rapid curve methodology
522
+
523
+
For some users in
524
+
[Kubernetes#57291](https://github.com/kubernetes/kubernetes/issues/57291), any
525
+
delay over 1 minute at any point is just too slow, even if it is legitimately
526
+
crashing. A common refrain is that for independently recoverable errors,
527
+
especially system infrastructure events or recovered external dependencies, or
528
+
for absolutely nonnegotiably critical sidecar pods, users would rather poll more
529
+
often or more intelligently to reduce the amount of time a workload has to wait
530
+
to try again after a failure. In the extreme cases, users want to be able to
531
+
configure (by container, node, or exit code) the backoff to close to 0 seconds.
532
+
This KEP considers it out of scope to implement fully user-customizable
533
+
behavior, and too risky without full and complete benchmarking to node stability
534
+
to allow legitimately crashing workloads to have a backoff of 0, but it is in
535
+
scope for the first alpha to provide users a way to opt workloads in to a even
536
+
faster restart behavior.
535
537
536
-
This KEP will support a new value for this field, `Rapid`, which on feature flag disablement will be interpreted as `Always`. If `restartPolicy: Rapid` is set or inherited for a container, that container will follow the new Rapid backoff curve.
538
+
The finalization of the initial and max cap after benchmarking. As a
539
+
conservative first estimate in line with maximums discussed on
540
+
[Kubernetes#57291](https://github.com/kubernetes/kubernetes/issues/57291), the
541
+
initial curve is selected at initial=250ms / cap=1 minute, but during
542
+
benchmarking this will be modelled against kubelet capacity, potentially
543
+
targeting something closer to an initial value near 0s, and a cap of 10-30s.
537
544
538
-
Due to configuring this as another option to this field, this would make Rapid backoff possible for restartable init (aka sidecar) containers, Pods, Deployments, StatefulSets, ReplicaSets, DaemonSets, but NOT pure init containers, Jobs or CronJobs.
545
+
546
+
#### New OneOf for `restartPolicy` -- `Rapid`
547
+
`restartPolicy` is an immutable field in podSpec and containerSpec. If set in
548
+
podSpec, each container in the Pod inherits the Pod's restart policy of either
549
+
`Never` (default), `OnFailure`, or `Always`; for a Job, the only valid options
550
+
are `Never` and `OnFailure`. In containerSpec, it is valid ONLY on init
551
+
containers and ONLY as `Always`, to configure a sidecar container that runs
552
+
continuously alongside the regular containers in the Pod.
553
+
554
+
This KEP will support a new value for this field, `Rapid`, which on feature flag
555
+
disablement will be interpreted as `Always`. If `restartPolicy: Rapid` is set or
556
+
inherited for a container, that container will follow the new Rapid backoff
557
+
curve.
558
+
559
+
Due to configuring this as another option to this field, this would make Rapid
560
+
backoff possible for restartable init (aka sidecar) containers, Pods,
561
+
Deployments, StatefulSets, ReplicaSets, DaemonSets, but NOT pure init
562
+
containers, Jobs or CronJobs.
539
563
540
564
### Kubelet overhead analysis
541
565
@@ -555,7 +579,7 @@ does during pod restarts.
555
579
* Logs information about all those container operations (utilizing disk IO and
556
580
“spamming” logs)
557
581
558
-
####Observability
582
+
### Observability
559
583
560
584
Again, let it be known that by definition this KEP will cause pods to restart
561
585
faster and more often than the current status quo and such a change is desired.
0 commit comments