Skip to content

Commit eebd38b

Browse files
committed
Simplify general backoff curve update and move catch up mechanic to Alternatives
Signed-off-by: Laura Lorenz <[email protected]>
1 parent 830b013 commit eebd38b

File tree

3 files changed

+36
-20
lines changed

3 files changed

+36
-20
lines changed

keps/sig-node/4603-tune-crashloopbackoff/README.md

Lines changed: 36 additions & 20 deletions
Original file line numberDiff line numberDiff line change
@@ -299,10 +299,9 @@ nitty-gritty.
299299

300300
This design seeks to incorporate a three-pronged approach:
301301

302-
1. Change the existing default backoff curve to stack more retries earlier, but
303-
target the same amount of overall retries as the current behavior. This means
304-
more restarts in the first 1 min or so, but later retries would be spaced out
305-
further than they are today.
302+
1. Change the existing initial value for the backoff curve to stack more retries
303+
earlier for all restarts (`restartPolicy: OnFailure` and `restartPolicy:
304+
Always`)
306305
2. Allow fast, flat-rate (0-10s + jitter) restarts when the exit code is 0, if
307306
`restartPolicy: Always`.
308307
3. Provide a `restartPolicy: Rapid` option to configure even faster restarts for
@@ -315,7 +314,7 @@ better analyze and anticipate the change in load and node stability as a result
315314
of these changes.
316315

317316

318-
#### Existing backoff curve change: front loaded decay with interval
317+
#### Existing backoff curve change: front loaded decay
319318

320319
As mentioned above, today the standard backoff curve is an exponential decay
321320
starting at 10s and capping at 5 minutes, resulting in a composite of the
@@ -334,22 +333,13 @@ in the first 30 minutes the container will restart about 10 times, with the
334333
first four restarts in the first 5 minutes.
335334

336335
This KEP proposes changing the existing backoff curve to load more restarts
337-
earlier by changing the initial value of the exponential backoff. In an effort
338-
to anticipate API server stability ahead of the experiential data we can collect
339-
during alpha, the proposed changes are to both reduce the initial value, and
340-
step function to a higher delay cap once the decay curve triggers the same
341-
number of total restarts as experienced today in a 10 minute time horizon, in
342-
order to approximate load (though not rate) of pod restart API server requests.
343-
344-
The detailed methodology for determining the implementable starting value and
345-
step function cap, and benchmarking it during and after alpha, is enclosed in
346-
Design Details. In short, the current proposal is to implement a new initial
347-
value of 1s, and a catch-up delay of 569 seconds (almost 9.5 minutes) on the 6th
348-
retry.
336+
earlier by changing the initial value of the exponential backoff. A number of
337+
alternate initial values are modelled below, until the 5 minute cap would be
338+
reached. This proposal suggests we start with a new initial value of 1s, and
339+
analyze its impact on infrastructure during alpha.
349340

350-
!["A graph showing the delay interval function needed to maintain restart
351-
number"](controlfornumberofrestarts.png
352-
"Alternate CrashLoopBackoff decay")
341+
!["A graph showing the decay curves for different initial values"](differentinitialvalues.png
342+
"Alternate CrashLoopBackoff initial values")
353343

354344

355345
#### Flat-rate restarts for `Success` (exit code 0)
@@ -1193,6 +1183,32 @@ These overrides will exist for the following reasons:
11931183
These had been selected because there are known use cases where changed restart
11941184
behavior would benefit workloads epxeriencing these categories of failures.
11951185

1186+
### Front loaded decay with interval
1187+
In an effort
1188+
to anticipate API server stability ahead of the experiential data we can collect
1189+
during alpha, the proposed changes are to both reduce the initial value, and include a
1190+
step function to a higher delay cap once the decay curve triggers the same
1191+
number of total restarts as experienced today in a 10 minute time horizon, in
1192+
order to approximate load (though not rate) of pod restart API server requests.
1193+
1194+
In short, the current proposal is to implement a new initial
1195+
value of 1s, and a catch-up delay of 569 seconds (almost 9.5 minutes) on the 6th
1196+
retry.
1197+
1198+
!["A graph showing the delay interval function needed to maintain restart
1199+
number"](controlfornumberofrestarts.png
1200+
"Alternate CrashLoopBackoff decay")
1201+
1202+
**Why not?**: If we keep the same decay rate as today (2x), no matter what the
1203+
initial value is, the majority of the added restarts are in the beginning. Even
1204+
if we "catch up" the delay to the total number of restarts, we expect problem
1205+
with kubelet to happen more as a result of the faster restarts in the beginning,
1206+
not because we spaced out later ones longer. In addition, we are only talking
1207+
about 3-7 more restarts per backoff, even in the fastest modeled case (25ms
1208+
initial value), which is not anticipated to be a sufficient enough hit to the
1209+
infrastructure to warrant implementing such a contrived backoff curve.
1210+
1211+
!["A graph showing the changes to restarts depending on some initial values"](initialvaluesandnumberofrestarts.png "Different CrashLoopBackoff initial values")
11961212

11971213
### More complex heuristics
11981214

16.6 KB
Loading
15.5 KB
Loading

0 commit comments

Comments
 (0)