You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: keps/sig-node/4603-tune-crashloopbackoff/README.md
+36-20Lines changed: 36 additions & 20 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -299,10 +299,9 @@ nitty-gritty.
299
299
300
300
This design seeks to incorporate a three-pronged approach:
301
301
302
-
1. Change the existing default backoff curve to stack more retries earlier, but
303
-
target the same amount of overall retries as the current behavior. This means
304
-
more restarts in the first 1 min or so, but later retries would be spaced out
305
-
further than they are today.
302
+
1. Change the existing initial value for the backoff curve to stack more retries
303
+
earlier for all restarts (`restartPolicy: OnFailure` and `restartPolicy:
304
+
Always`)
306
305
2. Allow fast, flat-rate (0-10s + jitter) restarts when the exit code is 0, if
307
306
`restartPolicy: Always`.
308
307
3. Provide a `restartPolicy: Rapid` option to configure even faster restarts for
@@ -315,7 +314,7 @@ better analyze and anticipate the change in load and node stability as a result
315
314
of these changes.
316
315
317
316
318
-
#### Existing backoff curve change: front loaded decay with interval
317
+
#### Existing backoff curve change: front loaded decay
319
318
320
319
As mentioned above, today the standard backoff curve is an exponential decay
321
320
starting at 10s and capping at 5 minutes, resulting in a composite of the
@@ -334,22 +333,13 @@ in the first 30 minutes the container will restart about 10 times, with the
334
333
first four restarts in the first 5 minutes.
335
334
336
335
This KEP proposes changing the existing backoff curve to load more restarts
337
-
earlier by changing the initial value of the exponential backoff. In an effort
338
-
to anticipate API server stability ahead of the experiential data we can collect
339
-
during alpha, the proposed changes are to both reduce the initial value, and
340
-
step function to a higher delay cap once the decay curve triggers the same
341
-
number of total restarts as experienced today in a 10 minute time horizon, in
342
-
order to approximate load (though not rate) of pod restart API server requests.
343
-
344
-
The detailed methodology for determining the implementable starting value and
345
-
step function cap, and benchmarking it during and after alpha, is enclosed in
346
-
Design Details. In short, the current proposal is to implement a new initial
347
-
value of 1s, and a catch-up delay of 569 seconds (almost 9.5 minutes) on the 6th
348
-
retry.
336
+
earlier by changing the initial value of the exponential backoff. A number of
337
+
alternate initial values are modelled below, until the 5 minute cap would be
338
+
reached. This proposal suggests we start with a new initial value of 1s, and
339
+
analyze its impact on infrastructure during alpha.
349
340
350
-

341
+

353
343
354
344
355
345
#### Flat-rate restarts for `Success` (exit code 0)
@@ -1193,6 +1183,32 @@ These overrides will exist for the following reasons:
1193
1183
These had been selected because there are known use cases where changed restart
1194
1184
behavior would benefit workloads epxeriencing these categories of failures.
1195
1185
1186
+
### Front loaded decay with interval
1187
+
In an effort
1188
+
to anticipate API server stability ahead of the experiential data we can collect
1189
+
during alpha, the proposed changes are to both reduce the initial value, and include a
1190
+
step function to a higher delay cap once the decay curve triggers the same
1191
+
number of total restarts as experienced today in a 10 minute time horizon, in
1192
+
order to approximate load (though not rate) of pod restart API server requests.
1193
+
1194
+
In short, the current proposal is to implement a new initial
1195
+
value of 1s, and a catch-up delay of 569 seconds (almost 9.5 minutes) on the 6th
1196
+
retry.
1197
+
1198
+

1201
+
1202
+
**Why not?**: If we keep the same decay rate as today (2x), no matter what the
1203
+
initial value is, the majority of the added restarts are in the beginning. Even
1204
+
if we "catch up" the delay to the total number of restarts, we expect problem
1205
+
with kubelet to happen more as a result of the faster restarts in the beginning,
1206
+
not because we spaced out later ones longer. In addition, we are only talking
1207
+
about 3-7 more restarts per backoff, even in the fastest modeled case (25ms
1208
+
initial value), which is not anticipated to be a sufficient enough hit to the
1209
+
infrastructure to warrant implementing such a contrived backoff curve.
1210
+
1211
+

0 commit comments