Skip to content

Commit adf8d04

Browse files
committed
Add upgrade/downgrade strategy section
Signed-off-by: Laura Lorenz <[email protected]>
1 parent 6e9517a commit adf8d04

File tree

1 file changed

+37
-0
lines changed
  • keps/sig-node/4603-tune-crashloopbackoff

1 file changed

+37
-0
lines changed

keps/sig-node/4603-tune-crashloopbackoff/README.md

Lines changed: 37 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -937,6 +937,43 @@ enhancement:
937937
cluster required to make on upgrade, in order to make use of the enhancement?
938938
-->
939939

940+
For `ReduceDefaultCrashLoopBackoffDecay`:
941+
942+
For an existing cluster, no changes are required to configuration, invocations
943+
or API objects to make an upgrade.
944+
945+
To use the enhancement, the alpha feature gate is turned on. In the future when
946+
(/if) the feature gate is removed, no configurations would be required to be
947+
made, and the default behavior of the baseline backoff curve would -- by design
948+
-- be changed.
949+
950+
For `EnableRapidCrashLoopBackoffDecay`:
951+
952+
For an existing cluster, no changes are required to configuration, invocations
953+
or API objects to make an upgrade.
954+
955+
To make use of this enhancement, on upgrade, the feature gate must first be
956+
turned on. Then, if any Pods want to opt into the `Rapid` backoff decay curve,
957+
they must be completely redeployed with `restartPolicy: Rapid`, since that field
958+
cannot be patched.
959+
960+
To stop use of this enhancement, there are two options.
961+
962+
On a per-Pod basis, Pods can be completely redeployed with `restartPolicy` set
963+
to something besides `Rapid`. They will no longer use the `Rapid` backoff curve;
964+
since the Pods have been completely redeployed, they will lose their prior
965+
backoff counter anyways and, if restarted, will start from the beginning of
966+
their backoff curve (either the original one with initial value 10s, or the new
967+
baseline with initial value 1s, depending on whether they've turned on the
968+
`ReduceDefaultCrashLoopBackoffDecay` feature gate).
969+
970+
Or, the entire cluster can be restarted with the
971+
`EnableRapidCrashLoopBackoffDecay` feature gate turned off. In this case, any
972+
Pod configured with `restartPolicy: Rapid` will instead serve as `restartPolicy:
973+
Always` and use the default backoff curve. Again, since the cluster was
974+
restarted and Pods were redeployed, they will not maintain prior state and will
975+
start at the beginning of their backoff curve.
976+
940977
### Version Skew Strategy
941978

942979
<!--

0 commit comments

Comments
 (0)