Skip to content

Commit 19917d6

Browse files
Update DaemonSet KEP for Alpha in 1.21
APIs and gate went into 1.20, impl will go into 1.21. Add some guidance for new surge daemonset authors.
1 parent 6c2e020 commit 19917d6

File tree

3 files changed

+24
-11
lines changed

3 files changed

+24
-11
lines changed
Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,3 @@
1+
kep-number: 1591
2+
alpha:
3+
approver: "@deads2k"

keps/sig-apps/1591-daemonset-surge/README.md

Lines changed: 15 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -96,6 +96,11 @@ DaemonSet pods are slightly more constrained than Deployments when it comes to s
9696

9797
In order to reduce confusion for new users, we will start by rejecting HostPort use in daemonset when MaxSurge is non-zero. A user will not be able to update a daemonset to MaxSurge != 0 if HostPort is set, or update a HostPort if MaxSurge is set, without receiving a validation error. If the MaxSurge feature gate is off, the validation rule is bypassed, and a user who turns off the gate, sets both fields, and then enables the gate will have failing pods but will be able to update their daemonset to either remove surge or remove the host port safely.
9898

99+
A user who uses HostNetwork but does not declare HostPorts and attempts to use MaxSurge with processes that listen on the host network should see errors from the network stack when their process attempts to bind a port (such as `cannot bind to address: port in use`) and the new pod will crash and go into a crashloop. Users should expect to see these failures as they would any other "my application does not start on Kubernetes" error via pod status, daemonset status conditions, and pod logs.
100+
101+
Building a daemonset that hands off between two host level processes with any degree of coordination is an advanced topic and is up to the workload author. The simplest daemonsets may use pod network without any host level sharing and will benefit significantly from maxSurge during updates by reducing downtime at the cost of extra resources. As more complex sharing (host network, disk resources, unix domain sockets, configuration) is needed, the author is expected to leverage custom readiness probes, process start conditions, and process coordination mechanisms (like disks, networking, or shared memory) across pods. Debugging those interactions will be in the domain of the workload author.
102+
103+
99104
### Workload Implications
100105

101106
There are three main workload types that seek to minimize disruption:
@@ -170,8 +175,8 @@ you need any help or guidance.
170175

171176
* **How can this feature be enabled / disabled in a live cluster?**
172177
- [x] Feature gate (also fill in values in `kep.yaml`)
173-
- Feature gate name:
174-
- Components depending on the feature gate:
178+
- Feature gate name: `DaemonSetUpdateSurge`
179+
- Components depending on the feature gate: `kube-apiserver`, `kube-controller-manager`
175180
- [ ] Other
176181
- Describe the mechanism:
177182
- Will enabling / disabling the feature require downtime of the control
@@ -186,15 +191,18 @@ you need any help or guidance.
186191
* **Can the feature be disabled once it has been enabled (i.e. can we roll back
187192
the enablement)?**
188193

189-
Yes, when the feature gate is disabled the field is ignored and can be cleared.
190-
A workload using this alpha feature would no longer be able to surge and would
191-
fall back to the default MaxUnavailable value (which is minimum 1).
194+
Yes, when the feature gate is disabled the field is ignored and can be cleared by
195+
an end user. A workload using this alpha feature would no longer be able to surge
196+
and would fall back to the default MaxUnavailable value (which is minimum 1).
192197

193198
* **What happens if we reenable the feature if it was previously rolled back?**
194199

195200
The field would become active and whatever new values were present would cause
196-
the surge feature to become active. If the field were changed the user would have
197-
to use the new alpha field.
201+
the surge feature to become active. If the field name were changed old values
202+
would be lost and the controller would default to using maxUnavailable 1.
203+
204+
To clear the field from etcd, disable the gate and perform a no-op PUT on every
205+
daemonset.
198206

199207
* **Are there any tests for feature enablement/disablement?**
200208

keps/sig-apps/1591-daemonset-surge/kep.yaml

Lines changed: 6 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -12,6 +12,8 @@ reviewers:
1212
approvers:
1313
- "@kow3ns"
1414
- "@janetkuo"
15+
prr-approvers:
16+
- "@deads2k"
1517
editor: TBD
1618
creation-date: 2020-03-02
1719
last-updated: 2020-03-02
@@ -24,8 +26,8 @@ feature-gates:
2426
- kube-apiserver
2527
- kube-controller-manager
2628
disable-supported: true
27-
latest-milestone: "v1.20"
29+
latest-milestone: "v1.21"
2830
milestone:
29-
alpha: "v1.20"
30-
beta: "v1.21"
31-
stable: "v1.23"
31+
alpha: "v1.21"
32+
beta: "v1.22"
33+
stable: "v1.24"

0 commit comments

Comments
 (0)