12
12
- [ Test Plan] ( #test-plan )
13
13
- [ Graduation Criteria] ( #graduation-criteria )
14
14
- [ Alpha] ( #alpha )
15
+ - [ Beta] ( #beta )
15
16
- [ GA] ( #ga )
16
17
- [ Upgrade / Downgrade Strategy] ( #upgrade--downgrade-strategy )
17
18
- [ Version Skew Strategy] ( #version-skew-strategy )
19
+ - [ Production Readiness Review Questionnaire] ( #production-readiness-review-questionnaire )
20
+ - [ Feature Enablement and Rollback] ( #feature-enablement-and-rollback )
21
+ - [ Rollout, Upgrade and Rollback Planning] ( #rollout-upgrade-and-rollback-planning )
22
+ - [ Monitoring Requirements] ( #monitoring-requirements )
23
+ - [ Dependencies] ( #dependencies )
24
+ - [ Scalability] ( #scalability )
25
+ - [ Troubleshooting] ( #troubleshooting )
18
26
- [ Implementation History] ( #implementation-history )
19
27
- [ Drawbacks] ( #drawbacks )
20
28
- [ Alternatives] ( #alternatives )
27
35
Items marked with (R) are required * prior to targeting to a milestone / release* .
28
36
29
37
- [X] (R) Enhancement issue in release milestone, which links to KEP dir in [ kubernetes/enhancements] (not the initial KEP PR)
30
- - [ ] (R) KEP approvers have approved the KEP status as ` implementable `
31
- - [ ] (R) Design details are appropriately documented
32
- - [ ] (R) Test plan is in place, giving consideration to SIG Architecture and SIG Testing input
33
- - [ ] (R) Graduation criteria is in place
34
- - [ ] (R) Production readiness review completed
38
+ - [X ] (R) KEP approvers have approved the KEP status as ` implementable `
39
+ - [X ] (R) Design details are appropriately documented
40
+ - [X ] (R) Test plan is in place, giving consideration to SIG Architecture and SIG Testing input
41
+ - [X ] (R) Graduation criteria is in place
42
+ - [X ] (R) Production readiness review completed
35
43
- [ ] Production readiness review approved
36
44
- [ ] "Implementation History" section is up-to-date for milestone
37
45
- [ ] User-facing documentation has been created in [ kubernetes/website] , for publication to [ kubernetes.io]
@@ -115,12 +123,19 @@ E2E tests:
115
123
116
124
### Alpha
117
125
118
- Adds new field ` allocateLoadBalancerNodePorts ` to Service but not implemented, this allows for rollback.
126
+ * Adds new field ` allocateLoadBalancerNodePorts ` to Service, but the field is dropped unless an existing Service has the field set already.
127
+ * Only allow the field ` allocateLoadBalancerNodePorts ` to be set when the feature gate is on.
128
+ * There are sufficient unit tests exercising API strategy with the feature gate enabled / disabled.
119
129
120
- ### GA
130
+ ### Beta
131
+
132
+ * E2E tests checking that node ports do not get allocated when ` service.spec.allocateLoadBalancerNodePorts=false ` .
133
+ * Feature gate is on by default.
121
134
122
- Feature is enabled when field is set.
135
+ ### GA
123
136
137
+ * Feature gate is on by default and locked.
138
+ * To safely handle rollback, there has been at least 1 release prior where apiserver understands the new field (covered in alpha).
124
139
125
140
### Upgrade / Downgrade Strategy
126
141
@@ -136,6 +151,172 @@ re-enabling node port should not cause any traffic disruptions.
136
151
Version skew from the control plane to kube-proxy should be trivial since kube-proxy's behavior is driven by the ` nodePort ` field
137
152
and not the ` allocateLoadBalancerNodePorts ` field.
138
153
154
+ ## Production Readiness Review Questionnaire
155
+
156
+ ### Feature Enablement and Rollback
157
+
158
+ _ This section must be completed when targeting alpha to a release._
159
+
160
+ * ** How can this feature be enabled / disabled in a live cluster?**
161
+ - [X] Feature gate (also fill in values in ` kep.yaml ` )
162
+ - Feature gate name: ServiceLBNodePortControl
163
+ - Components depending on the feature gate: kube-apiserver
164
+ - [ ] Other
165
+ - Describe the mechanism:
166
+ - Will enabling / disabling the feature require downtime of the control
167
+ plane?
168
+ - Will enabling / disabling the feature require downtime or reprovisioning
169
+ of a node? (Do not assume ` Dynamic Kubelet Config ` feature is enabled).
170
+
171
+ * ** Does enabling the feature change any default behavior?**
172
+
173
+ No, enabling the feature gate but not setting ` spec.allocateLoadBalancerNodePorts ` will not
174
+ change any default behaviors in Service.
175
+
176
+ * ** Can the feature be disabled once it has been enabled (i.e. can we roll back
177
+ the enablement)?**
178
+
179
+ Yes, if the feature gate is disabled, new Services cannot use the new field, but existing Services
180
+ already using the field will continue to have it set. Updates to existing fields are allowed.
181
+
182
+ * ** What happens if we reenable the feature if it was previously rolled back?**
183
+
184
+ The existing value for ` spec.allocateLoadBalancerNodePorts ` will remain intact since API strategy
185
+ will not drop fields if existing resources have it set.
186
+
187
+ * ** Are there any tests for feature enablement/disablement?**
188
+
189
+ Yes, there will be unit tests for the Service API strategy which exercises the behavior
190
+ with the feature gate enabled and disabled.
191
+
192
+ ### Rollout, Upgrade and Rollback Planning
193
+
194
+ _ This section must be completed when targeting beta graduation to a release._
195
+
196
+ * ** How can a rollout fail? Can it impact already running workloads?**
197
+
198
+ * By default this should not impact any existing Services since we are not changing any default behaviors.
199
+ * Enabling this feature on new clusters can impact workloads if load balancers depend on node ports without users
200
+ being aware.
201
+
202
+ * ** What specific metrics should inform a rollback?**
203
+
204
+ Metrics for node port counts will vary for Service LoadBalancers that set ` spec.allocateLoadBalancerNodeports=false ` .
205
+ If load balancers are misbehaving at the same time node port allocation metric is decreasing, the user may want to
206
+ consider rolling back this feature.
207
+
208
+ * ** Were upgrade and rollback tested? Was the upgrade->downgrade->upgrade path tested?**
209
+
210
+ No, upgrade->downgrade->upgrade has not been tested yet. Like any new API field, on downgrade
211
+ any existing Services using the field will continue to have the field set. For these Services,
212
+ they will not have node ports allocated. New Services cannot use the new field unless the feature
213
+ gate is enabled in the old version when the feature was alpha.
214
+
215
+ Manual validation of this behavior should be done prior to promoting this feature to beta.
216
+
217
+ * ** Is the rollout accompanied by any deprecations and/or removals of features, APIs,
218
+ fields of API types, flags, etc.?**
219
+
220
+ No.
221
+
222
+ ### Monitoring Requirements
223
+
224
+ _ This section must be completed when targeting beta graduation to a release._
225
+
226
+ * ** How can an operator determine if the feature is in use by workloads?**
227
+
228
+ Service should have ` spec.allocateLoadBalancerNodePorts=false ` and Service LoadBalancers will not have node ports allocated.
229
+
230
+ * ** What are the SLIs (Service Level Indicators) an operator can use to determine
231
+ the health of the service?**
232
+
233
+ N/A
234
+
235
+ * ** What are the reasonable SLOs (Service Level Objectives) for the above SLIs?**
236
+
237
+ N/A
238
+
239
+ * ** Are there any missing metrics that would be useful to have to improve observability
240
+ of this feature?**
241
+
242
+ N/A
243
+
244
+ ### Dependencies
245
+
246
+ _ This section must be completed when targeting beta graduation to a release._
247
+
248
+ * ** Does this feature depend on any specific services running in the cluster?**
249
+
250
+ This feature is dependent on the Service LoadBalancer implementation of a cluster. This feature
251
+ should only be used if the load balancer implementation does not need node ports for the load balancer
252
+ data path.
253
+
254
+
255
+ ### Scalability
256
+
257
+ _ For alpha, this section is encouraged: reviewers should consider these questions
258
+ and attempt to answer them._
259
+
260
+ _ For beta, this section is required: reviewers must answer these questions._
261
+
262
+ _ For GA, this section is required: approvers should be able to confirm the
263
+ previous answers based on experience in the field._
264
+
265
+ * ** Will enabling / using this feature result in any new API calls?**
266
+ Describe them, providing:
267
+
268
+ No, enabling this feature should actually reduce the number of operations, since
269
+ the feature is to disable an existing behavior with node ports.
270
+
271
+ * ** Will enabling / using this feature result in introducing new API types?**
272
+
273
+ No
274
+
275
+ * ** Will enabling / using this feature result in any new calls to the cloud
276
+ provider?**
277
+
278
+ No
279
+
280
+ * ** Will enabling / using this feature result in increasing size or count of
281
+ the existing API objects?**
282
+
283
+ No
284
+
285
+ * ** Will enabling / using this feature result in increasing time taken by any
286
+ operations covered by [ existing SLIs/SLOs] ?**
287
+
288
+ No
289
+
290
+ * ** Will enabling / using this feature result in non-negligible increase of
291
+ resource usage (CPU, RAM, disk, IO, ...) in any components?**
292
+
293
+ No
294
+
295
+ ### Troubleshooting
296
+
297
+ The Troubleshooting section currently serves the ` Playbook ` role. We may consider
298
+ splitting it into a dedicated ` Playbook ` document (potentially with some monitoring
299
+ details). For now, we leave it here.
300
+
301
+ _ This section must be completed when targeting beta graduation to a release._
302
+
303
+ * ** How does this feature react if the API server and/or etcd is unavailable?**
304
+
305
+ Not any different from when node ports are used for load balancers.
306
+
307
+ * ** What are other known failure modes?**
308
+
309
+ If ` service.spec.allocateLoadBalancerNodePorts=false ` but the load balancer implementation does depend on node ports.
310
+
311
+ * ** What steps should be taken if SLOs are not being met to determine the problem?**
312
+
313
+ In a scenario where a user sets ` service.spec.allocateLoadBalancerNodePorts=false ` but the load balancer does require node ports,
314
+ the user can re-enable node ports for a Service by setting ` service.spec.allocateLoadBalancerNodePorts ` back to ` true ` .
315
+ This will trigger node port allocation from kube-apiserver.
316
+
317
+ [ supported limits ] : https://git.k8s.io/community//sig-scalability/configs-and-limits/thresholds.md
318
+ [ existing SLIs/SLOs ] : https://git.k8s.io/community/sig-scalability/slos/slos.md#kubernetes-slisslos
319
+
139
320
## Implementation History
140
321
141
322
- 2020-06-17: KEP is proposed as implementable
0 commit comments