9
9
- [ Proposal] ( #proposal )
10
10
- [ Risks and Mitigations] ( #risks-and-mitigations )
11
11
- [ Design Details] ( #design-details )
12
+ - [ Following sections will be added] ( #following-sections-will-be-added )
13
+ - [ Following fields will be moved (without any change in name, data-type and default values)] ( #following-fields-will-be-moved-without-any-change-in-name-data-type-and-default-values )
14
+ - [ Following fields will be changed] ( #following-fields-will-be-changed )
15
+ - [ Following fields will be added] ( #following-fields-will-be-added )
16
+ - [ Following fields will have different default values] ( #following-fields-will-have-different-default-values )
17
+ - [ Following fields will be dropped] ( #following-fields-will-be-dropped )
12
18
- [ Test Plan] ( #test-plan )
13
19
- [ Prerequisite testing updates] ( #prerequisite-testing-updates )
14
20
- [ Unit tests] ( #unit-tests )
34
40
35
41
Items marked with (R) are required * prior to targeting to a milestone / release* .
36
42
37
- - [ ] (R) Enhancement issue in release milestone, which links to KEP dir in [ kubernetes/enhancements] (not the initial KEP PR)
38
- - [ ] (R) KEP approvers have approved the KEP status as ` implementable `
39
- - [ ] (R) Design details are appropriately documented
40
- - [ ] (R) Test plan is in place, giving consideration to SIG Architecture and SIG Testing input (including test refactors)
43
+ - [X ] (R) Enhancement issue in release milestone, which links to KEP dir in [ kubernetes/enhancements] (not the initial KEP PR)
44
+ - [X ] (R) KEP approvers have approved the KEP status as ` implementable `
45
+ - [X ] (R) Design details are appropriately documented
46
+ - [X ] (R) Test plan is in place, giving consideration to SIG Architecture and SIG Testing input (including test refactors)
41
47
- [ ] e2e Tests for all Beta API Operations (endpoints)
42
48
- [ ] (R) Ensure GA e2e tests meet requirements for [ Conformance Tests] ( https://github.com/kubernetes/community/blob/master/contributors/devel/sig-architecture/conformance-tests.md )
43
49
- [ ] (R) Minimum Two Week Window for GA e2e tests to prove flake free
@@ -127,6 +133,55 @@ The mitigations to those risks:
127
133
128
134
## Design Details
129
135
136
+ ### Following sections will be added
137
+ | Field | Comments |
138
+ | ---------| -----------------------------------------------------|
139
+ | Linux | new section for linux (platform-specific) options |
140
+ | Windows | new section for windows (platform-specific) options |
141
+
142
+ ### Following fields will be moved (without any change in name, data-type and default values)
143
+ | v1alpha1 | v1alpha2 | Comments |
144
+ | ------------------------| ---------------------| ------------------------------------------------------------------|
145
+ | Conntrack | Linux.Conntrack | moved from root(generic) to linux (platform-specific) section |
146
+ | OOMScoreAdj | Linux.OOMScoreAdj | moved from root(generic) to linux (platform-specific) section |
147
+ | IPTables.MasqueradeAll | Linux.MasqueradeAll | moved from iptables (backend-specific) to root (generic) section |
148
+ | NFTables.MasqueradeAll | Linux.MasqueradeAll | moved from nftables (backend-specific) to root (generic) section |
149
+ | IPTables.SyncPeriod | SyncPeriod | moved from iptables (backend-specific) to root (generic) section |
150
+ | NFTables.SyncPeriod | SyncPeriod | moved from nftables (backend-specific) to root (generic) section |
151
+ | IPVS.SyncPeriod | SyncPeriod | moved from ipvs (backend-specific) to to root (generic) section |
152
+ | IPTables.MinSyncPeriod | MinSyncPeriod | moved from iptables (backend-specific) to root (generic) section |
153
+ | NFTables.MinSyncPeriod | MinSyncPeriod | moved from nftables (backend-specific) to root (generic) section |
154
+ | IPVS.MinSyncPeriod | MinSyncPeriod | moved from ipvs (backend-specific) to root (generic) section |
155
+
156
+ ### Following fields will be changed
157
+ | v1alpha1 | v1alpha2 | DataType | Comments |
158
+ | --------------------| --------------------------| --------------| ----------------------------------------------------------------------------------------------------------------|
159
+ | ClusterCIDR | DetectLocal.ClusterCIDRs | list[ string] | list of CIDR ranges for detecting local traffic |
160
+ | BindAddress | NodeIPOverride | list[ string] | list of primary node IPs |
161
+ | MetricsBindAddress | MetricsBindAddresses | list[ string] | list of CIDR ranges that contain valid node IPs to expose metrics server, instead of host port(ip: port ) format |
162
+ | HealthzBindAddress | HealthzBindAddresses | list[ string] | list of CIDR ranges that contain valid node IPs to expose healthz server, instead of host port(ip: port ) format | |
163
+
164
+ ### Following fields will be added
165
+ | Field | DataType | Default Value | Comments |
166
+ | ----------------------| ------------------| ---------------| --------------------------------------------------------------------------------|
167
+ | IPVS.MasqueradeBit | integer (32-bit) | 14 | IPVS will use this field instead of IPTables.MasqueradeBit |
168
+ | Windows.RunAsService | boolean | false | new field for existing --windows-service command line flag |
169
+ | ConfigHardFail | boolean | true | if set to true, kube-proxy will exit rather than just warning on config errors |
170
+ | MetricsBindPort | integer (32-bit) | 10249 | port on which metrics server will be exposed |
171
+ | HealthzBindPort | integer (32-bit) | 10256 | port on which helathz server will be exposed |
172
+
173
+ ### Following fields will have different default values
174
+ | Field | v1alpha1 (default) | v1alpha2 (default) |
175
+ | -----------------------------| --------------------| --------------------|
176
+ | IPTables.LocalhostNodePorts | true | false |
177
+ | BindAddressHardFail | false | true |
178
+
179
+
180
+ ### Following fields will be dropped
181
+ | Key | Comments |
182
+ | -----------| -----------------------------------------|
183
+ | PortRange | dropped as no longer used by kube-proxy |
184
+
130
185
131
186
### Test Plan
132
187
@@ -137,12 +192,18 @@ to implement this enhancement.
137
192
##### Prerequisite testing updates
138
193
139
194
140
- ##### Unit tests
141
-
195
+ ##### Unit tests
196
+ There will addition of new tests and modification of existing ones in the following packages:
197
+ - ` k8s.io/kubernetes/cmd/kubeadm/app/componentconfigs ` : ` 2024-01-21 ` - ` 76% `
198
+ - ` k8s.io/kubernetes/cmd/kubeadm/app/phases/addons/proxy ` : ` 2024-01-21 ` - ` 78% `
199
+ - ` k8s.io/kubernetes/cmd/kubeadm/app/util/config ` : ` 2024-01-21 ` - ` 70.5% `
200
+ - ` k8s.io/kubernetes/cmd/kubeadm/app/util/config/strict ` : ` 2024-01-21 ` - ` 100% `
201
+ - ` k8s.io/kubernetes/cmd/kube-proxy/app ` : ` 2024-01-21 ` - ` 43.6% `
202
+ - ` k8s.io/kubernetes/pkg/proxy/apis/config/scheme ` : ` 2024-01-21 ` - ` 100% `
203
+ - ` k8s.io/kubernetes/pkg/proxy/apis/config/validation ` : ` 2024-01-21 ` - ` 84.2% `
142
204
143
205
##### Integration tests
144
206
145
-
146
207
##### e2e tests
147
208
148
209
### Graduation Criteria
@@ -156,9 +217,19 @@ The config should be considered graduated to beta if it:
156
217
157
218
### Upgrade / Downgrade Strategy
158
219
220
+ Users are able to use the ` v1alpha1 ` or ` v1alpha2 ` API. Since they only affect the
221
+ configuration of the proxy, there is no impact to running workloads.
222
+
223
+ The existing flags ` --config ` and ` --write-config-to ` can be used to convert any existing
224
+ v1alpha1 to v1alpha2 kube-proxy configuration. ` --config ` can consume and decode any
225
+ supported version, ` --write-config-to ` will always write using latest version.
226
+ ``` bash
227
+ /usr/local/bin/kube-proxy --config old-v1alpha1.yaml --write-config-to new-v1alpha2.yaml
228
+ ```
159
229
160
230
### Version Skew Strategy
161
231
232
+ N/A
162
233
163
234
## Production Readiness Review Questionnaire
164
235
@@ -168,81 +239,66 @@ The config should be considered graduated to beta if it:
168
239
169
240
###### How can this feature be enabled / disabled in a live cluster?
170
241
171
- <!--
172
- Pick one of these and delete the rest.
173
-
174
- Documentation is available on [feature gate lifecycle] and expectations, as
175
- well as the [existing list] of feature gates.
176
-
177
- [feature gate lifecycle]: https://git.k8s.io/community/contributors/devel/sig-architecture/feature-gates.md
178
- [existing list]: https://kubernetes.io/docs/reference/command-line-tools-reference/feature-gates/
179
- -->
180
-
181
- - [ ] Feature gate (also fill in values in ` kep.yaml ` )
182
- - Feature gate name:
183
- - Components depending on the feature gate:
184
- - [x] Other
185
- - Describe the mechanism:
186
- - Will enabling / disabling the feature require downtime of the control
187
- plane?
188
- - Will enabling / disabling the feature require downtime or reprovisioning
189
- of a node?
242
+ Operators can use the config API via --config command line flag for kube-proxy.
243
+ To disable, operators can remove --config flag and use other command line flags
244
+ to configure the proxy.
190
245
191
246
###### Does enabling the feature change any default behavior?
192
247
193
- <!--
194
- Any change of default behavior may be surprising to users or break existing
195
- automations, so be extremely careful here.
196
- -->
248
+ No
197
249
198
250
###### Can the feature be disabled once it has been enabled (i.e. can we roll back the enablement)?
199
251
200
- <!--
201
- Describe the consequences on existing workloads (e.g., if this is a runtime
202
- feature, can it break the existing applications?).
203
-
204
- Feature gates are typically disabled by setting the flag to `false` and
205
- restarting the component. No other changes should be necessary to disable the
206
- feature.
207
-
208
- NOTE: Also set `disable-supported` to `true` or `false` in `kep.yaml`.
209
- -->
252
+ Yes, by removing --config command line flag for kube-proxy.
210
253
211
254
###### What happens if we reenable the feature if it was previously rolled back?
212
255
256
+ N/A
257
+
213
258
###### Are there any tests for feature enablement/disablement?
214
259
215
- <!--
216
- The e2e framework does not currently support enabling or disabling feature
217
- gates. However, unit tests in each component dealing with managing data, created
218
- with and without the feature, are necessary. At the very least, think about
219
- conversion tests if API types are being modified.
260
+ The e2e framework does not currently support changing configuration files.
220
261
221
- Additionally, for features that are introducing a new API field, unit tests that
222
- are exercising the `switch` of feature gate itself (what happens if I disable a
223
- feature gate after having objects written with the new field) are also critical.
224
- You can take a look at one potential example of such test in:
225
- https://github.com/kubernetes/kubernetes/pull/97058/files#diff-7826f7adbc1996a05ab52e3f5f02429e94b68ce6bce0dc534d1be636154fded3R246-R282
226
- -->
262
+ There are intensive unit tests for all the API versions.
227
263
228
264
### Rollout, Upgrade and Rollback Planning
229
265
230
266
###### How can a rollout or rollback fail? Can it impact already running workloads?
231
267
268
+ A malformed configuration will cause the proxy to fail to start. Running
269
+ workloads are not affected.
270
+
232
271
###### What specific metrics should inform a rollback?
233
272
273
+ - ` sync_proxy_rules_duration_seconds ` being empty or fairly high.
274
+ - Spike in any of the following metrics:
275
+ - ` network_programming_duration_seconds `
276
+ - ` sync_proxy_rules_endpoint_changes_pending `
277
+ - ` sync_proxy_rules_service_changes_pending `
278
+ - A spike in difference of ` sync_proxy_rules_last_queued_timestamp_seconds ` and ` sync_proxy_rules_last_timestamp_seconds `
279
+
234
280
###### Were upgrade and rollback tested? Was the upgrade->downgrade->upgrade path tested?
235
281
282
+ N/A
283
+
236
284
###### Is the rollout accompanied by any deprecations and/or removals of features, APIs, fields of API types, flags, etc.?
237
285
286
+ No.
287
+
238
288
### Monitoring Requirements
239
289
240
290
###### How can an operator determine if the feature is in use by workloads?
241
291
292
+ N/A
293
+
242
294
###### How can someone using this feature know that it is working for their instance?
243
295
296
+ N/A
297
+
244
298
###### What are the reasonable SLOs (Service Level Objectives) for the enhancement?
245
299
300
+ N/A
301
+
246
302
###### What are the SLIs (Service Level Indicators) an operator can use to determine the health of the service?
247
303
248
304
- [ ] Metrics
@@ -271,17 +327,15 @@ No.
271
327
272
328
###### Will enabling / using this feature result in introducing new API types?
273
329
274
- Yes.
275
- [ WIP]
330
+ Yes, ` v1alpha2 ` will be introduced for kube-proxy.
276
331
277
332
###### Will enabling / using this feature result in any new calls to the cloud provider?
278
333
279
334
No.
280
335
281
336
###### Will enabling / using this feature result in increasing size or count of the existing API objects?
282
337
283
- Yes.
284
- [ WIP]
338
+ No.
285
339
286
340
###### Will enabling / using this feature result in increasing time taken by any operations covered by existing SLIs/SLOs?
287
341
@@ -300,19 +354,27 @@ No.
300
354
301
355
###### How does this feature react if the API server and/or etcd is unavailable?
302
356
357
+ N/A
303
358
304
359
###### What are other known failure modes?
305
360
361
+ None.
306
362
307
363
###### What steps should be taken if SLOs are not being met to determine the problem?
308
364
365
+ N/A
309
366
310
367
## Implementation History
311
-
368
+ - 2019-09-20: KEP introduced with motivation.
369
+ - 2023-11-17: KEP for v1alpha2 configuration sent for review, including proposal,
370
+ test plan, and PRR questionnaire.
312
371
313
372
## Drawbacks
314
373
374
+ N/A
375
+
315
376
## Alternatives
316
377
317
- ## Infrastructure Needed (Optional)
378
+ N/A
318
379
380
+ ## Infrastructure Needed (Optional)
0 commit comments