Skip to content

Commit 12cc497

Browse files
authored
Merge pull request #4337 from aroradaman/kube-proxy-config-v1alpha2-design
KEP-784: Update design-details
2 parents 1464eaf + c6895ad commit 12cc497

File tree

3 files changed

+126
-60
lines changed

3 files changed

+126
-60
lines changed
Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,3 @@
1+
kep-number: 784
2+
alpha:
3+
approver: "@wojtek-t"

keps/sig-network/784-kube-proxy-component-config/README.md

Lines changed: 119 additions & 57 deletions
Original file line numberDiff line numberDiff line change
@@ -9,6 +9,12 @@
99
- [Proposal](#proposal)
1010
- [Risks and Mitigations](#risks-and-mitigations)
1111
- [Design Details](#design-details)
12+
- [Following sections will be added](#following-sections-will-be-added)
13+
- [Following fields will be moved (without any change in name, data-type and default values)](#following-fields-will-be-moved-without-any-change-in-name-data-type-and-default-values)
14+
- [Following fields will be changed](#following-fields-will-be-changed)
15+
- [Following fields will be added](#following-fields-will-be-added)
16+
- [Following fields will have different default values](#following-fields-will-have-different-default-values)
17+
- [Following fields will be dropped](#following-fields-will-be-dropped)
1218
- [Test Plan](#test-plan)
1319
- [Prerequisite testing updates](#prerequisite-testing-updates)
1420
- [Unit tests](#unit-tests)
@@ -34,10 +40,10 @@
3440

3541
Items marked with (R) are required *prior to targeting to a milestone / release*.
3642

37-
- [ ] (R) Enhancement issue in release milestone, which links to KEP dir in [kubernetes/enhancements] (not the initial KEP PR)
38-
- [ ] (R) KEP approvers have approved the KEP status as `implementable`
39-
- [ ] (R) Design details are appropriately documented
40-
- [ ] (R) Test plan is in place, giving consideration to SIG Architecture and SIG Testing input (including test refactors)
43+
- [X] (R) Enhancement issue in release milestone, which links to KEP dir in [kubernetes/enhancements] (not the initial KEP PR)
44+
- [X] (R) KEP approvers have approved the KEP status as `implementable`
45+
- [X] (R) Design details are appropriately documented
46+
- [X] (R) Test plan is in place, giving consideration to SIG Architecture and SIG Testing input (including test refactors)
4147
- [ ] e2e Tests for all Beta API Operations (endpoints)
4248
- [ ] (R) Ensure GA e2e tests meet requirements for [Conformance Tests](https://github.com/kubernetes/community/blob/master/contributors/devel/sig-architecture/conformance-tests.md)
4349
- [ ] (R) Minimum Two Week Window for GA e2e tests to prove flake free
@@ -127,6 +133,55 @@ The mitigations to those risks:
127133

128134
## Design Details
129135

136+
### Following sections will be added
137+
| Field | Comments |
138+
|---------|-----------------------------------------------------|
139+
| Linux | new section for linux (platform-specific) options |
140+
| Windows | new section for windows (platform-specific) options |
141+
142+
### Following fields will be moved (without any change in name, data-type and default values)
143+
| v1alpha1 | v1alpha2 | Comments |
144+
|------------------------|---------------------|------------------------------------------------------------------|
145+
| Conntrack | Linux.Conntrack | moved from root(generic) to linux (platform-specific) section |
146+
| OOMScoreAdj | Linux.OOMScoreAdj | moved from root(generic) to linux (platform-specific) section |
147+
| IPTables.MasqueradeAll | Linux.MasqueradeAll | moved from iptables (backend-specific) to root (generic) section |
148+
| NFTables.MasqueradeAll | Linux.MasqueradeAll | moved from nftables (backend-specific) to root (generic) section |
149+
| IPTables.SyncPeriod | SyncPeriod | moved from iptables (backend-specific) to root (generic) section |
150+
| NFTables.SyncPeriod | SyncPeriod | moved from nftables (backend-specific) to root (generic) section |
151+
| IPVS.SyncPeriod | SyncPeriod | moved from ipvs (backend-specific) to to root (generic) section |
152+
| IPTables.MinSyncPeriod | MinSyncPeriod | moved from iptables (backend-specific) to root (generic) section |
153+
| NFTables.MinSyncPeriod | MinSyncPeriod | moved from nftables (backend-specific) to root (generic) section |
154+
| IPVS.MinSyncPeriod | MinSyncPeriod | moved from ipvs (backend-specific) to root (generic) section |
155+
156+
### Following fields will be changed
157+
| v1alpha1 | v1alpha2 | DataType | Comments |
158+
|--------------------|--------------------------|--------------|----------------------------------------------------------------------------------------------------------------|
159+
| ClusterCIDR | DetectLocal.ClusterCIDRs | list[string] | list of CIDR ranges for detecting local traffic |
160+
| BindAddress | NodeIPOverride | list[string] | list of primary node IPs |
161+
| MetricsBindAddress | MetricsBindAddresses | list[string] | list of CIDR ranges that contain valid node IPs to expose metrics server, instead of host port(ip:port) format |
162+
| HealthzBindAddress | HealthzBindAddresses | list[string] | list of CIDR ranges that contain valid node IPs to expose healthz server, instead of host port(ip:port) format | |
163+
164+
### Following fields will be added
165+
| Field | DataType | Default Value | Comments |
166+
|----------------------|------------------|---------------|--------------------------------------------------------------------------------|
167+
| IPVS.MasqueradeBit | integer (32-bit) | 14 | IPVS will use this field instead of IPTables.MasqueradeBit |
168+
| Windows.RunAsService | boolean | false | new field for existing --windows-service command line flag |
169+
| ConfigHardFail | boolean | true | if set to true, kube-proxy will exit rather than just warning on config errors |
170+
| MetricsBindPort | integer (32-bit) | 10249 | port on which metrics server will be exposed |
171+
| HealthzBindPort | integer (32-bit) | 10256 | port on which helathz server will be exposed |
172+
173+
### Following fields will have different default values
174+
| Field | v1alpha1 (default) | v1alpha2 (default) |
175+
|-----------------------------|--------------------|--------------------|
176+
| IPTables.LocalhostNodePorts | true | false |
177+
| BindAddressHardFail | false | true |
178+
179+
180+
### Following fields will be dropped
181+
| Key | Comments |
182+
|-----------|-----------------------------------------|
183+
| PortRange | dropped as no longer used by kube-proxy |
184+
130185

131186
### Test Plan
132187

@@ -137,12 +192,18 @@ to implement this enhancement.
137192
##### Prerequisite testing updates
138193

139194

140-
##### Unit tests
141-
195+
##### Unit tests
196+
There will addition of new tests and modification of existing ones in the following packages:
197+
- `k8s.io/kubernetes/cmd/kubeadm/app/componentconfigs`: `2024-01-21` - `76%`
198+
- `k8s.io/kubernetes/cmd/kubeadm/app/phases/addons/proxy`: `2024-01-21` - `78%`
199+
- `k8s.io/kubernetes/cmd/kubeadm/app/util/config`: `2024-01-21` - `70.5%`
200+
- `k8s.io/kubernetes/cmd/kubeadm/app/util/config/strict`: `2024-01-21` - `100%`
201+
- `k8s.io/kubernetes/cmd/kube-proxy/app`: `2024-01-21` - `43.6%`
202+
- `k8s.io/kubernetes/pkg/proxy/apis/config/scheme`: `2024-01-21` - `100%`
203+
- `k8s.io/kubernetes/pkg/proxy/apis/config/validation`: `2024-01-21` - `84.2%`
142204

143205
##### Integration tests
144206

145-
146207
##### e2e tests
147208

148209
### Graduation Criteria
@@ -156,9 +217,19 @@ The config should be considered graduated to beta if it:
156217

157218
### Upgrade / Downgrade Strategy
158219

220+
Users are able to use the `v1alpha1` or `v1alpha2` API. Since they only affect the
221+
configuration of the proxy, there is no impact to running workloads.
222+
223+
The existing flags `--config` and `--write-config-to` can be used to convert any existing
224+
v1alpha1 to v1alpha2 kube-proxy configuration. `--config` can consume and decode any
225+
supported version, `--write-config-to` will always write using latest version.
226+
```bash
227+
/usr/local/bin/kube-proxy --config old-v1alpha1.yaml --write-config-to new-v1alpha2.yaml
228+
```
159229

160230
### Version Skew Strategy
161231

232+
N/A
162233

163234
## Production Readiness Review Questionnaire
164235

@@ -168,81 +239,66 @@ The config should be considered graduated to beta if it:
168239

169240
###### How can this feature be enabled / disabled in a live cluster?
170241

171-
<!--
172-
Pick one of these and delete the rest.
173-
174-
Documentation is available on [feature gate lifecycle] and expectations, as
175-
well as the [existing list] of feature gates.
176-
177-
[feature gate lifecycle]: https://git.k8s.io/community/contributors/devel/sig-architecture/feature-gates.md
178-
[existing list]: https://kubernetes.io/docs/reference/command-line-tools-reference/feature-gates/
179-
-->
180-
181-
- [ ] Feature gate (also fill in values in `kep.yaml`)
182-
- Feature gate name:
183-
- Components depending on the feature gate:
184-
- [x] Other
185-
- Describe the mechanism:
186-
- Will enabling / disabling the feature require downtime of the control
187-
plane?
188-
- Will enabling / disabling the feature require downtime or reprovisioning
189-
of a node?
242+
Operators can use the config API via --config command line flag for kube-proxy.
243+
To disable, operators can remove --config flag and use other command line flags
244+
to configure the proxy.
190245

191246
###### Does enabling the feature change any default behavior?
192247

193-
<!--
194-
Any change of default behavior may be surprising to users or break existing
195-
automations, so be extremely careful here.
196-
-->
248+
No
197249

198250
###### Can the feature be disabled once it has been enabled (i.e. can we roll back the enablement)?
199251

200-
<!--
201-
Describe the consequences on existing workloads (e.g., if this is a runtime
202-
feature, can it break the existing applications?).
203-
204-
Feature gates are typically disabled by setting the flag to `false` and
205-
restarting the component. No other changes should be necessary to disable the
206-
feature.
207-
208-
NOTE: Also set `disable-supported` to `true` or `false` in `kep.yaml`.
209-
-->
252+
Yes, by removing --config command line flag for kube-proxy.
210253

211254
###### What happens if we reenable the feature if it was previously rolled back?
212255

256+
N/A
257+
213258
###### Are there any tests for feature enablement/disablement?
214259

215-
<!--
216-
The e2e framework does not currently support enabling or disabling feature
217-
gates. However, unit tests in each component dealing with managing data, created
218-
with and without the feature, are necessary. At the very least, think about
219-
conversion tests if API types are being modified.
260+
The e2e framework does not currently support changing configuration files.
220261

221-
Additionally, for features that are introducing a new API field, unit tests that
222-
are exercising the `switch` of feature gate itself (what happens if I disable a
223-
feature gate after having objects written with the new field) are also critical.
224-
You can take a look at one potential example of such test in:
225-
https://github.com/kubernetes/kubernetes/pull/97058/files#diff-7826f7adbc1996a05ab52e3f5f02429e94b68ce6bce0dc534d1be636154fded3R246-R282
226-
-->
262+
There are intensive unit tests for all the API versions.
227263

228264
### Rollout, Upgrade and Rollback Planning
229265

230266
###### How can a rollout or rollback fail? Can it impact already running workloads?
231267

268+
A malformed configuration will cause the proxy to fail to start. Running
269+
workloads are not affected.
270+
232271
###### What specific metrics should inform a rollback?
233272

273+
- `sync_proxy_rules_duration_seconds` being empty or fairly high.
274+
- Spike in any of the following metrics:
275+
- `network_programming_duration_seconds`
276+
- `sync_proxy_rules_endpoint_changes_pending`
277+
- `sync_proxy_rules_service_changes_pending`
278+
- A spike in difference of `sync_proxy_rules_last_queued_timestamp_seconds` and `sync_proxy_rules_last_timestamp_seconds`
279+
234280
###### Were upgrade and rollback tested? Was the upgrade->downgrade->upgrade path tested?
235281

282+
N/A
283+
236284
###### Is the rollout accompanied by any deprecations and/or removals of features, APIs, fields of API types, flags, etc.?
237285

286+
No.
287+
238288
### Monitoring Requirements
239289

240290
###### How can an operator determine if the feature is in use by workloads?
241291

292+
N/A
293+
242294
###### How can someone using this feature know that it is working for their instance?
243295

296+
N/A
297+
244298
###### What are the reasonable SLOs (Service Level Objectives) for the enhancement?
245299

300+
N/A
301+
246302
###### What are the SLIs (Service Level Indicators) an operator can use to determine the health of the service?
247303

248304
- [ ] Metrics
@@ -271,17 +327,15 @@ No.
271327

272328
###### Will enabling / using this feature result in introducing new API types?
273329

274-
Yes.
275-
[WIP]
330+
Yes, `v1alpha2` will be introduced for kube-proxy.
276331

277332
###### Will enabling / using this feature result in any new calls to the cloud provider?
278333

279334
No.
280335

281336
###### Will enabling / using this feature result in increasing size or count of the existing API objects?
282337

283-
Yes.
284-
[WIP]
338+
No.
285339

286340
###### Will enabling / using this feature result in increasing time taken by any operations covered by existing SLIs/SLOs?
287341

@@ -300,19 +354,27 @@ No.
300354

301355
###### How does this feature react if the API server and/or etcd is unavailable?
302356

357+
N/A
303358

304359
###### What are other known failure modes?
305360

361+
None.
306362

307363
###### What steps should be taken if SLOs are not being met to determine the problem?
308364

365+
N/A
309366

310367
## Implementation History
311-
368+
- 2019-09-20: KEP introduced with motivation.
369+
- 2023-11-17: KEP for v1alpha2 configuration sent for review, including proposal,
370+
test plan, and PRR questionnaire.
312371

313372
## Drawbacks
314373

374+
N/A
375+
315376
## Alternatives
316377

317-
## Infrastructure Needed (Optional)
378+
N/A
318379

380+
## Infrastructure Needed (Optional)

keps/sig-network/784-kube-proxy-component-config/kep.yaml

Lines changed: 4 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,4 @@
1-
title: Kube Proxy component configuration graduation
1+
title: Kube Proxy component configuration updates and graduation
22
kep-number: 784
33
authors:
44
- "@rosti"
@@ -8,17 +8,18 @@ participating-sigs:
88
- sig-cluster-lifecycle
99
- sig-api-machinery
1010
- wg-component-standard
11-
status: provisional
11+
status: implementable
1212
creation-date: 2019-06-13
1313
reviewers:
1414
- "@danwinship"
1515
- "@aojea"
1616
- "@thockin"
17+
- "@neolit123"
1718
approvers:
1819
- "@danwinship"
1920
- "@aojea"
2021
- "@thockin"
21-
22+
- "@neolit123"
2223
see-also: []
2324
replaces: []
2425

0 commit comments

Comments
 (0)