Skip to content

Commit fba79ff

Browse files
authored
Merge pull request kubernetes#2883 from rikatz/portranges-to-ga
KEP-2079: Move portRanges to GA
2 parents 321b82a + a462ca3 commit fba79ff

File tree

3 files changed

+113
-60
lines changed

3 files changed

+113
-60
lines changed

keps/prod-readiness/sig-network/2079.yaml

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -3,3 +3,5 @@ alpha:
33
approver: "@wojtek-t"
44
beta:
55
approver: "@wojtek-t"
6+
stable:
7+
approver: "@wojtek-t"

keps/sig-network/2079-network-policy-port-range/README.md

Lines changed: 107 additions & 55 deletions
Original file line numberDiff line numberDiff line change
@@ -16,10 +16,13 @@
1616
- [Design Details](#design-details)
1717
- [Validations](#validations)
1818
- [Test Plan](#test-plan)
19+
- [Prerequisite testing updates](#prerequisite-testing-updates)
20+
- [Unit tests](#unit-tests)
21+
- [e2e tests](#e2e-tests)
1922
- [Graduation Criteria](#graduation-criteria)
2023
- [Alpha](#alpha)
2124
- [Beta](#beta)
22-
- [GA Graduation](#ga-graduation)
25+
- [GA](#ga)
2326
- [Upgrade / Downgrade Strategy](#upgrade--downgrade-strategy)
2427
- [Production Readiness Review Questionnaire](#production-readiness-review-questionnaire)
2528
- [Feature Enablement and Rollback](#feature-enablement-and-rollback)
@@ -138,7 +141,7 @@ and also to the range 49152-65535 without allowing any other ports.
138141

139142
### Notes/Constraints/Caveats
140143

141-
* The technology used by the CNI provider might not support port range in a
144+
* The technology used by the CNI provider might not support port range in a
142145
trivial way as described in [#drawbacks]
143146

144147
### Risks and Mitigations
@@ -151,25 +154,25 @@ of the new field.
151154

152155
API changes to NetworkPolicy:
153156
* Add a new field called `EndPort` inside `NetworkPolicyPort` as the following:
154-
```
157+
```go
155158
// NetworkPolicyPort describes a port to allow traffic on
156159
type NetworkPolicyPort struct {
157-
// The protocol (TCP, UDP, or SCTP) which traffic must match. If not specified, this
158-
// field defaults to TCP.
159-
// +optional
160-
Protocol *v1.Protocol `json:"protocol,omitempty" protobuf:"bytes,1,opt,name=protocol,casttype=k8s.io/api/core/v1.Protocol"`
161-
162-
// The port on the given protocol. This can either be a numerical or named
160+
// The protocol (TCP, UDP, or SCTP) which traffic must match. If not specified, this
161+
// field defaults to TCP.
162+
// +optional
163+
Protocol *v1.Protocol `json:"protocol,omitempty" protobuf:"bytes,1,opt,name=protocol,casttype=k8s.io/api/core/v1.Protocol"`
164+
165+
// The port on the given protocol. This can either be a numerical or named
163166
// port on a pod. If this field is not provided, this matches all port names and
164167
// numbers, whether an endPort is defined or not.
165-
// +optional
166-
Port *intstr.IntOrString `json:"port,omitempty" protobuf:"bytes,2,opt,name=port"`
167-
168-
// EndPort defines the last port included in the port range.
169-
// Example:
170-
// endPort: 12345
171-
// +optional
172-
EndPort int32 `json:"port,omitempty" protobuf:"bytes,2,opt,name=endPort"`
168+
// +optional
169+
Port *intstr.IntOrString `json:"port,omitempty" protobuf:"bytes,2,opt,name=port"`
170+
171+
// EndPort defines the last port included in the port range.
172+
// Example:
173+
// endPort: 12345
174+
// +optional
175+
EndPort int32 `json:"port,omitempty" protobuf:"bytes,2,opt,name=endPort"`
173176
}
174177
```
175178

@@ -181,6 +184,12 @@ The `NetworkPolicyPort` will need to be validated, with the following scenarios:
181184

182185
### Test Plan
183186

187+
[X] I/we understand the owners of the involved components may require updates to
188+
existing tests to make this code solid enough prior to committing the changes necessary
189+
to implement this enhancement.
190+
191+
##### Prerequisite testing updates
192+
184193
Unit tests:
185194
* test API validation logic
186195
* test API strategy to ensure disabled fields
@@ -189,6 +198,16 @@ E2E tests:
189198
* Add e2e tests exercising only the API operations for port ranges. Data-path
190199
validation should be done by CNIs.
191200

201+
##### Unit tests
202+
203+
- `pkg/apis/networking/validation/validation`: `14/Jun/2022` - `92.5%`
204+
- `pkg/registry/networking/networkpolicy/strategy`: `14/Jun/2022` - `75.9%`
205+
206+
##### e2e tests
207+
208+
- Feature:NetworkPolicyEndPort: https://storage.googleapis.com/k8s-triage/index.html?text=EndPort#eaa4b8cdb7b461dccfa9
209+
210+
The flakes shown here are not related to this feature, per the tests logs
192211

193212
### Graduation Criteria
194213

@@ -203,11 +222,17 @@ validation should be done by CNIs.
203222
with generally positive feedback on its usage.
204223
- Feature Gate is enabled by Default.
205224

206-
#### GA Graduation
225+
#### GA
207226

208227
- At least **four** NetworkPolicy providers (or CNI providers) support the `EndPort` field
209228
- `EndPort` has been enabled by default for at least 1 minor release
210229

230+
The following are the CNIs that implement this feature:
231+
- Calico
232+
- Antrea
233+
- Openshift SDN
234+
- Kuberouter
235+
211236
### Upgrade / Downgrade Strategy
212237

213238
If upgraded no impact should happen as this is a new field.
@@ -221,17 +246,16 @@ start working incorrectly. This is a fail-closed failure, so it is acceptable.
221246
### Feature Enablement and Rollback
222247

223248

224-
* **How can this feature be enabled / disabled in a live cluster?**
249+
###### How can this feature be enabled / disabled in a live cluster?
225250
- [X] Feature gate (also fill in values in `kep.yaml`)
226251
- Feature gate name: NetworkPolicyEndPort
227252
- Components depending on the feature gate: Kubernetes API Server
228253

229-
* **Does enabling the feature change any default behavior?**
254+
###### Does enabling the feature change any default behavior?
230255
No
231256

232-
* **Can the feature be disabled once it has been enabled (i.e. can we roll back
233-
the enablement)?**
234-
257+
###### Can the feature be disabled once it has been enabled (i.e. can we roll back the enablement)?
258+
235259
Yes. One caveat here is that NetworkPolicies created with EndPort field set
236260
when the feature was enabled will continue to have that field set when the
237261
feature is disabled unless user removes it from the object.
@@ -247,57 +271,82 @@ start working incorrectly. This is a fail-closed failure, so it is acceptable.
247271
port range, which may break users, which is inevitable but satisfies the
248272
fail-closed requirement.
249273

250-
* **What happens if we reenable the feature if it was previously rolled back?**
274+
###### What happens if we reenable the feature if it was previously rolled back?
275+
251276
Nothing.
252277

253-
* **Are there any tests for feature enablement/disablement?**
278+
###### Are there any tests for feature enablement/disablement?
254279

255280
Yes and they can be found [here](https://github.com/kubernetes/kubernetes/blob/release-1.21/pkg/registry/networking/networkpolicy/strategy_test.go#L284)
256281

257282
### Rollout, Upgrade and Rollback Planning
258283

259-
_This section must be completed when targeting beta graduation to a release._
260-
* **How can a rollout fail? Can it impact already running workloads?**
284+
###### How can a rollout or rollback fail? Can it impact already running workloads?
261285
Not probably, but still there's the risk of some bug that fails validation,
262286
or conversion function crashes.
263287

264-
* **What specific metrics should inform a rollback?**
288+
###### What specific metrics should inform a rollback?
265289
The increase of 5xx http error count on Network Policies Endpoint
266290

267-
* **Were upgrade and rollback tested? Was the upgrade->downgrade->upgrade path tested?**
291+
###### Were upgrade and rollback tested? Was the upgrade->downgrade->upgrade path tested?
292+
268293
Yes, with unit tests.
269-
There's still some need to make manual tests, that will be done in a follow up.
294+
Manual tests were also executed as the following:
295+
* Created a KinD cluster in v1.24 and Calico as a CNI
296+
* Created a Network Policy with `endPort` field to allow a Pod egress to ports from 70 to 90
297+
* Did a test against a target in port 80 - Worked
298+
* Disabled the Feature Gate
299+
* The Network Policy still worked fine
300+
* Changed the Network Policy so the range is 70 to 79, and the Network Policy was changed fine
301+
* Traffic started to be blocked, but could call port 78 as it is still within range
302+
* Removed `endPort` field, and wasn't able to re-add it as Feature gate was disabled
303+
* Re-enabled feature gate
304+
* Re-added `endPort` field with value of 90
305+
* Traffic started to flow/be accepted again
306+
307+
Per the manual tests, all worked as desired.
308+
309+
###### Is the rollout accompanied by any deprecations and/or removals of features, APIs, fields of API types, flags, etc.?
270310

271-
* **Is the rollout accompanied by any deprecations and/or removals of features, APIs,
272311
None
273312

274313
### Monitoring Requirements
275314

276-
_This section must be completed when targeting beta graduation to a release._
277-
* **How can an operator determine if the feature is in use by workloads?**
315+
###### How can an operator determine if the feature is in use by workloads?
316+
278317

279318
Operators can determine if NetworkPolicies are making use of EndPort creating
280319
an object specifying the range and validating if the traffic is allowed within
281-
the specified range
320+
the specified range.
321+
322+
Also Network Policy object now supports (as alpha) status/condition fields, so
323+
Network Policy providers can add a feedback to the user whether the policy was processed
324+
correctly or not. Providing this feedback is optional and depends on implementation
325+
by each NPP.
326+
327+
###### How can someone using this feature know that it is working for their instance?
282328

283-
* **How can someone using this feature know that it is working for their instance?
284329
- [x] Other
285330
- Details:
286331
The API Field must be present when a NetworkPolicy is created with that field.
287332
The feature working correctly depends on the CNI implementation, so the operator can
288333
look into CNI metrics to check if the rules are being applied correctly, like Calico
289334
that provides metrics like `felix_iptables_restore_errors` that can be used to
290335
verify if the amount of restoring errors raised after the feature being applied.
291-
We might need in a future to add some Status field that allows CNI providers to provide
292-
feedback about the functionality
336+
For NetworkPolicy Providers that doesn't support this feature, a new status field was added
337+
in Network Policy object allowing the providers to give feedback to users using conditions.
338+
Any NPP that does not support this feature should add a condition on the Network Policy
339+
object.
293340

294-
* **What are the SLIs (Service Level Indicators) an operator can use to determine
295-
the health of the service?**
341+
342+
###### What are the SLIs (Service Level Indicators) an operator can use to determine the health of the service?
343+
296344
Operators can use metrics provided by the CNI to use as SLI, like
297345
`felix_iptables_restore_errors` from Calico to verify if the errors rate
298346
has raised.
299347

300-
* **What are the reasonable SLOs (Service Level Objectives) for the above SLIs?**
348+
###### What are the reasonable SLOs (Service Level Objectives) for the enhancement?
349+
301350
- per-day percentage of API calls finishing with 5XX errors <= 1% is a reasonable SLO
302351

303352
* **Are there any missing metrics that would be useful to have to improve observability
@@ -307,52 +356,55 @@ of this feature?**
307356

308357
### Dependencies
309358

310-
* **Does this feature depend on any specific services running in the cluster?**
311-
Yes, a CNI supporting the new feature
359+
###### Does this feature depend on any specific services running in the cluster?
312360

361+
Yes, a CNI supporting the new feature
313362

314363
### Scalability
315364

316-
* **Will enabling / using this feature result in any new API calls?**
365+
###### Will enabling / using this feature result in any new API calls?
317366
No
318367

319-
* **Will enabling / using this feature result in introducing new API types?**
368+
###### Will enabling / using this feature result in introducing new API types?
369+
320370
No
321371

322-
* **Will enabling / using this feature result in any new calls to the cloud
323-
provider?**
372+
###### Will enabling / using this feature result in any new calls to the cloud provider?
373+
324374
No
325375

326-
* **Will enabling / using this feature result in increasing size or count of
327-
the existing API objects?**
376+
###### Will enabling / using this feature result in increasing size or count of the existing API objects?
377+
328378

329379
- API type(s): NetworkPolicyPorts
330380
- Estimated increase in size: 2 bytes for each new `EndPort` value specified + the field name/number in its serialized format
331381
- Estimated amount of new objects: N/A
332382

333-
* **Will enabling / using this feature result in increasing time taken by any
334-
operations covered by [existing SLIs/SLOs]?**
383+
###### Will enabling / using this feature result in increasing time taken by any operations covered by existing SLIs/SLOs?
384+
335385
N/A
336386

337-
* **Will enabling / using this feature result in non-negligible increase of
338-
resource usage (CPU, RAM, disk, IO, ...) in any components?**
387+
###### Will enabling / using this feature result in non-negligible increase of resource usage (CPU, RAM, disk, IO, ...) in any components?
339388
It might get some increase of resource usage by the CNI while parsing the
340389
new field.
341390

342391
### Troubleshooting
343392

344-
* **How does this feature react if the API server and/or etcd is unavailable?**
393+
###### How does this feature react if the API server and/or etcd is unavailable?
394+
345395
As this feature is mainly used by CNI providers, the reaction with API server
346396
and/or etcd being unavailable will be the same as before.
347397

348-
* **What are other known failure modes?**
398+
###### What are other known failure modes?
349399
N/A
350400

351-
* **What steps should be taken if SLOs are not being met to determine the problem?**
401+
###### What steps should be taken if SLOs are not being met to determine the problem?
402+
352403
Remove EndPort field and check if the number of errors reduce, although this might
353404
lead to undesired Network Policy, blocking previously working rules.
354405

355406
## Implementation History
407+
- 2022-06-14 Propose GA graduation
356408
- 2021-05-11 Propose Beta graduation and add more Performance Review data
357409
- 2020-10-08 Initial [KEP PR](https://github.com/kubernetes/enhancements/pull/2079)
358410

keps/sig-network/2079-network-policy-port-range/kep.yaml

Lines changed: 4 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -13,24 +13,23 @@ approvers:
1313
- "@thockin"
1414

1515
# The target maturity stage in the current dev cycle for this KEP.
16-
stage: beta
16+
stage: stable
1717

1818
# The most recent milestone for which work toward delivery of this KEP has been
1919
# done. This can be the current (upcoming) milestone, if it is being actively
2020
# worked on.
21-
latest-milestone: "v1.22"
21+
latest-milestone: "v1.25"
2222

2323
# The milestone at which this feature was, or is targeted to be, at each stage.
2424
milestone:
2525
alpha: "v1.21"
2626
beta: "v1.22"
27-
stable: "v1.23"
27+
stable: "v1.25"
2828

2929
# The following PRR answers are required at alpha release
3030
# List the feature gate name and the components for which it must be enabled
3131
feature-gates:
3232
- name: NetworkPolicyEndPort
3333
components:
3434
- kube-apiserver
35-
disable-supported: true
36-
35+
disable-supported: true

0 commit comments

Comments
 (0)