Skip to content

Commit be1770b

Browse files
authored
Merge pull request kubernetes#3196 from bridgetkromhout/1435-prr
KEP-1435: adding PRR for MixedProtocolLBService feature to move to beta
2 parents 9ffa073 + 98180d2 commit be1770b

File tree

3 files changed

+35
-20
lines changed

3 files changed

+35
-20
lines changed
Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,3 @@
1+
kep-number: 1435
2+
beta:
3+
approver: "ehashman"

keps/sig-network/1435-mixed-protocol-lb/README.md

Lines changed: 27 additions & 16 deletions
Original file line numberDiff line numberDiff line change
@@ -297,7 +297,7 @@ Our feature does not introduce new values or new fields. It enables the usage of
297297
- Alibaba: no risk. The current CPI and LB already supports the mixed protocols in the same Service definition. If this feature is enabled in an API server and then the API server rollback is executed the CPI can still handle the Services with mixed protocol sets.
298298
- AWS: no risk. The current CPI and LB already supports the mixed protocols in the same Service definition. The situation is the same as with the Alibaba CPI.
299299
- Azure: no risk. The current CPI and LB already supports the mixed protocols in the same Service definition. The situation is the same as with the Alibaba CPI.
300-
- GCE: currently the GCE CPI assumes that a Service definition contains a single protocol value, as it assumes that the Service Controller already rejected Services with mixed protocols. While the Service Controller really did so a while ago, it does not do this anymore. It means a risk.
300+
- GCE: currently the GCE CPI assumes that a Service definition contains a single protocol value, as it assumes the apiserver already rejects Services with mixed protocols during validation. While the Service Controller really did so a while ago, it does not do this anymore. It means a risk. However, Google team members stated in the enhancement issue that we could proceed.
301301
- DigitalOcean: no risk. The current CPI accepts Services with TCP protocol only, i.e. after a K8s upgrade a user still cannot use this feature. Consequently, a rollback in the K8s version does not introduce any issues.
302302
- IBM Cloud VPC: no risk. The same situation like in the case of AWS.
303303
- IBM Cloud Classic: no risk. The CPI and NLB already supports TCP and UDP in the same Service definition. The same situation like in the case of Alibaba.
@@ -471,7 +471,7 @@ In the long term:
471471

472472
### Kube-proxy
473473

474-
The kube-proxy should use the port status information from `Service.status.loadBalancer.ingress` in order not to allow traffic to those ports that could not be opened by the load balancer either.
474+
Kube-proxy will not block traffic based on the port status information from `Service.status.loadBalancer.ingress` that a load balancer controller sets. Because multi-protocol is a feature, clusterIP and NodePort traffic will work independent of the load balancer. Because some load balancers use NodePort and some use VIP_like LB traffic, there may exist packets to LBIP:LBPORT/wrong-protocol. Traffic will not be blocked based on variation in this implementation detail.
475475

476476

477477
### Test Plan
@@ -484,9 +484,9 @@ Optionally, if the CPI supports that:
484484

485485
### Graduation Criteria
486486

487-
From end user's perspective the graduation criteria are the feecback/bug correction and testing based.
487+
From end user's perspective the graduation criteria are feedback/bug correction and testing based.
488488

489-
From CPI implementation perspective thet feature can be graduated to beta, as the cloud providers with managed K8s products can still decide whether they activate it for their managed clusters or not, depending on the status of their CPI implementation.
489+
From CPI implementation perspective the feature can be graduated to beta, as the cloud providers with managed K8s products can still decide whether they activate it for their managed clusters or not, depending on the status of their CPI implementation.
490490

491491
Graduating to GA means, that the feature flag checking is removed from the code. It means, that all CPI implementations must be ready to deal with Services with mixed protocol configuration - either rejecting such Services properly or managing the cloud load balancers according to the Service definition.
492492

@@ -497,7 +497,6 @@ Graduating to GA means, that the feature flag checking is removed from the code.
497497
#### Alpha -> Beta Graduation
498498

499499
- All of the major clouds support this or indicate non-support properly
500-
- Kube-proxy does not proxy on ports that are in an error state
501500

502501
#### Beta -> GA Graduation
503502

@@ -531,12 +530,12 @@ _This section must be completed when targeting alpha to a release._
531530

532531
* **How can this feature be enabled / disabled in a live cluster?**
533532
- [x] Feature gate (also fill in values in `kep.yaml`)
534-
- Feature gate name: MixedProtocolLBSVC
533+
- Feature gate name: MixedProtocolLBService
535534
- Components depending on the feature gate: Kubernetes API Server
536535

537536
* **Does enabling the feature change any default behavior?**
538537

539-
When the feature is enabled the Services with mixed protocols are not rejected anymore by the Kuber API server, and it is up to the CPI to handle those.
538+
When the feature is enabled the Services with mixed protocols are not rejected anymore by the Kubernetes API server, and it is up to the CPI to handle those.
540539
Please see the analysis at `API change and upgrade/downgrade situations`
541540

542541
* **Can the feature be disabled once it has been enabled (i.e. can we roll back
@@ -558,23 +557,29 @@ _This section must be completed when targeting beta graduation to a release._
558557

559558
* **How can a rollout fail? Can it impact already running workloads?**
560559

561-
TBD
560+
Enabling this feature gate moves the responsibility for handling mixed protocols from the Kubernetes API server to the specific CPI. See the provider-specific details at [API change and upgrade/downgrade situations](#api-change-and-upgradedowngrade-situations).
562561

563562
* **What specific metrics should inform a rollback?**
564563

565-
TBD
564+
As all providers either already have adjusted their error messages or intend to, enabling this feature gate may lead to a CPI error. If load balancer traffic only works correctly for one protocol and not for the other, that is another reason to roll back. In such cases, remediation would be achieved by disabling the feature gate.
566565

567566
* **Were upgrade and rollback tested? Was the upgrade->downgrade->upgrade path tested?**
568567

569-
TBD
568+
This feature uses an existing value in existing fields, but with a different logic. If we create a Service with mixed protocol and then we roll back the API server to a version that does not implement this feature, the clients will still get the Service with mixed protocols when using the rolled-back API. If the client (CPI implementation) has been roll backed as well, then the client may encounter a Service setup that it does not support. All clouds either support this or adjusted their errors reported, per the enhancement issue.
569+
570570

571571
### Monitoring Requirements
572572

573573
_This section must be completed when targeting beta graduation to a release._
574574

575575
* **How can an operator determine if the feature is in use by workloads?**
576576

577-
TBD
577+
After checking to see if a Service of `type:LoadBalancer` that uses two different protocols on the same port was created and validated by the API server, the operator can then check the specifics of the [API change and upgrade/downgrade situations](#api-change-and-upgradedowngrade-situations) for their specific cloud provider; this covers object persistence and availability.
578+
579+
The e2e tests shall check that
580+
- a multi-protocol Service triggers the creation of a multi-protocol cloud load balancer
581+
Optionally, if the CPI supports that:
582+
- the CPI sets the new Conditions and or Port Status in the Load Balancer Service after creating the cloud load balancer
578583

579584
* **What are the SLIs (Service Level Indicators) an operator can use to determine
580585
the health of the service?**
@@ -587,12 +592,13 @@ the health of the service?**
587592

588593
* **What are the reasonable SLOs (Service Level Objectives) for the above SLIs?**
589594

590-
TBD
595+
Before this KEP, expectations of performance for this specific feature were consistent across providers. In this KEP we loosen validation on the Services API, which allows the creation of Services with two protocols where before the API server only allowed one protocol per Service. Depending on CPI and cloud provider API implementation, setup of additional ports or listeners with different protocols may take more time than was previously required for single-protocol Services.
596+
591597

592598
* **Are there any missing metrics that would be useful to have to improve observability
593599
of this feature?**
594600

595-
TBD
601+
N/A
596602

597603
### Dependencies
598604

@@ -606,7 +612,7 @@ _This section must be completed when targeting beta graduation to a release._
606612

607613
* **Will enabling / using this feature result in any new API calls?**
608614

609-
If a CPI supports the management of the new Conditions and PortStatus in the LoadBalancer Service the managemenof of those fileds will mean additional traffic on the API
615+
If a CPI supports the management of the new Conditions and PortStatus in the LoadBalancer Service the management of those fields will mean additional traffic on the API
610616

611617
* **Will enabling / using this feature result in introducing new API types?**
612618

@@ -636,20 +642,25 @@ resource usage (CPU, RAM, disk, IO, ...) in any components?**
636642

637643
* **How does this feature react if the API server and/or etcd is unavailable?**
638644

645+
The CPI sets the Conditions and/or PortStatus on the Service.Status object. If the API service is not available, the CPI cannot update the status. If the CPI updates the status and then later the API server becomes unavailable, the Status is stored with the Service object and will be available the next time the API server starts.
646+
647+
It's possible for the CPI to set the new Conditions and/or PortStatus in the Load Balancer Service after creating the cloud load balancer, but this feature will not be triggerable without API server response.
648+
639649
* **What are other known failure modes?**
640650

641-
TBD
651+
Cloud providers will need to provide their intended responses; in most cases they intend to initially indicate non-support, then add support later.
642652

643653
* **What steps should be taken if SLOs are not being met to determine the problem?**
644654

645-
TBD
655+
Enabling this feature gate moves the responsibility for handling mixed protocols from the Kubernetes API server to the specific CPI. Depending on CPI and cloud provider API implementation, setup of additional ports or listeners with different protocols may take more time than was previously required for single-protocol Services. Diagnosis for any performance impacts will be specific to the out-of-tree cloud providers.
646656

647657
[supported limits]: https://git.k8s.io/community//sig-scalability/configs-and-limits/thresholds.md
648658
[existing SLIs/SLOs]: https://git.k8s.io/community/sig-scalability/slos/slos.md#kubernetes-slisslos
649659

650660
## Implementation History
651661

652662
- the `Proposal` section being merged, signaling agreement on a proposed design: 14th July 2020
663+
- Move from alpha to beta after agreement from all listed providers, April 2022
653664

654665
## Drawbacks
655666

keps/sig-network/1435-mixed-protocol-lb/kep.yaml

Lines changed: 5 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -2,6 +2,7 @@ title: Different protocols in the same service definition with type=loadbalancer
22
kep-number: 1435
33
authors:
44
5+
- "@bridgetkromhout"
56
owning-sig: sig-network
67
participating-sigs:
78
- sig-cloud-provider
@@ -19,18 +20,18 @@ replaces:
1920
- "/keps/sig-network/ 20200103-mixed-protocol-lb"
2021

2122
# The target maturity stage in the current dev cycle for this KEP.
22-
stage: alpha
23+
stage: beta
2324

2425
# The most recent milestone for which work toward delivery of this KEP has been
2526
# done. This can be the current (upcoming) milestone, if it is being actively
2627
# worked on.
27-
latest-milestone: "v1.20"
28+
latest-milestone: "v1.24"
2829

2930
# The milestone at which this feature was, or is targeted to be, at each stage.
3031
milestone:
3132
alpha: "v1.20"
32-
beta: "v1.21"
33-
stable: "v1.22"
33+
beta: "v1.24"
34+
stable: "TBD"
3435

3536
# The following PRR answers are required at alpha release
3637
# List the feature gate name and the components for which it must be enabled

0 commit comments

Comments
 (0)