You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: keps/sig-network/1435-mixed-protocol-lb/README.md
+27-16Lines changed: 27 additions & 16 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -297,7 +297,7 @@ Our feature does not introduce new values or new fields. It enables the usage of
297
297
- Alibaba: no risk. The current CPI and LB already supports the mixed protocols in the same Service definition. If this feature is enabled in an API server and then the API server rollback is executed the CPI can still handle the Services with mixed protocol sets.
298
298
- AWS: no risk. The current CPI and LB already supports the mixed protocols in the same Service definition. The situation is the same as with the Alibaba CPI.
299
299
- Azure: no risk. The current CPI and LB already supports the mixed protocols in the same Service definition. The situation is the same as with the Alibaba CPI.
300
-
- GCE: currently the GCE CPI assumes that a Service definition contains a single protocol value, as it assumes that the Service Controller already rejected Services with mixed protocols. While the Service Controller really did so a while ago, it does not do this anymore. It means a risk.
300
+
- GCE: currently the GCE CPI assumes that a Service definition contains a single protocol value, as it assumes the apiserver already rejects Services with mixed protocols during validation. While the Service Controller really did so a while ago, it does not do this anymore. It means a risk. However, Google team members stated in the enhancement issue that we could proceed.
301
301
- DigitalOcean: no risk. The current CPI accepts Services with TCP protocol only, i.e. after a K8s upgrade a user still cannot use this feature. Consequently, a rollback in the K8s version does not introduce any issues.
302
302
- IBM Cloud VPC: no risk. The same situation like in the case of AWS.
303
303
- IBM Cloud Classic: no risk. The CPI and NLB already supports TCP and UDP in the same Service definition. The same situation like in the case of Alibaba.
@@ -471,7 +471,7 @@ In the long term:
471
471
472
472
### Kube-proxy
473
473
474
-
The kube-proxy should use the port status information from `Service.status.loadBalancer.ingress` in order not to allow traffic to those ports that could not be opened by the load balancer either.
474
+
Kube-proxy will not block traffic based on the port status information from `Service.status.loadBalancer.ingress` that a load balancer controller sets. Because multi-protocol is a feature, clusterIP and NodePort traffic will work independent of the load balancer. Because some load balancers use NodePort and some use VIP_like LB traffic, there may exist packets to LBIP:LBPORT/wrong-protocol. Traffic will not be blocked based on variation in this implementation detail.
475
475
476
476
477
477
### Test Plan
@@ -484,9 +484,9 @@ Optionally, if the CPI supports that:
484
484
485
485
### Graduation Criteria
486
486
487
-
From end user's perspective the graduation criteria are the feecback/bug correction and testing based.
487
+
From end user's perspective the graduation criteria are feedback/bug correction and testing based.
488
488
489
-
From CPI implementation perspective thet feature can be graduated to beta, as the cloud providers with managed K8s products can still decide whether they activate it for their managed clusters or not, depending on the status of their CPI implementation.
489
+
From CPI implementation perspective the feature can be graduated to beta, as the cloud providers with managed K8s products can still decide whether they activate it for their managed clusters or not, depending on the status of their CPI implementation.
490
490
491
491
Graduating to GA means, that the feature flag checking is removed from the code. It means, that all CPI implementations must be ready to deal with Services with mixed protocol configuration - either rejecting such Services properly or managing the cloud load balancers according to the Service definition.
492
492
@@ -497,7 +497,6 @@ Graduating to GA means, that the feature flag checking is removed from the code.
497
497
#### Alpha -> Beta Graduation
498
498
499
499
- All of the major clouds support this or indicate non-support properly
500
-
- Kube-proxy does not proxy on ports that are in an error state
501
500
502
501
#### Beta -> GA Graduation
503
502
@@ -531,12 +530,12 @@ _This section must be completed when targeting alpha to a release._
531
530
532
531
* **How can this feature be enabled / disabled in a live cluster?**
533
532
- [x] Feature gate (also fill in values in `kep.yaml`)
534
-
- Feature gate name: MixedProtocolLBSVC
533
+
- Feature gate name: MixedProtocolLBService
535
534
- Components depending on the feature gate: Kubernetes API Server
536
535
537
536
* **Does enabling the feature change any default behavior?**
538
537
539
-
When the feature is enabled the Services with mixed protocols are not rejected anymore by the Kuber API server, and it is up to the CPI to handle those.
538
+
When the feature is enabled the Services with mixed protocols are not rejected anymore by the Kubernetes API server, and it is up to the CPI to handle those.
540
539
Please see the analysis at `API change and upgrade/downgrade situations`
541
540
542
541
* **Can the feature be disabled once it has been enabled (i.e. can we roll back
@@ -558,23 +557,29 @@ _This section must be completed when targeting beta graduation to a release._
558
557
559
558
* **How can a rollout fail? Can it impact already running workloads?**
560
559
561
-
TBD
560
+
Enabling this feature gate moves the responsibility for handling mixed protocols from the Kubernetes API server to the specific CPI. See the provider-specific details at [API change and upgrade/downgrade situations](#api-change-and-upgradedowngrade-situations).
562
561
563
562
* **What specific metrics should inform a rollback?**
564
563
565
-
TBD
564
+
As all providers either already have adjusted their error messages or intend to, enabling this feature gate may lead to a CPI error. If load balancer traffic only works correctly for one protocol and not for the other, that is another reason to roll back. In such cases, remediation would be achieved by disabling the feature gate.
566
565
567
566
* **Were upgrade and rollback tested? Was the upgrade->downgrade->upgrade path tested?**
568
567
569
-
TBD
568
+
This feature uses an existing value in existing fields, but with a different logic. If we create a Service with mixed protocol and then we roll back the API server to a version that does not implement this feature, the clients will still get the Service with mixed protocols when using the rolled-back API. If the client (CPI implementation) has been roll backed as well, then the client may encounter a Service setup that it does not support. All clouds either support this or adjusted their errors reported, per the enhancement issue.
569
+
570
570
571
571
### Monitoring Requirements
572
572
573
573
_This section must be completed when targeting beta graduation to a release._
574
574
575
575
* **How can an operator determine if the feature is in use by workloads?**
576
576
577
-
TBD
577
+
After checking to see if a Service of `type:LoadBalancer` that uses two different protocols on the same port was created and validated by the API server, the operator can then check the specifics of the [API change and upgrade/downgrade situations](#api-change-and-upgradedowngrade-situations) for their specific cloud provider; this covers object persistence and availability.
578
+
579
+
The e2e tests shall check that
580
+
- a multi-protocol Service triggers the creation of a multi-protocol cloud load balancer
581
+
Optionally, if the CPI supports that:
582
+
- the CPI sets the new Conditions and or Port Status in the Load Balancer Service after creating the cloud load balancer
578
583
579
584
* **What are the SLIs (Service Level Indicators) an operator can use to determine
580
585
the health of the service?**
@@ -587,12 +592,13 @@ the health of the service?**
587
592
588
593
* **What are the reasonable SLOs (Service Level Objectives) for the above SLIs?**
589
594
590
-
TBD
595
+
Before this KEP, expectations of performance for this specific feature were consistent across providers. In this KEP we loosen validation on the Services API, which allows the creation of Services with two protocols where before the API server only allowed one protocol per Service. Depending on CPI and cloud provider API implementation, setup of additional ports or listeners with different protocols may take more time than was previously required for single-protocol Services.
596
+
591
597
592
598
* **Are there any missing metrics that would be useful to have to improve observability
593
599
of this feature?**
594
600
595
-
TBD
601
+
N/A
596
602
597
603
### Dependencies
598
604
@@ -606,7 +612,7 @@ _This section must be completed when targeting beta graduation to a release._
606
612
607
613
* **Will enabling / using this feature result in any new API calls?**
608
614
609
-
If a CPI supports the management of the new Conditions and PortStatus in the LoadBalancer Service the managemenof of those fileds will mean additional traffic on the API
615
+
If a CPI supports the management of the new Conditions and PortStatus in the LoadBalancer Service the management of those fields will mean additional traffic on the API
610
616
611
617
* **Will enabling / using this feature result in introducing new API types?**
612
618
@@ -636,20 +642,25 @@ resource usage (CPU, RAM, disk, IO, ...) in any components?**
636
642
637
643
* **How does this feature react if the API server and/or etcd is unavailable?**
638
644
645
+
The CPI sets the Conditions and/or PortStatus on the Service.Status object. If the API service is not available, the CPI cannot update the status. If the CPI updates the status and then later the API server becomes unavailable, the Status is stored with the Service object and will be available the next time the API server starts.
646
+
647
+
It's possible for the CPI to set the new Conditions and/or PortStatus in the Load Balancer Service after creating the cloud load balancer, but this feature will not be triggerable without API server response.
648
+
639
649
* **What are other known failure modes?**
640
650
641
-
TBD
651
+
Cloud providers will need to provide their intended responses; in most cases they intend to initially indicate non-support, then add support later.
642
652
643
653
* **What steps should be taken if SLOs are not being met to determine the problem?**
644
654
645
-
TBD
655
+
Enabling this feature gate moves the responsibility for handling mixed protocols from the Kubernetes API server to the specific CPI. Depending on CPI and cloud provider API implementation, setup of additional ports or listeners with different protocols may take more time than was previously required for single-protocol Services. Diagnosis for any performance impacts will be specific to the out-of-tree cloud providers.
0 commit comments