You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: keps/sig-api-machinery/4020-unknown-version-interoperability-proxy/README.md
+77-24Lines changed: 77 additions & 24 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -58,7 +58,7 @@ If none of those approvers are still appropriate, then changes to that list
58
58
should be approved by the remaining approvers and/or the owning SIG (or
59
59
SIG Architecture for cross-cutting KEPs).
60
60
-->
61
-
# KEP-3903: Unknown Version Interoperability Proxy
61
+
# KEP-4020: Unknown Version Interoperability Proxy
62
62
63
63
<!--
64
64
A table of contents is helpful for quickly jumping to sections of a KEP and for
@@ -157,8 +157,8 @@ Items marked with (R) are required *prior to targeting to a milestone / release*
157
157
## Summary
158
158
159
159
When a cluster has multiple apiservers at mixed versions (such as during an
160
-
upgrade or downgrade), not every apiserver can serve every resource at every
161
-
version.
160
+
upgrade/downgrade or when runtime-config changes and a rollout happens), not
161
+
every apiserver can serve every resource at every version.
162
162
163
163
To fix this, we will add a filter to the handler chain in the aggregator which
164
164
proxies clients to an apiserver that is capable of handling their request.
@@ -291,7 +291,7 @@ To prevent server-side request forgeries we will not give control over informati
291
291
### Aggregation Layer
292
292
293
293
1. A new filter will be added to the [handler chain] of the aggregation layer. This filter will maintain an internal map with the key being the group-version-resource and the value being a list of server IDs of apiservers that are capable of serving that group-version-resource
294
-
1. This internal map is populated using an informer for StorageVersion objects. An event handler will be added for this informer that will get the apiserver ID of the requested group-version-resource and update the internal map accordingly
294
+
1. This internal map is populated using an informer for StorageVersion objects. An event handler will be added for this informer that will get the apiserver ID of the requested group-version-resource and update the internal map accordingly
295
295
296
296
2. This filter will pass on the request to the next handler in the local aggregator chain, if:
297
297
1. It is a non resource request
@@ -314,17 +314,29 @@ StorageVersion API currently tells us whether a particular StorageVersion can be
* TODO: We need to find a place to store and retrieve the destination apiserver's host and port information given the server's ID.
318
-
We do not want to store this information in
317
+
We will use the already existing [masterlease reconciler](https://github.com/kubernetes/kubernetes/blob/master/pkg/controlplane/reconcilers/lease.go) to store/retrieve the IPs and ports for kube-apiservers. Major reasons to use this are
319
318
320
-
* StorageVersion : because we do not want to expose the network identity of the apiservers in this API that can be listed in multiple places where it may be unnecessary/redundant to do so
321
-
* Endpoint reconciler lease : because the IP present here could be that of a load balancer for the apiservers, but we need to know the definite address of the identified destination apiserver
319
+
1. masterlease reconciler already stores kube-apiserver IPs currently
320
+
2. this information is not exposed to users in an API that can be used maliciously
321
+
3. existing code to handle lifecycle of the masterleases is convenient
322
+
323
+
How the masterlease reconciler will be used is as follows:
324
+
325
+
1. We will use the already existing IP in Endpoints.Subsets.Addresses of the masterlease by default
326
+
327
+
2. For users with network configurations that would not allow Endpoints.Subsets.Addresses to be reachable from a kube-apiserver, we will introduce a new --advertise-peer-ip flag to kube-apiserver. We will store its value as an annotation on the masterlease and use this to route the request to the right destination server
328
+
329
+
3. We will also expose the IP and port information of the kube-apiservers as annotations in APIserver identity lease object for visibility/debugging purposes
330
+
331
+
4. We will also use an egress dialer for network connections made to peer kube-apiservers. For this, will create a new type for the network context to be used for peer kube-apiserver connections ([xref](https://github.com/kubernetes/kubernetes/blob/master/staging/src/k8s.io/apiserver/pkg/apis/apiserver/types.go#L55-L71))
322
332
323
333
#### Proxy transport between apiservers and authn
324
334
325
335
For the mTLS between source and destination apiservers, we will do the following
326
336
327
-
1. For server authentication by the client (source apiserver) : the client needs to validate the server certs (presented by the destination apiserver), for which it needs to know the CA bundle of the authority that signed those certs. We will introduce a new flag --peer-ca-file that must be passed to the kube-apiserver to verify the other kube-apiserver's server certs
337
+
1. For server authentication by the client (source apiserver) : the client needs to validate the server certs (presented by the destination apiserver), for which it will
338
+
1. look at the CA bundle of the authority that signed those certs. We will introduce a new flag --peer-ca-file that must be passed to the kube-apiserver to verify the other kube-apiserver's server certs
339
+
2. look at the ServerName `kubernetes.default.svc` for SNI to verify server certs against
328
340
329
341
2. For client authentication by the server (destination apiserver) : destination apiserver will check the source apiserver certs to determine that the proxy request is from an authenticated client. The destination apiserver will use requestheader authentication (and NOT client cert authentication) for this using the kube-aggregator proxy client cert/key and the --requestheader-client-ca-file passed to the apiserver upon bootstrap
In the first alpha phase, the integration tests are expected to be added for:
404
+
405
+
- The behavior with feature gate turned on/off
406
+
- Validation where an apiserver tries to serve a request that has already been proxied once
407
+
- Validation where an apiserver tries to call a peer but actually calls itself (to simulate a networking configuration where this happens on accident), and the test fails
- Components depending on the feature gate: kube-apiserver
558
577
-[ ] Other
559
578
- Describe the mechanism:
560
579
- Will enabling / disabling the feature require downtime of the control
@@ -569,6 +588,8 @@ Any change of default behavior may be surprising to users or break existing
569
588
automations, so be extremely careful here.
570
589
-->
571
590
591
+
Yes, requests for built-in resources at the time when a cluster is at mixed versions will be served with a default 503 error instead of a 404 error, if the request is unable to be served.
592
+
572
593
###### Can the feature be disabled once it has been enabled (i.e. can we roll back the enablement)?
573
594
574
595
<!--
@@ -582,8 +603,12 @@ feature.
582
603
NOTE: Also set `disable-supported` to `true` or `false` in `kep.yaml`.
583
604
-->
584
605
606
+
Yes, disabling the feature will result in requests for built-in resources in a cluster at mixed versions to be served with a default 404 error in the case when the request is unable to be served locally.
607
+
585
608
###### What happens if we reenable the feature if it was previously rolled back?
586
609
610
+
The request for built-in resources will be proxied to the apiserver capable of serving it, or else be served with 503 error.
611
+
587
612
###### Are there any tests for feature enablement/disablement?
588
613
589
614
<!--
@@ -599,6 +624,8 @@ You can take a look at one potential example of such test in:
Unit test and integration test will be introduced in alpha implementation.
628
+
602
629
### Rollout, Upgrade and Rollback Planning
603
630
604
631
<!--
@@ -617,13 +644,17 @@ rollout. Similarly, consider large clusters and how enablement/disablement
617
644
will rollout across nodes.
618
645
-->
619
646
647
+
The proxy to remote apiserver can fail if there are network restrictions in place that do not allow an apiserver to talk to a remote apiserver. In this case, the request will fail with 503 error.
648
+
620
649
###### What specific metrics should inform a rollback?
621
650
622
651
<!--
623
652
What signals should users be paying attention to when the feature is young
624
653
that might indicate a serious problem?
625
654
-->
626
655
656
+
- apiserver_request_total metric that will tell us if there's a spike in the number of errors seen meaning the feature is not working as expected
657
+
627
658
###### Were upgrade and rollback tested? Was the upgrade->downgrade->upgrade path tested?
628
659
629
660
<!--
@@ -632,12 +663,16 @@ Longer term, we may want to require automated upgrade/rollback tests, but we
632
663
are missing a bunch of machinery and tooling and can't do that now.
633
664
-->
634
665
666
+
Upgrade and rollback will be tested before the feature goes to Beta.
667
+
635
668
###### Is the rollout accompanied by any deprecations and/or removals of features, APIs, fields of API types, flags, etc.?
636
669
637
670
<!--
638
671
Even if applying deprecation policies, they may still surprise some users.
639
672
-->
640
673
674
+
No.
675
+
641
676
### Monitoring Requirements
642
677
643
678
<!--
@@ -655,6 +690,9 @@ checking if there are objects with field X set) may be a last resort. Avoid
655
690
logs or events for this purpose.
656
691
-->
657
692
693
+
The following metrics could be used to see if the feature is in use:
694
+
- kubernetes_uvip_count
695
+
658
696
###### How can someone using this feature know that it is working for their instance?
659
697
660
698
<!--
@@ -666,13 +704,7 @@ and operation of this feature.
666
704
Recall that end users cannot usually observe component logs or access metrics.
667
705
-->
668
706
669
-
-[ ] Events
670
-
- Event Reason:
671
-
-[ ] API .status
672
-
- Condition name:
673
-
- Other field:
674
-
-[ ] Other (treat as last resort)
675
-
- Details:
707
+
- Metrics like kubernetes_uvip_count can be used to check how many requests were proxied to remote apiserver
676
708
677
709
###### What are the reasonable SLOs (Service Level Objectives) for the enhancement?
678
710
@@ -691,7 +723,6 @@ These goals will help you determine what you need to measure (SLIs) in the next
691
723
question.
692
724
-->
693
725
694
-
This feature depends on the `StorageVersion` feature, that generates objects with a `storageVersion.status.serverStorageVersions[*].apiServerID` field which is used to find the destination apiserver's network location.
695
726
696
727
###### What are the SLIs (Service Level Indicators) an operator can use to determine the health of the service?
697
728
@@ -710,6 +741,8 @@ Describe the metrics themselves and the reasons why they weren't added (e.g., co
710
741
implementation difficulties, etc.).
711
742
-->
712
743
744
+
No. We are open to input.
745
+
713
746
### Dependencies
714
747
715
748
<!--
@@ -718,7 +751,10 @@ This section must be completed when targeting beta to a release.
718
751
719
752
###### Does this feature depend on any specific services running in the cluster?
720
753
721
-
No, but it does depend on the `StorageVersion` feature in kube-apiserver.
754
+
No, but it does depend on
755
+
756
+
- the `StorageVersion` feature that generates objects with a `storageVersion.status.serverStorageVersions[*].apiServerID` field which is used to find the remote apiserver's network location.
757
+
-`APIServerIdentity` feature in kube-apiserver that creates a lease object for APIServerIdentity which we will use to store the network location of the remote apiserver for visibility/debugging
722
758
723
759
<!--
724
760
Think about both cluster-level services (e.g. metrics-server) as well
@@ -762,6 +798,8 @@ Focusing mostly on:
762
798
heartbeats, leader election, etc.)
763
799
-->
764
800
801
+
No.
802
+
765
803
###### Will enabling / using this feature result in introducing new API types?
766
804
767
805
<!--
@@ -771,6 +809,8 @@ Describe them, providing:
771
809
- Supported number of objects per namespace (for namespace-scoped objects)
772
810
-->
773
811
812
+
No.
813
+
774
814
###### Will enabling / using this feature result in any new calls to the cloud provider?
775
815
776
816
<!--
@@ -779,6 +819,8 @@ Describe them, providing:
779
819
- Estimated increase:
780
820
-->
781
821
822
+
No.
823
+
782
824
###### Will enabling / using this feature result in increasing size or count of the existing API objects?
783
825
784
826
<!--
@@ -788,6 +830,8 @@ Describe them, providing:
788
830
- Estimated amount of new objects: (e.g., new Object X for every existing Pod)
789
831
-->
790
832
833
+
No.
834
+
791
835
###### Will enabling / using this feature result in increasing time taken by any operations covered by existing SLIs/SLOs?
792
836
793
837
<!--
@@ -811,6 +855,8 @@ This through this both in small and large cases, again with respect to the
0 commit comments