50
50
- [ Alternative 2] ( #alternative-2 )
51
51
- [ Alternative 3] ( #alternative-3 )
52
52
- [ Alternative 4] ( #alternative-4 )
53
- - [ Infrastructure Needed (Optional)] ( #infrastructure-needed-optional )
54
53
<!-- /toc -->
55
54
56
55
## Release Signoff Checklist
@@ -63,20 +62,20 @@ Items marked with (R) are required *prior to targeting to a milestone / release*
63
62
- [x] (R) Design details are appropriately documented
64
63
- [x] (R) Test plan is in place, giving consideration to SIG Architecture and SIG Testing input
65
64
(including test refactors)
66
- - [ ] e2e Tests for all Beta API Operations (endpoints)
67
- - [ ] (R) Ensure GA e2e tests for meet requirements for [ Conformance
65
+ - [x ] e2e Tests for all Beta API Operations (endpoints)
66
+ - [x ] (R) Ensure GA e2e tests for meet requirements for [ Conformance
68
67
Tests] ( https://github.com/kubernetes/community/blob/master/contributors/devel/sig-architecture/conformance-tests.md )
69
- - [ ] (R) Minimum Two Week Window for GA e2e tests to prove flake free
68
+ - [x ] (R) Minimum Two Week Window for GA e2e tests to prove flake free
70
69
- [x] (R) Graduation criteria is in place
71
- - [ ] (R) [ all GA Endpoints] ( https://github.com/kubernetes/community/pull/1806 ) must be hit by
70
+ - [x ] (R) [ all GA Endpoints] ( https://github.com/kubernetes/community/pull/1806 ) must be hit by
72
71
[ Conformance
73
72
Tests] ( https://github.com/kubernetes/community/blob/master/contributors/devel/sig-architecture/conformance-tests.md )
74
73
- [x] (R) Production readiness review completed
75
74
- [x] (R) Production readiness review approved
76
- - [ ] "Implementation History" section is up-to-date for milestone
77
- - [ ] User-facing documentation has been created in [ kubernetes/website] , for publication to
75
+ - [x ] "Implementation History" section is up-to-date for milestone
76
+ - [x ] User-facing documentation has been created in [ kubernetes/website] , for publication to
78
77
[ kubernetes.io]
79
- - [ ] Supporting documentation—e.g., additional design documents, links to mailing list
78
+ - [x ] Supporting documentation—e.g., additional design documents, links to mailing list
80
79
discussions/SIG meetings, relevant PRs/issues, release notes
81
80
82
81
<!--
@@ -151,7 +150,7 @@ capacity.
151
150
To simplify the model, make it backwards compatible and to avoid that it can evolve into something
152
151
different and collide with other APIs, like Gateway APIs, we are adding the following constraints:
153
152
154
- - a ServiceCIDR will be immutable after creation (to be revisited before Beta) .
153
+ - a ServiceCIDR will be immutable after creation.
155
154
- a ServiceCIDR can only be deleted if there are no Service IP associated to it (enforced by finalizer).
156
155
- there can be overlapping ServiceCIDRs.
157
156
- the apiserver will periodically ensure that a "default" ServiceCIDR exists to cover the service CIDR flags
@@ -403,20 +402,6 @@ until it verifies that the consumer associated to that IP has been deleted too.
403
402
The IPAddress will be deleted and an event generated if the controller determines that the IPAddress
404
403
is orphan [ (see Allocator section)] ( #allocator )
405
404
406
- - IPAddress referencing recreated Object (different UID)
407
-
408
- 1 . User created Gateway "foo"
409
- 2 . Gateway controller allocated IP and ref -> "foo"
410
- 3 . User deleted gateway "foo"
411
- 4 . Gateway controller doesn't delete the IP (leave it for GC)
412
- 5 . User creates a new Gateway "foo"
413
- 6 . apiserver repair loop finds the IP with a valid ref to "foo"
414
-
415
- If the new gateway is created before the apiserver observes the delete, apiserver will find that gateway "foo"
416
- still exists and can't release the IP. It can't peek inside "foo" to see if that is the right IP because it is
417
- a type it does not know. If it knew the UID it could see that "foo" UID was different and release the IP.
418
- The IPAddress will use the UID to reference the parent to avoid problems in this scenario.
419
-
420
405
#### Resizing Service IP Ranges
421
406
422
407
One of the most common problems users may have is how to scale up or scale down their Service CIDRs.
@@ -496,10 +481,7 @@ type ParentReference struct {
496
481
Namespace string
497
482
// Name is the name of the referent
498
483
Name string
499
- // UID is the uid of the referent
500
- UID string
501
484
}
502
-
503
485
```
504
486
505
487
#### Allocator
@@ -580,14 +562,6 @@ make this code solid enough prior to committing the changes necessary to impleme
580
562
581
563
##### Integration tests
582
564
583
- <!--
584
- This question should be filled when targeting a release.
585
- For Alpha, describe what tests will be added to ensure proper quality of the enhancement.
586
-
587
- For Beta and GA, add links to added tests together with links to k8s-triage for those tests:
588
- https://storage.googleapis.com/k8s-triage/index.html
589
- -->
590
-
591
565
There will be added tests to verify:
592
566
593
567
- API servers using the old and new allocators at same time
@@ -603,6 +577,12 @@ Files:
603
577
- test/integration/servicecidr/allocator_test.go
604
578
- test/integration/servicecidr/migration_test.go
605
579
- test/integration/servicecidr/servicecidr_test.go
580
+ - test/integration/servicecidr/feature_enable_disable_test.go
581
+ - test/integration/servicecidr/perf_test.go
582
+
583
+ Links:
584
+
585
+ https://storage.googleapis.com/k8s-triage/index.html?test=servicecidr%7Csynthetic_controlplane_test
606
586
607
587
##### e2e tests
608
588
@@ -611,6 +591,23 @@ to use the new ServiceCIDR ranges allocated.
611
591
612
592
https://testgrid.k8s.io/sig-network-kind#sig-network-kind,%20alpha,%20beta&include-filter-by-regex=ServiceCIDRs
613
593
594
+ https://testgrid.k8s.io/kops-misc#kops-aws-sig-network-beta
595
+
596
+ In addition to the specific e2e test that validate the new functionality of allowing
597
+ to add new ServiceCIDRs and create new Services using the IPs of the new range, all
598
+ the existing e2e tests that exercises Services in one way or another are also exercising
599
+ the new APIs.
600
+
601
+ If we take a job of an execution of any job with the feature enabled, per example, https://storage.googleapis.com/kubernetes-ci-logs/logs/ci-kubernetes-network-kind-alpha-beta-features/1866163383959556096/artifacts/kind-control-plane/pods/kube-system_kube-apiserver-kind-control-plane_89ea5ffb5eb9e46fc7a038252629d04c/kube-apiserver/0.log , we can see the ServiceCIDR and IPAddress objects are constantly exercised:
602
+
603
+ ``` sh
604
+ $ grep -c networking.k8s.io/v1beta1/servicecidrs 0.log
605
+ 553
606
+
607
+ $ grep -c networking.k8s.io/v1beta1/ipaddresses 0.log
608
+ 400
609
+ ```
610
+
614
611
### Graduation Criteria
615
612
616
613
#### Alpha
@@ -635,9 +632,16 @@ https://testgrid.k8s.io/sig-network-kind#sig-network-kind,%20alpha,%20beta&inclu
635
632
636
633
#### GA
637
634
638
- - 2 examples of real-world usage
635
+ - examples of real-world usage:
636
+ - It is available in GKE https://cloud.google.com/kubernetes-engine/docs/how-to/use-beta-apis since 1.31 and used in production clusters (numbers can not be disclosed)
637
+ - [ Non-GKE blog] ( https://engineering.doit.com/scaling-kubernetes-how-to-seamlessly-expand-service-ip-ranges-246f392112f8 ) about how to use the ServiceCIDR feature in GKE.
638
+ - It can be used by OSS users with installers that allow to set the feature gates and enable the beta apis, automated testing with kops, see kubernetes/test-infra #33864 and e2e tests section.
639
+ - It is being tested by the community [ spidernet-io/spiderpool #4089 (comment)] ( https://github.com/spidernet-io/spiderpool/pull/4089#issuecomment-2505499043 ) since it went beta in 1.31.
639
640
- More rigorous forms of testing—e.g., downgrade tests and scalability tests
641
+ - It is tested internally in GKE as part of the release.
642
+ - It will inherit all the project testing (scalability, e2e, ...) after being graduated.
640
643
- Allowing time for feedback
644
+ - The feature was beta in 1.31, it has been tested by different projects and enabled in one platform [ with only one bug reported] ( https://github.com/kubernetes/kubernetes/issues/127588 ) .
641
645
642
646
** Note:** Generally we also wait at least two releases between beta and GA/stable, because there's
643
647
no opportunity for user feedback, or even bug reports, in back-to-back releases.
@@ -697,8 +701,9 @@ it will be safe to disable the dual-write mode.
697
701
| 1.31 | Beta off | Alpha off |
698
702
| 1.32 | Beta on | Alpha off |
699
703
| 1.33 | GA on | Beta off |
700
- | 1.34 | GA (there are no bitmaps running) | GA on (also delete old bitmap)|
701
- | 1.35 | remove feature gate | remove feature gate |
704
+ | 1.34 | GA (there are no bitmaps running) | GA (also delete old bitmap)|
705
+ | 1.36 | remove feature gate | GA |
706
+ | 1.37 | --- | remove feature gate |
702
707
703
708
## Production Readiness Review Questionnaire
704
709
@@ -708,7 +713,9 @@ it will be safe to disable the dual-write mode.
708
713
709
714
- [x] Feature gate (also fill in values in ` kep.yaml ` )
710
715
- Feature gate name: MultiCIDRServiceAllocator
711
- - Components depending on the feature gate: kube-apiserver, kube-controller-manager
716
+ - Components depending on the feature gate: kube-apiserver, kube-controller-manager
717
+ - Feature gate name: DisableAllocatorDualWrite
718
+ - Components depending on the feature gate: kube-apiserver,
712
719
713
720
###### Does enabling the feature change any default behavior?
714
721
@@ -734,26 +741,10 @@ restoring the cluster to a working state.
734
741
735
742
###### Are there any tests for feature enablement/disablement?
736
743
737
- <!--
738
- The e2e framework does not currently support enabling or disabling feature
739
- gates. However, unit tests in each component dealing with managing data, created
740
- with and without the feature, are necessary. At the very least, think about
741
- conversion tests if API types are being modified.
742
-
743
- Additionally, for features that are introducing a new API field, unit tests that
744
- are exercising the `switch` of feature gate itself (what happens if I disable a
745
- feature gate after having objects written with the new field) are also critical.
746
- You can take a look at one potential example of such test in:
747
- https://github.com/kubernetes/kubernetes/pull/97058/files#diff-7826f7adbc1996a05ab52e3f5f02429e94b68ce6bce0dc534d1be636154fded3R246-R282
748
- -->
749
-
750
744
Tests for feature enablement/disablement are implemented as integration test, see ` test/integration/servicecidr/feature_enable_disable.go `
751
745
The feature add new API objects, no new API fields to existing objects are added.
752
- ### Rollout, Upgrade and Rollback Planning
753
746
754
- <!--
755
- This section must be completed when targeting beta to a release.
756
- -->
747
+ ### Rollout, Upgrade and Rollback Planning
757
748
758
749
###### How can a rollout or rollback fail? Can it impact already running workloads?
759
750
@@ -786,7 +777,6 @@ test/integration/servicecidr/allocator_test.go TestMigrateService
786
777
2 . start an apiserver with the new feature enable
787
778
3 . the reconciler must detect this stored service and create the corresponding IPAddress
788
779
789
- To be added in Beta https://github.com/kubernetes/kubernetes/pull/122047
790
780
test/integration/servicecidr/feature_enable_disable.go TestEnableDisableServiceCIDR
791
781
1 . start apiserver with the feature disabled
792
782
2 . create new services and assert are correct
@@ -959,6 +949,17 @@ This feature contain repair controllers in the apiserver that makes rollbacks sa
959
949
- [x] Docs (` k/website ` ) update PR(s):
960
950
- https://github.com/kubernetes/website/pull/37620
961
951
- https://github.com/kubernetes/website/pull/43469
952
+ - [x] Beta
953
+ - [x] KEP (` k/enhancements ` ) update PR(s):
954
+ - https://github.com/kubernetes/enhancements/pull/4645
955
+ - [x] Code (` k/k ` ) update PR(s):
956
+ - https://github.com/kubernetes/kubernetes/pull/122047
957
+ - https://github.com/kubernetes/kubernetes/pull/125021
958
+ - [x] Docs (` k/website ` ) update(s):
959
+ - https://github.com/kubernetes/website/pull/46947
960
+ - [ ] Stable
961
+ - [ ] KEP (` k/enhancements ` ) update PR(s): https://github.com/kubernetes/enhancements/pull/4983
962
+ - [ ] Code (` k/k ` ) update PR(s): https://github.com/kubernetes/kubernetes/pull/128971
962
963
963
964
## Drawbacks
964
965
@@ -977,24 +978,83 @@ Several alternatives were proposed in the original PR but discarded by different
977
978
978
979
#### Alternative 1
979
980
981
+ From Daniel Smith:
982
+
983
+ > Each apiserver watches services and keeps an in-memory structure of free IPs.
984
+ > Instead of an allocation table, let's call it a "lock list". It's just a list of (IP, lock > expiration timestamp). When an apiserver wants to do something with an IP, it adds an item > to the list with a timestamp e.g. 30 seconds in the future (we can do this in a way that fails if the item is already there, in which case we abort). Then, we go use it. Then, we let the lock expire. (We can delete the lock early if using the IP fails.)
985
+ > (The above could be optimized in etcd by making an entry per-IP rather than a single list.)
986
+ > So, to make a safe allocation, apiserver comes up with a candidate IP address (either randomly or because the user requested it). Check it against the in-memory structure. If that passes, we look for a lock entry. If none is found, we add a lock entry. Then we use it in a service. Then we delete the lock entry (or wait for it to expire).
987
+
988
+ > Nothing special needs to be done for deletion, I think it's fine if it takes a while for individual apiservers to mark an IP as safe for reuse.
989
+
980
990
< https://github.com/kubernetes/enhancements/pull/1881#issuecomment-672090253 >
981
991
992
+ Problem: Strong dependency on etcd
993
+
982
994
#### Alternative 2
983
995
996
+ From Tim Hockin:
997
+
998
+ > make IP-allocations real
999
+ > resources. In the limit this would mean potentially 10s of thousands of
1000
+ > tiny instances, but they don't change often. This would help in other
1001
+ > areas where I want to allow APIs to specify IPs without worrying about
1002
+ > hijacking important "real" IPs. I imagine it would work something like
1003
+ > PVC<->PV binding. The problem here is that at least service IPs have
1004
+ > always been synchronous to CREATE and changing that i a pretty significant
1005
+ > behavioral difference thath users WILL be able to observe. I don't think
1006
+ > we have any precedent for "nested" transactions on user-visible APIs?
1007
+
984
1008
< https://github.com/kubernetes/enhancements/pull/1881#issuecomment-672135245 >
985
1009
1010
+ Problem: Incompatible change in behavior.
1011
+
986
1012
#### Alternative 3
987
1013
1014
+ From Wojtek Tyczynski
1015
+
1016
+ > keep storing a bitmap in-memory of kube-apiserver (have that propagated via watching Service objects)
1017
+ > store just the hash from that bitmap in etcd
1018
+ > whenever you want to allocate/release IP:
1019
+ > (a) get the current hash from etcd
1020
+ > (b) compute the hash from your in-memory representation
1021
+ > (c) if those two are not equal, you're not synced - wait/retry/...
1022
+ > (d) if they are equal, then if this is incorrect operation against in-memory operation return conflct/bad request/...
1023
+ > (e) otherwise, apply in-memory and write to etcd having the current version is precondtion > (as in GuaranteedUpdate)
1024
+
988
1025
< https://github.com/kubernetes/enhancements/pull/1881#issuecomment-672764247 >
989
1026
1027
+ Problems:
1028
+
1029
+ > If somehow an inconsistent state gets recorded in etcd, then you're permanently stuck here. And the failure mode is really bad (can't make any more services) and requires etcd-level access to fix. So, this is not a workable solution, I think.
1030
+
990
1031
#### Alternative 4
991
1032
992
- < https://github.com/kubernetes/enhancements/pull/1881#issuecomment-755737620 >
1033
+ From Antonio Ojea
993
1034
994
- ## Infrastructure Needed (Optional)
1035
+ > instead of using a bitmap allocator I've created a new API object that has a set of IPs, so the IPs are tracked in the API object
995
1036
996
- <!--
997
- Use this section if you need things from the project/SIG. Examples include a
998
- new subproject, repos requested, or GitHub details. Listing these here allows a
999
- SIG to get the process for these resources started right away.
1000
- -->
1037
+ > // IPRangeSpec defines the desired state of IPRange
1038
+ > type IPRangeSpec struct {
1039
+ > // Range represent the IP range in CIDR format
1040
+ > // i.e. 10.0.0.0/16 or 2001:db2::/64
1041
+ > // +kubebuilder:validation: MaxLength =128
1042
+ > // +kubebuilder:validation: MinLength =8
1043
+ > Range string ` json:"range,omitempty" `
1044
+
1045
+ > // +optional
1046
+ > // Addresses represent the IP addresses of the range and its status.
1047
+ > // Each address may be associated to one kubernetes object (i.e. Services)
1048
+ > // +listType=set
1049
+ > Addresses []string `json:"addresses,omitempty"`
1050
+ > }
1051
+
1052
+ > // IPRangeStatus defines the observed state of IPRange
1053
+ > type IPRangeStatus struct {
1054
+ > // Free represent the number of IP addresses that are not allocated in the Range
1055
+ > // +optional
1056
+ > Free int64 ` json:"free,omitempty" `
1057
+
1058
+ < https://github.com/kubernetes/enhancements/pull/1881#issuecomment-755737620 >
1059
+
1060
+ Problems: Updates to the set within the object are problematic and difficult to handle.
0 commit comments