Skip to content

Commit 9fc5833

Browse files
committed
KEP-1880: Promote ServiceCIDR to GA
1 parent 08eee32 commit 9fc5833

File tree

3 files changed

+128
-66
lines changed

3 files changed

+128
-66
lines changed

keps/prod-readiness/sig-network/1880.yaml

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -2,4 +2,6 @@ kep-number: 1880
22
alpha:
33
approver: "@johnbelamaric"
44
beta:
5+
approver: "@soltysh"
6+
stable:
57
approver: "@soltysh"

keps/sig-network/1880-multiple-service-cidrs/README.md

Lines changed: 123 additions & 63 deletions
Original file line numberDiff line numberDiff line change
@@ -50,7 +50,6 @@
5050
- [Alternative 2](#alternative-2)
5151
- [Alternative 3](#alternative-3)
5252
- [Alternative 4](#alternative-4)
53-
- [Infrastructure Needed (Optional)](#infrastructure-needed-optional)
5453
<!-- /toc -->
5554

5655
## Release Signoff Checklist
@@ -63,20 +62,20 @@ Items marked with (R) are required *prior to targeting to a milestone / release*
6362
- [x] (R) Design details are appropriately documented
6463
- [x] (R) Test plan is in place, giving consideration to SIG Architecture and SIG Testing input
6564
(including test refactors)
66-
- [ ] e2e Tests for all Beta API Operations (endpoints)
67-
- [ ] (R) Ensure GA e2e tests for meet requirements for [Conformance
65+
- [x] e2e Tests for all Beta API Operations (endpoints)
66+
- [x] (R) Ensure GA e2e tests for meet requirements for [Conformance
6867
Tests](https://github.com/kubernetes/community/blob/master/contributors/devel/sig-architecture/conformance-tests.md)
69-
- [ ] (R) Minimum Two Week Window for GA e2e tests to prove flake free
68+
- [x] (R) Minimum Two Week Window for GA e2e tests to prove flake free
7069
- [x] (R) Graduation criteria is in place
71-
- [ ] (R) [all GA Endpoints](https://github.com/kubernetes/community/pull/1806) must be hit by
70+
- [x] (R) [all GA Endpoints](https://github.com/kubernetes/community/pull/1806) must be hit by
7271
[Conformance
7372
Tests](https://github.com/kubernetes/community/blob/master/contributors/devel/sig-architecture/conformance-tests.md)
7473
- [x] (R) Production readiness review completed
7574
- [x] (R) Production readiness review approved
76-
- [ ] "Implementation History" section is up-to-date for milestone
77-
- [ ] User-facing documentation has been created in [kubernetes/website], for publication to
75+
- [x] "Implementation History" section is up-to-date for milestone
76+
- [x] User-facing documentation has been created in [kubernetes/website], for publication to
7877
[kubernetes.io]
79-
- [ ] Supporting documentation—e.g., additional design documents, links to mailing list
78+
- [x] Supporting documentation—e.g., additional design documents, links to mailing list
8079
discussions/SIG meetings, relevant PRs/issues, release notes
8180

8281
<!--
@@ -151,7 +150,7 @@ capacity.
151150
To simplify the model, make it backwards compatible and to avoid that it can evolve into something
152151
different and collide with other APIs, like Gateway APIs, we are adding the following constraints:
153152

154-
- a ServiceCIDR will be immutable after creation (to be revisited before Beta).
153+
- a ServiceCIDR will be immutable after creation.
155154
- a ServiceCIDR can only be deleted if there are no Service IP associated to it (enforced by finalizer).
156155
- there can be overlapping ServiceCIDRs.
157156
- the apiserver will periodically ensure that a "default" ServiceCIDR exists to cover the service CIDR flags
@@ -403,20 +402,6 @@ until it verifies that the consumer associated to that IP has been deleted too.
403402
The IPAddress will be deleted and an event generated if the controller determines that the IPAddress
404403
is orphan [(see Allocator section)](#allocator)
405404

406-
- IPAddress referencing recreated Object (different UID)
407-
408-
1. User created Gateway "foo"
409-
2. Gateway controller allocated IP and ref -> "foo"
410-
3. User deleted gateway "foo"
411-
4. Gateway controller doesn't delete the IP (leave it for GC)
412-
5. User creates a new Gateway "foo"
413-
6. apiserver repair loop finds the IP with a valid ref to "foo"
414-
415-
If the new gateway is created before the apiserver observes the delete, apiserver will find that gateway "foo"
416-
still exists and can't release the IP. It can't peek inside "foo" to see if that is the right IP because it is
417-
a type it does not know. If it knew the UID it could see that "foo" UID was different and release the IP.
418-
The IPAddress will use the UID to reference the parent to avoid problems in this scenario.
419-
420405
#### Resizing Service IP Ranges
421406

422407
One of the most common problems users may have is how to scale up or scale down their Service CIDRs.
@@ -496,10 +481,7 @@ type ParentReference struct {
496481
Namespace string
497482
// Name is the name of the referent
498483
Name string
499-
// UID is the uid of the referent
500-
UID string
501484
}
502-
503485
```
504486

505487
#### Allocator
@@ -580,14 +562,6 @@ make this code solid enough prior to committing the changes necessary to impleme
580562

581563
##### Integration tests
582564

583-
<!--
584-
This question should be filled when targeting a release.
585-
For Alpha, describe what tests will be added to ensure proper quality of the enhancement.
586-
587-
For Beta and GA, add links to added tests together with links to k8s-triage for those tests:
588-
https://storage.googleapis.com/k8s-triage/index.html
589-
-->
590-
591565
There will be added tests to verify:
592566

593567
- API servers using the old and new allocators at same time
@@ -603,6 +577,12 @@ Files:
603577
- test/integration/servicecidr/allocator_test.go
604578
- test/integration/servicecidr/migration_test.go
605579
- test/integration/servicecidr/servicecidr_test.go
580+
- test/integration/servicecidr/feature_enable_disable_test.go
581+
- test/integration/servicecidr/perf_test.go
582+
583+
Links:
584+
585+
https://storage.googleapis.com/k8s-triage/index.html?test=servicecidr%7Csynthetic_controlplane_test
606586

607587
##### e2e tests
608588

@@ -611,6 +591,23 @@ to use the new ServiceCIDR ranges allocated.
611591

612592
https://testgrid.k8s.io/sig-network-kind#sig-network-kind,%20alpha,%20beta&include-filter-by-regex=ServiceCIDRs
613593

594+
https://testgrid.k8s.io/kops-misc#kops-aws-sig-network-beta
595+
596+
In addition to the specific e2e test that validate the new functionality of allowing
597+
to add new ServiceCIDRs and create new Services using the IPs of the new range, all
598+
the existing e2e tests that exercises Services in one way or another are also exercising
599+
the new APIs.
600+
601+
If we take a job of an execution of any job with the feature enabled, per example, https://storage.googleapis.com/kubernetes-ci-logs/logs/ci-kubernetes-network-kind-alpha-beta-features/1866163383959556096/artifacts/kind-control-plane/pods/kube-system_kube-apiserver-kind-control-plane_89ea5ffb5eb9e46fc7a038252629d04c/kube-apiserver/0.log , we can see the ServiceCIDR and IPAddress objects are constantly exercised:
602+
603+
```sh
604+
$ grep -c networking.k8s.io/v1beta1/servicecidrs 0.log
605+
553
606+
607+
$ grep -c networking.k8s.io/v1beta1/ipaddresses 0.log
608+
400
609+
```
610+
614611
### Graduation Criteria
615612

616613
#### Alpha
@@ -635,9 +632,16 @@ https://testgrid.k8s.io/sig-network-kind#sig-network-kind,%20alpha,%20beta&inclu
635632

636633
#### GA
637634

638-
- 2 examples of real-world usage
635+
- examples of real-world usage:
636+
- It is available in GKE https://cloud.google.com/kubernetes-engine/docs/how-to/use-beta-apis since 1.31 and used in production clusters (numbers can not be disclosed)
637+
- [Non-GKE blog](https://engineering.doit.com/scaling-kubernetes-how-to-seamlessly-expand-service-ip-ranges-246f392112f8) about how to use the ServiceCIDR feature in GKE.
638+
- It can be used by OSS users with installers that allow to set the feature gates and enable the beta apis, automated testing with kops, see kubernetes/test-infra#33864 and e2e tests section.
639+
- It is being tested by the community [spidernet-io/spiderpool#4089 (comment)](https://github.com/spidernet-io/spiderpool/pull/4089#issuecomment-2505499043) since it went beta in 1.31.
639640
- More rigorous forms of testing—e.g., downgrade tests and scalability tests
641+
- It is tested internally in GKE as part of the release.
642+
- It will inherit all the project testing (scalability, e2e, ...) after being graduated.
640643
- Allowing time for feedback
644+
- The feature was beta in 1.31, it has been tested by different projects and enabled in one platform [with only one bug reported](https://github.com/kubernetes/kubernetes/issues/127588).
641645

642646
**Note:** Generally we also wait at least two releases between beta and GA/stable, because there's
643647
no opportunity for user feedback, or even bug reports, in back-to-back releases.
@@ -697,8 +701,9 @@ it will be safe to disable the dual-write mode.
697701
| 1.31 | Beta off | Alpha off |
698702
| 1.32 | Beta on | Alpha off |
699703
| 1.33 | GA on | Beta off |
700-
| 1.34 | GA (there are no bitmaps running) | GA on (also delete old bitmap)|
701-
| 1.35 | remove feature gate | remove feature gate |
704+
| 1.34 | GA (there are no bitmaps running) | GA (also delete old bitmap)|
705+
| 1.36 | remove feature gate | GA |
706+
| 1.37 | --- | remove feature gate |
702707

703708
## Production Readiness Review Questionnaire
704709

@@ -708,7 +713,9 @@ it will be safe to disable the dual-write mode.
708713

709714
- [x] Feature gate (also fill in values in `kep.yaml`)
710715
- Feature gate name: MultiCIDRServiceAllocator
711-
- Components depending on the feature gate: kube-apiserver, kube-controller-manager
716+
- Components depending on the feature gate: kube-apiserver, kube-controller-manager
717+
- Feature gate name: DisableAllocatorDualWrite
718+
- Components depending on the feature gate: kube-apiserver,
712719

713720
###### Does enabling the feature change any default behavior?
714721

@@ -734,26 +741,10 @@ restoring the cluster to a working state.
734741

735742
###### Are there any tests for feature enablement/disablement?
736743

737-
<!--
738-
The e2e framework does not currently support enabling or disabling feature
739-
gates. However, unit tests in each component dealing with managing data, created
740-
with and without the feature, are necessary. At the very least, think about
741-
conversion tests if API types are being modified.
742-
743-
Additionally, for features that are introducing a new API field, unit tests that
744-
are exercising the `switch` of feature gate itself (what happens if I disable a
745-
feature gate after having objects written with the new field) are also critical.
746-
You can take a look at one potential example of such test in:
747-
https://github.com/kubernetes/kubernetes/pull/97058/files#diff-7826f7adbc1996a05ab52e3f5f02429e94b68ce6bce0dc534d1be636154fded3R246-R282
748-
-->
749-
750744
Tests for feature enablement/disablement are implemented as integration test, see `test/integration/servicecidr/feature_enable_disable.go`
751745
The feature add new API objects, no new API fields to existing objects are added.
752-
### Rollout, Upgrade and Rollback Planning
753746

754-
<!--
755-
This section must be completed when targeting beta to a release.
756-
-->
747+
### Rollout, Upgrade and Rollback Planning
757748

758749
###### How can a rollout or rollback fail? Can it impact already running workloads?
759750

@@ -786,7 +777,6 @@ test/integration/servicecidr/allocator_test.go TestMigrateService
786777
2. start an apiserver with the new feature enable
787778
3. the reconciler must detect this stored service and create the corresponding IPAddress
788779

789-
To be added in Beta https://github.com/kubernetes/kubernetes/pull/122047
790780
test/integration/servicecidr/feature_enable_disable.go TestEnableDisableServiceCIDR
791781
1. start apiserver with the feature disabled
792782
2. create new services and assert are correct
@@ -959,6 +949,17 @@ This feature contain repair controllers in the apiserver that makes rollbacks sa
959949
- [x] Docs (`k/website`) update PR(s):
960950
- https://github.com/kubernetes/website/pull/37620
961951
- https://github.com/kubernetes/website/pull/43469
952+
- [x] Beta
953+
- [x] KEP (`k/enhancements`) update PR(s):
954+
- https://github.com/kubernetes/enhancements/pull/4645
955+
- [x] Code (`k/k`) update PR(s):
956+
- https://github.com/kubernetes/kubernetes/pull/122047
957+
- https://github.com/kubernetes/kubernetes/pull/125021
958+
- [x] Docs (`k/website`) update(s):
959+
- https://github.com/kubernetes/website/pull/46947
960+
- [ ] Stable
961+
- [ ] KEP (`k/enhancements`) update PR(s): https://github.com/kubernetes/enhancements/pull/4983
962+
- [ ] Code (`k/k`) update PR(s): https://github.com/kubernetes/kubernetes/pull/128971
962963

963964
## Drawbacks
964965

@@ -977,24 +978,83 @@ Several alternatives were proposed in the original PR but discarded by different
977978

978979
#### Alternative 1
979980

981+
From Daniel Smith:
982+
983+
> Each apiserver watches services and keeps an in-memory structure of free IPs.
984+
> Instead of an allocation table, let's call it a "lock list". It's just a list of (IP, lock > expiration timestamp). When an apiserver wants to do something with an IP, it adds an item > to the list with a timestamp e.g. 30 seconds in the future (we can do this in a way that fails if the item is already there, in which case we abort). Then, we go use it. Then, we let the lock expire. (We can delete the lock early if using the IP fails.)
985+
> (The above could be optimized in etcd by making an entry per-IP rather than a single list.)
986+
> So, to make a safe allocation, apiserver comes up with a candidate IP address (either randomly or because the user requested it). Check it against the in-memory structure. If that passes, we look for a lock entry. If none is found, we add a lock entry. Then we use it in a service. Then we delete the lock entry (or wait for it to expire).
987+
988+
> Nothing special needs to be done for deletion, I think it's fine if it takes a while for individual apiservers to mark an IP as safe for reuse.
989+
980990
<https://github.com/kubernetes/enhancements/pull/1881#issuecomment-672090253>
981991

992+
Problem: Strong dependency on etcd
993+
982994
#### Alternative 2
983995

996+
From Tim Hockin:
997+
998+
> make IP-allocations real
999+
> resources. In the limit this would mean potentially 10s of thousands of
1000+
> tiny instances, but they don't change often. This would help in other
1001+
> areas where I want to allow APIs to specify IPs without worrying about
1002+
> hijacking important "real" IPs. I imagine it would work something like
1003+
> PVC<->PV binding. The problem here is that at least service IPs have
1004+
> always been synchronous to CREATE and changing that i a pretty significant
1005+
> behavioral difference thath users WILL be able to observe. I don't think
1006+
> we have any precedent for "nested" transactions on user-visible APIs?
1007+
9841008
<https://github.com/kubernetes/enhancements/pull/1881#issuecomment-672135245>
9851009

1010+
Problem: Incompatible change in behavior.
1011+
9861012
#### Alternative 3
9871013

1014+
From Wojtek Tyczynski
1015+
1016+
> keep storing a bitmap in-memory of kube-apiserver (have that propagated via watching Service objects)
1017+
> store just the hash from that bitmap in etcd
1018+
> whenever you want to allocate/release IP:
1019+
> (a) get the current hash from etcd
1020+
> (b) compute the hash from your in-memory representation
1021+
> (c) if those two are not equal, you're not synced - wait/retry/...
1022+
> (d) if they are equal, then if this is incorrect operation against in-memory operation return conflct/bad request/...
1023+
> (e) otherwise, apply in-memory and write to etcd having the current version is precondtion > (as in GuaranteedUpdate)
1024+
9881025
<https://github.com/kubernetes/enhancements/pull/1881#issuecomment-672764247>
9891026

1027+
Problems:
1028+
1029+
> If somehow an inconsistent state gets recorded in etcd, then you're permanently stuck here. And the failure mode is really bad (can't make any more services) and requires etcd-level access to fix. So, this is not a workable solution, I think.
1030+
9901031
#### Alternative 4
9911032

992-
<https://github.com/kubernetes/enhancements/pull/1881#issuecomment-755737620>
1033+
From Antonio Ojea
9931034

994-
## Infrastructure Needed (Optional)
1035+
> instead of using a bitmap allocator I've created a new API object that has a set of IPs, so the IPs are tracked in the API object
9951036
996-
<!--
997-
Use this section if you need things from the project/SIG. Examples include a
998-
new subproject, repos requested, or GitHub details. Listing these here allows a
999-
SIG to get the process for these resources started right away.
1000-
-->
1037+
> // IPRangeSpec defines the desired state of IPRange
1038+
> type IPRangeSpec struct {
1039+
> // Range represent the IP range in CIDR format
1040+
> // i.e. 10.0.0.0/16 or 2001:db2::/64
1041+
> // +kubebuilder:validation:MaxLength=128
1042+
> // +kubebuilder:validation:MinLength=8
1043+
> Range string `json:"range,omitempty"`
1044+
1045+
> // +optional
1046+
> // Addresses represent the IP addresses of the range and its status.
1047+
> // Each address may be associated to one kubernetes object (i.e. Services)
1048+
> // +listType=set
1049+
> Addresses []string `json:"addresses,omitempty"`
1050+
>}
1051+
1052+
> // IPRangeStatus defines the observed state of IPRange
1053+
> type IPRangeStatus struct {
1054+
> // Free represent the number of IP addresses that are not allocated in the Range
1055+
> // +optional
1056+
> Free int64 `json:"free,omitempty"`
1057+
1058+
<https://github.com/kubernetes/enhancements/pull/1881#issuecomment-755737620>
1059+
1060+
Problems: Updates to the set within the object are problematic and difficult to handle.

keps/sig-network/1880-multiple-service-cidrs/kep.yaml

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -20,18 +20,18 @@ see-also:
2020
replaces:
2121

2222
# The target maturity stage in the current dev cycle for this KEP.
23-
stage: beta
23+
stage: stable
2424

2525
# The most recent milestone for which work toward delivery of this KEP has been
2626
# done. This can be the current (upcoming) milestone, if it is being actively
2727
# worked on.
28-
latest-milestone: "v1.31"
28+
latest-milestone: "v1.33"
2929

3030
# The milestone at which this feature was, or is targeted to be, at each stage.
3131
milestone:
3232
alpha: "v1.27"
3333
beta: "v1.31"
34-
stable:
34+
stable: "v1.33"
3535

3636
# The following PRR answers are required at alpha release
3737
# List the feature gate name and the components for which it must be enabled

0 commit comments

Comments
 (0)