Skip to content

Commit 22045d3

Browse files
authored
Merge pull request kubernetes#4175 from aojea/update_service_cidrs
KEP-1880: Update service cidrs and clarify some behaviors
2 parents 25ada4e + 7ef3c8e commit 22045d3

File tree

2 files changed

+27
-70
lines changed

2 files changed

+27
-70
lines changed

keps/sig-network/1880-multiple-service-cidrs/README.md

Lines changed: 24 additions & 67 deletions
Original file line numberDiff line numberDiff line change
@@ -135,6 +135,8 @@ Implement a new allocation logic for Services IPs that:
135135
services.spec.IPFamily and services.spec.IPFamilyPolicy, a simple webhook or an admission plugin
136136
can set this fields to the desired default, so the allocation logic doesn't have to handle it.
137137
- Removing the apiserver flags that define the service IP CIDRs, though that may be possible in the future.
138+
- Any admin or cluster wide process related to Services, like automating the default Service CIDR range, though,
139+
this KEP will implement the behaviours and primitives necessaries to perform those kind of operations automatically.
138140

139141
## Proposal
140142

@@ -162,6 +164,8 @@ different and collide with other APIs, like Gateway APIs, we are adding the foll
162164

163165
The new allocator logic can be used by other APIs, like Gateway API.
164166

167+
The new well defined behaviors and objects implemented will allow future developments to perform admin and cluster wide operations on the Service ranges.
168+
165169
### User Stories
166170

167171
#### Story 1
@@ -244,15 +248,15 @@ This default IP family is used in cases where a Service creation doesn't provide
244248
the Service to Single Stack with an IP chosen from the default IP family defined.
245249

246250
The current implementation doesn't guarantee consistency for Service IP ranges and default IP families across
247-
multiple apiservers.
251+
multiple apiservers, see https://github.com/kubernetes/kubernetes/issues/114743.
248252

249253
[![](https://mermaid.ink/img/pako:eNqFU0FuwjAQ_MrKZ_iAD0gpFIlbJKSectnaG1jJsVPbQaKIv9dpEhICtM4tM7OZnXEuQjlNQopAXw1ZRRvGg8eqsJAOqug8ZIAB1p4wEuzJn1hRB9foIyuu0UZ4owPbjvQMLJ2nV2iW7z7QsMbIzj7C--QBD536KWHLlsPx1fQNqaRPM4pemsFytZr6lU-Xsy69cSfy99Sd5cjJ7TfBLoctVmzOUDIZHf7UZcY41X5kbZoQye_yMBia8GDZeRvjkhBisk-H8-P4KSv3lLamrfPTIF6xN97VoDngpyF9sz_YGZmdn7uCJJxmZd3BnWLWmQSKSvd3ykQIjVIU-sDaM-N3Q27NyeSwrXjk36DfLjMJHUQmEJTI5p_J0xsjwVMKKI6SMbQ5zxCGaYPAJZD37d0axFPJzJzVbcTDA5MjFqIiXyHr9CdeWqwQ8UgVFSKphaYSGxMLUdhrojZ1ipreNafVhCwxLb0Q2ES3P1slZPQNDaT-b-5Z1x-6EVE6)](https://mermaid.live/edit#pako:eNqFU0FuwjAQ_MrKZ_iAD0gpFIlbJKSectnaG1jJsVPbQaKIv9dpEhICtM4tM7OZnXEuQjlNQopAXw1ZRRvGg8eqsJAOqug8ZIAB1p4wEuzJn1hRB9foIyuu0UZ4owPbjvQMLJ2nV2iW7z7QsMbIzj7C--QBD536KWHLlsPx1fQNqaRPM4pemsFytZr6lU-Xsy69cSfy99Sd5cjJ7TfBLoctVmzOUDIZHf7UZcY41X5kbZoQye_yMBia8GDZeRvjkhBisk-H8-P4KSv3lLamrfPTIF6xN97VoDngpyF9sz_YGZmdn7uCJJxmZd3BnWLWmQSKSvd3ykQIjVIU-sDaM-N3Q27NyeSwrXjk36DfLjMJHUQmEJTI5p_J0xsjwVMKKI6SMbQ5zxCGaYPAJZD37d0axFPJzJzVbcTDA5MjFqIiXyHr9CdeWqwQ8UgVFSKphaYSGxMLUdhrojZ1ipreNafVhCwxLb0Q2ES3P1slZPQNDaT-b-5Z1x-6EVE6)
250254

251255
### New allocation model
252256

253257
The new allocation mode requires:
254258

255-
- 2 new API objects ServiceCIDR and IPAddress in networking.k8s.io/v1alpha1, see <https://groups.google.com/g/kubernetes-sig-api-machinery/c/S0KuN_PJYXY/m/573BLOo4EAAJ>. The ServiceCIDR will be protected with a finalizer, the IPAddress object doesn't need a finalizer because the APIserver always release and delete the IPAddress after the Service has been deleted.
259+
- 2 new API objects ServiceCIDR and IPAddress in the group `networking.k8s.io`, see <https://groups.google.com/g/kubernetes-sig-api-machinery/c/S0KuN_PJYXY/m/573BLOo4EAAJ>. The ServiceCIDR will be protected with a finalizer, the IPAddress object doesn't need a finalizer because the APIserver always release and delete the IPAddress after the Service has been deleted.
256260
- 1 new allocator implementing current `allocator.Interface`, that runs in each apiserver, and uses the new ServiceCIDRs objects to allocate IPs for Services.
257261
- 1 new repair loop that runs in the apiserver that reconciles the Services with the IPAddresses: repair
258262
Services, garbage collecting orphan IPAddresses and handle the upgrade from the old allocators.
@@ -279,12 +283,7 @@ and Services.
279283
In order to be completely backwards compatible, the bootstrap process will remain the same, the
280284
difference is that instead of creating a bitmap based on the flags, it will create a new
281285
ServiceCIDR object from the flags (flags configuration removal is out of scope of this KEP)
282-
...
283-
284-
```
285-
<<[UNRESOLVED bootstrap>>
286-
Option 1:
287-
... with a special well-known name `kubernetes`.
286+
with a special well-known name `kubernetes`.
288287

289288
The new bootstrap process will be:
290289

@@ -309,68 +308,25 @@ All the apiservers will be synchronized on the ServiceCIDR and default Service c
309308
Changes on the configuration imply manual removal of the ServiceCIDR and default Service, automatically
310309
the rest of the apiservers will race and the winner will set the configuration of the cluster.
311310

312-
Pros:
313-
- Simple to implement
314-
- Align with current behavior of kubernetes.default, though this can be a Con as well, since this
315-
the existing behavior was unexpected
316-
Cons:
317-
- Requires manual intervention
318-
319-
Option 2:
320-
... with a special label `networking.kubernetes.io/service-cidr-from-flags` set to `"true"`.
321-
322-
It now has to handle the possibility of multiple ServiceCIDR with the special label, and
323-
also updating the configuration, per example, from single-stack to dual-stack.
324-
325-
The new bootstrap process will be:
311+
This behavior align with current behavior of kubernetes.default, that it makes it consistent and easier to think
312+
about, allowing future developments to use it to implement more complex operations at the admin cluster level.
326313

327-
```
328-
at startup:
329-
read_flags
330-
if invalid flags
331-
exit
332-
run default-service-ip-range controller
333-
run kubernetes.default service loop (it uses the first ip from the subnet defined in the flags)
334-
run service-repair loop (reconcile services, ipaddresses)
335-
run apiserver
336-
337-
controller:
338-
watch all ServiceCIDR objects labelled from-flags
339-
ignore being-deleted ranges
340-
wait for first sync
341-
342-
controller on_event:
343-
if no default range matching exactly my flags
344-
log
345-
create a ServiceCIDR from my flags
346-
generateName: "from-flags-"
347-
from-flags label: "true"
348-
else if multiple
349-
log
350-
if multiple ranges match exactly my flags (or a single-family subset of)
351-
log
352-
delete all subsets, leaving the largest set that exactly matches on at least on family
353-
endif
354-
endif
355-
if kubernetes.default does not exist
356-
create it
357-
```
314+
#### The special "default" ServiceCIDR
358315

359-
Pros:
360-
- Automatically handles conflicts, no admin operation required
361-
Cons:
362-
- Complex to implement
316+
The `kubernetes.default` Service is expected to be covered by a valid range. Each apiserver
317+
ensure that a ServiceCIDR object exists to cover its own flag-defined ranges. If someone were to force-delete
318+
the ServiceCIDR covering `kubernetes.default` it would be treated the same as before, any apiserver will try
319+
to recreate the Service from its configured default Service CIDR flag-defined range.
363320

364-
<<[/UNRESOLVED]>>
365-
```
366-
367-
#### The special "default" ServiceCIDR
321+
This well-known an establish behavior can allow administrators to replace the `kubernetes.default` by a
322+
series of operations, per example:
323+
1. Initial state: 2 kube-apiservers with default ServiceCIDR 10.0.0.0/24
324+
2. Apiservers will create the `kubernetes.default` Service with ClusterIP 10.0.0.1.
325+
3. Upgrade kube-apiservers and replace the service-cidr flag to 192.168.0.0/24
326+
4. Delete the ServiceCIDRs objects and the `kubernetes.default` Service.
327+
5. The kube-apiserver will recreate the `kubernetes.default` with IP 192.168.0.1.
368328

369-
The `kubernetes.default` Service is expected to be covered by a valid range. Each apiserver will
370-
ensure that a ServiceCIDR object exists to cover its own flag-defined ranges, so this should
371-
be true in normal cases. If someone were to force-delete the ServiceCIDR covering `kubernetes.default` it
372-
would be treated the same as any Service in the repair loop, which will generate warnings about
373-
orphaned Service IPs.
329+
Note this can also be used to switch the IP family of the cluster.
374330

375331
#### Service IP Allocation
376332

@@ -481,7 +437,8 @@ deletion can take some time, one allocator can successfully allocate an IP addre
481437
inside of the old Service CIDR).
482438

483439
There is one controller that will periodically check that the 1-on-1 relation between IPAddresses and Services is
484-
correct, and will start sending events to warn the user that it has to fix/recreate the corresponding Service.
440+
correct, and will start sending events to warn the user that it has to fix/recreate the corresponding Service,
441+
keeping the same behavior that exists today.
485442

486443
#### API
487444

keps/sig-network/1880-multiple-service-cidrs/kep.yaml

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -25,13 +25,13 @@ stage: alpha
2525
# The most recent milestone for which work toward delivery of this KEP has been
2626
# done. This can be the current (upcoming) milestone, if it is being actively
2727
# worked on.
28-
latest-milestone: "v1.27"
28+
latest-milestone: "v1.29"
2929

3030
# The milestone at which this feature was, or is targeted to be, at each stage.
3131
milestone:
3232
alpha: "v1.27"
33-
beta: "v1.29"
34-
stable: "v1.31"
33+
beta: "v1.30"
34+
stable: "v1.32"
3535

3636
# The following PRR answers are required at alpha release
3737
# List the feature gate name and the components for which it must be enabled

0 commit comments

Comments
 (0)