@@ -266,15 +266,58 @@ Currently, the Service CIDRs are configured independently in each kube-apiserver
266
266
the bootstrap process, the apiserver uses the first IP of each range to create the special
267
267
"kubernetes.default" Service. It also starts a reconcile loop, that synchronize the state of the
268
268
bitmap used by the internal allocators with the assigned IPs to the Services.
269
+ This "kubernetes.default" Service is never updated, the first apiserver wins and assigns the
270
+ ClusterIP from its configured service-ip-range, other apiservers with different ranges will
271
+ not try to change the IP. If the apiserver that created the Service no longer works, the
272
+ admin has to delete the Service so others apiservers can create it with its own ranges.
269
273
270
- With current implementation, each kube-apiserver can boot with different ranges configured.
274
+ With current implementation, each kube-apiserver can boot with different ranges configured without errors,
275
+ but the cluster will not work correctly, see https://github.com/kubernetes/kubernetes/issues/114743 .
271
276
There is no conflict resolution, each apiserver keep writing and deleting others apiservers allocator bitmap
272
277
and Services.
273
278
274
279
In order to be completely backwards compatible, the bootstrap process will remain the same, the
275
280
difference is that instead of creating a bitmap based on the flags, it will create a new
276
281
ServiceCIDR object from the flags (flags configuration removal is out of scope of this KEP)
277
- with a special label ` networking.kubernetes.io/service-cidr-from-flags ` set to ` "true" ` .
282
+ ...
283
+
284
+ ```
285
+ <<[UNRESOLVED bootstrap>>
286
+ Option 1:
287
+ ... with a special well-known name `kubernetes`.
288
+
289
+ The new bootstrap process will be:
290
+
291
+ ```
292
+ at startup:
293
+ read_flags
294
+ if invalid flags
295
+ exit
296
+ run default-service-ip-range controller
297
+ run kubernetes.default service loop (it uses the first ip from the subnet defined in the flags)
298
+ run service-repair loop (reconcile services, ipaddresses)
299
+ run apiserver
300
+
301
+ controller:
302
+ if ServiceCIDR ` kubernetes ` does not exist
303
+ create it and create the kubernetes.default service (avoid races)
304
+ else
305
+ keep watching to handle finalizers and recreate if needed
306
+ ```
307
+
308
+ All the apiservers will be synchronized on the ServiceCIDR and default Service created by the first to win.
309
+ Changes on the configuration imply manual removal of the ServiceCIDR and default Service, automatically
310
+ the rest of the apiservers will race and the winner will set the configuration of the cluster.
311
+
312
+ Pros:
313
+ - Simple to implement
314
+ - Align with current behavior of kubernetes.default, though this can be a Con as well, since this
315
+ doesn't be the expected
316
+ Cons:
317
+ - Requires manual intervention
318
+
319
+ Option 2:
320
+ ... with a special label `networking.kubernetes.io/service-cidr-from-flags` set to `"true"`.
278
321
279
322
It now has to handle the possibility of multiple ServiceCIDR with the special label, and
280
323
also updating the configuration, per example, from single-stack to dual-stack.
@@ -287,6 +330,8 @@ at startup:
287
330
if invalid flags
288
331
exit
289
332
run default-service-ip-range controller
333
+ run kubernetes.default service loop (it uses the first ip from the subnet defined in the flags)
334
+ run service-repair loop (reconcile services, ipaddresses)
290
335
run apiserver
291
336
292
337
controller:
@@ -311,6 +356,14 @@ controller on_event:
311
356
create it
312
357
```
313
358
359
+ Pros:
360
+ - Automatically handles conflicts, no admin operation required
361
+ Cons:
362
+ - Complex to implement
363
+
364
+ <<[/UNRESOLVED]>>
365
+ ```
366
+
314
367
#### The special "default" ServiceCIDR
315
368
316
369
The ` kubernetes.default ` Service is expected to be covered by a valid range. Each apiserver will
0 commit comments