16
16
- [ Kube-Proxy] ( #kube-proxy )
17
17
- [ EndpointSlice Controller] ( #endpointslice-controller )
18
18
- [ Heuristics] ( #heuristics )
19
- - [ Proportional CPU Heuristic] ( #proportional-cpu-heuristic )
20
- - [ Assumptions] ( #assumptions )
21
- - [ Identifying Zones] ( #identifying-zones )
19
+ - [ Identifying Zones] ( #identifying-zones )
22
20
- [ Excluding Control Plane Nodes] ( #excluding-control-plane-nodes )
23
- - [ Example] ( #example )
24
21
- [ Overload] ( #overload )
25
22
- [ Handling Node Updates] ( #handling-node-updates )
23
+ - [ Proportional CPU Heuristic] ( #proportional-cpu-heuristic )
24
+ - [ Assumptions] ( #assumptions )
25
+ - [ Example] ( #example )
26
+ - [ PreferZone Heuristic] ( #preferzone-heuristic )
27
+ - [ Assumptions] ( #assumptions-1 )
28
+ - [ Example] ( #example-1 )
26
29
- [ Additional Heuristics] ( #additional-heuristics )
27
30
- [ Future Expansion] ( #future-expansion )
28
31
- [ Test Plan] ( #test-plan )
@@ -295,7 +298,7 @@ implemented directly by kube-proxy.
295
298
# ## EndpointSlice Controller
296
299
297
300
When the `TopologyAwareHints` feature gate is enabled and the annotation is set
298
- to `Auto` or `ProportionalByCore ` for a Service, the EndpointSlice controller
301
+ to `Auto` or `ProportionalZoneCPU ` for a Service, the EndpointSlice controller
299
302
will add hints to EndpointSlices. These hints will indicate where an endpoint
300
303
should be consumed by proxy implementations to enable topology aware routing.
301
304
@@ -306,27 +309,15 @@ This KEP starts with the following heuristics:
306
309
| Heuristic Name | Description |
307
310
|-|-|
308
311
| Auto | EndpointSlice controller and/or underlying dataplane can choose the heuristic used. |
309
- | ProportionalByCore | Endpoints will be allocated to each zone proportionally, based on the allocatable Node CPU cores in each zone. |
312
+ | ProportionalZoneCPU | Endpoints will be allocated to each zone proportionally, based on the allocatable Node CPU cores in each zone. |
313
+ | PreferZone | Hints are always populated to represent the zone the endpoint is in. |
310
314
311
315
In the future, additional heuristics may be added. Until that point, "Auto" will
312
316
be the only configurable value. In most clusters, that will translate to
313
- ` ProportionalByCore ` unless the underlying dataplane has a better approach
317
+ ` ProportionalZoneCPU ` unless the underlying dataplane has a better approach
314
318
available.
315
319
316
- # ## Proportional CPU Heuristic
317
- # ### Assumptions
318
-
319
- - Incoming traffic is proportional to the number of allocatable CPU cores in a
320
- zone. Although this is an imperfect metric, it is the best available way of
321
- predicting how much traffic will be received in a zone. If we are unable to
322
- derive the number of allocatable cores in a zone we will fall back to the
323
- number of nodes in that zone.
324
- - Service capacity is proportional to the number of endpoints in a zone. This
325
- assumes that each endpoint has equivalent capacity. Although this is not
326
- always true, it usually is. We can explore ways to deal with variable capacity
327
- endpoints in the future.
328
-
329
- # ### Identifying Zones
320
+ # ## Identifying Zones
330
321
331
322
The EndpointSlice controller reads the standard `topology.kubernetes.io/zone`
332
323
label on Nodes to determine which zone a Pod is running in. Kube-Proxy would be
@@ -340,23 +331,6 @@ calculating allocatable cores in a zone:
340
331
* `node-role.kubernetes.io/control-plane`
341
332
* `node-role.kubernetes.io/master`
342
333
343
- # ### Example
344
-
345
- zone-a : 20 CPU cores
346
- zone-b : 16 CPU cores
347
- zone-c : 14 CPU cores
348
-
349
- In this scenario, the following proportion of endpoints would be allocated for
350
- each Service :
351
-
352
- zone-a : 40%
353
- zone-b : 32%
354
- zone-c : 28%
355
-
356
- When allocating endpoints to meet this distribution, keeping endpoints in the
357
- same zone will be prioritized. When same-zone endpoints are exhausted, endpoints
358
- will be taken from zones that have excess capacity.
359
-
360
334
# ### Overload
361
335
362
336
Overload is a key concept for this proposal. This occurs when there are less
@@ -393,6 +367,57 @@ of the following scenarios:
393
367
2. A new Node results in a Service that is able to achieve an endpoint
394
368
distribution below 20% for the first time.
395
369
370
+ # ## Proportional CPU Heuristic
371
+
372
+ # ### Assumptions
373
+
374
+ - Incoming traffic is proportional to the number of allocatable CPU cores in a
375
+ zone. Although this is an imperfect metric, it is the best available way of
376
+ predicting how much traffic will be received in a zone. If we are unable to
377
+ derive the number of allocatable cores in a zone we will fall back to the
378
+ number of nodes in that zone.
379
+ - Service capacity is proportional to the number of endpoints in a zone. This
380
+ assumes that each endpoint has equivalent capacity. Although this is not
381
+ always true, it usually is. We can explore ways to deal with variable capacity
382
+ endpoints in the future.
383
+ # ### Example
384
+
385
+ zone-a : 20 CPU cores
386
+ zone-b : 16 CPU cores
387
+ zone-c : 14 CPU cores
388
+
389
+ In this scenario, the following proportion of endpoints would be allocated for
390
+ each Service :
391
+
392
+ zone-a : 40%
393
+ zone-b : 32%
394
+ zone-c : 28%
395
+
396
+ When allocating endpoints to meet this distribution, keeping endpoints in the
397
+ same zone will be prioritized. When same-zone endpoints are exhausted, endpoints
398
+ will be taken from zones that have excess capacity.
399
+
400
+ # ## PreferZone Heuristic
401
+
402
+ # ### Assumptions
403
+
404
+ - Endpoints are distributed per zone proportionally to the expected traffic capacity.
405
+
406
+ This heuristic will route traffic to the endpoints existing in the zone without any overflow.
407
+ Dataplanes will fall back to cluster-wide routing if there are no endpoints with hints for the
408
+ zone the dataplane is running in.
409
+ There is risk of blackholing traffic or traffic imbalance if the endpoint distribution is incorrect.
410
+
411
+ # ### Example
412
+
413
+ zone-a : 2 endpoints
414
+ zone-b : 0 endpoint
415
+ zone-c : 3 endpoints
416
+
417
+ In this scenario, traffic generated in zona-a or zone-c will be routed only to the endpoints existing
418
+ in their corresponding zone. Traffic from zone-b, since does not have any endpoint, will fall back to
419
+ cluster wide routing and will be routed to endpoints in zone-a and zone-c.
420
+
396
421
# ## Additional Heuristics
397
422
To enable additional heuristics to be added in the future, we will :
398
423
0 commit comments