19
19
- [ Risks and Mitigations] ( #risks-and-mitigations )
20
20
- [ Design Details] ( #design-details )
21
21
- [ Standard Heuristic Implementation (kube-proxy dataplane)] ( #standard-heuristic-implementation-kube-proxy-dataplane )
22
- - [ <code >Default</code > and <code >PreferEqualSpread </code >] ( #default-and-preferequalspread )
23
- - [ <code >PreferZone </code >] ( #preferzone )
22
+ - [ <code >Default</code > and <code >Spread </code >] ( #default-and-spread )
23
+ - [ <code >Zone </code >] ( #zone )
24
24
- [ Changes within kube-proxy] ( #changes-within-kube-proxy )
25
25
- [ Status Reporting] ( #status-reporting )
26
26
- [ Condition usage by other implementations] ( #condition-usage-by-other-implementations )
47
47
- [ Alternatives] ( #alternatives )
48
48
- [ Repurpose the existing topology annotation to recognize additional values] ( #repurpose-the-existing-topology-annotation-to-recognize-additional-values )
49
49
- [ Reuse the fields internal/externalTrafficPolicy to offer these routing preferences] ( #reuse-the-fields-internalexternaltrafficpolicy-to-offer-these-routing-preferences )
50
+ - [ Granular Routing Controls] ( #granular-routing-controls )
50
51
- [ Infrastructure Needed (Optional)] ( #infrastructure-needed-optional )
51
52
<!-- /toc -->
52
53
@@ -125,11 +126,11 @@ prioritize local nodes, then zone, then anywhere).
125
126
126
127
### Topology Aware Routing and InternalTrafficPolicy
127
128
128
- TopologyAwareRouting together with InternalTrafficPolicy were meant to be the
129
- successors of ` topologyKeys ` and allow implementations to be more flexible.
129
+ Topology aware routing together with ` internalTrafficPolicy ` were meant to be
130
+ the successors of ` topologyKeys ` and allow implementations to be more flexible.
130
131
131
132
* TopologyAwareRouting:
132
- * Exposes the annotation service.kubernetes.io/topology-mode. When this
133
+ * Responds the annotation service.kubernetes.io/topology-mode. When this
133
134
annotation is set to Auto, an implementation specific heuristic is used to
134
135
route the traffic.
135
136
* ** Goal:** The aim with Auto was to allow implementations to be as smart as
@@ -155,7 +156,7 @@ successors of `topologyKeys` and allow implementations to be more flexible.
155
156
* ** Limitation:** Lacks failover; traffic is dropped if no local endpoint exists.
156
157
157
158
Note that while the initial proposal of InternalTrafficPolicy proposed a
158
- PreferLocal policy, it was dropped later on. This meant that now
159
+ Local policy, it was dropped later on. This meant that now
159
160
TopologyAwareRouting in conjunction with InternalTrafficPolicy didn’t exactly
160
161
allow users to express a much desired use case from topologyKeys which is
161
162
"prefer node-local, failover to same zone, then route anywhere" While this
@@ -188,17 +189,17 @@ such a preference in future refinements.
188
189
decisions.
189
190
190
191
* ** Mandatory and Uniform Implementation Support:** Kubernetes implementations
191
- are not required to support all standard heuristics (e.g., PreferZone,
192
- ProportionalZoneCPU). Even when standard heuristics are supported, their
193
- precise behavior and interpretation might vary across implementations.
192
+ are not required to support all standard heuristics (e.g., ` Zone ` , ` Spread ` ).
193
+ Even when standard heuristics are supported, their precise behavior and
194
+ interpretation might vary across implementations.
194
195
195
196
* ** Replacement of Traffic Policies:** The new field is complementary to
196
197
InternalTrafficPolicy and ExternalTrafficPolicy. It does not aim to substitute
197
198
their role in enforcing strict traffic locality.
198
199
199
200
* ** Immediate Support for All Possible Heuristics:** The initial implementation
200
201
focuses on a core set of heuristics. Addition of new heuristics (like
201
- ` PreferLocal ` for Node local preference) could be explored in future
202
+ ` Local ` for Node local preference) could be explored in future
202
203
refinements.
203
204
204
205
## Proposal
@@ -212,9 +213,9 @@ The field will support the following initial values:
212
213
* ` Default ` : Indicates no specific routing preference. The user delegates the
213
214
routing decision to the implementation, allowing it to apply its best-effort
214
215
strategy.
215
- * ` PreferEqualSpread ` : Encourages an equal distribution of traffic across
216
+ * ` Spread ` : Encourages an equal distribution of traffic across
216
217
endpoints, potentially spanning multiple zones (or regions).
217
- * ` PreferZone ` : Encourages routing traffic to endpoints within the same zone as
218
+ * ` Zone ` : Encourages routing traffic to endpoints within the same zone as
218
219
the client. If no endpoints are available within the zone, traffic should be
219
220
routed to other zones.
220
221
@@ -229,14 +230,14 @@ reserved for potential future standardization.
229
230
230
231
NOTE: Implementations reserve the right to refine the behavior associated with
231
232
any heuristic, including standard heuristics. This means the behavior enabled
232
- by values such as ` Default ` or ` PreferZone ` might evolve over time. Such
233
- refinements could improve the implementation's ability to honor the original
234
- intent of the heuristic, even if the specific mechanisms change. For example,
235
- in the case of PreferZone, an implementation might initially route traffic
236
- within a zone with equal probability. A future improvement could introduce
237
- load-aware routing within the zone to further optimize performance while still
238
- adhering to the core principle of zonal preference. The decision of what
239
- constitutes an "improvement" remains at the discretion of the implementation.
233
+ by values such as ` Default ` or ` Zone ` might evolve over time, and some
234
+ evolutions might interpret the heuristic goals slightly differently. For
235
+ example, in the case of ` Zone ` , an implementation might initially route
236
+ traffic within the zone without considering endpoint overload, while a future
237
+ refinement could introduce feedback mechanisms to detect overload and route
238
+ traffic outside the zone when necessary, optimizing overall performance. The
239
+ decision of what constitutes an "improvement" remains at the discretion of the
240
+ implementation.
240
241
241
242
### User Stories
242
243
@@ -254,7 +255,7 @@ NOTE: Implementations reserve the right to refine the behavior associated with
254
255
* ** Requirement:** I want my application to primarily receive traffic from
255
256
endpoints within the same zone for performance or cost reasons. However,
256
257
I want to avoid connection failures if no local endpoints are available.
257
- * ** Solution:** Set ` routingPreference=PreferZone `
258
+ * ** Solution:** Set ` routingPreference=Zone `
258
259
* ** Effect:** The Kubernetes implementation will aim to prioritize routing
259
260
traffic to endpoints in the same zone as the client. If no endpoints are
260
261
available within the zone, traffic will be routed to other zones. It's
@@ -266,7 +267,7 @@ NOTE: Implementations reserve the right to refine the behavior associated with
266
267
* ** Requirement:** I prioritize application availability and want to minimize the
267
268
risk of outages due to localized overload. I'm willing to accept potentially
268
269
higher costs associated with cross-zone traffic distribution.
269
- * ** Solution:** Set ` routingPreference=PreferEqualSpread `
270
+ * ** Solution:** Set ` routingPreference=Spread `
270
271
* ** Effect:** The Kubernetes implementation will try to distribute traffic as
271
272
equally as possible across endpoints, potentially spanning multiple zones or
272
273
regions. This can improve resilience but might lead to increased network
@@ -295,16 +296,17 @@ NOTE: Implementations reserve the right to refine the behavior associated with
295
296
296
297
### Notes/Constraints/Caveats (Optional)
297
298
298
- None.
299
+ This proposal is our third attempt at an API revolving around such a
300
+ configuration. There's a non-zero chance that we may need to revisit this again.
299
301
300
302
### Risks and Mitigations
301
303
302
- * ** Risk:** Having a routing preference like ` PreferZone ` comes at the risk of
304
+ * ** Risk:** Having a routing preference like ` Zone ` comes at the risk of
303
305
endpoints in certain zones being overloaded if the originating traffic is
304
306
skewed towards a particular zone.
305
307
306
308
** Mitigation:**
307
- * Emphasize in the documentation that the ` PreferZone ` preference is
309
+ * Emphasize in the documentation that the ` Zone ` preference is
308
310
designed for low-latency or monetory-cost reasons, with the understanding
309
311
that it can lead to overload within zones.
310
312
* Recommend approaches like having deployments per zone which can scale
@@ -314,7 +316,9 @@ None.
314
316
annotation might encounter differences in exact routing behavior:
315
317
* The new field doesn't support a routing preference that is exactly similar
316
318
to using ` service.kubernetes.io/topology-mode=Auto ` from the old annotation.
317
- * If both field and the annotation are set, the annotation will take precedence.
319
+ * If both field and the annotation are set, the annotation will take
320
+ precedence. (However, this behavior is temporary as the annotation will be
321
+ deprecated and removed in future releases)
318
322
319
323
** Mitigation:** Properly document the suggested migration paths with
320
324
limitations.
@@ -325,24 +329,24 @@ None.
325
329
326
330
kube-proxy (along with EndpointSlice controller, within kube-controller-manager
327
331
as the control plane) will support the three standard routing preferences
328
- (` Default ` , ` PreferEqualSpread ` , ` PreferZone ` ).
332
+ (` Default ` , ` Spread ` , ` Zone ` ).
329
333
330
- #### ` Default ` and ` PreferEqualSpread `
334
+ #### ` Default ` and ` Spread `
331
335
* Initially, kube-proxy will treat the ` Default ` preference the same as
332
- ` PreferEqualSpread `
336
+ ` Spread `
333
337
* This leverages existing implementation, requiring no major changes.
334
338
335
- #### ` PreferZone `
339
+ #### ` Zone `
336
340
* This preference will be implemented by the use of Hints within EndpointSlices.
337
341
* We already use Hints to implement ` service.kubernetes.io/topology-mode=Auto `
338
342
Similarly, we’ll use the same Hints within the EndpointSlice to implement the
339
- PreferZone heuristic – the hints will match the zone of the endpoint itself.
343
+ ` Zone ` heuristic – the hints will match the zone of the endpoint itself.
340
344
* While it may seem redundant to populate the hints here since kube-proxy can
341
345
already derive the zone hint from the endpoints zone (as they would be the
342
346
same), we will still use this for implementation simply because of the reason
343
347
that it’s easier to implement and provides a better design. Consider an
344
348
alternative implementation where kube-proxy reads
345
- ` routingPreference=PreferZone ` field and then constructs the route rules
349
+ ` routingPreference=Zone ` field and then constructs the route rules
346
350
accordingly. This means some extra logic needs to be baked into the kube-proxy
347
351
which could have just as easily been implemented by an already existing
348
352
extensibility mechanism (i.e. EndpointSlice hints)
@@ -373,7 +377,7 @@ set to "Auto"
373
377
** New behaviour:** Irrespective of what the annotation
374
378
` service.kubernetes.io/topology-aware-hints ` or field ` routingPreference ` are
375
379
set to (or even if they are not set at all), kube-proxy will always consider
376
- EndpointSlice hints.
380
+ EndpointSlice hints (assuming this feature-gate is enabled) .
377
381
378
382
NOTE: The expectation remains that * all* endpoints within an EndpointSlice must
379
383
have corresponding hints for kube-proxy to utilize them. This avoids scenarios
@@ -521,14 +525,14 @@ The following packages will also see minor changes:
521
525
##### Integration tests
522
526
523
527
* Verify that if both the annotation ` service.kubernetes.io/topology-mode=Auto `
524
- and field ` routingPreference=PreferZone ` are configured, precedence is given
528
+ and field ` routingPreference=Zone ` are configured, precedence is given
525
529
to the annotation.
526
530
527
531
##### e2e tests
528
532
529
533
* Verify that EndpointSlice hints are correctly populated when
530
- ` routingPreference=PreferZone `
531
- * Verify through probes that when ` routingPreference=PreferZone ` , requests
534
+ ` routingPreference=Zone `
535
+ * Verify through probes that when ` routingPreference=Zone ` , requests
532
536
originating from a zone which has service pods get sent to a pod in the same
533
537
zone. For requests originating from zones with no service pods, requests
534
538
should not get blackholed and should rather be forwarded to any service pod
@@ -868,8 +872,8 @@ along the following lines:
868
872
### Reuse the fields internal/externalTrafficPolicy to offer these routing preferences
869
873
870
874
This has been a major topic of discussion in the past, with questions around
871
- which field would be appropriate to support a heuristic like PreferZone . If we
872
- were to in fact use this approach we would be faced with the dilemma of choosing
875
+ which field would be appropriate to support a heuristic like ` Zone ` . If we were
876
+ to in fact use this approach we would be faced with the dilemma of choosing
873
877
between two less-than-ideal options:
874
878
875
879
* Dilute purpose and sacrifice semantic expectation
@@ -884,14 +888,14 @@ between two less-than-ideal options:
884
888
node since the logs would then report an incorrect log source.”. Values like
885
889
"Local" mandate that traffic must remain within the Node boundary.
886
890
887
- * ** Problem:** Introducing routing preferences like "PreferZone" would dilute this
891
+ * ** Problem:** Introducing routing preferences like ` Zone ` would dilute this
888
892
clear semantic meaning and could create potential misinterpretations. Using
889
893
a separate field dedicated to routing preferences avoids this confusion and
890
894
maintains consistency.
891
895
892
896
* Become inflexible or rigid
893
897
894
- * Alternatively, if we introduce "PreferZone" without diluting the meaning of
898
+ * Alternatively, if we introduce ` Zone ` without diluting the meaning of
895
899
the existing fields, we'd need to create extremely specific and inflexible
896
900
rules for how it works across all implementations.
897
901
@@ -902,6 +906,39 @@ between two less-than-ideal options:
902
906
Given the above, introducing a new dedicated field seems to be better than
903
907
picking one of the two bad options.
904
908
909
+ ### Granular Routing Controls
910
+
911
+ One approach to routing control would be introducing numerous configuration
912
+ fields, either directly in the Service API or within a separate, dedicated API.
913
+ This offers users maximum precision in defining routing behaviors based on
914
+ factors like location, weighted preferences, and other criteria. This approach
915
+ can be seen as a revisited, and potentially expanded, version of the
916
+ ` topologyKeys ` concept (and hence would suffer from some of the downsides of
917
+ ` topologyKeys ` , as stated previously.)
918
+
919
+ In some sense, the approach is indeed very tempting. The reason why an option
920
+ like ` routingPreference ` might be a better option is:
921
+
922
+ * Introducing numerous configuration options within the Service API (or a
923
+ separate API type) could be sacrificing some of the core simplicity of the
924
+ Service API. Future, more complex needs could (and should) be explored within
925
+ the Gateway API.
926
+
927
+ * The ` routingPreference ` field elegantly balances control and abstraction.
928
+ Users can influence behavior with high-level heuristics (` Zone ` , ` Spread ` )
929
+ while implementations handle the underlying complexity. Heuristics can flag
930
+ potential issues and guide users towards safe configurations. Using
931
+ independent fields increases the risk of unintended consequences, as
932
+ interactions between seemingly unrelated settings can create unexpected and
933
+ potentially damaging routing behavior. Additionally, even simple routing
934
+ adjustments might require tweaking multiple fields, adding complexity for the
935
+ user.
936
+
937
+ * Rigid API contracts with granular fields can hinder an implementation's
938
+ ability to introduce innovative routing strategies that don't fit the
939
+ predefined mold. ` routingPreference ` encourages flexibility by treating
940
+ preferences as hints, allowing for sophisticated, implementation-specific
941
+ algorithms that can evolve over time.
905
942
906
943
## Infrastructure Needed (Optional)
907
944
0 commit comments