Skip to content

Commit dcaa20b

Browse files
fixup! fixup! Add KEP 4444: Routing Preference for Services
1 parent 02a82a4 commit dcaa20b

File tree

1 file changed

+77
-40
lines changed
  • keps/sig-network/4444-service-routing-preference

1 file changed

+77
-40
lines changed

keps/sig-network/4444-service-routing-preference/README.md

Lines changed: 77 additions & 40 deletions
Original file line numberDiff line numberDiff line change
@@ -19,8 +19,8 @@
1919
- [Risks and Mitigations](#risks-and-mitigations)
2020
- [Design Details](#design-details)
2121
- [Standard Heuristic Implementation (kube-proxy dataplane)](#standard-heuristic-implementation-kube-proxy-dataplane)
22-
- [<code>Default</code> and <code>PreferEqualSpread</code>](#default-and-preferequalspread)
23-
- [<code>PreferZone</code>](#preferzone)
22+
- [<code>Default</code> and <code>Spread</code>](#default-and-spread)
23+
- [<code>Zone</code>](#zone)
2424
- [Changes within kube-proxy](#changes-within-kube-proxy)
2525
- [Status Reporting](#status-reporting)
2626
- [Condition usage by other implementations](#condition-usage-by-other-implementations)
@@ -47,6 +47,7 @@
4747
- [Alternatives](#alternatives)
4848
- [Repurpose the existing topology annotation to recognize additional values](#repurpose-the-existing-topology-annotation-to-recognize-additional-values)
4949
- [Reuse the fields internal/externalTrafficPolicy to offer these routing preferences](#reuse-the-fields-internalexternaltrafficpolicy-to-offer-these-routing-preferences)
50+
- [Granular Routing Controls](#granular-routing-controls)
5051
- [Infrastructure Needed (Optional)](#infrastructure-needed-optional)
5152
<!-- /toc -->
5253

@@ -125,11 +126,11 @@ prioritize local nodes, then zone, then anywhere).
125126

126127
### Topology Aware Routing and InternalTrafficPolicy
127128

128-
TopologyAwareRouting together with InternalTrafficPolicy were meant to be the
129-
successors of `topologyKeys` and allow implementations to be more flexible.
129+
Topology aware routing together with `internalTrafficPolicy` were meant to be
130+
the successors of `topologyKeys` and allow implementations to be more flexible.
130131

131132
* TopologyAwareRouting:
132-
* Exposes the annotation service.kubernetes.io/topology-mode. When this
133+
* Responds the annotation service.kubernetes.io/topology-mode. When this
133134
annotation is set to Auto, an implementation specific heuristic is used to
134135
route the traffic.
135136
* **Goal:** The aim with Auto was to allow implementations to be as smart as
@@ -155,7 +156,7 @@ successors of `topologyKeys` and allow implementations to be more flexible.
155156
* **Limitation:** Lacks failover; traffic is dropped if no local endpoint exists.
156157

157158
Note that while the initial proposal of InternalTrafficPolicy proposed a
158-
PreferLocal policy, it was dropped later on. This meant that now
159+
Local policy, it was dropped later on. This meant that now
159160
TopologyAwareRouting in conjunction with InternalTrafficPolicy didn’t exactly
160161
allow users to express a much desired use case from topologyKeys which is
161162
"prefer node-local, failover to same zone, then route anywhere" While this
@@ -188,17 +189,17 @@ such a preference in future refinements.
188189
decisions.
189190

190191
* **Mandatory and Uniform Implementation Support:** Kubernetes implementations
191-
are not required to support all standard heuristics (e.g., PreferZone,
192-
ProportionalZoneCPU). Even when standard heuristics are supported, their
193-
precise behavior and interpretation might vary across implementations.
192+
are not required to support all standard heuristics (e.g., `Zone`, `Spread`).
193+
Even when standard heuristics are supported, their precise behavior and
194+
interpretation might vary across implementations.
194195

195196
* **Replacement of Traffic Policies:** The new field is complementary to
196197
InternalTrafficPolicy and ExternalTrafficPolicy. It does not aim to substitute
197198
their role in enforcing strict traffic locality.
198199

199200
* **Immediate Support for All Possible Heuristics:** The initial implementation
200201
focuses on a core set of heuristics. Addition of new heuristics (like
201-
`PreferLocal` for Node local preference) could be explored in future
202+
`Local` for Node local preference) could be explored in future
202203
refinements.
203204

204205
## Proposal
@@ -212,9 +213,9 @@ The field will support the following initial values:
212213
* `Default`: Indicates no specific routing preference. The user delegates the
213214
routing decision to the implementation, allowing it to apply its best-effort
214215
strategy.
215-
* `PreferEqualSpread`: Encourages an equal distribution of traffic across
216+
* `Spread`: Encourages an equal distribution of traffic across
216217
endpoints, potentially spanning multiple zones (or regions).
217-
* `PreferZone`: Encourages routing traffic to endpoints within the same zone as
218+
* `Zone`: Encourages routing traffic to endpoints within the same zone as
218219
the client. If no endpoints are available within the zone, traffic should be
219220
routed to other zones.
220221

@@ -229,14 +230,14 @@ reserved for potential future standardization.
229230

230231
NOTE: Implementations reserve the right to refine the behavior associated with
231232
any heuristic, including standard heuristics. This means the behavior enabled
232-
by values such as `Default` or `PreferZone` might evolve over time. Such
233-
refinements could improve the implementation's ability to honor the original
234-
intent of the heuristic, even if the specific mechanisms change. For example,
235-
in the case of PreferZone, an implementation might initially route traffic
236-
within a zone with equal probability. A future improvement could introduce
237-
load-aware routing within the zone to further optimize performance while still
238-
adhering to the core principle of zonal preference. The decision of what
239-
constitutes an "improvement" remains at the discretion of the implementation.
233+
by values such as `Default` or `Zone` might evolve over time, and some
234+
evolutions might interpret the heuristic goals slightly differently. For
235+
example, in the case of `Zone`, an implementation might initially route
236+
traffic within the zone without considering endpoint overload, while a future
237+
refinement could introduce feedback mechanisms to detect overload and route
238+
traffic outside the zone when necessary, optimizing overall performance. The
239+
decision of what constitutes an "improvement" remains at the discretion of the
240+
implementation.
240241

241242
### User Stories
242243

@@ -254,7 +255,7 @@ NOTE: Implementations reserve the right to refine the behavior associated with
254255
* **Requirement:** I want my application to primarily receive traffic from
255256
endpoints within the same zone for performance or cost reasons. However,
256257
I want to avoid connection failures if no local endpoints are available.
257-
* **Solution:** Set `routingPreference=PreferZone`
258+
* **Solution:** Set `routingPreference=Zone`
258259
* **Effect:** The Kubernetes implementation will aim to prioritize routing
259260
traffic to endpoints in the same zone as the client. If no endpoints are
260261
available within the zone, traffic will be routed to other zones. It's
@@ -266,7 +267,7 @@ NOTE: Implementations reserve the right to refine the behavior associated with
266267
* **Requirement:** I prioritize application availability and want to minimize the
267268
risk of outages due to localized overload. I'm willing to accept potentially
268269
higher costs associated with cross-zone traffic distribution.
269-
* **Solution:** Set `routingPreference=PreferEqualSpread`
270+
* **Solution:** Set `routingPreference=Spread`
270271
* **Effect:** The Kubernetes implementation will try to distribute traffic as
271272
equally as possible across endpoints, potentially spanning multiple zones or
272273
regions. This can improve resilience but might lead to increased network
@@ -295,16 +296,17 @@ NOTE: Implementations reserve the right to refine the behavior associated with
295296

296297
### Notes/Constraints/Caveats (Optional)
297298

298-
None.
299+
This proposal is our third attempt at an API revolving around such a
300+
configuration. There's a non-zero chance that we may need to revisit this again.
299301

300302
### Risks and Mitigations
301303

302-
* **Risk:** Having a routing preference like `PreferZone` comes at the risk of
304+
* **Risk:** Having a routing preference like `Zone` comes at the risk of
303305
endpoints in certain zones being overloaded if the originating traffic is
304306
skewed towards a particular zone.
305307

306308
**Mitigation:**
307-
* Emphasize in the documentation that the `PreferZone` preference is
309+
* Emphasize in the documentation that the `Zone` preference is
308310
designed for low-latency or monetory-cost reasons, with the understanding
309311
that it can lead to overload within zones.
310312
* Recommend approaches like having deployments per zone which can scale
@@ -314,7 +316,9 @@ None.
314316
annotation might encounter differences in exact routing behavior:
315317
* The new field doesn't support a routing preference that is exactly similar
316318
to using `service.kubernetes.io/topology-mode=Auto` from the old annotation.
317-
* If both field and the annotation are set, the annotation will take precedence.
319+
* If both field and the annotation are set, the annotation will take
320+
precedence. (However, this behavior is temporary as the annotation will be
321+
deprecated and removed in future releases)
318322

319323
**Mitigation:** Properly document the suggested migration paths with
320324
limitations.
@@ -325,24 +329,24 @@ None.
325329

326330
kube-proxy (along with EndpointSlice controller, within kube-controller-manager
327331
as the control plane) will support the three standard routing preferences
328-
(`Default`, `PreferEqualSpread`, `PreferZone`).
332+
(`Default`, `Spread`, `Zone`).
329333

330-
#### `Default` and `PreferEqualSpread`
334+
#### `Default` and `Spread`
331335
* Initially, kube-proxy will treat the `Default` preference the same as
332-
`PreferEqualSpread`
336+
`Spread`
333337
* This leverages existing implementation, requiring no major changes.
334338

335-
#### `PreferZone`
339+
#### `Zone`
336340
* This preference will be implemented by the use of Hints within EndpointSlices.
337341
* We already use Hints to implement `service.kubernetes.io/topology-mode=Auto`
338342
Similarly, we’ll use the same Hints within the EndpointSlice to implement the
339-
PreferZone heuristic – the hints will match the zone of the endpoint itself.
343+
`Zone` heuristic – the hints will match the zone of the endpoint itself.
340344
* While it may seem redundant to populate the hints here since kube-proxy can
341345
already derive the zone hint from the endpoints zone (as they would be the
342346
same), we will still use this for implementation simply because of the reason
343347
that it’s easier to implement and provides a better design. Consider an
344348
alternative implementation where kube-proxy reads
345-
`routingPreference=PreferZone` field and then constructs the route rules
349+
`routingPreference=Zone` field and then constructs the route rules
346350
accordingly. This means some extra logic needs to be baked into the kube-proxy
347351
which could have just as easily been implemented by an already existing
348352
extensibility mechanism (i.e. EndpointSlice hints)
@@ -373,7 +377,7 @@ set to "Auto"
373377
**New behaviour:** Irrespective of what the annotation
374378
`service.kubernetes.io/topology-aware-hints` or field `routingPreference` are
375379
set to (or even if they are not set at all), kube-proxy will always consider
376-
EndpointSlice hints.
380+
EndpointSlice hints (assuming this feature-gate is enabled).
377381

378382
NOTE: The expectation remains that *all* endpoints within an EndpointSlice must
379383
have corresponding hints for kube-proxy to utilize them. This avoids scenarios
@@ -521,14 +525,14 @@ The following packages will also see minor changes:
521525
##### Integration tests
522526

523527
* Verify that if both the annotation `service.kubernetes.io/topology-mode=Auto`
524-
and field `routingPreference=PreferZone` are configured, precedence is given
528+
and field `routingPreference=Zone` are configured, precedence is given
525529
to the annotation.
526530

527531
##### e2e tests
528532

529533
* Verify that EndpointSlice hints are correctly populated when
530-
`routingPreference=PreferZone`
531-
* Verify through probes that when `routingPreference=PreferZone`, requests
534+
`routingPreference=Zone`
535+
* Verify through probes that when `routingPreference=Zone`, requests
532536
originating from a zone which has service pods get sent to a pod in the same
533537
zone. For requests originating from zones with no service pods, requests
534538
should not get blackholed and should rather be forwarded to any service pod
@@ -868,8 +872,8 @@ along the following lines:
868872
### Reuse the fields internal/externalTrafficPolicy to offer these routing preferences
869873

870874
This has been a major topic of discussion in the past, with questions around
871-
which field would be appropriate to support a heuristic like PreferZone. If we
872-
were to in fact use this approach we would be faced with the dilemma of choosing
875+
which field would be appropriate to support a heuristic like `Zone`. If we were
876+
to in fact use this approach we would be faced with the dilemma of choosing
873877
between two less-than-ideal options:
874878

875879
* Dilute purpose and sacrifice semantic expectation
@@ -884,14 +888,14 @@ between two less-than-ideal options:
884888
node since the logs would then report an incorrect log source.”. Values like
885889
"Local" mandate that traffic must remain within the Node boundary.
886890

887-
* **Problem:** Introducing routing preferences like "PreferZone" would dilute this
891+
* **Problem:** Introducing routing preferences like `Zone` would dilute this
888892
clear semantic meaning and could create potential misinterpretations. Using
889893
a separate field dedicated to routing preferences avoids this confusion and
890894
maintains consistency.
891895

892896
* Become inflexible or rigid
893897

894-
* Alternatively, if we introduce "PreferZone" without diluting the meaning of
898+
* Alternatively, if we introduce `Zone` without diluting the meaning of
895899
the existing fields, we'd need to create extremely specific and inflexible
896900
rules for how it works across all implementations.
897901

@@ -902,6 +906,39 @@ between two less-than-ideal options:
902906
Given the above, introducing a new dedicated field seems to be better than
903907
picking one of the two bad options.
904908

909+
### Granular Routing Controls
910+
911+
One approach to routing control would be introducing numerous configuration
912+
fields, either directly in the Service API or within a separate, dedicated API.
913+
This offers users maximum precision in defining routing behaviors based on
914+
factors like location, weighted preferences, and other criteria. This approach
915+
can be seen as a revisited, and potentially expanded, version of the
916+
`topologyKeys` concept (and hence would suffer from some of the downsides of
917+
`topologyKeys`, as stated previously.)
918+
919+
In some sense, the approach is indeed very tempting. The reason why an option
920+
like `routingPreference` might be a better option is:
921+
922+
* Introducing numerous configuration options within the Service API (or a
923+
separate API type) could be sacrificing some of the core simplicity of the
924+
Service API. Future, more complex needs could (and should) be explored within
925+
the Gateway API.
926+
927+
* The `routingPreference` field elegantly balances control and abstraction.
928+
Users can influence behavior with high-level heuristics (`Zone`, `Spread`)
929+
while implementations handle the underlying complexity. Heuristics can flag
930+
potential issues and guide users towards safe configurations. Using
931+
independent fields increases the risk of unintended consequences, as
932+
interactions between seemingly unrelated settings can create unexpected and
933+
potentially damaging routing behavior. Additionally, even simple routing
934+
adjustments might require tweaking multiple fields, adding complexity for the
935+
user.
936+
937+
* Rigid API contracts with granular fields can hinder an implementation's
938+
ability to introduce innovative routing strategies that don't fit the
939+
predefined mold. `routingPreference` encourages flexibility by treating
940+
preferences as hints, allowing for sophisticated, implementation-specific
941+
algorithms that can evolve over time.
905942

906943
## Infrastructure Needed (Optional)
907944

0 commit comments

Comments
 (0)