Skip to content

Commit 27b1053

Browse files
authored
Merge pull request kubernetes#2706 from robscott/topology-1.22
Updating Topology Hints KEP for 1.22
2 parents 338d64e + 5329d2e commit 27b1053

File tree

2 files changed

+54
-41
lines changed

2 files changed

+54
-41
lines changed

keps/sig-network/2433-topology-aware-hints/README.md

Lines changed: 53 additions & 40 deletions
Original file line numberDiff line numberDiff line change
@@ -24,6 +24,7 @@
2424
- [Test Plan](#test-plan)
2525
- [Controller Unit Tests](#controller-unit-tests)
2626
- [Kube-Proxy Unit Tests](#kube-proxy-unit-tests)
27+
- [e2e Tests](#e2e-tests)
2728
- [Observability](#observability)
2829
- [Graduation Criteria](#graduation-criteria)
2930
- [Version Skew Strategy](#version-skew-strategy)
@@ -75,15 +76,9 @@ routing at zone level but could be expanded to include region.
7576

7677
In the short term, this is taking the place of two closely related KEPs that
7778
were never implemented. These KEPs relate to EndpointSlice subsetting and are
78-
still relevant, just deferred to a later point in time. For more info on this
79-
transition refer to the following resources:
80-
81-
- [Doc: Updates to Topology in Kubernetes
82-
1.21](https://docs.google.com/document/d/1ZzUoFY1SrdjVefl7gVOJZJLt1I1LHttw8pcX95nlgMY/edit)
83-
- [KEP 2004: Topology Aware
84-
Subsetting](https://github.com/kubernetes/enhancements/blob/master/keps/sig-network/2004-topology-aware-subsetting).
85-
- [KEP 2030: Topology Aware
86-
Proxying](https://github.com/kubernetes/enhancements/blob/master/keps/sig-network/2030-topology-aware-proxying).
79+
still relevant, just deferred to a later point in time. This
80+
[doc](https://docs.google.com/document/d/1ZzUoFY1SrdjVefl7gVOJZJLt1I1LHttw8pcX95nlgMY/edit)
81+
has more info on this transition.
8782

8883
## Motivation
8984

@@ -151,8 +146,7 @@ hints help ensure that each zone will have a single endpoint to consume by
151146
adding a hint to the third endpoint that it should be consumed by "zone-c".
152147

153148
This functionality will be enabled by a `TopologyAwareHints` feature gate along
154-
with the `trafficPolicy` field on Service that will be added as part of KEP
155-
2086.
149+
with a new Service annotation.
156150

157151
### Risks and Mitigations
158152

@@ -196,20 +190,24 @@ updated to read the same information to identify which zone it is running in.
196190

197191
### Configuration
198192

199-
The new Service `trafficPolicy` field will be expanded to support a new value:
200-
201-
- `PreferZone`: When there are a sufficient number of endpoints for the Service,
202-
the EndpointSlice controller will add topology hints for each endpoint that
203-
will ensure a proportional amounts are available to each zone in a cluster.
193+
A new `service.kubernetes.io/topology-aware-routing` annotation can be used to
194+
enable or disable Topology Aware Routing (and by extension, hints) for a
195+
Service. This may be set to "Auto" or "Disabled". Any other value is treated as
196+
"Disabled".
204197

205-
A future KEP will explore changing the default value of this field to
206-
`PreferZone`.
198+
The previous `service.kubernetes.io/topology-aware-hints` annotation will
199+
continue to be supported as a means of configuring this feature.
207200

208201
#### Interoperability
209202

210-
Validation will ensure that `trafficPolicy` can not be set to `PreferZone` when
211-
the deprecated `topologyKeys` field is also set. This will be true until the
212-
`topologyKeys` field is removed in the future.
203+
Topology hints will be ignored if the TopologyKeys field has at least one entry.
204+
This field is deprecated and will be removed soon.
205+
206+
Both ExternalTrafficPolicy and InternalTrafficPolicy will be given precedence
207+
over topology aware routing. For example, if `ExternalTrafficPolicy=Local` and
208+
topology was enabled, external traffic would be routed using the
209+
ExternalTrafficPolicy configuration while internal traffic would be routed with
210+
topology.
213211

214212
#### Feature Gate
215213

@@ -250,7 +248,6 @@ type ForZone struct {
250248
}
251249
```
252250

253-
254251
#### Future API Expansion
255252
This approach would allow for future API expansion that enabled specifying
256253
multiple zones per endpoint with weights. That level of complexity may never be
@@ -277,7 +274,7 @@ conditions are true:
277274

278275
- Kube-Proxy is able to determine the zone it is running within (likely based
279276
on node labels).
280-
- The `trafficPolicy` field is set to `PreferZone` for the Service.
277+
- The annotation is set to `Auto`.
281278
- At least one endpoint for the Service has a hint pointing to the zone
282279
Kube-Proxy is running within.
283280
- All endpoints for the Service have zone hints.
@@ -293,10 +290,10 @@ had not yet propagated to all of them.
293290

294291
### EndpointSlice Controller
295292

296-
When the `TopologyAwareHints` feature gate is enabled and the `trafficPolicy`
297-
field is set to `PreferZone` for a Service, the EndpointSlice controller will
298-
add hints to EndpointSlices. These hints will indicate where an endpoint should
299-
be consumed by proxy implementations to enable topology aware routing.
293+
When the `TopologyAwareHints` feature gate is enabled and the annotation is set
294+
to `Auto` for a Service, the EndpointSlice controller will add hints to
295+
EndpointSlices. These hints will indicate where an endpoint should be consumed
296+
by proxy implementations to enable topology aware routing.
300297

301298
The EndpointSlice controller will determine how many endpoints should be
302299
available for each zone based on the proportion of CPU cores in each zone. If
@@ -370,13 +367,10 @@ In the future we may expand this functionality if needed. This could include:
370367
#### Controller Unit Tests
371368
| Test Description | Expected Result |
372369
| :--- | :--- |
373-
| Feature Gate On, TrafficPolicy == 'PreferZone', 2+ zones | Hints set |
374-
| Feature Gate On, TrafficPolicy == 'PreferZone', 1 zone | No hints set |
375-
| Feature Gate On, TrafficPolicy == 'Local', 2+ zones | No hints |
376-
| Feature Gate On, TrafficPolicy Unset, 2+ zones | No hints |
377-
| Feature Gate Off, TrafficPolicy == 'PreferZone', 2+ zones | No hints |
378-
| Feature Gate Off, TrafficPolicy Unset, 2+ zones | No hints |
379-
| Feature Gate Off, TrafficPolicy Unset, 2+ zones | No hints |
370+
| Feature On, 2+ zones | Hints set |
371+
| Feature Off, 2+ zones | No hints |
372+
| Feature On, 1 zone | No hints set |
373+
| Feature On, ExternalTrafficPolicy == 'Local', 2+ zones | No hints |
380374
| 2 endpoints, 3 zones | No hints |
381375
| 3 endpoints, 3 zones | Hints set |
382376
| 4 endpoints, 3 zones | No hints |
@@ -393,10 +387,28 @@ In the future we may expand this functionality if needed. This could include:
393387
#### Kube-Proxy Unit Tests
394388
| Test Description | Expected Result |
395389
| :--- | :--- |
396-
| Feature Gate On, TrafficPolicy == 'PreferZone', hints matching zone | Endpoints filtered |
397-
| Feature Gate On, TrafficPolicy == 'Local', hints matching zone | Endpoints not filtered |
398-
| Feature Gate Off, TrafficPolicy == 'PreferZone', hints matching zone | Endpoints not filtered |
399-
| Feature Gate On, TrafficPolicy == 'PreferZone', no hints matching zone | Endpoints not filtered |
390+
| Feature On, hints matching zone | Endpoints filtered |
391+
| Feature On, ExternalTrafficPolicy == 'Local', hints matching zone | Endpoints not filtered |
392+
| Feature Off, hints matching zone | Endpoints not filtered |
393+
| Feature On, no hints matching zone | Endpoints not filtered |
394+
395+
### e2e Tests
396+
This represents the largest and most uncertain part of the testing effort. We
397+
need to find a way to run e2e tests on multizone clusters. To limit flakiness,
398+
those clusters likely need to have a consistent distribution of nodes across
399+
zones. This will enable us to write predictable tests for topology aware
400+
routing.
401+
402+
At a minimum, we likely want the following test:
403+
404+
- 3 zone cluster, with 1 equivalent node per zone
405+
- Deploy a single pod to each node with a daemonset
406+
- Create a Service that targets that daemonset
407+
- Make requests from each zone and ensure that the request is routed to a pod in
408+
the same zone
409+
410+
We'll likely need more tests to properly vet this feature, but this one should
411+
be straightforward to write and unlikely to be flaky.
400412

401413
### Observability
402414
We can reuse some of the metrics of EndpointSlice Controller that we already
@@ -435,6 +447,7 @@ EndpointSliceSyncs = metrics.NewCounterVec(
435447

436448
### Graduation Criteria
437449
- Alpha should provide basic functionality covered with tests described above.
450+
- Interoperability with Internal and External TrafficPolicy fields.
438451

439452
### Version Skew Strategy
440453
This KEP requires updates to both the EndpointSlice Controller and kube-proxy.
@@ -448,7 +461,7 @@ Thus there could be two potential version skew scenarios:
448461
of the new controller functionality.
449462

450463
Each scenario described above will end up behaving as if this feature is not
451-
enabled even if the `trafficPolicy` has been set on Service.
464+
enabled even if the annotation has been set on the Service.
452465

453466
## Production Readiness Review Questionnaire
454467

@@ -467,7 +480,7 @@ enabled even if the `trafficPolicy` has been set on Service.
467480
* **Can the feature be disabled once it has been enabled (i.e. can we roll back
468481
the enablement)?**
469482
Yes. It can easily be disabled universally by turning off the feature gate or
470-
setting the `trafficPolicy` field to some other value for a Service.
483+
setting the annotation to some other value for a Service.
471484

472485
* **What happens if we reenable the feature if it was previously rolled back?**
473486
EndpointSlices hints will be added again resulting in changes to existing

keps/sig-network/2433-topology-aware-hints/kep.yaml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -31,7 +31,7 @@ stage: alpha
3131
# done. This can be the current (upcoming) milestone, if it is being actively
3232
# worked on.
3333
# latest-milestone: "v1.21"
34-
latest-milestone: "0.0"
34+
latest-milestone: "v1.22"
3535

3636
# The milestone at which this feature was, or is targeted to be, at each stage.
3737
milestone:

0 commit comments

Comments
 (0)