You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
This new field will intersect with externalTrafficPolicy in the following ways:
143
-
* if `externalTrafficPolicy=Cluster`, traffic will be routed based on `trafficPolicy` for external sources
144
-
* if `externalTrafficPolicy=Local`, `externalTrafficPolicy` will take precedent over `trafficPolicy`, but only for external sources.
133
+
This field will be independent from externalTrafficPolicy. In other words, internalTrafficPolicy only applies to traffic originating from internal sources.
145
134
146
135
Proposed changes to kube-proxy:
147
-
* when `trafficPolicy=Cluster`, default to existing behavior today.
148
-
* when `trafficPolicy=Topology`, use topology hints from EndpointSlice API.
149
-
* when `trafficPolicy=PreferLocal`, route to endpoints in EndpointSlice that matches the local node's topology (topology defined by `kubernetes.io/hostname`),
150
-
fall back to "Cluster" behavior if there are no local endpoints.
151
-
* when `trafficPolicy=Local`, route to endpoints in EndpointSlice that maches the local node's topology, drop traffic if none exist.
136
+
* when `internalTrafficPolicy=Cluster`, default to existing behavior today.
137
+
* when `internalTrafficPolicy=Local`, route to endpoints in EndpointSlice that maches the local node's topology, drop traffic if none exist.
138
+
139
+
Overlap with topology-aware routing:
140
+
141
+
| ExternalTrafficPolicy | InternalTrafficPolicy | Topology | External Result | Internal Result |
142
+
| - | - | - | - | - |
143
+
| - | - | Auto | Topology | Topology |
144
+
| Local | - | Auto | Local | Topology |
145
+
| Local | Local | Auto | Local | Local |
152
146
153
147
### Test Plan
154
148
155
149
Unit tests:
156
-
* unit tests validating API strategy/validation for when `trafficPolicy` is set on Service.
157
-
* unit tests exercising kube-proxy behavior when `trafficPolicy` is set to all possible values.
150
+
* unit tests validating API strategy/validation for when `internalTrafficPolicy` is set on Service.
151
+
* unit tests exercising kube-proxy behavior when `internalTrafficPolicy` is set to all possible values.
158
152
159
153
E2E test:
160
-
* e2e tests validating default behavior with kube-proxy did not change when `trafficPolicy` defaults to `Cluster`. Existing tests should cover this.
161
-
* e2e tests validating that traffic is preferred to local endpoints when `trafficPolicy` is set to `PreferLocal`.
162
-
* e2e tests validating that traffic is only sent to node-local endpoints when `trafficPolicy` is set to `Local`.
154
+
* e2e tests validating default behavior with kube-proxy did not change when `internalTrafficPolicy` defaults to `Cluster`. Existing tests should cover this.
155
+
* e2e tests validating that traffic is only sent to node-local endpoints when `internalTrafficPolicy` is set to `Local`.
163
156
164
157
### Graduation Criteria
165
158
166
159
Alpha:
167
-
* feature gate `ServiceTrafficPolicy`_must_ be enabled for apiserver to accept values for `spec.trafficPolicy`. Otherwise field is dropped.
168
-
* kube-proxy handles traffic routing for 4 initial internal traffic policies `Cluster`,`Topology`, `PreferLocal` and `Local`.
160
+
* feature gate `ServiceInternalTrafficPolicy`_must_ be enabled for apiserver to accept values for `spec.internalTrafficPolicy`. Otherwise field is dropped.
161
+
* kube-proxy handles traffic routing for 2 initial internal traffic policies `Cluster`, and `Local`.
169
162
* Unit tests as defined in "Test Plan" section above. E2E tests are nice to have but not required for Alpha.
170
163
164
+
Beta:
165
+
* integration tests exercising API behavior for `spec.internalTrafficPolicy` field of Service.
166
+
* e2e tests exercising kube-proxy routing when `internalTrafficPolicy` is `Local`.
167
+
* feature gate `ServiceInternalTrafficPolicy` is enabled by default.
168
+
* consensus on how internalTrafficPolicy overlaps with topology-aware routing.
171
169
172
170
### Upgrade / Downgrade Strategy
173
171
@@ -187,18 +185,12 @@ _This section must be completed when targeting alpha to a release._
187
185
188
186
***How can this feature be enabled / disabled in a live cluster?**
189
187
-[X] Feature gate (also fill in values in `kep.yaml`)
- Components depending on the feature gate: kube-apiserver, kube-proxy
192
-
-[ ] Other
193
-
- Describe the mechanism:
194
-
- Will enabling / disabling the feature require downtime of the control
195
-
plane?
196
-
- Will enabling / disabling the feature require downtime or reprovisioning
197
-
of a node? (Do not assume `Dynamic Kubelet Config` feature is enabled).
198
190
199
191
***Does enabling the feature change any default behavior?**
200
192
201
-
No, enabling the feature does not change any default behavior since the default value of `trafficPolicy` is `Cluster`.
193
+
No, enabling the feature does not change any default behavior since the default value of `internalTrafficPolicy` is `Cluster`.
202
194
203
195
***Can the feature be disabled once it has been enabled (i.e. can we roll back
204
196
the enablement)?**
@@ -207,54 +199,57 @@ Yes, the feature gate can be disabled, but Service resource that have set the ne
207
199
208
200
***What happens if we reenable the feature if it was previously rolled back?**
209
201
210
-
New Services should be able to set the `trafficPolicy` field. Existing Services that have the field set already should not be impacted.
202
+
New Services should be able to set the `internalTrafficPolicy` field. Existing Services that have the field set will begin to apply the policy again.
211
203
212
204
***Are there any tests for feature enablement/disablement?**
213
205
214
-
There will be unit tests to verify that apiserver will drop the field when the `ServiceTrafficPolicy` feature gate is disabled.
206
+
There will be unit tests to verify that apiserver will drop the field when the `ServiceInternalTrafficPolicy` feature gate is disabled.
215
207
216
208
### Rollout, Upgrade and Rollback Planning
217
209
218
210
_This section must be completed when targeting beta graduation to a release._
219
211
220
212
***How can a rollout fail? Can it impact already running workloads?**
221
213
222
-
TBD for beta.
214
+
Rollout should have minimal impact because the default value of `internalTrafficPolicy` is `Cluster`, which is the default behavior today.
223
215
224
216
***What specific metrics should inform a rollback?**
225
217
226
-
TBD for beta.
218
+
Metrics representing Services being black-holed will be added. This metric can inform rollback.
227
219
228
220
***Were upgrade and rollback tested? Was the upgrade->downgrade->upgrade path tested?**
229
221
230
-
TBD for beta.
222
+
No, but this will be manually tested prior to beta. Automated testing will be done if the test tooling is available.
231
223
232
224
***Is the rollout accompanied by any deprecations and/or removals of features, APIs,
233
225
fields of API types, flags, etc.?**
234
226
235
-
TBD for beta.
227
+
No.
236
228
237
229
### Monitoring Requirements
238
230
239
231
_This section must be completed when targeting beta graduation to a release._
240
232
241
233
***How can an operator determine if the feature is in use by workloads?**
242
234
243
-
TBD for beta.
235
+
* Check Service to see if `internalTrafficPolicy` is set to `Local`.
236
+
* A per-node "blackhole" metric will be added to kube-proxy which represent Services that are being intentionally dropped (internalTrafficPolicy=Local and no endpoints).
237
+
238
+
TODO: add metric name once it's decided
244
239
245
240
***What are the SLIs (Service Level Indicators) an operator can use to determine
246
241
the health of the service?**
247
242
248
-
TBD for beta.
243
+
They can check the "blackhole" metric when internalTrafficPolicy=Local and there are no endpoints.
249
244
250
245
***What are the reasonable SLOs (Service Level Objectives) for the above SLIs?**
251
246
252
-
TBD for beta.
247
+
This will depend on Service topology and whether `internalTrafficPolicy=Local` is being used.
253
248
254
249
***Are there any missing metrics that would be useful to have to improve observability
255
250
of this feature?**
256
251
257
-
TBD for beta.
252
+
A new metric will be added to represent Services that are being "blackholed" (internalTrafficPolicy=Local and no endpoints).
258
253
259
254
### Dependencies
260
255
@@ -267,7 +262,7 @@ _This section must be completed when targeting beta graduation to a release._
267
262
a cloud provider API, or upon an external software-defined storage or network
268
263
control plane.
269
264
270
-
TBD for beta.
265
+
No.
271
266
272
267
273
268
### Scalability
@@ -325,22 +320,26 @@ _This section must be completed when targeting beta graduation to a release._
325
320
326
321
***How does this feature react if the API server and/or etcd is unavailable?**
327
322
328
-
TBD for beta.
323
+
Services will not be able to update their internal traffic policy.
329
324
330
325
***What are other known failure modes?**
331
326
332
-
TBD for beta.
327
+
A Service `internalTrafficPolicy` is set to `Local` but there are no node-local endpoints.
333
328
334
329
***What steps should be taken if SLOs are not being met to determine the problem?**
335
330
336
-
TBD for beta.
331
+
* check Service for internal traffic policy
332
+
* check EndpointSlice to ensure nodeName is set correctly
0 commit comments