Skip to content

Commit 6576fa6

Browse files
authored
Merge pull request #3835 from alexanderConstantinescu/kep-3458-follow-up
KEP 3458: address left over comments / KEP requirements
2 parents b680006 + 9fdb91e commit 6576fa6

File tree

3 files changed

+56
-20
lines changed

3 files changed

+56
-20
lines changed
Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,3 @@
1+
kep-number: 3458
2+
beta:
3+
approver: "@wojtek-t"

keps/sig-network/3458-remove-transient-node-predicates-from-service-controller/README.md

Lines changed: 51 additions & 18 deletions
Original file line numberDiff line numberDiff line change
@@ -95,6 +95,24 @@ concerns cluster ingress connectivity. Users of Kubernetes will also see a
9595
reduction in the amount of cloud API calls, for what concerns calls stemming
9696
from syncing load balancers with the Kubernetes cluster state.
9797

98+
Addressing b) is not useful for ingress load balancing. A load balancer needs to
99+
know if the networking data plane is running fine and this is determined by the
100+
configured health check. Cloud providers define their own health check, and no
101+
one does the same. The following describes what the health check looks like on
102+
the three major public cloud providers:
103+
104+
- GCP: probes port 10256 (Kube-proxy's healthz port)
105+
- AWS: if ELB; probes the first `NodePort` defined on the service spec
106+
- Azure: probes all `NodePort` defined on the service spec.
107+
108+
All clouds take an approach of trying to ascertain if traffic can be forwarded
109+
to the endpoint, which is a completely valid health check for load balancer
110+
services. There are drawbacks to all of these ways of doing - but cloud
111+
providers themselves are deemed best suited for what concerns: determining what
112+
is the best mechanism to use for their load balancers / cloud's mode of
113+
operation. Their mechanism is beyond the scope of this KEP, i.e: this KEP does
114+
not attempt to "align them".
115+
98116
### Goals
99117

100118
- Stop re-configuring the load balancers' node set for cases a) and b) above
@@ -170,6 +188,8 @@ from syncing load balancers with the Kubernetes cluster state.
170188
The service controller in the KCCM currently has a set of tests validating
171189
expected syncs caused by Node predicates, these will need to be updated.
172190

191+
- `k8s.io/cloud-provider/controllers/service`: `08/Feb/2023` - `67.7%`
192+
173193
#### Integration tests
174194

175195
Kubernetes is mostly tested via unit tests and e2e, not integration, and this is
@@ -248,21 +268,28 @@ Behavior will be restored back immediately.
248268

249269
###### Are there any tests for feature enablement/disablement?
250270

251-
N/A
271+
Not needed since the enablement/disablement is triggered by changing a in-memory boolean
272+
variable.
252273

253274
### Rollout, Upgrade and Rollback Planning
254275

255276
###### How can a rollout or rollback fail? Can it impact already running workloads?
256277

257-
Rollout and rollback are not expected to fail.
278+
If a cluster has a lot of Nodes which are currently `NotReady` (in the order of
279+
hundreds) and a rollout is triggered, it is expected that all of these nodes
280+
will be added at once to every load balancer. That might have cloud API rate
281+
limiting impacts on the service controller.
258282

259283
###### What specific metrics should inform a rollback?
260284

261-
Performance degradation by the CA when downscaling / flat out inability to delete VMs.
285+
Performance degradation by the CA when downscaling / flat out inability to
286+
delete VMs. - this should be informed by the metric `nodesync_error_rate`
287+
mentioned in [Monitoring requirements](#monitoring-requirements)
262288

263289
###### Were upgrade and rollback tested? Was the upgrade->downgrade->upgrade path tested?
264290

265-
N/A
291+
Once the change is implemented: the author will work with Kubernetes vendors to
292+
test the upgrade/downgrade scenario in a cloud environment.
266293

267294
###### Is the rollout accompanied by any deprecations and/or removals of features, APIs, fields of API types, flags, etc.?
268295

@@ -294,29 +321,33 @@ post the implementation and rollout of the change.
294321

295322
###### How can someone using this feature know that it is working for their instance?
296323

297-
Yes, when a node transition from `Ready` <-> `NotReady` the load balancers are
298-
not re-synced and the load balancers' node set will remain unchanged for
299-
`externalTrafficPolicy: Cluster` services. On downscaling by the CA, the node
300-
will remain the the load balancers' set for a longer period of time, just until
301-
the Node object in Kubernetes is fully deleted.
324+
- By observing no change for the metric `load_balancer_sync_count` when a Node
325+
transitions between `Ready` <-> `NotReady` or when a Node is tainted with
326+
`ToBeDeletedByClusterAutoscaler`. This is because this KEP proposes that we stop
327+
syncing load balancer as a consequence of these events.
328+
329+
- By observing no change w.r.t any active ingress connections for an
330+
`externalTrafficPolicy: Cluster` service, which is passing through a Node
331+
which is transitioning between `Ready` <-> `NotReady`. I.e: no impact on new
332+
or established connections, given that Kube-proxy is healthy when the Node
333+
transitions state like this and isn't impacted.
302334

303335
###### What are the reasonable SLOs (Service Level Objectives) for the enhancement?
304336

337+
No
338+
339+
###### What are the SLIs (Service Level Indicators) an operator can use to determine the health of the service?
340+
305341
Total amount of load balancer re-syncs should be reduced, leading to less cloud
306342
provider API calls. Also, and more subtle: connections will get a chance to
307343
gracefully terminate when the CA downscales cluster nodes. For services of type
308344
`externalTrafficPolicy: Cluster` "traversing" connections through a "nexthop"
309345
node might not be impacted by that Node's readiness state anymore.
310346

311-
###### What are the SLIs (Service Level Indicators) an operator can use to determine the health of the service?
312-
313-
<!--
314-
Pick one more of these and delete the rest.
315-
-->
316-
317347
- [X] Metrics
318348
- Events: The KCCM triggers events when syncing load balancers. The amount of
319349
these events should be reduced.
350+
- Metrics: `load_balancer_sync_count`
320351

321352
###### Are there any missing metrics that would be useful to have to improve observability of this feature?
322353

@@ -326,9 +357,7 @@ N/A
326357

327358
###### Does this feature depend on any specific services running in the cluster?
328359

329-
It depends on:
330-
331-
- having a `type: LoadBalancer` service with `externalTrafficPolicy: Cluster`
360+
No
332361

333362
### Scalability
334363

@@ -356,6 +385,10 @@ No
356385

357386
No
358387

388+
###### Can enabling / using this feature result in resource exhaustion of some node resources (PIDs, sockets, inodes, etc.)?
389+
390+
No
391+
359392
### Troubleshooting
360393

361394
###### How does this feature react if the API server and/or etcd is unavailable?

keps/sig-network/3458-remove-transient-node-predicates-from-service-controller/kep.yaml

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -12,13 +12,13 @@ approvers:
1212
creation-date: "2022-08-07"
1313
last-updated: v1.27
1414
status: implementeable
15-
stage: stable
15+
stage: beta
1616
latest-milestone: "v1.27"
1717
milestone:
1818
beta: "v1.27"
1919
stable: "v1.29"
2020
feature-gates:
2121
- name: StableLoadBalancerNodeSet
2222
components:
23-
- kube-controller-manager
23+
- cloud-controller-manager
2424
disable-supported: true

0 commit comments

Comments
 (0)