Skip to content

Commit 3181159

Browse files
KEP-3836 v1.30 documentation
1 parent 0110b43 commit 3181159

File tree

2 files changed

+81
-0
lines changed

2 files changed

+81
-0
lines changed

content/en/docs/concepts/services-networking/service.md

Lines changed: 77 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -619,6 +619,83 @@ You can integrate with [Gateway](https://gateway-api.sigs.k8s.io/) rather than S
619619
can define your own (provider specific) annotations on the Service that specify the equivalent detail.
620620
{{< /note >}}
621621

622+
#### Node liveness impact on load balancer traffic
623+
624+
Load balancer health checks are critical to modern applications. They are used to
625+
determine which server (virtual machine, or IP address) the load balancer should
626+
dispatch traffic to. The Kubernetes APIs do not define how health checks have to be
627+
implemented for Kubernetes managed load balancers, instead it's the cloud providers
628+
(and the people implementing integration code) who decide on the behavior. Load
629+
balancer health checks are extensively used within the context of supporting the
630+
`externalTrafficPolicy` field for Services. If `Cluster` is specified all nodes are
631+
eligible load balancing targets _as long as_ the node is not being deleted and kube-proxy
632+
is healthy. In this mode: load balancer health checks are configured to target the
633+
service proxy's readiness port and path. In the case of kube-proxy this evaluates
634+
to: `${NODE_IP}:10256/healthz`. kube-proxy will return either an HTTP code 200 or 503.
635+
kube-proxy's load balancer health check endpoint returns 200 if:
636+
637+
1. kube-proxy is healthy, meaning:
638+
- it's able to progress programming the network and isn't timing out while doing
639+
so (the timeout is defined to be: **2 × `iptables.syncPeriod`**); and
640+
2. the node is not being deleted (there is no deletion timestamp set for the Node).
641+
642+
The reason why kube-proxy returns 503 and marks the node as not
643+
eligible when it's being deleted, is because kube-proxy supports connection
644+
draining for terminating nodes. A couple of important things occur from the point
645+
of view of a Kubernetes-managed load balancer when a node _is being_ / _is_ deleted.
646+
647+
While deleting:
648+
649+
* kube-proxy will start failing its readiness probe and essentially mark the
650+
node as not eligible for load balancer traffic. The load balancer health
651+
check failing causes load balancers which support connection draining to
652+
allow existing connections to terminate, and block new connections from
653+
establishing.
654+
655+
When deleted:
656+
657+
* The service controller in the Kubernetes cloud controller manager removes the
658+
node from the referenced set of eligible targets. Removing any instance from
659+
the load balancer's set of backend targets immediately terminates all
660+
connections. This is also the reason kube-proxy first fails the health check
661+
while the node is deleting.
662+
663+
It's important to note for Kubernetes vendors that if any vendor configures the
664+
kube-proxy readiness probe as a liveness probe: that kube-proxy will start
665+
restarting continuously when a node is deleting until it has been fully deleted.
666+
667+
Users deploying kube-proxy can inspect both the readiness / liveness state by
668+
evaluating the metrics: `proxy_livez_total` / `proxy_healthz_total`. Both
669+
metrics publish two series, one with the 200 label and one with the 503 one.
670+
671+
For Services of `externalTrafficPolicy: Local`: kube-proxy will return 200 if
672+
673+
1. kube-proxy is healthy/ready, and
674+
2. has a local endpoint on the node in question.
675+
676+
Node deletion does **not** have an impact on kube-proxy's return
677+
code for what concerns load balancer health checks. The reason for this is:
678+
deleting nodes could end up causing an ingress outage should all endpoints
679+
simultaneously be running on said nodes.
680+
681+
It's important to note that the configuration of load balancer health checks is
682+
specific to each cloud provider, meaning: different cloud providers configure
683+
the health check in different ways. The three main cloud providers do so in the
684+
following way:
685+
686+
* AWS: if ELB; probes the first NodePort defined on the service spec
687+
* Azure: probes all NodePort defined on the service spec.
688+
* GCP: probes port 10256 (kube-proxy's healthz port)
689+
690+
There are drawbacks and benefits to each method, so none can be considered fully
691+
right, but it is important to mention that connection draining using kube-proxy
692+
can therefore only occur for cloud providers which configure the health checks to
693+
target kube-proxy. Also note that configuring health checks to target the application
694+
might cause ingress downtime should the application experience issues which
695+
are unrelated to networking problems. The recommendation is therefore that cloud
696+
providers configure the load balancer health checks to target the service
697+
proxy's healthz port.
698+
622699
#### Load balancers with mixed protocol types
623700

624701
{{< feature-state feature_gate_name="MixedProtocolLBService" >}}

content/en/docs/reference/command-line-tools-reference/feature-gates/kube-proxy-draining-terminating-nodes.md

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -9,6 +9,10 @@ stages:
99
- stage: alpha
1010
defaultValue: false
1111
fromVersion: "1.28"
12+
toVersion: "1.30"
13+
- stage: beta
14+
defaultValue: true
15+
fromVersion: "1.30"
1216
---
1317
Implement connection draining for
1418
terminating nodes for `externalTrafficPolicy: Cluster` services.

0 commit comments

Comments
 (0)