@@ -488,6 +488,67 @@ route to ready node-local endpoints. If the traffic policy is `Local` and there
488
488
are no node-local endpoints, the kube-proxy does not forward any traffic for the
489
489
relevant Service.
490
490
491
+ If ` Cluster ` is specified all nodes are eligible load balancing targets _ as long as_
492
+ the node is not being deleted and kube-proxy is healthy. In this mode: load balancer
493
+ health checks are configured to target the service proxy's readiness port and path.
494
+ In the case of kube-proxy this evaluates to: ` ${NODE_IP}:10256/healthz ` . kube-proxy
495
+ will return either an HTTP code 200 or 503. kube-proxy's load balancer health check
496
+ endpoint returns 200 if:
497
+
498
+ 1 . kube-proxy is healthy, meaning:
499
+ - it's able to progress programming the network and isn't timing out while doing
500
+ so (the timeout is defined to be: ** 2 × ` iptables.syncPeriod ` ** ); and
501
+ 2 . the node is not being deleted (there is no deletion timestamp set for the Node).
502
+
503
+ The reason why kube-proxy returns 503 and marks the node as not
504
+ eligible when it's being deleted, is because kube-proxy supports connection
505
+ draining for terminating nodes. A couple of important things occur from the point
506
+ of view of a Kubernetes-managed load balancer when a node _ is being_ / _ is_ deleted.
507
+
508
+ While deleting:
509
+
510
+ * kube-proxy will start failing its readiness probe and essentially mark the
511
+ node as not eligible for load balancer traffic. The load balancer health
512
+ check failing causes load balancers which support connection draining to
513
+ allow existing connections to terminate, and block new connections from
514
+ establishing.
515
+
516
+ When deleted:
517
+
518
+ * The service controller in the Kubernetes cloud controller manager removes the
519
+ node from the referenced set of eligible targets. Removing any instance from
520
+ the load balancer's set of backend targets immediately terminates all
521
+ connections. This is also the reason kube-proxy first fails the health check
522
+ while the node is deleting.
523
+
524
+ It's important to note for Kubernetes vendors that if any vendor configures the
525
+ kube-proxy readiness probe as a liveness probe: that kube-proxy will start
526
+ restarting continuously when a node is deleting until it has been fully deleted.
527
+ kube-proxy exposes a ` /livez ` path which, as opposed to the ` /healthz ` one, does
528
+ ** not** consider the Node's deleting state and only its progress programming the
529
+ network. ` /livez ` is therefore the recommended path for anyone looking to define
530
+ a livenessProbe for kube-proxy.
531
+
532
+ Users deploying kube-proxy can inspect both the readiness / liveness state by
533
+ evaluating the metrics: ` proxy_livez_total ` / ` proxy_healthz_total ` . Both
534
+ metrics publish two series, one with the 200 label and one with the 503 one.
535
+
536
+ For ` Local ` Services: kube-proxy will return 200 if
537
+
538
+ 1 . kube-proxy is healthy/ready, and
539
+ 2 . has a local endpoint on the node in question.
540
+
541
+ Node deletion does ** not** have an impact on kube-proxy's return
542
+ code for what concerns load balancer health checks. The reason for this is:
543
+ deleting nodes could end up causing an ingress outage should all endpoints
544
+ simultaneously be running on said nodes.
545
+
546
+ The Kubernetes project recommends that cloud provider integration code
547
+ configures load balancer health checks that target the service proxy's healthz
548
+ port. If you are using or implementing your own virtual IP implementation,
549
+ that people can use instead of kube-proxy, you should set up a similar health
550
+ checking port with logic that matches the kube-proxy implementation.
551
+
491
552
### Traffic to terminating endpoints
492
553
493
554
{{< feature-state for_k8s_version="v1.28" state="stable" >}}
0 commit comments