Merge pull request #65336 from cbippley/OSDOCS-5759

skrthomas · web-flow · commit f6d90abf9277 · 2023-10-12T12:22:47.000-04:00
OSDOCS-5759 Removal of kube-controller-manager nodelifecycle controller's fields
diff --git a/modules/nodes-edge-remote-workers-strategies.adoc b/modules/nodes-edge-remote-workers-strategies.adoc
@@ -112,15 +112,13 @@ spec:
 <2> Specify the frequency that the kubelet checks the status of a node associated with this `MachineConfig` object. The default value is `10s`. If you change this default, the `node-status-report-frequency` value is changed to the same value.
 <3> Specify the frequency that the kubelet reports the status of a node associated with this `MachineConfig` object. The default value is `1m`.
 
-The `node-status-update-frequency` parameter works with the `node-monitor-grace-period` and `pod-eviction-timeout` parameters.
+The `node-status-update-frequency` parameter works with the `node-monitor-grace-period` parameter.
 
 * The `node-monitor-grace-period` parameter specifies how long {product-title} waits after a node associated with a `MachineConfig` object is marked `Unhealthy` if the controller manager does not receive the node heartbeat. Workloads on the node continue to run after this time. If the remote worker node rejoins the cluster after `node-monitor-grace-period` expires, pods continue to run. New pods can be scheduled to that node. The `node-monitor-grace-period` interval is `40s`. The `node-status-update-frequency` value must be lower than the `node-monitor-grace-period` value.
 
-* The `pod-eviction-timeout` parameter specifies the amount of time {product-title} waits after marking a node that is associated with a `MachineConfig` object as `Unreachable` to start marking pods for eviction. Evicted pods are rescheduled on other nodes. If the remote worker node rejoins the cluster after `pod-eviction-timeout` expires, the pods running on the remote worker node are terminated because the node controller has evicted the pods on-premise. Pods can then be rescheduled to that node. The `pod-eviction-timeout` interval is `5m0s`.
-
 [NOTE]
 ====
-Modifying the `node-monitor-grace-period` and `pod-eviction-timeout` parameters is not supported.
+Modifying the `node-monitor-grace-period` parameter is not supported.
 ====
 
 --
@@ -133,7 +131,12 @@ A taint with the `NoExecute` effect affects pods that are running on the node in
 
 * Pods that do not tolerate the taint are queued for eviction.
 * Pods that tolerate the taint without specifying a `tolerationSeconds` value in their toleration specification remain bound forever.
-* Pods that tolerate the taint with a specified `tolerationSeconds` value remain bound for the specified amount of time.  After the time elapses, the pods are queued for eviction.
+* Pods that tolerate the taint with a specified `tolerationSeconds` value remain bound for the specified amount of time. After the time elapses, the pods are queued for eviction.
+
+[NOTE]
+====
+Unless tolerations are explicitly set, Kubernetes automatically adds a toleration for `node.kubernetes.io/not-ready` and `node.kubernetes.io/unreachable` with `tolerationSeconds=300`, meaning that pods remain bound for 5 minutes if either of these taints is detected. 
+====
 
 You can delay or avoid pod eviction by configuring pods tolerations with the `NoExecute` effect for the `node.kubernetes.io/unreachable` and `node.kubernetes.io/not-ready` taints.
 
@@ -148,14 +151,14 @@ tolerations:
 - key: "node.kubernetes.io/not-ready"
   operator: "Exists"
   effect: "NoExecute" <2>
-  tolerationSeconds: 600
+  tolerationSeconds: 600 <3>
 ...
 ----
 <1> The `NoExecute` effect without `tolerationSeconds` lets pods remain forever if the control plane cannot reach the node.
 <2> The `NoExecute` effect with `tolerationSeconds`: 600 lets pods remain for 10 minutes if the control plane marks the node as `Unhealthy`.
+<3> You can specify your own `tolerationSeconds` value.
 
-{product-title} uses the `tolerationSeconds` value after the `pod-eviction-timeout` value elapses.
-
+[id="nodes-edge-remote-workers-strategies-objects_{context}"]
 Other types of {product-title} objects::
 You can use replica sets, deployments, and replication controllers. The scheduler can reschedule these pods onto other nodes after the node is disconnected for five minutes. Rescheduling onto other nodes can be beneficial for some workloads, such as REST APIs, where an administrator can guarantee a specific number of pods are running and accessible.
 
diff --git a/operators/operator_sdk/osdk-leader-election.adoc b/operators/operator_sdk/osdk-leader-election.adoc
@@ -10,7 +10,7 @@ During the lifecycle of an Operator, it is possible that there may be more than
 
 There are two different leader election implementations to choose from, each with its own trade-off:
 
-Leader-for-life:: The leader pod only gives up leadership, using garbage collection, when it is deleted. This implementation precludes the possibility of two instances mistakenly running as leaders, a state also known as split brain. However, this method can be subject to a delay in electing a new leader. For example, when the leader pod is on an unresponsive or partitioned node, the link:https://kubernetes.io/docs/reference/command-line-tools-reference/kube-controller-manager/#options[`pod-eviction-timeout`] dictates long how it takes for the leader pod to be deleted from the node and step down, with a default of `5m`. See the link:https://godoc.org/github.com/operator-framework/operator-sdk/pkg/leader[Leader-for-life] Go documentation for more.
+Leader-for-life:: The leader pod only gives up leadership, using garbage collection, when it is deleted. This implementation precludes the possibility of two instances mistakenly running as leaders, a state also known as split brain. However, this method can be subject to a delay in electing a new leader. For example, when the leader pod is on an unresponsive or partitioned node, you can specify `node.kubernetes.io/unreachable` and `node.kubernetes.io/not-ready` tolerations on the leader pod and use the `tolerationSeconds` value to dictate how long it takes for the leader pod to be deleted from the node and step down. These tolerations are added to the pod by default on admission with a `tolerationSeconds` value of 5 minutes. See the link:https://godoc.org/github.com/operator-framework/operator-sdk/pkg/leader[Leader-for-life] Go documentation for more.
 
 Leader-with-lease:: The leader pod periodically renews the leader lease and gives up leadership when it cannot renew the lease. This implementation allows for a faster transition to a new leader when the existing leader is isolated, but there is a possibility of split brain in link:https://github.com/kubernetes/client-go/blob/30b06a83d67458700a5378239df6b96948cb9160/tools/leaderelection/leaderelection.go#L21-L24[certain situations]. See the link:https://godoc.org/github.com/kubernetes-sigs/controller-runtime/pkg/leaderelection[Leader-with-lease] Go documentation for more.