You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: modules/eco-node-health-check-operator-about.adoc
+10-10Lines changed: 10 additions & 10 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -6,11 +6,11 @@
6
6
[id="about-node-health-check-operator_{context}"]
7
7
= About the Node Health Check Operator
8
8
9
-
The Node Health Check Operator deploys the `NodeHealthCheck` controller to detect the health of a node in the cluster. The `NodeHealthCheck` controller creates the `NodeHealthCheck` custom resource (CR), which defines a set of criteria and thresholds to determine the node's health.
9
+
The Node Health Check Operator deploys the `NodeHealthCheck` controller to detect the health of a node in the cluster. The `NodeHealthCheck` controller creates the `NodeHealthCheck` custom resource (CR), which defines a set of criteria and thresholds to determine the node's health.
10
10
11
-
The Node Health Check Operator also installs the Poison Pill Operator as a default remediation provider.
11
+
The Node Health Check Operator also installs the Self Node Remediation Operator as a default remediation provider.
12
12
13
-
When the Node Health Check Operator detects an unhealthy node, it creates a remediation CR that triggers the remediation provider. For example, the controller creates the `PoisonPillRemediation` CR, which triggers the Poison Pill Operator to remediate the unhealthy node.
13
+
When the Node Health Check Operator detects an unhealthy node, it creates a remediation CR that triggers the remediation provider. For example, the controller creates the `SelfNodeRemediation` CR, which triggers the Self Node Remediation Operator to remediate the unhealthy node.
14
14
15
15
The `NodeHealthCheck` CR resembles the following YAML file:
During the upgrade process, nodes in the cluster might become temporarily unavailable and get identified as unhealthy. In the case of worker nodes, when the Operator detects that the cluster is upgrading, it stops remediating new unhealthy nodes to prevent such nodes from rebooting.
52
52
====
53
-
<3> Specifies a remediation template from the remediation provider. For example, from the Poison Pill Operator.
53
+
<3> Specifies a remediation template from the remediation provider. For example, from the Self Node Remediation Operator.
54
54
<4> Specifies a `selector` that matches labels or expressions that you want to check. The default value is empty, which selects all nodes.
55
-
<5> Specifies a list of the conditions that determine whether a node is considered unhealthy.
55
+
<5> Specifies a list of the conditions that determine whether a node is considered unhealthy.
56
56
<6> Specifies the timeout duration for a node condition. If a condition is met for the duration of the timeout, the node will be remediated. Long timeouts can result in long periods of downtime for a workload on an unhealthy node.
Use the Node Health Check Operator to deploy the `NodeHealthCheck` controller. The controller identifies unhealthy nodes and uses the Poison Pill Operator to remediate the unhealthy nodes.
9
+
Use the Node Health Check Operator to deploy the `NodeHealthCheck` controller. The controller identifies unhealthy nodes and uses the Self Node Remediation Operator to remediate the unhealthy nodes.
0 commit comments