Skip to content

Commit e83203f

Browse files
committed
Address review comments
1 parent dc9ec4d commit e83203f

File tree

1 file changed

+11
-11
lines changed

1 file changed

+11
-11
lines changed

content/en/blog/_posts/2022-05-20-non-graceful-node-shutdown.md

Lines changed: 11 additions & 11 deletions
Original file line numberDiff line numberDiff line change
@@ -7,13 +7,13 @@ slug: kubernetes-1-24-non-graceful-node-shutdown-alpha
77

88
**Authors** Xing Yang and Yassine Tijani (VMware)
99

10-
Kubernetes v1.24 introduces alpha support for [Non-Graceful Node Shutdown](https://github.com/kubernetes/enhancements/tree/master/keps/sig-storage/2268-non-graceful-shutdown). This feature allows stateful workloads to failover to a different node after the original node is shutdown or in a non-recoverable state such as the hardware failure or broken OS.
10+
Kubernetes v1.24 introduces alpha support for [Non-Graceful Node Shutdown](https://github.com/kubernetes/enhancements/tree/master/keps/sig-storage/2268-non-graceful-shutdown). This feature allows stateful workloads to failover to a different node after the original node is shutdown or in a non-recoverable state such as hardware failure or broken OS.
1111

1212
## How is this different from Graceful Node Shutdown
1313

1414
You might have heard about the [Graceful Node Shutdown](/docs/concepts/architecture/nodes/#graceful-node-shutdown) capability of Kubernetes,
1515
and are wondering how the Non-Graceful Node Shutdown feature is different from that. Graceful Node Shutdown
16-
allows Kubernetes to detect when a node is shutting down cleanly, and handle that situation appropriately.
16+
allows Kubernetes to detect when a node is shutting down cleanly, and handles that situation appropriately.
1717
A Node Shutdown can be "graceful" only if the node shutdown action can be detected by the kubelet ahead
1818
of the actual shutdown. However, there are cases where a node shutdown action may not be detected by
1919
the kubelet. This could happen either because the shutdown command does not trigger the systemd inhibitor
@@ -23,15 +23,15 @@ locks mechanism that kubelet relies upon, or because of a configuration error
2323
Graceful node shutdown relies on Linux-specific support. The kubelet does not watch for upcoming
2424
shutdowns on Windows nodes (this may change in a future Kubernetes release).
2525

26-
When a node is shutdown but without the kubelet detecting it, Pods on that node
27-
also shut down ungracefully. For stateless apps, that's often not a problem (a ReplicaSet adds a new Pod once
28-
the cluster detects that the affected node or Pod has failed). For stateful apps, the story is more complicated.
29-
If you use a StatefulSet and have a Pod from that StatefulSet on a node that fails uncleanly, that affected Pod
30-
will be marked as terminating; the StatefulSet cannot create a replacement Pod because the existing Pod
26+
When a node is shutdown but without the kubelet detecting it, pods on that node
27+
also shut down ungracefully. For stateless apps, that's often not a problem (a ReplicaSet adds a new pod once
28+
the cluster detects that the affected node or pod has failed). For stateful apps, the story is more complicated.
29+
If you use a StatefulSet and have a pod from that StatefulSet on a node that fails uncleanly, that affected pod
30+
will be marked as terminating; the StatefulSet cannot create a replacement pod because the pod
3131
still exists in the cluster.
3232
As a result, the application running on the StatefulSet may be degraded or even offline. If the original, shut
33-
down node comes up again, the kubelet on that original node reports in, deletes the existing Pods, and
34-
the control plane makes a replacement Pod for that StatefulSet on a different running node.
33+
down node comes up again, the kubelet on that original node reports in, deletes the existing pods, and
34+
the control plane makes a replacement pod for that StatefulSet on a different running node.
3535
If the original node has failed and does not come up, those stateful pods would be stuck in a
3636
terminating status on that failed node indefinitely.
3737

@@ -54,7 +54,7 @@ taint following a shutdown that the kubelet did not detect and handle in advance
5454
can use that taint is when the node is in a non-recoverable state due to a hardware failure or a broken OS.
5555
The values you set for that taint can be `node.kubernetes.io/out-of-service=nodeshutdown: "NoExecute"`
5656
or `node.kubernetes.io/out-of-service=nodeshutdown:" NoSchedule"`.
57-
Provided you have enabled the feature gate as I mentioned earlier, setting the out-of-service taint on a Node
57+
Provided you have enabled the feature gate mentioned earlier, setting the out-of-service taint on a Node
5858
means that pods on the node will be deleted unless if there are matching tolerations on the pods.
5959
Persistent volumes attached to the shutdown node will be detached, and for StatefulSets, replacement pods will
6060
be created successfully on a different running node.
@@ -70,7 +70,7 @@ web-1 1/1 Running 0 10m 10.244.1.7 k8s-node-433-1639279804
7070

7171
Note: Before applying the out-of-service taint, you **must** verify that a node is already in shutdown or power off state (not in the middle of restarting), either because the user intentionally shut it down or the node is down due to hardware failures, OS issues, etc.
7272

73-
Once all the workload Pods that are linked to the out-of-service node are moved to a new running node, and the shutdown node has been recovered, you should remove
73+
Once all the workload pods that are linked to the out-of-service node are moved to a new running node, and the shutdown node has been recovered, you should remove
7474
that taint on the affected node after the node is recovered.
7575
If you know that the node will not return to service, you could instead delete the node from the cluster.
7676

0 commit comments

Comments
 (0)