@@ -454,50 +454,6 @@ Message: Pod was terminated in response to imminent node shutdown.
454
454
455
455
{{< /note >}}
456
456
457
- ## Non Graceful node shutdown {#non-graceful-node-shutdown}
458
-
459
- {{< feature-state state="beta" for_k8s_version="v1.26" >}}
460
-
461
- A node shutdown action may not be detected by kubelet's Node Shutdown Manager,
462
- either because the command does not trigger the inhibitor locks mechanism used by
463
- kubelet or because of a user error, i.e., the ShutdownGracePeriod and
464
- ShutdownGracePeriodCriticalPods are not configured properly. Please refer to above
465
- section [ Graceful Node Shutdown] ( #graceful-node-shutdown ) for more details.
466
-
467
- When a node is shutdown but not detected by kubelet's Node Shutdown Manager, the pods
468
- that are part of a StatefulSet will be stuck in terminating status on
469
- the shutdown node and cannot move to a new running node. This is because kubelet on
470
- the shutdown node is not available to delete the pods so the StatefulSet cannot
471
- create a new pod with the same name. If there are volumes used by the pods, the
472
- VolumeAttachments will not be deleted from the original shutdown node so the volumes
473
- used by these pods cannot be attached to a new running node. As a result, the
474
- application running on the StatefulSet cannot function properly. If the original
475
- shutdown node comes up, the pods will be deleted by kubelet and new pods will be
476
- created on a different running node. If the original shutdown node does not come up,
477
- these pods will be stuck in terminating status on the shutdown node forever.
478
-
479
- To mitigate the above situation, a user can manually add the taint ` node.kubernetes.io/out-of-service ` with either ` NoExecute `
480
- or ` NoSchedule ` effect to a Node marking it out-of-service.
481
- If the ` NodeOutOfServiceVolumeDetach ` [ feature gate] ( /docs/reference/command-line-tools-reference/feature-gates/ )
482
- is enabled on ` kube-controller-manager ` , and a Node is marked out-of-service with this taint, the
483
- pods on the node will be forcefully deleted if there are no matching tolerations on it and volume
484
- detach operations for the pods terminating on the node will happen immediately. This allows the
485
- Pods on the out-of-service node to recover quickly on a different node.
486
-
487
- During a non-graceful shutdown, Pods are terminated in the two phases:
488
-
489
- 1 . Force delete the Pods that do not have matching ` out-of-service ` tolerations.
490
- 2 . Immediately perform detach volume operation for such pods.
491
-
492
- {{< note >}}
493
- - Before adding the taint ` node.kubernetes.io/out-of-service ` , it should be verified
494
- that the node is already in shutdown or power off state (not in the middle of
495
- restarting).
496
- - The user is required to manually remove the out-of-service taint after the pods are
497
- moved to a new node and the user has checked that the shutdown node has been
498
- recovered since the user was the one who originally added the taint.
499
- {{< /note >}}
500
-
501
457
### Pod Priority based graceful node shutdown {#pod-priority-graceful-node-shutdown}
502
458
503
459
{{< feature-state state="alpha" for_k8s_version="v1.23" >}}
@@ -596,6 +552,50 @@ the feature is Beta and is enabled by default.
596
552
Metrics `graceful_shutdown_start_time_seconds` and `graceful_shutdown_end_time_seconds`
597
553
are emitted under the kubelet subsystem to monitor node shutdowns.
598
554
555
+ # # Non Graceful node shutdown {#non-graceful-node-shutdown}
556
+
557
+ {{< feature-state state="beta" for_k8s_version="v1.26" >}}
558
+
559
+ A node shutdown action may not be detected by kubelet's Node Shutdown Manager,
560
+ either because the command does not trigger the inhibitor locks mechanism used by
561
+ kubelet or because of a user error, i.e., the ShutdownGracePeriod and
562
+ ShutdownGracePeriodCriticalPods are not configured properly. Please refer to above
563
+ section [Graceful Node Shutdown](#graceful-node-shutdown) for more details.
564
+
565
+ When a node is shutdown but not detected by kubelet's Node Shutdown Manager, the pods
566
+ that are part of a StatefulSet will be stuck in terminating status on
567
+ the shutdown node and cannot move to a new running node. This is because kubelet on
568
+ the shutdown node is not available to delete the pods so the StatefulSet cannot
569
+ create a new pod with the same name. If there are volumes used by the pods, the
570
+ VolumeAttachments will not be deleted from the original shutdown node so the volumes
571
+ used by these pods cannot be attached to a new running node. As a result, the
572
+ application running on the StatefulSet cannot function properly. If the original
573
+ shutdown node comes up, the pods will be deleted by kubelet and new pods will be
574
+ created on a different running node. If the original shutdown node does not come up,
575
+ these pods will be stuck in terminating status on the shutdown node forever.
576
+
577
+ To mitigate the above situation, a user can manually add the taint `node.kubernetes.io/out-of-service` with either `NoExecute`
578
+ or `NoSchedule` effect to a Node marking it out-of-service.
579
+ If the `NodeOutOfServiceVolumeDetach`[feature gate](/docs/reference/command-line-tools-reference/feature-gates/)
580
+ is enabled on `kube-controller-manager`, and a Node is marked out-of-service with this taint, the
581
+ pods on the node will be forcefully deleted if there are no matching tolerations on it and volume
582
+ detach operations for the pods terminating on the node will happen immediately. This allows the
583
+ Pods on the out-of-service node to recover quickly on a different node.
584
+
585
+ During a non-graceful shutdown, Pods are terminated in the two phases :
586
+
587
+ 1. Force delete the Pods that do not have matching `out-of-service` tolerations.
588
+ 2. Immediately perform detach volume operation for such pods.
589
+
590
+ {{< note >}}
591
+ - Before adding the taint `node.kubernetes.io/out-of-service` , it should be verified
592
+ that the node is already in shutdown or power off state (not in the middle of
593
+ restarting).
594
+ - The user is required to manually remove the out-of-service taint after the pods are
595
+ moved to a new node and the user has checked that the shutdown node has been
596
+ recovered since the user was the one who originally added the taint.
597
+ {{< /note >}}
598
+
599
599
# # Swap memory management {#swap-memory}
600
600
601
601
{{< feature-state state="alpha" for_k8s_version="v1.22" >}}
0 commit comments