Skip to content

Commit d41e075

Browse files
authored
Merge pull request #63936 from aireilly/OCPBUGS-18111
OCPBUGS-18111 - Move misplaced SNO reboot topic
2 parents 461cd99 + 4d56435 commit d41e075

File tree

4 files changed

+31
-31
lines changed

4 files changed

+31
-31
lines changed
Lines changed: 23 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,23 @@
1+
// Module included in the following assemblies:
2+
//
3+
// * nodes/nodes/nodes-nodes-working.adoc
4+
5+
:_content-type: CONCEPT
6+
[id="sno-clusters-reboot-without-drain_{context}"]
7+
= Handling errors in {sno} clusters when the node reboots without draining application pods
8+
9+
In {sno} clusters and in {product-title} clusters in general, a situation can arise where a node reboot occurs without first draining the node. This can occur where an application pod requesting devices fails with the `UnexpectedAdmissionError` error. `Deployment`, `ReplicaSet`, or `DaemonSet` errors are reported because the application pods that require those devices start before the pod serving those devices. You cannot control the order of pod restarts.
10+
11+
While this behavior is to be expected, it can cause a pod to remain on the cluster even though it has failed to deploy successfully. The pod continues to report `UnexpectedAdmissionError`. This issue is mitigated by the fact that application pods are typically included in a `Deployment`, `ReplicaSet`, or `DaemonSet`. If a pod is in this error state, it is of little concern because another instance should be running. Belonging to a `Deployment`, `ReplicaSet`, or `DaemonSet` guarantees the successful creation and execution of subsequent pods and ensures the successful deployment of the application.
12+
13+
There is ongoing work upstream to ensure that such pods are gracefully terminated. Until that work is resolved, run the following command for a {sno} cluster to remove the failed pods:
14+
15+
[source,terminal,subs="+quotes"]
16+
----
17+
$ oc delete pods --field-selector status.phase=Failed -n _<POD_NAMESPACE>_
18+
----
19+
20+
[NOTE]
21+
====
22+
The option to drain the node is unavailable for {sno} clusters.
23+
====

modules/ztp-sno-node-reboot-scenarios.adoc

Lines changed: 0 additions & 23 deletions
This file was deleted.

nodes/nodes/nodes-nodes-working.adoc

Lines changed: 8 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -6,20 +6,26 @@ include::_attributes/common-attributes.adoc[]
66

77
toc::[]
88

9-
As an administrator, you can perform a number of tasks to make your clusters more efficient.
9+
As an administrator, you can perform several tasks to make your clusters more efficient.
1010

1111
// The following include statements pull in the module files that comprise
1212
// the assembly. Include any combination of concept, procedure, or reference
1313
// modules required to cover the user story. You can also include other
1414
// assemblies.
1515

16-
1716
include::modules/nodes-nodes-working-evacuating.adoc[leveloffset=+1]
1817

1918
include::modules/nodes-nodes-working-updating.adoc[leveloffset=+1]
2019

2120
include::modules/nodes-nodes-working-marking.adoc[leveloffset=+1]
2221

22+
include::modules/sno-clusters-reboot-without-drain.adoc[leveloffset=+1]
23+
24+
[role="_additional-resources"]
25+
.Additional resources
26+
27+
* xref:../../nodes/nodes/nodes-nodes-working.adoc#nodes-nodes-working-evacuating_nodes-nodes-working[Understanding how to evacuate pods on nodes]
28+
2329
== Deleting nodes
2430

2531
include::modules/nodes-nodes-working-deleting.adoc[leveloffset=+2]
@@ -31,4 +37,3 @@ include::modules/nodes-nodes-working-deleting.adoc[leveloffset=+2]
3137
see xref:../../machine_management/manually-scaling-machineset.adoc#machineset-manually-scaling-manually-scaling-machineset[Manually scaling a MachineSet].
3238

3339
include::modules/nodes-nodes-working-deleting-bare-metal.adoc[leveloffset=+2]
34-

scalability_and_performance/ztp_far_edge/ztp-reference-cluster-configuration-for-vdu.adoc

Lines changed: 0 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -86,12 +86,7 @@ include::modules/ztp-sno-du-configuring-lvms.adoc[leveloffset=+2]
8686

8787
include::modules/ztp-sno-du-disabling-network-diagnostics.adoc[leveloffset=+2]
8888

89-
include::modules/ztp-sno-node-reboot-scenarios.adoc[leveloffset=+2]
90-
9189
[role="_additional-resources"]
9290
.Additional resources
9391

94-
* xref:../../nodes/nodes/nodes-nodes-working.adoc#nodes-nodes-working-evacuating_nodes-nodes-working[Understanding how to evacuate pods on nodes
95-
]
96-
9792
* xref:../../scalability_and_performance/ztp_far_edge/ztp-deploying-far-edge-sites.adoc#ztp-deploying-far-edge-sites[Deploying far edge sites using ZTP]

0 commit comments

Comments
 (0)