Skip to content

Commit 9ea46dd

Browse files
authored
Merge pull request #29501 from aireilly/MGMT-298
Updates for MGMT-298
2 parents d802758 + 01882ae commit 9ea46dd

11 files changed

+175
-29
lines changed

_topic_map.yml

Lines changed: 6 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -2768,16 +2768,16 @@ Topics:
27682768
- Name: Node maintenance
27692769
Dir: node_maintenance
27702770
Topics:
2771-
- Name: Automatic renewal of TLS certificates
2772-
File: virt-automatic-certificates
2773-
- Name: Managing node labeling for obsolete CPU models
2774-
File: virt-managing-node-labeling-obsolete-cpu-models
2775-
- Name: Node maintenance mode
2776-
File: virt-node-maintenance
2771+
- Name: About node maintenance
2772+
File: virt-about-node-maintenance
27772773
- Name: Setting a node to maintenance mode
27782774
File: virt-setting-node-maintenance
27792775
- Name: Resuming a node from maintenance mode
27802776
File: virt-resuming-node
2777+
- Name: Automatic renewal of TLS certificates
2778+
File: virt-automatic-certificates
2779+
- Name: Managing node labeling for obsolete CPU models
2780+
File: virt-managing-node-labeling-obsolete-cpu-models
27812781
# Node Networking
27822782
- Name: Node networking
27832783
Dir: node_network

modules/virt-automatic-certificates-renewal.adoc

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -3,7 +3,7 @@
33
// * virt/node_maintenance/virt-automatic-certificates.adoc
44

55
[id="virt-automatic-certificates-renewal_{context}"]
6-
= Automatic renewal of TLS certificates
6+
= TLS certificates automatic renewal schedules
77

88
TLS certificates are automatically deleted and replaced according to the following schedule:
99

Lines changed: 49 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,49 @@
1+
// Module included in the following assemblies:
2+
//
3+
// virt/node_maintenance/virt-node-maintenance.adoc
4+
5+
[id="virt-checking_status_of_node_maintenance_cr_tasks_{context}"]
6+
= Checking status of current NodeMaintenance CR tasks
7+
8+
You can check the status of current `NodeMaintenance` CR tasks.
9+
10+
.Prerequisites
11+
12+
* Install the {product-title} CLI `oc`.
13+
* Log in as a user with `cluster-admin` privileges.
14+
15+
.Procedure
16+
17+
* Check the status of current node maintenance tasks by running the following command:
18+
+
19+
[source,terminal]
20+
----
21+
$ oc get NodeMaintenance -o yaml
22+
----
23+
+
24+
.Example output
25+
+
26+
[source,yaml]
27+
----
28+
apiVersion: v1
29+
items:
30+
- apiVersion: nodemaintenance.kubevirt.io/v1beta1
31+
kind: NodeMaintenance
32+
metadata:
33+
...
34+
spec:
35+
nodeName: node-1.example.com
36+
reason: Node maintenance
37+
status:
38+
evictionPods: 3 <1>
39+
pendingPods:
40+
- pod-example-workload-0
41+
- httpd
42+
- httpd-manual
43+
phase: Running
44+
lastError: "Last failure message" <2>
45+
totalpods: 5
46+
...
47+
----
48+
<1> `evictionPods` is the number of pods scheduled for eviction.
49+
<2> `lastError` records the latest eviction error, if any.
Lines changed: 11 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,11 @@
1+
// Module included in the following assemblies:
2+
//
3+
// virt/node_maintenance/virt-about-node-maintenance.adoc
4+
5+
[id="virt-maintaining-bare-metal-nodes_{context}"]
6+
= Maintaining bare metal nodes
7+
8+
When you deploy {product-title} on bare metal infrastructure, there are additional considerations that must be taken into account compared to deploying on cloud infrastructure. Unlike in cloud environments where the cluster nodes are considered ephemeral, re-provisioning a bare metal node requires significantly more time and effort for maintenance tasks.
9+
10+
When a bare metal node fails, for example, if a fatal kernel error happens or a NIC card hardware failure occurs, workloads on the failed node need to be restarted elsewhere else on the cluster while the problem node is repaired or replaced. Node maintenance mode allows cluster administrators to gracefully power down nodes, moving workloads to other parts of the cluster and ensuring workloads do not get interrupted. Detailed progress and node status details are provided during maintenance.
11+
Lines changed: 29 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,29 @@
1+
// Module included in the following assemblies:
2+
//
3+
// virt/node_maintenance/virt-resuming-node.adoc
4+
5+
[id="virt-resuming-node-from-maintenance-mode-with-cr_{context}"]
6+
= Resuming a node from maintenance mode that was initiated with a NodeMaintenance CR
7+
8+
You can resume a node by deleting the `NodeMaintenance` CR.
9+
10+
.Prerequisites
11+
12+
* Install the {product-title} CLI `oc`.
13+
* Log in to the cluster as a user with `cluster-admin` privileges.
14+
15+
.Procedure
16+
17+
* When your node maintenance task is complete, delete the active `NodeMaintenance` CR:
18+
+
19+
[source,terminal]
20+
----
21+
$ oc delete -f nodemaintenance-cr.yaml
22+
----
23+
+
24+
.Example output
25+
+
26+
[source,terminal]
27+
----
28+
nodemaintenance.nodemaintenance.kubevirt.io "maintenance-example" deleted
29+
----
Lines changed: 55 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,55 @@
1+
// Module included in the following assemblies:
2+
//
3+
// virt/node_maintenance/virt-node-maintenance.adoc
4+
5+
[id="virt-setting-node-to-maintenance-mode-with-cr_{context}"]
6+
= Setting a node to maintenance mode with a NodeMaintenance custom resource
7+
8+
You can put a node into maintenance mode with a `NodeMaintenance` custom resource (CR). When you apply a `NodeMaintenance` CR, all allowed pods are evicted and the node is shut down. Evicted pods are queued to be moved to another node in the cluster.
9+
10+
.Prerequisites
11+
12+
* Install the {product-title} CLI `oc`.
13+
* Log in to the cluster as a user with `cluster-admin` privileges.
14+
15+
.Procedure
16+
17+
. Create the following node maintenance CR, and save the file as `nodemaintenance-cr.yaml`:
18+
+
19+
[source,yaml]
20+
----
21+
apiVersion: nodemaintenance.kubevirt.io/v1beta1
22+
kind: NodeMaintenance
23+
metadata:
24+
name: maintenance-example <1>
25+
spec:
26+
nodeName: node-1.example.com <2>
27+
reason: "Node maintenance" <3>
28+
----
29+
<1> Node maintenance CR name
30+
<2> The name of the node to be put into maintenance mode
31+
<3> Plain text description of the reason for maintenance
32+
+
33+
. Apply the node maintenance schedule by running the following command:
34+
+
35+
[source,terminal]
36+
----
37+
$ oc apply -f nodemaintenance-cr.yaml
38+
----
39+
40+
. Check the progress of the maintenance task by running the following command, replacing `<node-name>` with the name of your node:
41+
+
42+
[source,terminal]
43+
----
44+
$ oc describe node <node-name>
45+
----
46+
+
47+
.Example output
48+
+
49+
[source,terminal]
50+
----
51+
Events:
52+
Type Reason Age From Message
53+
---- ------ ---- ---- -------
54+
Normal NodeNotSchedulable 61m kubelet Node node-1.example.com status is now: NodeNotSchedulable
55+
----
Lines changed: 11 additions & 13 deletions
Original file line numberDiff line numberDiff line change
@@ -1,25 +1,23 @@
11
// Module included in the following assemblies:
2-
//
3-
// * virt/node_maintenance/virt-node-maintenance.adoc
4-
// * virt/node_maintenance/virt-setting-node-maintenance.adoc
2+
// virt/node_maintenance/virt-about-node-maintenance.adoc
53

64
[id="virt-understanding-node-maintenance_{context}"]
75
= Understanding node maintenance mode
86

9-
Placing a node into maintenance marks the node as unschedulable and drains all
10-
the virtual machines and pods from it. Virtual machine instances that have a
11-
`LiveMigrate` eviction strategy are live migrated to another node without loss
12-
of service. This eviction strategy is configured by default in virtual machine
13-
created from common templates but must be configured manually for custom
14-
virtual machines.
7+
Nodes can be placed into maintenance mode using the `oc adm` utility, or using `NodeMaintenance` custom resources (CRs).
158

16-
Virtual machine instances without an eviction strategy will be deleted on the
17-
node and recreated on another node.
9+
Placing a node into maintenance marks the node as unschedulable and drains all the virtual machines and pods from it. Virtual machine instances that have a `LiveMigrate` eviction strategy are live migrated to another node without loss of service. This eviction strategy is configured by default in virtual machine created from common templates but must be configured manually for custom virtual machines.
10+
11+
Virtual machine instances without an eviction strategy are shut down. Virtual machines with a `RunStrategy` of `Running` or `RerunOnFailure` are recreated on another node. Virtual machines with a `RunStrategy` of `Manual` are not automatically restarted.
1812

1913
[IMPORTANT]
2014
====
21-
Virtual machines must have a persistent volume claim (PVC) with a shared
22-
ReadWriteMany (RWX) access mode to be live migrated.
15+
Virtual machines must have a persistent volume claim (PVC) with a shared `ReadWriteMany` (RWX) access mode to be live migrated.
2316
====
2417

18+
When installed as part of OpenShift Virtualization, Node Maintenance Operator watches for new or deleted `NodeMaintenance` CRs. When a new `NodeMaintenance` CR is detected, no new workloads are scheduled and the node is cordoned off from the rest of the cluster. All pods that can be evicted are evicted from the node. When a `NodeMaintenance` CR is deleted, the node that is referenced in the CR is made available for new workloads.
2519

20+
[NOTE]
21+
====
22+
Using a `NodeMaintenance` CR for node maintenance tasks achieves the same results as the `oc adm cordon` and `oc adm drain` commands using standard {product-title} custom resource processing.
23+
====

virt/live_migration/virt-live-migration.adoc

Lines changed: 0 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -11,6 +11,5 @@ include::modules/virt-updating-access-mode-for-live-migration.adoc[leveloffset=+
1111
.Additional resources:
1212

1313
* xref:../../virt/live_migration/virt-migrate-vmi.adoc#virt-migrate-vmi[Migrating a virtual machine instance to another node]
14-
* xref:../../virt/node_maintenance/virt-node-maintenance.adoc#virt-node-maintenance[Node maintenance mode]
1514
* xref:../../virt/live_migration/virt-live-migration-limits.adoc#virt-live-migration-limits[Live migration limiting]
1615
* xref:../../virt/virtual_machines/virtual_disks/virt-storage-defaults-for-datavolumes.adoc#virt-storage-defaults-for-datavolumes[Storage defaults for data volumes]
Lines changed: 4 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -1,13 +1,14 @@
1-
[id="virt-node-maintenance"]
2-
= Node maintenance mode
1+
[id="virt-about-node-maintenance"]
2+
= About node maintenance
33
include::modules/virt-document-attributes.adoc[]
44
:context: virt-node-maintenance
5-
65
toc::[]
76

87
include::modules/virt-understanding-node-maintenance.adoc[leveloffset=+1]
8+
include::modules/virt-maintaining-bare-metal-nodes.adoc[leveloffset=+1]
99

1010
.Additional resources:
1111

12+
* xref:../../virt/virtual_machines/virt-create-vms.adoc#virt-about-runstrategies-vms_virt-create-vms[About RunStrategies for virtual machines]
1213
* xref:../../virt/live_migration/virt-live-migration.adoc#virt-live-migration[Virtual machine live migration]
1314
* xref:../../virt/live_migration/virt-configuring-vmi-eviction-strategy.adoc#virt-configuring-vmi-eviction-strategy[Configuring virtual machine eviction strategy]

virt/node_maintenance/virt-resuming-node.adoc

Lines changed: 4 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -6,10 +6,12 @@ include::modules/common-attributes.adoc[]
66

77
toc::[]
88

9-
Resuming a node brings it out of maintenance mode and schedulable again.
9+
Resuming a node brings it out of maintenance mode and makes it schedulable again.
1010

11-
Resume a node from maintenance from either the web console or the CLI.
11+
Resume a node from maintenance mode from the web console, CLI, or by deleting the `NodeMaintenance` custom resource.
1212

1313
include::modules/virt-resuming-node-maintenance-web.adoc[leveloffset=+1]
1414
include::modules/virt-resuming-node-maintenance-cli.adoc[leveloffset=+1]
15+
include::modules/virt-resuming-node-from-maintenance-mode-with-cr.adoc[leveloffset=+1]
16+
1517

0 commit comments

Comments
 (0)