Skip to content

Commit 09f0301

Browse files
authored
Merge pull request #70056 from jeana-redhat/OSDOCS-9244-Nutanix-failure-domain-MAPI
OSDOCS-9244: Machine API updates for Nutanix failure domain support
2 parents 0789116 + ead5db2 commit 09f0301

9 files changed

+338
-48
lines changed

machine_management/control_plane_machine_management/cpmso-configuration.adoc

Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -89,6 +89,12 @@ Some sections of the control plane machine set CR are provider-specific. The fol
8989
//Sample Nutanix provider specification
9090
include::modules/cpmso-yaml-provider-spec-nutanix.adoc[leveloffset=+2]
9191

92+
//Failure domains for Nutanix clusters
93+
include::modules/mapi-failure-domain-nutanix.adoc[leveloffset=+2]
94+
[role="_additional-resources"]
95+
.Additional resources
96+
* xref:../../post_installation_configuration/adding-nutanix-failure-domains.adoc#adding-failure-domains-to-an-existing-nutanix-cluster[Adding failure domains to an existing Nutanix cluster]
97+
9298
[id="cpmso-sample-yaml-vsphere_{context}"]
9399
== Sample YAML for configuring VMware vSphere clusters
94100

machine_management/control_plane_machine_management/cpmso-resiliency.adoc

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -23,6 +23,8 @@ include::modules/cpmso-failure-domains-provider.adoc[leveloffset=+2]
2323

2424
* xref:../../machine_management/control_plane_machine_management/cpmso-configuration.adoc#cpmso-yaml-failure-domain-azure_cpmso-configuration[Sample Microsoft Azure failure domain configuration]
2525

26+
* xref:../../post_installation_configuration/adding-nutanix-failure-domains.adoc#adding-failure-domains-to-an-existing-nutanix-cluster[Adding failure domains to an existing Nutanix cluster]
27+
2628
* xref:../../machine_management/control_plane_machine_management/cpmso-configuration.adoc#cpmso-yaml-failure-domain-openstack_cpmso-configuration[Sample {rh-openstack-first} failure domain configuration]
2729

2830
//Balancing control plane machines

machine_management/creating_machinesets/creating-machineset-nutanix.adoc

Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -16,3 +16,9 @@ include::modules/machineset-yaml-nutanix.adoc[leveloffset=+1]
1616

1717
//Creating a compute machine set
1818
include::modules/machineset-creating.adoc[leveloffset=+1]
19+
20+
//Failure domains for Nutanix clusters
21+
include::modules/mapi-failure-domain-nutanix.adoc[leveloffset=+1]
22+
[role="_additional-resources"]
23+
.Additional resources
24+
* xref:../../post_installation_configuration/adding-nutanix-failure-domains.adoc#adding-failure-domains-to-an-existing-nutanix-cluster[Adding failure domains to an existing Nutanix cluster]

modules/cpmso-failure-domains-provider.adoc

Lines changed: 4 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -23,15 +23,14 @@ The control plane machine set concept of a failure domain is analogous to existi
2323
|X
2424
|link:https://cloud.google.com/compute/docs/regions-zones[zone]
2525

26-
|Nutanix
27-
//link:https://portal.nutanix.com/page/documents/details?targetId=Web-Console-Guide-Prism-v6_1:arc-failure-modes-c.html[Availability domain]
28-
|
29-
|Not applicable ^[1]^
30-
3126
|Microsoft Azure
3227
|X
3328
|link:https://learn.microsoft.com/en-us/azure/azure-web-pubsub/concept-availability-zones[Azure availability zone]
3429

30+
|Nutanix
31+
|X
32+
|link:https://portal.nutanix.com/page/documents/solutions/details?targetId=RA-2147-Nutanix-for-Enterprise-Edge:failure-domain-considerations.html[failure domain]
33+
3534
|VMware vSphere
3635
|
3736
|Not applicable
@@ -40,9 +39,5 @@ The control plane machine set concept of a failure domain is analogous to existi
4039
|X
4140
|link:https://docs.openstack.org/nova/2023.2/admin/availability-zones.html[OpenStack Nova availability zones] and link:https://docs.openstack.org/cinder/2023.2/admin/availability-zone-type.html[OpenStack Cinder availability zones]
4241
|====
43-
[.small]
44-
--
45-
1. Nutanix has a failure domain concept, but {product-title} {product-version} does not include support for this feature.
46-
--
4742

4843
The failure domain configuration in the control plane machine set custom resource (CR) is platform-specific. For more information about failure domain parameters in the CR, see the sample failure domain configuration for your provider.

modules/machineset-modifying.adoc

Lines changed: 16 additions & 14 deletions
Original file line numberDiff line numberDiff line change
@@ -19,6 +19,8 @@ By default, the {product-title} router pods are deployed on compute machines.
1919
Because the router is required to access some cluster resources, including the web console, do not scale the compute machine set to `0` unless you first relocate the router pods.
2020
====
2121

22+
The output examples in this procedure use the values for an AWS cluster.
23+
2224
.Prerequisites
2325

2426
* Your {product-title} cluster uses the Machine API.
@@ -34,7 +36,7 @@ Because the router is required to access some cluster resources, including the w
3436
$ oc edit machineset <machine_set_name> -n openshift-machine-api
3537
----
3638

37-
. Note the value of the `spec.replicas` field, as you need it when scaling the machine set to apply the changes.
39+
. Note the value of the `spec.replicas` field, because you need it when scaling the machine set to apply the changes.
3840
+
3941
[source,yaml]
4042
----
@@ -58,7 +60,7 @@ spec:
5860
$ oc get -n openshift-machine-api machines -l machine.openshift.io/cluster-api-machineset=<machine_set_name>
5961
----
6062
+
61-
.Example output
63+
.Example output for an AWS cluster
6264
[source,text]
6365
----
6466
NAME PHASE TYPE REGION ZONE AGE
@@ -75,7 +77,7 @@ $ oc annotate machine/<machine_name_original_1> \
7577
machine.openshift.io/delete-machine="true"
7678
----
7779

78-
. Scale the compute machine set to twice the number of replicas by running the following command:
80+
. To create replacement machines with the new configuration, scale the compute machine set to twice the number of replicas by running the following command:
7981
+
8082
[source,terminal]
8183
----
@@ -92,7 +94,7 @@ $ oc scale --replicas=4 \// <1>
9294
$ oc get -n openshift-machine-api machines -l machine.openshift.io/cluster-api-machineset=<machine_set_name>
9395
----
9496
+
95-
.Example output
97+
.Example output for an AWS cluster
9698
[source,text]
9799
----
98100
NAME PHASE TYPE REGION ZONE AGE
@@ -104,7 +106,7 @@ NAME PHASE TYPE REGION ZONE
104106
+
105107
When the new machines are in the `Running` phase, you can scale the compute machine set to the original number of replicas.
106108

107-
. Scale the compute machine set to the original number of replicas by running the following command:
109+
. To remove the machines that were created with the old configuration, scale the compute machine set to the original number of replicas by running the following command:
108110
+
109111
[source,terminal]
110112
----
@@ -116,14 +118,21 @@ $ oc scale --replicas=2 \// <1>
116118

117119
.Verification
118120

121+
* To verify that a machine created by the updated machine set has the correct configuration, examine the relevant fields in the CR for one of the new machines by running the following command:
122+
+
123+
[source,terminal]
124+
----
125+
$ oc describe machine <machine_name_updated_1> -n openshift-machine-api
126+
----
127+
119128
* To verify that the compute machines without the updated configuration are deleted, list the machines that are managed by the updated compute machine set by running the following command:
120129
+
121130
[source,terminal]
122131
----
123132
$ oc get -n openshift-machine-api machines -l machine.openshift.io/cluster-api-machineset=<machine_set_name>
124133
----
125134
+
126-
.Example output while deletion is in progress
135+
.Example output while deletion is in progress for an AWS cluster
127136
[source,text]
128137
----
129138
NAME PHASE TYPE REGION ZONE AGE
@@ -133,17 +142,10 @@ NAME PHASE TYPE REGION ZONE
133142
<machine_name_updated_2> Running m6i.xlarge us-west-1 us-west-1a 5m41s
134143
----
135144
+
136-
.Example output when deletion is complete
145+
.Example output when deletion is complete for an AWS cluster
137146
[source,text]
138147
----
139148
NAME PHASE TYPE REGION ZONE AGE
140149
<machine_name_updated_1> Running m6i.xlarge us-west-1 us-west-1a 6m30s
141150
<machine_name_updated_2> Running m6i.xlarge us-west-1 us-west-1a 6m30s
142-
----
143-
144-
* To verify that a machine created by the updated machine set has the correct configuration, examine the relevant fields in the CR for one of the new machines by running the following command:
145-
+
146-
[source,terminal]
147-
----
148-
$ oc describe machine <machine_name_updated_1> -n openshift-machine-api
149151
----
Lines changed: 19 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,19 @@
1+
// Module included in the following assemblies:
2+
//
3+
// * machine_management/cpmso-configuration.adoc
4+
// * machine_management/creating_machinesets/creating-machineset-nutanix.adoc
5+
6+
:_mod-docs-content-type: REFERENCE
7+
[id="mapi-failure-domain-nutanix_{context}"]
8+
= Failure domains for Nutanix clusters
9+
10+
To add or update the failure domain configuration on a Nutanix cluster, you must make coordinated changes to several resources.
11+
The following actions are required:
12+
13+
. Modify the cluster infrastructure custom resource (CR).
14+
15+
. Modify the cluster control plane machine set CR.
16+
17+
. Modify or replace the compute machine set CRs.
18+
19+
For more information, see "Adding failure domains to an existing Nutanix cluster" in the _Post-installation configuration_ content.

modules/post-installation-adding-nutanix-failure-domains-compute-machines.adoc renamed to modules/post-installation-adding-nutanix-failure-domains-compute-machines-edit.adoc

Lines changed: 43 additions & 19 deletions
Original file line numberDiff line numberDiff line change
@@ -3,15 +3,10 @@
33
// * post_installation_configuration/adding-nutanix-failure-domains.adoc
44

55
:_mod-docs-content-type: PROCEDURE
6-
[id="post-installation-adding-nutanix-failure-domains-compute-machines_{context}"]
7-
= Distributing compute machines across failure domains
6+
[id="post-installation-adding-nutanix-failure-domains-compute-machines-edit_{context}"]
7+
= Editing compute machine sets to implement failure domains
88

9-
You can distribute compute machines across Nutanix failure domains by performing either of the following tasks:
10-
11-
* Modifying existing compute machine sets.
12-
* Creating new compute machine sets.
13-
14-
The following procedure details how to distribute compute machines across failure domains by modifying existing compute machine sets. For more information on creating a compute machine set, see "Additional resources".
9+
To distribute compute machines across Nutanix failure domains by using an existing compute machine set, you update the compute machine set with your configuration and then use scaling to replace the existing compute machines.
1510

1611
.Prerequisites
1712

@@ -25,20 +20,32 @@ The following procedure details how to distribute compute machines across failur
2520
----
2621
$ oc describe infrastructures.config.openshift.io cluster
2722
----
23+
2824
. For each failure domain (`platformSpec.nutanix.failureDomains`), note the cluster's UUID, name, and subnet object UUID. These values are required to add a failure domain to a compute machine set.
25+
2926
. List the compute machine sets in your cluster by running the following command:
3027
+
3128
[source,terminal]
3229
----
3330
$ oc get machinesets -n openshift-machine-api
3431
----
32+
+
33+
.Example output
34+
[source,terminal]
35+
----
36+
NAME DESIRED CURRENT READY AVAILABLE AGE
37+
<machine_set_name_1> 1 1 1 1 55m
38+
<machine_set_name_2> 1 1 1 1 55m
39+
----
40+
3541
. Edit the first compute machine set by running the following command:
3642
+
3743
[source,terminal]
3844
----
39-
$ oc edit machineset <machineset_name> -n openshift-machine-api
45+
$ oc edit machineset <machine_set_name_1> -n openshift-machine-api
4046
----
41-
. Configure the compute machine set to use the first failure domain by adding the following to the `spec.template.spec.providerSpec.value` stanza:
47+
48+
. Configure the compute machine set to use the first failure domain by updating the following to the `spec.template.spec.providerSpec.value` stanza.
4249
+
4350
[NOTE]
4451
====
@@ -54,7 +61,7 @@ metadata:
5461
creationTimestamp: null
5562
labels:
5663
machine.openshift.io/cluster-api-cluster: <cluster_name>
57-
name: <machineset_name>
64+
name: <machine_set_name_1>
5865
namespace: openshift-machine-api
5966
spec:
6067
replicas: 2
@@ -75,14 +82,27 @@ spec:
7582
uuid: <prism_element_network_uuid_1>
7683
# ...
7784
----
78-
. Note the value of `spec.replicas`, as you need it when scaling the machine set to apply the changes.
85+
86+
. Note the value of `spec.replicas`, because you need it when scaling the compute machine set to apply the changes.
87+
7988
. Save your changes.
89+
8090
. List the machines that are managed by the updated compute machine set by running the following command:
8191
+
8292
[source,terminal]
8393
----
84-
$ oc get -n openshift-machine-api machines -l machine.openshift.io/cluster-api-machineset=<machine_set_name>
94+
$ oc get -n openshift-machine-api machines \
95+
-l machine.openshift.io/cluster-api-machineset=<machine_set_name_1>
96+
----
97+
+
98+
.Example output
99+
[source,text]
100+
----
101+
NAME PHASE TYPE REGION ZONE AGE
102+
<machine_name_original_1> Running AHV Unnamed Development-STS 4h
103+
<machine_name_original_2> Running AHV Unnamed Development-STS 4h
85104
----
105+
86106
. For each machine that is managed by the updated compute machine set, set the `delete` annotation by running the following command:
87107
+
88108
[source,terminal]
@@ -91,30 +111,34 @@ $ oc annotate machine/<machine_name_original_1> \
91111
-n openshift-machine-api \
92112
machine.openshift.io/delete-machine="true"
93113
----
94-
. Scale the compute machine set to twice the number of replicas by running the following command:
114+
115+
. To create replacement machines with the new configuration, scale the compute machine set to twice the number of replicas by running the following command:
95116
+
96117
[source,terminal]
97118
----
98119
$ oc scale --replicas=<twice_the_number_of_replicas> \// <1>
99-
machineset <machine_set_name> \
120+
machineset <machine_set_name_1> \
100121
-n openshift-machine-api
101122
----
102123
<1> For example, if the original number of replicas in the compute machine set is `2`, scale the replicas to `4`.
124+
103125
. List the machines that are managed by the updated compute machine set by running the following command:
104126
+
105127
[source,terminal]
106128
----
107-
$ oc get -n openshift-machine-api machines -l machine.openshift.io/cluster-api-machineset=<machine_set_name>
129+
$ oc get -n openshift-machine-api machines -l machine.openshift.io/cluster-api-machineset=<machine_set_name_1>
108130
----
109131
+
110132
When the new machines are in the `Running` phase, you can scale the compute machine set to the original number of replicas.
111-
. Scale the compute machine set to the original number of replicas by running the following command:
133+
134+
. To remove the machines that were created with the old configuration, scale the compute machine set to the original number of replicas by running the following command:
112135
+
113136
[source,terminal]
114137
----
115138
$ oc scale --replicas=<original_number_of_replicas> \// <1>
116-
machineset <machine_set_name> \
139+
machineset <machine_set_name_1> \
117140
-n openshift-machine-api
118141
----
119-
<1> For example, if the original number of replicas in the compute machine set is `2`, scale the replicas to `2`.
142+
<1> For example, if the original number of replicas in the compute machine set was `2`, scale the replicas to `2`.
143+
120144
. As required, continue to modify machine sets to reference the additional failure domains that are available to the deployment.

0 commit comments

Comments
 (0)