Skip to content

Commit d01ed62

Browse files
Merge pull request #1095 from klgill/docs-osprh20612-HAadoptiondocs
new content for HA adoption
2 parents 57437af + 77b0df3 commit d01ed62

7 files changed

+224
-0
lines changed

docs_user/assemblies/assembly_adopting-the-data-plane.adoc

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -28,5 +28,7 @@ include::../modules/proc_performing-a-fast-forward-upgrade-on-compute-services.a
2828

2929
include::../modules/proc_adopting-networker-services-to-the-data-plane.adoc[leveloffset=+1]
3030

31+
include::../modules/proc_enabling-high-availability-for-instances.adoc[leveloffset=+1]
32+
3133
ifdef::parent-context[:context: {parent-context}]
3234
ifndef::parent-context[:!context:]
Lines changed: 21 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,21 @@
1+
:_mod-docs-content-type: ASSEMBLY
2+
ifdef::context[:parent-context: {context}]
3+
4+
[id="preparing-an-instance-HA-deployment-for-adoption_{context}"]
5+
6+
:context: preparing-instance-HA
7+
8+
= Preparing an Instance HA deployment for adoption
9+
10+
[role="_abstract"]
11+
To enable the high availability for Compute instances (Instance HA) service after you adopt the {rhos_long_noacro} ({rhos_acro}) {rhos_curr_ver} data plane, perform the following preparation tasks:
12+
13+
* Create a fencing configuration file to use after you adopt the {rhos_acro} data plane.
14+
* Prevent Pacemaker from monitoring or recovering the Compute nodes.
15+
16+
include::../modules/proc_maintaining-the-instance-ha-functionality-after-adoption.adoc[leveloffset=+1]
17+
18+
include::../modules/proc_preventing-pacemaker-from-monitoring-compute-nodes.adoc[leveloffset=+1]
19+
20+
ifdef::parent-context[:context: {parent-context}]
21+
ifndef::parent-context[:!context:]

docs_user/assemblies/assembly_rhoso-180-adoption-overview.adoc

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -34,6 +34,8 @@ include::../assemblies/assembly_storage-requirements.adoc[leveloffset=+1]
3434

3535
include::../assemblies/assembly_red-hat-ceph-storage-prerequisites.adoc[leveloffset=+1]
3636

37+
include::../assemblies/assembly_preparing-an-instance-HA-deployment-for-adoption.adoc[leveloffset=+1]
38+
3739
include::../modules/con_comparing-configuration-files-between-deployments.adoc[leveloffset=+1]
3840

3941
ifdef::parent-context[:context: {parent-context}]

docs_user/modules/con_adoption-process-overview.adoc

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -27,3 +27,4 @@ Post-adoption tasks::
2727
* Optional: Run tempest to verify that the entire adoption process is working properly. For more information, see link:{defaultURL}/validating_and_troubleshooting_the_deployed_cloud/index[Validating and troubleshooting the deployed cloud].
2828
* Optional: Perform a minor update from RHEL 9.2 to 9.4. You can perform a minor update any time after you complete the adoption procedure. For more information, see link:{defaultURL}/updating_your_environment_to_the_latest_maintenance_release/index[Updating your environment to the latest maintenance release].
2929
* Optional: Verify that you migrated all services from the Controller nodes, and then power off the nodes. If any services are still running in the Controller nodes, such as Open Virtual Networking (ML2/OVN), {object_storage_first_ref}, or {Ceph}, do not power off the nodes.
30+
* If you enabled the high availability for Compute instances (Instance HA) service, remove the Pacemaker components from your Compute nodes. For more information, see xref:enabling-high-availability-for-instances_data-plane[Enabling the high availability for Compute instances service].
Lines changed: 102 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,102 @@
1+
:_mod-docs-content-type: PROCEDURE
2+
[id="enabling-high-availability-for-instances_{context}"]
3+
4+
= Enabling the high availability for Compute instances service
5+
6+
[role="_abstract"]
7+
To enable the high availability for Compute instances (Instance HA) service, you create the following resources:
8+
9+
* Fencing secret.
10+
* Configuration map. You can create the configuration map manually, or the configuration map is created automatically when you deploy the Instance HA resource. However, you must create the configuration map manually if you want to disable the Instance HA service.
11+
* Instance HA resource.
12+
13+
14+
.Prerequisites
15+
16+
* You have created the `fencing-secret.yaml` configuration file. For more information, see xref:maintaining-instance-ha-functionality-after-adoption_preparing-instance-HA[Maintaining the Instance HA functionality after adoption].
17+
* You have disabled Pacemaker on your Compute nodes. For more information, see xref:preventing-pacemaker-from-monitoring-compute-nodes_preparing-instance-HA[Preventing Pacemaker from monitoring Compute nodes].
18+
19+
.Procedure
20+
21+
. Create the secret:
22+
+
23+
----
24+
$ oc apply -f fencing-secret.yaml -n openstack
25+
----
26+
27+
. Optional: Create the Instance HA configuration map and set the `DISABLED` parameter to `false`. For example:
28+
+
29+
----
30+
$ cat << EOF > iha-cm.yaml
31+
kind: ConfigMap
32+
metadata:
33+
name: instanceha-0-config
34+
namespace: openstack
35+
apiVersion: v1
36+
data:
37+
config.yaml: |
38+
config:
39+
EVACUABLE_TAG: "evacuable"
40+
TAGGED_IMAGES: "true"
41+
TAGGED_FLAVORS: "true"
42+
TAGGED_AGGREGATES: "true"
43+
SMART_EVACUATION: "false"
44+
DELTA: "30"
45+
DELAY: "0"
46+
POLL: "45"
47+
THRESHOLD: "50"
48+
WORKERS: "4"
49+
RESERVED_HOSTS: "false"
50+
LEAVE_DISABLED: "false"
51+
CHECK_KDUMP: "false"
52+
LOGLEVEL: "info"
53+
DISABLED: "false"
54+
EOF
55+
----
56+
57+
.. Apply the configuration:
58+
+
59+
----
60+
$ oc apply -f iha-cm.yaml -n openstack
61+
----
62+
+
63+
[NOTE]
64+
If you want to restrict which Compute nodes are evacuated, create host aggregates and set them by using the `EVACUABLE_TAG` parameter. Alternatively, you can set the `TAGGED_AGGREGATES` parameter to `false` to enable monitoring and evacuation of all your Compute nodes. For more information about Instance HA service parameters, see link:{ha-for-instances}/assembly_deploying-and-configuring-the-high-availability-for-compute-instances-service_instance-ha#proc_editing-the-instance-ha-service-parameters_instance-ha[Editing the Instance HA service parameters] in _Configuring high availability for instances_.
65+
66+
. Create an Instance HA resource and reference the fencing secret and configuration map. For example:
67+
+
68+
----
69+
$ cat << EOF > iha.yaml
70+
apiVersion: instanceha.openstack.org/v1beta1
71+
kind: InstanceHa
72+
metadata:
73+
name: instanceha-0
74+
namespace: openstack
75+
spec:
76+
caBundleSecretName: combined-ca-bundle
77+
instanceHaConfigMap: <instanceha-0-config>
78+
fencingSecret: fencing-secret
79+
EOF
80+
----
81+
+
82+
* `instanceha-0-config`: Specifies the name of the Instance HA configuration map that you created. Leave blank to have the `infra-operator` automatically create a configuration map. You can then edit the values as needed.
83+
84+
. Deploy the Instance HA resource:
85+
+
86+
----
87+
$ oc apply -f iha.yaml -n openstack
88+
----
89+
90+
.Next steps
91+
92+
* After you complete the {rhos_long_noacro} adoption, remove the Pacemaker components from the Compute nodes. You must run the following commands on each Compute node:
93+
+
94+
----
95+
$ sudo systemctl stop pacemaker_remote
96+
$ sudo systemctl stop pcsd
97+
$ sudo systemctl stop pcsd-ruby.service
98+
$ sudo systemctl disable pacemaker_remote
99+
$ sudo systemctl disable pcsd
100+
$ sudo systemctl disable pcsd-ruby.service
101+
$ sudo dnf remove pacemaker pacemaker-remote pcs pcsd -y
102+
----
Lines changed: 56 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,56 @@
1+
:_mod-docs-content-type: PROCEDURE
2+
[id="maintaining-instance-ha-functionality-after-adoption_{context}"]
3+
4+
= Maintaining the Instance HA functionality after adoption
5+
6+
[role="_abstract"]
7+
To maintain the high availability for Compute instances (Instance HA) functionality after you adopt {rhos_long_noacro} {rhos_curr_ver}, create a fencing configuration file to use in your adopted environment.
8+
9+
.Procedure
10+
11+
. Gather the fencing information from the `fencing.yaml` file in your {rhos_prev_long} ({OpenStackShort}) {rhos_prev_ver} cluster.
12+
13+
. Retrieve the {OpenStackShort} {rhos_prev_ver} stonith configuration from any of your overcloud Controller nodes:
14+
+
15+
----
16+
$ sudo pcs config
17+
----
18+
+
19+
.Sample output
20+
+
21+
----
22+
Stonith Devices:
23+
...
24+
Resource: stonith-fence_ipmilan-525400dde4f7 (class=stonith
25+
type=fence_ipmilan)
26+
Attributes: stonith-fence_ipmilan-525400dde4f7-instance_attributes
27+
delay=20
28+
ipaddr=172.16.0.1
29+
ipport=6231
30+
lanplus=true
31+
login=admin
32+
passwd=password
33+
pcmk_host_list=compute-1
34+
Operations:
35+
monitor: stonith-fence_ipmilan-525400dde4f7-monitor-interval-60s
36+
interval=60s
37+
Resource: stonith-fence_ipmilan-525400819ad3 (class=stonith
38+
type=fence_ipmilan)
39+
Attributes: stonith-fence_ipmilan-525400819ad3-instance_attributes
40+
delay=20
41+
ipaddr=172.16.0.1
42+
ipport=6230
43+
lanplus=true
44+
login=admin
45+
passwd=password
46+
pcmk_host_list=compute-0
47+
Operations:
48+
monitor: stonith-fence_ipmilan-525400819ad3-monitor-interval-60s
49+
interval=60s
50+
...
51+
----
52+
53+
. Generate the fencing configuration file:
54+
+
55+
* To install the script that automatically generates this file, see link:https://access.redhat.com/solutions/7123932[How do I automatically generate fencing secret for RHOSO18 instanceha from a osp17.1 cluster that I want to adopt?].
56+
* To create the fencing configuration file manually, see link:{ha-for-instances}/assembly_deploying-and-configuring-the-high-availability-for-compute-instances-service_instance-ha#proc_configuring-the-fencing-of-compute-nodes_instance-ha[Configuring the fencing of Compute nodes] in _Configuring high availability for instances_.
Lines changed: 40 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,40 @@
1+
:_mod-docs-content-type: PROCEDURE
2+
[id="preventing-pacemaker-from-monitoring-compute-nodes_{context}"]
3+
4+
= Preventing Pacemaker from monitoring Compute nodes
5+
6+
[role="_abstract"]
7+
You must disable Pacemaker so that it does not monitor your Compute nodes during the adoption. For example, if a network issue occurs during the adoption, Pacemaker attempts to reboot the Compute nodes to recover them, which breaks the adoption.
8+
9+
.Procedure
10+
11+
. Retrieve the names of the Compute remote resources:
12+
+
13+
----
14+
$ sudo pcs stonith |grep -B1 stonith-fence_compute-fence-nova |grep Target |awk -F ': ' '{print $2}'
15+
----
16+
17+
. Disable the stonith and `pacemaker_remote` resources on each Compute remote resource:
18+
+
19+
----
20+
$ sudo pcs property set stonith-enabled=false
21+
$ sudo pcs resource disable <compute_remote_resource>
22+
----
23+
+
24+
* Replace `<compute_remote_resource>` with the name of the Compute remote resource in your environment.
25+
26+
. Retrieve the name of the Compute stonith resources:
27+
+
28+
----
29+
$ sudo pcs stonith |grep Level |grep fence_compute |awk '{print $4}' |awk -F ',' '{print $1}' |sort |uniq
30+
----
31+
32+
. Remove the Compute node `pacemaker_remote` and fencing resources:
33+
+
34+
----
35+
$ sudo pcs stonith disable <compute_stonith_resource>
36+
$ sudo pcs stonith delete <compute_stonith_resource>
37+
$ sudo pcs resource delete <compute_remote_resource>
38+
----
39+
+
40+
* Replace `<compute_stonith_resource>` with the name of the Compute stonith resource in your environment.

0 commit comments

Comments
 (0)