You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: articles/sap/workloads/high-availability-guide-rhel-pacemaker.md
+27-22Lines changed: 27 additions & 22 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -43,12 +43,12 @@ Read the following SAP Notes and articles first:
43
43
## Overview
44
44
45
45
> [!IMPORTANT]
46
-
> Pacemaker clusters that span multiple Virtual networks(VNets)/subnets are not covered by standard support policies.
46
+
> Pacemaker clusters that span multiple Virtual networks(VNets)/subnets aren't covered by standard support policies.
47
47
48
48
There are two options available on Azure for configuring the fencing in a pacemaker cluster for RHEL: Azure fence agent, which restarts a failed node via the Azure APIs, or you can use SBD device.
49
49
50
50
> [!IMPORTANT]
51
-
> In Azure, RHEL high availability cluster with storage based fencing (fence_sbd) uses software-emulated watchdog. It is important to review [Software-Emulated Watchdog Known Limitations](https://access.redhat.com/articles/7034141) and [Support Policies for RHEL High Availability Clusters - sbd and fence_sbd](https://access.redhat.com/articles/2800691) when selecting SBD as the fencing mechanism.
51
+
> In Azure, RHEL high availability cluster with storage based fencing (fence_sbd) uses software-emulated watchdog. It's important to review [Software-Emulated Watchdog Known Limitations](https://access.redhat.com/articles/7034141) and [Support Policies for RHEL High Availability Clusters - sbd and fence_sbd](https://access.redhat.com/articles/2800691) when selecting SBD as the fencing mechanism.
52
52
53
53
### Use an SBD device
54
54
@@ -66,9 +66,9 @@ You can configure the SBD device by using either of two options:
66
66

67
67
68
68
> [!IMPORTANT]
69
-
> When you're planning to deploy and configure Linux pacemaker cluster nodes and SBD devices, do not allow the routing between your virtual machines and the VMs that are hosting the SBD devices to pass through any other devices, such as a [network virtual appliance (NVA)](https://azure.microsoft.com/solutions/network-appliances/).
69
+
> When you're planning to deploy and configure Linux pacemaker cluster nodes and SBD devices, don't allow the routing between your virtual machines and the VMs that are hosting the SBD devices to pass through any other devices, such as a [network virtual appliance (NVA)](https://azure.microsoft.com/solutions/network-appliances/).
70
70
>
71
-
> Maintenance events and other issues with the NVA can have a negative impact on the stability and reliability of the overall cluster configuration. For more information, see [user-defined routing rules](../../virtual-network/virtual-networks-udr-overview.md).
71
+
> Maintenance events and other issues with the NVA can have a negative effect on the stability and reliability of the overall cluster configuration. For more information, see [user-defined routing rules](../../virtual-network/virtual-networks-udr-overview.md).
72
72
73
73
* SBD with Azure shared disk
74
74
@@ -105,7 +105,7 @@ You first need to create the iSCSI target virtual machines. You can share iSCSI
105
105
106
106
1. Deploy virtual machines that run on supported RHEL OS version, and connect to them via SSH. The VMs don't have to be of large size. VM sizes such as Standard_E2s_v3 or Standard_D2s_v3 are sufficient. Be sure to use Premium storage for the OS disk.
107
107
108
-
2. It isn't necessary to use RHEL for SAP with HA and Update Services, or RHEL for SAP Apps OS image for the iSCSI target server. A standard RHEL OS image can be used instead. However, be aware that the support life cycle varies between different OS product releases.
108
+
2. It isn't necessary to use RHEL for SAP with HA and Update Services, or RHEL for SAP Apps OS image for the iSCSI target server. A standard RHEL OS image can be used instead. However, the support life cycle varies between different OS product releases.
109
109
110
110
3. Run following commands on all iSCSI target virtual machines.
111
111
@@ -376,7 +376,8 @@ On the cluster nodes, connect and discover iSCSI device that was created in the
376
376
[...]
377
377
SBD_STARTMODE=always
378
378
[...]
379
-
SBD_DELAY_START=yes
379
+
# # In some cases, a longer delay than the default "msgwait" seconds is needed. So, set a specific delay value, in seconds. See, `man sbd` for more information.
380
+
SBD_DELAY_START=216
380
381
[...]
381
382
```
382
383
@@ -397,12 +398,13 @@ On the cluster nodes, connect and discover iSCSI device that was created in the
397
398
398
399
```bash
399
400
sudo mkdir /etc/systemd/system/sbd.service.d
400
-
echo -e "[Service]\nTimeoutSec=144" | sudo tee /etc/systemd/system/sbd.service.d/sbd_delay_start.conf
401
+
echo -e "[Service]\nTimeoutSec=259" | sudo tee /etc/systemd/system/sbd.service.d/sbd_delay_start.conf
401
402
sudo systemctl daemon-reload
402
403
403
404
systemctl show sbd | grep -i timeout
404
-
# TimeoutStartUSec=2min 24s
405
-
# TimeoutStopUSec=2min 24s
405
+
# TimeoutStartUSec=4min 19s
406
+
# TimeoutStopUSec=4min 19s
407
+
# TimeoutAbortUSec=4min 19s
406
408
```
407
409
408
410
## SBD with an Azure shared disk
@@ -507,7 +509,7 @@ foreach ($vmName in $vmNames) {
507
509
sudo vi /etc/sysconfig/sbd
508
510
```
509
511
510
-
2. Change the property of the SBD device, enable the pacemaker integration, and change the start mode of SBD
512
+
2. Change the property of the SBD device, enable the pacemaker integration, change the start mode of SBD, and adjust SBD_DELAY_START value.
511
513
512
514
```bash
513
515
[...]
@@ -517,7 +519,8 @@ foreach ($vmName in $vmNames) {
517
519
[...]
518
520
SBD_STARTMODE=always
519
521
[...]
520
-
SBD_DELAY_START=yes
522
+
# In some cases, a longer delay than the default "msgwait" seconds is needed. So, set a specific delay value, in seconds. See, `man sbd` for more information.
523
+
SBD_DELAY_START=216
521
524
[...]
522
525
```
523
526
@@ -538,12 +541,13 @@ foreach ($vmName in $vmNames) {
538
541
539
542
```bash
540
543
sudo mkdir /etc/systemd/system/sbd.service.d
541
-
echo -e "[Service]\nTimeoutSec=144" | sudo tee /etc/systemd/system/sbd.service.d/sbd_delay_start.conf
544
+
echo -e "[Service]\nTimeoutSec=259" | sudo tee /etc/systemd/system/sbd.service.d/sbd_delay_start.conf
542
545
sudo systemctl daemon-reload
543
546
544
547
systemctl show sbd | grep -i timeout
545
-
# TimeoutStartUSec=2min 24s
546
-
# TimeoutStopUSec=2min 24s
548
+
# TimeoutStartUSec=4min 19s
549
+
# TimeoutStopUSec=4min 19s
550
+
# TimeoutAbortUSec=4min 19s
547
551
```
548
552
549
553
## Azure fence agent configuration
@@ -574,7 +578,7 @@ The fencing device uses either a managed identity for Azure resource or a servic
574
578
1. Make a note of the **Value**. It's used as the **password** for the service principal.
575
579
1. Select **Overview**. Make a note of the **Application ID**. It's used as the username (**login ID**in the following steps) of the service principal.
576
580
577
-
---
581
+
---
578
582
579
583
2. Create a custom role for the fence agent
580
584
@@ -618,7 +622,7 @@ The fencing device uses either a managed identity for Azure resource or a servic
618
622
619
623
Make sure to assign the role for both cluster nodes.
620
624
621
-
---
625
+
---
622
626
623
627
## Cluster installation
624
628
@@ -799,7 +803,7 @@ Based on the selected fencing mechanism, follow only one section for relevant in
799
803
2. **[1]** For the SBD device configured using iSCSI target servers or Azure shared disk, run the following commands.
800
804
801
805
```bash
802
-
sudo pcs property set stonith-timeout=144
806
+
sudo pcs property set stonith-timeout=210
803
807
sudo pcs property set stonith-enabled=true
804
808
805
809
# Replace the device IDs with your device ID.
@@ -812,7 +816,7 @@ Based on the selected fencing mechanism, follow only one section for relevant in
812
816
813
817
```bash
814
818
sudo pcs cluster stop --all
815
-
819
+
816
820
# It would take time to start the cluster as "SBD_DELAY_START" is set to "yes"
817
821
sudo pcs cluster start --all
818
822
```
@@ -838,7 +842,7 @@ Based on the selected fencing mechanism, follow only one section for relevant in
838
842
> When using Azure government cloud, you must specify `cloud=` option when configuring fence agent. For example, `cloud=usgov` for the Azure US government cloud. For details on RedHat support on Azure government cloud, see [Support Policies for RHEL High Availability Clusters - Microsoft Azure Virtual Machines as Cluster Members](https://access.redhat.com/articles/3131341).
839
843
840
844
> [!TIP]
841
-
> The option `pcmk_host_map` is *only* required in the command if the RHEL hostnames and the Azure VM names are *not* identical. Specify the mapping in the format **hostname:vm-name**. For more information, see [What format should I use to specify node mappings to fencing devices in pcmk_host_map?](https://access.redhat.com/solutions/2619961).
845
+
> The option `pcmk_host_map` is *only* required in the command if the RHEL hostnames and the Azure VM names aren't* identical. Specify the mapping in the format **hostname:vm-name**. For more information, see [What format should I use to specify node mappings to fencing devices in pcmk_host_map?](https://access.redhat.com/solutions/2619961).
842
846
843
847
#### [Managed identity](#tab/msi)
844
848
@@ -859,7 +863,7 @@ Based on the selected fencing mechanism, follow only one section for relevant in
@@ -897,7 +901,7 @@ Based on the selected fencing mechanism, follow only one section for relevant in
897
901
op monitor interval=3600
898
902
```
899
903
900
-
---
904
+
---
901
905
902
906
If you're using a fencing device based on service principal configuration, read [Change from SPN to MSI for Pacemaker clusters by using Azure fencing](https://techcommunity.microsoft.com/t5/running-sap-applications-on-the/sap-on-azure-high-availability-change-from-spn-to-msi-for/ba-p/3609278) and learn how to convert to managed identity configuration.
903
907
@@ -1008,6 +1012,7 @@ The following Red Hat KB articles contain important information about configurin
1008
1012
* For information on how to change the default timeout, see [How do I configure kdump for use with the RHEL 6, 7, 8 HA Add-On?](https://access.redhat.com/articles/67570).
1009
1013
* For information on how to reduce failover delay when you use `fence_kdump`, see [Can I reduce the expected delay of failover when adding fence_kdump configuration?](https://access.redhat.com/solutions/5512331).
1010
1014
1015
+
1011
1016
Run the following optional steps to add `fence_kdump` as a first-level fencing configuration, in addition to the Azure fence agent configuration.
1012
1017
1013
1018
1. **[A]** Verify that `kdump` is active and configured.
0 commit comments