Skip to content

Commit 41773e9

Browse files
committed
Update SBD_DELAY_START, TimeoutSec, stonith-timeout value in pacemaker setup. Change in order constraint for HANA on ANF
1 parent 038b6e8 commit 41773e9

File tree

3 files changed

+158
-127
lines changed

3 files changed

+158
-127
lines changed

articles/sap/workloads/high-availability-guide-rhel-pacemaker.md

Lines changed: 21 additions & 16 deletions
Original file line numberDiff line numberDiff line change
@@ -376,7 +376,8 @@ On the cluster nodes, connect and discover iSCSI device that was created in the
376376
[...]
377377
SBD_STARTMODE=always
378378
[...]
379-
SBD_DELAY_START=yes
379+
# # In some cases, a longer delay than the default "msgwait" seconds is needed. So, set a specific delay value, in seconds. See, `man sbd` for more information.
380+
SBD_DELAY_START=216
380381
[...]
381382
```
382383
@@ -397,12 +398,13 @@ On the cluster nodes, connect and discover iSCSI device that was created in the
397398
398399
```bash
399400
sudo mkdir /etc/systemd/system/sbd.service.d
400-
echo -e "[Service]\nTimeoutSec=144" | sudo tee /etc/systemd/system/sbd.service.d/sbd_delay_start.conf
401+
echo -e "[Service]\nTimeoutSec=259" | sudo tee /etc/systemd/system/sbd.service.d/sbd_delay_start.conf
401402
sudo systemctl daemon-reload
402403
403404
systemctl show sbd | grep -i timeout
404-
# TimeoutStartUSec=2min 24s
405-
# TimeoutStopUSec=2min 24s
405+
# TimeoutStartUSec=4min 19s
406+
# TimeoutStopUSec=4min 19s
407+
# TimeoutAbortUSec=4min 19s
406408
```
407409
408410
## SBD with an Azure shared disk
@@ -507,7 +509,7 @@ foreach ($vmName in $vmNames) {
507509
sudo vi /etc/sysconfig/sbd
508510
```
509511
510-
2. Change the property of the SBD device, enable the pacemaker integration, and change the start mode of SBD
512+
2. Change the property of the SBD device, enable the pacemaker integration, change the start mode of SBD, and adjust SBD_DELAY_START value.
511513
512514
```bash
513515
[...]
@@ -517,7 +519,8 @@ foreach ($vmName in $vmNames) {
517519
[...]
518520
SBD_STARTMODE=always
519521
[...]
520-
SBD_DELAY_START=yes
522+
# In some cases, a longer delay than the default "msgwait" seconds is needed. So, set a specific delay value, in seconds. See, `man sbd` for more information.
523+
SBD_DELAY_START=216
521524
[...]
522525
```
523526
@@ -538,12 +541,13 @@ foreach ($vmName in $vmNames) {
538541
539542
```bash
540543
sudo mkdir /etc/systemd/system/sbd.service.d
541-
echo -e "[Service]\nTimeoutSec=144" | sudo tee /etc/systemd/system/sbd.service.d/sbd_delay_start.conf
544+
echo -e "[Service]\nTimeoutSec=259" | sudo tee /etc/systemd/system/sbd.service.d/sbd_delay_start.conf
542545
sudo systemctl daemon-reload
543546
544547
systemctl show sbd | grep -i timeout
545-
# TimeoutStartUSec=2min 24s
546-
# TimeoutStopUSec=2min 24s
548+
# TimeoutStartUSec=4min 19s
549+
# TimeoutStopUSec=4min 19s
550+
# TimeoutAbortUSec=4min 19s
547551
```
548552
549553
## Azure fence agent configuration
@@ -574,7 +578,7 @@ The fencing device uses either a managed identity for Azure resource or a servic
574578
1. Make a note of the **Value**. It's used as the **password** for the service principal.
575579
1. Select **Overview**. Make a note of the **Application ID**. It's used as the username (**login ID** in the following steps) of the service principal.
576580

577-
---
581+
---
578582

579583
2. Create a custom role for the fence agent
580584

@@ -618,7 +622,7 @@ The fencing device uses either a managed identity for Azure resource or a servic
618622
619623
Make sure to assign the role for both cluster nodes.
620624
621-
---
625+
---
622626
623627
## Cluster installation
624628
@@ -799,7 +803,7 @@ Based on the selected fencing mechanism, follow only one section for relevant in
799803
2. **[1]** For the SBD device configured using iSCSI target servers or Azure shared disk, run the following commands.
800804
801805
```bash
802-
sudo pcs property set stonith-timeout=144
806+
sudo pcs property set stonith-timeout=210
803807
sudo pcs property set stonith-enabled=true
804808
805809
# Replace the device IDs with your device ID.
@@ -812,7 +816,7 @@ Based on the selected fencing mechanism, follow only one section for relevant in
812816
813817
```bash
814818
sudo pcs cluster stop --all
815-
819+
816820
# It would take time to start the cluster as "SBD_DELAY_START" is set to "yes"
817821
sudo pcs cluster start --all
818822
```
@@ -859,7 +863,7 @@ Based on the selected fencing mechanism, follow only one section for relevant in
859863
subscriptionId="subscription id" pcmk_host_map="prod-cl1-0:prod-cl1-0-vm-name;prod-cl1-1:prod-cl1-1-vm-name" \
860864
power_timeout=240 pcmk_reboot_timeout=900 pcmk_monitor_timeout=120 pcmk_monitor_retries=4 pcmk_action_limit=3 \
861865
op monitor interval=3600
862-
866+
863867
# Run following command if you are setting up fence agent on (two-node cluster and pacemaker version less than 2.0.4-6.el8)
864868
sudo pcs stonith create rsc_st_azure fence_azure_arm msi=true resourceGroup="resource group" \
865869
subscriptionId="subscription id" pcmk_host_map="prod-cl1-0:prod-cl1-0-vm-name;prod-cl1-1:prod-cl1-1-vm-name" \
@@ -888,7 +892,7 @@ Based on the selected fencing mechanism, follow only one section for relevant in
888892
pcmk_host_map="prod-cl1-0:prod-cl1-0-vm-name;prod-cl1-1:prod-cl1-1-vm-name" \
889893
power_timeout=240 pcmk_reboot_timeout=900 pcmk_monitor_timeout=120 pcmk_monitor_retries=4 pcmk_action_limit=3 \
890894
op monitor interval=3600
891-
895+
892896
# Run following command if you are setting up fence agent on (two-node cluster and pacemaker version less than 2.0.4-6.el8)
893897
sudo pcs stonith create rsc_st_azure fence_azure_arm username="login ID" password="password" \
894898
resourceGroup="resource group" tenantId="tenant ID" subscriptionId="subscription id" \
@@ -897,7 +901,7 @@ Based on the selected fencing mechanism, follow only one section for relevant in
897901
op monitor interval=3600
898902
```
899903
900-
---
904+
---
901905
902906
If you're using a fencing device based on service principal configuration, read [Change from SPN to MSI for Pacemaker clusters by using Azure fencing](https://techcommunity.microsoft.com/t5/running-sap-applications-on-the/sap-on-azure-high-availability-change-from-spn-to-msi-for/ba-p/3609278) and learn how to convert to managed identity configuration.
903907

@@ -1008,6 +1012,7 @@ The following Red Hat KB articles contain important information about configurin
10081012
* For information on how to change the default timeout, see [How do I configure kdump for use with the RHEL 6, 7, 8 HA Add-On?](https://access.redhat.com/articles/67570).
10091013
* For information on how to reduce failover delay when you use `fence_kdump`, see [Can I reduce the expected delay of failover when adding fence_kdump configuration?](https://access.redhat.com/solutions/5512331).
10101014
1015+
10111016
Run the following optional steps to add `fence_kdump` as a first-level fencing configuration, in addition to the Azure fence agent configuration.
10121017
10131018
1. **[A]** Verify that `kdump` is active and configured.

0 commit comments

Comments
 (0)