Skip to content

Commit a716cc3

Browse files
Merge pull request #245819 from dennispadia/depadia-sleshanaupdate
Update priority-fencing-delay parameter in SLES HANA document
2 parents 2cbe3d9 + 2cf8178 commit a716cc3

File tree

2 files changed

+46
-22
lines changed

2 files changed

+46
-22
lines changed

articles/sap/workloads/high-availability-guide-suse.md

Lines changed: 9 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -989,7 +989,7 @@ The following tests are a copy of the test cases in the best practices guides of
989989
rsc_sap_NW1_ERS02 (ocf::heartbeat:SAPInstance): Started nw1-cl-0
990990
```
991991

992-
Execute firewall rule to drop communication on one of the nodes
992+
Execute firewall rule to block the communication on one of the nodes.
993993

994994
```bash
995995
# Execute iptable rule on nw1-cl-0 (10.0.0.5) to block the incoming and outgoing traffic to nw1-cl-1 (10.0.0.6)
@@ -1000,7 +1000,14 @@ The following tests are a copy of the test cases in the best practices guides of
10001000

10011001
When configuring a fencing device, it's recommended to configure [`pcmk_delay_max`](https://www.suse.com/support/kb/doc/?id=000019110) property. So, in the event of split-brain scenario, the cluster introduces a random delay up to the `pcmk_delay_max` value, to the fencing action on each node. The node with the shortest delay will be selected for fencing.
10021002
1003-
Additionally, in ENSA 2 configuration, to prioritize the node hosting the ASCS resource over the other node during a split brain scenario, it's recommended to configure [`priority-fencing-delay`](https://documentation.suse.com/sle-ha/15-SP3/single-html/SLE-HA-administration/#pro-ha-storage-protect-fencing) property in the cluster. Enabling priority-fencing-delay property allows the cluster to introduce an extra delay in the fencing action specifically on the node hosting the ASCS resource, allowing the ASCS node to win the fence race.
1003+
Additionally, in ENSA 2 configuration, to prioritize the node hosting the ASCS resource over the other node during a split brain scenario, it's recommended to configure [`priority-fencing-delay`](https://documentation.suse.com/sle-ha/15-SP3/single-html/SLE-HA-administration/#pro-ha-storage-protect-fencing) property in the cluster. Enabling priority-fencing-delay property allows the cluster to introduce an additional delay in the fencing action specifically on the node hosting the ASCS resource, allowing the ASCS node to win the fence race.
1004+
1005+
Execute below command to delete the firewall rule.
1006+
1007+
```bash
1008+
# If the iptables rule set on the server gets reset after a reboot, the rules will be cleared out. In case they have not been reset, please proceed to remove the iptables rule using the following command.
1009+
iptables -D INPUT -s 10.0.0.6 -j DROP; iptables -D OUTPUT -d 10.0.0.6 -j DROP
1010+
```
10041011

10051012
1. Test manual restart of ASCS instance
10061013

articles/sap/workloads/sap-hana-high-availability.md

Lines changed: 37 additions & 20 deletions
Original file line numberDiff line numberDiff line change
@@ -207,7 +207,7 @@ Replace `<placeholders>` with the values for your SAP HANA installation.
207207
sudo mkfs.xfs /dev/vg_hana_log_<HANA SID>/hana_log
208208
sudo mkfs.xfs /dev/vg_hana_shared_<HANA SID>/hana_shared
209209
```
210-
210+
211211
1. Create the mount directories and copy the universally unique identifier (UUID) of all the logical volumes:
212212

213213
```bash
@@ -481,13 +481,13 @@ With susChkSrv implemented, an immediate and configurable action is executed. Th
481481
provider = SAPHanaSR
482482
path = /usr/share/SAPHanaSR
483483
execution_order = 1
484-
484+
485485
[ha_dr_provider_suschksrv]
486486
provider = susChkSrv
487487
path = /usr/share/SAPHanaSR
488488
execution_order = 3
489489
action_on_lost = fence
490-
490+
491491
[trace]
492492
ha_dr_saphanasr = info
493493
```
@@ -599,6 +599,8 @@ sudo crm configure ms msl_SAPHana_<HANA SID>_HDB<instance number> rsc_SAPHana_<H
599599
meta notify="true" clone-max="2" clone-node-max="1" \
600600
target-role="Started" interleave="true"
601601
602+
sudo crm resource meta msl_SAPHana_<HANA SID>_HDB<instance number> set priority 100
603+
602604
sudo crm configure primitive rsc_ip_<HANA SID>_HDB<instance number> ocf:heartbeat:IPaddr2 \
603605
meta target-role="Started" \
604606
operations \$id="rsc_ip_<HANA SID>_HDB<instance number>-operations" \
@@ -620,6 +622,8 @@ sudo crm configure order ord_SAPHana_<HANA SID>_HDB<instance number> Optional: c
620622
# Clean up the HANA resources. The HANA resources might have failed because of a known issue.
621623
sudo crm resource cleanup rsc_SAPHana_<HANA SID>_HDB<instance number>
622624
625+
sudo crm configure property priority-fencing-delay=30
626+
623627
sudo crm configure property maintenance-mode=false
624628
sudo crm configure rsc_defaults resource-stickiness=1000
625629
sudo crm configure rsc_defaults migration-threshold=5000
@@ -848,31 +852,44 @@ stonith-sbd (stonith:external/sbd): Started hn1-db-1
848852
rsc_nc_HN1_HDB03 (ocf::heartbeat:azure-lb): Started hn1-db-1
849853
```
850854
851-
### Test the Azure fencing agent
855+
### Blocking network communication
852856
853-
You can test the setup of the Azure fencing agent (not the *SBD*) by disabling the network interface on the `hn1-db-0` node:
857+
Resource state before starting the test:
854858
855-
```bash
856-
sudo ifdown eth0
857-
```
859+
```bash
860+
Online: [ hn1-db-0 hn1-db-1 ]
861+
862+
Full list of resources:
863+
stonith-sbd (stonith:external/sbd): Started hn1-db-1
864+
Clone Set: cln_SAPHanaTopology_HN1_HDB03 [rsc_SAPHanaTopology_HN1_HDB03]
865+
Started: [ hn1-db-0 hn1-db-1 ]
866+
Master/Slave Set: msl_SAPHana_HN1_HDB03 [rsc_SAPHana_HN1_HDB03]
867+
Masters: [ hn1-db-1 ]
868+
Slaves: [ hn1-db-0 ]
869+
Resource Group: g_ip_HN1_HDB03
870+
rsc_ip_HN1_HDB03 (ocf::heartbeat:IPaddr2): Started hn1-db-1
871+
rsc_nc_HN1_HDB03 (ocf::heartbeat:azure-lb): Started hn1-db-1
872+
```
858873
859-
The VM now restarts or stops, depending on your cluster configuration.
874+
Execute firewall rule to block the communication on one of the nodes.
860875
861-
If you set the `stonith-action` setting to `off`, the VM is stopped and the resources are migrated to the running VM.
876+
```bash
877+
# Execute iptable rule on hn1-db-1 (10.0.0.6) to block the incoming and outgoing traffic to hn1-db-0 (10.0.0.5)
878+
iptables -A INPUT -s 10.0.0.5 -j DROP; iptables -A OUTPUT -d 10.0.0.5 -j DROP
879+
```
862880
863-
After you start the VM again, the SAP HANA resource fails to start as secondary if you set `AUTOMATED_REGISTER="false"`. In this case, configure the HANA instance as secondary by running this command:
881+
When cluster nodes can't communicate to each other, there's a risk of a split-brain scenario. In such situations, cluster nodes will try to simultaneously fence each other, resulting in fence race.
864882
865-
```bash
866-
su - <hana sid>adm
883+
When configuring a fencing device, it's recommended to configure [`pcmk_delay_max`](https://www.suse.com/support/kb/doc/?id=000019110) property. So, in the event of split-brain scenario, the cluster introduces a random delay up to the `pcmk_delay_max` value, to the fencing action on each node. The node with the shortest delay will be selected for fencing.
867884

868-
# Stop the HANA instance, just in case it is running
869-
sapcontrol -nr <instance number> -function StopWait 600 10
870-
hdbnsutil -sr_register --remoteHost=hn1-db-1 --remoteInstance=<instance number> --replicationMode=sync --name=<site 1>
885+
Additionally, to ensure that the node running the HANA master takes priority and wins the fence race in a split brain scenario, it's recommended to set [`priority-fencing-delay`](https://documentation.suse.com/sle-ha/15-SP3/single-html/SLE-HA-administration/#pro-ha-storage-protect-fencing) property in the cluster configuration. By enabling priority-fencing-delay property, the cluster can introduce an additional delay in the fencing action specifically on the node hosting HANA master resource, allowing the node to win the fence race.
871886
872-
# Switch back to root and clean up the failed state
873-
exit
874-
crm resource cleanup msl_SAPHana_<HANA SID>_HDB<instance number> hn1-db-0
875-
```
887+
Execute below command to delete the firewall rule.
888+
889+
```bash
890+
# If the iptables rule set on the server gets reset after a reboot, the rules will be cleared out. In case they have not been reset, please proceed to remove the iptables rule using the following command.
891+
iptables -D INPUT -s 10.0.0.5 -j DROP; iptables -D OUTPUT -d 10.0.0.5 -j DROP
892+
```
876893
877894
### Test SBD fencing
878895

0 commit comments

Comments
 (0)