Skip to content

Commit dd5d909

Browse files
Merge pull request #247 from mihaelablendea/master
Added STONITH sample
2 parents aab5a48 + 3aadfa5 commit dd5d909

File tree

5 files changed

+170
-0
lines changed

5 files changed

+170
-0
lines changed
Lines changed: 132 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,132 @@
1+
# Configure STONITH with ilo3 fencing agent
2+
## Test all the Agents before configuring Stonith
3+
4+
```bash
5+
sudo fence_ilo3 -a dl380g7-07-ilo -l Administrator -p 'Password!12' --action=status –verbose
6+
sudo fence_ilo3 -a dl380g7-08-ilo -l Administrator -p 'Password!12' --action=status –verbose
7+
sudo fence_ilo3 -a dl380g7-09-ilo -l Administrator -p 'Password!12' --action=status –verbose
8+
```
9+
10+
>[!NOTE]
11+
>Check whether the password and user name for the device include any special characters that could be misinterpreted by the bash shell. Making sure that you enter passwords and user names surrounded by quotation marks could address this issue.
12+
13+
## Create the Stonith fencing
14+
15+
```bash
16+
sudo pcs stonith create fence_dl380g7-07 fence_ilo3 ipaddr=dl380g7-07-ilo login="Administrator" passwd='Password!12' pcmk_host_list=dl380g7-07
17+
sudo pcs stonith create fence_dl380g7-08 fence_ilo3 ipaddr=dl380g7-08-ilo login="Administrator" passwd='Password!12' pcmk_host_list=dl380g7-08
18+
sudo pcs stonith create fence_dl380g7-09 fence_ilo3 ipaddr=dl380g7-09-ilo login="Administrator" passwd='Password!12' pcmk_host_list=dl380g7-09
19+
```
20+
21+
## Enable fencing
22+
23+
```bash
24+
sudo pcs property set stonith-enabled=true
25+
```
26+
27+
## Check fencing configuration
28+
29+
```bash
30+
sudo pcs stonith --full
31+
```
32+
33+
The following shows the output:
34+
```
35+
Resource: fence_dl380g7-08 (class=stonith type=fence_ilo3)
36+
Attributes: ipaddr=dl380g7-08-ilo login=Administrator passwd=Password!12
37+
Operations: monitor interval=60s (fence_dl380g7-08-monitor-interval-60s)
38+
Resource: fence_dl380g7-09 (class=stonith type=fence_ilo3)
39+
Attributes: ipaddr=dl380g7-09-ilo login=Administrator passwd=Password!12 pcmk_host_list=dl380g7-09
40+
Operations: monitor interval=60s (fence_dl380g7-09-monitor-interval-60s)
41+
Resource: fence_dl380g7-07 (class=stonith type=fence_ilo3)
42+
Attributes: ipaddr=dl380g7-07-ilo login=Administrator passwd=Password!12 pcmk_host_list=dl380g7-07
43+
Operations: monitor interval=60s (fence_dl380g7-07-monitor-interval-60s)
44+
```
45+
46+
## Test the configuration
47+
48+
1. Fence a node with `pcs stonith fence <nodeName>`
49+
50+
```bash
51+
pcs stonith fence dl380g7-09
52+
```
53+
54+
```bash
55+
sudo pcs status
56+
```
57+
58+
The following shows the output:
59+
```
60+
Cluster name: sqlcluster
61+
Stack: corosync
62+
Current DC: dl380g7-08 (version 1.1.15-11.el7_3.4-e174ec8) - partition with quorum
63+
Last updated: Fri May 12 09:46:58 2017 Last change: Fri May 12 09:46:55 2017 by root via cibadmin on dl380g7-08
64+
65+
3 nodes and 7 resources configured
66+
67+
Online: [ dl380g7-07 dl380g7-08 ]
68+
OFFLINE: [ dl380g7-09 ]
69+
70+
Full list of resources:
71+
72+
Master/Slave Set: ag_cluster-master [ag_cluster]
73+
Masters: [ dl380g7-08 ]
74+
Slaves: [ dl380g7-07 ]
75+
Stopped: [ dl380g7-09 ]
76+
virtualip (ocf::heartbeat:IPaddr2): Started dl380g7-08
77+
fence_dl380g7-08 (stonith:fence_ilo3): Started dl380g7-07
78+
fence_dl380g7-09 (stonith:fence_ilo3): Started dl380g7-07
79+
fence_dl380g7-07 (stonith:fence_ilo3): Started dl380g7-08
80+
```
81+
82+
2. Crash a node using `echo c>>/proc/sysrq-trigger`
83+
84+
```bash
85+
sudo pcs status
86+
```
87+
88+
The following shows the output:
89+
```
90+
Cluster name: sqlcluster
91+
Stack: corosync
92+
Current DC: dl380g7-08 (version 1.1.15-11.el7_3.4-e174ec8) - partition with quorum
93+
Last updated: Fri May 12 10:00:52 2017 Last change: Fri May 12 09:58:01 2017 by root via cibadmin on dl380g7-08
94+
95+
3 nodes and 7 resources configured
96+
97+
Online: [ dl380g7-07 dl380g7-08 ]
98+
OFFLINE: [ dl380g7-09 ]
99+
100+
Full list of resources:
101+
102+
Master/Slave Set: ag_cluster-master [ag_cluster]
103+
Masters: [ dl380g7-08 ]
104+
Slaves: [ dl380g7-07 ]
105+
Stopped: [ dl380g7-09 ]
106+
virtualip (ocf::heartbeat:IPaddr2): Started dl380g7-08
107+
fence_dl380g7-08 (stonith:fence_ilo3): Started dl380g7-08
108+
fence_dl380g7-09 (stonith:fence_ilo3): Started dl380g7-07
109+
fence_dl380g7-07 (stonith:fence_ilo3): Started dl380g7-08
110+
```
111+
112+
```bash
113+
sudo cat /var/log/messages
114+
```
115+
116+
The following shows the output:
117+
```
118+
May 12 09:58:38 dl380g7-08 pengine[30024]: warning: Node dl380g7-09 will be fenced because the node is no longer part of the cluster
119+
May 12 09:58:38 dl380g7-08 pengine[30024]: warning: Action fence_dl380g7-09_stop_0 on dl380g7-09 is unrunnable (offline)
120+
May 12 09:58:38 dl380g7-08 pengine[30024]: notice: Move fence_dl380g7-09#011(Started dl380g7-09 -> dl380g7-07)
121+
May 12 09:58:38 dl380g7-08 crmd[30025]: notice: Initiating start operation fence_dl380g7-09_start_0 on dl380g7-07
122+
May 12 09:58:38 dl380g7-08 stonith-ng[30021]: notice: Client crmd.30025.62ff454d wants to fence (reboot) 'dl380g7-09' with device '(any)'
123+
May 12 09:58:39 dl380g7-08 stonith-ng[30021]: notice: fence_dl380g7-07 can not fence (reboot) dl380g7-09: static-list
124+
May 12 09:58:39 dl380g7-08 stonith-ng[30021]: notice: fence_dl380g7-09 can fence (reboot) dl380g7-09: static-list
125+
May 12 09:58:40 dl380g7-08 crmd[30025]: notice: Initiating monitor operation fence_dl380g7-09_monitor_60000 on dl380g7-07
126+
```
127+
128+
3. Take down the network between nodes and appropriate network cards
129+
130+
```bash
131+
sudo if down eth0
132+
```
Lines changed: 29 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,29 @@
1+
# Samples for STONITH Configuration in a Pacemaker Cluster
2+
3+
Pacemaker cluster vendors require STONITH to be enabled and a fencing device configured for a supported cluster setup. When the cluster resource manager cannot determine the state of a node or of a resource on a node, fencing is used to bring the cluster to a known state again. Resource level fencing ensures mainly that there is no data corruption in case of an outage by configuring a resource. You can use resource level fencing, for instance, with DRBD (Distributed Replicated Block Device) to mark the disk on a node as outdated when the communication link goes down. Node level fencing ensures that a node does not run any resources. This is done by resetting the node and the Pacemaker implementation of it is called STONITH (which stands for "shoot the other node in the head"). Pacemaker supports a great variety of fencing devices, e.g. an uninterruptible power supply or management interface cards for servers. For more details, see [Pacemaker Clusters from Scratch](http://clusterlabs.org/doc/en-US/Pacemaker/1.1-plugin/html/Clusters_from_Scratch/ch05.html), [Fencing and Stonith](http://clusterlabs.org/doc/crm_fencing.html), [Red Hat High Availability Add-On with Pacemaker: Fencing](http://access.redhat.com/documentation/Red_Hat_Enterprise_Linux/6/html/Configuring_the_Red_Hat_High_Availability_Add-On_with_Pacemaker/ch-fencing-HAAR.html) and [Fencing in a Red Hat High Availability Cluster](https://access.redhat.com/solutions/15575).
4+
5+
## Other considerations
6+
7+
* Disabling STONITH is just for testing purposes. If you plan to use Pacemaker in a production environment, you should plan a STONITH implementation depending on your environment and keep it enabled.
8+
* Type of fence depends on the machine ( baremetal) or VM type.
9+
* RHEL does not provide fencing agents for any cloud environments (including Azure) or Hyper-V. Consequentially, the cluster vendor does not offer support for running production clusters in these environments.
10+
* All fencing agents are shell scripts in /usr/sbin, so you can go through them and figure out what they are doing.
11+
* Shell scripts often point to /usr/share/fence which have few python scripts
12+
* Fencing should be tested from command line BEFORE you actually create a stonith fence with PCS.
13+
14+
* On many baremetal recommended seems to be ilo ( ilo/ilo2/ilo3/ilo4 which go over ipmi ) or second option is the ssh equivalents ( ilo3_ssh/ilo4_ssh)
15+
* Have to find out what version of ilo machine supports
16+
* For _ssh agents, have to add public key/auth to ilo under the Administration—security to enable passwordless auth
17+
18+
19+
## Other fencing configurations
20+
21+
[How do I configure a stonith device using agent fence_vmware_soap in a RHEL 6 or 7 High Availability cluster with pacemaker](https://access.redhat.com/solutions/917813)
22+
23+
[What are the requirements for using the fence agent fence_vmware_soap](https://access.redhat.com/solutions/306233)
24+
25+
[How can I diagnose fence_vmware_soap failures in RHEL 5, 6, or 7?](https://access.redhat.com/solutions/473603)
26+
27+
[How to configure stonith agent fence_xvm in pacemaker cluster when cluster nodes are KVM guests and are on different KVM hosts](https://access.redhat.com/solutions/2386421)
28+
29+
[How to configure fence agent fence_xvm in RHEL cluster](https://access.redhat.com/solutions/917833)
Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1 @@
1+
# Cluster Configuration Samples for SQL Server on Linux HA Solutions
Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1 @@
1+
# Samples for High Availability solutions for SQL Server on Linux
Lines changed: 7 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,7 @@
1+
# Samples for High Availability solutions for SQL Server
2+
3+
Go to the documentation tutorials to learn more about:
4+
5+
[HADR Solutions on Linux](https://docs.microsoft.com/en-us/sql/linux/sql-server-linux-business-continuity-dr)
6+
7+
[Availability Groups](https://docs.microsoft.com/en-us/sql/database-engine/availability-groups/windows/always-on-availability-groups-sql-server)

0 commit comments

Comments
 (0)