Skip to content

Commit 61cf86a

Browse files
committed
refactor(galeracheck)
1 parent 40d0092 commit 61cf86a

File tree

4 files changed

+384
-100
lines changed

4 files changed

+384
-100
lines changed

galeracheck/README.md

Lines changed: 220 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,220 @@
1+
# galeracheck
2+
3+
HTTP health check service for Galera Cluster nodes. Returns HTTP 200 or 503 based on node state for use with load balancers (HAProxy, AWS ELB, etc.).
4+
5+
## Overview
6+
7+
galeracheck is a lightweight HTTP service that monitors the health of a Galera Cluster node and reports its availability status. Load balancers can poll this endpoint to determine whether to route traffic to the node.
8+
9+
## Installation
10+
11+
```bash
12+
go build
13+
sudo cp galeracheck /usr/local/bin/
14+
```
15+
16+
### Systemd Service
17+
18+
```bash
19+
sudo cp galeracheck.service /etc/systemd/system/
20+
sudo systemctl daemon-reload
21+
sudo systemctl enable galeracheck
22+
sudo systemctl start galeracheck
23+
```
24+
25+
## Usage
26+
27+
```bash
28+
galeracheck [options]
29+
```
30+
31+
### Options
32+
33+
```
34+
-config string
35+
MySQL config file to use (default "~/.my.cnf")
36+
37+
-port int
38+
TCP port to listen on (default 8000)
39+
40+
-mysql-socket string
41+
Path to unix socket of monitored MySQL instance (default "/run/mysqld/mysqld.sock")
42+
43+
-mysql-host string
44+
Hostname or IP address of monitored MySQL instance (default: use unix socket)
45+
46+
-mysql-port string
47+
Port of monitored MySQL instance (default "3306")
48+
49+
-available-when-donor
50+
Keep node available in LB when in donor state (state 2)
51+
Useful during RSU (Rolling Schema Upgrade) operations
52+
53+
-disable-when-readonly
54+
Remove node from LB when read_only is set
55+
Useful for gracefully taking a node out of rotation
56+
```
57+
58+
## Configuration
59+
60+
Create a MySQL config file (default `~/.my.cnf`) with credentials:
61+
62+
```ini
63+
[mysql]
64+
user=monitor
65+
password=secret
66+
socket=/run/mysqld/mysqld.sock
67+
```
68+
69+
Or for TCP connections:
70+
71+
```ini
72+
[mysql]
73+
user=monitor
74+
password=secret
75+
host=localhost
76+
port=3306
77+
```
78+
79+
The `[client]` section is also supported if `[mysql]` is not present.
80+
81+
## MySQL User Privileges
82+
83+
The monitoring user needs minimal privileges:
84+
85+
```sql
86+
CREATE USER 'monitor'@'localhost' IDENTIFIED BY 'secret';
87+
GRANT PROCESS ON *.* TO 'monitor'@'localhost';
88+
FLUSH PRIVILEGES;
89+
```
90+
91+
## Health Check Logic
92+
93+
The service queries the following Galera status variables:
94+
- `wsrep_local_state` - Node's current state
95+
- `wsrep_cluster_size` - Number of nodes in cluster
96+
- `read_only` (when `-disable-when-readonly` is used)
97+
98+
### Node States
99+
100+
- **State 4 (Synced)**: Node is synchronized and operational
101+
- **State 2 (Donor/Desynced)**: Node is acting as donor for SST/IST or in RSU mode
102+
103+
### Availability Rules
104+
105+
The service returns **HTTP 200** (available) when:
106+
1. Node is in state 4 (Synced)
107+
2. OR `-available-when-donor` is enabled AND node is in state 2
108+
3. OR `-disable-when-readonly` is enabled AND read_only is OFF AND state is 4
109+
4. **OR cluster size is 1 AND state is 4** (failsafe for last remaining node)
110+
111+
Otherwise, returns **HTTP 503** (unavailable).
112+
113+
### Single-Node Failsafe
114+
115+
When the cluster degrades to a single node (`wsrep_cluster_size = 1`), that node remains available even without `-available-when-donor` enabled. This prevents complete service outage in degraded cluster scenarios.
116+
117+
## Load Balancer Integration
118+
119+
### HAProxy Example
120+
121+
```
122+
backend galera_cluster
123+
mode tcp
124+
balance leastconn
125+
option httpchk
126+
127+
server node1 10.0.1.11:3306 check port 8000
128+
server node2 10.0.1.12:3306 check port 8000
129+
server node3 10.0.1.13:3306 check port 8000
130+
```
131+
132+
### AWS Application Load Balancer
133+
134+
Configure health check:
135+
- Protocol: HTTP
136+
- Port: 8000
137+
- Path: /
138+
- Success codes: 200
139+
140+
## Use Cases
141+
142+
### Standard Deployment
143+
144+
```bash
145+
galeracheck -port 8000
146+
```
147+
148+
Nodes in state 4 (Synced) are available. Donor nodes (state 2) are removed from LB.
149+
150+
### Rolling Schema Upgrade (RSU)
151+
152+
```bash
153+
galeracheck -port 8000 -available-when-donor
154+
```
155+
156+
Nodes remain available during RSU DDL operations (when in state 2).
157+
158+
### Graceful Node Removal
159+
160+
```bash
161+
galeracheck -port 8000 -disable-when-readonly
162+
```
163+
164+
Set `read_only=ON` on a node to remove it from the LB without triggering desync.
165+
166+
### TCP Connection
167+
168+
```bash
169+
galeracheck -port 8000 -mysql-host 127.0.0.1 -mysql-port 3306
170+
```
171+
172+
Monitor via TCP instead of unix socket.
173+
174+
## Response Format
175+
176+
**Available (HTTP 200)**
177+
```
178+
200 Galera Node is synced
179+
```
180+
181+
**Unavailable (HTTP 503)**
182+
```
183+
503 Galera Node is not synced
184+
```
185+
186+
**Connection Error (HTTP 503)**
187+
```
188+
503 No connection
189+
```
190+
191+
**Query Error (HTTP 503)**
192+
```
193+
503 Cannot check cluster state: <error>
194+
```
195+
196+
## Troubleshooting
197+
198+
### Connection Issues
199+
200+
Check MySQL credentials and socket/host configuration:
201+
```bash
202+
mysql --defaults-file=~/.my.cnf -e "SHOW STATUS LIKE 'wsrep%';"
203+
```
204+
205+
### Permission Issues
206+
207+
Verify the monitoring user has PROCESS privilege:
208+
```sql
209+
SHOW GRANTS FOR 'monitor'@'localhost';
210+
```
211+
212+
### Service Logs
213+
214+
```bash
215+
sudo journalctl -u galeracheck -f
216+
```
217+
218+
## License
219+
220+
See parent project license.

galeracheck/galeracheck.go

Lines changed: 0 additions & 99 deletions
This file was deleted.

galeracheck/galeracheck.service

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -3,7 +3,7 @@ Description=Galera Check Service
33

44
[Service]
55
Type=simple
6-
ExecStart=/usr/local/bin/galeracheck -a -p 8000
6+
ExecStart=/usr/local/bin/galeracheck -port 8000
77
Restart=on-failure
88

99
[Install]

0 commit comments

Comments
 (0)