You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: docs/reference/alert-rules.md
+2Lines changed: 2 additions & 0 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -46,6 +46,8 @@ This page contains a markdown version of the alert rules described in the `postg
46
46
| Alert | Severity | Notes |
47
47
|------|----------|-------|
48
48
| PatroniPostgresqlDown |![critical]| Patroni PostgreSQL instance is down.<br>Check for errors in the Loki logs. |
49
+
| PatroniMultipleLeaders |![critical]| Patroni cluster has multiple leader nodes.<br>More than one leader node (primary or standby) is detected inside a cluster.<br>This may indicate split-brain; check Patroni/Loki logs and network/quorum state. |
50
+
| PatroniPrimaryAndStandbyLeader |![critical]| Patroni cluster has both primary and standby leaders.<br>A primary leader and a standby leader are simultaneously detected inside a cluster.<br>Check for errors in the Loki logs. |
49
51
| PatroniHasNoLeader |![critical]| Patroni instance has no leader node.<br>A leader node (neither primary nor standby) cannot be found inside a cluster.<br>Check for errors in the Loki logs. |
Copy file name to clipboardExpand all lines: src/prometheus_alert_rules/patroni_rules.yaml
+26-2Lines changed: 26 additions & 2 deletions
Original file line number
Diff line number
Diff line change
@@ -17,14 +17,38 @@ groups:
17
17
Check for errors in the Loki logs.
18
18
LABELS = {{ $labels }}
19
19
20
+
- alert: PatroniMultipleLeaders
21
+
expr: 'sum by (juju_model,juju_application,juju_model_uuid,scope) (patroni_master) > 1 or sum by (juju_model,juju_application,juju_model_uuid,scope) (patroni_standby_leader) > 1'
22
+
for: 0m
23
+
labels:
24
+
severity: critical
25
+
annotations:
26
+
summary: Patroni cluster {{ $labels.scope }} has multiple leader nodes.
27
+
description: |
28
+
More than one leader node (primary or standby) is detected inside the cluster {{ $labels.scope }}.
29
+
Check for errors in the Loki logs.
30
+
LABELS = {{ $labels }}
31
+
32
+
- alert: PatroniPrimaryAndStandbyLeader
33
+
expr: 'sum by (juju_model,juju_application,juju_model_uuid,scope) (patroni_master) == 1 and sum by (juju_model,juju_application,juju_model_uuid,scope) (patroni_standby_leader) == 1'
34
+
for: 0m
35
+
labels:
36
+
severity: critical
37
+
annotations:
38
+
summary: Patroni cluster {{ $labels.scope }} has both primary and standby leaders.
39
+
description: |
40
+
A primary leader and a standby leader are simultaneously detected inside the cluster {{ $labels.scope }}.
41
+
Check for errors in the Loki logs.
42
+
LABELS = {{ $labels }}
43
+
20
44
# 2.4.1
21
45
- alert: PatroniHasNoLeader
22
-
expr: '(max by (scope) (patroni_master) < 1) and (max by (scope) (patroni_standby_leader) < 1)'
46
+
expr: '(max by (juju_model,juju_application,juju_model_uuid,scope) (patroni_master) < 1) and (max by (juju_model,juju_application,juju_model_uuid,scope) (patroni_standby_leader) < 1)'
23
47
for: 0m
24
48
labels:
25
49
severity: critical
26
50
annotations:
27
-
summary: Patroni instance {{ $labels.instance }} has no leader node.
51
+
summary: Patroni instance {{ $labels.instance }} has no leader node.
28
52
description: |
29
53
A leader node (neither primary nor standby) cannot be found inside the cluster {{ $labels.scope }}.
0 commit comments