feat: multiple Patroni primaries alert#1282
feat: multiple Patroni primaries alert#1282Deezzir wants to merge 3 commits intocanonical:16/edgefrom
Conversation
Signed-off-by: Deezzir <yurii.kondrakov@canonical.com>
Codecov Report✅ All modified and coverable lines are covered by tests. ❌ Your project check has failed because the head coverage (68.97%) is below the target coverage (70.00%). You can increase the head coverage or adjust the target coverage. Additional details and impacted files@@ Coverage Diff @@
## 16/edge #1282 +/- ##
========================================
Coverage 68.97% 68.97%
========================================
Files 16 16
Lines 3816 3816
Branches 575 575
========================================
Hits 2632 2632
Misses 982 982
Partials 202 202 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
|
Thanks for the great contribution again, @Deezzir! |
|
@carlcsaposs-canonical @marceloneppel, may I ask you to retrigger the CI so we can merge? Thanks! |
Signed-off-by: Deezzir <yurii.kondrakov@canonical.com>
9bff5f8
9bff5f8 to
54a2493
Compare
| LABELS = {{ $labels }} | ||
|
|
||
| - alert: PatroniPrimaryAndStandbyLeader | ||
| expr: 'sum by (scope) (patroni_master) == 1 and sum by (scope) (patroni_standby_leader) == 1' |
There was a problem hiding this comment.
| expr: 'sum by (scope) (patroni_master) == 1 and sum by (scope) (patroni_standby_leader) == 1' | |
| expr: 'sum by (scope) (patroni_master) > 0 and sum by (scope) (patroni_standby_leader) > 0' |
There was a problem hiding this comment.
I think it is okay to leave ==1, the case where we have multiple of each is already covered by another alert - PatroniMultipleLeaders, so it is safe to leave it that way.
There was a problem hiding this comment.
Both approaches work for me. The PatroniPrimaryAndStandbyLeader alert is already a safety net for a scenario that shouldn't happen in practice (primary and standby leaders are roles from different cluster types, so they shouldn't coexist within the same scope). The cases where > 0 and == 1 differ are even more unlikely on top of that. I'm fine with keeping == 1 as-is.
Signed-off-by: Deezzir <yurii.kondrakov@canonical.com>
Issue
There is no alert that would check for multiple leaders. This explicit alert will help detect a split-brain occurrence.
Solution
PatroniMultipleLeaders- fires if multiple leaders are detected.PatroniPrimaryAndStandbyLeader-fires if the cluster has both primary and standby leaders at the same timeChecklist
Closes: canonical/postgresql-operator#1151