Skip to content

Commit 28a3bd7

Browse files
feat: [DPE-7404] promote to primary on unit scope (#646)
* port promotion to primary to k8s charm * lib sync from VM * test wait for failure scenario * please new linting rules and libs bump * missing one * add placeholder function for followup PR * remove placeholder * locking capabilities for unit rejoin * update parameters * fix dependency build with pinned version * merge leftover * git checkout origin/main -- poetry.lock && poetry lock * Include docs --------- Co-authored-by: Carl Csaposs <[email protected]>
1 parent 749385a commit 28a3bd7

File tree

12 files changed

+367
-18
lines changed

12 files changed

+367
-18
lines changed

actions.yaml

Lines changed: 13 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -27,6 +27,7 @@ set-password:
2727
type: string
2828
description: The username, the default value 'root'.
2929
Possible values - root, serverconfig, clusteradmin.
30+
enum: [root, serverconfig, clusteradmin]
3031
password:
3132
type: string
3233
description: The password will be auto-generated if this option is not specified.
@@ -77,15 +78,24 @@ create-replication:
7778

7879
promote-to-primary:
7980
description: |
80-
Promotes this cluster to become the primary in the cluster-set. Used for safe switchover or failover.
81-
Can only be run against the charm leader unit of a standby cluster.
81+
Promotes the unit or cluster to become the primary in the cluster or cluster-set, depending on
82+
the scope (unit or cluster). Used for safe switchover or failover.
83+
When in cluster scope, can only be run against the charm leader unit of a standby cluster.
8284
params:
85+
scope:
86+
type: string
87+
description: Whether to promote a unit or a cluster. Must be set to either `unit` or `cluster`.
88+
enum: [unit, cluster]
8389
force:
8490
type: boolean
8591
default: False
8692
description: |
87-
Use force when previous primary is unreachable (failover). Will invalidate previous
93+
For cluster scope, use force when previous primary is unreachable (failover). Will invalidate previous
8894
primary.
95+
For unit scope, use force to force quorum from the current unit. Note that this operation is DANGEROUS
96+
as it can create a split-brain if incorrectly used and should be considered a last resort. Make
97+
absolutely sure that there are no partitions of this group that are still operating somewhere in
98+
the network, but not accessible from your location
8999
90100
recreate-cluster:
91101
description: |

docs/how-to/cross-regional-async-replication/switchover-failover.md

Lines changed: 4 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -9,7 +9,7 @@ Make sure both `Rome` and `Lisbon` Clusters are deployed using the [Async Deploy
99
Assuming `Rome` is currently `Primary` and you want to promote `Lisbon` to be new primary:
1010

1111
```shell
12-
juju run -m lisbon db2/leader promote-to-primary
12+
juju run -m lisbon db2/leader promote-to-primary scope=cluster
1313
```
1414

1515
`Rome` will be converted to `StandBy` member.
@@ -25,9 +25,10 @@ It should ONLY be executed if Primary cluster is no longer exist (i.e. it is los
2525
Assuming `Rome` was a `Primary` (before we lost the cluster `Rome`) and you want to promote `Lisbon` to be the new primary:
2626

2727
```shell
28-
juju run -m lisbon db2/leader promote-to-primary force=True
28+
juju run -m lisbon db2/leader promote-to-primary scope=cluster force=True
2929
```
3030

3131
```{caution}
3232
`force=True` will cause the old primary to be invalidated.
33-
```
33+
```
34+

docs/how-to/index.md

Lines changed: 3 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -24,6 +24,7 @@ Scale replicas <scale-replicas>
2424
Manage passwords <manage-passwords>
2525
Enable TLS <enable-tls>
2626
External network access <external-network-access>
27+
Primary switchover <primary-switchover>
2728
```
2829

2930
## Back up and restore
@@ -79,4 +80,5 @@ Development <development/index>
7980
:hidden:
8081
8182
Contribute <contribute>
82-
```
83+
```
84+

docs/how-to/primary-switchover.md

Lines changed: 20 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,20 @@
1+
# How to do a primary switchover
2+
3+
A user may want to change the primary in a MySQL cluster to improve
4+
performance, enable maintenance, recover from failure, or balance load across
5+
nodes.
6+
7+
On a healthy cluster, the primary can be changed by running the `promote-to-primary` action with
8+
parameter `scope` set to `unit` on the unit that should become the new primary.
9+
10+
```shell
11+
juju run-action mysql/1 promote-to-primary scope=unit
12+
```
13+
14+
In this example, the unit `mysql/1` will become the new primary. The previous primary will become a
15+
secondary.
16+
17+
```{caution}
18+
The `promote-to-primary` action can be used in cluster scope, when using async replication.
19+
Check [Switchover / Failover](cross-regional-async-replication/switchover-failover) for more information.
20+
```

docs/reference/troubleshooting/index.md

Lines changed: 6 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -10,7 +10,7 @@ See [](/reference/troubleshooting/known-scenarios.md) for specific operational i
1010

1111
## Check status
1212

13-
The first troubleshooting step is to run `juju status` and check the statuses and messages of all applications and units.
13+
The first troubleshooting step is to run `juju status` and check the statuses and messages of all applications and units.
1414

1515
See [](/reference/charm-statuses) for additional recommendations based on status.
1616

@@ -47,7 +47,7 @@ See [Juju logs documentation](https://juju.is/docs/juju/log) to learn more about
4747

4848
Check the operator [architecture](/explanation/architecture) first to be familiar with the `charm` and `workload` containers.
4949

50-
Make sure both containers are `Running` and `Ready` to continue troubleshooting inside the charm.
50+
Make sure both containers are `Running` and `Ready` to continue troubleshooting inside the charm.
5151

5252
To describe the running pod, use the following command (where `0` is a Juju unit id):
5353

@@ -99,6 +99,7 @@ To enter the `workload` container, run:
9999
```shell
100100
juju ssh --container mysql mysql-k8s/0 bash
101101
```
102+
102103
You can check the list of running processes and Pebble plan:
103104

104105
```shell
@@ -114,7 +115,7 @@ mysql 70 0.0 0.0 2888 1884 ? S 21:14 0:00 /bin/sh /usr/
114115
mysql 366 2.4 7.2 26711784 2394252 ? Sl 21:14 0:10 /usr/sbin/mysqld --basedir=/usr --datadir=/var/lib/mysql --plugin-dir=/usr/lib/mysql/plugin --log-error=/var/log/mysql/error.log --pid-file=mysql-k8s-0.pid
115116
```
116117

117-
The list of running Pebble services will dependson whether the charm is integrated with [COS](/how-to/monitoring-cos/enable-monitoring) and/or has [backup](/how-to/back-up-and-restore/create-a-backup) functionality.
118+
The list of running Pebble services will dependson whether the charm is integrated with [COS](/how-to/monitoring-cos/enable-monitoring) and/or has [backup](/how-to/back-up-and-restore/create-a-backup) functionality.
118119

119120
The Pebble and its service `mysqld_safe` must always be enabled and currently running (the Linux processes `pebble`, `mysqld_safe` and `mysqld`).
120121

@@ -159,7 +160,7 @@ Continue troubleshooting your database/SQL related issues from here.
159160

160161
[Contact us](/reference/contacts) if you cannot determinate the source of your issue, or if you'd like to help us improve this document.
161162

162-
## Installing extra software:
163+
## Installing extra software
163164

164165
**We do not recommend installing any additionally software** as it may affect the stability and produce anomalies which is hard to troubleshoot and fix.
165166

@@ -178,4 +179,5 @@ root@mysql-k8s-0:/#
178179
:titlesonly:
179180
180181
Known scenarios <known-scenarios>
182+
Recovering from quorum loss <recover-from-quorum-loss>
181183
```
Lines changed: 101 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,101 @@
1+
# Recovering from quorum loss
2+
3+
Quorum loss in MySQL happens when the majority of nodes (the quorum) required to make decisions and
4+
maintain consistency is no longer available. This can happen due to network issues, node failures,
5+
or other disruptions. When this occurs, the cluster may become unavailable or enter a read-only
6+
state.
7+
8+
Although the charm cannot automatically recover from quorum loss, you can take the following steps
9+
to manually recover the cluster.
10+
11+
```{warning}
12+
Recovery from quorum loss should be performed with caution, as it can impact the availability and
13+
cause loss of data.
14+
```
15+
16+
## Ensure the cluster is in no-quorum state
17+
18+
A quorum loss will typically look like this in the juju status output:
19+
20+
```
21+
Model Controller Cloud/Region Version SLA Timestamp
22+
mymodel localhost default 3.6.8 unsupported 17:52:19Z
23+
24+
App Version Status Scale Charm Channel Rev Address Exposed Message
25+
mysql 8.0.42-0ubuntu0.22.04.2 waiting 3 mysql-k8s 8.0/edge 279 10.152.183.61 no waiting for units to settle down
26+
27+
Unit Workload Agent Address Ports Message
28+
mysql/0* maintenance idle 10.1.2.48 offline
29+
mysql/1 maintenance idle 10.1.0.195 offline
30+
mysql/2 active idle 10.1.1.81
31+
```
32+
33+
From an active unit, check the cluster status with:
34+
35+
```shell
36+
juju run mysql/2 get-cluster-status
37+
```
38+
39+
Which will output the current status of the cluster.
40+
41+
```
42+
Running operation 17 with 1 task
43+
- task 18 on unit-mysql-2
44+
45+
Waiting for task 18...
46+
status:
47+
clustername: cluster-3eab807dee6797402ecfc52b5a84d15b
48+
clusterrole: primary
49+
defaultreplicaset:
50+
name: default
51+
primary: mysql-0.mysql-endpoints.m3.svc.cluster.local.:3306
52+
ssl: required
53+
status: no_quorum
54+
statustext: cluster has no quorum as visible from 'mysql-2.mysql-endpoints.m3.svc.cluster.local.:3306'
55+
and cannot process write transactions. 2 members are not active.
56+
topology:
57+
mysql-0:
58+
address: mysql-0.mysql-endpoints.m3.svc.cluster.local.:3306
59+
instanceerrors: '[''note: group_replication is stopped.'']'
60+
memberrole: primary
61+
memberstate: offline
62+
mode: n/a
63+
role: ha
64+
status: unreachable
65+
version: 8.0.42
66+
mysql-1:
67+
address: mysql-1.mysql-endpoints.m3.svc.cluster.local.:3306
68+
instanceerrors: '[''note: group_replication is stopped.'']'
69+
memberrole: secondary
70+
memberstate: offline
71+
mode: n/a
72+
role: ha
73+
status: unreachable
74+
version: 8.0.42
75+
mysql-2:
76+
address: mysql-2.mysql-endpoints.m3.svc.cluster.local.:3306
77+
memberrole: secondary
78+
mode: r/o
79+
replicationlagfromimmediatesource: ""
80+
replicationlagfromoriginalsource: ""
81+
role: ha
82+
status: online
83+
version: 8.0.42
84+
topologymode: single-primary
85+
domainname: cluster-set-3eab807dee6797402ecfc52b5a84d15b
86+
groupinformationsourcemember: mysql-2.mysql-endpoints.m3.svc.cluster.local.:3306
87+
success: "True"
88+
```
89+
90+
Note from the output, we can see that the cluster is in a no-quorum state, with `status:
91+
no_quorum`.
92+
93+
## Recover the cluster from the active unit
94+
95+
Using the available active unit, run the action:
96+
97+
```shell
98+
juju run mysql/2 promote-to-primary scope=unit force=true
99+
```
100+
101+
The unit will become the new primary. Other offline units, if reachable, will rejoin automatically on the follow up `update-status` events.

poetry.lock

Lines changed: 17 additions & 2 deletions
Some generated files are not rendered by default. Learn more about customizing how changed files appear on GitHub.

pyproject.toml

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -67,6 +67,7 @@ kubernetes = "^27.2.0"
6767
allure-pytest = "^2.13.2"
6868
allure-pytest-default-results = "^0.1.2"
6969
pytest-asyncio = "^0.21.1"
70+
jubilant = "^1.0.1"
7071

7172
[tool.coverage.run]
7273
branch = true

src/charm.py

Lines changed: 27 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -42,6 +42,7 @@
4242
MySQLLockAcquisitionError,
4343
MySQLNoMemberStateError,
4444
MySQLRebootFromCompleteOutageError,
45+
MySQLRejoinInstanceToClusterError,
4546
MySQLServiceNotRunningError,
4647
MySQLSetClusterPrimaryError,
4748
MySQLUnableToGetMemberStateError,
@@ -318,10 +319,6 @@ def text_logs(self) -> list:
318319

319320
return text_logs
320321

321-
def update_endpoints(self) -> None:
322-
"""Temp placeholder."""
323-
pass
324-
325322
def unit_initialized(self, raise_exceptions: bool = False) -> bool:
326323
"""Return whether a unit is started.
327324
@@ -948,7 +945,7 @@ def _execute_manual_rejoin(self) -> None:
948945
It is supposed to be called when the MySQL 8.0.21+ auto-rejoin attempts have been exhausted,
949946
on an OFFLINE replica that still belongs to the cluster
950947
"""
951-
if not self._mysql.is_instance_in_cluster(self.unit_label):
948+
if not self._mysql.instance_belongs_to_cluster(self.unit_label):
952949
logger.warning("Instance does not belong to the cluster. Cannot perform manual rejoin")
953950
return
954951

@@ -957,15 +954,38 @@ def _execute_manual_rejoin(self) -> None:
957954
logger.warning("Instance does not have ONLINE peers. Cannot perform manual rejoin")
958955
return
959956

957+
# add random delay to mitigate collisions when multiple units are rejoining
958+
# due the difference between the time we test for locks and acquire them
959+
# Not used for cryptographic purpose
960+
sleep(random.uniform(0, 1.5)) # noqa: S311
961+
962+
if self._mysql.are_locks_acquired(from_instance=cluster_primary):
963+
logger.info("waiting: cluster lock is held")
964+
return
965+
try:
966+
self._mysql.rejoin_instance_to_cluster(
967+
unit_address=self.unit_address,
968+
unit_label=self.unit_label,
969+
from_instance=cluster_primary,
970+
)
971+
except MySQLRejoinInstanceToClusterError:
972+
logger.warning("Can't rejoin instance to cluster. Falling back to remove and add.")
973+
960974
self._mysql.remove_instance(
961975
unit_label=self.unit_label,
976+
auto_dissolve=False,
962977
)
963978
self._mysql.add_instance_to_cluster(
964979
instance_address=self.unit_address,
965980
instance_unit_label=self.unit_label,
966981
from_instance=cluster_primary,
967982
)
968983

984+
def update_endpoints(self) -> None:
985+
"""Update the endpoints for the database relation."""
986+
self.database_relation._configure_endpoints(None)
987+
self._on_update_status(None)
988+
969989
def _is_cluster_blocked(self) -> bool:
970990
"""Performs cluster state checks for the update-status handler.
971991
@@ -1031,6 +1051,8 @@ def _set_app_status(self) -> None:
10311051
return
10321052

10331053
if not primary_address:
1054+
logger.error("Cluster has no primary. Check cluster status on online units.")
1055+
self.app.status = MaintenanceStatus("Cluster has no primary.")
10341056
return
10351057

10361058
if "s3-block-message" in self.app_peer_data:

0 commit comments

Comments
 (0)