Get MYSQL_PWD using an on-demand cluster query (PR 1 of 6) #342

zzzeek · 2025-07-01T13:01:40Z

In order to facilitate an in-place change to the name of the Secret that is referenced by a Galera instance for the mysql root password, rework
the approach used by pods and shell scripts to no longer require the root secret name and/or password be passed by environment variable, instead using a pod-level cluster query to retrieve the current root password. The logic to retrieve this password is encapsulated into a single shell script that is present as a volume mount on running containers.

This allows Job objects to be created with hashes that do not link to a specific Secret name, as well as to create StatefulSet objects that don't refer to this name. When the Secret name changes on a Galera instance for an in-place root password change, the hashes / CRs for these objects will remain unchanged.

A subsequent change to the mariadb operator will add the ability to change the mysql root password of a Galera cluster using a dual-reference architecture where
the "current" root secret will be part of /Status, while the secret referenced in /Spec will be the "new" root secret. When these two names differ, that will indicate an in-place password change should take place, as well as allowing the pre-existing root password to be available at the same time as the new one in order to do a root password
change. The same
architecture will be applied to a new class of "system" MariaDBAccount
objects that are for use only by the Galera instance itself
and do not have a link to any MariaDBDatabase CR. The
Galera CR itself will no longer use osp-secret
for the mysql root password nor will the secret be directly
referenced from the Galera CR, instead referenced by a
"system" MariaDBAccount CR which the Galera operator itself
will create.

zzzeek · 2025-07-03T22:11:40Z

/retest

zzzeek · 2025-07-03T22:12:07Z

/recheck

zzzeek · 2025-07-03T22:12:39Z

/test

openshift-ci · 2025-07-03T22:12:43Z

@zzzeek: The /test command needs one or more targets.
The following commands are available to trigger required jobs:

/test functional

/test images

/test mariadb-operator-build-deploy-chainsaw

/test mariadb-operator-build-deploy-kuttl

/test precommit-check

The following commands are available to trigger optional jobs:

/test mariadb-operator-build-deploy

Use /test all to run the following jobs that were automatically triggered:

pull-ci-openstack-k8s-operators-mariadb-operator-main-functional

pull-ci-openstack-k8s-operators-mariadb-operator-main-images

pull-ci-openstack-k8s-operators-mariadb-operator-main-mariadb-operator-build-deploy-chainsaw

pull-ci-openstack-k8s-operators-mariadb-operator-main-mariadb-operator-build-deploy-kuttl

pull-ci-openstack-k8s-operators-mariadb-operator-main-precommit-check

In response to this:

/test

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

zzzeek · 2025-07-03T22:19:00Z

/test mariadb-operator-build-deploy-kuttl

zzzeek · 2025-07-04T13:21:03Z

that last run failed in the openstack deploy step. everything succeeded except neutron that got stuck in db upgrade, it failed with: "sqlalchemy.exc.OperationalError: (pymysql.err.OperationalError) (1054, "Unknown column 'quotas.project_id' in 'field list'")"

not really sure how that can happen

zzzeek · 2025-07-04T13:21:28Z

/retest

tests/chainsaw/common/tls-certificate.yaml

dciabrin

Thanks for that.
The only problem I have with this review is that we currently use the db root user for probes, because we haven't got yet a clean way of creating alternative users (part of it will be resolved with your next review I think). So if we were to merge it as is, we'd end up doing two API calls for each probe sent to the galera pods, which is way too many.

I think we should discuss how to best cache that and only call the API when needed. I was thinking something along the lines of:

we keep using root_auth any time we want to get root creds.
internally, root_auth checks whether /root/.my.cnf exists. If it doesn't, it calls the API and generate that file with the current password.
if it exists, it checks whether creds are still valid (e.g. by doing a mysqladmin ping). If not, do 2.
if mysql is not running (i.e. when pod start), do not try to check creds validity, assume .my.cnf is valid and wait for the next probe to update it if needed.

controllers/galera_controller.go

pkg/mariadb/volumes.go

templates/galera/bin/detect_last_commit.sh

tests/chainsaw/scripts/mysql-cli.sh

templates/galera/bin/root_auth.sh

dciabrin · 2025-08-20T09:52:01Z

templates/galera/bin/root_auth.sh

+
+GALERA_INSTANCE="{{.galeraInstanceName}}"
+
+# note jq is not installed in the galera image, macgyvering w/ python instead


Ack, I'll track the shortcoming in a separate Jira so we can fix that for good.

templates/galera/bin/mysql_root_auth.sh

dciabrin · 2025-11-03T17:13:13Z

/lgtm

lmiccini · 2025-11-03T17:13:55Z

/approve

openshift-ci · 2025-11-03T17:14:01Z

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: lmiccini, zzzeek

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

~~OWNERS~~ [lmiccini]

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

In 1bb2318 the galera certs were regenerated with a three year expiry, but this did not include the CA expiration time, leading to failures again. this change updates that time as well and adds a script that can be used to regen the values.

In order to facilitate an in-place change to the name of the Secret that is referenced by a Galera instance for the mysql root password, rework the approach used by pods and shell scripts to no longer require the root secret name and/or password be passed by environment variable, instead using a pod-level cluster query to retrieve the current root password. The logic to retrieve this password is encapsulated into a single shell script that is present as a volume mount on running containers. This allows Job objects to be created with hashes that do not link to a specific Secret name, as well as to create StatefulSet objects that don't refer to this name. When the Secret name changes on a Galera instance for an in-place root password change, the hashes / CRs for these objects will remain unchanged. A subsequent change to the mariadb operator will add the ability to change the mysql root password of a Galera cluster using a dual-reference architecture where the "current" root secret will be part of <CR>/Status, while the secret referenced in <CR>/Spec will be the "new" root secret. When these two names differ, that will indicate an in-place password change should take place, as well as allowing the pre-existing root password to be available at the same time as the new one in order to do a root password change. The same architecture will be applied to a new class of "system" MariaDBAccount objects that are for use only by the Galera instance itself and do not have a link to any MariaDBDatabase CR. The Galera CR itself will no longer use osp-secret for the mysql root password nor will the secret be directly referenced from the Galera CR, instead referenced by a "system" MariaDBAccount CR which the Galera operator itself will create. 🤖 Generated with Claude Code (https://claude.com/claude-code) Co-Authored-By: Claude <[email protected]>

dciabrin · 2025-11-03T22:03:24Z

/lgtm

openshift-ci bot requested review from lewisdenny and viroel July 1, 2025 13:01

zzzeek requested a review from dciabrin July 1, 2025 13:01

zzzeek changed the title ~~Get MYSQL_PWD using an on-demand cluster query~~ Get MYSQL_PWD using an on-demand cluster query (PR 1 of 6) Jul 1, 2025

zzzeek mentioned this pull request Jul 1, 2025

DNM: mariadbaccount system accounts #320

Draft

zzzeek force-pushed the OSPRH-14916-pr1 branch 2 times, most recently from f429f65 to a91d023 Compare July 1, 2025 21:27

zzzeek force-pushed the OSPRH-14916-pr1 branch from a91d023 to 638bc6a Compare July 4, 2025 01:00

zzzeek force-pushed the OSPRH-14916-pr1 branch from 638bc6a to a4dbf5a Compare August 18, 2025 15:14

zzzeek commented Aug 18, 2025

View reviewed changes

tests/chainsaw/common/tls-certificate.yaml Show resolved Hide resolved

zzzeek force-pushed the OSPRH-14916-pr1 branch from a4dbf5a to ee836ee Compare August 18, 2025 19:20

dciabrin reviewed Aug 20, 2025

View reviewed changes

zzzeek force-pushed the OSPRH-14916-pr1 branch 4 times, most recently from 2467a94 to 4c0c5f5 Compare October 11, 2025 21:08

zzzeek commented Oct 12, 2025

View reviewed changes

templates/galera/bin/mysql_root_auth.sh Outdated Show resolved Hide resolved

zzzeek force-pushed the OSPRH-14916-pr1 branch 3 times, most recently from a151658 to f87d0bd Compare October 12, 2025 19:47

openshift-ci bot assigned dciabrin Nov 3, 2025

openshift-ci bot added the lgtm label Nov 3, 2025

openshift-ci bot added the approved label Nov 3, 2025

zzzeek force-pushed the OSPRH-14916-pr1 branch from f87d0bd to c282e5d Compare November 3, 2025 19:13

openshift-ci bot removed the lgtm label Nov 3, 2025

zzzeek and others added 2 commits November 3, 2025 14:24

update CA expiration time

5cb294a

In 1bb2318 the galera certs were regenerated with a three year expiry, but this did not include the CA expiration time, leading to failures again. this change updates that time as well and adds a script that can be used to regen the values.

zzzeek force-pushed the OSPRH-14916-pr1 branch from c282e5d to 3527102 Compare November 3, 2025 19:25

openshift-ci bot added the lgtm label Nov 3, 2025

openshift-merge-bot bot merged commit 966e316 into openstack-k8s-operators:main Nov 3, 2025
7 checks passed


		GALERA_INSTANCE="{{.galeraInstanceName}}"

		# note jq is not installed in the galera image, macgyvering w/ python instead

Get MYSQL_PWD using an on-demand cluster query (PR 1 of 6) #342

Get MYSQL_PWD using an on-demand cluster query (PR 1 of 6) #342

Conversation

zzzeek commented Jul 1, 2025

Uh oh!

zzzeek commented Jul 3, 2025

Uh oh!

zzzeek commented Jul 3, 2025

Uh oh!

zzzeek commented Jul 3, 2025

Uh oh!

openshift-ci bot commented Jul 3, 2025

Uh oh!

zzzeek commented Jul 3, 2025

Uh oh!

zzzeek commented Jul 4, 2025

Uh oh!

zzzeek commented Jul 4, 2025

Uh oh!

Uh oh!

dciabrin left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

dciabrin Aug 20, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

dciabrin commented Nov 3, 2025

Uh oh!

lmiccini commented Nov 3, 2025

Uh oh!

openshift-ci bot commented Nov 3, 2025

Uh oh!

dciabrin commented Nov 3, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants