Skip to content

Comments

Add system user password rotation using MySQL dual password#888

Open
shunki-fujita wants to merge 2 commits intomainfrom
issue-849
Open

Add system user password rotation using MySQL dual password#888
shunki-fujita wants to merge 2 commits intomainfrom
issue-849

Conversation

@shunki-fujita
Copy link
Contributor

@shunki-fujita shunki-fujita commented Feb 18, 2026

Close: #849

Implement password rotation for all 8 MOCO system users via MySQL's RETAIN CURRENT PASSWORD / DISCARD OLD PASSWORD mechanism. Users trigger rotation with kubectl annotations (password-rotate / password-discard), and the controller manages the full lifecycle: generate pending passwords, ALTER USER with RETAIN, distribute new secrets, then DISCARD old passwords.

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR adds a two-phase, crash-safe password rotation workflow for MOCO’s 8 fixed “system users” by leveraging MySQL’s dual-password feature (RETAIN CURRENT PASSWORD / DISCARD OLD PASSWORD), triggered via kubectl-moco subcommands (or raw annotations) and persisted via status to safely resume after controller restarts.

Changes:

  • Implement pending-password lifecycle in Secrets (*_PENDING + ROTATION_ID) and rotation state in MySQLCluster.status.systemUserRotation.
  • Add controller reconcile logic for rotate/discard phases, including Secret distribution, status guards, and best-effort annotation cleanup.
  • Update kubectl-moco credential into subcommands (show|rotate|discard) and adjust e2e/docs/CRDs accordingly.

Reviewed changes

Copilot reviewed 23 out of 24 changed files in this pull request and generated 6 comments.

Show a summary per file
File Description
pkg/password/rotation.go Implements pending-password Secret helpers and validation for rotation workflows.
pkg/password/rotation_test.go Unit tests for pending-password Secret helper behaviors.
pkg/dbop/password.go Adds MySQL ALTER USER ... RETAIN/DISCARD operations to the DB operator.
pkg/dbop/operator.go Extends dbop.Operator interface with rotate/discard password methods.
pkg/dbop/nop.go Implements new operator methods for NopOperator.
pkg/constants/meta.go Adds rotation/discard annotation keys.
go.mod Promotes github.com/google/uuid to a direct dependency.
controllers/password_rotation.go Implements controller rotation/discard phase state machine and Secret distribution.
controllers/mysqlcluster_controller.go Wires password-rotation reconcile into main reconcile flow and adds OpFactory dependency.
controllers/mysqlcluster_controller_test.go Adds integration-style tests for rotate→discard flow and stale-annotation handling.
controllers/mock_test.go Adds mock OperatorFactory/Operator for password rotation tests.
cmd/moco-controller/cmd/run.go Injects OpFactory into the reconciler in the controller binary.
cmd/kubectl-moco/cmd/credential.go Restructures credential into subcommands; adds rotate/discard annotation patching.
e2e/replication_test.go Updates kubectl moco credential invocations to credential show.
e2e/lifecycle_test.go Updates kubectl moco credential invocations to credential show.
docs/kubectl-moco.md Documents new kubectl moco credential <subcommand> structure and rotation commands.
docs/designdoc/password_rotation.md Adds detailed design doc for rotation state machine, safety, and recovery.
api/v1beta2/mysqlcluster_types.go Adds SystemUserRotationStatus and RotationPhase types to the API.
api/v1beta2/zz_generated.deepcopy.go Adds deepcopy support for the new status types.
docs/crd_mysqlcluster_v1beta2.md Documents the new status.systemUserRotation fields.
config/crd/bases/moco.cybozu.com_mysqlclusters.yaml CRD schema update for status.systemUserRotation.
config/crd/tests/apiextensions...mysqlclusters...yaml CRD test fixture update for status.systemUserRotation.
charts/moco/templates/generated/crds/moco_crds.yaml Helm-generated CRD update for status.systemUserRotation.
clustering/mock_test.go Updates mock operator to satisfy the extended dbop.Operator interface.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 23 out of 24 changed files in this pull request and generated 2 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 23 out of 24 changed files in this pull request and generated 1 comment.

Comments suppressed due to low confidence (1)

pkg/dbop/password.go:59

  • Missing unit tests for the password rotation SQL operations. The RotateUserPassword and DiscardOldPassword methods lack direct unit test coverage. While the validateMocoUser function provides SQL injection protection, this critical security-sensitive code should have dedicated unit tests that verify:
  1. The validateMocoUser whitelist correctly blocks invalid user names
  2. The SQL statement construction is correct
  3. Error handling works as expected

Consider adding a pkg/dbop/password_test.go file with unit tests for these functions, using mocks or test doubles for the database connection.

package dbop

import (
	"context"
	"fmt"

	"github.com/cybozu-go/moco/pkg/constants"
)

// validMocoUsers is a set of allowed user names for password rotation SQL operations.
var validMocoUsers map[string]struct{}

func init() {
	validMocoUsers = make(map[string]struct{}, len(constants.MocoUsers))
	for _, u := range constants.MocoUsers {
		validMocoUsers[u] = struct{}{}
	}
}

// validateMocoUser returns an error if user is not one of the fixed system user
// names. This guards against SQL injection since the user name is interpolated
// directly into ALTER USER statements.
func validateMocoUser(user string) error {
	if _, ok := validMocoUsers[user]; !ok {
		return fmt.Errorf("invalid MOCO user name %q: must be one of constants.MocoUsers", user)
	}
	return nil
}

// RotateUserPassword executes ALTER USER ... IDENTIFIED BY ... RETAIN CURRENT PASSWORD.
//
// user must be one of the fixed system user names defined in pkg/constants/users.go.
// The user name is interpolated directly into the SQL statement because MySQL
// does not support placeholders in the user position of ALTER USER.
func (o *operator) RotateUserPassword(ctx context.Context, user, newPassword string) error {
	if err := validateMocoUser(user); err != nil {
		return err
	}
	query := fmt.Sprintf("ALTER USER '%s'@'%%' IDENTIFIED BY ? RETAIN CURRENT PASSWORD", user)
	if _, err := o.db.ExecContext(ctx, query, newPassword); err != nil {
		return fmt.Errorf("failed to rotate password for %s: %w", user, err)
	}
	return nil
}

// DiscardOldPassword executes ALTER USER ... DISCARD OLD PASSWORD.
//
// user must be one of the fixed system user names defined in pkg/constants/users.go.
// See RotateUserPassword for the rationale.
func (o *operator) DiscardOldPassword(ctx context.Context, user string) error {
	if err := validateMocoUser(user); err != nil {
		return err
	}
	query := fmt.Sprintf("ALTER USER '%s'@'%%' DISCARD OLD PASSWORD", user)
	if _, err := o.db.ExecContext(ctx, query); err != nil {
		return fmt.Errorf("failed to discard old password for %s: %w", user, err)
	}
	return nil
}

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 23 out of 24 changed files in this pull request and generated no new comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 23 out of 24 changed files in this pull request and generated no new comments.

Comments suppressed due to low confidence (1)

docs/designdoc/password_rotation.md:196

  • The design doc says the replicas=0 discard refusal returns RequeueAfter: 30s, but the controller implementation uses rotationRequeueInterval (15s) for that case. Please align the documentation with the actual behavior, or adjust the code to match the documented 30s interval.
**Scaled-down clusters (replicas=0):**

When `spec.replicas` is 0, discard is **rejected** with a Warning Event (`DiscardRefused`) and the handler returns `RequeueAfter: 30s`.
Without running Pods, we cannot verify that the new passwords actually work, so discarding old passwords would be unsafe.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@shunki-fujita shunki-fujita force-pushed the issue-849 branch 2 times, most recently from ee58742 to 5a02f10 Compare February 19, 2026 08:12
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 24 out of 25 changed files in this pull request and generated 1 comment.

Comments suppressed due to low confidence (3)

cmd/kubectl-moco/cmd/credential.go:28

  • The deprecated form kubectl moco credential CLUSTER_NAME still calls fetchCredential, but the --mysql-user/--format flags are now registered only on the show subcommand. That breaks previously valid flag usage even though the command is only marked deprecated. Consider moving these flags to credentialCmd as PersistentFlags (or duplicating them) so both forms accept the same options during the deprecation period.
	RunE: func(cmd *cobra.Command, args []string) error {
		if len(args) == 1 {
			fmt.Fprintln(cmd.ErrOrStderr(), "Warning: 'kubectl moco credential CLUSTER_NAME' is deprecated, use 'kubectl moco credential show CLUSTER_NAME' instead.")
			return fetchCredential(cmd.Context(), args[0])

controllers/password_rotation.go:570

  • For replica instances, this disables super_read_only via SetSuperReadOnly before any session has sql_log_bin=0 set. If the goal is to ensure no rotation-related writes are binlogged (like ALTER USER), consider executing the super_read_only toggle under sql_log_bin=0 too (e.g., have SetSuperReadOnly use a dedicated conn that sets sql_log_bin=0 first, or add a helper that runs both statements in one session).
	if isReplica {
		if err := op.SetSuperReadOnly(ctx, false); err != nil {
			return fmt.Errorf("failed to disable super_read_only on instance %d: %w", instanceIndex, err)
		}

controllers/password_rotation.go:614

  • Same concern here: discardInstanceUsers disables super_read_only before any sql_log_bin=0 is set on that session. Consider ensuring the super_read_only toggle is executed with binlogging disabled as well, to keep cross-cluster replication safety guarantees consistent.
	if isReplica {
		if err := op.SetSuperReadOnly(ctx, false); err != nil {
			return fmt.Errorf("failed to disable super_read_only on instance %d for discard: %w", instanceIndex, err)
		}

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 24 out of 25 changed files in this pull request and generated no new comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 24 out of 25 changed files in this pull request and generated no new comments.

Comments suppressed due to low confidence (3)

cmd/kubectl-moco/cmd/credential.go:191

  • The deprecated path (kubectl moco credential CLUSTER_NAME) no longer accepts the previously supported -u/--mysql-user and --format flags because those flags are defined only on the show subcommand. This makes the deprecation a breaking change for existing scripts. Consider adding these flags as persistent flags on credentialCmd (bound to the same config variables) so the deprecated invocation keeps working during the deprecation period.
func init() {
	fs := credentialShowCmd.Flags()
	fs.StringVarP(&credentialConfig.user, "mysql-user", "u", constants.ReadOnlyUser, "User for login to mysql")
	fs.StringVar(&credentialConfig.format, "format", "plain", "The format of output [`plain` or `mycnf`]")

cmd/kubectl-moco/cmd/credential.go:164

  • If moco.cybozu.com/password-discard annotation is already set but contains the wrong rotationID (e.g., manually set or stale), this command prints "already requested" and exits without fixing it. That leaves the user waiting while the controller will later consume the annotation as mismatched. Consider detecting ann != rotationStatus.RotationID and either overwriting it to the active rotationID or returning a clear error instructing the user to remove/re-run.
	if ann := cluster.Annotations[constants.AnnPasswordDiscard]; ann != "" {
		fmt.Printf("Password discard is already requested (rotationID: %s).\n", ann)
		return nil
	}

pkg/dbop/password.go:39

  • SetSuperReadOnly executes SET GLOBAL super_read_only=... via the pooled DB handle without setting sql_log_bin=0 on that session. For consistency with the rotation design’s binlog-safety goals, consider using a dedicated connection and setting sql_log_bin=0 in the same session (similar to RotateUserPassword/DiscardOldPassword), or document why this statement is guaranteed not to be written to the binlog.
func (o *operator) SetSuperReadOnly(ctx context.Context, on bool) error {
	val := 0
	if on {
		val = 1
	}
	if _, err := o.db.ExecContext(ctx, fmt.Sprintf("SET GLOBAL super_read_only=%d", val)); err != nil {
		return fmt.Errorf("failed to set super_read_only=%d: %w", val, err)
	}

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@shunki-fujita shunki-fujita force-pushed the issue-849 branch 2 times, most recently from edcfc66 to 80fcc6f Compare February 20, 2026 08:13
Implement password rotation for all 8 MOCO system users via MySQL's
RETAIN CURRENT PASSWORD / DISCARD OLD PASSWORD mechanism. Users trigger
rotation with kubectl annotations (password-rotate / password-discard),
and the controller manages the full lifecycle: generate pending passwords,
ALTER USER with RETAIN, distribute new secrets, then DISCARD old passwords.

Key safety properties:
- All ALTER USER operations use SET sql_log_bin=0 (via dedicated db.Conn)
  to prevent binlog propagation to cross-cluster replicas
- RETAIN is executed on every instance individually (primary + all replicas)
  since binlog-based within-cluster replication is disabled
- Crash recovery uses HasDualPassword to query MySQL directly for
  additional_password, skipping users that already have RETAIN applied.
  This makes MySQL the source of truth and avoids the race where RETAIN
  succeeds but a Status.Patch fails
- For replicas, super_read_only is temporarily disabled before ALTER USER
  and re-enabled afterwards; clustering loop self-recovers on failure
- Pending passwords stored in source Secret with ROTATION_ID for
  integrity validation
- ConfirmPendingPasswords is idempotent (no-op when pending already cleared)
- Annotation removal is best-effort to avoid Patch conflicts with Status
- Discard is gated on StatefulSet rollout completion to ensure all Pods
  have picked up new passwords via EnvFrom before old passwords are dropped
- Discard is rejected when replicas=0 (scaled-down cluster) because new
  passwords cannot be verified without running Pods

Signed-off-by: shunki-fujita <shunki-fujita@cybozu.co.jp>
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 24 out of 25 changed files in this pull request and generated 2 comments.

Comments suppressed due to low confidence (2)

cmd/kubectl-moco/cmd/credential.go:33

  • The deprecated invocation path (kubectl moco credential CLUSTER_NAME) no longer supports the old -u/--mysql-user and --format flags because they are now defined only on the show subcommand. If backwards compatibility is intended during the deprecation window, consider defining these as persistent flags on credentialCmd (or otherwise wiring them so the deprecated path continues to accept the same flags).
var credentialCmd = &cobra.Command{
	Use:   "credential",
	Short: "Manage credentials for MySQLCluster",
	Long:  "The credential command is used to show, rotate, and discard credentials for MySQLCluster.",
	Args:  cobra.MaximumNArgs(1),
	RunE: func(cmd *cobra.Command, args []string) error {
		if len(args) == 1 {
			fmt.Fprintln(cmd.ErrOrStderr(), "Warning: 'kubectl moco credential CLUSTER_NAME' is deprecated, use 'kubectl moco credential show CLUSTER_NAME' instead.")
			return fetchCredential(cmd.Context(), args[0])
		}
		return cmd.Help()
	},
}

controllers/mysqlcluster_controller_test.go:2424

  • This assertion only checks that each user appears at least once in a flat rotatedUsers map. Since rotation must run on every instance, this test can still pass even if the reconciler only rotates users on a single instance. Consider tracking/asserting RotateUserPassword calls per (instance,user) pair (or asserting instanceDual for every instance and every user) so the test would fail if any instance is skipped.
		By("verifying ALTER USER RETAIN was called for all users")
		Eventually(func() int {
			return len(mockOpFactory.getRotatedUsers())
		}).Should(Equal(len(constants.MocoUsers)))


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@shunki-fujita shunki-fujita marked this pull request as ready for review February 20, 2026 09:16
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 25 out of 26 changed files in this pull request and generated no new comments.

Comments suppressed due to low confidence (2)

controllers/password_rotation.go:233

  • In handlePasswordRotate, pending keys may be generated and written to the controller Secret before verifying the Secret is a valid MOCO password Secret (e.g. correct moco.cybozu.com/secret-version annotation). If the Secret format is broken (manual edit/restore), this can create additional inconsistent *_PENDING/ROTATION_ID state and then fail later when NewMySQLPasswordFromSecret is called. Consider validating the controller Secret format up front (before SetPendingPasswords/Update) and returning an error without mutating the Secret when it’s invalid.
	// Step 1: Ensure pending passwords exist in the source Secret.
	// SetPendingPasswords is idempotent: if pending passwords already exist for the
	// current rotationID, they are reused without regeneration.
	sourceSecret := &corev1.Secret{}
	if err := r.Get(ctx, client.ObjectKey{Namespace: r.SystemNamespace, Name: cluster.ControllerSecretName()}, sourceSecret); err != nil {
		return err
	}

	hasPending, err := password.HasPendingPasswords(sourceSecret, rotation.RotationID)
	if err != nil {
		r.Recorder.Eventf(cluster, corev1.EventTypeWarning, "StaleRotationPending",
			"Failed to verify pending passwords in source secret %s: %v. "+
				"To recover, manually delete *_PENDING and ROTATION_ID keys from the secret: "+
				"kubectl -n %s edit secret %s",
			cluster.ControllerSecretName(), err, r.SystemNamespace, cluster.ControllerSecretName())
		return fmt.Errorf("failed to verify pending passwords in source secret: %w", err)
	}
	if !hasPending {
		if _, err := password.SetPendingPasswords(sourceSecret, rotation.RotationID); err != nil {
			return err
		}
		if err := r.Update(ctx, sourceSecret); err != nil {
			return err
		}
		log.Info("generated pending passwords", "rotationID", rotation.RotationID)

cmd/kubectl-moco/cmd/credential.go:220

  • The deprecated form kubectl moco credential CLUSTER_NAME now runs the parent command, but the --mysql-user/-u and --format flags are registered only on the show subcommand. This breaks previously valid invocations like kubectl moco credential -u moco-admin <cluster> (unknown flag). Consider making these persistent flags on the parent (or duplicating them) so the deprecated syntax remains functional while users migrate.
	fs := credentialShowCmd.Flags()
	fs.StringVarP(&credentialConfig.user, "mysql-user", "u", constants.ReadOnlyUser, "User for login to mysql")
	fs.StringVar(&credentialConfig.format, "format", "plain", "The format of output [`plain` or `mycnf`]")

	_ = credentialShowCmd.RegisterFlagCompletionFunc("mysql-user", func(cmd *cobra.Command, args []string, toComplete string) ([]string, cobra.ShellCompDirective) {
		return []string{constants.ReadOnlyUser, constants.WritableUser, constants.AdminUser}, cobra.ShellCompDirectiveDefault
	})
	_ = credentialShowCmd.RegisterFlagCompletionFunc("format", func(cmd *cobra.Command, args []string, toComplete string) ([]string, cobra.ShellCompDirective) {
		return []string{"plain", "mycnf"}, cobra.ShellCompDirectiveDefault
	})

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@shunki-fujita shunki-fujita force-pushed the issue-849 branch 4 times, most recently from f686090 to f07c936 Compare February 21, 2026 11:49
Add MigrateUserAuthPlugin to migrate system users to the target
authentication plugin (determined from @@global.authentication_policy)
during Phase 2. After DISCARD OLD PASSWORD, each user is re-identified
with ALTER USER ... IDENTIFIED WITH <plugin> BY ..., re-hashing the
password under the new plugin. Both steps are idempotent.

Signed-off-by: shunki-fujita <shunki-fujita@cybozu.co.jp>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

System users password management

1 participant