Skip to content

Conversation

@Deydra71
Copy link
Contributor

@Deydra71 Deydra71 commented Apr 16, 2025

Jira: OSPRH-14737

This PR introduces a new ApplicationCredential (AC) controller in the keystone-operator. It watches ApplicationCredential custom resources and performs these actions:

  1. Creates Keystone ApplicationCredentials for each CR (authenticating as that user due to Keystone’s default policy)
  2. Stores the AC’s ID and Secret in a k8s secret
  3. Implements rotation logic based on expirationDays and gracePeriodDays:
    • Reconciles at least once a day, rotating any AC that’s within or past its grace window
    • If an AC is already in the grace period at the next reconcile, it rotates immediately
    • The old ApplicationCredential in Keystone is not revoked on rotation (it naturally expires)

Additionally:

  • The controller waits for a KeystoneAPI resource to be Ready before proceeding with AC operations

Notes:

  • CRD & RBAC for the ApplicationCredential resource are not automatically installed yet. These must be applied manually until openstack-operator integration is complete

To apply rbac permissions run oc edit clusterrole keystone-operator-manager-role and add:

- apiGroups:
  - keystone.openstack.org
  resources:
  - applicationcredentials
  verbs:
  - create
  - delete
  - get
  - list
  - patch
  - update
  - watch

- apiGroups:
  - keystone.openstack.org
  resources:
  - applicationcredentials/finalizers
  verbs:
  - patch
  - update

- apiGroups:
  - keystone.openstack.org
  resources:
  - applicationcredentials/status
  verbs:
  - get
  - patch
  - update

Example AC CR for barbican service user:

apiVersion: keystone.openstack.org/v1beta1
kind: ApplicationCredential
metadata:
  name: ac-barbican
  namespace: openstack
spec:
  expirationDays: 365
  gracePeriodDays: 182
  passwordSelector: BarbicanPassword
  roles:
  - service
  secret: osp-secret
  userName: barbican

@openshift-ci
Copy link
Contributor

openshift-ci bot commented Apr 16, 2025

Skipping CI for Draft Pull Request.
If you want CI signal for your change, please convert it to an actual PR.
You can still manually trigger a test run with /test all

) (string, error) {

// The name of the Secret containing the service passwords
const ospSecretName = "osp-secret"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We've discussed this - but basically we provide the ability for the operators to specify the secret that contains the admin user password. This is osp-secret by default - but it need not be. See https://github.com/openstack-k8s-operators/barbican-operator/blob/main/api/v1beta1/common_types.go#L44 for instance.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed, we should not hardcode any value like this. It could be part of the deployment YAML spec.

return "", fmt.Errorf("failed to get Secret/%s: %w", ospSecretName, err)
}

key := capitalizeFirst(userName) + "Password"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This may be a default, but its not what is specified. In barbican, for instance we have a PasswordSelectors field - https://github.com/openstack-k8s-operators/barbican-operator/blob/main/api/v1beta1/common_types.go#L49 which identifies the correct key. But, this needn't be the case.

Ultimately, I think you are going to need to have the AC specification include the name of the user, the name of the user secret and the relevant field. You can set these appropriately in openstack-operator.

Comment on lines 279 to 276
// Always assign these roles:
Roles: []applicationcredentials.Role{
{Name: "admin"},
{Name: "service"},
},
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This makes this controller less generic and therefore potentially less useful. Perhaps this is another parameter - like the access rules that should be passed in as part of the AC spec.

The same can also be said for the Unrestricted field.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see there is support for all of Roles, Unrestricted and AccessRules in the gophercloud call - https://pkg.go.dev/github.com/gophercloud/gophercloud/openstack/identity/v3/applicationcredentials#CreateOpts

}

// Otherwise check again in 24 hours
return defaultRequeue
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't this be return rotateAt ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

rotateAt is a timestamp, we need return a duration

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK that makes sense. I (mis)understood this function when I first read it.

So if I understand this correctly now, we should be returning a recheck duration of 24 hours, unless we are already in the grace period - in which case we would be returning a 0 to immediately recheck.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, that's exactly right

// default requeue is 24h as minimal grace period is 1 day
defaultRequeue := 24 * time.Hour
if ac.Status.ExpiresAt == nil || ac.Status.ExpiresAt.IsZero() {
return defaultRequeue
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm curious - why would we want to requeue if the application credential does not expire?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was thinking about this as an ultimate fallback to wake the controller at least once a day. If because of some error the status of the CR is not updated.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK

return "fallback" // placeholder for generating failure
}
s := hex.EncodeToString(b)
return s[:n]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Its unlikely, but should we check for collisions?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could check if the AC with same suffix already exists, but unless we are creating millions of AC in short period the chance is basically zero. Or we can increase n.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should this function be a lib-common utility in the long term?


logger := r.GetLogger(ctx)

// Only if user explicitly does "oc delete" do we revoke the AC
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Its actually quite useful, I think, to have the delete revoke the application credential. This gives us a nice way to do a revocation.

The problem is, of course, that there will be some time between when the app cred is revoked and the new one is issued.

I wonder if there is a way to trigger a rotation without doing a delete. Could we implement a reconcileUpdate to do this instead? Then, the procedure when trying to revoke the cert would be to -

  1. patch the AC so that we are within the grace period. This triggers the creation of a new AC.
  2. Wait for the new AC to be propagated.
  3. Revoke the old AC by deleting it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm just concerned about step 2. - Wait for the new AC to be propagated. Because AC controller would need a feedback that the service deployment successfully rolled out with new credential, which would add another logic to watch the deployment status in AC controller. I’d prefer to leave revocation as a manual step for now.

Copy link
Contributor

@stuggi stuggi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have not finished my initial review, but probably won't have time today to do so. will continue tomorrow, and just wanted to add what I have so far.

Comment on lines 144 to 152
// Decide if we need to create (or rotate)
needsRotation := false
if instance.Status.ACID == "" {
needsRotation = true
logger.Info("AC does not exist, creating")
} else if r.shouldRotateNow(instance) {
needsRotation = true
logger.Info("AC is within grace period, rotating")
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lets split this out into a func

// Check if KeystoneAPI is ready
keystoneAPI, err := keystonev1.GetKeystoneAPI(ctx, helperObj, instance.Namespace, nil)
if err != nil {
logger.Info("KeystoneAPI not found, requeue", "error", err)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should set KeystoneAPIReadyCondition

return ctrl.Result{RequeueAfter: 10 * time.Second}, nil
}
if !keystoneAPI.IsReady() {
logger.Info("KeystoneAPI not ready, requeue")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.


// Ensure we have an initial ReadyCondition
condList := condition.CreateList(
condition.UnknownCondition(condition.ReadyCondition, condition.InitReason, condition.ReadyInitMessage),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the ReadyCondition gets auto added in Init.

should init KeystoneAPIReadyCondition and set it when checking for the keystoneapi bellow

			condition.UnknownCondition(keystonev1.KeystoneAPIReadyCondition, condition.InitReason, keystonev1.KeystoneAPIReadyInitMessage),

I think we could just use the same condition init as in https://github.com/openstack-k8s-operators/keystone-operator/blob/main/controllers/keystoneendpoint_controller.go#L111-L127 and just init it with what is used in this controller?

}

// Check if KeystoneAPI is ready
keystoneAPI, err := keystonev1.GetKeystoneAPI(ctx, helperObj, instance.Namespace, nil)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

logger.Info("KeystoneAPI not ready, requeue")
return ctrl.Result{RequeueAfter: 10 * time.Second}, nil
}

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

also I'd move all the above tasks into the main reconcile func as these are also the steps done for handling deletion. then you won't need login to check for the keystoneapi in the cleanup method. like https://github.com/openstack-k8s-operators/keystone-operator/blob/main/controllers/keystoneendpoint_controller.go#L225 where the admin client got checked before

logger := r.GetLogger(ctx)
adminOS, ctrlResult, err := keystonev1.GetAdminServiceClient(ctx, helperObj, keystoneAPI)
if err != nil {
return "", err
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if err != nil {
logger.Error(err, "Failed to find user ID")
instance.Status.Conditions.Set(condition.FalseCondition(
condition.ReadyCondition,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this seems to be not the correct condition. the ReadyCondition is only set in the defer function based on the sub condition status during the reconcile

}
savedConditions := instance.Status.Conditions.DeepCopy()

// Defer patch logic (skips if we are deleting)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

keystoneAPI, err := keystonev1.GetKeystoneAPI(ctx, helperObj, instance.Namespace, nil)
if err == nil && keystoneAPI.IsReady() {
userID, userErr := r.findUserIDAsAdmin(ctx, helperObj, keystoneAPI, instance.Spec.UserName)
if userErr == nil {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should try to reduce those amount of nested ifs.

what if there is an userErr, which is not "user not found"? is it ok continue and remove the finalizer? shouldn't we return the actual error if it is not "user not found" ?

can't we just do

userID, userErr := r.findUserIDAsAdmin(ctx, helperObj, keystoneAPI, instance.Spec.UserName)
if userErr != nil {
   return ctrl.Result{}, err
}
userOS, userRes, userErr2 := keystonev1.GetUserServiceClient(ctx, helperObj, keystoneAPI, instance.Spec.UserName)
...

) (string, error) {

// The name of the Secret containing the service passwords
const ospSecretName = "osp-secret"

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed, we should not hardcode any value like this. It could be part of the deployment YAML spec.

return "", fmt.Errorf("failed to get Secret/%s: %w", ospSecretName, err)
}

key := capitalizeFirst(userName) + "Password"

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do you need to capitalize the username's first letter? Also, why are you harcoding "Password"?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was just to filter user password from osp-secret. Now, passwordSelector will be passed to AC CR, so that will be used for filtering

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't get it. Can you elaborate?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The previous code assumed this password scheme in osp-secret:

BarbicanPassword: <secret_data>
CinderPassword: <secret_data>
GlancePassword: <secret_data>
…

So to pick the right field we had to:

  1. Take the userName ("barbican")
  2. Capitalize the first letter → "Barbican"
  3. Append "Password" → "BarbicanPassword"

and then look up that key in the secret. Of course that'd force everyone to use that convention. So now openstack-operator extracts and passes into AC CR these as well:

spec:
  secret: # the Secret name (by default it's osp-secret)
  passwordSelector:  # how we extract this key, e.g. BarbicanPassword
  userName: # e.g. barbican, user can customize this in control plane spec
  …

@Deydra71 Deydra71 force-pushed the underlined-ac-support branch from 38cb512 to 15fcdb9 Compare May 14, 2025 11:50
@Deydra71
Copy link
Contributor Author

The latest update makes the controller take into consideration the custom password secret, custom service user name, and password selectors. Continuing to address other reviews.

@Deydra71 Deydra71 force-pushed the underlined-ac-support branch from 15fcdb9 to 2d8a489 Compare May 16, 2025 13:50
@Deydra71
Copy link
Contributor Author

Deydra71 commented May 16, 2025

Latest update includes corrections based on some reviews (not all, will still continue), and also adds support for the Unrestricted, Roles and AccessRoles fields.

Currently AccessRoles field is not correctly passed to the AC CR by openstack-operator, so it's missing in the AC CR, and for services this field is specified in the openstackcontrolplane spec, ACs are not generated. We are continuing on Slack what should be in AccessRules fields in the first place.

And also automatic revocation is disabled for now for testing, based on what we will agree it will be enabled again.

@Deydra71 Deydra71 force-pushed the underlined-ac-support branch from 2d8a489 to 4962009 Compare May 19, 2025 12:35
scope := &gophercloud.AuthScope{
ProjectName: "service",
DomainName: "Default",
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm very wary about hard-coding stuff here. I think that;

  1. there may be cases where either the service project or DomainName may be different
  2. we might be interested in obtaining an app cred for a non-service user or for a different project. For example, I know that there is work ongoing upstream to do manila share encryption where the user would use app creds to retrieve a barbican secret.

In any case, I think it makes sense to make this function more generic - maybe call it GetUserClient() and pass in the domain and project name. You could then add these parameters to the app cred spec, and default to "service" and "Default". This will future-proof the interface a bit.

Copy link
Contributor Author

@Deydra71 Deydra71 May 28, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I understand, but right now we don't have any spec.ServiceProject or spec.ServiceDomain. So, that's why this serves as a mean to only get scoped token for our service.

Since FR3 is explicitly about wiring up service-account auth, I suggest we leave this helper as is for now, and add a // TODO pointing at the future work to extend the CRD with project/domain fields and refactor into a generic GetUserClient(…, project, domain…)

The domain name is actually hard coded for admin as well - https://github.com/openstack-k8s-operators/keystone-operator/blob/main/api/v1beta1/keystoneapi.go#L157

Copy link
Contributor Author

@Deydra71 Deydra71 May 28, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

there may be cases where either the service project or DomainName may be different

Is this still true if we take into account only service users?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree that hard coding is not good and we could do it like we did in the GetScopedAdminServiceClient, but can not comment on the need for this service client. there is just really little difference to the func GetScopedAdminServiceClient , maybe we could just make a generic GetScopedClient with parameters:
scope, TenantName, DomainName and Region and could deprecate the GetScopedAdminServiceClient

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree about having a generic function, the old one can be deprecated and it might produce a warning, so we would know in the service operators that we need to update to the new function call (e.g. [1]).

[1] https://github.com/openstack-k8s-operators/glance-operator/blob/main/internal/controller/glance_controller.go#L1126

AuthURL: authURL,
Username: userName,
Password: password,
TenantName: "service",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ditto comment as above.

secretName := fmt.Sprintf("%s-secret", instance.Name)
if err := r.storeACSecret(ctx, helperObj, instance, secretName, newID, newSecret); err != nil {
return ctrl.Result{}, err
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The keystone-operator is somewhat unique in the operators in that there isn't a lot of documentation of what it does. Just looking at the code , for instance, it wasn't completely clear to me what the name of the ac would be, and the name of the secret etc. I'm sure I could figure it out - but I shouldn't have to work so hard.

It would be great if there were a small doc (maybe in a docs directory like the other operators) which contains a description of what we'd expect this operator to produce. It doesn't have to be fancy, but it will help future reviewers and coders who want to enhance this feature.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added the doc dir and a new file describing basic functionality of the AC controller - will add more details soon.

secret := &corev1.Secret{}
key := types.NamespacedName{Namespace: namespace, Name: GetACSecretName(serviceName)}
if err := c.Get(ctx, key, secret); err != nil {
if k8s_errors.IsNotFound(err) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Deydra71 just wondering if VerifyApplicationCredentialsForService should be embedded here since we pass pretty much the same parameters and it would hide more details from the service operators that right now call both functions.
It would be nice to have a single entrypoint, and delegate keystone to do the verification, that of course can return an error we can handle.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure it would be that much beneficial for service operators, because VerifyApplicationCredentialsForService just tracks the AC secret and add the hash to configVars, it doesn't read the actual credentials. The GetApplicationCredentialFromSecret then serves only to get the actual data from the secret during generating config at later stage. Separating the change tracking from credential extraction seems logical to me and makes the reconcile flow clearer to understand in my opinion. @stuggi What do you think?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

see my comment here https://github.com/openstack-k8s-operators/glance-operator/pull/812/files#r2609517115 I guess it depends on if we can do it in a single place for the service or not. I think we do not need a dedicated hash since the AC is part of the service config which we already hash. so there is probably no need to get a dedicated hash for it.

@Deydra71
Copy link
Contributor Author

Deydra71 commented Jan 7, 2026

@stuggi @fmount hi! Could you please take a look again at the updated PR with the suggested changes? We’d like to merge it so we can proceed with kuttl and other testing in the service operators. I also updated the service operators that are supposed to use AppCred (telemetry still in progres) to use field indexers instead of custom watchers.

scope := &gophercloud.AuthScope{
ProjectName: "service",
DomainName: "Default",
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree that hard coding is not good and we could do it like we did in the GetScopedAdminServiceClient, but can not comment on the need for this service client. there is just really little difference to the func GetScopedAdminServiceClient , maybe we could just make a generic GetScopedClient with parameters:
scope, TenantName, DomainName and Region and could deprecate the GetScopedAdminServiceClient

passwordSelector,
)
if err != nil {
return "", ctrl.Result{}, fmt.Errorf("failed to get %q from Secret/%s: %w", passwordSelector, ospSecretName, err)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do we need this message, the lib-common func already returns an error string depending on the condition it checks ( generic error, secret not there, key not found)

if res != (ctrl.Result{}) {
return "", res, nil
}
if data == "" {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do we need this? is an empty string a valid pwd in keystone? creating the authclient should fail.

Comment on lines 155 to 163
| description | Created by keystone-operator for AC CR openstack/ac-barbican (user... |
| expires_at | 2026-05-29T09:02:28.705012 |
| id | 7b23dbac20bc4f048f937415c84bb329 |
| name | ac-barbican-29090 |
| project_id | 24f05745bc2145c6a625b528ce21d7a3 |
| roles | service |
| system | None |
| unrestricted | False |
| user_id | 2ecd25b38f0d432388ad8b838e46f36d |
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ni, table formating

- Create New AC
- Generates a new Keystone AC with a fresh 5-char suffix
- Uses the same roles, unrestricted flag, access rules, and expirationDays
- Does _not_ revoke the old AC, the old credential naturally expires
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what happens when it expired. gets it removed from keystone db automatic, or do we need a cron which cleans them, like we have for the tokens?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

that's a goo dpoint... I don't think they are deleted from the database. The expiration only makes the auth to fail. @vakwetu can you please confirm?

keystonev1.KeystoneApplicationCredentialReadyCondition,
condition.RequestedReason,
condition.SeverityInfo,
"Waiting for secret %q to be available",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lets add a KeystoneApplicationCredentialWaitingMessage

Comment on lines 451 to 486
data := map[string]string{
keystonev1.ACIDSecretKey: newID,
keystonev1.ACSecretSecretKey: newSecret,
}

// Add application-credentials label
labels := map[string]string{
"application-credentials": "true",
}

tmpl := []util.Template{{
Name: secretName,
Namespace: ac.Namespace,
Type: util.TemplateTypeNone,
CustomData: data,
Labels: labels,
}}
if err := oko_secret.EnsureSecrets(ctx, helperObj, ac, tmpl, nil); err != nil {
return err
}

// Ensure protection finalizer is present on the Secret
secret := &corev1.Secret{}
key := types.NamespacedName{Namespace: ac.Namespace, Name: secretName}
if err := helperObj.GetClient().Get(ctx, key, secret); err != nil {
return err
}
base := secret.DeepCopy()
if controllerutil.AddFinalizer(secret, acSecretFinalizer) {
if err := helperObj.GetClient().Patch(ctx, secret, client.MergeFrom(base)); err != nil {
if k8s_errors.IsConflict(err) {
return fmt.Errorf("%w: requeue: patch conflict on %s", err, secretName)
}
return err
}
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

since EnsureSecrets does not support adding a finalizer, its more efficient if you just use controllerutil.CreateOrPatch() to create/update the secret in a single request. could also enhance CreateOrPatchSecret for adding a finalizer if present in the secret passed in.

instance,
corev1.EventTypeNormal,
"ApplicationCredentialRotated",
fmt.Sprintf("ApplicationCredential '%s' (user: %s) rotated - EDPM nodes may need credential updates. Previous expiration: %s, New expiration: %s",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nice! would be also good to set a warning condition https://github.com/openstack-k8s-operators/dev-docs/blob/main/conditions.md#severity on the ctlpane, that the controlplane flips to not ready?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@stuggi I'm not sure I udnerstand, could you please elaborate where specifically we should set the warning condition? On KeystoneApplicationCredential or else?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes that was my idea this morning, but thinking of it again, we probably can not, since we do not know when the edpm side got updated to flip it back to ready.

@slagle is there a way to check if a nodedataset (all nodes) got updated with the latest config?

Comment on lines 379 to 384
if err := r.Patch(ctx, instance, client.MergeFrom(base)); err != nil {
if k8s_errors.IsConflict(err) {
return ctrl.Result{Requeue: true}, nil
}
return ctrl.Result{}, err
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

since it is on the instance itself and return afterwards, the patch is not required since we will do it in the defer func


// GetLogger returns a logger configured for this controller.
func (r *ApplicationCredentialReconciler) GetLogger(ctx context.Context) logr.Logger {
return ctrlLog.FromContext(ctx).WithName("ApplicationCredentialReconciler")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we usually use the format:

	return ctrlLog.FromContext(ctx).WithName("Controllers").WithName("KeystoneApplicationCredential")

// Store it in a Secret (create or update)
secretName := fmt.Sprintf("%s-secret", instance.Name)
if err := r.storeACSecret(ctx, helperObj, instance, secretName, newID, newSecret); err != nil {
if strings.HasPrefix(err.Error(), "requeue: ") {

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the error msg created at L482 in storeACSecret do not start with "requeue"

return fmt.Errorf("%w: requeue: patch conflict on %s", err, secretName)

may use Contains instead or put %w at the end of msg

Comment on lines 126 to 130
// If we're not deleting this and the service object doesn't have our finalizer, add it.
if instance.DeletionTimestamp.IsZero() && controllerutil.AddFinalizer(instance, finalizer) || isNewInstance {
return ctrl.Result{}, nil
}

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

true and true or true
should it be true and (true or true)

so for first reconcide
not-deleted-yet and (able-to-add-finalizer-just-now or its-new-instance)

because AND has more precedence, for later reconciles it can be read as
not-deleted-yet and able-to-add-finalizer-just-now
true and false || false
which is never true as we cant add same finalizer again (as assumed already here on condition)

So I meant say seems fine for first reconcile may create issue in second and later reconciles because of condition precedent - need to test though.
true and (false or false)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

wow, thanks for this , I realized with this comemnt that there are two cases where we could return too early -->

  • if an instance is created and immediately deleted before first reconcile
  • and if for some reason the instance looks new, because of the status wan't persisted yet, because of eg failed first reconcile

I think it will reconcile delete on the next event, but we can't assume that of course... I will add the parentess.

if instance.DeletionTimestamp.IsZero() &&
   (controllerutil.AddFinalizer(instance, finalizer) || isNewInstance) {
    return ctrl.Result{}, nil
}

@Deydra71
Copy link
Contributor Author

/retest

@Deydra71
Copy link
Contributor Author

ATM not sure why the functional test fails, locally the suite passes, and also the failure is connected to the MariaDB part, not our app cred changes https://github.com/openstack-k8s-operators/keystone-operator/blob/main/test/functional/keystoneapi_controller_test.go#L2246

Comment on lines 187 to 189
if !instance.DeletionTimestamp.IsZero() {
return r.reconcileDelete(ctx, instance, helperObj)
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is not the same as above L166 ? if we don't do any cleanup on the keystone side, I guess we can just remove it here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oh, yes I forgot to remove it, it's redundant now

@Deydra71 Deydra71 force-pushed the underlined-ac-support branch from ec8a09f to 3c7806d Compare January 14, 2026 15:29
scope := &gophercloud.AuthScope{
ProjectName: "service",
DomainName: "Default",
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree about having a generic function, the old one can be deprecated and it might produce a warning, so we would know in the service operators that we need to update to the new function call (e.g. [1]).

[1] https://github.com/openstack-k8s-operators/glance-operator/blob/main/internal/controller/glance_controller.go#L1126


// ComputeSecurityHash computes a hash of security-critical spec fields
// (roles, accessRules, unrestricted). Used to detect changes that require immediate rotation.
func ComputeSecurityHash(spec KeystoneApplicationCredentialSpec) (string, error) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this function used somewhere, perhaps in service operators?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not in service operators, but in internal/controller/keystoneapplicationcredential_controller.go[1] to detect permission changes and trigge app cred rotation.

[1] https://github.com/openstack-k8s-operators/keystone-operator/pull/567/files#diff-bbfa4b5edb0132743cb1ad35d9483082d2cab3d6a16540023740f107ed3e1075R309


// ApplicationCredential already exists and is valid
// Update RotationEligibleAt in case GracePeriodDays changed
if instance.Status.ExpiresAt != nil {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Deydra71 @stuggi quick question here: should we add ObservedGeneration check before reaching the Ready state (we assign it on L124 iirc)?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what check you are thinking about?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was wondering if it makes sense to check that .Status.ObservedGenerations matches w/ instance.ObservedGeneration in case of multiple updated to the same CR but we probably don't need it as there aren't multiple controllers involved.
Let's put this aside for now as I'm not entirely sure it would change anything in the reconciliation (I didn't test it).

@stuggi
Copy link
Contributor

stuggi commented Jan 14, 2026

/test keystone-operator-build-deploy-kuttl

…r functions for service operators.

Signed-off-by: Veronika Fisarova <[email protected]>
@Deydra71 Deydra71 force-pushed the underlined-ac-support branch from 3c7806d to 12e6048 Compare January 15, 2026 04:39
@Deydra71
Copy link
Contributor Author

@stuggi @fmount I refactored generic GetScopedClient + deprecate GetScopedAdminServiceClient based on your comments, can you please recheck it?

instance,
corev1.EventTypeNormal,
"ApplicationCredentialRotated",
fmt.Sprintf("ApplicationCredential '%s' (user: %s) rotated - EDPM nodes may need credential updates. Previous expiration: %s, New expiration: %s",

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1, this helps in events

@Deydra71 Deydra71 requested review from auniyal61 and fmount January 16, 2026 06:46
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet