-
Notifications
You must be signed in to change notification settings - Fork 54
AppCred controller support #567
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
AppCred controller support #567
Conversation
|
Skipping CI for Draft Pull Request. |
|
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: Deydra71 The full list of commands accepted by this bot can be found here.
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
46e453b to
040976c
Compare
040976c to
38cb512
Compare
api/v1beta1/keystoneapi.go
Outdated
| ) (string, error) { | ||
|
|
||
| // The name of the Secret containing the service passwords | ||
| const ospSecretName = "osp-secret" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We've discussed this - but basically we provide the ability for the operators to specify the secret that contains the admin user password. This is osp-secret by default - but it need not be. See https://github.com/openstack-k8s-operators/barbican-operator/blob/main/api/v1beta1/common_types.go#L44 for instance.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed, we should not hardcode any value like this. It could be part of the deployment YAML spec.
api/v1beta1/keystoneapi.go
Outdated
| return "", fmt.Errorf("failed to get Secret/%s: %w", ospSecretName, err) | ||
| } | ||
|
|
||
| key := capitalizeFirst(userName) + "Password" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This may be a default, but its not what is specified. In barbican, for instance we have a PasswordSelectors field - https://github.com/openstack-k8s-operators/barbican-operator/blob/main/api/v1beta1/common_types.go#L49 which identifies the correct key. But, this needn't be the case.
Ultimately, I think you are going to need to have the AC specification include the name of the user, the name of the user secret and the relevant field. You can set these appropriately in openstack-operator.
| // Always assign these roles: | ||
| Roles: []applicationcredentials.Role{ | ||
| {Name: "admin"}, | ||
| {Name: "service"}, | ||
| }, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This makes this controller less generic and therefore potentially less useful. Perhaps this is another parameter - like the access rules that should be passed in as part of the AC spec.
The same can also be said for the Unrestricted field.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see there is support for all of Roles, Unrestricted and AccessRules in the gophercloud call - https://pkg.go.dev/github.com/gophercloud/gophercloud/openstack/identity/v3/applicationcredentials#CreateOpts
| } | ||
|
|
||
| // Otherwise check again in 24 hours | ||
| return defaultRequeue |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shouldn't this be return rotateAt ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
rotateAt is a timestamp, we need return a duration
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK that makes sense. I (mis)understood this function when I first read it.
So if I understand this correctly now, we should be returning a recheck duration of 24 hours, unless we are already in the grace period - in which case we would be returning a 0 to immediately recheck.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, that's exactly right
| // default requeue is 24h as minimal grace period is 1 day | ||
| defaultRequeue := 24 * time.Hour | ||
| if ac.Status.ExpiresAt == nil || ac.Status.ExpiresAt.IsZero() { | ||
| return defaultRequeue |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm curious - why would we want to requeue if the application credential does not expire?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was thinking about this as an ultimate fallback to wake the controller at least once a day. If because of some error the status of the CR is not updated.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK
| return "fallback" // placeholder for generating failure | ||
| } | ||
| s := hex.EncodeToString(b) | ||
| return s[:n] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Its unlikely, but should we check for collisions?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We could check if the AC with same suffix already exists, but unless we are creating millions of AC in short period the chance is basically zero. Or we can increase n.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should this function be a lib-common utility in the long term?
|
|
||
| logger := r.GetLogger(ctx) | ||
|
|
||
| // Only if user explicitly does "oc delete" do we revoke the AC |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Its actually quite useful, I think, to have the delete revoke the application credential. This gives us a nice way to do a revocation.
The problem is, of course, that there will be some time between when the app cred is revoked and the new one is issued.
I wonder if there is a way to trigger a rotation without doing a delete. Could we implement a reconcileUpdate to do this instead? Then, the procedure when trying to revoke the cert would be to -
- patch the AC so that we are within the grace period. This triggers the creation of a new AC.
- Wait for the new AC to be propagated.
- Revoke the old AC by deleting it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm just concerned about step 2. - Wait for the new AC to be propagated. Because AC controller would need a feedback that the service deployment successfully rolled out with new credential, which would add another logic to watch the deployment status in AC controller. I’d prefer to leave revocation as a manual step for now.
stuggi
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have not finished my initial review, but probably won't have time today to do so. will continue tomorrow, and just wanted to add what I have so far.
| // Decide if we need to create (or rotate) | ||
| needsRotation := false | ||
| if instance.Status.ACID == "" { | ||
| needsRotation = true | ||
| logger.Info("AC does not exist, creating") | ||
| } else if r.shouldRotateNow(instance) { | ||
| needsRotation = true | ||
| logger.Info("AC is within grace period, rotating") | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lets split this out into a func
| // Check if KeystoneAPI is ready | ||
| keystoneAPI, err := keystonev1.GetKeystoneAPI(ctx, helperObj, instance.Namespace, nil) | ||
| if err != nil { | ||
| logger.Info("KeystoneAPI not found, requeue", "error", err) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should set KeystoneAPIReadyCondition
| return ctrl.Result{RequeueAfter: 10 * time.Second}, nil | ||
| } | ||
| if !keystoneAPI.IsReady() { | ||
| logger.Info("KeystoneAPI not ready, requeue") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
|
|
||
| // Ensure we have an initial ReadyCondition | ||
| condList := condition.CreateList( | ||
| condition.UnknownCondition(condition.ReadyCondition, condition.InitReason, condition.ReadyInitMessage), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the ReadyCondition gets auto added in Init.
should init KeystoneAPIReadyCondition and set it when checking for the keystoneapi bellow
condition.UnknownCondition(keystonev1.KeystoneAPIReadyCondition, condition.InitReason, keystonev1.KeystoneAPIReadyInitMessage),
I think we could just use the same condition init as in https://github.com/openstack-k8s-operators/keystone-operator/blob/main/controllers/keystoneendpoint_controller.go#L111-L127 and just init it with what is used in this controller?
| } | ||
|
|
||
| // Check if KeystoneAPI is ready | ||
| keystoneAPI, err := keystonev1.GetKeystoneAPI(ctx, helperObj, instance.Namespace, nil) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
https://github.com/openstack-k8s-operators/keystone-operator/blob/main/controllers/keystoneendpoint_controller.go#L134-L165 is an example which sets the conditions
| logger.Info("KeystoneAPI not ready, requeue") | ||
| return ctrl.Result{RequeueAfter: 10 * time.Second}, nil | ||
| } | ||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
mark the KeystoneAPIReadyCondition when we get here, https://github.com/openstack-k8s-operators/keystone-operator/blob/main/controllers/keystoneendpoint_controller.go#L195
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
also I'd move all the above tasks into the main reconcile func as these are also the steps done for handling deletion. then you won't need login to check for the keystoneapi in the cleanup method. like https://github.com/openstack-k8s-operators/keystone-operator/blob/main/controllers/keystoneendpoint_controller.go#L225 where the admin client got checked before
| logger := r.GetLogger(ctx) | ||
| adminOS, ctrlResult, err := keystonev1.GetAdminServiceClient(ctx, helperObj, keystoneAPI) | ||
| if err != nil { | ||
| return "", err |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| if err != nil { | ||
| logger.Error(err, "Failed to find user ID") | ||
| instance.Status.Conditions.Set(condition.FalseCondition( | ||
| condition.ReadyCondition, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this seems to be not the correct condition. the ReadyCondition is only set in the defer function based on the sub condition status during the reconcile
| } | ||
| savedConditions := instance.Status.Conditions.DeepCopy() | ||
|
|
||
| // Defer patch logic (skips if we are deleting) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should keep the defer function, like we use in all the controllers https://github.com/openstack-k8s-operators/keystone-operator/blob/main/controllers/keystoneendpoint_controller.go#L91-L109
| keystoneAPI, err := keystonev1.GetKeystoneAPI(ctx, helperObj, instance.Namespace, nil) | ||
| if err == nil && keystoneAPI.IsReady() { | ||
| userID, userErr := r.findUserIDAsAdmin(ctx, helperObj, keystoneAPI, instance.Spec.UserName) | ||
| if userErr == nil { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should try to reduce those amount of nested ifs.
what if there is an userErr, which is not "user not found"? is it ok continue and remove the finalizer? shouldn't we return the actual error if it is not "user not found" ?
can't we just do
userID, userErr := r.findUserIDAsAdmin(ctx, helperObj, keystoneAPI, instance.Spec.UserName)
if userErr != nil {
return ctrl.Result{}, err
}
userOS, userRes, userErr2 := keystonev1.GetUserServiceClient(ctx, helperObj, keystoneAPI, instance.Spec.UserName)
...
api/v1beta1/keystoneapi.go
Outdated
| ) (string, error) { | ||
|
|
||
| // The name of the Secret containing the service passwords | ||
| const ospSecretName = "osp-secret" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed, we should not hardcode any value like this. It could be part of the deployment YAML spec.
api/v1beta1/keystoneapi.go
Outdated
| return "", fmt.Errorf("failed to get Secret/%s: %w", ospSecretName, err) | ||
| } | ||
|
|
||
| key := capitalizeFirst(userName) + "Password" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do you need to capitalize the username's first letter? Also, why are you harcoding "Password"?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This was just to filter user password from osp-secret. Now, passwordSelector will be passed to AC CR, so that will be used for filtering
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't get it. Can you elaborate?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The previous code assumed this password scheme in osp-secret:
BarbicanPassword: <secret_data>
CinderPassword: <secret_data>
GlancePassword: <secret_data>
…
So to pick the right field we had to:
- Take the userName ("barbican")
- Capitalize the first letter → "Barbican"
- Append "Password" → "BarbicanPassword"
and then look up that key in the secret. Of course that'd force everyone to use that convention. So now openstack-operator extracts and passes into AC CR these as well:
spec:
secret: # the Secret name (by default it's osp-secret)
passwordSelector: # how we extract this key, e.g. BarbicanPassword
userName: # e.g. barbican, user can customize this in control plane spec
…
38cb512 to
15fcdb9
Compare
|
The latest update makes the controller take into consideration the custom password secret, custom service user name, and password selectors. Continuing to address other reviews. |
15fcdb9 to
2d8a489
Compare
|
Latest update includes corrections based on some reviews (not all, will still continue), and also adds support for the Currently And also automatic revocation is disabled for now for testing, based on what we will agree it will be enabled again. |
2d8a489 to
4962009
Compare
| scope := &gophercloud.AuthScope{ | ||
| ProjectName: "service", | ||
| DomainName: "Default", | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm very wary about hard-coding stuff here. I think that;
- there may be cases where either the service project or DomainName may be different
- we might be interested in obtaining an app cred for a non-service user or for a different project. For example, I know that there is work ongoing upstream to do manila share encryption where the user would use app creds to retrieve a barbican secret.
In any case, I think it makes sense to make this function more generic - maybe call it GetUserClient() and pass in the domain and project name. You could then add these parameters to the app cred spec, and default to "service" and "Default". This will future-proof the interface a bit.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I understand, but right now we don't have any spec.ServiceProject or spec.ServiceDomain. So, that's why this serves as a mean to only get scoped token for our service.
Since FR3 is explicitly about wiring up service-account auth, I suggest we leave this helper as is for now, and add a // TODO pointing at the future work to extend the CRD with project/domain fields and refactor into a generic GetUserClient(…, project, domain…)
The domain name is actually hard coded for admin as well - https://github.com/openstack-k8s-operators/keystone-operator/blob/main/api/v1beta1/keystoneapi.go#L157
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
there may be cases where either the service project or DomainName may be different
Is this still true if we take into account only service users?
| AuthURL: authURL, | ||
| Username: userName, | ||
| Password: password, | ||
| TenantName: "service", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ditto comment as above.
|
Merge Failed. This change or one of its cross-repo dependencies was unable to be automatically merged with the current state of its repository. Please rebase the change and upload a new patchset. |
96d9dbb to
0bf30aa
Compare
0bf30aa to
bf9333e
Compare
|
Note: We have to add kuttl test in a separate PR, after eventual bump in openstack-operator, as the |
| if instance.Status.SecretName != "" { | ||
| key := types.NamespacedName{Namespace: instance.Namespace, Name: instance.Status.SecretName} | ||
| secret := &corev1.Secret{} | ||
| if err := r.Get(ctx, key, secret); err == nil && controllerutil.ContainsFinalizer(secret, acSecretFinalizer) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In theory we do not need to call controllerutil.ContainsFinalizer as the same check implicitly exists in controllerutil.RemoveFinalizer, so you don't need this in the condition
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point, didn't realize that the "contains" check is redundant.
| if err := r.Get(ctx, key, secret); err == nil && controllerutil.ContainsFinalizer(secret, acSecretFinalizer) { | ||
| base := secret.DeepCopy() | ||
| controllerutil.RemoveFinalizer(secret, acSecretFinalizer) | ||
| _ = r.Patch(ctx, secret, client.MergeFrom(base)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should we check a potential error returned from Patch?
| } | ||
|
|
||
| // Remove finalizer from the AC CR | ||
| if controllerutil.ContainsFinalizer(instance, finalizer) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same here, you can attempt removing the finalizer via controllerutil.RemoveFinalizer as it removes it only if is present.
|
|
||
| // createACWithName creates a new AC in Keystone | ||
| func (r *ApplicationCredentialReconciler) createACWithName( | ||
| logger logr.Logger, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should we get logger like:
logger := r.GetLogger(ctx)
and avoid passing it to the function? we can instead pass ctx context.Context.
| if err := helperObj.GetClient().Get(ctx, key, secret); err != nil { | ||
| return err | ||
| } | ||
| if !controllerutil.ContainsFinalizer(secret, acSecretFinalizer) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same here, you can if controllerutil.AddFinalizer(secret ....) { directly (see [1])
| return "fallback" // placeholder for generating failure | ||
| } | ||
| s := hex.EncodeToString(b) | ||
| return s[:n] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should this function be a lib-common utility in the long term?
| return defaultRequeue | ||
| } | ||
|
|
||
| // needsRotation returns (shouldRotate, logMessage) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe here we can explain better the returned parameters.
In addition I'm not sure you need both bool and string.
The string seems something that feeds the logger, while the boolean is what you need to process for real as a returned value, right? In any case, I'm not asking to change this function, I was just trying to see it in the picture of the whole flow.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added better description of the function. And you are right, the boolean is the return value that drives the logic, and string serves for logging.
| } | ||
|
|
||
| // getUserIDFromToken extracts the user ID from the authenticated token | ||
| func (r *ApplicationCredentialReconciler) getUserIDFromToken(_ logr.Logger, identClient *gophercloud.ServiceClient, username string) (string, error) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
do you need logger here?
docs/applicationcredentials.md
Outdated
| // Fetch AC data directly from the Secret | ||
| acData, err := keystonev1.GetApplicationCredentialFromSecret( | ||
| ctx, client, namespace, "barbican") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should this be, instead of "barbican", secretName that we got on L190 (and therefore ac-barbican-secret)?
bf9333e to
c053c03
Compare
|
Merge Failed. This change or one of its cross-repo dependencies was unable to be automatically merged with the current state of its repository. Please rebase the change and upload a new patchset. |
c053c03 to
dc4f9c7
Compare
5d735e2 to
27792b7
Compare
21f9389 to
59c1e57
Compare
83054d6 to
bcad4d2
Compare
Signed-off-by: Veronika Fisarova <[email protected]>
bcad4d2 to
94de270
Compare
|
@Deydra71: The following test failed, say
Full PR test history. Your PR dashboard. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here. |
Jira: OSPRH-14737
This PR introduces a new ApplicationCredential (AC) controller in the keystone-operator. It watches
ApplicationCredentialcustom resources and performs these actions:expirationDaysandgracePeriodDays:Additionally:
KeystoneAPIresource to beReadybefore proceeding with AC operationsNotes:
ApplicationCredentialresource are not automatically installed yet. These must be applied manually until openstack-operator integration is completeTo apply rbac permissions run
oc edit clusterrole keystone-operator-manager-roleand add:Example AC CR for barbican service user: