Skip to content

OBC Controller and Reconcile#520

Open
shirady wants to merge 17 commits intored-hat-storage:mainfrom
shirady:obc-controller-implementation
Open

OBC Controller and Reconcile#520
shirady wants to merge 17 commits intored-hat-storage:mainfrom
shirady:obc-controller-implementation

Conversation

@shirady
Copy link
Contributor

@shirady shirady commented Feb 25, 2026

Describe the problem

Part of RHSTOR-6230
Continue of PRs red-hat-storage/ocs-operator#3692 by calling the Notify when an OBC is created and deleted.

Explain the changes

  1. Register the OBC and OBC types.
    Note: to avoid direct import of lib-bucket-provision, I had to add them manually.
    In case we decide to import it directly, it can be in the same style as others.
  2. OBC can be created in all namespaces, so add the needed changes for it.
  3. Create the OBC controller, which is running on the client cluster (that doesn't have NooBaa installed and the OBC controller from lib-bucket-provisionser), the controller with predicate so the reconcile loop will be triggered on specific changes only: creation, deletion, and update only by setting the deletion timestamp.
  4. In creation - add a finalizer and call Notify.
    Notes:
  • On failures, change the status to Failed.
  • Decided at this point not to change to Pending.
  • To identify the storage client (as there can be a couple of storage clients in the client cluster), we use the ownerReferences from the storageclass.
  1. In deletion call Notify and remove finalizer.
  2. Adding permission to the controller for the above actions.
  3. Changes to RBAC after running make manifests and make bundle.

Testing Instructions:

Manual Test:

  1. Set 2 clusters - provider and client, after onboarding. The client cluster with this PR code changes, and the provider cluster with Implement Notify - OBC Created and Deleted ocs-operator#3692 code changes (Notify implementation).
  2. On the client cluster, install the OBC CRD manually from noobaa-operator repo: kubectl apply -f ./noobaa-operator/deploy/obc/objectbucket.io_objectbucketclaims_crd.yaml
  3. Set the permission manually:
kubectl create clusterrole ocs-client-operator-obc-role \
  --verb=get,list,watch,patch,update \
  --resource=objectbucketclaims.objectbucket.io

# Add the status subresource rule
kubectl patch clusterrole ocs-client-operator-obc-role --type='json' -p='[
  {"op": "add", "path": "/rules/-", "value": {
    "apiGroups": ["objectbucket.io"],
    "resources": ["objectbucketclaims/status"],
    "verbs": ["get", "patch", "update"]
  }}
]'

kubectl create clusterrolebinding ocs-client-operator-obc-rolebinding \
  --clusterrole=ocs-client-operator-obc-role \
  --serviceaccount=openshift-storage-client:ocs-client-operator-controller-manager
  1. Add a storage cluster with ownerReferences of the storage client (you can copy from another storage class on the cluster) manually with a temp finalizer (else it is deleted).
    Example of storage class:
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
  name: openshift-storage.noobaa.io
  finalizers:
  - ocs-client-operator.ocs.openshift.io/temp
  ownerReferences:
  - apiVersion: ocs.openshift.io/v1alpha1
    blockOwnerDeletion: true
    controller: true
    kind: StorageClient
    name: storage-client-2202
    uid: 841f7ac8-17cb-4d34-8ee2-7c6ebaf82103
parameters:
  bucketclass: noobaa-default-bucket-class
provisioner: openshift-storage.noobaa.io/obc
reclaimPolicy: Delete
volumeBindingMode: Immediate
  1. Creation:
  • Client cluster: create OBC on client cluster - should be created without status change.
    Example of OBC:
apiVersion: objectbucket.io/v1alpha1
kind: ObjectBucketClaim
metadata:
  name: shira-obc-2302
  namespace: openshift-storage-client
spec:
  storageClassName: openshift-storage.noobaa.io
  generateBucketName: shira-obc-2302
  additionalConfig: {}

Note: OBC can be created in any namespace, if you want to test on another namespace - create the namespace (kubectl create ns <namespacename>) and change the namespace value in the yaml.

  • Provider cluster: check that the OBC on the provider cluster is created and in Bound phase.
  1. Deletion:
    Since we do not have the copy mechanism implemented yet, we would test only OBC deletin (without additional resources):
  • Manual steps:
    a. Install the OB CRD (kubectl apply -f noobaa-operator/deploy/obc/objectbucket.io_objectbuckets_crd.yaml)
    b. Change the clusterRole (use Ai generated file from the kubebuilder marks).
  • Client cluster: delete OBC on client cluster - should be deleted.
  • Provider cluster: check that the OBC was deleted (with all related resources).

@shirady shirady marked this pull request as draft February 25, 2026 13:58
@openshift-ci
Copy link

openshift-ci bot commented Feb 25, 2026

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: shirady
Once this PR has been reviewed and has the lgtm label, please assign nb-ohad for approval. For more information see the Code Review Process.

The full list of commands accepted by this bot can be found here.

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@leelavg
Copy link
Contributor

leelavg commented Feb 26, 2026

install the OBC CRD manually from noobaa-operator repo

  • unless we finalize on how we are bringing these CRDs we can't have this PR in current format, can there be a dynamic caching mechanism for this controller, like, when the PR is merged and client-op is deployed standalone even without OBC CRD client-op will not error out and serve existing cases?

manually with a temp finalizer (else it is deleted).

  • why? it gets deleted if the ownerref is wrong or the server doesn't send this resource? I suppose it's the latter and may need a fix, good find.

Copy link
Contributor

@leelavg leelavg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

didn't review the controller yet as pr is in draft.

Comment on lines +235 to +237
&nbv1.ObjectBucketClaim{}: {
Namespaces: map[string]cache.Config{corev1.NamespaceAll: {}},
},
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

even the returned OB/Secret/Configmap should be watched in other namespaces, isn't it?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The secret and config map should be watched in other namespaces as well, yes -I added the change.
OB is cluster-scoped, so I didn't change.

@shirady shirady force-pushed the obc-controller-implementation branch from a8fa7f2 to d7c518a Compare February 26, 2026 11:59
@shirady
Copy link
Contributor Author

shirady commented Feb 26, 2026

install the OBC CRD manually from noobaa-operator repo

  • unless we finalize on how we are bringing these CRDs we can't have this PR in current format, can there be a dynamic caching mechanism for this controller, like, when the PR is merged and client-op is deployed standalone even without OBC CRD client-op will not error out and serve existing cases?

From what I understand from @nb-ohad for testing we can install the CRD that we need.
If you think that we need to add something specific for the general case, we can add.
The qoute was taken from the manual testing that was done.

manually with a temp finalizer (else it is deleted).

  • why? it gets deleted if the ownerref is wrong or the server doesn't send this resource? I suppose it's the latter and may need a fix, good find.

Yes, you are correct. We would need to make changes. As we didn't want to be blocked at this point, it was suggested that we add the finalizer for testing purposes.

@shirady shirady force-pushed the obc-controller-implementation branch from 159612a to 622dfff Compare March 4, 2026 08:25
@shirady shirady force-pushed the obc-controller-implementation branch 3 times, most recently from e1d0de3 to 2f407e5 Compare March 4, 2026 09:05
@shirady shirady marked this pull request as ready for review March 4, 2026 10:37
@shirady
Copy link
Contributor Author

shirady commented Mar 4, 2026

@leelavg @rewantsoni, Please take a look

if err := r.Get(r.ctx, types.NamespacedName{Name: ownerRef.Name}, storageClient); err != nil {
return nil, fmt.Errorf("get StorageClient %q (owner of StorageClass %q): %w", ownerRef.Name, storageClassName, err)
}
if storageClient.Status.ConsumerID == "" || storageClient.Spec.StorageProviderEndpoint == "" {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe check if storageClient is onboarded or not. If onboarding succeeded, we should have the consumerUID and the storageProviderEndpoint as well

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you mean to check if the phase is StorageClientConnected?
But it is not the same as checking those fields, from what I see.
For example, when there was an issue in the cluster:

  • phase: Initializing
  • oc get storageclient storage-client-2202 -o json | jq .spec.storageProviderEndpoint (output with value)
  • oc get storageclient storage-client-2202 -o json | jq .status.consumerID null

The storage client was successfully onboarded in the past.
If I check the phase, it will not be able to create OBCs...

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the storageClient is not Onboarding, we shouldn't be performing any calls

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The storageClient is onboarded.
In this example, an unrelated issue caused the phase to change to Initializing.
@rewantsoni Would you like me to remove this condition, or do you have another suggestion?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And I don't think we should be having this check here; this should only return the Storage Client

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am trying to think of a case where the Client will be Connected, but the consumer UID will be empty 🤔

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably would not happen, it is set as the last step in onboarding.

shirady added 4 commits March 4, 2026 17:19
Signed-off-by: shirady <57721533+shirady@users.noreply.github.com>
Signed-off-by: shirady <57721533+shirady@users.noreply.github.com>
Signed-off-by: shirady <57721533+shirady@users.noreply.github.com>
Signed-off-by: shirady <57721533+shirady@users.noreply.github.com>
@shirady shirady force-pushed the obc-controller-implementation branch from 2f407e5 to d45478a Compare March 4, 2026 16:06
Signed-off-by: shirady <57721533+shirady@users.noreply.github.com>
@shirady shirady force-pushed the obc-controller-implementation branch from d45478a to 5d62d88 Compare March 5, 2026 07:31
shirady added 4 commits March 5, 2026 10:31
Signed-off-by: shirady <57721533+shirady@users.noreply.github.com>
Signed-off-by: shirady <57721533+shirady@users.noreply.github.com>
Signed-off-by: shirady <57721533+shirady@users.noreply.github.com>
Signed-off-by: shirady <57721533+shirady@users.noreply.github.com>
shirady added 2 commits March 5, 2026 13:22
Signed-off-by: shirady <57721533+shirady@users.noreply.github.com>
Signed-off-by: shirady <57721533+shirady@users.noreply.github.com>
@shirady shirady changed the title RHSTOR-6230 | OBC Controller and Reconcile OBC Controller and Reconcile Mar 8, 2026
"sigs.k8s.io/controller-runtime/pkg/client"
"sigs.k8s.io/controller-runtime/pkg/controller/controllerutil"
"sigs.k8s.io/controller-runtime/pkg/predicate"
"sigs.k8s.io/controller-runtime/pkg/reconcile"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please format the imports as

standard
pkg
others

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, I'll look again.

Comment on lines +69 to +70
err := r.Get(r.ctx, req.NamespacedName, obc)
if err != nil {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you use inline errors?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, thanks.

if err != nil {
r.log.Error(err, "failed to get StorageClient for OBC create")
obc.Status.Phase = ObjectBucketClaimStatusPhaseFailed
if statusErr := r.Client.Status().Update(r.ctx, obc); statusErr != nil {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be good if have status updates from a single place. Can you split reconcile into reconcile and reconcilePhases, and update the status towards the end of the reconcile similar to what we are doing for storageclient controller?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tried to refactor it so we will have the reconcile function, reconcilePhases function (and more functions called inside it), and update the status toward the end of the reconcile.

My only concern with this change is a race condition between the status update and the "Bound" that is patched from outside.
Therefore, I tried to status was saved, and the update will be called only if it was changed (and it changes to Failed only in the creation/update flow).

if err := r.Get(r.ctx, types.NamespacedName{Name: ownerRef.Name}, storageClient); err != nil {
return nil, fmt.Errorf("get StorageClient %q (owner of StorageClass %q): %w", ownerRef.Name, storageClassName, err)
}
if storageClient.Status.ConsumerID == "" || storageClient.Spec.StorageProviderEndpoint == "" {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And I don't think we should be having this check here; this should only return the Storage Client

shirady added 6 commits March 9, 2026 08:16
…rEndpoint

Signed-off-by: shirady <57721533+shirady@users.noreply.github.com>
Signed-off-by: shirady <57721533+shirady@users.noreply.github.com>
Signed-off-by: shirady <57721533+shirady@users.noreply.github.com>
like in storage client controller

Signed-off-by: shirady <57721533+shirady@users.noreply.github.com>
Signed-off-by: shirady <57721533+shirady@users.noreply.github.com>
Signed-off-by: shirady <57721533+shirady@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants