-
Notifications
You must be signed in to change notification settings - Fork 218
OCPBUGS-54745: status: Conditionally add CRDs to relatedObjects #1217
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
OCPBUGS-54745: status: Conditionally add CRDs to relatedObjects #1217
Conversation
|
@Miciah: This pull request references Jira Issue OCPBUGS-54745, which is invalid:
Comment The bug has been updated to refer to the pull request using the external bug tracker. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
|
/jira refresh |
|
@Miciah: This pull request references Jira Issue OCPBUGS-54745, which is valid. The bug has been moved to the POST state. 3 validation(s) were run on this bug
Requesting review from QA contact: In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
|
/assign @alebedev87 |
| ) | ||
|
|
||
| if err := r.cache.Get(ctx, gatewaysResourceNamespacedName, &crd); err != nil { | ||
| if !errors.IsNotFound(err) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does it matter much if there was a prior resource, then it was deleted? E.g. someone removed the Istio or other CRDs we're checking here - do we then need to remove these from relatedObjects? Or is it okay because this is reconciled periodically?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The watch on CRDs should trigger reconciliation if the istios CRD is deleted:
cluster-ingress-operator/pkg/operator/controller/status/controller.go
Lines 135 to 143 in 0fa1281
| if err := c.Watch(source.Kind[client.Object](operatorCache, &apiextensionsv1.CustomResourceDefinition{}, handler.EnqueueRequestsFromMapFunc(toDefaultIngressController), predicate.Funcs{ | |
| CreateFunc: func(e event.CreateEvent) bool { | |
| return relatedObjectsCRDs.Has(e.Object.GetName()) | |
| }, | |
| UpdateFunc: func(e event.UpdateEvent) bool { | |
| return false | |
| }, | |
| DeleteFunc: func(e event.DeleteEvent) bool { | |
| return relatedObjectsCRDs.Has(e.Object.GetName()) |
getOperatorState intializes operatorState leaving haveIstiosResource false and only sets it to true if a Get on the istios CRD returns a nil error value, meaning the CRD is present:
cluster-ingress-operator/pkg/operator/controller/status/controller.go
Lines 354 to 355 in 0fa1281
| func (r *reconciler) getOperatorState(ctx context.Context, ingressNamespace, canaryNamespace string) (operatorState, error) { | |
| state := operatorState{} |
cluster-ingress-operator/pkg/operator/controller/status/controller.go
Lines 415 to 421 in 0fa1281
| if err := r.cache.Get(ctx, istiosResourceNamespacedName, &crd); err != nil { | |
| if !errors.IsNotFound(err) { | |
| return state, fmt.Errorf("failed to get CRD %q: %v", istiosResourceName, err) | |
| } | |
| } else { | |
| state.haveIstiosResource = true | |
| } |
Then Reconcile only adds istios to relatedObjects if haveIstiosResource is true:
cluster-ingress-operator/pkg/operator/controller/status/controller.go
Lines 255 to 260 in 0fa1281
| if state.haveIstiosResource { | |
| related = append(related, configv1.ObjectReference{ | |
| Group: sailv1.GroupVersion.Group, | |
| Resource: "istios", | |
| }) | |
| } |
So we should be fine.
|
Cluster install failure |
|
/test all |
|
/retest-required |
|
E2E Test failure: found [{Group: Resource:namespaces Namespace: Name:openshift-ingress-operator} |
0fa1281 to
c7db35d
Compare
|
https://github.com/openshift/cluster-ingress-operator/compare/0fa1281ffd86f48464e2f09c302f47f41bc4a8d6..c7db35d125f750f799fe98bf48a850089b1b09e2 makes two changes:
|
|
/test e2e-aws-gatewayapi |
|
/retest |
|
/retest-required |
test/e2e/operator_test.go
Outdated
| expected = append(expected, configv1.ObjectReference{ | ||
| Group: "gateway.networking.k8s.io", | ||
| Resource: "gateways", | ||
| Namespace: "openshift-ingress", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should the namespace be blank here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Possibly. My thinking was that we were telling people to create gateways in the "openshift-ingress" namespace, so the platform-managed ones should be there, we're allowing gateways in other namespaces as well (we just won't manage DNS for them). Based on that, I suppose it makes sense to leave the namespace blank.
| }) | ||
| // This test runs before TestGatewayAPI, so we do *not* expect | ||
| // to see subscriptions, istios, or gateways in relatedObjects. | ||
| if gatewayAPIControllerEnabled, err := isFeatureGateEnabled(features.FeatureGateGatewayAPIController); err != nil { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Based on the test output, gateways is never appended to expected. Does anything enable FeatureGateGatewayAPIController before this test is run?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The featuregate should be enabled in the techpreview jobs and not currently enabled in any other jobs.
07e8fef to
4ca2f3e
Compare
|
https://github.com/openshift/cluster-ingress-operator/compare/c7db35d125f750f799fe98bf48a850089b1b09e2..4ca2f3ee05bcafa57afef7ab12e18e29f934db4b makes these changes:
|
4ca2f3e to
570e600
Compare
|
@rhamini3: The specified target(s) for The following commands are available to trigger optional jobs: Use In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
|
Valid related objects test is now passing with the proposed changes in AWS Tech Preview |
|
@Miciah: This pull request references Jira Issue OCPBUGS-54745, which is valid. 3 validation(s) were run on this bug
No GitHub users were found matching the public email listed for the QA contact in Jira ([email protected]), skipping review request. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
Omit the namespace for the istios resource in relatedObjects as the istios resource is cluster-scoped. * pkg/operator/controller/status/controller.go (Reconcile): Remove the namespace for istios.
Omit the namespace for the gateways resource in relatedObjects as Istio manages gateways in all namespaces. * pkg/operator/controller/status/controller.go (Reconcile): Remove the namespace for gateways.
* pkg/operator/controller/status/controller.go (Reconcile): Pass ctx to getOperatorState. * pkg/operator/controller/status/controller.go (getOperatorState): Add a parameter for ctx, and use it instead of context.TODO().
1c68435 to
98b6246
Compare
|
@Miciah: This pull request references Jira Issue OCPBUGS-54745, which is invalid:
Comment In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
98b6246 to
f304027
Compare
Check whether the gatewayclasses, gateways, and istios CRDs actually exist before adding them to relatedObjects. Watch customresourcedefinitions in the status controller so that it updates relatedObjects as these CRDs are created. Check the "GatewayAPIController" featuregate to determine whether to add the gatewayclasses, gateways, istios, and subscriptions resources to relatedObjects, in addition to checking the "GatewayAPI" featuregate. Before this change, the operator could add istios to relatedObjects even if the OSSM subscription failed to install. By convention, an operator should only add resources to relatedObjects if those resources exist. This commit fixes OCPBUGS-54745. https://issues.redhat.com/browse/OCPBUGS-54745 * pkg/operator/controller/status/controller.go (gatewaysResourceName, gatewayclassesResourceName, istiosResourceName): New consts for the CRD names. (relatedObjectsCRDs): New var for a string set that contains gatewaysResourceName, gatewayclassesResourceName, and istiosResourceName. (New): Check the GatewayAPIControllerEnabled field in the controller config in addition to checking GatewayAPIEnabled to determine whether to watch subscriptions and customresourcedefinitions. Add a watch on customresourcedefinitions, with a predicate for CRDs with names that are in relatedObjectsCRDs. (Config): Add GatewayAPIControllerEnabled. (Reconcile): Check the GatewayAPIControllerEnabled field in the controller config as well as the haveIstiosResource, haveGatewayclassesResource, and haveGatewaysResource fields in the operatorState object, and conditionally add the corresponding resources to relatedObjects. (operatorState): Add haveIstiosResource, haveGatewaysResource, and haveGatewayclassesResource fields. (getOperatorState): Check GatewayAPIControllerEnabled in addition to GatewayAPIEnabled before checking for the OSSM subscription. Set haveGatewaysResource, haveGatewayclassesResource, and haveIstiosResource. * pkg/operator/operator.go (New): Specify GatewayAPIControllerEnabled in the status controller config. * test/e2e/operator_test.go (TestClusterOperatorStatusRelatedObjects): Expect to see "gateways" and "gatewayclasses" in relatedObjects if the "GatewayAPI" and "GatewayAPIController" featuregates are enabled.
f304027 to
6f42089
Compare
|
/hold cancel |
|
@Miciah: all tests passed! Full PR test history. Your PR dashboard. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here. |
| if err := r.cache.Get(ctx, gatewayclassesResourceNamespacedName, &crd); err != nil { | ||
| if !errors.IsNotFound(err) { | ||
| return state, fmt.Errorf("failed to get CRD %q: %v", gatewayclassesResourceName, err) | ||
| } | ||
| } else { | ||
| state.haveGatewayclassesResource = true | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
GatewayClasses and Gateways can be created even if OLM capabilities are not present. Should we move them up? Out of if r.config.GatewayAPIControllerEnabled && r.config.MarketplaceEnabled && r.config.OperatorLifecycleManagerEnabled condition.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've gone back and forth on this matter, and I think there are reasonable arguments either way. In my most recent update, I have conditionalized adding gatewayclasses and gateways to relatedObjects with the condition r.config.GatewayAPIControllerEnabled && r.config.MarketplaceEnabled && r.config.OperatorLifecycleManagerEnabled on the grounds that we are only adding these resources to relatedObjects as part of, and in support of, the controller feature. That is, if we didn't provide our own controller and were only providing the CRDs, then we wouldn't bother gathering gatewayclasses or gateways; we only need these resources in order to diagnose issues with the gateway controller.
If the operator could specify gateways and gatewayclasses by controller name in the relatedObjects specification, I would consider doing that. As it is, the operator would need to check the controller name on each gatewayclass and then check the gatewayclass name on each gateway in order to add each individual gatewayclass or gateway that our controller managed, and I don't think that that would be appropriate or worth the effort.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That is, if we didn't provide our own controller and were only providing the CRDs, then we wouldn't bother gathering gatewayclasses or gateways; we only need these resources in order to diagnose issues with the gateway controller.
Ack.
|
/lgtm |
|
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: alebedev87 The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
|
/jira refresh |
|
@Miciah: This pull request references Jira Issue OCPBUGS-54745, which is valid. 3 validation(s) were run on this bug
No GitHub users were found matching the public email listed for the QA contact in Jira ([email protected]), skipping review request. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
|
@Miciah: Jira Issue OCPBUGS-54745: All pull requests linked via external trackers have merged: Jira Issue OCPBUGS-54745 has been moved to the MODIFIED state. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
|
[ART PR BUILD NOTIFIER] Distgit: ose-cluster-ingress-operator |
|
@Miciah: Jira Issue OCPBUGS-54745 is in an unrecognized state (Verified) and will not be moved to the MODIFIED state. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
status: Omit istios namespace in
relatedObjectsOmit the namespace for the istios resource in
relatedObjectsas the istios resource is cluster-scoped.status: Omit gateways namespace in
relatedObjectsOmit the namespace for the gateways resource in
relatedObjectsas Istio manages gateways in all namespaces.getOperatorState: AddcontextparameterPass
ctxfrom the controller'sReconcilemethod togetOperatorState, and use it instead ofcontext.TODO().status: Conditionally add CRDs to
relatedObjectsCheck whether the gatewayclasses, gateways, and istios CRDs actually exist before adding them to
relatedObjects.Watch customresourcedefinitions in the status controller so that it updates
relatedObjectsas these CRDs are created.Check the "GatewayAPIController" featuregate to determine whether to add the gatewayclasses, gateways, istios, and subscriptions resources to relatedObjects, in addition to checking the "GatewayAPI" featuregate.
Before this change, the operator could add istios to
relatedObjectseven if the OSSM subscription failed to install. By convention, an operator should only add resources torelatedObjectsif those resources exist.