-
Notifications
You must be signed in to change notification settings - Fork 218
NE-1934: Bump to OSSM 3.0 for Gateway API support #1152
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
NE-1934: Bump to OSSM 3.0 for Gateway API support #1152
Conversation
|
Skipping CI for Draft Pull Request. |
|
/test e2e-aws-gatewayapi |
9af78a0 to
1ceec30
Compare
|
/test e2e-aws-gatewayapi |
1ceec30 to
53ec66e
Compare
|
/test e2e-aws-gatewayapi |
|
@Miciah Just fyi - this PR doesn't update the CRDs so this is still running v1beta1 Gateway API CRDs. |
|
/assign @candita, @alebedev87 |
|
/assign @candita @alebedev87 |
53ec66e to
fbba0d2
Compare
|
/test e2e-aws-gatewayapi |
| Channel: "stable", | ||
| Channel: "1.1-nightly", //"stable", | ||
| InstallPlanApproval: operatorsv1alpha1.ApprovalAutomatic, | ||
| Package: "servicemeshoperator", | ||
| CatalogSource: "redhat-operators", | ||
| Package: "sailoperator", //"servicemeshoperator3", | ||
| CatalogSource: "community-operators", //"redhat-operators", | ||
| CatalogSourceNamespace: "openshift-marketplace", | ||
| StartingCSV: "sailoperator.v1.1.0-nightly-2025-03-05", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OSSM 3.0.0TP2 still has the Istio v1alpha1 CRD, so I am using a Sail Operator nightly to get the Istio v1 CRD. I will update this subscription once OSSM 3.0 is GA.
fbba0d2 to
3e738c1
Compare
|
I forgot to /test e2e-aws-gatewayapi |
3e738c1 to
4e54863
Compare
|
/test e2e-aws-gatewayapi |
|
I believe that the tests are failing because automated deployment/sidecar injection is not working with Sail Operator. |
|
/test e2e-aws-gatewayapi |
|
e2e-aws-gatewayapi failed because /test e2e-aws-gatewayapi |
e76ab75 to
f5b8619
Compare
|
You may be able to ditch that |
When reconciling gatewayclasses in order to ensure the Istio CR, make sure to use the oldest gatewayclass that has our controller name. Before this commit, the controller behaved inconsistently: - The controller's watch on gatewayclasses reconciled the specific gatewayclass that triggered the watch. - The controller's watches on subscriptions, installplans, and istios reconciled the "openshift-default" gatewayclass. This inconsistency caused several problems: - If a gatewayclass existed with our controller name but none existed with the name "openshift-default", the controller logged spurious reconcile requests for, and failures to get, the "openshift-default" gatewayclass. - If multiple gatewayclasses existed with our controller name, the controller tried to update the owner reference on the Istio CR every time the controller reconciled a gatewayclass other than the one in the owner reference. Using the oldest gatewayclass solves these problems. * pkg/operator/controller/gatewayclass/controller.go (enqueueRequestForDefaultGatewayClassController): Rename... (enqueueRequestForSomeGatewayClass): ...to this. Delete the unused namespace parameter. Look up the oldest gatewayclass with our controller name, and return a reconcile request for that gatewayclass. (NewUnmanaged, Reconcile): Use enqueueRequestForSomeGatewayClass for all watches.
Shorten the name of the gatewayclass that the conformance tests use in order to
avoid errors creating the service, which is named based on the gateway's and
gatewayclasses's names:
Invalid value: "http-listener-isolation-with-hostname-intersection-gateway-conformance": must be no more than 63 characters
This failure is from the GatewayHTTPListenerIsolation test. We do not currently
run this test by default. This commit shortens the gatewayclass name in
preparation for enabling the test in the future.
* hack/gatewayapi-conformance.sh: Shorten the gatewayclass name.
8055a97 to
25dde98
Compare
|
https://github.com/openshift/cluster-ingress-operator/compare/8055a976678a3377646399d0110f8cbe4d0e9144..25dde982b0f91fe5bed88fc44da904d920f03d2a addresses a couple of review comments that I missed before:
/test e2e-aws-gatewayapi-conformance |
shaneutt
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
With the follow-ups that were already created, I think things are looking good here.
| git clone --branch "${BRANCH}" https://github.com/kubernetes-sigs/gateway-api | ||
| cd gateway-api | ||
|
|
||
| if [[ "$BUNDLE_VERSION" = "v1.2.1" ]]; then |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
technically you're going to want to cover anything in v1.2.x. Hopefully we don't have any more patches in v1.2, but you never know.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, I hadn't seen that the proposed backport at kubernetes-sigs/gateway-api#3688 had been closed.
However, I'm ambivalent about changing this, knowing that there's still a chance that there could be a v1.2.2 with the patch, or a v1.2.2 with some other change that causes the cherry-pick to fail.
If we bump to v1.2.2+ and find that the cherry-pick is still needed, we can update this test then with the knowledge that it remains applicable and necessary.
| cd gateway-api | ||
|
|
||
| if [[ "$BUNDLE_VERSION" = "v1.2.1" ]]; then | ||
| echo "Cherry-picking fix for CoreDNS deployment issue in Gateway API v1.2.1" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| echo "Cherry-picking fix for CoreDNS deployment issue in Gateway API v1.2.1" | |
| // This is a temporary workaround until Gateway API v1.3.x. | |
| // See: https://github.com/kubernetes-sigs/gateway-api/pull/3389 | |
| echo "Cherry-picking fix for CoreDNS deployment issue in Gateway API v1.2.1" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If I push another change, I'll include this comment. Otherwise, I'd rather not burn the CI cycles. It's easy enough to look up the PR from the commit.
| go test ./conformance -v -timeout 10m -run TestConformance -args --supported-features=${SUPPORTED_FEATURES} | ||
| go test ./conformance -v -timeout 10m -run TestConformance -args "--supported-features=${SUPPORTED_FEATURES}" "--gateway-class=${GATEWAYCLASS_NAME}" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
THERE we go 👍
| for i := range gateways.Items { | ||
| if gateways.Items[i].Labels[key] == value { | ||
| continue | ||
| } | ||
| if string(gateways.Items[i].Spec.GatewayClassName) != o.GetName() { | ||
| continue | ||
| } | ||
| request := reconcile.Request{ | ||
| NamespacedName: types.NamespacedName{ | ||
| Namespace: gateways.Items[i].Namespace, | ||
| Name: gateways.Items[i].Name, | ||
| }, | ||
| } | ||
| requests = append(requests, request) | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit:
| for i := range gateways.Items { | |
| if gateways.Items[i].Labels[key] == value { | |
| continue | |
| } | |
| if string(gateways.Items[i].Spec.GatewayClassName) != o.GetName() { | |
| continue | |
| } | |
| request := reconcile.Request{ | |
| NamespacedName: types.NamespacedName{ | |
| Namespace: gateways.Items[i].Namespace, | |
| Name: gateways.Items[i].Name, | |
| }, | |
| } | |
| requests = append(requests, request) | |
| } | |
| for _, gateway := range gateways.Items { | |
| if gateway.Labels[key] == value { | |
| continue | |
| } | |
| if string(gateway.Spec.GatewayClassName) != o.GetName() { | |
| continue | |
| } | |
| request := reconcile.Request{ | |
| NamespacedName: types.NamespacedName{ | |
| Namespace: gateway.Namespace, | |
| Name: gateway.Name, | |
| }, | |
| } | |
| requests = append(requests, request) | |
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We've been tending towards using the index to avoid copying large values in the case of struct types, especially in hot loops or with large slices. For example, using the index can yield a significant performance improvement when iterating over thousands of routes. For the code in question, this might not be a particularly hot loop, and there probably aren't thousands or even hundreds of gateways, but I still prefer not to copy the gateway if we only need a few of its fields.
|
Hopefully this was a transient error.
/test e2e-aws-gatewayapi |
|
/retest |
|
/retest-required |
|
/retest |
|
/retest-required |
1 similar comment
|
/retest-required |
|
An infrastructure issue (error creating role assignments because we hit a limit) is causing at least the hypershift-e2e-aks failures. I'm not sure about e2e-hypershift. The issue is being mitigated (by deleting some old role assignments). /retest-required |
|
I believe the e2e-hypershift test failures were caused by OCPBUGS-53462. The hypershift-e2e-aks job failed again. This time I don't see errors about role assignments, but /test hypershift-e2e-aks |
|
/approve |
|
/hold |
|
Sorry, I'm seeing things. |
|
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: candita The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
|
/lgtm |
|
/label qe-approved |
|
@Miciah: This pull request references NE-1934 which is a valid jira issue. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
|
/label acknowledge-critical-fixes-only Per TRT, "Feature gated code is free to use this label." |
|
@Miciah: all tests passed! Full PR test history. Your PR dashboard. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here. |
|
[ART PR BUILD NOTIFIER] Distgit: ose-cluster-ingress-operator |
With OSSM 3, the Maistra Istio Operator is replaced with new operator based on the upstream Sail Operator, and the ServiceMeshControlPlane CRD is replaced by the Istio CRD. Vendor the sail-operator API:
Note that vendoring sail-operator requires the mergo override.
OSSM 3.0 is based on Istio 1.24, which supports Gateway API v1.2.1. Copy in the updated Gateway API CRDs:
Update the conformance tests to reflect the features that Istio 1.24 supports.
Update the gatewayclass controller to create a subscription for the OSSM 3 Operator (which is in a separate channel from the OSSM 2 operator) and to create an Istio CR instead of a ServiceMeshControlPlane CR.
Currently, OSSM 3.0 is Tech Preview, so the subscription is actually for the latest Sail Operator nightly build.
Update client initialization, RBAC, and tests that used the Maistra APIs from OSSM 2 to use the sail-operator API for OSSM 3.