-
Notifications
You must be signed in to change notification settings - Fork 218
NE-1994: Add E2E test for Istio manual deployment #1204
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
NE-1994: Add E2E test for Istio manual deployment #1204
Conversation
|
@Miciah: This pull request references NE-1994 which is a valid jira issue. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
567bd5e to
1f0444b
Compare
|
/label acknowledge-critical-fixes-only Per TRT, "Feature gated code is free to use this label." |
|
@Miciah: This pull request references NE-1994 which is a valid jira issue. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
|
/assign |
test/e2e/gateway_api_test.go
Outdated
| // enabled. When manual deployment is enabled, then Istio allows a gateway use | ||
| // to another gateway's service by specifying that gateway's service in |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit:
| // enabled. When manual deployment is enabled, then Istio allows a gateway use | |
| // to another gateway's service by specifying that gateway's service in | |
| // enabled. When manual deployment is enabled, then Istio allows a gateway to use | |
| // another gateway's service by specifying that gateway's service in |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
test/e2e/gateway_api_test.go
Outdated
| // Istio to provision a service for this specific gateway, even if it specifies | ||
| // spec.addresses. | ||
| func testGatewayAPIManualDeployment(t *testing.T) { | ||
| gatewayClass, err := createGatewayClass("openshift-default", "openshift.io/gateway-controller") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| gatewayClass, err := createGatewayClass("openshift-default", "openshift.io/gateway-controller") | |
| gatewayClass, err := createGatewayClass("openshift-default", "openshift.io/gateway-controller/v1") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Very odd - I don't understand how the test works without the correct controllerName.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I haven't rerun the tests since #1202 merged.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
test/e2e/gateway_api_test.go
Outdated
| t.Logf("Polling for up to %v to verify that the gateway is accepted...", timeout) | ||
| if err := wait.PollUntilContextTimeout(context.Background(), interval, timeout, false, func(context context.Context) (bool, error) { | ||
| if err := kclient.Get(context, gatewayName, &gateway); err != nil { | ||
| t.Logf("Failed to get gateway %s: %v", gatewayName, err) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit, it can be helpful to see "retrying..." in the log:
| t.Logf("Failed to get gateway %s: %v", gatewayName, err) | |
| t.Logf("Failed to get gateway %s: %v, retrying...", gatewayName, err) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems redundant (and possibly inaccurate for the last iteration when we reach the timeout), but I can add it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| } | ||
| } | ||
| } | ||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit:
| t.Logf("Gateway %s not yet Accepted, retrying...", gatewayname) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
test/e2e/gateway_api_test.go
Outdated
|
|
||
| return false, nil | ||
| }); err != nil { | ||
| t.Errorf("Failed to observe the expected condition: %v", err) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| t.Errorf("Failed to observe the expected condition: %v", err) | |
| t.Errorf("Failed to find gateway %s at the expected condition: %v", gatewayname, err) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That seems redundant; do we need to include the gateway name in every log message in the test? I can add it though.
test/e2e/gateway_api_test.go
Outdated
| t.Logf("Polling for up to %v to verify that service %q is created...", timeout, serviceName) | ||
| if err := wait.PollUntilContextTimeout(context.Background(), interval, timeout, false, func(context context.Context) (bool, error) { | ||
| if err := kclient.Get(context, serviceName, &service); err != nil { | ||
| t.Logf("Failed to get service %s: %v", serviceName, err) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| t.Logf("Failed to get service %s: %v", serviceName, err) | |
| t.Logf("Failed to get service %s: %v, retrying...", serviceName, err) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
test/e2e/gateway_api_test.go
Outdated
|
|
||
| return true, nil | ||
| }); err != nil { | ||
| t.Errorf("Failed to observe the expected condition: %v", err) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| t.Errorf("Failed to observe the expected condition: %v", err) | |
| t.Errorf("Istio failed to automatically provision service %q: %v", serviceName, err) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Isn't this extrapolating, and does it add any useful information? I think it's better to keep the error message to what we actually observed and know. The preceding log lines will provide the necessary context to understand why the expected condition was not observed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added the service name to the message in https://github.com/openshift/cluster-ingress-operator/compare/1f0444b996dfff19d6118aadb7309bc8e1293fa8..6734479ebfc5bbb83897bc51f7d7ca580f783e2b.
1f0444b to
6734479
Compare
|
https://github.com/openshift/cluster-ingress-operator/compare/1f0444b996dfff19d6118aadb7309bc8e1293fa8..6734479ebfc5bbb83897bc51f7d7ca580f783e2b rebases, updates the controller name per #1202, and updates some comments and strings to address review comments. |
test/e2e/gateway_api_test.go
Outdated
| } | ||
| } | ||
|
|
||
| // testGatewayAPIManualDeployment verifies that Istio's manual deployment is not |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that I'm struggling to understand what "Istio manual deployment" means. Is it an installation of servicemeshoperator3 OLM operator made by a third party (not CIO)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Now that you mention it, I think this test might not be complete.
I think of manual deployment as the opposite of auto-deployment. Auto-deployment means that Istio must create a service (for a Gateway?). Manual deployment means it must NOT create the service automatically. Manual deployment is apparently configured by adding a service's address in the Gateway's spec.Addresses. We don't want manual deployment, so here we're checking that it created the service automatically even though a service is specified in the spec.Addresses.
I think to be complete, the test needs to check that the created service is not the same service specified in the spec.Addresses. But also, maybe the service that is added to spec.Addresses needs to really exist, and the test needs to make sure the service that is created is not the service added in spec.Addresses.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think of manual deployment as the opposite of auto-deployment. Auto-deployment means that Istio must create a service (for a Gateway?). Manual deployment means it must NOT create the service automatically. Manual deployment is apparently configured by adding a service's address in the Gateway's spec.Addresses. We don't want manual deployment, so here we're checking that it created the service automatically even though a service is specified in the spec.Addresses.
Got it, thank you! The test starts to make sense for me now.
I think to be complete, the test needs to check that the created service is not the same service specified in the spec.Addresses. But also, maybe the service that is added to spec.Addresses needs to really exist, and the test needs to make sure the service that is created is not the service added in spec.Addresses.
Yes, that may complete the test indeed. We can create a service and set its FQDN as Gateway.Spec.Address (type Hostname): Istio doc's example using service FQDN.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think of manual deployment as the opposite of auto-deployment. Auto-deployment means that Istio must create a service (for a Gateway?). Manual deployment means it must NOT create the service automatically. Manual deployment is apparently configured by adding a service's address in the Gateway's spec.Addresses. We don't want manual deployment, so here we're checking that it created the service automatically even though a service is specified in the spec.Addresses.
Right. If it helps, I can add a link to https://istio.io/latest/docs/tasks/traffic-management/ingress/gateway-api/#manual-deployment in the code comment.
I think to be complete, the test needs to check that the created service is not the same service specified in the spec.Addresses. But also, maybe the service that is added to spec.Addresses needs to really exist, and the test needs to make sure the service that is created is not the service added in spec.Addresses.
I'm not sure I understand. If a service got created, then that implies that Istio did automated deployment. We know what service the test specifies, and it is not a .svc.cluster.local host name:
cluster-ingress-operator/test/e2e/gateway_api_test.go
Lines 222 to 225 in 6734479
| Addresses: []gatewayapiv1.GatewayAddress{{ | |
| Type: ptr.To(gatewayapiv1.HostnameAddressType), | |
| Value: "lb.example.com", | |
| }}, |
Yes, that may complete the test indeed. We can create a service and set its FQDN as
Gateway.Spec.Address(typeHostname): Istio doc's example using service FQDN.
I could change the test to specify spec.addresses[].value: router-internal-default.openshift-ingress.svc.cluster.local. I had that during one iteration of the PR, thought it was unnecessary and potentially confusing, and changed the value to "lb.example.com". I can change it back if that would be less confusing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
References to "manual deployment", "automated deployment", and gateway listener "merging" added here:
cluster-ingress-operator/test/e2e/gateway_api_test.go
Lines 198 to 217 in e2eb4f0
| // testGatewayAPIManualDeployment verifies that Istio's "manual deployment" | |
| // feature is not enabled (see | |
| // <https://istio.io/latest/docs/tasks/traffic-management/ingress/gateway-api/#manual-deployment>). | |
| // We only want Istio to allow "automated deployment" (see | |
| // <https://istio.io/latest/docs/tasks/traffic-management/ingress/gateway-api/#automated-deployment>). | |
| // | |
| // When manual deployment is enabled, then Istio allows a gateway to use an | |
| // existing service (for example, another gateway's service) by specifying that | |
| // service in spec.addresses. When a gateway using manual deployment specifies | |
| // another gateway's service, the resulting behavior is effectively the same | |
| // behavior as Gateway API's concept of gateway listener "merging" (see | |
| // <https://github.com/kubernetes-sigs/gateway-api/blob/v1.2.1/apis/v1/gateway_types.go#L181-L182>). | |
| // | |
| // Gateway listener merging is underspecified in Gateway API and is not | |
| // consistently implemented among Gateway API implementations, and so we do not | |
| // want to allow it or any similar behavior (such as Istio's "manual | |
| // deployment") until such a time as it is well defined, standard behavior. | |
| // Instead, for the time being, we expect Istio to provision a service for a | |
| // gateway ("automated deployment"), even if the gateway specifies some existing | |
| // service in spec.addresses. |
Changed to use an existing service here:
cluster-ingress-operator/test/e2e/gateway_api_test.go
Lines 228 to 234 in e2eb4f0
| // Use the router's internal service in order to ensure that the | |
| // referent exists. Using an existing service isn't strictly necessary | |
| // in order to verify that Istio does not use manual deployment; if | |
| // manual deployment *is* enabled, Istio rejects the gateway if it | |
| // points to a non-existent referent. However, using an existing | |
| // service more closely reflects the way that manual deployment *would* | |
| // be used if it were allowed. |
I struggled with describing why the test needs to use an existing service. I think what I wrote will make sense to future me. Let me know what you think!
test/e2e/gateway_api_test.go
Outdated
| GatewayClassName: gatewayapiv1.ObjectName(gatewayClass.Name), | ||
| Addresses: []gatewayapiv1.GatewayAddress{{ | ||
| Type: ptr.To(gatewayapiv1.HostnameAddressType), | ||
| Value: "lb.example.com", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Following up on #1204 (comment), shouldn't the spec.Address point to an existing service? Otherwise, how do we know that the manual deployment didn't just fail because the service didn't exist?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Following up on #1204 (comment), shouldn't the spec.Address point to an existing service? Otherwise, how do we know that the manual deployment didn't just fail because the service didn't exist?
We know that because the gateway was accepted and Istio created a service for it.
| var service corev1.Service | ||
| t.Logf("Polling for up to %v to verify that service %q is created...", timeout, serviceName) | ||
| if err := wait.PollUntilContextTimeout(context.Background(), interval, timeout, false, func(context context.Context) (bool, error) { | ||
| if err := kclient.Get(context, serviceName, &service); err != nil { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hm, how do we know it didn't create a service of another name?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I suppose we could list all services and then look for unexpected services, but is there a reason why this test should do that?
This commit resolves NE-1994. https://issues.redhat.com/browse/NE-1994 * test/e2e/gateway_api_test.go (TestGatewayAPI): Run the new test, testGatewayAPIManualDeployment. Update a log message. (testGatewayAPIManualDeployment): New test. Verify that Istio is configured not to allow manual deployment.
6734479 to
e2eb4f0
Compare
|
https://github.com/openshift/cluster-ingress-operator/compare/6734479ebfc5bbb83897bc51f7d7ca580f783e2b..e2eb4f098027f87d7761bea11198f9aacfae41e1 elaborates on some comments to explain the concepts of Istio's "manual deployment" and "automated deployment" as well as Gateway API's gateway listener "merging". It also changes the test gateway to point to the router's internal service so that the gateway's referent is an existing service, which more closely resembles how someone would use manual deployment in practice. |
|
Cluster install failure:
/test e2e-aws-operator |
|
Multus CNI error for single node:
/test e2e-aws-ovn-single-node |
|
@Miciah: all tests passed! Full PR test history. Your PR dashboard. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here. |
|
/lgtm |
|
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: candita The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
|
[ART PR BUILD NOTIFIER] Distgit: ose-cluster-ingress-operator |
Add a new test to verify that Istio is configured not to allow manual deployment.