-
Notifications
You must be signed in to change notification settings - Fork 52
OCPCLOUD-2992: machineset migration e2e #287
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
@sunzhaohua2: This pull request references OCPCLOUD-2555 which is a valid jira issue. Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the story to target the "4.19.0" version, but no target version was set. DetailsIn response to this: Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
|
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: The full list of commands accepted by this bot can be found here. DetailsNeeds approval from an approver in each of these files:Approvers can indicate their approval by writing |
|
/hold |
|
/hold cancel |
|
/retest |
|
@damdo can you help to take a look when you got time, thanks! |
|
Heya @sunzhaohua2 :) What we have here is useful, but if we look at the CI run we see it's passing in 0-1s. I think a lot of this is already covered by unit testing, so we might want to focus our efforts a little differently. IMO, here for e2es we want to test the behaviour of the MAPI controllers (machine, machineset, etc), CAPI controllers (machine, machineset, etc), and our migration controllers all interacting together - which we can't do with unit tests :) This is because in our unit tests (e.g for the MachineSync controllers) we're not running any of the CAPI or MAPI controllers. So the tests we write here want to check things like deletion logic, showing machines templates and so on end up running / created without status errors (they're healthy - not that they merely exist and are accepted by the api server) and so on. OCPBUGS-56897 this is a good example of a bug that could nicely be caught by an e2e test :) We would want to test the case where the MAPI machine set goes away, and we assert the CAPI machines do not change (e.g the UUID remains the same consistently, for example). I think we want to work out a base set of e2e tests that give us broad coverage, and start with adding only those, and then increment :) I'll try and give this some thought! one example may be something like: mapi auth , create machineset, observe capi machine + machineset mirror come up healthy, switch auth see other controllers take over, scale up / down see machines work and delete |
Thanks for your suggestion, good advice! I saw Joel said there will be an e2e doc, I will refer to the doc to make adjustments. |
JoelSpeed
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Having skimmed through the test cases, I think even though a lot of these are duplicating unit testing, there is still value in having these within the E2Es as well. They are quick and don't cost us much, and, if they were to regress somehow (e.g. through something else changing in the cluster), would mean that periodics could be configured to pick up these changes, which we would otherwise only find when running a PR against the cluster.
The bigger e2e cases that Theo mentioned will provide the majority of our E2E value I think, but this is also worthwhile continuing to work on IMO
e2e/framework/util.go
Outdated
| return apiUrl.Hostname(), int32(port), nil | ||
| } | ||
|
|
||
| // IsTechPreviewNoUpgrade checks if a cluster is a TechPreviewNoUpgrade cluster |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You should check for the presence of a particular feature gate being enabled in the feature gate status rather than checking tech preview directly.
If we can use the openshift tests extension (I think we got that?), then naming tests based on the feature gate name will be sufficient and the test extension will skip any tests that aren't enabled for us
I realise we don't have a great feature gate for this feature right now, I'll take an action item to create that so that we can start moving on with this
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I updated to check feature gate.
Yes, we got openshift tests extension, raised a pr before, will check that again.
86cea52 to
3b5809a
Compare
|
/retest |
|
@JoelSpeed @theobarberbany thanks for helping review, for the dup with unit I updated with capi machineset with the same name does not exist
create a capi machineset with the same name, then create a mapi machineset, which should be rejected In the E2E plan Dam wrote |
I'm not sure if that's true. I'd agree that creating a MAPI MachineSet with specAPI MAPI shouldn't be allowed when there is already a CAPI MachineSet. But in the case outlined here, I would expect the MachineSet to be created, and then a CAPI MachineSet to be created as a mirror of the MAPI MachineSet. The mirror CAPI MachineSet would do the work, and the MAPI MachineSet would always be paused. In a lot of the actions/scenarios, we care specifically about creating MachineSets, and less about the updates. |
|
I suspect that creating a suite of tests that specifically check the creation mechanics and what happens, and then a separate suite of tests that check what happens once you already have the existing MachineSets, would make the most sense |
|
@sunzhaohua2 , can we update the title to use - OCPCLOUD-2992 , Please review if that would be correct . |
|
@sunzhaohua2: This pull request references OCPCLOUD-2992 which is a valid jira issue. Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the story to target the "4.20.0" version, but no target version was set. DetailsIn response to this: Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
4bd3719 to
5065f59
Compare
|
Hi @JoelSpeed @theobarberbany can you help to review this? |
|
@sunzhaohua2: The following tests failed, say
Full PR test history. Your PR dashboard. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here. |
JoelSpeed
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When writing the CPMS E2Es, we used https://pkg.go.dev/sigs.k8s.io/controller-runtime/pkg/envtest/komega to wrap a lot of the fetching of resources, in particular the things like checking the spec/status of a certain field become
Eventually(komega.Object(machine)).Should(HaveField("Status.AuthoritativeAPI", Equal("ClusterAPI"))
Updating objects is also fairly simple, and uses retry logic to make sure we are not susceptible to flakes
Eventually(komega.Update(machine, func() {
machine.Spec.AuthoritativeAPI = "ClusterAPI"
}).Should(Succeed())
}
I wonder if we could avoid a lot of boiler plate if we adopted the same library here as well?
This PR is also huge! I wonder if we could break it down into stages, start by merging the framework and util bits and then different sections of the tests suites. Might make it a bit easier to review/iterate on.
Also, where I've made suggestions, please see if those might apply in other places. A lot of transforms here could be replaced simply with HaveField. A lot of the fetching of objects could use komega combined with HaveField to extract the fields we care about
| func DeleteMachines(cl client.Client, machines ...*clusterv1.Machine) error { | ||
| return wait.PollUntilContextTimeout(ctx, RetryShort, time.Minute, true, func(ctx context.Context) (bool, error) { | ||
| for _, machine := range machines { | ||
| if err := cl.Delete(ctx, machine); err != nil { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What if the object isn't found? Would this loop forever?
| _, mapiDefaultProviderSpec := getDefaultAWSMAPIProviderSpec(cl) | ||
| createAWSClient(mapiDefaultProviderSpec.Placement.Region) | ||
| awsMachineTemplate = newAWSMachineTemplate(mapiDefaultProviderSpec) | ||
| if err := cl.Create(ctx, awsMachineTemplate); err != nil && !apierrors.IsAlreadyExists(err) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why would it already exist?
| It("should MAPI MachineSet is authoritative and create the CAPI MachineSet mirror", func() { | ||
| verifyMachineSetAuthoritative(ctx, cl, machineSetNameMAPI, machinev1beta1.MachineAuthorityMachineAPI) | ||
| verifySynchronizedCondition(ctx, cl, machineSetNameMAPI, machinev1beta1.MachineAuthorityMachineAPI) | ||
| verifyMAPIPausedCondition(ctx, cl, machineSetNameMAPI, machinev1beta1.MachineAuthorityMachineAPI) | ||
| verifyMAPIHasCAPIMirror(cl, machineSetNameMAPI) | ||
| verifyCAPIPausedCondition(cl, machineSetNameMAPI, machinev1beta1.MachineAuthorityMachineAPI) | ||
| }) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Each of these could be separate It statements, as that would allow us clearer signal on which of the expectations is failing.
| }) | ||
| }) | ||
| /* | ||
| Context("With specAPI: CAPI and EXISTING CAPI MSet with that name", func() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should the CAPI MS be created in a BeforeAll inside this block?
| //By("Creating a same name MAPI MachineSet") | ||
| mapiMachineSetCAPI, err := createMAPIMachineSetWithAuthoritativeAPI(ctx, cl, 0, capiMachineSetNameCAPI, machinev1beta1.MachineAuthorityMachineAPI, machinev1beta1.MachineAuthorityMachineAPI) | ||
| Expect(err).ToNot(HaveOccurred(), "failed to create mapiMachineSet %s", mapiMachineSetCAPI) | ||
| By("Verify the MAPI MachineSet is Paused") | ||
| verifyMAPIPausedCondition(ctx, cl, capiMachineSetNameCAPI, machinev1beta1.MachineAuthorityClusterAPI) | ||
| By("Verify the MAPI Spec be updated to reflect existing CAPI mirror") | ||
| mapiMachineSetCAPI, err = mapiframework.GetMachineSet(ctx, cl, machineSetNameCAPI) | ||
| Expect(err).ToNot(HaveOccurred(), "failed to get mapiMachineSet %s", machineSetNameCAPI) | ||
| providerSpec := mapiMachineSetCAPI.Spec.Template.Spec.ProviderSpec | ||
| Expect(providerSpec.Value).NotTo(BeNil()) | ||
| Expect(providerSpec.Value.Raw).NotTo(BeEmpty()) | ||
| var awsConfig machinev1beta1.AWSMachineProviderConfig | ||
| err = json.Unmarshal(providerSpec.Value.Raw, &awsConfig) | ||
| Expect(err).NotTo(HaveOccurred(), "Failed to unmarshal ProviderSpec.Value") | ||
| Expect(awsConfig.InstanceType).To( | ||
| SatisfyAny( | ||
| BeEmpty(), | ||
| Equal("m5.large"), | ||
| ), | ||
| "Unexpected instanceType: %s", | ||
| awsConfig.InstanceType, | ||
| ) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The different expectations here could be broken into separate ordered It expectations
| return *mapiMachineSet.Spec.Replicas | ||
| }, framework.WaitShort, framework.RetryShort).Should(Equal(int32(2)), "replicas should change to 2") | ||
|
|
||
| By("Verify there is a non-authoritative, paused MAPI Machine mirror for the new CAPI Machine") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We need to make sure the original machine is MAPI authoritative at some point in the setup
| mapiMachines, err := mapiframework.GetMachinesFromMachineSet(ctx, cl, mapiMachineSet) | ||
| Expect(err).ToNot(HaveOccurred(), "Failed to get mapiMachineSet %s", machineSetNameMAPI) | ||
| Expect(mapiMachines).NotTo(BeEmpty(), "Machines should remain") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The MAPI MachineSet should not exist anymore? So I think instead we check that MAPI machines still exist in the MAPI namespace with the same names as the CAPI machines, WDYT?
| It("should be rejected when scaling CAPI mirror", func() { | ||
| By("Scaling up CAPI MachineSet to 1") | ||
| framework.ScaleMachineSet(machineSetNameMAPI, 1) | ||
| capiMachineSet, err := framework.GetMachineSet(cl, machineSetNameMAPI) | ||
| Expect(err).ToNot(HaveOccurred(), "Failed to get capiMachineSet %s", capiMachineSet) | ||
|
|
||
| Eventually(func() int32 { | ||
| capiMachineSet, err := framework.GetMachineSet(cl, machineSetNameMAPI) | ||
| Expect(err).ToNot(HaveOccurred(), "Failed to get capiMachineSet %s", machineSetNameMAPI) | ||
| return *capiMachineSet.Spec.Replicas | ||
| }, framework.WaitShort, framework.RetryShort).Should(Equal(int32(0)), "replicas should eventually revert to 0") | ||
| mapiMachineSet, err := mapiframework.GetMachineSet(ctx, cl, machineSetNameMAPI) | ||
| Expect(err).ToNot(HaveOccurred(), "Failed to get mapiMachineSet %s", machineSetNameMAPI) | ||
| Expect(*mapiMachineSet.Spec.Replicas).To(Equal(int32(0)), "replicas should remain 0") | ||
| }) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should get rejected by a VAP eventually
| Expect(*mapiMachineSet.Spec.Replicas).To(Equal(int32(0)), "replicas should remain 0") | ||
| }) | ||
|
|
||
| It("should be rejected when updating CAPI mirror spec", func() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should be rejected by VAP eventually
| // sleep for 30s to make sure mirror machineset be created | ||
| time.Sleep(30 * time.Second) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could we use an Eventually to check for it to be created?
Thank you for all the suggestions, I raised a new pr for the utils #323 please help to take a look. I will modify the tests suites and submit them one by one later. |
No description provided.