Attempt at fixing flakey tests by hlipsig · Pull Request #4700 · Azure/ARO-RP

hlipsig · 2026-03-19T22:10:08Z

Which issue this PR addresses:

ARO-25407 Flakes for timeouts in MUO test.

What this PR does / why we need it:

Improves tests resilience by changing the timeout condition to 3 minutes.

Test plan for issue:

Local run looked good. Green e2e signal is what's needed.

Is there any documentation that needs to be updated for this PR?

How do you know this will function as expected in production?

Only fixes tests.

Copilot

Pull request overview

Improves e2e test resilience by waiting for Deployments to be ready (not just present) and by skipping MUO tests when MUO isn’t enabled/managed, addressing flakes caused by timeout/race conditions around reconciliation readiness.

Changes:

Add Eventually-based helpers to wait for MUO/Gatekeeper Deployments to become ready.
Update MUO restore test to skip when MUO isn’t enabled/managed and to wait for readiness after reconciliation.
Update Guardrails Gatekeeper restore tests to wait for readiness after reconciliation.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

test/e2e/operator.go

Copilot · 2026-03-19T22:16:56Z

test/e2e/operator.go

+	waitForMUODeploymentReady := func(ctx context.Context) {
+		Eventually(func(g Gomega, ctx context.Context) {
+			d, err := clients.Kubernetes.AppsV1().Deployments(managedUpgradeOperatorNamespace).Get(ctx, managedUpgradeOperatorDeployment, metav1.GetOptions{})
+			g.Expect(err).NotTo(HaveOccurred())
+			g.Expect(ready.DeploymentIsReady(d)).To(BeTrue(), "expected MUO deployment to be ready")
+		}).WithContext(ctx).WithTimeout(DefaultEventuallyTimeout).Should(Succeed())
+	}


This helper function’s logic is duplicated later for Gatekeeper (same pattern: Get + DeploymentIsReady inside Eventually). Consider extracting a single reusable helper (e.g., waitForDeploymentReady(ctx, namespace, name, timeout)) to reduce duplication and keep readiness semantics consistent across tests.

test/e2e/operator.go

Copilot · 2026-03-20T00:16:07Z

@hlipsig I've opened a new pull request, #4701, to work on those changes. Once the pull request is ready, I'll request review from you.

Co-authored-by: hlipsig <8000786+hlipsig@users.noreply.github.com>

Revert flakey test changes and increase MUO SpecTimeout to 3 minutes

cadenmarchese · 2026-03-20T12:51:04Z

test/e2e/operator.go

 		By("waiting for the MUO deployment to be reconciled")
 		GetK8sObjectWithRetry(ctx, getFunc, managedUpgradeOperatorDeployment, metav1.GetOptions{})
-	}, SpecTimeout(2*time.Minute))
+	}, SpecTimeout(3*time.Minute))


Rather than specify a timeout for this specific test, can we leverage the Eventually() pattern like other tests, using the DefaultEventuallyTimeout at test/e2e/helpers.go, which is 5 minutes?

cadenmarchese · 2026-03-20T12:52:17Z

.bingo/go.mod

If the .bingo change is required, can we leave it to a new PR since this one is meant to fix the e2e flake?

Attempt at fixing flakey tests

deccbff

Copilot AI review requested due to automatic review settings March 19, 2026 22:10

hlipsig requested review from alcasim, bennerv, cadenmarchese, hawkowl, jharrington22, kimorris27, mociarain, mrWinston, rogbas, sankur-codes, tiguelu, tsatam, tuxerrante, ventifus, wanghaoran1988 and yjst2012 as code owners March 19, 2026 22:10

Copilot AI reviewed Mar 19, 2026

View reviewed changes

Copilot started reviewing on behalf of hlipsig March 19, 2026 22:19 View session

Initial plan

37bacd3

Copilot AI mentioned this pull request Mar 20, 2026

Revert flakey test changes and increase MUO SpecTimeout to 3 minutes #4701

Merged

Copilot AI and others added 3 commits March 20, 2026 00:25

Revert flakey test changes and increase SpecTimeout to 3 minutes

7543a0a

Co-authored-by: hlipsig <8000786+hlipsig@users.noreply.github.com>

Update .bingo/go.mod

cf97afa

Merge pull request #4701 from Azure/copilot/sub-pr-4700

b8451e7

Revert flakey test changes and increase MUO SpecTimeout to 3 minutes

cadenmarchese reviewed Mar 20, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Attempt at fixing flakey tests#4700

Attempt at fixing flakey tests#4700
hlipsig wants to merge 5 commits intomasterfrom
hlipsig/ARO-25407/fix-muo-gatekeeper-e2e-flake

hlipsig commented Mar 19, 2026 •

edited

Loading

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Copilot AI Mar 19, 2026

Uh oh!

Uh oh!

Copilot AI commented Mar 20, 2026

Uh oh!

cadenmarchese Mar 20, 2026

Uh oh!

cadenmarchese Mar 20, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Conversation

hlipsig commented Mar 19, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Which issue this PR addresses:

What this PR does / why we need it:

Test plan for issue:

Is there any documentation that needs to be updated for this PR?

How do you know this will function as expected in production?

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Uh oh!

Copilot AI Mar 19, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Copilot AI commented Mar 20, 2026

Uh oh!

cadenmarchese Mar 20, 2026

Choose a reason for hiding this comment

Uh oh!

cadenmarchese Mar 20, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

hlipsig commented Mar 19, 2026 •

edited

Loading