Skip to content

Improve test flakiness#524

Merged
bfeshti merged 5 commits intodevfrom
improve-test-flakiness
Mar 12, 2026
Merged

Improve test flakiness#524
bfeshti merged 5 commits intodevfrom
improve-test-flakiness

Conversation

@bfeshti
Copy link
Collaborator

@bfeshti bfeshti commented Mar 11, 2026

Fix flaky integration tests: scoped cleanup, timeouts and error handling

bfeshti added 4 commits March 11, 2026 22:06
…lling

- Replace `kubectl delete pv --all` with targeted PV deletion by name
  to prevent parallel tests from interfering with each other's PVs,
  which was causing 10-minute timeouts
- Fix variable shadowing bug where statefulset existence check result
  was ignored (always used the namespace check error instead)
- Replace fixed time.Sleep calls with waitForPodsTerminated polling
- Add runWithRetry and waitForPodsTerminated helper functions

Made-with: Cursor
- Add 5-minute context timeout to gcloud run() which previously had
  no timeout and could hang indefinitely
- Fix deleteDisk retry loop that used select/default pattern causing
  a CPU-burning busy-loop hammering the GCP API with no delay between
  retries; now sleeps 5 seconds between attempts

Made-with: Cursor
The error returned by createPersistentVolume was silently discarded,
allowing tests to proceed with a failed PV/PVC setup. Now the error
is checked and propagated properly.

Made-with: Cursor
Add 50-minute job-level timeout to all four test jobs as a safety net.
Without this, the GitHub Actions default of 6 hours applies, meaning
a hung process (e.g. kubectl port-forward with no timeout) could burn
CI runner time for hours.

Made-with: Cursor
@bfeshti bfeshti requested a review from riggi-alekaj as a code owner March 11, 2026 21:09
kubectl logs on completed/terminated pods can fail transiently.
Add a kubectlLogs helper that retries up to 3 times with 10-second
delays, and replace all raw exec.Command("kubectl", "logs", ...)
calls across standalone.go and cluster.go to use it.

Made-with: Cursor
@bfeshti bfeshti merged commit 1b24f1d into dev Mar 12, 2026
18 checks passed
@bfeshti bfeshti deleted the improve-test-flakiness branch March 12, 2026 08:13
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants