-
Notifications
You must be signed in to change notification settings - Fork 281
Correct Github Actions CI instability for iOS. #2337
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you very much for debugging this @freakboy3742 ! And for the great writeup! Debugging flaky tests requires a lot of patience. I hope this does the trick!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A little surprised there's not some feature in xdist that would help with this, but didn't see one. Thanks!
No problems - and apologies for the inconvenience while the suite was flaky.
Likewise - I looked for the same thing, and came up empty. It definitely seems like it could be a useful feature. |
Is the pytest-xdist feature |
It's not just that the iOS tests need to run serially - they need to run when there is nothing else running at the same time. The issue is that iOS builds (and the iOS simulator) are already heavily multi-threaded operations. |
The iOS test suite added by #2286 has proven to be unstable in GitHub Actions CI, failing ~50% of the time. On deeper investigation, the issue appears to be CPU overloading on the CI machine.
The integration tests run multhreaded using xdist; the ARM64 test machine reports 3 CPUs, and so 3 test threads are started. The iOS tests are isolated into a load group, resulting in the iOS tests being isolated to a single worker; but at the same time, the other two CPUs are allocated other macOS tests to run. The iOS test is itself a multi-process, multi-threaded build, calling Xcode, the iOS Simulator, and the macOS logging infrastructure; when the 2 remaining processes are at 100% utilisation building other tests, it's possible for the process of compiling the iOS project and booting a simulator clone to take a long time. The iOS testbed currently has a hard-coded 5 minute timeout waiting for Xcode to compile the project and boot a simulator; this is the timeout that is causing test failures.
This PR makes two changes to fix this.
Firstly, it splits the integration testbed into 2 phases. Instead of using an iOS load group, a pytest marker is used to identify the iOS tests as "serial" tests. The test suite then runs the integration test suite in two parts: all the serial tests are executed single process; and the non-serial tests are then executed multi-process. This effectively means that all the iOS tests run sequentially, then the rest of the test suite runs in parallel (as it always has done).
Secondly, the macOS ARM64 builds have been updated to use the macOS-15 runner. The macOS-15 runner is currently listed as "beta"; but it has been available for almost 6 months, and if history is any indication, will become the default runner in the very near future. The macOS-15 runner has two notable improvements:
test_ios.py::test_ios_platforms
run in as little as 94 seconds, with the complete macOS test suite completing in 22 minutes - down from 36 minutes on the macOS-14 runner.Apple "best practice" strongly encourages developers to keep on the "stable bleeding edge" of Xcode tooling, so upgrading to use Xcode 16 is generally a good idea anyway. It is possible to use Xcode 16 on a macOS-14 base image - but due to (1), the performance is still much worse than the macOS-15 runner (the worst iOS test execution time I've seen is 392s). Switching to macOS-15 also means that we don't need to explicitly maintain the version of Xcode, as the macOS-15 runner will always use the latest version of Xcode 16 that has been published.
Fixes #2335 - It's obviously difficult to categorically prove this, but I've run 4 builds on the macOS-15 runner with serial test isolation, with test execution times ranging from 94-215s 1. That time includes installing the compiled app on the simulator and running the test - so it's finishing well under the 5 minute/300s timeout that was causing test failures. I've also run 3 successful builds on macOS-14 with Xcode 16.2; these have much worse build times (332-392s), which is more than 5 minutes - but again, includes the time to install and run the test suite, which is easily 1/3-1/2 of the overall test time. Those tests have been run during US PST business hours and during AU AWST business hours, which limits the possible influence of "time of day" and overall system load on the problem.
Footnotes
If you want to audit the test results, the CI runs of interest are the last 7 commits attached to [DO NOT MERGE] Evaluate iOS CI reliability issues. #2336. The full CI runs report as fails because the CircleCI configuration fails - but that's a false positive because of a bad configuration disabling the use of CircleCI. The Github macOS-13 and macOS-14/15 runners are the only results of significance. ↩