Correct Github Actions CI instability for iOS. #2337

freakboy3742 · 2025-03-27T01:29:06Z

The iOS test suite added by #2286 has proven to be unstable in GitHub Actions CI, failing ~50% of the time. On deeper investigation, the issue appears to be CPU overloading on the CI machine.

The integration tests run multhreaded using xdist; the ARM64 test machine reports 3 CPUs, and so 3 test threads are started. The iOS tests are isolated into a load group, resulting in the iOS tests being isolated to a single worker; but at the same time, the other two CPUs are allocated other macOS tests to run. The iOS test is itself a multi-process, multi-threaded build, calling Xcode, the iOS Simulator, and the macOS logging infrastructure; when the 2 remaining processes are at 100% utilisation building other tests, it's possible for the process of compiling the iOS project and booting a simulator clone to take a long time. The iOS testbed currently has a hard-coded 5 minute timeout waiting for Xcode to compile the project and boot a simulator; this is the timeout that is causing test failures.

This PR makes two changes to fix this.

Firstly, it splits the integration testbed into 2 phases. Instead of using an iOS load group, a pytest marker is used to identify the iOS tests as "serial" tests. The test suite then runs the integration test suite in two parts: all the serial tests are executed single process; and the non-serial tests are then executed multi-process. This effectively means that all the iOS tests run sequentially, then the rest of the test suite runs in parallel (as it always has done).

Secondly, the macOS ARM64 builds have been updated to use the macOS-15 runner. The macOS-15 runner is currently listed as "beta"; but it has been available for almost 6 months, and if history is any indication, will become the default runner in the very near future. The macOS-15 runner has two notable improvements:

The build machines are significantly faster. I've seen a single test_ios.py::test_ios_platforms run in as little as 94 seconds, with the complete macOS test suite completing in 22 minutes - down from 36 minutes on the macOS-14 runner.
The macOS-15 runner defaults to using Xcode 16, whereas the macOS-14 runner defaults to Xcode 15. Xcode 15 had a number of issues with slow simulator startup; the 15.0, 15.1 and 15.2 releases were unusably slow. The current 15.4 image is better, but not as good as Xcode 16.

Apple "best practice" strongly encourages developers to keep on the "stable bleeding edge" of Xcode tooling, so upgrading to use Xcode 16 is generally a good idea anyway. It is possible to use Xcode 16 on a macOS-14 base image - but due to (1), the performance is still much worse than the macOS-15 runner (the worst iOS test execution time I've seen is 392s). Switching to macOS-15 also means that we don't need to explicitly maintain the version of Xcode, as the macOS-15 runner will always use the latest version of Xcode 16 that has been published.

Fixes #2335 - It's obviously difficult to categorically prove this, but I've run 4 builds on the macOS-15 runner with serial test isolation, with test execution times ranging from 94-215s ¹. That time includes installing the compiled app on the simulator and running the test - so it's finishing well under the 5 minute/300s timeout that was causing test failures. I've also run 3 successful builds on macOS-14 with Xcode 16.2; these have much worse build times (332-392s), which is more than 5 minutes - but again, includes the time to install and run the test suite, which is easily 1/3-1/2 of the overall test time. Those tests have been run during US PST business hours and during AU AWST business hours, which limits the possible influence of "time of day" and overall system load on the problem.

If you want to audit the test results, the CI runs of interest are the last 7 commits attached to [DO NOT MERGE] Evaluate iOS CI reliability issues. #2336. The full CI runs report as fails because the CircleCI configuration fails - but that's a false positive because of a bad configuration disabling the use of CircleCI. The Github macOS-13 and macOS-14/15 runners are the only results of significance. ↩

joerick

Thank you very much for debugging this @freakboy3742 ! And for the great writeup! Debugging flaky tests requires a lot of patience. I hope this does the trick!

henryiii

A little surprised there's not some feature in xdist that would help with this, but didn't see one. Thanks!

freakboy3742 · 2025-03-29T03:38:25Z

Thank you very much for debugging this @freakboy3742 ! And for the great writeup! Debugging flaky tests requires a lot of patience. I hope this does the trick!

No problems - and apologies for the inconvenience while the suite was flaky.

A little surprised there's not some feature in xdist that would help with this, but didn't see one. Thanks!

Likewise - I looked for the same thing, and came up empty. It definitely seems like it could be a useful feature.

henryiii · 2025-05-03T04:26:26Z

Is the pytest-xdist feature pytest.mark.xdist_group(name="ios")? That guarantees all tests from that group run in one worker, and since there's no threaded parallelism, that means they run in serial.

freakboy3742 · 2025-05-03T04:33:24Z

It's not just that the iOS tests need to run serially - they need to run when there is nothing else running at the same time. The issue is that iOS builds (and the iOS simulator) are already heavily multi-threaded operations. xdist-groups (which is what the old implementation used) will guarantee all the tests go to the same worker, but there will still be as many workers as there are CPUs; so when there's only 3 or 4 CPU cores, and 3 of them are being consumed by other tests, the simulator literally doesn't have the resources to start up in a timely fashion on the 1 remaining core running the iOS tests - hence the failures in CI. This is less common on local runs because there are 8-16 cores (depending on laptop model), which seems to be enough to carry the load.

Correct Github Actions CI instability for iOS.

4cd8c0d

freakboy3742 mentioned this pull request Mar 27, 2025

[DO NOT MERGE] Evaluate iOS CI reliability issues. #2336

Closed

Add a dummy serial test so CircleCI passes serial test discovery.

e5d692d

freakboy3742 mentioned this pull request Mar 27, 2025

Instability in iOS tests under CI conditions #2335

Closed

joerick approved these changes Mar 28, 2025

View reviewed changes

henryiii approved these changes Mar 28, 2025

View reviewed changes

henryiii merged commit 4ef7b37 into pypa:main Mar 28, 2025
24 checks passed

freakboy3742 deleted the ios-timeout-fix branch March 29, 2025 03:35

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Correct Github Actions CI instability for iOS. #2337

Correct Github Actions CI instability for iOS. #2337

Uh oh!

freakboy3742 commented Mar 27, 2025

Uh oh!

joerick left a comment

Uh oh!

henryiii left a comment •

edited

Loading

Uh oh!

Uh oh!

freakboy3742 commented Mar 29, 2025 •

edited

Loading

Uh oh!

henryiii commented May 3, 2025 •

edited

Loading

Uh oh!

freakboy3742 commented May 3, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Uh oh!

Correct Github Actions CI instability for iOS. #2337

Correct Github Actions CI instability for iOS. #2337

Uh oh!

Conversation

freakboy3742 commented Mar 27, 2025

Footnotes

Uh oh!

joerick left a comment

Choose a reason for hiding this comment

Uh oh!

henryiii left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

freakboy3742 commented Mar 29, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

henryiii commented May 3, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

freakboy3742 commented May 3, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

henryiii left a comment •

edited

Loading

freakboy3742 commented Mar 29, 2025 •

edited

Loading

henryiii commented May 3, 2025 •

edited

Loading