Make CI workflow updates, enhancements, & performance tweaks#30
Make CI workflow updates, enhancements, & performance tweaks#30mhucka merged 39 commits intoquantumlib:mainfrom
Conversation
The `cmake --build` command accepts a `-j` option for using multiple jobs. We may as well use it to speed up CI processes where we can.
This updates the GHAs used in the various jobs to their latest versions, and uses SHAs, per security best-practices recommendations.
ci.yml
Standard runners on GitHub have 4 vCPUs today, and managed runners and self-hosted runners may have more. Rather than hardwire the `-j` value given to `make`, we can set the job number based on the number of processors available. The change in this commit makes it use 1 less than the number of processors on the host. (This is currently 3 on the standard Ubuntu GitHub runners.)
Enabling the pip cache option to `actions/setup-python` should hopefully provide a little speed boost.
Per security best practices, this commit adds a top-level `permissions` definition.
This adds the ability to run the workflow manually from GitHub GUI.
This is to follow recommended best practices.
Missed the `with:`.
Only ninja is needed, so we can shorten installation time by skipping the larger cibuildwheel.
This adds pytest-xdist to places where pytest is installed, and the option `-n auto` to places where pytest is invoked, so that it uses parallel jobs when the host has enough cores.
In the place where the third-party action `egor/setup-clang` was used, we can just install clang using `apt`. This is slightly better because we know the operating system being used (so the portability features of that third-party action are irrelevant) and it's more future-proof (because we can't be assured how long the third-party action will be supported).
Passing the `-c advice.detachedHead=false` option to `git clone` avoids a message about cloned the repo in a "detached HEAD" state. This is helpful because when debugging problems, seeing that message in the logs can was developers' time trying to determine whether it's expected or whether something has gone wrong.
The `check_sdist_installs` action didn't do much besides build the sdist and try to install it, and since `build_sdist` has to do the build too, we can simply include the testing in `build_sdist`. Doing it this way is better not just for efficiency: if the test fails, we wouldn't want to upload the sdist artifact anyway, so this adds a sanity check that was missing before.
Due to changes in v4 of `actions/upload-artifact`, it is no longer necessary to use a separate merge action. Instead, the invocation of `actions/download-artifact` can be given arguments that have the same effect.
When debugging, it's useful to see what cmake is doing inside the `pip install` step of `build_sdist`.
Previously it would upload the dev releases after builds finished, without checking that jobs like `test_pybind` succeeded. It seems better to make sure that the copy of chromobius being uploaded to PyPI passes basic tests. (Even if it's a dev release.)
Changes: * Build for Python 3.14 prerelease * Build for linux arm64 * Specify Stim version 1.15 to match requirements.txt * Update cibuildwheel to version 3.2.0 * Use Python 3.11 with actions/setup-python * Add more debugging flags * Rename 'os_dist' to 'build' in the matrix, to try to make some of the variable references more readable and shorter.
ci.ymlThis prevents the job from running in people's forks.
There are cryptic errors on GitHub, and since no one has asked for this combo, let's just shelve this part for now.
Strilanc
left a comment
There was a problem hiding this comment.
looks good other than -1 on the number of jobs
.github/workflows/ci.yml
Outdated
| mkdir googletest/build && cd googletest/build | ||
| cmake .. -DBUILD_GMOCK=OFF | ||
| make | ||
| make -j $(( $(nproc) - 1 )) |
There was a problem hiding this comment.
Why -1? It seems generally better to occupy the whole machine. The OS can schedule in other processes if needed, but the compilation is the key thing happening so no reason to not give it everything.
(If they give us a single core machine, this asks for 0 cores?)
There was a problem hiding this comment.
Re using -1: it's an old rule of thumb, which still seems to be valid because on the VMs we use for development, when I let it use all CPUs, the result is often that the system becomes utterly unresponsive for long stretches of time.
Re 0 cores: I was going by the documentation of GitHub runners at https://docs.github.com/en/actions/reference/runners/github-hosted-runners where all the runners are documented as having 3-4 cores. However, it's true that things might be different for self-hosted runners or if someone runs the workflow locally using act, and in any case, it's a bug to allow a the possibility of 0.
I'll change it to use just -j. Since I use act to run workflow tests locally, I can adjust the number of CPUs that it uses; this will mitigate the effect of letting it use everything.
There was a problem hiding this comment.
When I build on my machine I use all the cores. It can in fact cause delays on UI interactions, but for a CI run that doesn't seem like a problem? Does the output log actually come out in huge chunks or something?
There was a problem hiding this comment.
When I build on my machine I use all the cores. It can in fact cause delays on UI interactions, but for a CI run that doesn't seem like a problem? Does the output log actually come out in huge chunks or something?
(By "CI run", I assume you mean on GitHub.) What seems to happen is that it stops producing output for a time. Chunks wouldn't be a problem, but the lack of any visible activity leaves me wondering if something has gone wrong, how long I should wait before trying to diagnose it, etc. So, not a critical issue; more of a "I wish it didn't do that" issue.
Per discussion in review, the use of `$(( $(nproc) - 1 ))` may not be worth it, and in any case, is buggy because if `nproc` returns 1, the argument becomes `make -j 0` and that's an error.
Per discussion in review with Craig, I removed the code that disabled message coloring in debug runs.
Changes:
build_sdistand get rid of separatecheck_sdist_installsjob.actions/upload-artifactfor merging uploads and get rid of separatemerge_upload_artifactsjob.upload_dev_release_to_pypidepend on results not only from build jobs but also test jobs, so that if builds succeed but tests fail, we don't upload a broken dev release.actions/setup-python.-joption tomakeandcmake --buildcommands for using multiple processes if possible.cibuildwheelused.requirements.txt.workflow_dispatch.build_clangjob.