Skip to content

Commit d699423

Browse files
authored
Improve ci.yml, enhance testability, and document the testing process (#41)
Changes: * Move more `cibuildwheel` settings into the `pyproject.toml` file, which both aligns with recommended practices and allows simplifications to `ci.yml`. * Get rid of `preflight_check_dists` job in `ci.yml` and make the `twine check` step be part of the `upload_to_testpypi` job. This simplifies and streamlines the workflow. * Get rid of references to `vars.ACT` in `ci.yml`, which turn out to be unnecessary. * Enhance local testability by adding support for using a local pypi server for testing in the `upload_to_testpypi` job. * Add an input variable `upload_to_pypi` to `workflow_dispatch` invocations to allow control over whether the upload to `pypi.org` is done during manual invocations. * Set `skip-existing` to `true` when using `actions/gh-action-pypi-publish` so that partial reruns of the workflow are possible. * Use explicit versions of runners instead of `-latest`. * Make miscellaneous small tweaks to `ci.yml`. * Add a README file to `.github/workflows/` detailing how to do local testing of `ci.yml`, to document the process.
1 parent d98f74b commit d699423

File tree

3 files changed

+290
-94
lines changed

3 files changed

+290
-94
lines changed

.github/workflows/README.md

Lines changed: 194 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,194 @@
1+
# Notes about the continuous integration workflow
2+
3+
The continuous integration workflow in [`ci.yml`](ci.yml) triggers automatically
4+
on pushes, pull requests, and merge queue events. It can also be [triggered
5+
manually](https://docs.github.com/en/actions/how-tos/manage-workflow-runs/manually-run-a-workflow)
6+
from either the Actions tab on GitHub or via the GitHub APIs. Manual invocations
7+
like that are useful for limited testing and debugging (particularly when a
8+
problem only seems to show up on GitHub itself), but for more significant
9+
development and testing of the workflow itself, we use the
10+
[`act`](https://github.com/nektos/act) extension for GitHub's CLI program
11+
[`gh`](https://cli.github.com/) to run the workflow on a local computer.
12+
13+
For the benefit of future Chromobius maintainers, this document summarizes how
14+
to set up an environment for working with [`act`](https://github.com/nektos/act)
15+
to run and test the CI workflow.
16+
17+
## Local testing of the CI workflow with `act`
18+
19+
The overall process consists of these steps, which are described in more detail
20+
in the subsections below:
21+
22+
1. Clone the Chromobius repository to a local Linux computer.
23+
24+
2. Install and configure the following programs:
25+
26+
* The GitHub CLI program [`gh`](https://cli.github.com/)
27+
* The [`act` extension](https://nektosact.com/installation/gh.html) for `gh`
28+
* The free and open-source Docker Community Edition (CE) version of
29+
[Docker Engine](https://docs.docker.com/engine/#licensing) (note: this is not
30+
the same as Docker Desktop, which is _not_ needed)
31+
32+
3. Create a Docker image that will be used by `gh act` to run the GitHub
33+
Actions workflow in `ci.yml`.
34+
35+
4. Run `gh act` with specific arguments, observe the results of the run, edit
36+
the workflow file (if necessary), and repeat until satisfied.
37+
38+
Note that the CI workflow in `ci.yml` contains a build step with a matrix of
39+
Linux, macOS, and Windows operating systems. It is not possible to run all of
40+
them on the same machine because of architectural differences, so when we test
41+
the workflow locally, we tell `gh act` to select a subset of the matrix. This is
42+
explained below.
43+
44+
<a class="anchor" id="creating-runner-images"></a>
45+
### Creation of Docker images to use as workflow job runners
46+
47+
For `gh act` to run a workflow, it needs to be told what Docker images to use
48+
for job runners. It is usually not possible to use GitHub's actual runner images
49+
(even though they are made freely available by GitHub) due to differences in
50+
hardware assumptions. Thankfully, some approximations to the GitHub images are
51+
available from other sources. For our testing, we create customized versions of
52+
runners that pre-install some software known to be provided on GitHub.
53+
54+
For this project, here is the `Dockerfile` we use for the Linux runner:
55+
56+
```dockerfile
57+
# Start from a base image that is already configured for act.
58+
# The hash below is for the image tagged act-24.04-20251102.
59+
FROM ghcr.io/catthehacker/ubuntu@sha256:8943e69edcada5141b8c1fcc1a84bab15568a49f438387bd858cb3e4df5a436d
60+
61+
# Switch to the root user to have permission to install packages.
62+
USER root
63+
64+
# Add some software that is pre-installed on GitHub Linux runners.
65+
RUN apt-get update && \
66+
apt-get install -y --no-install-recommends \
67+
clang \
68+
cmake \
69+
golang-go \
70+
libclang-dev \
71+
libclang-rt-dev \
72+
ninja-build \
73+
python3 python3-dev cython3 \
74+
shellcheck \
75+
yamllint \
76+
&& \
77+
# Clean up the apt cache to keep the image small.
78+
rm -rf /var/lib/apt/lists/*
79+
```
80+
81+
Here are the shell commands used to build the image:
82+
83+
```shell
84+
docker build -t ubuntu-act:latest .
85+
docker image prune
86+
```
87+
88+
The Docker image will be named `ubuntu-act`. This name is mapped to the names of
89+
GitHub runners used in `ci.yml` in a way explained in the next subsection.
90+
91+
### Configuration of `act`
92+
93+
`gh act` reads a configuration file that can be used to set some run-time
94+
parameters. This can be used to map the name of the Docker image built in the
95+
step above to the name of the runners used in the workflow. Certain other
96+
parameters are also essential to provide, notably `--pull=false`. Here is an
97+
example of a `~/.actrc` file:
98+
99+
```shell
100+
# The -P flag maps a GitHub runner name (inside the workflow file) to the name
101+
# of a Docker image on the local computer. The following maps the runner named
102+
# "ubuntu-24.04" (used in ci.yml) to the local docker image "ubuntu-act".
103+
-P ubuntu-24.04=ubuntu-act:latest
104+
105+
# If using a local docker image for the job runners, need to use --pull=false
106+
# or else will get the error "Error response from daemon: pull access denied".
107+
--pull=false
108+
109+
# This tells act where to put artifacts saved using `actions/upload-artifact`.
110+
--artifact-server-path /tmp/act-artifacts
111+
112+
# These are some miscellaneous performance improvements.
113+
--use-new-action-cache
114+
--action-offline-mode
115+
116+
# This tells act to remove containers after workflow failures.
117+
--rm
118+
```
119+
120+
### Running `gh act`
121+
122+
The following is an example of a command we use to run the workflow in debug
123+
mode. The command is meant to be executed from the top level of the Chromobius
124+
source directory. Note that this example shows how to select a specific OS from
125+
the matrix in `build_dist` (namely the entries using `ubuntu-24.04` as the
126+
operating system); this matrix selection value would need to be changed when
127+
running this command on a different operating system and hardware architecture.
128+
129+
```shell
130+
gh act workflow_dispatch \
131+
--matrix os:ubuntu-24.04 \
132+
--input debug=true \
133+
--input upload_to_pypi=false \
134+
--env GITHUB_WORKFLOW_REF=refs/heads/main \
135+
--no-recurse -W .github/workflows/ci.yml
136+
```
137+
138+
The `--input` options in the command line above are used to set variables that
139+
are used in the workflow to change some behaviors when debugging. The `--env`
140+
option sets the `GITHUB_WORKFLOW_REF` environment variable that is normally set
141+
by GitHub when a workflow is running in that environment.
142+
143+
### Miscellaneous tips
144+
145+
Sometimes it's useful to add the `--verbose` option to the `gh act` command
146+
above to get more information about what is happening.
147+
148+
If the workflow running inside `act` inexplicably starts producing inconsistent
149+
errors, such as a program like `bazel` not being found on one run when it was
150+
found on the previous run, or well-known actions (e.g., `actions/setup-python`)
151+
generating internal errors, the first thing to suspect is problems with caching.
152+
Here are some things to try:
153+
154+
1. A possible cause of random workflow errors is corruption in the `act` cache.
155+
(This can happen when runs are terminated using, e.g.,
156+
<kbd>control</kbd><kbd>c</kbd>.) To resolve this, delete the cache contents:
157+
158+
1. Delete all artifacts in the `act` artifact directory. Assuming you are
159+
using `/tmp/act-artifacts` for the artifact directory, do this:
160+
161+
```shell
162+
rm -rf /tmp/act-artifacts/*
163+
```
164+
165+
2. Delete everything in the `act` cache (which is located in
166+
`$HOME/.cache/act/` by default):
167+
168+
```shell
169+
rm -rf ~/.cache/act/*
170+
rm -rf ~/.cache/actcache/*
171+
```
172+
173+
2. If clearing the caches and containers as described above does not stop
174+
random flaky behavior, the next thing to try is to add the option
175+
`--no-cache-server` to the `gh act` command. If the random errors stop, then
176+
the cause has been narrowed down. You can then experiment with trying to get
177+
some parallelism back by replacing `--no-cache-server` with the
178+
`--concurrent-jobs` option and a low number like 4 or 2. Reducing the
179+
maximum concurrent jobs will reduce performance, but that may be the price
180+
for avoiding random errors. If random errors resurface, then it may be
181+
necessary to use `--no-cache-server` all the time on your system.
182+
183+
3. If the steps above did not stop random errors, the last thing to try is to
184+
delete the Docker containers and volumes:
185+
186+
```shell
187+
docker system prune --all
188+
docker volume prune --all
189+
```
190+
191+
This is a bit of a sledgehammer, unfortunately, and if you created a Docker
192+
image as described in the [section above](#creating-runner-images), then
193+
doing the pruning commands will remove it and you will need to recreate the
194+
image.

0 commit comments

Comments
 (0)