Skip to content

Commit 7b1d98a

Browse files
docker-build-parallel (#508)
Summary: 1. Due to shortfall in `docker` build tooling, it is not possible to first build and store a multi-architecture image and then test it and then push it. 2. Because point (1) is exactly what we require, and the prior push step has to halt and wait for `arm64`, time gets wasted. 3. This change seeks to mitigate same by pushing a digest-only tagged image for each architecture, and then (post-testing) adding *nominally* immutable multi-architecture tags. **Note**: unfortunately, at this time, `linux/arm64` GHA runners are not first class citizens. Once they are, there is significant potential improvement in build time compared to current QEMU pattern.
1 parent a891a31 commit 7b1d98a

File tree

2 files changed

+249
-53
lines changed

2 files changed

+249
-53
lines changed

.github/workflows/build.yml

Lines changed: 236 additions & 50 deletions
Original file line numberDiff line numberDiff line change
@@ -1069,12 +1069,39 @@ jobs:
10691069
with:
10701070
name: stackql_darwin_arm64
10711071
path: build/stackql
1072-
1072+
1073+
## Docker Build and Push Jobs
1074+
## based loosely on patterns described in:
1075+
## - https://docs.docker.com/build/ci/github-actions/multi-platform/#distribute-build-across-multiple-runners
1076+
## - https://docs.docker.com/build/ci/github-actions/share-image-jobs/
1077+
##
1078+
## NOTE: The QEMU build for linux/arm64 is very slow. On the order of 30 minutes. This is currently unavoidable.
1079+
##
1080+
## TODO: Migrate linux/arm64 docker build to native once GHA supports this platform as a first class citizen.
1081+
##
10731082
dockerbuild:
10741083
name: Docker Build
10751084
runs-on: ubuntu-latest-m
10761085
timeout-minutes: ${{ vars.DEFAULT_JOB_TIMEOUT_MIN == '' && 120 || vars.DEFAULT_JOB_TIMEOUT_MIN }}
1086+
strategy:
1087+
fail-fast: false
1088+
matrix:
1089+
platform:
1090+
- linux/amd64
1091+
- linux/arm64
1092+
10771093
steps:
1094+
- name: Prepare
1095+
run: |
1096+
platform=${{ matrix.platform }}
1097+
echo "PLATFORM_PAIR=${platform//\//-}" >> "${GITHUB_ENV}"
1098+
1099+
- name: Docker meta
1100+
id: meta
1101+
uses: docker/metadata-action@v5
1102+
with:
1103+
images: |
1104+
${{ env.STACKQL_IMAGE_NAME }}
10781105
10791106
- name: Check out code into the Go module directory
10801107
uses: actions/[email protected]
@@ -1100,17 +1127,40 @@ jobs:
11001127
echo "SOURCE_TAG=${GITHUB_REF#refs/tags/}"
11011128
} >> "${GITHUB_STATE}"
11021129
1103-
- name: Install psql
1130+
- name: Image env sanitize
11041131
run: |
1105-
sudo apt-get update
1106-
sudo apt-get install --yes --no-install-recommends \
1107-
postgresql-client \
1108-
ca-certificates \
1109-
openssl
1110-
1111-
- name: Install Python dependencies
1112-
run: |
1113-
pip3 install -r cicd/requirements.txt
1132+
BUILD_IMAGE_REQUIRED="true"
1133+
PUSH_IMAGE_REQUIRED="false"
1134+
if [ "$( grep '^build-elide.*' <<< '${{ github.ref_name }}' )" != "" ]; then
1135+
BUILD_IMAGE_REQUIRED="false"
1136+
fi
1137+
# shellcheck disable=SC2235
1138+
if ( \
1139+
[ "${{ github.repository }}" = "stackql/stackql" ] \
1140+
|| [ "${{ github.repository }}" != "stackql/stackql-devel" ] \
1141+
) \
1142+
&& [ "${{ vars.CI_SKIP_DOCKER_PUSH }}" != "true" ] \
1143+
&& [ "$( grep '^build-elide.*' <<< '${{ github.ref_name }}' )" = "" ] \
1144+
&& ( \
1145+
[ "${{ github.ref_type }}" = "branch" ] \
1146+
&& [ "${{ github.ref_name }}" = "main" ] \
1147+
&& [ "${{ github.event_name }}" = "push" ] \
1148+
) \
1149+
|| ( \
1150+
[ "${{ github.ref_type }}" = "tag" ] \
1151+
&& [ "$( grep '^build-release.*' <<< '${{ github.ref_name }}' )" != "" ] \
1152+
); \
1153+
then
1154+
PUSH_IMAGE_REQUIRED="true"
1155+
fi
1156+
if [ "${{ matrix.platform }}" == "linux/arm64" ] && [ "${PUSH_IMAGE_REQUIRED}" = "false" ]; then
1157+
BUILD_IMAGE_REQUIRED="false"
1158+
fi
1159+
{
1160+
echo "IMAGE_PLATFORM_SAN=$( sed 's/\//_/g' <<< '${{ matrix.platform }}' )";
1161+
echo "PUSH_IMAGE_REQUIRED=${PUSH_IMAGE_REQUIRED}";
1162+
echo "BUILD_IMAGE_REQUIRED=${BUILD_IMAGE_REQUIRED}";
1163+
} | tee -a "${GITHUB_ENV}"
11141164
11151165
- name: Extract Build Info and Persist
11161166
env:
@@ -1146,20 +1196,38 @@ jobs:
11461196
echo "GID=${GID}"
11471197
} >> "${GITHUB_ENV}"
11481198
1199+
- name: Install psql
1200+
if: env.BUILD_IMAGE_REQUIRED == 'true'
1201+
run: |
1202+
sudo apt-get update
1203+
sudo apt-get install --yes --no-install-recommends \
1204+
postgresql-client \
1205+
ca-certificates \
1206+
openssl
1207+
1208+
# for some reason skipping this with env.BUILD_IMAGE_REQUIRED == 'true' breaks python cleanup where it can't find pip cache
1209+
- name: Install Python dependencies
1210+
run: |
1211+
pip3 install -r cicd/requirements.txt
1212+
11491213
- name: Generate rewritten registry for simulations
1214+
if: env.BUILD_IMAGE_REQUIRED == 'true'
11501215
run: |
11511216
python3 test/python/registry-rewrite.py --replacement-host=host.docker.internal
11521217
11531218
- name: Pull Docker base images for cache purposes
1219+
if: env.BUILD_IMAGE_REQUIRED == 'true'
11541220
run: |
1155-
docker pull golang:1.18.4-bullseye
1156-
docker pull ubuntu:22.04
1221+
docker pull --platform ${{ matrix.platform }} golang:1.18.4-bullseye || echo 'could not pull image for cache purposes'
1222+
docker pull --platform ${{ matrix.platform }} ubuntu:22.04 || echo 'could not pull image for cache purposes'
11571223
11581224
- name: Pull Docker image for cache purposes
1225+
if: env.BUILD_IMAGE_REQUIRED == 'true'
11591226
run: |
1160-
docker pull stackql/stackql:latest || echo 'could not pull image for cache purposes'
1227+
docker pull --platform ${{ matrix.platform }} stackql/stackql:latest || echo 'could not pull image for cache purposes'
11611228
11621229
- name: Create certificates for robot tests
1230+
if: env.BUILD_IMAGE_REQUIRED == 'true'
11631231
run: |
11641232
openssl req -x509 -keyout test/server/mtls/credentials/pg_server_key.pem -out test/server/mtls/credentials/pg_server_cert.pem -config test/server/mtls/openssl.cnf -days 365
11651233
openssl req -x509 -keyout test/server/mtls/credentials/pg_client_key.pem -out test/server/mtls/credentials/pg_client_cert.pem -config test/server/mtls/openssl.cnf -days 365
@@ -1168,26 +1236,56 @@ jobs:
11681236
openssl req -x509 -keyout cicd/vol/srv/credentials/pg_client_key.pem -out cicd/vol/srv/credentials/pg_client_cert.pem -config test/server/mtls/openssl.cnf -days 365
11691237
openssl req -x509 -keyout cicd/vol/srv/credentials/pg_rubbish_key.pem -out cicd/vol/srv/credentials/pg_rubbish_cert.pem -config test/server/mtls/openssl.cnf -days 365
11701238
1171-
- name: Build image
1239+
- name: Build image precursors
1240+
if: env.BUILD_IMAGE_REQUIRED == 'true'
11721241
run: |
11731242
docker compose -f docker-compose-credentials.yml build credentialsgen
11741243
docker compose build mockserver
1244+
1245+
- name: Login to Docker Hub
1246+
uses: docker/login-action@v3
1247+
if: env.BUILD_IMAGE_REQUIRED == 'true'
1248+
with:
1249+
username: ${{ secrets.DOCKERHUB_USERNAME }}
1250+
password: ${{ secrets.DOCKERHUB_TOKEN }}
11751251

11761252
- name: Build Stackql image with buildx
1177-
uses: docker/build-push-action@v5
1253+
uses: docker/build-push-action@v6
1254+
id: img_build
1255+
if: env.BUILD_IMAGE_REQUIRED == 'true'
11781256
with:
11791257
context: .
11801258
build-args: |
11811259
BUILDMAJORVERSION=${{env.BUILDMAJORVERSION}}
11821260
BUILDMINORVERSION=${{env.BUILDMINORVERSION}}
11831261
BUILDPATCHVERSION=${{env.BUILDPATCHVERSION}}
1184-
push: false
1262+
platforms: ${{ matrix.platform }}
11851263
target: app
1186-
no-cache: ${{ vars.CI_DOCKER_BUILD_NO_CACHE == 'true' && true || false }}
1187-
load: true
1188-
tags: ${{ env.STACKQL_IMAGE_NAME }}:${{github.sha}},${{ env.STACKQL_IMAGE_NAME }}:v${{env.BUILDMAJORVERSION}}.${{env.BUILDMINORVERSION}}.${{env.BUILDPATCHVERSION}},${{ env.STACKQL_IMAGE_NAME }}:latest
1264+
labels: ${{ steps.meta.outputs.labels }}
1265+
outputs: type=image,"name=${{ env.STACKQL_IMAGE_NAME }}",push-by-digest=true,name-canonical=true,push=true
1266+
1267+
- name: Export digest
1268+
if: env.BUILD_IMAGE_REQUIRED == 'true'
1269+
run: |
1270+
mkdir -p ${{ runner.temp }}/digests
1271+
digest="${{ steps.img_build.outputs.digest }}"
1272+
touch "${{ runner.temp }}/digests/${digest#sha256:}"
1273+
1274+
- name: Upload digest
1275+
if: env.BUILD_IMAGE_REQUIRED == 'true'
1276+
uses: actions/upload-artifact@v4
1277+
with:
1278+
name: digests-${{ env.PLATFORM_PAIR }}
1279+
path: ${{ runner.temp }}/digests/*
1280+
if-no-files-found: error
1281+
1282+
- name: Pull by digest
1283+
if: env.BUILD_IMAGE_REQUIRED == 'true'
1284+
run: |
1285+
docker pull --platform ${{ matrix.platform }} ${{ env.STACKQL_IMAGE_NAME }}@${{ steps.img_build.outputs.digest }}
11891286
11901287
- name: Debug info
1288+
if: env.BUILD_IMAGE_REQUIRED == 'true'
11911289
run: |
11921290
echo "psql version info: $(psql --version)"
11931291
echo ""
@@ -1223,14 +1321,14 @@ jobs:
12231321
echo ""
12241322
12251323
- name: Run robot mocked functional tests
1226-
if: success() && env.CI_IS_EXPRESS != 'true'
1324+
if: success() && env.CI_IS_EXPRESS != 'true' && matrix.platform == 'linux/amd64' && env.BUILD_IMAGE_REQUIRED == 'true'
12271325
timeout-minutes: ${{ vars.DEFAULT_STEP_TIMEOUT_MIN == '' && 20 || vars.DEFAULT_STEP_TIMEOUT_MIN }}
12281326
run: |
12291327
python cicd/python/build.py --robot-test --config='{ "variables": { "EXECUTION_PLATFORM": "docker" } }'
12301328
12311329
- name: Run POSTGRES BACKEND robot mocked functional tests
1232-
if: success() && env.CI_IS_EXPRESS != 'true'
1233-
timeout-minutes: ${{ vars.DEFAULT_STEP_TIMEOUT_MIN == '' && 20 || vars.DEFAULT_STEP_TIMEOUT_MIN }}
1330+
if: success() && env.CI_IS_EXPRESS != 'true' && matrix.platform == 'linux/amd64' && env.BUILD_IMAGE_REQUIRED == 'true'
1331+
timeout-minutes: ${{ vars.DEFAULT_LONG_STEP_TIMEOUT_MIN == '' && 40 || vars.DEFAULT_LONG_STEP_TIMEOUT_MIN }}
12341332
run: |
12351333
echo "## Stray flask apps to be killed before robot tests ##"
12361334
pgrep -f flask | xargs kill -9
@@ -1249,12 +1347,12 @@ jobs:
12491347
python cicd/python/build.py --robot-test --config='{ "variables": { "EXECUTION_PLATFORM": "docker", "SHOULD_RUN_DOCKER_EXTERNAL_TESTS": true, "SQL_BACKEND": "postgres_tcp" } }'
12501348
12511349
- name: Output from mocked functional tests
1252-
if: always() && env.CI_IS_EXPRESS != 'true'
1350+
if: always() && env.CI_IS_EXPRESS != 'true' && matrix.platform == 'linux/amd64' && env.BUILD_IMAGE_REQUIRED == 'true'
12531351
run: |
12541352
cat ./test/robot/reports/output.xml
12551353
12561354
- name: Run robot integration tests
1257-
if: env.AZURE_CLIENT_SECRET != '' && startsWith(env.STATE_SOURCE_TAG, 'build-release')
1355+
if: env.AZURE_CLIENT_SECRET != '' && startsWith(env.STATE_SOURCE_TAG, 'build-release') && matrix.platform == 'linux/amd64' && env.BUILD_IMAGE_REQUIRED == 'true'
12581356
env:
12591357
AZURE_CLIENT_ID: ${{ secrets.AZURE_CLIENT_ID }}
12601358
AZURE_CLIENT_SECRET: ${{ secrets.AZURE_CLIENT_SECRET }}
@@ -1266,30 +1364,118 @@ jobs:
12661364
echo "## End ##"
12671365
python cicd/python/build.py --robot-test-integration --config='{ "variables": { "EXECUTION_PLATFORM": "docker" } }'
12681366
1269-
- name: Login to Docker Hub
1270-
if: ${{ ( success() && github.ref_type == 'branch' && github.ref_name == 'main' && github.repository == 'stackql/stackql' && github.event_name == 'push' ) || ( success() && github.ref_type == 'tag' && startsWith(github.ref_name, 'build-release') ) }}
1271-
uses: docker/login-action@v2
1272-
with:
1273-
username: ${{ secrets.DOCKERHUB_USERNAME }}
1274-
password: ${{ secrets.DOCKERHUB_TOKEN }}
1275-
1276-
- name: Hack to avoid docker buildx failures
1277-
run: |
1278-
sudo rm -rf cicd/vol/postgres/persist
1367+
dockermerge:
1368+
runs-on: ubuntu-latest
1369+
needs:
1370+
- dockerbuild
1371+
steps:
12791372

1280-
- name: Push stackql image to Docker Hub
1281-
if: ${{ (github.repository == 'stackql/stackql' || github.repository == 'stackql/stackql-devel') && vars.CI_SKIP_DOCKER_PUSH != 'true' && ( success() && github.ref_type == 'branch' && github.ref_name == 'main' && github.event_name == 'push' ) || ( success() && github.ref_type == 'tag' && startsWith(github.ref_name, 'build-release') ) }}
1282-
uses: docker/build-push-action@v5
1283-
with:
1284-
context: .
1285-
no-cache: ${{ vars.CI_DOCKER_BUILD_NO_CACHE == 'true' && true || false }}
1286-
platforms: linux/arm64,linux/amd64
1287-
build-args: |
1288-
BUILDMAJORVERSION=${{env.BUILDMAJORVERSION}}
1289-
BUILDMINORVERSION=${{env.BUILDMINORVERSION}}
1290-
BUILDPATCHVERSION=${{env.BUILDPATCHVERSION}}
1291-
RUN_INTEGRATION_TESTS=0
1292-
push: true
1293-
target: app
1294-
tags: ${{ env.STACKQL_IMAGE_NAME }}:${{github.sha}},${{ env.STACKQL_IMAGE_NAME }}:v${{env.BUILDMAJORVERSION}}.${{env.BUILDMINORVERSION}}.${{env.BUILDPATCHVERSION}},${{ env.STACKQL_IMAGE_NAME }}:latest
1373+
- name: Check out code into the Go module directory
1374+
uses: actions/[email protected]
1375+
1376+
- name: Image env sanitize
1377+
run: |
1378+
PUSH_IMAGE_REQUIRED="false"
1379+
# shellcheck disable=SC2235
1380+
if ( \
1381+
[ "${{ github.repository }}" = "stackql/stackql" ] \
1382+
|| [ "${{ github.repository }}" != "stackql/stackql-devel" ] \
1383+
) \
1384+
&& [ "${{ vars.CI_SKIP_DOCKER_PUSH }}" != "true" ] \
1385+
&& [ "$( grep '^build-elide.*' <<< '${{ github.ref_name }}' )" = "" ] \
1386+
&& ( \
1387+
[ "${{ github.ref_type }}" = "branch" ] \
1388+
&& [ "${{ github.ref_name }}" = "main" ] \
1389+
&& [ "${{ github.event_name }}" = "push" ] \
1390+
) \
1391+
|| ( \
1392+
[ "${{ github.ref_type }}" = "tag" ] \
1393+
&& [ "$( grep '^build-release.*' <<< '${{ github.ref_name }}' )" != "" ] \
1394+
); \
1395+
then
1396+
PUSH_IMAGE_REQUIRED="true"
1397+
fi
1398+
{
1399+
echo "PUSH_IMAGE_REQUIRED=${PUSH_IMAGE_REQUIRED}";
1400+
} | tee -a "${GITHUB_ENV}"
1401+
1402+
- name: Download digests
1403+
uses: actions/download-artifact@v4
1404+
if: env.PUSH_IMAGE_REQUIRED == 'true'
1405+
with:
1406+
path: ${{ runner.temp }}/digests
1407+
pattern: digests-*
1408+
merge-multiple: true
1409+
1410+
- name: Login to Docker Hub
1411+
uses: docker/login-action@v3
1412+
if: env.PUSH_IMAGE_REQUIRED == 'true'
1413+
with:
1414+
username: ${{ secrets.DOCKERHUB_USERNAME }}
1415+
password: ${{ secrets.DOCKERHUB_TOKEN }}
1416+
1417+
- name: Set up Docker Buildx
1418+
if: env.PUSH_IMAGE_REQUIRED == 'true'
1419+
uses: docker/setup-buildx-action@v3
1420+
1421+
- name: Extract Build Info and Persist
1422+
if: env.PUSH_IMAGE_REQUIRED == 'true'
1423+
env:
1424+
BUILDCOMMITSHA: ${{github.sha}}
1425+
BUILDBRANCH: ${{github.ref}}
1426+
BUILDPLATFORM: ${{runner.os}}
1427+
BUILDPATCHVERSION: ${{github.run_number}}
1428+
run: |
1429+
source cicd/version.txt
1430+
BUILDMAJORVERSION=${MajorVersion}
1431+
BUILDMINORVERSION=${MinorVersion}
1432+
if [[ ! "$BUILDBRANCH" == "*develop" ]]; then
1433+
# shellcheck disable=2269
1434+
BUILDPATCHVERSION="${BUILDPATCHVERSION}"
1435+
fi
1436+
BUILDSHORTCOMMITSHA="$(echo "${BUILDCOMMITSHA}" | cut -c 1-7)"
1437+
BUILDDATE="$(date)"
1438+
export BUILDDATE
1439+
echo "BUILDMAJORVERSION: ${BUILDMAJORVERSION}"
1440+
echo "BUILDMINORVERSION: ${BUILDMINORVERSION}"
1441+
echo "BUILDPATCHVERSION: ${BUILDPATCHVERSION}"
1442+
echo "BUILDBRANCH: ${BUILDBRANCH}"
1443+
echo "BUILDCOMMITSHA: ${BUILDCOMMITSHA}"
1444+
echo "BUILDSHORTCOMMITSHA: ${BUILDSHORTCOMMITSHA}"
1445+
echo "BUILDDATE: ${BUILDDATE}"
1446+
echo "BUILDPLATFORM: ${BUILDPLATFORM}"
1447+
{
1448+
echo "BUILDMAJORVERSION=$BUILDMAJORVERSION"
1449+
echo "BUILDMINORVERSION=$BUILDMINORVERSION"
1450+
echo "BUILDPATCHVERSION=$BUILDPATCHVERSION"
1451+
echo "UID=${UID}"
1452+
echo "GID=${GID}"
1453+
} >> "${GITHUB_ENV}"
1454+
1455+
- name: Docker meta
1456+
if: env.PUSH_IMAGE_REQUIRED == 'true'
1457+
id: meta
1458+
uses: docker/metadata-action@v5
1459+
with:
1460+
images: |
1461+
${{ env.STACKQL_IMAGE_NAME }}
1462+
tags: |
1463+
type=ref,event=branch
1464+
type=ref,event=pr
1465+
type=raw,value=latest
1466+
type=raw,value=v${{env.BUILDMAJORVERSION}}.${{env.BUILDMINORVERSION}}.${{env.BUILDPATCHVERSION}}
1467+
type=raw,value=${{ github.sha }}
1468+
1469+
- name: Create manifest list and push
1470+
if: env.PUSH_IMAGE_REQUIRED == 'true'
1471+
working-directory: ${{ runner.temp }}/digests
1472+
run: |
1473+
# shellcheck disable=SC2046
1474+
docker buildx imagetools create $(jq -cr '.tags | map("-t " + .) | join(" ")' <<< "$DOCKER_METADATA_OUTPUT_JSON") \
1475+
$(printf '${{ env.STACKQL_IMAGE_NAME }}@sha256:%s ' *)
1476+
1477+
- name: Inspect image
1478+
if: env.PUSH_IMAGE_REQUIRED == 'true'
1479+
run: |
1480+
docker buildx imagetools inspect ${{ env.STACKQL_IMAGE_NAME }}:${{ steps.meta.outputs.version }}
12951481

docs/CICD.md

Lines changed: 13 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -4,10 +4,20 @@
44

55
Summary:
66

7-
- At present, PR checks, build and test are all performed through [.github/workflows/go.yml](/.github/workflows/go.yml).
7+
- At present, PR checks, build and test are all performed through [.github/workflows/build.yml](/.github/workflows/build.yml).
88
- Releasing over various channels (website, homebrew, chocolatey...) is performed manually.
9-
- The strategic state is to split the functions: PR checks, build and test; into separate files, and migrate to use [goreleaser](https://goreleaser.com/).
10-
- Should take the hint from docker to [speed up multi-platform builds using multiple runners](https://docs.docker.com/build/ci/github-actions/multi-platform/#distribute-build-across-multiple-runners).
9+
- ~~The strategic state is to split the functions: PR checks, build and test; into separate files, and migrate to use [goreleaser](https://goreleaser.com/).~~
10+
- Docker Build and Push Jobs have scope for improvement.
11+
- These are currently based loosely on patterns described in:
12+
- https://docs.docker.com/build/ci/github-actions/multi-platform/#distribute-build-across-multiple-runners
13+
- https://docs.docker.com/build/ci/github-actions/share-image-jobs/
14+
- This pattern does the below:
15+
- (a) Build and push by digest.
16+
- (b) Leverage [`docker buildx imagetools`](https://docs.docker.com/reference/cli/docker/buildx/imagetools/) to write desired tags.
17+
- This pattern is only required because if tag pushes are done concurrently, then identical multi-architecture tags are clobbered in a reverse race condition.
18+
- **NOTE**: The QEMU build for linux/arm64 is **very slow**. On the order of 30 minutes. This is currently unavoidable.
19+
- **TODO**: Migrate linux/arm64 docker build to native once GHA supports this platform as a first class citizen.
20+
- ~~**DANGER**: New pattern depends entirely on [docker manifest](https://docs.docker.com/reference/cli/docker/manifest/), which is marked "experimental" by the vendor. Per [this stackoverflow answer](https://stackoverflow.com/a/66337328), in spite of fundamental instability, this is still the best option.~~
1121

1222

1323
## Secrets

0 commit comments

Comments
 (0)