CI/CD status #1720
Replies: 23 comments 35 replies
-
Thursday 3 Apr BST 23:00 (UTC+1)Noticed errors with our container builds due to installation problems. Identified issues with the |
Beta Was this translation helpful? Give feedback.
-
Tuesday 8 Apr BST 11:00 (UTC+1)Noticed errors with our container builds due to installation problems. Identified issues with the |
Beta Was this translation helpful? Give feedback.
-
Thursday 10 Apr BST 10:20 (UTC+1)Noticed timeouts in the upload base images workflow with the macos-hosted agent. Suspected the issue to be disk space, so re-ran the workflow after doing |
Beta Was this translation helpful? Give feedback.
-
Wednesday 16 Apr BST 11:00 (UTC+1)Power is down, so the self-hosted agents are unavailable. (Power must be back in six hours. The good news is that we have an EcoFlow 1KWh battery backup. The bad news is that I am travelling in the opposite direction, so I couldn't physically deliver the battery to the co-location centre. Note to self: we need to distribute compute units across locations.) |
Beta Was this translation helpful? Give feedback.
-
Thursday 17 Apr BST 17:00 (UTC+1)Noticed that the "Upload base images" had errors. Apparently, this is a known issue with colima which requires a workaround after every macOS update. brew uninstall colima qemu lima
rm -rf ~/.colima
brew install colima
brew services restart colima |
Beta Was this translation helpful? Give feedback.
-
FYI: keychain can be interacted with from CLI. Not sure what you need, but
I use it in CI all the time!
…On Thu, Apr 17, 2025, 18:41 prabhu ***@***.***> wrote:
For reminder, we cannot use Docker Desktop for Mac since it requires
access to the keychain and doesn't really work well in a headless
environment. Things like Rancher Desktop and nerdctl require rework in the
workflow to replace Docker-based steps with custom scripts.
—
Reply to this email directly, view it on GitHub
<#1720 (reply in thread)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/ADC7EIPPXZPRAUMWKWXEBMT2Z7KUZAVCNFSM6AAAAAB2NKT6X2VHI2DSMVQWIX3LMV43URDJONRXK43TNFXW4Q3PNVWWK3TUHMYTEOBWHE4DQOA>
.
You are receiving this because you are subscribed to this thread.Message
ID: ***@***.***>
|
Beta Was this translation helpful? Give feedback.
-
I believe the correct command would be `keychain unlock`. It needs
credentials, but these can be given on the cli.
…On Thu, Apr 17, 2025, 18:59 prabhu ***@***.***> wrote:
Interesting. I don't have the exact error with me, but it only worked when
run from a screensharing session, since macOS wanted to show a password
prompt. Upon restart, nothing docker related will work unless it was
unlocked. In contrast, colima works in clear text password mode, but does
have other issues.
—
Reply to this email directly, view it on GitHub
<#1720 (reply in thread)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/ADC7EIJKXQIQR45ZMJ3ELWL2Z7MV3AVCNFSM6AAAAAB2NKT6X2VHI2DSMVQWIX3LMV43URDJONRXK43TNFXW4Q3PNVWWK3TUHMYTEOBXGAYDKMI>
.
You are receiving this because you commented.Message ID:
***@***.***>
|
Beta Was this translation helpful? Give feedback.
-
Saturday 26 Apr BST 17:00 (UTC+1)
|
Beta Was this translation helpful? Give feedback.
-
Thursday 1 May BST 15:00 (UTC+1)Happy Labour Day (International Workers' Day)! Today a brand new server ![]() ![]() I would love to share an HBOM and OBOM. Sadly, there are no HBOM tools I could find. (cc: @malice00). So, just some output from a couple of commands and a decent OBOM. Output from lscpu
OBOMBugs identified |
Beta Was this translation helpful? Give feedback.
-
Thursday 8 May BST 00:00 (UTC+1)Self-hosted agents ran out of disk space. It turned out, due to some known bugs in cdxgen, the /tmp folder was full of |
Beta Was this translation helpful? Give feedback.
-
Thursday 8 May BST 13:00 (UTC+1)Despite the new self-hosted machines, our CI usage is going up well beyond the 50K minutes that GitHub offers for Enterprise users. We are now beginning to consume minutes from other OWASP orgs as well (CycloneDX is a sub-org under OWASP). Thanks to the excellent work from @malice00, we have a new matrix-based workflow for building base images. However, building Ruby 3.4.3 steps still takes too much time, so we are at risk of exceeding 50K minutes this month as well. I think the next step is to look into multi-stage builds and layer caching, so that |
Beta Was this translation helpful? Give feedback.
-
Beta Was this translation helpful? Give feedback.
-
Thursday 15 May BST 12:00 (UTC+1)Currently, fighting the CI due to multiple failures (all of them with github-hosted). Installing dotnet on debian isn't reliable
Node 20 builds continue to fail for mysterious reasons
I am also seeing random corepack command not found errorsA number of java builds are failing, despite us installing corepack explicitly!
The rolling image is looking for
opensuse continues to do opensuse things
|
Beta Was this translation helpful? Give feedback.
-
Wednesday 21 May BST 10:00 (UTC+1)Multiple build failures since macos-hosted was out of disk space. Retrying the builds after running the command Noticed another issue with attaching sboms to the nightly images. node bin/cdxgen.js -t docker -o sbom-oci-base-image.cdx.json ghcr.io/cyclonedx/debian-dotnet8:nightly
node bin/verify.js -i sbom-oci-base-image.cdx.json --public-key contrib/bom-signer/public.key
Possible bug in cdxgen? |
Beta Was this translation helpful? Give feedback.
-
Thursday 22 May BST 10:00 (UTC+1)
Related: #1606 |
Beta Was this translation helpful? Give feedback.
-
Friday 23 May BST 09:00 (UTC+1)Happy Friday! mini-dev-1 (that gray Beelink ESR8 one since we don't have the HBOM!) required some servicing today. When I set up this machine a while ago, my thinking was to set enough UMA Frame Buffer size so that we could invoke cdx1 and other mini ML models with llama.cpp as part of CI pipelines. For some reason, I went with 16GB (out of 32GB RAM). This not only reduced the amount of memory available for CI, but also reduced the system performance a bit as per AMD support document. Changing this setting to auto required physically rebooting to BIOS, etc. With The plan is to run this as a CPU-only build agent for a bit more, till we reach a point where we need some GPU. |
Beta Was this translation helpful? Give feedback.
-
Friday 23 May BST 17:00 (UTC+1)After a lot of testing and research, our colima start -c 6 -d 250 -m 16 -t qemu -r docker --dns 1.1.1.1 --dns 1.0.0.1 --mount-type sshfs -a x86_64 --profile colima-amd64
colima start -c 6 -d 250 -m 16 -t vz -r docker --dns 1.1.1.1 --dns 1.0.0.1 --mount-type virtiofs -a aarch64 --profile colima-aarch64
docker buildx create --use --name multi-builder colima-amd64
docker buildx create --append --name multi-builder colima-aarch64 There is no need for any binfmt hacks. The workflows that use macos-hosted agents do not need setup-qemu and setup-buildx steps. |
Beta Was this translation helpful? Give feedback.
-
Monday 2 June BST 13:00 (UTC+1)Multiple build failures with the macos-hosted agents. Suspecting the issue could be due to misconfigured buildx. |
Beta Was this translation helpful? Give feedback.
-
Tuesday 3 June BST 11:00 (UTC+1)Opensuse reliability issues. These errors often disappear with a manual restart, but all these take up time.
|
Beta Was this translation helpful? Give feedback.
-
Tuesday 17 June BST 11:00 (UTC+1)We have lost mini-dev-1 due to a lack of disk space (and human errors). Even though it has a 1 TB disk, due to existing cdxgen bugs wrt temp directories cleanup, we lost this space overnight (after < 10 builds). There is a cron job that does Plan
|
Beta Was this translation helpful? Give feedback.
-
Wednesday 18 June BST 12:00 (UTC+1)mini-dev-1 arrived today (Thanks, Bernie!) for maintenance. When I executed the ![]() Thanks to ChatGPT's help, I managed to delete sudo lvdisplay
sudo umount /dev/ubuntu-vg/lv-to-delete
# Manually remove from /etc/fstab
sudo lvremove /dev/ubuntu-vg/lv-to-delete
sudo lvextend -l +100%FREE /dev/ubuntu-vg/target-lv
sudo btrfs filesystem resize max / With tailscale back on, sudo df -h
Filesystem Size Used Avail Use% Mounted on
tmpfs 3.1G 10M 3.1G 1% /run
efivarfs 128K 43K 81K 35% /sys/firmware/efi/efivars
/dev/mapper/ubuntu--vg-ubuntu--lv 929G 296G 630G 32% /
tmpfs 16G 92K 16G 1% /dev/shm
tmpfs 5.0M 0 5.0M 0% /run/lock
tmpfs 16G 0 16G 0% /run/qemu
/dev/nvme0n1p2 2.0G 317M 1.5G 18% /boot
/dev/nvme0n1p1 1.1G 6.2M 1.1G 1% /boot/efi
tmpfs 3.1G 16K 3.1G 1% /run/user/1000
overlay 929G 296G 630G 32% /var/lib/docker/rootfs/overlayfs/35fd7f7783db3920697d529dce17daf4f2563262c9bef49c6c5c53bbcf0faa75
overlay 929G 296G 630G 32% /var/lib/docker/rootfs/overlayfs/45a05b792f98bfda809375bde04d5b2425ac35a2fdaf0fc659c7034692baed1e
overlay 929G 296G 630G 32% /var/lib/docker/rootfs/overlayfs/a33f86060ad9ba76964aa1efe8d96041d86e5bbcaeebf95a7f5de37d400f0480 What about OBOMWe generate OBOM for all our servers. While, the document has details about the packages, services, mount points, etc, it currently lacks information about For interested people, our osquery queries are here. We need to add some queries to read the Bonus points, if the annotation section can be enhanced to describe the partitions, networks, and services in plain text. It currently reads as shown:
|
Beta Was this translation helpful? Give feedback.
-
Tuesday 22nd July BST 14:00 (UTC+1)Thanks to some brilliant work from @malice00 and @bandhan-majumder we have reached the stage where our in-house compute is fully automated requiring limited maintenance. Roland has also kindly setup a local Nexus server to improve build performance and resilience. ![]() All of our release artefacts and container images (including multi-arch) include a detailed SBOM. The next step is to look into generating the licence notices for all release artefacts. |
Beta Was this translation helpful? Give feedback.
-
Sunday 3rd August CEST 03:30 (UTC+2)After changing our image builds to make use of our local Nexus, I finally figured out how to get Docker mirroring to work with buildkit! I created a toml configuration for buildkit:
Then, I removed the existing multi-builder and recreated it using the toml configuration:
I had to use the IP address of our server here, because for some reason the colima instances that are running our docker engines can't consistently resolve names in our network. Things to remember
|
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
-
Discussion to track our build status, downtime etc.
self-hosted agents
We have two Mac Minis (64GB RAM each) and one Mini PC (32 GB RAM) powering cdxgen.
Todo
Learnings
tonistiigi/binfmt
,multiarch/qemu-user-static
,moby/buildkit:buildx-stable-1
wrap a very old version of QEMU 8.2.2 with several known bugs. Upgrading the entire stack to use QEMU 10 is not for the faint-hearted. Even with QEMU 10, you are met with regression bugs with the only solution being to downgrade to QEMU v8.1.5!Beta Was this translation helpful? Give feedback.
All reactions