Skip to content

Commit 03eb5cf

Browse files
committed
chore: add rudimentary docs on the QEMU artifact
1 parent 36510ea commit 03eb5cf

File tree

3 files changed

+102
-8
lines changed

3 files changed

+102
-8
lines changed

Makefile

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -14,6 +14,9 @@ disk/focal-raw.img: output-cloudimg/packer-cloudimg
1414
container-disk-image: output-cloudimg/packer-cloudimg
1515
docker build . -t supabase-postgres-test:$(GIT_SHA) -f ./Dockerfile-kubevirt
1616

17+
eks-node-container-disk-image: output-cloudimg/packer-cloudimg
18+
sudo nerdctl build . -t supabase-postgres-test:$(GIT_SHA) --namespace k8s.io -f ./Dockerfile-kubevirt
19+
1720
host-disk: disk/focal-raw.img
1821
sudo chown 107 -R disk
1922

ebssurrogate/scripts/qemu-bootstrap-nix.sh

Lines changed: 4 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -1,10 +1,4 @@
11
#!/usr/bin/env bash
2-
#
3-
# This script creates filesystem and setups up chrooted
4-
# enviroment for further processing. It also runs
5-
# ansible playbook and finally does system cleanup.
6-
#
7-
# Adapted from: https://github.com/jen20/packer-ubuntu-zfs
82

93
set -o errexit
104
set -o pipefail
@@ -41,9 +35,8 @@ tee /etc/ansible/ansible.cfg <<EOF
4135
callbacks_enabled = timer, profile_tasks, profile_roles
4236
EOF
4337
# Run Ansible playbook
44-
#export ANSIBLE_LOG_PATH=/tmp/ansible.log && export ANSIBLE_DEBUG=True && export ANSIBLE_REMOTE_TEMP=/mnt/tmp
4538
export ANSIBLE_LOG_PATH=/tmp/ansible.log && export ANSIBLE_REMOTE_TEMP=/mnt/tmp
46-
ansible-playbook ./ansible/playbook.yml --extra-vars '{"nixpkg_mode": true, "debpkg_mode": false, "stage2_nix": false}' # $ARGS - I think this is being not passed in correctly
39+
ansible-playbook ./ansible/playbook.yml --extra-vars '{"nixpkg_mode": true, "debpkg_mode": false, "stage2_nix": false}'
4740
}
4841

4942
function setup_postgesql_env {
@@ -80,7 +73,10 @@ setup_postgesql_env
8073
setup_locale
8174
execute_playbook
8275

76+
####################
8377
# stage 2 things
78+
####################
79+
8480
function install_nix() {
8581
sudo su -c "curl --proto '=https' --tlsv1.2 -sSf -L https://install.determinate.systems/nix | sh -s -- install --no-confirm \
8682
--extra-conf \"substituters = https://cache.nixos.org https://nix-postgres-artifacts.s3.amazonaws.com\" \

qemu_artifact.md

Lines changed: 95 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,95 @@
1+
# QEMU artifact
2+
3+
We build a container image that contains a QEMU qcow2 disk image. This container image can be use with KubeVirt's [containerDisk](https://kubevirt.io/user-guide/storage/disks_and_volumes/#containerdisk) functionality to boot up VMs off the qcow2 image.
4+
5+
Container images are a convenient mechanism to ship the disk image to the nodes where they're needed.
6+
7+
Given the size of the image, the first VM using it on a node might take a while to come up, while the image is being pulled down. The image can be pre-fetched to avoid this; we might also switch to other deployment mechanisms in the future.
8+
9+
# Building QEMU artifact
10+
11+
## Creating a bare-metal instance
12+
13+
We launch an Ubuntu 22 bare-metal instance; we're using the `c6g.metal` instance type in this case, but any ARM instance type is sufficient for our purposes.
14+
15+
aws ec2 create-security-group --group-name "launch-wizard-1" --description "launch-wizard-1 created 2024-11-26T00:32:56.039Z" --vpc-id "vpc-0fbfcc428751ce76b"
16+
aws ec2 authorize-security-group-ingress --group-id "sg-preview-1" --ip-permissions '{"IpProtocol":"tcp","FromPort":22,"ToPort":22,"IpRanges":[{"CidrIp":"0.0.0.0/0"}]}'
17+
aws ec2 run-instances --image-id "ami-0a87daabd88e93b1f" --instance-type "c6g.metal" --key-name "darora-aps1" --block-device-mappings '{"DeviceName":"/dev/sda1","Ebs":{"Encrypted":false,"DeleteOnTermination":true,"Iops":3000,"SnapshotId":"snap-0fe84a34403e3da8b","VolumeSize":200,"VolumeType":"gp3","Throughput":125}}' --network-interfaces '{"AssociatePublicIpAddress":true,"DeviceIndex":0,"Groups":["sg-preview-1"]}' --tag-specifications '{"ResourceType":"instance","Tags":[{"Key":"Name","Value":"darora-pg-image"}]}' --metadata-options '{"HttpEndpoint":"enabled","HttpPutResponseHopLimit":2,"HttpTokens":"required"}' --private-dns-name-options '{"HostnameType":"ip-name","EnableResourceNameDnsARecord":true,"EnableResourceNameDnsAAAARecord":false}' --count "1"
18+
19+
## Install deps
20+
21+
On the instance, install the dependencies we require for producing QEMU artifacts:
22+
23+
sudo apt-get update
24+
sudo apt-get install -y qemu-system qemu-system-arm qemu-utils qemu-efi-aarch64 libvirt-clients libvirt-daemon libqcow-utils software-properties-common git make libnbd-bin nbdkit fuse2fs cloud-image-utils awscli
25+
sudo usermod -aG kvm ubuntu
26+
curl -fsSL https://apt.releases.hashicorp.com/gpg | sudo apt-key add -
27+
sudo apt-add-repository "deb [arch=arm64] https://apt.releases.hashicorp.com $(lsb_release -cs) main"
28+
sudo apt-get update && sudo apt-get install packer=1.11.2-1
29+
sudo apt-get install -y docker.io
30+
31+
32+
Some dev deps that might be useful:
33+
34+
sudo apt-get install -y emacs ripgrep vim-tiny byobu
35+
36+
37+
## Clone repo and build
38+
39+
Logout/login first to pick up new group memberships!
40+
41+
git clone https://github.com/supabase/postgres.git
42+
cd postgres
43+
git checkout da/qemu-rebasing # choose appropriate branch here
44+
make init container-disk-image
45+
46+
### Build process
47+
48+
The current AMI process involves a few steps:
49+
50+
1. nix package is build and published using GHA (`.github/workflows/nix-build.yml`)
51+
- this builds Postgres alongwith the PG extensions we use.
52+
2. "stage1" build (`amazon-arm64-nix.pkr.hcl`, invoked via `.github/workflows/ami-release-nix.yml`)
53+
- uses an upstream Ubuntu image to initialize the AMI
54+
- installs and configures the majority of the software that gets shipped as part of the AMI (e.g. gotrue, postgrest, ...)
55+
3. "stage2" build (`stage2-nix-psql.pkr.hcl`, invoked via `.github/workflows/ami-release-nix.yml`)
56+
- uses the image published from (2)
57+
- installs and configures the software that is build and published using nix in (1)
58+
- cleans up build dependencies etc
59+
60+
The QEMU artifact process collapses (2) and (3):
61+
62+
a. nix package is build and published using GHA (`.github/workflows/nix-build.yml`)
63+
b. packer build (`qemu-arm64-nix.pkr.hcl`)
64+
- uses an upstream Ubuntu live image as the base
65+
- performs the work that was performed as part of the "stage1" and "stage2" builds
66+
- this work is executed using `ebssurrogate/scripts/qemu-bootstrap-nix.sh`
67+
68+
## Publish image for later use
69+
70+
Publish the built image to a registry of your choosing, and use the published image with KubeVirt.
71+
72+
73+
# Iterating on the QEMU artifact
74+
75+
For a tighter iteration loop on the Postgres artifact, the recommended workflow is to do so on an Ubuntu bare-metal node that's part of the EKS cluster that you're deploying to.
76+
77+
- Use the `host-disk` make target to build the raw image file on disk. (`/path/to/postgres/disk/focal-raw.img`)
78+
- Update the VM spec to use `hostDisk` instead of `containerDisk`. Note that only one VM can use an image at a time, so you can't create multiple VMs backed by the same host disk.
79+
- Enable the `HostDisk` feature flag for KubeVirt
80+
- Deploy the VM to the node
81+
82+
Additionally, to iterate on the container image part of things, you can build the image on the bare-metal node (`eks-node-container-disk-image` target), rather than needing to publish it to ECR or similar registry. However, this part can take a while, so iterating using host disks remains the fastest dev loop.
83+
84+
## Dependencies note
85+
86+
Installing `docker.io` on an EKS node might interfere with the k8s setup of the node. You can instead install `nerdctl` and `buildkit`:
87+
88+
curl -L -O https://github.com/containerd/nerdctl/releases/download/v2.0.0/nerdctl-2.0.0-linux-arm64.tar.gz
89+
tar -xzf nerdctl-2.0.0-linux-arm64.tar.gz
90+
sudo mv ./nerdctl /usr/local/bin/
91+
curl -O -L https://github.com/moby/buildkit/releases/download/v0.17.1/buildkit-v0.17.1.linux-arm64.tar.gz
92+
tar -xzf buildkit-v0.17.1.linux-arm64.tar.gz
93+
sudo mv bin/* /usr/local/bin/
94+
95+
You'll need to run buildkit: `sudo buildkitd`

0 commit comments

Comments
 (0)