Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
34 changes: 34 additions & 0 deletions .github/workflows/deploy-lnt.llvm.org.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,34 @@
name: Deploy lnt.llvm.org

on:
push:
tags:
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am deploying on tags at the moment: I don't think we want to re-deploy at every commit since we risk bringing down the instance. Actually, I even wonder whether that should be a manually triggered job. WDYT?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We want to tag images by commit SHA, but explicitly version them in the terraform. That means we get a new image per commit, but only redeploy when we explicitly bump the commit of the images we're running.

- 'v*'

permissions:
contents: read

jobs:
deploy:
runs-on: ubuntu-24.04

steps:
- uses: actions/checkout@08c6903cd8c0fde910a37f88322edcfb5dd907a8

- name: Setup Terraform
uses: hashicorp/setup-terraform@v3

- name: Configure AWS Credentials
uses: aws-actions/configure-aws-credentials@v4
with:
aws-access-key-id: ${{ secrets.AWS_ACCESS_KEY_ID }}
aws-secret-access-key: ${{ secrets.AWS_SECRET_ACCESS_KEY }}

- name: Initialize Terraform
run: terraform init

- name: Apply Terraform changes
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the instance already exists and then we re-deploy it, what's going to happen? My understanding is that we'd start over from scratch with an empty EC2 instance, which means we would lose all of the existing data stored on the instance. Is that not the case?

Do you understand the mechanism by which the data we store in VOLUMES in the Docker container end up being persisted across re-deployments of the EC2 instance? I don't.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm pretty sure it just calls whatever AWS API calls it needs to update the instance to match your terraform file, it won't get destroyed. Terraform holds state about this type of stuff.

The volume will just be stored on the root block device since we haven't attached any EBS storage or anything.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm pretty sure it just calls whatever AWS API calls it needs to update the instance to match your terraform file, it won't get destroyed. Terraform holds state about this type of stuff.

I see. But the Terraform state is not kept across invocations of the Github Action, so I don't really understand how Terraform can tell that we even already have an instance.

run: terraform apply -auto-approve
env:
TF_VAR_lnt_db_password: ${{ secrets.LNT_DB_PASSWORD }}
TF_VAR_lnt_auth_token: ${{ secrets.LNT_AUTH_TOKEN }}
17 changes: 17 additions & 0 deletions docker/lnt.llvm.org/ec2-startup.sh.tpl
Original file line number Diff line number Diff line change
@@ -0,0 +1,17 @@
#!/bin/bash

#
# This is a template for the startup script that gets run on the EC2
# instance running lnt.llvm.org. This template gets filled in by the
# Terraform configuration file.
#

sudo yum update -y
sudo amazon-linux-extras install docker docker-compose-plugin -y
sudo service docker start
sudo usermod -a -G docker ec2-user
sudo chkconfig docker on

LNT_DB_PASSWORD=${__db_password__}

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Where are these env variables coming from?

LNT_AUTH_TOKEN=${__auth_token__}
docker compose --file compose.yaml up
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we need to daemonize this otherwise cloudinit will never finish

Suggested change
docker compose --file compose.yaml up
docker compose --file compose.yaml up -d

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IIUC these user data scripts are only called when the instance is first initialized, but not e.g. rebooted. So we probably want to change the docker-compose restart policy to be unless-stopped so the containers get relaunched on a reboot

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This depends upon how we set it up. I was thinking it might be better to setup the machine to be a clean slate on every boot, and mount a persistent volume that actually contains the DB. That makes it super easy to change system software inside TF.

48 changes: 48 additions & 0 deletions docker/lnt.llvm.org/main.tf
Original file line number Diff line number Diff line change
@@ -0,0 +1,48 @@
#
# Terraform file for deploying lnt.llvm.org.
#

provider "aws" {
region = "us-west-2"
}

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We also need a way to set the terraform state. We use a GCS bucket in the premerge cluster to do this. https://github.com/llvm/llvm-zorg/blob/87d07e600970abf419046d2ab6083b2d64240bce/premerge/main.tf#L31

Otherwise state isn't saved across checkouts, which means things won't work.

variable "lnt_db_password" {

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These should probably be data resources that reference secrets stored inside AWS's secret manager.

https://github.com/llvm/llvm-zorg/blob/87d07e600970abf419046d2ab6083b2d64240bce/premerge/main.tf#L113 is how we set this up for premerge. Not sure exactly how to do this for AWS.

type = string
description = "The database password for the lnt.llvm.org database."
sensitive = true
}

variable "lnt_auth_token" {
type = string
description = "The authentication token to perform destructive operations on lnt.llvm.org."
sensitive = true
}

data "cloudinit_config" "startup_scripts" {
base64_encode = true
part {
filename = "ec2-startup.sh"
content_type = "text/x-shellscript"
content = templatefile("${path.module}/ec2-startup.sh.tpl", {
__db_password__ = var.lnt_db_password,
__auth_token__ = var.lnt_auth_token,
})
}

part {
filename = "compose.yaml"
content_type = "text/cloud-config"
content = file("${path.module}/../compose.yaml")
}
}

resource "aws_instance" "docker_server" {
ami = "ami-0c97bd51d598d45e4" # Amazon Linux 2023 kernel-6.12 AMI in us-west-2
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@boomanaiden154 Are we OK with hardcoding the AMI? What do you folks usually do?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not familiar with how AWS does things. Hard coding it doesn't seem like a big deal. But we want to be able to change it, which would probably force instance recreation. I think we should do what I suggested above where the instance is a clean slate on every boot but mounts a persistent volume that has the DB info.

instance_type = "t2.micro"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The default block storage on these devices are tiny (~8GB IIRC?), you probably want to expand it to a few GB more

Suggested change
instance_type = "t2.micro"
instance_type = "t2.micro"
root_block_device {
volume_size = 64
volume_type = "gp3"
}

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A couple GB boot disk should be fine, but slightly bigger might be good. The DB should probably be on a separate volume.

key_name = "test-key-name" # TODO
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure what to put here, I presume this needs to match a key in the LLVM Foundation's actual AWS account.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any keys should be specified in the provider. I think this is a different type of key.

tags = {
Name = "lnt.llvm.org"
}

user_data_base64 = data.cloudinit_config.startup_scripts.rendered
}
11 changes: 11 additions & 0 deletions docs/developer_guide.rst
Original file line number Diff line number Diff line change
Expand Up @@ -84,3 +84,14 @@ install the development dependencies, and then run the following commands from t
This requires setting up the right API token, see `the official documentation <https://packaging.python.org/en/latest/tutorials/packaging-projects/#uploading-the-distribution-archives>`_
for details. You can replace ``--repository testpypi`` with ``--repository pypi`` once you are actually ready
to publish the package.

Deploying lnt.llvm.org
----------------------

The `lnt.llvm.org <https://lnt.llvm.org>`_ instance gets re-deployed automatically on every tag
that gets pushed to main via a Github Action. Manually deploying the instance is also possible
by directly using Terraform::

cd docker/lnt.llvm.org
terraform init
terraform apply -var <foo> # see docker/lnt.llvm.org/main.tf for required variables