Skip to content

Commit 040e569

Browse files
committed
merge
2 parents c4a4847 + 54910a1 commit 040e569

File tree

27 files changed

+424
-115
lines changed

27 files changed

+424
-115
lines changed

.github/workflows/fatimage.yml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -117,4 +117,4 @@ jobs:
117117
path: |
118118
./image-id.txt
119119
./image-name.txt
120-
overwrite: true
120+
overwrite: true

README.md

Lines changed: 23 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -55,21 +55,28 @@ You will also need to install [OpenTofu](https://opentofu.org/docs/intro/install
5555

5656
### Create a new environment
5757

58-
Use the `cookiecutter` template to create a new environment to hold your configuration. In the repository root run:
58+
Run the following from the repository root to activate the venv:
5959

6060
. venv/bin/activate
61+
62+
Use the `cookiecutter` template to create a new environment to hold your configuration:
63+
6164
cd environments
6265
cookiecutter skeleton
6366

6467
and follow the prompts to complete the environment name and description.
6568

6669
**NB:** In subsequent sections this new environment is refered to as `$ENV`.
6770

68-
Now generate secrets for this environment:
71+
Activate the new environment:
72+
73+
. environments/$ENV/activate
74+
75+
And generate secrets for it:
6976

7077
ansible-playbook ansible/adhoc/generate-passwords.yml
7178

72-
### Define infrastructure configuration
79+
### Define and deploy infrastructure
7380

7481
Create an OpenTofu variables file to define the required infrastructure, e.g.:
7582

@@ -91,20 +98,28 @@ Create an OpenTofu variables file to define the required infrastructure, e.g.:
9198
}
9299
}
93100

94-
Variables marked `*` refer to OpenStack resources which must already exist. The above is a minimal configuration - for all variables
95-
and descriptions see `environments/$ENV/terraform/terraform.tfvars`.
101+
Variables marked `*` refer to OpenStack resources which must already exist. The above is a minimal configuration - for all variables and descriptions see `environments/$ENV/terraform/terraform.tfvars`.
102+
103+
To deploy this infrastructure, ensure the venv and the environment are [activated](#create-a-new-environment) and run:
96104

97-
### Deploy appliance
105+
export OS_CLOUD=openstack
106+
cd environments/$ENV/terraform/
107+
tofu apply
108+
109+
and follow the prompts. Note the OS_CLOUD environment variable assumes that OpenStack credentials are defined using a [clouds.yaml](https://docs.openstack.org/python-openstackclient/latest/configuration/index.html#clouds-yaml) file in a default location with the default cloud name of `openstack`.
110+
111+
### Configure appliance
112+
113+
To configure the appliance, ensure the venv and the environment are [activated](#create-a-new-environment) and run:
98114

99115
ansible-playbook ansible/site.yml
100116

101-
You can now log in to the cluster using:
117+
Once it completes you can log in to the cluster using:
102118

103119
ssh rocky@$login_ip
104120

105121
where the IP of the login node is given in `environments/$ENV/inventory/hosts.yml`
106122

107-
108123
## Overview of directory structure
109124

110125
- `environments/`: See [docs/environments.md](docs/environments.md).

ansible/.gitignore

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -64,3 +64,5 @@ roles/*
6464
!roles/k9s/**
6565
!roles/kube_prometheus_stack
6666
!roles/kube_prometheus_stack/**
67+
!roles/lustre/
68+
!roles/lustre/**

ansible/cleanup.yml

Lines changed: 0 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -38,11 +38,6 @@
3838

3939
- name: Cleanup /tmp
4040
command : rm -rf /tmp/*
41-
42-
- name: Delete ansible-init sentinel file created if ansible-init has run during build
43-
ansible.builtin.file:
44-
path: /var/lib/ansible-init.done
45-
state: absent
4641

4742
- name: Get package facts
4843
package_facts:

ansible/fatimage.yml

Lines changed: 11 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -25,7 +25,7 @@
2525

2626
- hosts: builder
2727
become: yes
28-
gather_facts: no
28+
gather_facts: yes
2929
tasks:
3030
# - import_playbook: iam.yml
3131
- name: Install FreeIPA client
@@ -44,6 +44,11 @@
4444
name: stackhpc.os-manila-mount
4545
tasks_from: install.yml
4646
when: "'manila' in group_names"
47+
- name: Install Lustre packages
48+
include_role:
49+
name: lustre
50+
tasks_from: install.yml
51+
when: "'lustre' in group_names"
4752

4853
- import_playbook: extras.yml
4954

@@ -57,6 +62,7 @@
5762
name: mysql
5863
tasks_from: install.yml
5964
when: "'mysql' in group_names"
65+
6066
- name: OpenHPC
6167
import_role:
6268
name: stackhpc.openhpc
@@ -83,18 +89,21 @@
8389
import_role:
8490
name: openondemand
8591
tasks_from: vnc_compute.yml
92+
8693
when: "'openondemand_desktop' in group_names"
94+
8795
- name: Open Ondemand jupyter node
8896
import_role:
8997
name: openondemand
9098
tasks_from: jupyter_compute.yml
91-
when: "'openondemand' in group_names"
99+
when: "'openondemand_jupyter' in group_names"
92100

93101
# - import_playbook: monitoring.yml:
94102
- import_role:
95103
name: opensearch
96104
tasks_from: install.yml
97105
when: "'opensearch' in group_names"
106+
98107
# slurm_stats - nothing to do
99108
- import_role:
100109
name: filebeat

ansible/filesystems.yml

Lines changed: 10 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -24,3 +24,13 @@
2424
tasks:
2525
- include_role:
2626
name: stackhpc.os-manila-mount
27+
28+
- name: Setup Lustre clients
29+
hosts: lustre
30+
become: true
31+
tags: lustre
32+
tasks:
33+
- include_role:
34+
name: lustre
35+
# NB install is ONLY run in builder
36+
tasks_from: configure.yml

ansible/roles/cluster_infra/templates/resources.tf.j2

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -15,7 +15,7 @@ resource "terraform_data" "k3s_token" {
1515
input = "{{ k3s_token }}"
1616
lifecycle {
1717
ignore_changes = [
18-
input,
18+
input, # makes it a write-once value (set via Ansible)
1919
]
2020
}
2121
}

ansible/roles/k3s/README.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -8,7 +8,7 @@ Installs k3s agent and server services on nodes and an ansible-init playbook to
88
Requirements
99
------------
1010

11-
`azimuth_cloud.image_utils.linux_ansible_init` must have been run previously on targeted nodes
11+
`azimuth_cloud.image_utils.linux_ansible_init` must have been run previously on targeted nodes during image build.
1212

1313
Role Variables
1414
--------------

ansible/roles/k3s/defaults/main.yml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -2,4 +2,4 @@
22
k3s_version: "v1.31.0+k3s1"
33
k3s_selinux_release: v1.6.latest.1
44
k3s_selinux_rpm_version: 1.6-1
5-
rocky_version: "{{ ansible_distribution_major_version }}"
5+
k3s_helm_version: v3.11.0

ansible/roles/k3s/tasks/main.yml

Lines changed: 4 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -5,7 +5,8 @@
55
path: /var/lib/rancher/k3s
66
register: stat_result
77

8-
- name: Download and air-gapped installation of k3s
8+
- name: Perform air-gapped installation of k3s
9+
# Using air-gapped install so containers are pre-installed to avoid rate-limiting from registries on cluster startup
910
when: not stat_result.stat.exists
1011
block:
1112

@@ -19,7 +20,7 @@
1920

2021
- name: Install k3s SELinux policy package
2122
yum:
22-
name: "https://github.com/k3s-io/k3s-selinux/releases/download/{{ k3s_selinux_release }}/k3s-selinux-{{ k3s_selinux_rpm_version }}.el{{ rocky_version }}.noarch.rpm"
23+
name: "https://github.com/k3s-io/k3s-selinux/releases/download/{{ k3s_selinux_release }}/k3s-selinux-{{ k3s_selinux_rpm_version }}.el{{ ansible_distribution_major_version }}.noarch.rpm"
2324
disable_gpg_check: true
2425

2526
- name: Create image directory
@@ -58,7 +59,7 @@
5859

5960
- name: Install helm
6061
unarchive:
61-
src: https://get.helm.sh/helm-v3.11.0-linux-amd64.tar.gz
62+
src: "https://get.helm.sh/helm-{{ k3s_helm_version }}-linux-amd64.tar.gz"
6263
dest: /usr/bin
6364
extra_opts: "--strip-components=1"
6465
owner: root

0 commit comments

Comments
 (0)