Skip to content

Commit 04a5bf3

Browse files
committed
Merge branch 'feature/k3s-ansible-init' into feature/k3s-monitoring
2 parents 9c359d9 + d95037b commit 04a5bf3

File tree

5 files changed

+42
-4
lines changed

5 files changed

+42
-4
lines changed

ansible/cleanup.yml

Lines changed: 25 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -43,3 +43,28 @@
4343
ansible.builtin.file:
4444
path: /var/lib/ansible-init.done
4545
state: absent
46+
47+
- name: Get package facts
48+
package_facts:
49+
50+
- name: Ensure image summary directory exists
51+
file:
52+
path: /var/lib/image/
53+
state: directory
54+
owner: root
55+
group: root
56+
mode: u=rwX,go=rX
57+
58+
- name: Write image summary
59+
copy:
60+
content: "{{ image_info | to_nice_json }}"
61+
dest: /var/lib/image/image.json
62+
vars:
63+
image_info:
64+
branch: "{{ lookup('pipe', 'git rev-parse --abbrev-ref HEAD') }}"
65+
build: "{{ ansible_nodename | split('.') | first }}" # hostname is image name, which contains build info
66+
os: "{{ ansible_distribution }} {{ ansible_distribution_version }}"
67+
kernel: "{{ ansible_kernel }}"
68+
ofed: "{{ ansible_facts.packages['mlnx-ofa_kernel'].0.version | default('-') }}"
69+
cuda: "{{ ansible_facts.packages['cuda'].0.version | default('-') }}"
70+
slurm-ohpc: "{{ ansible_facts.packages['slurm-ohpc'].0.version | default('-') }}"

ansible/fatimage.yml

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -177,9 +177,9 @@
177177

178178
- hosts: builder
179179
become: yes
180-
gather_facts: no
180+
gather_facts: yes
181+
tags: finalise
181182
tasks:
182-
# - meta: end_here
183183
- name: Cleanup image
184184
import_tasks: cleanup.yml
185185

ansible/roles/k3s/files/start_k3s.yml

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -6,6 +6,11 @@
66
k3s_server_name: "{{ os_metadata.meta.k3s_server }}"
77
service_name: "{{ 'k3s-agent' if k3s_server_name is defined else 'k3s' }}"
88
tasks:
9+
- name: Set agent node password as token # uses token to keep password consistent between reimages
10+
ansible.builtin.copy:
11+
dest: /etc/rancher/node/password
12+
content: "{{ k3s_token }}"
13+
914
- name: Add the token for joining the cluster to the environment
1015
no_log: true # avoid logging the server token
1116
ansible.builtin.lineinfile:

docs/k3s.README.md

Lines changed: 8 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,8 @@
1+
# Overview
2+
A K3s cluster is deployed with the Slurm cluster. Both an agent and server instance of K3s is installed during image build and the correct service (determined by OpenStack metadata) will be
3+
enabled during boot. Nodes with the `k3s_server` metadata field defined will be configured as K3s agents (this field gives them the address of the server). The Slurm control node is currently configured as a server while all other nodes configured as agents. It should be noted that running multiple K3s servers isn't supported. Currently only the root user on the control node has
4+
access to the Kubernetes API. The `k3s` role installs Helm for package management. K9s is also installed in the image and can be used by the root user.
5+
6+
# Idempotency
7+
K3s is intended to only be installed during image build as it is configured by the appliance on first boot with `azimuth_cloud.image_utils.linux_ansible_init`. Therefore, the `k3s` role isn't
8+
idempotent and changes to variables will not be reflected in the image when running `site.yml`. An additional consequence of this is that for changes to role variables to be correctly applied during build, a base image which has `ansible-init` installed but not existing K3s instances must be used.

environments/.stackhpc/terraform/main.tf

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -30,8 +30,8 @@ variable "cluster_image" {
3030
type = map(string)
3131
default = {
3232
# https://github.com/stackhpc/ansible-slurm-appliance/pull/441
33-
RL8: "openhpc-ofed-RL8-241002-1144-2ab8e524"
34-
RL9: "openhpc-ofed-RL9-241002-1145-2ab8e524"
33+
RL8: "openhpc-ofed-RL8-241008-1531-2861edba"
34+
RL9: "openhpc-ofed-RL9-241008-1531-2861edba"
3535
}
3636
}
3737

0 commit comments

Comments
 (0)