@@ -42,10 +42,35 @@ The following roles/groups are currently fully functional:
4242 node and all compute nodes.
4343- ` openhpc ` : all functionality
4444
45- # Development/debugging
45+ All of the above are defined in the skeleton cookiecutter config, and are
46+ toggleable via a terraform compute_init autovar file. In the .stackhpc
47+ environment, the compute init roles are set by default to:
48+ - ` enable_compute ` : This encompasses the openhpc role functionality while being
49+ a global toggle for the entire compute-init script.
50+ - ` etc_hosts `
51+ - ` nfs `
52+ - ` basic_users `
53+ - ` eessi `
54+
55+ # CI workflow
56+
57+ The compute node rebuild is tested in CI after the tests for rebuilding the
58+ login and control nodes. The process follows
59+
60+ 1 . Compute nodes are reimaged:
61+
62+ ansible-playbook -v --limit compute ansible/adhoc/rebuild.yml
4663
47- To develop/debug this without actually having to build an image:
64+ 2 . Ansible-init runs against newly reimaged compute nodes
65+
66+ 3 . Run sinfo and check nodes have expected slurm state
67+
68+ ansible-playbook -v ansible/ci/check_slurm.yml
69+
70+ # Development/debugging
4871
72+ To develop/debug changes to the compute script without actually having to build
73+ a new image:
4974
50751 . Deploy a cluster using tofu and ansible/site.yml as normal. This will
5176 additionally configure the control node to export compute hostvars over NFS.
@@ -103,7 +128,7 @@ as in step 3.
103128 available v the current approach:
104129
105130 ```
106- [root@rl9-compute-0 rocky]# grep hostvars /mnt/cluster/hostvars/rl9-compute-0/hostvars.yml
131+ [root@rl9-compute-0 rocky]# grep hostvars /mnt/cluster/hostvars/rl9-compute-0/hostvars.yml
107132 "grafana_address": "{{ hostvars[groups['grafana'].0].api_address }}",
108133 "grafana_api_address": "{{ hostvars[groups['grafana'].0].internal_address }}",
109134 "mysql_host": "{{ hostvars[groups['mysql'] | first].api_address }}",
0 commit comments