Skip to content

Commit 1f45851

Browse files
committed
doc problems with templating out hostvars
1 parent 53a7dc4 commit 1f45851

File tree

1 file changed

+49
-4
lines changed

1 file changed

+49
-4
lines changed

docs/experimental/compute-init.md

Lines changed: 49 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -1,14 +1,17 @@
11
# compute-init
22

3-
TODO: describe current status.
3+
The following roles are currently functional:
4+
- resolv_conf
5+
- etc_hosts
6+
- stackhpc.openhpc
47

58
# Development
69

710
To develop/debug this without actually having to build an image:
811

912

1013
1. Deploy a cluster using tofu and ansible/site.yml as normal. This will
11-
additionally configure the control node to export compute hosts over NFS.
14+
additionally configure the control node to export compute hostvars over NFS.
1215
Check the cluster is up.
1316

1417
2. Reimage the compute nodes:
@@ -22,6 +25,10 @@ To develop/debug this without actually having to build an image:
2225

2326
ansible-playbook ansible/fatimage.yml --tags compute_init
2427

28+
NB: This will also re-export the compute hostvars, as the nodes are not
29+
in the builder group, which conveniently means any changes made to that
30+
play also get picked up.
31+
2532
5. Fake a reimage of compute to run ansible-init and the compute-init playbook:
2633

2734
On compute node where metadata was added:
@@ -31,8 +38,9 @@ To develop/debug this without actually having to build an image:
3138

3239
Use `systemctl status ansible-init` to view stdout/stderr from Ansible.
3340

34-
Steps 4/5 can be repeated with changes to the compute script. If desirable
35-
reimage the compute node(s) first as in step 3.
41+
Steps 4/5 can be repeated with changes to the compute script. If required,
42+
reimage the compute node(s) first as in step 2 and/or add additional metadata
43+
as in step 3.
3644

3745
# Results/progress
3846

@@ -144,3 +152,40 @@ This commit - shows that hostvars have loaded:
144152
Dec 13 21:06:20 rl9-compute-0.rl9.invalid ansible-init[27585]: [INFO] ansible-init completed successfully
145153
Dec 13 21:06:20 rl9-compute-0.rl9.invalid systemd[1]: Finished ansible-init.service.
146154

155+
# Design notes
156+
157+
- In general, we don't want to rely on NFS export. So should e.g. copy files
158+
from this mount ASAP in the compute-init script. TODO:
159+
- There are a few possible approaches:
160+
161+
1. Control node copies files resulting from role into cluster exports,
162+
compute-init copies to local disk. Only works if files are not host-specific
163+
Examples: etc_hosts, eessi config?
164+
165+
2. Re-implement the role. Works if the role vars are not too complicated,
166+
(else they all need to be duplicated in compute-init). Could also only
167+
support certain subsets of role functionality or variables
168+
Examples: resolv_conf, stackhpc.openhpc
169+
170+
171+
# Problems with templated hostvars
172+
173+
Here are all the ones which actually rely on hostvars from other nodes,
174+
which therefore aren't available:
175+
176+
```
177+
[root@rl9-compute-0 rocky]# grep hostvars /mnt/cluster/hostvars/rl9-compute-0/hostvars.yml
178+
"grafana_address": "{{ hostvars[groups['grafana'].0].api_address }}",
179+
"grafana_api_address": "{{ hostvars[groups['grafana'].0].internal_address }}",
180+
"mysql_host": "{{ hostvars[groups['mysql'] | first].api_address }}",
181+
"nfs_server_default": "{{ hostvars[groups['control'] | first ].internal_address }}",
182+
"openhpc_slurm_control_host": "{{ hostvars[groups['control'].0].api_address }}",
183+
"openondemand_address": "{{ hostvars[groups['openondemand'].0].api_address if groups['openondemand'] | count > 0 else '' }}",
184+
"openondemand_node_proxy_directives": "{{ _opeonondemand_unset_auth if (openondemand_auth == 'basic_pam' and 'openondemand_host_regex' and groups['grafana'] | length > 0 and hostvars[ groups['grafana'] | first]._grafana_auth_is_anonymous) else '' }}",
185+
"openondemand_servername": "{{ hostvars[ groups['openondemand'] | first].ansible_host }}",
186+
"prometheus_address": "{{ hostvars[groups['prometheus'].0].api_address }}",
187+
"{{ hostvars[groups['freeipa_server'].0].ansible_host }}"
188+
```
189+
190+
More generally, there is nothing to stop any group var depending on a
191+
"{{ hostvars[] }}" interpolation ...

0 commit comments

Comments
 (0)