|
1 | 1 | rebuild
|
2 | 2 | =========
|
3 | 3 |
|
4 |
| -Enables reboot tool from https://github.com/stackhpc/slurm-openstack-tools.git to be run from control node. |
| 4 | +Enables reboot tool from https://github.com/stackhpc/slurm-openstack-tools.git |
| 5 | +to be run from control node. |
5 | 6 |
|
6 | 7 | Requirements
|
7 | 8 | ------------
|
8 | 9 |
|
9 |
| -clouds.yaml file |
| 10 | +An OpenStack clouds.yaml file containing credentials for a cloud under the |
| 11 | +"openstack" key. |
10 | 12 |
|
11 | 13 | Role Variables
|
12 | 14 | --------------
|
13 | 15 |
|
14 |
| -- `openhpc_rebuild_clouds`: Directory. Path to clouds.yaml file. |
| 16 | +The below is only used by this role's `main.yml` task file, i.e. when running |
| 17 | +the `ansible/site.yml` or `ansible/slurm.yml` playbooks: |
15 | 18 |
|
| 19 | +- `rebuild_clouds_path`: Optional. Path to `clouds.yaml` file on the deploy |
| 20 | + host, default `~/.config/openstack/clouds.yaml`. |
16 | 21 |
|
17 |
| -Example Playbook |
18 |
| ----------------- |
| 22 | +The below are only used by this role's `rebuild.yml` task file, i.e. when |
| 23 | +running the `ansible/adhoc/rebuild-via-slurm.yml` playbook: |
19 | 24 |
|
20 |
| - - hosts: control |
21 |
| - become: yes |
22 |
| - tasks: |
23 |
| - - import_role: |
24 |
| - name: rebuild |
| 25 | +- `rebuild_job_partitions`: Optional. Comma-separated list of names of rebuild |
| 26 | + partitions defined in `openhpc_slurm_partitions`. Useful as an extra-var for |
| 27 | + limiting rebuilds. Default `rebuild`. |
25 | 28 |
|
26 |
| -License |
27 |
| -------- |
| 29 | +- `rebuild_job_name`: Optional. Name of rebuild jobs. Default is `rebuild-` |
| 30 | + suffixed with the node name. |
28 | 31 |
|
29 |
| -Apache-2.0 |
| 32 | +- `rebuild_job_command`: Optional. String giving command to run in job after |
| 33 | + node has been rebuilt. Default is to sleep for 5 seconds. Note job output is |
| 34 | + send to `/dev/null` by default, as the root user running this has no shared |
| 35 | + directory for job output. |
30 | 36 |
|
| 37 | +- `rebuild_job_reboot`: Optional. A bool controlling whether to add the |
| 38 | + `--reboot` flag to the job to actually trigger a rebuild. Useful for e.g. |
| 39 | + testing partition configurations. Default `true`. |
| 40 | + |
| 41 | +- `rebuild_job_options`: Optional. A string giving any other options to pass to |
| 42 | + [sbatch](https://slurm.schedmd.com/sbatch.html). Default is empty string. |
| 43 | + |
| 44 | +- `rebuild_job_user`: Optional. The user to run the rebuild setup and job as. |
| 45 | + Default `root`. |
| 46 | + |
| 47 | +- `rebuild_job_template`: Optional. The string to use to submit the job. See |
| 48 | + [defaults.yml](defaults/main.yml). |
| 49 | + |
| 50 | +- `rebuild_job_hostlist`: String with a Slurm hostlist expression to restrict |
| 51 | + a rebuild to only those nodes (e.g. `tux[1-3]` or `tux1,tux2`). If set, |
| 52 | + `rebuild_partitions` must only define a single partition and that partition |
| 53 | + must contain those nodes. Not for routine use, but may be useful to e.g. |
| 54 | + reattempt a rebuild if this failed on specific nodes. Default is all nodes |
| 55 | + in the relevant partition. |
0 commit comments