|
1 | 1 | # Configuration of Persistent State |
2 | 2 |
|
3 | | -To enable cluster state to persist beyond individual node lifetimes (e.g. to survive a cluster deletion or rebuild) set `appliances_state_dir` to the path of a directory on persistent storage. |
| 3 | +To enable cluster state to persist beyond individual node lifetimes (e.g. to survive a cluster deletion or rebuild) set `appliances_state_dir` to the path of a directory on persistent storage, such as an OpenStack volume. |
4 | 4 |
|
5 | | -At present this will affect the following items: |
| 5 | +At present this will affect the following: |
6 | 6 | - `slurmctld` state, i.e. the Slurm queue. |
7 | 7 | - The MySQL database for `slurmdbd`, i.e. Slurm accounting information as shown by the `sacct` command. |
8 | 8 | - Prometheus database |
9 | 9 | - Grafana data |
10 | 10 | - OpenDistro/elasticsearch data |
11 | 11 |
|
12 | | -If using the `environments/common/layout/everything` Ansible groups template (which is the default for a new cookiecutter-produced enviromnent) then these services will all be on the `control` node and hence only this node requires persistent storage. |
| 12 | +If using the `environments/common/layout/everything` Ansible groups template (which is the default for a new cookiecutter-produced environment) then these services will all be on the `control` node and hence only this node requires persistent storage. |
13 | 13 |
|
14 | | -Note that if `appliances_state_dir` is defined, the path it gives must exist and should be owned by root. Directories will be created within this with appropriate permissions for each item of state defined above. |
| 14 | +Note that if `appliances_state_dir` is defined, the path it gives must exist and should be owned by root. Directories will be created within this with appropriate permissions for each item of state defined above. Additionally, the systemd units for the services listed above will be modified to require `appliances_state_dir` to be mounted before service start (via the `systemd` role). |
15 | 15 |
|
16 | 16 | A new cookiecutter-produced environment supports persistent state in the default Terraform (see `environments/skeleton/{{cookiecutter.environment}}/terraform/`) by: |
17 | 17 |
|
18 | 18 | - Defining a volume with a default size of 150GB - this can be controlled by the Terraform variable `state_volume_size`. |
19 | 19 | - Attaching it to the control node. |
20 | | -- Defining cloud-init userdata for the control node which partitions, formats and mounts this volume to `/var/lib/state`. |
21 | | -- Defining `appliances_state_dir` for the control node in the (Terraform-templated) the `inventory/hosts` file. |
| 20 | +- Defining cloud-init userdata for the control node which formats and mounts this volume at `/var/lib/state`. |
| 21 | +- Defining `appliances_state_dir: /var/lib/state` for the control node in the (Terraform-templated) `inventory/hosts` file. |
22 | 22 |
|
23 | | -**NB: The default Terraform is provided as a working example and for internal CI use - therefore this volume is deleted when running `terraform destroy` - this is probably not suitable for |
24 | | -a production environment.** |
| 23 | +**NB: The default Terraform is provided as a working example and for internal CI use - therefore this volume is deleted when running `terraform destroy` - this may not be appropriate for a production environment.** |
25 | 24 |
|
26 | 25 | In general, the Prometheus data is likely to be the only sizeable state stored. The size of this can be influenced through [Prometheus role variables](https://github.com/cloudalchemy/ansible-prometheus#role-variables), e.g.: |
27 | 26 | - `prometheus_storage_retention` - [default](../environments/common/inventory/group_vars/all/prometheus.yml) 31d |
|
0 commit comments