Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
88 changes: 88 additions & 0 deletions docs/filesystems.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,88 @@
# Overview

The Slurm appliance supports multiple ways of configuring shared filesystems, including:

- Configuring the control node as an NFS server. (Default)

- CephFS via Manila

- Lustre

# Manila

The Slurm appliance supports mounting shared filesystems using [CephFS](https://docs.ceph.com/en/latest/cephfs/) via [OpenStack Manila](https://docs.openstack.org/manila/latest/). This section explains:

- How to create the shares in OpenStack Manila.

- How to configure the Slurm Appliance to mount these Manila shares.

- How to switch to a Manila share for a shared home directory.

## Creating shares in OpenStack

The Slurm appliance requires that the Manila shares already exist on the system. Follow the instructions below to do this.

If this is the first time Manila is being used on the system, a CephFS share type will need to be created. You will need admin credentials to do this.

```bash
openstack share type create cephfs-type false --extra-specs storage_protocol=CEPHFS vendor_name=Ceph
```

Once this exists, create a share using credentials for the Slurm project. An access rule also needs to be created, where the `access_to` argument (`openstack share access create <share> <access_type> <access_to>`) is a user that will be created in Ceph. This needs to be globally unique in Ceph, so needs to be different for each OpenStack project. Ideally, this share should include your environment name. In this example, the name is "production".

```bash
openstack share create CephFS 300 --description 'Scratch dir for Slurm prod' --name slurm-production-scratch --share-type cephfs-type --wait
openstack share access create slurm-production-scratch cephx slurm-production
```

## Configuring the Slurm Appliance for Manila

To mount shares onto hosts in a group, add them to the `manila` group.

```ini
# environments/site/inventory/groups:
[manila:children]:
login
compute
```

If you are running a different version of Ceph from the defaults in the [os-manila-mount role](https://github.com/stackhpc/ansible-role-os-manila-mount/blob/master/defaults/main.yml), you will need to update the package version by setting:

```yaml
# environments/site/inventory/group_vars/manila.yml:
os_manila_mount_ceph_version: "18.2.4"
```

A [site-specific image](image-build.md) should be built which includes this package; add ``manila`` to the Packer ``inventory_groups`` variable.

Define the list of shares to be mounted, and the paths to mount them to. The example below parameterises the share name using the environment name. See the [stackhpc.os-manila-mount role](https://github.com/stackhpc/ansible-role-os-manila-mount) for further configuration options.

```yaml
# environments/site/inventory/group_vars/manila.yml:
os_manila_mount_shares:
- share_name: "slurm-{{ appliances_environment_name }}-scratch"
mount_path: /scratch
```

### Shared home directory

By default, the Slurm appliance configures the control node as an NFS server and exports a directory which is mounted on the other cluster nodes as `/home`. When using Manila + CephFS for the home directory instead, this will need to be disabled. To do this, set the tf var `home_volume_provisioning` to `None`.

Some `basic_users_homedir_*` parameters need overriding as the provided defaults are only satisfactory for the default root-squashed NFS share:

```yaml
# environments/site/inventory/group_vars/all/basic_users.yml:
basic_users_homedir_server: "{{ groups['login'] | first }}" # if not mounting /home on control node
basic_users_homedir_server_path: /home
```

Finally, add the home directory to the list of shares (the share should be already created in OpenStack).

```yaml
# environments/site/inventory/group_vars/all/manila.yml:
os_manila_mount_shares:
- share_name: "slurm-{{ appliances_environment_name }}-scratch"
mount_path: /scratch
- share_name: "slurm-{{ appliances_environment_name }}-home"
mount_path: /home
```
6 changes: 5 additions & 1 deletion environments/common/inventory/group_vars/all/manila.yml
Original file line number Diff line number Diff line change
Expand Up @@ -10,4 +10,8 @@ os_manila_mount_shares: []
# mount_group:
# mount_mode:

# os_manila_mount_ceph_version: nautilus # role default for RockyLinux 8
# os_manila_mount_ceph_version:

# Empty repo lists from stackhpc.ansible-role-os-manila-mount role defaults, as these repofiles are
# now generated by dnf_repos to allow injecting Ark creds:
os_manila_mount_ceph_rpm_repos: []

This file was deleted.