You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: README.md
+20-14Lines changed: 20 additions & 14 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -1,8 +1,7 @@
1
1
# Slurm Docker Cluster
2
2
3
-
This is a multi-container Slurm cluster using Kubernetes. The Helm chart
4
-
creates a named volume for persistent storage of MySQL data files as well as
5
-
an NFS volume for shared storage.
3
+
This is a multi-container Slurm cluster using Kubernetes. The Slurm cluster Helm chart creates a named volume for persistent storage of MySQL data files. By default, it also installs the
4
+
RookNFS Helm chart (also in this repo) to provide shared storage across the Slurm cluster nodes.
6
5
7
6
## Dependencies
8
7
@@ -27,47 +26,51 @@ The Helm chart will create the following named volumes:
27
26
28
27
* var_lib_mysql ( -> /var/lib/mysql )
29
28
30
-
A named ReadWriteMany (RWX) volume mounted to `/home` is also expected, this can be external or can be deployed using the scripts in the `/nfs`directory (See "Deploying the Cluster")
29
+
A named ReadWriteMany (RWX) volume mounted to `/home` is also expected, this can be external or can be deployed using the provided `rooknfs` chart directory (See "Deploying the Cluster").
31
30
32
31
## Configuring the Cluster
33
32
34
-
All config files in `slurm-cluster-chart/files` will be mounted into the container to configure their respective services on startup. Note that changes to these files will not all be propagated to existing deployments (see "Reconfiguring the Cluster").
35
-
Additional parameters can be found in the `values.yaml` file, which will be applied on a Helm chart deployment. Note that some of these values will also not propagate until the cluster is restarted (see "Reconfiguring the Cluster").
33
+
All config files in `slurm-cluster-chart/files` will be mounted into the container to configure their respective services on startup. Note that changes to these files will not all be propagated to existing deployments (see "Reconfiguring the Cluster"). Additional parameters can be found in the `values.yaml` file for the Helm chart. Note that some of these values will also not propagate until the cluster is restarted (see "Reconfiguring the Cluster").
36
34
37
35
## Deploying the Cluster
38
36
39
37
### Generating Cluster Secrets
40
38
41
39
On initial deployment ONLY, run
42
40
```console
43
-
./generate-secrets.sh
41
+
./generate-secrets.sh [<target-namespace>]
44
42
```
45
-
This generates a set of secrets. If these need to be regenerated, see "Reconfiguring the Cluster"
43
+
This generates a set of secrets in the target namespace to be used by the Slurm cluster. If these need to be regenerated, see "Reconfiguring the Cluster"
46
44
47
45
Be sure to take note of the Open Ondemand credentials, you will need them to access the cluster through a browser
48
46
49
47
### Connecting RWX Volume
50
48
51
-
A ReadWriteMany (RWX) volume is required, if a named volume exists, set `nfs.claimName` in the `values.yaml` file to its name. If not, manifests to deploy a Rook NFS volume are provided in the `/nfs` directory. You can deploy this by running
52
-
```console
53
-
./nfs/deploy-nfs.sh
54
-
```
55
-
and leaving `nfs.claimName` as the provided value.
49
+
A ReadWriteMany (RWX) volume is required for shared storage across cluster nodes. By default, the Rook NFS Helm chart is installed as a dependency of the Slurm cluster chart in order to provide a RWX capable Storage Class for the required shared volume. If the target Kubernetes cluster has an existing storage class which should be used instead, then `storageClass` in `values.yaml` should be set to the name of this existing class and the RookNFS dependency should be disabled by setting `rooknfs.enabled = false`. In either case, the storage capacity of the provisioned RWX volume can be configured by setting the value of `storage.capacity`.
50
+
51
+
See the separate RookNFS chart [values.yaml](./rooknfs/values.yaml) for further configuration options when using the RookNFS to provide the shared storage volume.
56
52
57
53
### Supplying Public Keys
58
54
59
55
To access the cluster via `ssh`, you will need to make your public keys available. All your public keys from localhost can be added by running
60
56
61
57
```console
62
-
./publish-keys.sh
58
+
./publish-keys.sh [<target-namespace>]
63
59
```
60
+
where `<target-namespace>` is the namespace in which the Slurm cluster chart will be deployed (i.e. using `helm install -n <target-namespace> ...`). This will create a Kubernetes Secret in the appropriate namespace for the Slurm cluster to use. Omitting the namespace arg will install the secrets in the default namespace.
64
61
65
62
### Deploying with Helm
66
63
67
64
After configuring `kubectl` with the appropriate `kubeconfig` file, deploy the cluster using the Helm chart:
0 commit comments