Skip to content

Commit e1d6d33

Browse files
authored
Fixed the Acrolinx score
1 parent bfc5cae commit e1d6d33

File tree

1 file changed

+32
-36
lines changed

1 file changed

+32
-36
lines changed

articles/cyclecloud/slurm.md

Lines changed: 32 additions & 36 deletions
Original file line numberDiff line numberDiff line change
@@ -8,15 +8,15 @@ ms.author: adjohnso
88

99
# Slurm
1010

11-
[//]: # (Need to link to the scheduler README on Github)
11+
[//]: # (Need to link to the scheduler README on GitHub)
1212

13-
Slurm is a highly configurable open source workload manager. See the [Slurm project site](https://www.schedmd.com/) for an overview.
13+
Slurm is a highly configurable open source workload manager. For more information, see the overview [Slurm project site](https://www.schedmd.com/).
1414

1515
> [!NOTE]
16-
> As of CycleCloud 8.4.0, the Slurm integration has been rewritten to support new features and functionality. See the [Slurm 3.0](slurm-3.md) documentation for more information.
16+
> Starting with CycleCloud 8.4.0, the Slurm integration was rewritten to support new features and functionality. For more information, see [Slurm 3.0](slurm-3.md) documentation.
1717
1818
::: moniker range="=cyclecloud-7"
19-
Slurm can easily be enabled on a CycleCloud cluster by modifying the "run_list" in the configuration section of your cluster definition. The two basic components of a Slurm cluster are the 'master' (or 'scheduler') node which provides a shared filesystem on which the Slurm software runs, and the 'execute' nodes which are the hosts that mount the shared filesystem and execute the jobs submitted. For example, a simple cluster template snippet may look like:
19+
Slurm can easily be enabled on a CycleCloud cluster by modifying the "run_list", in the configuration section of your cluster definition. A Slurm cluster has two main parts: the master (or scheduler) node, which runs the Slurm software on a shared file system, and the execute nodes, which mount that file system and run the submitted jobs. For example, a simple cluster template snippet may look like:
2020

2121
``` ini
2222
[cluster custom-slurm]
@@ -78,11 +78,11 @@ Slurm can easily be enabled on a CycleCloud cluster by modifying the "run_list"
7878
::: moniker-end
7979
## Editing Existing Slurm Clusters
8080

81-
Slurm clusters running in CycleCloud versions 7.8 and later implement an updated version of the autoscaling APIs that allows the clusters to utilize multiple nodearrays and partitions. To facilitate this functionality in Slurm, CycleCloud pre-populates the execute nodes in the cluster. Because of this, you need to run a command on the Slurm scheduler node after making any changes to the cluster, such as autoscale limits or VM types.
81+
Slurm clusters running in CycleCloud versions 7.8 and later implement an updated version of the autoscaling APIs that allows the clusters to utilize multiple nodearrays and partitions. To facilitate this functionality in Slurm, CycleCloud prepopulates the execute nodes in the cluster. Because of the prepopulation, you need to run a command on the Slurm scheduler node after making any changes to the cluster, such as autoscale limits or VM types.
8282

8383
### Making Cluster Changes
8484

85-
The Slurm cluster deployed in CycleCloud contains a script that facilitates this. After making any changes to the cluster, run the following as root (e.g., by running `sudo -i`) on the Slurm scheduler node to rebuild the `slurm.conf` and update the nodes in the cluster:
85+
The Slurm cluster deployed in CycleCloud contains a script that facilitates the changes. After making any changes to the cluster, run the next as root (For example, by running `sudo -i`) on the Slurm scheduler node to rebuild the `slurm.conf` and update the nodes in the cluster:
8686

8787
::: moniker range="=cyclecloud-7"
8888

@@ -92,12 +92,12 @@ The Slurm cluster deployed in CycleCloud contains a script that facilitates this
9292
```
9393

9494
> [!NOTE]
95-
> For CycleCloud versions < 7.9.10, the `cyclecloud_slurm.sh` script is located in _/opt/cycle/jetpack/system/bootstrap/slurm_.
95+
> For CycleCloud versions prior to 7.9.10, the `cyclecloud_slurm.sh` script is located in _/opt/cycle/jetpack/system/bootstrap/slurm_.
9696
9797
> [!IMPORTANT]
9898
> If you make any changes that affect the VMs for nodes in an MPI partition (such as VM size, image, or cloud-init), the nodes **must** all be terminated first.
99-
> The `remove_nodes` command prints a warning in this case, but it does not exit with an error.
100-
> If there are running nodes, you will get an error of `This node does not match existing scaleset attribute` when new nodes are started.
99+
> The `remove_nodes` command prints a warning in this case, but it doesn't exit with an error.
100+
> If there're running nodes, you get an error of `This node doesn't match existing scaleset attribute` when new nodes are started.
101101
102102
::: moniker-end
103103

@@ -110,26 +110,23 @@ The Slurm cluster deployed in CycleCloud contains a script that facilitates this
110110
> [!NOTE]
111111
> For CycleCloud versions < 8.2, the `cyclecloud_slurm.sh` script is located in _/opt/cycle/jetpack/system/bootstrap/slurm_.
112112
113-
If you make changes that affect the VMs for nodes in an MPI partition (such as VM size, image, or cloud-init), and the nodes are running, you will get an error of `This node does not match existing scaleset attribute` when new nodes are started. For this reason, the `apply_changes` command makes sure the nodes are terminated, and fails with the following error message if not: _The following nodes must be fully terminated before applying changes_.
113+
If you make changes that affect the VMs for nodes in an MPI partition (such as VM size, image, or cloud-init), and the nodes are running, you get an error `This node doesn't match existing scaleset attribute` when new nodes are started. For this reason, the `apply_changes` command makes sure the nodes are terminated, and fails with this error message if not: _The following nodes must be fully terminated before applying changes_.
114114

115-
If you are making a change that does NOT affect the VM properties for MPI nodes, you do not need to terminate running nodes first.
116-
In this case, you can make the changes by using the following two commands:
115+
If you're making a change that does NOT affect the VM properties for MPI nodes, you don't need to terminate running nodes first. In this case, you can make the changes by using these two commands:
117116

118117
``` bash
119118
/opt/cycle/slurm/cyclecloud_slurm.sh remove_nodes
120119
/opt/cycle/slurm/cyclecloud_slurm.sh scale
121120
```
122121

123122
> [!NOTE]
124-
> The `apply_changes` command only exists in CycleCloud 8.3+, so the only
125-
> way to make a change in earlier versions is with the above `remove_nodes` + `scale` commands.
126-
> Make sure that the `remove_nodes` command does not print a warning about nodes that need to be terminated.
123+
> The `apply_changes` command only exists in CycleCloud 8.3+, so the only way to make a change in earlier versions is with the overhead `remove_nodes` + `scale` commands. Ensure that the `remove_nodes` command doesn't print a warning about nodes that need to be terminated.
127124
128125
::: moniker-end
129126

130-
### Creating additional partitions
127+
### Creating supplemental partitions
131128

132-
The default template that ships with Azure CycleCloud has two partitions (`hpc` and `htc`), and you can define custom nodearrays that map directly to Slurm partitions. For example, to create a GPU partition, add the following section to your cluster template:
129+
The default template that ships with Azure CycleCloud has two partitions (`hpc` and `htc`), and you can define custom nodearrays that map directly to Slurm partitions. For example, to create a GPU partition, add the next section to your cluster template:
133130

134131
``` ini
135132
[[nodearray gpu]]
@@ -151,27 +148,27 @@ The default template that ships with Azure CycleCloud has two partitions (`hpc`
151148

152149
### Memory settings
153150

154-
CycleCloud automatically sets the amount of available memory for Slurm to use for scheduling purposes. Because the amount of available memory can change slightly due to different Linux kernel options, and the OS and VM can use up a small amount of memory that would otherwise be available for jobs, CycleCloud automatically reduces the amount of memory in the Slurm configuration. By default, CycleCloud holds back 5% of the reported available memory in a VM, but this value can be overridden in the cluster template by setting `slurm.dampen_memory` to the percentage of memory to hold back. For example, to hold back 20% of a VM's memory:
151+
CycleCloud automatically sets the amount of available memory for Slurm to use for scheduling purposes. Because available memory can vary slightly due to Linux kernel options, and the OS and VM use a small amount of memory, CycleCloud reduces the memory value in the Slurm configuration automatically. By default, CycleCloud holds back 5% of the reported available memory in a VM, but this value can be overridden in the cluster template by setting `slurm.dampen_memory` to the percentage of memory to hold back. For example, to hold back 20% of a VM's memory:
155152

156153
``` ini
157154
slurm.dampen_memory=20
158155
```
159156

160157
## Disabling autoscale for specific nodes or partitions
161158

162-
While the built-in CycleCloud "KeepAlive" feature does not currently work for Slurm clusters, it is possible to disable autoscale for a running Slurm cluster by editing the slurm.conf file directly. You can exclude either individual nodes or entire partitions from being autoscaled.
159+
While the built-in CycleCloud "KeepAlive" feature doesn't currently work for Slurm clusters, it's possible to disable autoscale for a running Slurm cluster by editing the slurm.conf file directly. You can exclude either individual nodes or entire partitions from being autoscaled.
163160

164161
### Excluding a node
165162

166-
To exclude a node or multiple nodes from autoscale, add `SuspendExcNodes=<listofnodes>` to the Slurm configuration file. For example, to exclude nodes 1 and 2 from the hpc partition, add the following to `/sched/slurm.conf`:
163+
To exclude a node or multiple nodes from autoscale, add `SuspendExcNodes=<listofnodes>` to the Slurm configuration file. For example, to exclude nodes 1 and 2 from the `hpc` partition, add the next to `/sched/slurm.conf`:
167164

168165
```bash
169166
SuspendExcNodes=hpc-pg0-[1-2]
170167
```
171168

172169
Then restart the `slurmctld` service for the new configuration to take effect.
173170
### Excluding a partition
174-
Excluding entire partitions from autoscale is similar to excluding nodes. To exclude the entire `hpc` partition, add the following to `/sched/slurm.conf`
171+
Excluding entire partitions from autoscale is similar to excluding nodes. To exclude the entire `hpc` partition, add the next to `/sched/slurm.conf`
175172

176173
```bash
177174
SuspendExcParts=hpc
@@ -181,9 +178,9 @@ Then restart the `slurmctld` service.
181178

182179
## Troubleshooting
183180

184-
### UID conflicts for Slurm and Munge users
181+
### UID conflicts for Slurm and munge users
185182

186-
By default, this project uses a UID and GID of 11100 for the Slurm user and 11101 for the Munge user. If this causes a conflict with another user or group, these defaults may be overridden.
183+
By default, this project uses a UID and GID of 11100 for the Slurm user and 11101 for the munge user. If this causes a conflict with another user or group, these defaults may be overridden.
187184

188185
To override the UID and GID, click the edit button for both the `scheduler` node:
189186

@@ -198,8 +195,7 @@ To override the UID and GID, click the edit button for both the `scheduler` node
198195
And the `execute` nodearray:
199196
![Edit Nodearray](~/articles/cyclecloud/images/slurmnodearraytab.png "Edit nodearray")
200197

201-
and add the following attributes to the `Configuration` section:
202-
198+
and add the next attributes to the `Configuration` section:
203199
![Edit Configuration](~/articles/cyclecloud/images/slurmnodearrayedit.png "Edit configuration")
204200

205201
``` ini
@@ -213,33 +209,33 @@ And the `execute` nodearray:
213209

214210
### Autoscale
215211

216-
CycleCloud uses Slurm's [Elastic Computing](https://slurm.schedmd.com/elastic_computing.html) feature. To debug autoscale issues, there are a few logs on the scheduler node you can check. The first is making sure that the power save resume calls are being made by checking `/var/log/slurmctld/slurmctld.log`. You should see lines like:
212+
CycleCloud uses Slurm's [Elastic Computing](https://slurm.schedmd.com/elastic_computing.html) feature. To debug autoscale issues, there're a few logs on the scheduler node you can check. The first is making sure that the power save resume calls are being made by checking `/var/log/slurmctld/slurmctld.log`. You should see lines like:
217213

218214
``` bash
219215
[2019-12-09T21:19:03.400] power_save: pid 8629 waking nodes htc-1
220216
```
221217

222-
The other log to check is `/var/log/slurmctld/resume.log`. If the resume step is failing, there will also be a `/var/log/slurmctld/resume_fail.log`. If there are messages about unknown or invalid node names, make sure you haven't added nodes to the cluster without following the steps in the "Making Cluster Changes" section above.
218+
The other log to check is `/var/log/slurmctld/resume.log`. If the resume step is failing, there is `/var/log/slurmctld/resume_fail.log`. If there're messages about unknown or invalid node names, make sure you haven't added nodes to the cluster without next the steps in the "Making Cluster Changes" section above.
223219

224220
## Slurm Configuration Reference
225221

226-
The following are the Slurm specific configuration options you can toggle to customize functionality:
222+
The next are the Slurm specific configuration options you can toggle to customize functionality:
227223

228224
| Slurm Specific Configuration Options | Description |
229225
| ------------------------------------ | ----------- |
230-
| slurm.version | Default: '18.08.7-1'. This is the Slurm version to install and run. This is currently the default and *only* option. In the future additional versions of the Slurm software may be supported. |
231-
| slurm.autoscale | Default: 'false'. This is a per-nodearray setting that controls whether Slurm should automatically stop and start nodes in this nodearray. |
232-
| slurm.hpc | Default: 'true'. This is a per-nodearray setting that controls whether nodes in the nodearray will be placed in the same placement group. Primarily used for nodearrays using VM families with InfiniBand. It only applies when slurm.autoscale is set to 'true'. |
233-
| slurm.default_partition | Default: 'false'. This is a per-nodearray setting that controls whether the nodearray should be the default partition for jobs that don't request a partition explicitly. |
226+
| slurm.version | Default: '18.08.7-1'. The Slurm version to install and run. This is currently the default and *only* option. In the future more versions of the Slurm software may be supported. |
227+
| slurm.autoscale | Default: 'false'. A per-nodearray setting that controls whether Slurm should automatically stop and start nodes in this nodearray. |
228+
| slurm.hpc | Default: 'true'.A per-nodearray setting that controls whether nodes in the nodearray will be placed in the same placement group. Primarily used for nodearrays using VM families with InfiniBand. It only applies when slurm.autoscale is set to 'true'. |
229+
| slurm.default_partition | Default: 'false'. A per-nodearray setting that controls whether the nodearray should be the default partition for jobs that don't request a partition explicitly. |
234230
| slurm.dampen_memory | Default: '5'. The percentage of memory to hold back for OS/VM overhead. |
235231
| slurm.suspend_timeout | Default: '600'. The amount of time (in seconds) between a suspend call and when that node can be used again. |
236232
| slurm.resume_timeout | Default: '1800'. The amount of time (in seconds) to wait for a node to successfully boot. |
237-
| slurm.install | Default: 'true'. Determines if Slurm is installed at node boot ('true'). If Slurm is installed in a custom image this should be set to 'false'. (proj version 2.5.0+) |
238-
| slurm.use_pcpu | Default: 'true'. This is a per-nodearray setting to control scheduling with hyperthreaded vcpus. Set to 'false' to set CPUs=vcpus in cyclecloud.conf. |
239-
| slurm.user.name | Default: 'slurm'. This is the username for the Slurm service to use. |
233+
| slurm.install | Default: 'true'. Determines if the Slurm is installed at node boot ('true'). If Slurm is installed in a custom image this should be set to 'false' (proj version 2.5.0+). |
234+
| slurm.use_pcpu | Default: 'true'. A per-nodearray setting to control scheduling with hyperthreaded vcpus. Set to 'false' to set CPUs=vcpus in cyclecloud.conf. |
235+
| slurm.user.name | Default: 'slurm'. The username for the Slurm service to use. |
240236
| slurm.user.uid | Default: '11100'. The User ID to use for the Slurm user. |
241237
| slurm.user.gid | Default: '11100'. The Group ID to use for the Slurm user. |
242-
| munge.user.name | Default: 'munge'. This is the username for the MUNGE authentication service to use. |
238+
| munge.user.name | Default: 'munge'. The username for the MUNGE authentication service to use. |
243239
| munge.user.uid | Default: '11101'. The User ID to use for the MUNGE user. |
244240
| munge.user.gid | Default: '11101'. The Group ID to use for the MUNGE user. |
245241

0 commit comments

Comments
 (0)