You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: articles/cyclecloud/slurm.md
+7-7Lines changed: 7 additions & 7 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -16,7 +16,7 @@ Slurm is a highly configurable open source workload manager. For more informatio
16
16
> Starting with CycleCloud 8.4.0, the Slurm integration was rewritten to support new features and functionality. For more information, see [Slurm 3.0](slurm-3.md) documentation.
17
17
18
18
::: moniker range="=cyclecloud-7"
19
-
Slurm can easily be enabled on a CycleCloud cluster by modifying the "run_list", in the configuration section of your cluster definition. A Slurm cluster has two main parts: the master (or scheduler) node, which runs the Slurm software on a shared file system, and the execute nodes, which mount that file system and run the submitted jobs. For example, a simple cluster template snippet may look like:
19
+
To enable Slurm on a CycleCloud cluster, modify the "run_list" in the definiton of cluster's configuration section. A Slurm cluster has two main parts: the master (or scheduler) node, which runs the Slurm software on a shared file system, and the execute nodes, which mount that file system and run the submitted jobs. For example, a simple cluster template snippet may look like:
20
20
21
21
```ini
22
22
[cluster custom-slurm]
@@ -78,7 +78,7 @@ Slurm can easily be enabled on a CycleCloud cluster by modifying the "run_list"
78
78
::: moniker-end
79
79
## Editing Existing Slurm Clusters
80
80
81
-
Slurm clusters running in CycleCloud versions 7.8 and later implement an updated version of the autoscaling APIs that allows the clusters to utilize multiple nodearrays and partitions. To facilitate this functionality in Slurm, CycleCloud prepopulates the execute nodes in the cluster. Because of the prepopulation, you need to run a command on the Slurm scheduler node after making any changes to the cluster, such as autoscale limits or VM types.
81
+
Slurm clusters running in CycleCloud versions 7.8 and later implement an updated version of the autoscaling APIs that allows the clusters to utilize multiple nodearrays and partitions. To facilitate this functionality in Slurm, CycleCloud prepopulates the executed nodes in the cluster. Because of the prepopulation, you need to run a command on the Slurm scheduler node after making any changes to the cluster, such as autoscale limits or VM types.
82
82
83
83
### Making Cluster Changes
84
84
@@ -195,7 +195,7 @@ To override the UID and GID, click the edit button for both the `scheduler` node
The other log to check is `/var/log/slurmctld/resume.log`. If the resume step is failing, there is`/var/log/slurmctld/resume_fail.log`. If there're messages about unknown or invalid node names, make sure you haven't added nodes to the cluster without next the steps in the "Making Cluster Changes" section above.
218
+
The other log to check is `/var/log/slurmctld/resume.log`. If the resume step is failing, there's`/var/log/slurmctld/resume_fail.log`. If there're messages about unknown or invalid node names, make sure you haven't added nodes to the cluster without next the steps in the "Making Cluster Changes" section above.
219
219
220
220
## Slurm Configuration Reference
221
221
222
222
The next are the Slurm specific configuration options you can toggle to customize functionality:
223
223
224
224
| Slurm Specific Configuration Options | Description |
| slurm.version | Default: '18.08.7-1'. The Slurm version to install and run. This is currently the default and *only* option. In the future more versions of the Slurm software may be supported. |
226
+
| slurm.version | Default: '18.08.7-1'. This sets the Slurm version to install and run. Right now, it’s the default and *only* option. More versions may be supported in the future. |
227
227
| slurm.autoscale | Default: 'false'. A per-nodearray setting that controls whether Slurm should automatically stop and start nodes in this nodearray. |
228
-
| slurm.hpc | Default: 'true'.A per-nodearray setting that controls whether nodes in the nodearray will be placed in the same placement group. Primarily used for nodearrays using VM families with InfiniBand. It only applies when slurm.autoscale is set to 'true'. |
228
+
| slurm.hpc | Default: 'true'.A per-nodearray setting that controls whether nodes in the nodearray are placed in the same placement group. Primarily used for nodearrays using VM families with InfiniBand. It only applies when slurm.autoscale is set to 'true'. |
229
229
| slurm.default_partition | Default: 'false'. A per-nodearray setting that controls whether the nodearray should be the default partition for jobs that don't request a partition explicitly. |
230
230
| slurm.dampen_memory | Default: '5'. The percentage of memory to hold back for OS/VM overhead. |
231
231
| slurm.suspend_timeout | Default: '600'. The amount of time (in seconds) between a suspend call and when that node can be used again. |
232
232
| slurm.resume_timeout | Default: '1800'. The amount of time (in seconds) to wait for a node to successfully boot. |
233
-
| slurm.install | Default: 'true'. Determines if the Slurm is installed at node boot ('true'). If Slurm is installed in a custom image this should be set to 'false' (proj version 2.5.0+). |
233
+
| slurm.install | Default: 'true'. Determines if the Slurm is installed at node boot ('true'). If Slurm is installed in a custom image, this configuration option should be set to 'false' (proj version 2.5.0+). |
234
234
| slurm.use_pcpu | Default: 'true'. A per-nodearray setting to control scheduling with hyperthreaded vcpus. Set to 'false' to set CPUs=vcpus in cyclecloud.conf. |
235
235
| slurm.user.name | Default: 'slurm'. The username for the Slurm service to use. |
236
236
| slurm.user.uid | Default: '11100'. The User ID to use for the Slurm user. |
0 commit comments