Skip to content

Commit bfc5cae

Browse files
authored
Fixed the Acrolinx score
1 parent e008d30 commit bfc5cae

File tree

1 file changed

+20
-21
lines changed

1 file changed

+20
-21
lines changed

articles/cyclecloud/slurm-3.md

Lines changed: 20 additions & 21 deletions
Original file line numberDiff line numberDiff line change
@@ -11,15 +11,15 @@ ms.author: anhoward
1111
Slurm scheduler support was rewritten as part of the CycleCloud 8.4.0 release. Key features include:
1212

1313
* Support for dynamic nodes, and dynamic partitions via dynamic nodearays, supporting both single and multiple virtual machine (VM) sizes
14-
* New slurm versions 23.02 and 22.05.8
14+
* New Slurm versions 23.02 and 22.05.8
1515
* Cost reporting via `azslurm` CLI
1616
* `azslurm` cli based autoscaler
1717
* Ubuntu 20 support
1818
* Removed need for topology plugin, and therefore also any submit plugin
1919

2020
## Slurm Clusters in CycleCloud versions < 8.4.0
2121

22-
See [Transitioning from 2.7 to 3.0](#transitioning-from-27-to-30) for more information.
22+
For more information, see [Transitioning from 2.7 to 3.0](#transitioning-from-27-to-30).
2323

2424
### Making Cluster Changes
2525

@@ -30,13 +30,13 @@ The Slurm cluster deployed in CycleCloud contains a cli called `azslurm` to faci
3030
# azslurm scale
3131
```
3232

33-
This creates the partitions with the correct number of nodes, the proper `gres.conf` and restart the `slurmctld`.
33+
The command creates the partitions with the correct number of nodes, the proper `gres.conf` and restart the `slurmctld`.
3434

35-
### No longer pre-creating execute nodes
35+
### No longer precreating execute nodes
3636

37-
Starting CycleCloud version 3.0.0 Slurm project, the nodes aren't pre-creating. Nodes are created when `azslurm resume` is invoked, or by manually creating them in CycleCloud using CLI.
37+
Starting CycleCloud version 3.0.0 Slurm project, the nodes aren't precreating. Nodes are created when `azslurm resume` is invoked, or by manually creating them in CycleCloud using CLI.
3838

39-
### Creating additional partitions
39+
### Creating extra partitions
4040

4141
The default template that ships with Azure CycleCloud has three partitions (`hpc`, `htc` and `dynamic`), and you can define custom nodearrays that map directly to Slurm partitions. For example, to create a GPU partition, add the following section to your cluster template:
4242

@@ -60,8 +60,7 @@ The default template that ships with Azure CycleCloud has three partitions (`hpc
6060

6161
### Dynamic Partitions
6262

63-
Starting CycleCloud version 3.0.1, we support dynamic partitions. You can make a `nodearray` map to a dynamic partition by adding the following.
64-
Note that `myfeature` could be any desired feature description or more than one feature, separated by a comma.
63+
Starting CycleCloud version 3.0.1, we support dynamic partitions. You can make a `nodearray` map to a dynamic partition by adding the following. The `myfeature` could be any desired feature description or more than one feature, separated by a comma.
6564

6665
```ini
6766
[[[configuration]]]
@@ -72,7 +71,7 @@ Note that `myfeature` could be any desired feature description or more than one
7271
slurm.dynamic_config := "-Z --conf \"Feature=myfeature\""
7372
```
7473

75-
This generates a dynamic partition like the following
74+
The shared code snip generates a dynamic partition like the following
7675

7776
```ini
7877
# Creating dynamic nodeset and partition using slurm.dynamic_config=-Z --conf "Feature=myfeature"
@@ -82,9 +81,9 @@ PartitionName=mydynamicpart Nodes=mydynamicns
8281

8382
### Using Dynamic Partitions to Autoscale
8483

85-
By default, dynamic partition deosn't inclue any nodes. You can start nodes through CycleCloud or by running `azslurm resume` manually, they'll join the cluster using the name you choose. However, since Slurm isn't aware of these nodes ahead of time, it can't autoscale them up.
84+
By default, dynamic partition doesn't include any nodes. You can start nodes through CycleCloud or by running `azslurm resume` manually, they join the cluster using the name you choose. However, since Slurm isn't aware of these nodes ahead of time, it can't autoscale them up.
8685

87-
Instead, you can also pre-create node records like so, which allows Slurm to autoscale them up.
86+
Instead, you can also precreate node records like so, which allows Slurm to autoscale them up.
8887

8988
```bash
9089
scontrol create nodename=f4-[1-10] Feature=myfeature State=CLOUD
@@ -100,7 +99,7 @@ scontrol create nodename=f4-[1-10] Feature=myfeature,Standard_F4 State=CLOUD
10099
scontrol create nodename=f8-[1-10] Feature=myfeature,Standard_F8 State=CLOUD
101100
```
102101

103-
Either way, once you have created these nodes in a `State=Cloud` they're now available to autoscale like other nodes.
102+
Either way, once you create these nodes in a `State=Cloud` they become available for autoscaling like other nodes.
104103

105104
To support **multiple VM sizes in a CycleCloud nodearray**, you can alter the template to allow multiple VM sizes by adding `Config.Mutiselect = true`.
106105

@@ -113,19 +112,19 @@ To support **multiple VM sizes in a CycleCloud nodearray**, you can alter the te
113112
Config.Multiselect = true
114113
```
115114

116-
### Dynamic Scaledown
115+
### Dynamic Scale down
117116

118-
By default, all nodes in the dynamic partition scales down just like the other partitions. To disable this, see [SuspendExcParts](https://slurm.schedmd.com/slurm.conf.html).
117+
By default, all nodes in the dynamic partition scales down just like the other partitions. To disable dynamic partition, see [SuspendExcParts](https://slurm.schedmd.com/slurm.conf.html).
119118

120119
### Manual scaling
121120

122-
If cyclecloud_slurm detects that autoscale is disabled (SuspendTime=-1), it uses the FUTURE state to denote nodes that're powered down instead of relying on the power state in Slurm. That is, when autoscale is enabled, off nodes are denoted as `idle~` in sinfo. When autoscale is disabled, the off nodes will not appear in sinfo at all. You can still see their definition with `scontrol show nodes --future`.
121+
If cyclecloud_slurm detects that autoscale is disabled (SuspendTime=-1), it uses the FUTURE state to denote nodes that're powered down instead of relying on the power state in Slurm. That is, when autoscale is enabled, off nodes are denoted as `idle~` in sinfo. When autoscaling is off, the inactive nodes don’t show up in sinfo. You can still see their definition with `scontrol show nodes --future`.
123122

124-
To start new nodes, run `/opt/azurehpc/slurm/resume_program.sh node_list` (e.g. htc-[1-10]).
123+
To start new nodes, run `/opt/azurehpc/slurm/resume_program.sh node_list` (for example, htc-[1-10]).
125124

126-
To shutdown nodes, run `/opt/azurehpc/slurm/suspend_program.sh node_list` (e.g. htc-[1-10]).
125+
To shutdown nodes, run `/opt/azurehpc/slurm/suspend_program.sh node_list` (for example, htc-[1-10]).
127126

128-
To start a cluster in this mode, simply add `SuspendTime=-1` to the additional slurm config in the template.
127+
To start a cluster in this mode, simply add `SuspendTime=-1` to the supplemental Slurm config in the template.
129128

130129
To switch a cluster to this mode, add `SuspendTime=-1` to the slurm.conf and run `scontrol reconfigure`. Then run `azslurm remove_nodes && azslurm scale`.
131130

@@ -138,9 +137,9 @@ To switch a cluster to this mode, add `SuspendTime=-1` to the slurm.conf and run
138137
->
139138
`/opt/azurehpc/slurm`
140139

141-
2. Autoscale logs are now in `/opt/azurehpc/slurm/logs` instead of `/var/log/slurmctld`. Note, that `slurmctld.log` will be in this folder.
140+
2. Autoscale logs are now in `/opt/azurehpc/slurm/logs` instead of `/var/log/slurmctld`. Note, that `slurmctld.log` is in this folder.
142141

143-
3. The `cyclecloud_slurm.sh` script no longer available. It's been replaced by a new CLI tool called `azslurm`, which you can be run as root. `azslurm` also supports autocomplete.
142+
3. The `cyclecloud_slurm.sh` script no longer available. A new CLI tool called `azslurm` replaced `cyclecloud_slurm.sh`, and you can be run as root. `azslurm` also supports autocomplete.
144143

145144
```bash
146145
[root@scheduler ~]# azslurm
@@ -170,7 +169,7 @@ To switch a cluster to this mode, add `SuspendTime=-1` to the slurm.conf and run
170169

171170
5. CycleCloud no longer creates nodes ahead of time. It only creates them when they're needed.
172171
173-
6. All slurm binaries are inside the `azure-slurm-install-pkg*.tar.gz` file, under `slurm-pkgs`. They're pulled from a specific binary release. The current binary release is [4.0.0](https://github.com/Azure/cyclecloud-slurm/releases/tag/4.0.0)
172+
6. All Slurm binaries are inside the `azure-slurm-install-pkg*.tar.gz` file, under `slurm-pkgs`. They're pulled from a specific binary release. The current binary release is [4.0.0](https://github.com/Azure/cyclecloud-slurm/releases/tag/4.0.0)
174173

175174
7. For MPI jobs, the only default network boundary is the partition. Unlike version 2.x, each pertition doesn't include multiple "placement groups". So you only have one colocated VMSS per partition. There's no need for the topology plugin anymore, so the job submission plugin isn't needed either. Instead, submitting to multiple partitions is the recommended option for use cases that require jobs submission to multiple placement groups.
176175

0 commit comments

Comments
 (0)