You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: README.md
+2-9Lines changed: 2 additions & 9 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -18,10 +18,9 @@ Role Variables
18
18
Otherwise, `groups` can be omitted and the following attributes can be defined in the partition object:
19
19
*`name`: The name of the nodes within this group.
20
20
*`cluster_name`: Optional. An override for the top-level definition `openhpc_cluster_name`.
21
-
*`num_nodes`: Nodes within the group are assumed to number `0:num_nodes-1`.
22
-
*`ram_mb`: Optional. The physical RAM available in each server of this group ([slurm.conf](https://slurm.schedmd.com/slurm.conf.html) parameter `RealMemory`).
21
+
*`ram_mb`: Optional. The physical RAM available in each server of this group ([slurm.conf](https://slurm.schedmd.com/slurm.conf.html) parameter `RealMemory`). This is set to the Slurm default of `1` if not defined.
23
22
24
-
For each group (if used) or partition there must be an ansible inventory group `{cluster_name}_{group_name}`. The compute nodes in this group must have hostnames in the form `{cluster_name}-{group_name}-{0..num_nodes-1}`. Note the inventory group uses "_" and the instances use "-".
23
+
For each group (if used) or partition there must be an ansible inventory group `<cluster_name>_<group_name>`. All nodes in this inventory group will be added to the group/partition. Nodes may have arbitrary hostnames but these should be lowercase to avoid a mismatch between inventory and actual hostname.
25
24
26
25
*`default`: Optional. A boolean flag for whether this partion is the default. Valid settings are `YES` and `NO`.
27
26
*`maxtime`: Optional. A partition-specific time limit in hours, minutes and seconds ([slurm.conf](https://slurm.schedmd.com/slurm.conf.html) parameter `MaxTime`). The default value is
@@ -82,16 +81,10 @@ To deploy, create a playbook which looks like this:
82
81
openhpc_slurm_control_host: "{{ groups['cluster_control'] | first }}"
83
82
openhpc_slurm_partitions:
84
83
- name: "compute"
85
-
num_nodes: 8
86
84
openhpc_cluster_name: openhpc
87
85
openhpc_packages: []
88
86
...
89
87
90
-
Note that the "compute" of the openhpc_slurm_partition name and the
91
-
openhpc_cluster_name are used to generate the compute node in the
92
-
slurm config of openhpc-compute-[0:7]. Your inventory entries
93
-
for that partition must match that convention.
94
-
95
88
To drain nodes, for example, before scaling down the cluster to 6 nodes:
0 commit comments