You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: README.md
+34-20Lines changed: 34 additions & 20 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -52,30 +52,44 @@ each list element:
52
52
53
53
### slurm.conf
54
54
55
-
`openhpc_slurm_partitions`: Optional. List of one or more slurm partitions, default `[]`. Each partition may contain the following values:
56
-
*`groups`: If there are multiple node groups that make up the partition, a list of group objects can be defined here.
57
-
Otherwise, `groups` can be omitted and the following attributes can be defined in the partition object:
58
-
*`name`: The name of the nodes within this group.
59
-
*`cluster_name`: Optional. An override for the top-level definition `openhpc_cluster_name`.
60
-
*`extra_nodes`: Optional. A list of additional node definitions, e.g. for nodes in this group/partition not controlled by this role. Each item should be a dict, with keys/values as per the ["NODE CONFIGURATION"](https://slurm.schedmd.com/slurm.conf.html#lbAE) docs for slurm.conf. Note the key `NodeName` must be first.
61
-
*`ram_mb`: Optional. The physical RAM available in each node of this group ([slurm.conf](https://slurm.schedmd.com/slurm.conf.html) parameter `RealMemory`) in MiB. This is set using ansible facts if not defined, equivalent to `free --mebi` total * `openhpc_ram_multiplier`.
62
-
*`ram_multiplier`: Optional. An override for the top-level definition `openhpc_ram_multiplier`. Has no effect if `ram_mb` is set.
55
+
`openhpc_nodegroups`: Optional, default `[]`. List of mappings, each defining a
56
+
unique set of homogenous nodes:
57
+
*`name`: Required. Name of node group.
58
+
*`ram_mb`: Optional. The physical RAM available in each node of this group
in MiB. This is set using ansible facts if not defined, equivalent to
61
+
`free --mebi` total * `openhpc_ram_multiplier`.
62
+
*`ram_multiplier`: Optional. An override for the top-level definition
63
+
`openhpc_ram_multiplier`. Has no effect if `ram_mb` is set.
63
64
*`gres`: Optional. List of dicts defining [generic resources](https://slurm.schedmd.com/gres.html). Each dict should define:
64
65
-`conf`: A string with the [resource specification](https://slurm.schedmd.com/slurm.conf.html#OPT_Gres_1) but requiring the format `<name>:<type>:<number>`, e.g. `gpu:A100:2`. Note the `type` is an arbitrary string.
65
-
-`file`: Omit if `openhpc_gres_autodetect` is set. A string with the [File](https://slurm.schedmd.com/gres.conf.html#OPT_File) (path to device(s)) for this resource, e.g. `/dev/nvidia[0-1]` for the above example.
66
-
66
+
-`file`: Omit if `gres_autodetect` is set. A string with the [File](https://slurm.schedmd.com/gres.conf.html#OPT_File) (path to device(s)) for this resource, e.g. `/dev/nvidia[0-1]` for the above example.
67
67
Note [GresTypes](https://slurm.schedmd.com/slurm.conf.html#OPT_GresTypes) must be set in `openhpc_config` if this is used.
68
-
69
-
*`default`: Optional. A boolean flag for whether this partion is the default. Valid settings are `YES` and `NO`.
70
-
*`maxtime`: Optional. A partition-specific time limit following the format of [slurm.conf](https://slurm.schedmd.com/slurm.conf.html) parameter `MaxTime`. The default value is
68
+
*`node_params`: Optional. Mapping of additional parameters and values for
- Hosts in a nodegroup are assumed to be homogenous in terms of processor and memory.
75
+
- Hosts may have arbitrary hostnames, but these should be lowercase to avoid a
76
+
mismatch between inventory and actual hostname.
77
+
- An inventory group may be missing or empty, in which case the node group
78
+
contains no hosts.
79
+
- If the inventory group is not empty the play must contain at least one host.
80
+
This is used to set `Sockets`, `CoresPerSocket`, `ThreadsPerCore` and
81
+
optionally `RealMemory` for the nodegroup.
82
+
83
+
`openhpc_partitions`: Optional, default `[]`. List of mappings, each defining a
84
+
partition. Each partition mapping may contain:
85
+
*`name`: Required. Name of partition.
86
+
*`groups`: Optional. List of nodegroup names. If omitted, the partition name
87
+
is assumed to match a nodegroup name.
88
+
*`default`: Optional. A boolean flag for whether this partion is the default. Valid settings are `YES` and `NO`.
89
+
*`maxtime`: Optional. A partition-specific time limit following the format of [slurm.conf](https://slurm.schedmd.com/slurm.conf.html) parameter `MaxTime`. The default value is
71
90
given by `openhpc_job_maxtime`. The value should be quoted to avoid Ansible conversions.
72
-
*`partition_params`: Optional. Mapping of additional parameters and values for [partition configuration](https://slurm.schedmd.com/slurm.conf.html#SECTION_PARTITION-CONFIGURATION).
73
-
74
-
For each group (if used) or partition any nodes in an ansible inventory group `<cluster_name>_<group_name>` will be added to the group/partition. Note that:
75
-
- Nodes may have arbitrary hostnames but these should be lowercase to avoid a mismatch between inventory and actual hostname.
76
-
- Nodes in a group are assumed to be homogenous in terms of processor and memory.
77
-
- An inventory group may be empty or missing, but if it is not then the play must contain at least one node from it (used to set processor information).
78
-
91
+
*`partition_params`: Optional. Mapping of additional parameters and values for
`openhpc_job_maxtime`: Maximum job time limit, default `'60-0'` (60 days). See [slurm.conf](https://slurm.schedmd.com/slurm.conf.html) parameter `MaxTime` for format. The default is 60 days. The value should be quoted to avoid Ansible conversions.
NodeName={{ hostlist }} Name={{ gres_name }} Type={{ gres_type }} File={{ gres.file | mandatory('The gres configuration dictionary: ' ~ gres ~ ' is missing the file key, but openhpc_gres_autodetect is set to off. The error occured on partition: ' ~ part.name ~ '. Please add the file key or set openhpc_gres_autodetect.') }}
{%setfirst_host = play_group_hosts | first | mandatory('Inventory group "' ~ inventory_group_name ~ '" contains no hosts in this play - was --limit used?') %}
0 commit comments