You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: README.md
+11-8Lines changed: 11 additions & 8 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -39,32 +39,35 @@ package in the image.
39
39
40
40
`openhpc_module_system_install`: Optional, default true. Whether or not to install an environment module system. If true, lmod will be installed. If false, You can either supply your own module system or go without one.
41
41
42
-
`openhpc_ram_multiplier`: Optional, default `0.95`. Multiplier used in the calculation: `total_memory * openhpc_ram_multiplier` when setting `RealMemory` for the partition in slurm.conf. Can be overriden on a per partition basis using `openhpc_slurm_partitions.ram_multiplier`. Has no effect if `openhpc_slurm_partitions.ram_mb` is set.
43
-
44
42
### slurm.conf
45
43
46
44
`openhpc_slurm_partitions`: list of one or more slurm partitions. Each partition may contain the following values:
47
45
*`groups`: If there are multiple node groups that make up the partition, a list of group objects can be defined here.
48
46
Otherwise, `groups` can be omitted and the following attributes can be defined in the partition object:
49
47
*`name`: The name of the nodes within this group.
50
48
*`cluster_name`: Optional. An override for the top-level definition `openhpc_cluster_name`.
49
+
*`extra_nodes`: Optional. A list of additional node definitions, e.g. for nodes in this group/partition not controlled by this role. Each item should be a dict, with keys/values as per the ["NODE CONFIGURATION"](https://slurm.schedmd.com/slurm.conf.html#lbAE) docs for slurm.conf. Note the key `NodeName` must be first.
51
50
*`ram_mb`: Optional. The physical RAM available in each server of this group ([slurm.conf](https://slurm.schedmd.com/slurm.conf.html) parameter `RealMemory`) in MiB. This is set using ansible facts if not defined, equivalent to `free --mebi` total * `openhpc_ram_multiplier`.
52
-
53
-
For each group (if used) or partition there must be an ansible inventory group `<cluster_name>_<group_name>`, with all nodes in this inventory group added to the group/partition. Note that:
54
-
- Nodes may have arbitrary hostnames but these should be lowercase to avoid a mismatch between inventory and actual hostname.
55
-
- Nodes in a group are assumed to be homogenous in terms of processor and memory.
56
-
- An inventory group may be empty, but if it is not then the play must contain at least one node from it (used to set processor information).
57
-
*`ram_multiplier`: Optional. An override for the top-level definition `openhpc_ram_multiplier`. Has no effect if `ram_mb` is set.
51
+
*`ram_multiplier`: Optional. An override for the top-level definition `openhpc_ram_multiplier`. Has no effect if `ram_mb` is set.
58
52
*`default`: Optional. A boolean flag for whether this partion is the default. Valid settings are `YES` and `NO`.
59
53
*`maxtime`: Optional. A partition-specific time limit in hours, minutes and seconds ([slurm.conf](https://slurm.schedmd.com/slurm.conf.html) parameter `MaxTime`). The default value is
60
54
given by `openhpc_job_maxtime`.
61
55
56
+
For each group (if used) or partition any nodes in an ansible inventory group `<cluster_name>_<group_name>` will be added to the group/partition. Note that:
57
+
- Nodes may have arbitrary hostnames but these should be lowercase to avoid a mismatch between inventory and actual hostname.
58
+
- Nodes in a group are assumed to be homogenous in terms of processor and memory.
59
+
- An inventory group may be empty, but if it is not then the play must contain at least one node from it (used to set processor information).
60
+
- Nodes may not appear in more than one group.
61
+
- A group/partition definition which does not have either a corresponding inventory group or a `extra_nodes` will raise an error.
62
+
62
63
`openhpc_job_maxtime`: A maximum time job limit in hours, minutes and seconds. The default is `24:00:00`.
63
64
64
65
`openhpc_cluster_name`: name of the cluster
65
66
66
67
`openhpc_config`: Mapping of additional parameters and values for `slurm.conf`. Note these will override any included in `templates/slurm.conf.j2`.
67
68
69
+
`openhpc_ram_multiplier`: Optional, default `0.95`. Multiplier used in the calculation: `total_memory * openhpc_ram_multiplier` when setting `RealMemory` for the partition in slurm.conf. Can be overriden on a per partition basis using `openhpc_slurm_partitions.ram_multiplier`. Has no effect if `openhpc_slurm_partitions.ram_mb` is set.
70
+
68
71
`openhpc_state_save_location`: Optional. Absolute path for Slurm controller state (`slurm.conf` parameter [StateSaveLocation](https://slurm.schedmd.com/slurm.conf.html#OPT_StateSaveLocation))
control: "{{ inventory_hostname in groups['testohpc_login'] }}"
11
+
batch: "{{ inventory_hostname in groups['testohpc_compute'] }}"
12
+
runtime: true
13
+
openhpc_slurm_control_host: "{{ groups['testohpc_login'] | first }}"
14
+
openhpc_slurm_partitions:
15
+
- name: "compute"
16
+
extra_nodes:
17
+
# Need to specify IPs for the non-existent State=DOWN nodes, because otherwise even in this state slurmctld will exclude a node with no lookup information from the config.
18
+
# We use invalid IPs here (i.e. starting 0.) to flag the fact the nodes shouldn't exist.
19
+
# Note this has to be done via slurm config rather than /etc/hosts due to Docker limitations on modifying the latter.
0 commit comments