Skip to content

Commit 4238c70

Browse files
committed
Merge remote-tracking branch 'origin/feat/nodegroups' into HEAD
2 parents 1c281c4 + b8c64dc commit 4238c70

File tree

4 files changed

+90
-85
lines changed

4 files changed

+90
-85
lines changed

README.md

Lines changed: 34 additions & 20 deletions
Original file line numberDiff line numberDiff line change
@@ -52,30 +52,44 @@ each list element:
5252

5353
### slurm.conf
5454

55-
`openhpc_slurm_partitions`: Optional. List of one or more slurm partitions, default `[]`. Each partition may contain the following values:
56-
* `groups`: If there are multiple node groups that make up the partition, a list of group objects can be defined here.
57-
Otherwise, `groups` can be omitted and the following attributes can be defined in the partition object:
58-
* `name`: The name of the nodes within this group.
59-
* `cluster_name`: Optional. An override for the top-level definition `openhpc_cluster_name`.
60-
* `extra_nodes`: Optional. A list of additional node definitions, e.g. for nodes in this group/partition not controlled by this role. Each item should be a dict, with keys/values as per the ["NODE CONFIGURATION"](https://slurm.schedmd.com/slurm.conf.html#lbAE) docs for slurm.conf. Note the key `NodeName` must be first.
61-
* `ram_mb`: Optional. The physical RAM available in each node of this group ([slurm.conf](https://slurm.schedmd.com/slurm.conf.html) parameter `RealMemory`) in MiB. This is set using ansible facts if not defined, equivalent to `free --mebi` total * `openhpc_ram_multiplier`.
62-
* `ram_multiplier`: Optional. An override for the top-level definition `openhpc_ram_multiplier`. Has no effect if `ram_mb` is set.
55+
`openhpc_nodegroups`: Optional, default `[]`. List of mappings, each defining a
56+
unique set of homogenous nodes:
57+
* `name`: Required. Name of node group.
58+
* `ram_mb`: Optional. The physical RAM available in each node of this group
59+
([slurm.conf](https://slurm.schedmd.com/slurm.conf.html) parameter `RealMemory`)
60+
in MiB. This is set using ansible facts if not defined, equivalent to
61+
`free --mebi` total * `openhpc_ram_multiplier`.
62+
* `ram_multiplier`: Optional. An override for the top-level definition
63+
`openhpc_ram_multiplier`. Has no effect if `ram_mb` is set.
6364
* `gres`: Optional. List of dicts defining [generic resources](https://slurm.schedmd.com/gres.html). Each dict should define:
6465
- `conf`: A string with the [resource specification](https://slurm.schedmd.com/slurm.conf.html#OPT_Gres_1) but requiring the format `<name>:<type>:<number>`, e.g. `gpu:A100:2`. Note the `type` is an arbitrary string.
65-
- `file`: Omit if `openhpc_gres_autodetect` is set. A string with the [File](https://slurm.schedmd.com/gres.conf.html#OPT_File) (path to device(s)) for this resource, e.g. `/dev/nvidia[0-1]` for the above example.
66-
66+
- `file`: Omit if `gres_autodetect` is set. A string with the [File](https://slurm.schedmd.com/gres.conf.html#OPT_File) (path to device(s)) for this resource, e.g. `/dev/nvidia[0-1]` for the above example.
6767
Note [GresTypes](https://slurm.schedmd.com/slurm.conf.html#OPT_GresTypes) must be set in `openhpc_config` if this is used.
68-
69-
* `default`: Optional. A boolean flag for whether this partion is the default. Valid settings are `YES` and `NO`.
70-
* `maxtime`: Optional. A partition-specific time limit following the format of [slurm.conf](https://slurm.schedmd.com/slurm.conf.html) parameter `MaxTime`. The default value is
68+
* `node_params`: Optional. Mapping of additional parameters and values for
69+
[node configuration](https://slurm.schedmd.com/slurm.conf.html#lbAE).
70+
71+
Each nodegroup will contain hosts from an Ansible inventory group named
72+
`{{ openhpc_cluster_name }}_{{ group_name}}`. Note that:
73+
- Each host may only appear in one nodegroup.
74+
- Hosts in a nodegroup are assumed to be homogenous in terms of processor and memory.
75+
- Hosts may have arbitrary hostnames, but these should be lowercase to avoid a
76+
mismatch between inventory and actual hostname.
77+
- An inventory group may be missing or empty, in which case the node group
78+
contains no hosts.
79+
- If the inventory group is not empty the play must contain at least one host.
80+
This is used to set `Sockets`, `CoresPerSocket`, `ThreadsPerCore` and
81+
optionally `RealMemory` for the nodegroup.
82+
83+
`openhpc_partitions`: Optional, default `[]`. List of mappings, each defining a
84+
partition. Each partition mapping may contain:
85+
* `name`: Required. Name of partition.
86+
* `groups`: Optional. List of nodegroup names. If omitted, the partition name
87+
is assumed to match a nodegroup name.
88+
* `default`: Optional. A boolean flag for whether this partion is the default. Valid settings are `YES` and `NO`.
89+
* `maxtime`: Optional. A partition-specific time limit following the format of [slurm.conf](https://slurm.schedmd.com/slurm.conf.html) parameter `MaxTime`. The default value is
7190
given by `openhpc_job_maxtime`. The value should be quoted to avoid Ansible conversions.
72-
* `partition_params`: Optional. Mapping of additional parameters and values for [partition configuration](https://slurm.schedmd.com/slurm.conf.html#SECTION_PARTITION-CONFIGURATION).
73-
74-
For each group (if used) or partition any nodes in an ansible inventory group `<cluster_name>_<group_name>` will be added to the group/partition. Note that:
75-
- Nodes may have arbitrary hostnames but these should be lowercase to avoid a mismatch between inventory and actual hostname.
76-
- Nodes in a group are assumed to be homogenous in terms of processor and memory.
77-
- An inventory group may be empty or missing, but if it is not then the play must contain at least one node from it (used to set processor information).
78-
91+
* `partition_params`: Optional. Mapping of additional parameters and values for
92+
[partition configuration](https://slurm.schedmd.com/slurm.conf.html#SECTION_PARTITION-CONFIGURATION).
7993

8094
`openhpc_job_maxtime`: Maximum job time limit, default `'60-0'` (60 days). See [slurm.conf](https://slurm.schedmd.com/slurm.conf.html) parameter `MaxTime` for format. The default is 60 days. The value should be quoted to avoid Ansible conversions.
8195

defaults/main.yml

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -4,7 +4,8 @@ openhpc_slurm_service_started: "{{ openhpc_slurm_service_enabled }}"
44
openhpc_slurm_service:
55
openhpc_slurm_control_host: "{{ inventory_hostname }}"
66
#openhpc_slurm_control_host_address:
7-
openhpc_slurm_partitions: []
7+
openhpc_partitions: []
8+
openhpc_nodegroups: []
89
openhpc_cluster_name:
910
openhpc_packages:
1011
- slurm-libpmi-ohpc

templates/gres.conf.j2

Lines changed: 16 additions & 28 deletions
Original file line numberDiff line numberDiff line change
@@ -1,29 +1,17 @@
11
AutoDetect=off
2-
{% set donehosts = [] | unique %}
3-
{% for part in openhpc_slurm_partitions %}
4-
{% set nodelist = [] %}
5-
{% for group in part.get('groups', [part]) %}
6-
{% if 'gres' in group %}
7-
{% set group_name = group.cluster_name|default(openhpc_cluster_name) ~ '_' ~ group.name %}
8-
{% set inventory_group_hosts = groups.get(group_name, []) %}
9-
{% set autodetect_mechanisms = inventory_group_hosts | group_by_gres_autodetect %}
10-
{% for mechanism, _mechanism_hosts in autodetect_mechanisms.items() %}
11-
{% set mechanism_hosts = _mechanism_hosts | difference(donehosts) %}
12-
{% if mechanism != 'off' %}
13-
{% for hostlist in (mechanism_hosts | hostlist_expression) %}
14-
NodeName={{ hostlist }} AutoDetect={{ mechanism }}
15-
{% endfor %}
16-
{% else %}
17-
{% for gres in group.gres %}
18-
{% set gres_name, gres_type, _ = gres.conf.split(':') %}
19-
{% for hostlist in (mechanism_hosts | hostlist_expression) %}
20-
NodeName={{ hostlist }} Name={{ gres_name }} Type={{ gres_type }} File={{ gres.file | mandatory('The gres configuration dictionary: ' ~ gres ~ ' is missing the file key, but openhpc_gres_autodetect is set to off. The error occured on partition: ' ~ part.name ~ '. Please add the file key or set openhpc_gres_autodetect.') }}
21-
22-
{% endfor %}
23-
{% endfor %}
24-
{% endif %}
25-
{% set donehosts = donehosts | union(mechanism_hosts) %}
26-
{% endfor %}
27-
{% endif %}
28-
{% endfor %}
29-
{% endfor %}
2+
{% for nodegroup in openhpc_nodegroups %}
3+
{% set gres_list = nodegroup.gres | default([]) %}
4+
{% set gres_autodetect = nodegroup.gres_autodetect | default('off') %}
5+
{% if gres_autodetect | default('off') != 'off' %}
6+
NodeName={{ hostlist }} AutoDetect={{ gres_autodetect }}
7+
{% else %}
8+
{% for gres in gres_list %}
9+
{% set gres_name, gres_type, _ = gres.conf.split(':') %}
10+
{% set inventory_group_name = openhpc_cluster_name ~ '_' ~ nodegroup.name %}
11+
{% set inventory_group_hosts = groups.get(inventory_group_name, []) %}
12+
{% for hostlist in (inventory_group_hosts | hostlist_expression) %}
13+
NodeName={{ hostlist }} Name={{ gres_name }} Type={{ gres_type }} File={{ gres.file }}
14+
{% endfor %}{# hostlists #}
15+
{% endfor %}{# gres #}
16+
{% endif %}{# autodetect #}
17+
{% endfor %}{# nodegroup #}

templates/slurm.conf.j2

Lines changed: 38 additions & 36 deletions
Original file line numberDiff line numberDiff line change
@@ -135,55 +135,57 @@ SlurmdSyslogDebug=info
135135
#SlurmSchedLogFile=
136136
#SlurmSchedLogLevel=
137137
#DebugFlags=
138-
#
139-
#
140-
# POWER SAVE SUPPORT FOR IDLE NODES - NOT SUPPORTED IN THIS APPLIANCE VERSION
141138

142139
# LOGIN-ONLY NODES
143140
# Define slurmd nodes not in partitions for login-only nodes in "configless" mode:
144141
{%if openhpc_login_only_nodes %}{% for node in groups[openhpc_login_only_nodes] %}
145142
NodeName={{ node }}
146143
{% endfor %}{% endif %}
147144

148-
# COMPUTE NODES
149-
# OpenHPC default configuration
150145
PropagateResourceLimitsExcept=MEMLOCK
151146
Epilog=/etc/slurm/slurm.epilog.clean
152-
{% set donehosts = [] %}
153-
{% for part in openhpc_slurm_partitions %}
154-
{% set nodelist = [] %}
155-
{% for group in part.get('groups', [part]) %}
156-
{% set group_name = group.cluster_name|default(openhpc_cluster_name) ~ '_' ~ group.name %}
157-
# openhpc_slurm_partitions group: {{ group_name }}
158-
{% set inventory_group_hosts = groups.get(group_name, []) %}
159-
{% if inventory_group_hosts | length > 0 %}
160-
{% set play_group_hosts = inventory_group_hosts | intersect (play_hosts) %}
161-
{% set first_host = play_group_hosts | first | mandatory('Group "' ~ group_name ~ '" contains no hosts in this play - was --limit used?') %}
162-
{% set first_host_hv = hostvars[first_host] %}
163-
{% set ram_mb = (first_host_hv['ansible_memory_mb']['real']['total'] * (group.ram_multiplier | default(openhpc_ram_multiplier))) | int %}
164-
{% for hostlist in (inventory_group_hosts | hostlist_expression) %}
165-
{% set gres = ' Gres=%s' % (','.join(group.gres | map(attribute='conf') )) if 'gres' in group else '' %}
166-
{% if hostlist not in donehosts %}
167-
NodeName={{ hostlist }} State=UNKNOWN RealMemory={{ group.get('ram_mb', ram_mb) }} Sockets={{first_host_hv['ansible_processor_count']}} CoresPerSocket={{ first_host_hv['ansible_processor_cores'] }} ThreadsPerCore={{ first_host_hv['ansible_processor_threads_per_core'] }}{{ gres }}
168-
{% endif %}
169-
{% set _ = nodelist.append(hostlist) %}
170-
{% set _ = donehosts.append(hostlist) %}
171-
{% endfor %}{# nodes #}
172-
{% endif %}{# inventory_group_hosts #}
173-
{% for extra_node_defn in group.get('extra_nodes', []) %}
174-
{{ extra_node_defn.items() | map('join', '=') | join(' ') }}
175-
{% set _ = nodelist.append(extra_node_defn['NodeName']) %}
176-
{% endfor %}
177-
{% endfor %}{# group #}
178-
{% if not nodelist %}{# empty partition #}
179-
{% set nodelist = ['""'] %}
180-
{% endif %}
181-
PartitionName={{part.name}} Default={{ part.get('default', 'YES') }} MaxTime={{ part.get('maxtime', openhpc_job_maxtime) }} State=UP Nodes={{ nodelist | join(',') }} {{ part.partition_params | default({}) | dict2parameters }}
182-
{% endfor %}{# partitions #}
147+
148+
# COMPUTE NODES
149+
# OpenHPC default configuration
150+
{% for nodegroup in openhpc_nodegroups %}
151+
{% set inventory_group_name = openhpc_cluster_name ~ '_' ~ nodegroup.name %}
152+
{% set inventory_group_hosts = groups.get(inventory_group_name, []) %}
153+
{% if inventory_group_hosts | length > 0 %}
154+
{% set play_group_hosts = inventory_group_hosts | intersect (play_hosts) %}
155+
{% set first_host = play_group_hosts | first | mandatory('Inventory group "' ~ inventory_group_name ~ '" contains no hosts in this play - was --limit used?') %}
156+
{% set first_host_hv = hostvars[first_host] %}
157+
{% set ram_mb = (first_host_hv['ansible_memory_mb']['real']['total'] * (nodegroup.ram_multiplier | default(openhpc_ram_multiplier))) | int %}
158+
{% set hostlists = (inventory_group_hosts | hostlist_expression) %}{# hosts in inventory group aren't necessarily a single hostlist expression #}
159+
{% for hostlist in hostlists %}
160+
NodeName={{ hostlist }} {{ '' -}}
161+
State=UNKNOWN {{ '' -}}
162+
RealMemory={{ nodegroup.ram_mb | default(ram_mb) }} {{ '' -}}
163+
Sockets={{first_host_hv['ansible_processor_count'] }} {{ '' -}}
164+
CoresPerSocket={{ first_host_hv['ansible_processor_cores'] }} {{ '' -}}
165+
ThreadsPerCore={{ first_host_hv['ansible_processor_threads_per_core'] }} {{ '' -}}
166+
{{ nodegroup.node_params | default({}) | dict2parameters }} {{ '' -}}
167+
{% if 'gres' in nodegroup %}Gres={{ ','.join(nodegroup.gres | map(attribute='conf')) }}{% endif %}
168+
{% endfor %}{# hostlists #}
169+
{% endif %}{# 1 or more hosts in inventory #}
170+
171+
NodeSet={{ nodegroup.name }} Nodes={{ ','.join(hostlists | default(['""'])) }}{# no support for creating nodesets by Feature #}
172+
173+
{% endfor %}
183174

184175
# Define a non-existent node, in no partition, so that slurmctld starts even with all partitions empty
185176
NodeName=nonesuch
186177

178+
# PARTITIONS
179+
{% for partition in openhpc_partitions %}
180+
PartitionName={{partition.name}} {{ '' -}}
181+
Default={{ partition.get('default', 'YES') }} {{ '' -}}
182+
MaxTime={{ partition.get('maxtime', openhpc_job_maxtime) }} {{ '' -}}
183+
State=UP {{ '' -}}
184+
Nodes={{ partition.get('groups', [partition.name]) | join(',') }} {{ '' -}}
185+
{{ partition.partition_params | default({}) | dict2parameters }}
186+
{% endfor %}{# openhpc_partitions #}
187+
187188
{% if openhpc_slurm_configless | bool %}SlurmctldParameters=enable_configless{% endif %}
188189

190+
189191
ReturnToService=2

0 commit comments

Comments
 (0)