Skip to content

Commit e3f58ad

Browse files
committed
Add support for autodetection of gres resources
1 parent 5ceb9e1 commit e3f58ad

File tree

2 files changed

+17
-8
lines changed

2 files changed

+17
-8
lines changed

README.md

Lines changed: 3 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -59,9 +59,10 @@ unique set of homogenous nodes:
5959
`free --mebi` total * `openhpc_ram_multiplier`.
6060
* `ram_multiplier`: Optional. An override for the top-level definition
6161
`openhpc_ram_multiplier`. Has no effect if `ram_mb` is set.
62-
* `gres`: Optional. List of dicts defining [generic resources](https://slurm.schedmd.com/gres.html). Each dict must define:
62+
* `gres_autodetect`: Optional. The [auto detection mechanism](https://slurm.schedmd.com/gres.conf.html#OPT_AutoDetect) to use for the generic resources. Note: you must still define the `gres` dictionary (see below) but you only need the define the `conf` key.
63+
* `gres`: Optional. List of dicts defining [generic resources](https://slurm.schedmd.com/gres.html). Each dict should define:
6364
- `conf`: A string with the [resource specification](https://slurm.schedmd.com/slurm.conf.html#OPT_Gres_1) but requiring the format `<name>:<type>:<number>`, e.g. `gpu:A100:2`. Note the `type` is an arbitrary string.
64-
- `file`: A string with the [File](https://slurm.schedmd.com/gres.conf.html#OPT_File) (path to device(s)) for this resource, e.g. `/dev/nvidia[0-1]` for the above example.
65+
- `file`: Omit if `gres_autodetect` is set. A string with the [File](https://slurm.schedmd.com/gres.conf.html#OPT_File) (path to device(s)) for this resource, e.g. `/dev/nvidia[0-1]` for the above example.
6566
Note [GresTypes](https://slurm.schedmd.com/slurm.conf.html#OPT_GresTypes) must be set in `openhpc_config` if this is used.
6667
* `features`: Optional. List of [Features](https://slurm.schedmd.com/slurm.conf.html#OPT_Features) strings.
6768
* `node_params`: Optional. Mapping of additional parameters and values for

templates/gres.conf.j2

Lines changed: 14 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -1,11 +1,19 @@
11
AutoDetect=off
22
{% for nodegroup in openhpc_nodegroups %}
3-
{% for gres in nodegroup.gres | default([]) %}
4-
{% set gres_name, gres_type, _ = gres.conf.split(':') %}
5-
{% set inventory_group_name = openhpc_cluster_name ~ '_' ~ nodegroup.name %}
6-
{% set inventory_group_hosts = groups.get(inventory_group_name, []) %}
3+
{% set gres_list = nodegroup.gres | default([]) %}
4+
{% set gres_autodetect = nodegroup.gres_autodetect | default('off') %}
5+
{% set inventory_group_name = openhpc_cluster_name ~ '_' ~ nodegroup.name %}
6+
{% set inventory_group_hosts = groups.get(inventory_group_name, []) %}
7+
{% if gres_autodetect | default('off') != 'off' %}
78
{% for hostlist in (inventory_group_hosts | hostlist_expression) %}
8-
NodeName={{ hostlist }} Name={{ gres_name }} Type={{ gres_type }} File={{ gres.file }}
9+
NodeName={{ hostlist }} AutoDetect={{ gres_autodetect }}
910
{% endfor %}{# hostlists #}
10-
{% endfor %}{# gres #}
11+
{% else %}
12+
{% for gres in gres_list %}
13+
{% set gres_name, gres_type, _ = gres.conf.split(':') %}
14+
{% for hostlist in (inventory_group_hosts | hostlist_expression) %}
15+
NodeName={{ hostlist }} Name={{ gres_name }} Type={{ gres_type }} File={{ gres.file | mandatory('The gres configuration dictionary: ' ~ gres ~ ' is missing the file key, but gres_autodetect is set to off. The error occured on node group: ' ~ nodegroup.name ~ '. Please add the file key or set gres_autodetect.') }}
16+
{% endfor %}{# hostlists #}
17+
{% endfor %}{# gres #}
18+
{% endif %}{# autodetect #}
1119
{% endfor %}{# nodegroup #}

0 commit comments

Comments
 (0)