Skip to content

Commit cb5a80a

Browse files
authored
Add openhpc_ram_multiplier (#93)
* Add openhpc_ram_multiplier Using total memory as value of `RealMemory` in slurm.conf does not allow for OS overheads and can cause slurm to srain the nodes with: `LowRealMemory`. Fixes #92. * Convert to int * Filters bind more strongly than multiplication * Use wording suggested by steve * Correct parenthesis
1 parent e1245ca commit cb5a80a

File tree

3 files changed

+7
-2
lines changed

3 files changed

+7
-2
lines changed

README.md

Lines changed: 3 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -39,6 +39,8 @@ package in the image.
3939

4040
`openhpc_module_system_install`: Optional, default true. Whether or not to install an environment module system. If true, lmod will be installed. If false, You can either supply your own module system or go without one.
4141

42+
`openhpc_ram_multiplier`: Optional, default `0.95`. Multiplier used in the calculation: `total_memory * openhpc_ram_multiplier` when setting `RealMemory` for the partition in slurm.conf. Can be overriden on a per partition basis using `openhpc_slurm_partitions.ram_multiplier`. Has no effect if `openhpc_slurm_partitions.ram_mb` is set.
43+
4244
### slurm.conf
4345

4446
`openhpc_slurm_partitions`: list of one or more slurm partitions. Each partition may contain the following values:
@@ -52,7 +54,7 @@ package in the image.
5254
- Nodes may have arbitrary hostnames but these should be lowercase to avoid a mismatch between inventory and actual hostname.
5355
- Nodes in a group are assumed to be homogenous in terms of processor and memory.
5456
- An inventory group may be empty, but if it is not then the play must contain at least one node from it (used to set processor information).
55-
57+
* `ram_multiplier`: Optional. An override for the top-level definition `openhpc_ram_multiplier`. Has no effect if `ram_mb` is set.
5658
* `default`: Optional. A boolean flag for whether this partion is the default. Valid settings are `YES` and `NO`.
5759
* `maxtime`: Optional. A partition-specific time limit in hours, minutes and seconds ([slurm.conf](https://slurm.schedmd.com/slurm.conf.html) parameter `MaxTime`). The default value is
5860
given by `openhpc_job_maxtime`.

defaults/main.yml

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -52,3 +52,6 @@ openhpc_slurm_configless: false
5252
openhpc_munge_key: ''
5353
openhpc_login_only_nodes: ''
5454
openhpc_module_system_install: true
55+
56+
# Auto detection
57+
openhpc_ram_multiplier: 0.95

templates/slurm.conf.j2

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -115,7 +115,7 @@ Epilog=/etc/slurm/slurm.epilog.clean
115115
{% set first_host_hv = hostvars[first_host] %}
116116

117117
NodeName=DEFAULT State=UNKNOWN \
118-
RealMemory={% if 'ram_mb' in group %}{{group.ram_mb}}{% else %}{{ first_host_hv['ansible_memory_mb']['real']['total'] }}{% endif %} \
118+
RealMemory={% if 'ram_mb' in group %}{{group.ram_mb}}{% else %}{{ (first_host_hv['ansible_memory_mb']['real']['total'] * group.ram_multiplier | default(openhpc_ram_multiplier)) | int }}{% endif %} \
119119
Sockets={{first_host_hv['ansible_processor_count']}} \
120120
CoresPerSocket={{first_host_hv['ansible_processor_cores']}} \
121121
ThreadsPerCore={{first_host_hv['ansible_processor_threads_per_core']}}

0 commit comments

Comments
 (0)