Skip to content

Commit 9a76c34

Browse files
authored
Merge pull request #35 from stackhpc/anyhostname
Allow arbitrary hostnames for compute nodes - num_nodes no longer used
2 parents 3466543 + 5483689 commit 9a76c34

28 files changed

+694
-40
lines changed

.travis.yml

Lines changed: 13 additions & 12 deletions
Original file line numberDiff line numberDiff line change
@@ -1,20 +1,18 @@
11
---
22
language: python
3-
python: "2.7"
43

54
# Use the new container infrastructure
6-
sudo: false
7-
8-
# Install ansible
9-
addons:
10-
apt:
11-
packages:
12-
- python-pip
13-
5+
sudo: required
6+
services:
7+
- docker
8+
before_install:
9+
- sudo apt-get -qq update
1410
install:
15-
# Install ansible
16-
- pip install ansible ansible-lint
17-
11+
- python3 -m pip install ansible
12+
- python3 -m pip install ansible-lint
13+
- python3 -m pip install molecule
14+
- python3 -m pip install docker
15+
1816
# Check ansible version
1917
- ansible --version
2018

@@ -34,5 +32,8 @@ script:
3432
# Test the custom filters
3533
- ansible-playbook tests/filter.yml -i tests/inventory -i tests/inventory-mock-groups
3634

35+
# Run molecule tests
36+
- molecule test
37+
3738
notifications:
3839
webhooks: https://galaxy.ansible.com/api/v1/notifications/

.yamllint

Lines changed: 33 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,33 @@
1+
---
2+
# Based on ansible-lint config
3+
extends: default
4+
5+
rules:
6+
braces:
7+
max-spaces-inside: 1
8+
level: error
9+
brackets:
10+
max-spaces-inside: 1
11+
level: error
12+
colons:
13+
max-spaces-after: -1
14+
level: error
15+
commas:
16+
max-spaces-after: -1
17+
level: error
18+
comments: disable
19+
comments-indentation: disable
20+
document-start: disable
21+
empty-lines:
22+
max: 3
23+
level: error
24+
hyphens:
25+
level: error
26+
indentation: disable
27+
key-duplicates: enable
28+
line-length: disable
29+
new-line-at-end-of-file: disable
30+
new-lines:
31+
type: unix
32+
trailing-spaces: disable
33+
truthy: disable

README.md

Lines changed: 2 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -18,10 +18,9 @@ Role Variables
1818
Otherwise, `groups` can be omitted and the following attributes can be defined in the partition object:
1919
* `name`: The name of the nodes within this group.
2020
* `cluster_name`: Optional. An override for the top-level definition `openhpc_cluster_name`.
21-
* `num_nodes`: Nodes within the group are assumed to number `0:num_nodes-1`.
22-
* `ram_mb`: Optional. The physical RAM available in each server of this group ([slurm.conf](https://slurm.schedmd.com/slurm.conf.html) parameter `RealMemory`).
21+
* `ram_mb`: Optional. The physical RAM available in each server of this group ([slurm.conf](https://slurm.schedmd.com/slurm.conf.html) parameter `RealMemory`). This is set to the Slurm default of `1` if not defined.
2322

24-
For each group (if used) or partition there must be an ansible inventory group `{cluster_name}_{group_name}`. The compute nodes in this group must have hostnames in the form `{cluster_name}-{group_name}-{0..num_nodes-1}`. Note the inventory group uses "_" and the instances use "-".
23+
For each group (if used) or partition there must be an ansible inventory group `<cluster_name>_<group_name>`. All nodes in this inventory group will be added to the group/partition. Nodes may have arbitrary hostnames but these should be lowercase to avoid a mismatch between inventory and actual hostname.
2524

2625
* `default`: Optional. A boolean flag for whether this partion is the default. Valid settings are `YES` and `NO`.
2726
* `maxtime`: Optional. A partition-specific time limit in hours, minutes and seconds ([slurm.conf](https://slurm.schedmd.com/slurm.conf.html) parameter `MaxTime`). The default value is
@@ -82,16 +81,10 @@ To deploy, create a playbook which looks like this:
8281
openhpc_slurm_control_host: "{{ groups['cluster_control'] | first }}"
8382
openhpc_slurm_partitions:
8483
- name: "compute"
85-
num_nodes: 8
8684
openhpc_cluster_name: openhpc
8785
openhpc_packages: []
8886
...
8987

90-
Note that the "compute" of the openhpc_slurm_partition name and the
91-
openhpc_cluster_name are used to generate the compute node in the
92-
slurm config of openhpc-compute-[0:7]. Your inventory entries
93-
for that partition must match that convention.
94-
9588
To drain nodes, for example, before scaling down the cluster to 6 nodes:
9689

9790
---

filter_plugins/group_hosts.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -45,7 +45,7 @@ def _group_hosts(hosts):
4545
r.append(int(suffix))
4646
else:
4747
unmatchable.append(v)
48-
return ['{}[{}]'.format(k, _group_numbers(v)) for k, v in results.iteritems()] + unmatchable
48+
return ['{}[{}]'.format(k, _group_numbers(v)) for k, v in results.items()] + unmatchable
4949

5050
def _group_numbers(numbers):
5151
units = []

molecule/README.md

Lines changed: 39 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,39 @@
1+
Molecule tests for the role.
2+
3+
# Test Matrix
4+
5+
Test options "flow down" through table unless changed.
6+
7+
Test | # Partitions | Groups in partitions? | Other
8+
--- | --- | --- | ---
9+
test1 | 1 | N | 2x compute node, sequential names (default test)
10+
test1b | 1 | N | 1x compute node
11+
test1c | 1 | N | 2x compute nodes, nonsequential names
12+
test2 | 2 | N | 4x compute node, sequential names
13+
test3 | 1 | Y | -
14+
15+
# Local Installation & Running
16+
17+
Local installation on a Centos7 machine looks like:
18+
19+
sudo yum install -y gcc python3-pip python3-devel openssl-devel python3-libselinux
20+
sudo yum install -y docker-ce docker-ce-cli containerd.io
21+
sudo yum install -y yum-utils
22+
sudo yum-config-manager --add-repo https://download.docker.com/linux/centos/docker-ce.repo
23+
sudo yum install -y docker-ce docker-ce-cli containerd.io
24+
pip3 install -r molecule/requirements.txt --user
25+
26+
sudo systemctl start docker
27+
sudo usermod -aG docker ${USER}
28+
newgrp docker
29+
docker run hello-world # test docker works without sudo
30+
31+
sudo yum install -y git
32+
git clone [email protected]:stackhpc/ansible-role-openhpc.git
33+
cd ansible-role-openhpc/
34+
35+
Then to run all tests:
36+
37+
cd ansible-role-openhpc/
38+
molecule test --all
39+

molecule/default

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1 @@
1+
test1

molecule/requirements.txt

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,4 @@
1+
pip
2+
setuptools
3+
molecule[lint]
4+
docker

molecule/test1/INSTALL.rst

Lines changed: 22 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,22 @@
1+
*******
2+
Docker driver installation guide
3+
*******
4+
5+
Requirements
6+
============
7+
8+
* Docker Engine
9+
10+
Install
11+
=======
12+
13+
Please refer to the `Virtual environment`_ documentation for installation best
14+
practices. If not using a virtual environment, please consider passing the
15+
widely recommended `'--user' flag`_ when invoking ``pip``.
16+
17+
.. _Virtual environment: https://virtualenv.pypa.io/en/latest/
18+
.. _'--user' flag: https://packaging.python.org/tutorials/installing-packages/#installing-to-the-user-site
19+
20+
.. code-block:: bash
21+
22+
$ python3 -m pip install 'molecule[docker]'

molecule/test1/converge.yml

Lines changed: 22 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,22 @@
1+
---
2+
- name: Converge
3+
hosts: all
4+
tasks:
5+
- name: Install OpenHPC repository
6+
yum:
7+
name: "https://github.com/openhpc/ohpc/releases/download/v1.3.GA/ohpc-release-1.3-1.el7.x86_64.rpm"
8+
state: present
9+
- name: "Include ansible-role-openhpc"
10+
include_role:
11+
name: "ansible-role-openhpc/"
12+
vars:
13+
openhpc_enable:
14+
control: "{{ inventory_hostname in groups['testohpc_login'] }}"
15+
batch: "{{ inventory_hostname in groups['testohpc_compute'] }}"
16+
runtime: true
17+
openhpc_slurm_service_enabled: true
18+
openhpc_slurm_control_host: "{{ groups['testohpc_login'] | first }}"
19+
openhpc_slurm_partitions:
20+
- name: "compute"
21+
openhpc_cluster_name: testohpc
22+

molecule/test1/molecule.yml

Lines changed: 48 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,48 @@
1+
---
2+
name: single partition, group is partition
3+
driver:
4+
name: docker
5+
platforms:
6+
- name: testohpc-login-0
7+
image: docker.io/pycontribs/centos:7
8+
pre_build_image: true
9+
groups:
10+
- testohpc_login
11+
command: /sbin/init
12+
tmpfs:
13+
- /run
14+
- /tmp
15+
volumes:
16+
- /sys/fs/cgroup:/sys/fs/cgroup:ro
17+
networks:
18+
- name: net1
19+
- name: testohpc-compute-0
20+
image: docker.io/pycontribs/centos:7
21+
pre_build_image: true
22+
groups:
23+
- testohpc_compute
24+
command: /sbin/init
25+
tmpfs:
26+
- /run
27+
- /tmp
28+
volumes:
29+
- /sys/fs/cgroup:/sys/fs/cgroup:ro
30+
networks:
31+
- name: net1
32+
- name: testohpc-compute-1
33+
image: docker.io/pycontribs/centos:7
34+
pre_build_image: true
35+
groups:
36+
- testohpc_compute
37+
command: /sbin/init
38+
tmpfs:
39+
- /run
40+
- /tmp
41+
volumes:
42+
- /sys/fs/cgroup:/sys/fs/cgroup:ro
43+
networks:
44+
- name: net1
45+
provisioner:
46+
name: ansible
47+
verifier:
48+
name: ansible

0 commit comments

Comments
 (0)