Skip to content

Commit ab2cfdf

Browse files
committed
updated operations guide for functionality requiring additional installs
1 parent f5f2c0b commit ab2cfdf

File tree

4 files changed

+37
-4
lines changed

4 files changed

+37
-4
lines changed

docs/environments.md

Lines changed: 4 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -14,7 +14,10 @@ All environments load the inventory from the `common` environment first, with th
1414

1515
The ansible inventory for the environment is in `environments/<environment>/inventory/`. It should generally contain:
1616
- A `hosts` file. This defines the hosts in the appliance. Generally it should be templated out by the deployment automation so it is also a convenient place to define variables which depend on the deployed hosts such as connection variables, IP addresses, ssh proxy arguments etc.
17-
- A `groups` file defining ansible groups, which essentially controls which features of the appliance are enabled and where they are deployed. This repository generally follows a convention where functionality is defined using ansible roles applied to a group of the same name, e.g. `openhpc` or `grafana`. The meaning and use of each group is described in comments in `environments/common/inventory/groups`. As the groups defined there for the common environment are empty, functionality is disabled by default and must be enabled in a specific environment's `groups` file. Two template examples are provided in `environments/commmon/layouts/` demonstrating a minimal appliance with only the Slurm cluster itself, and an appliance with all functionality.
17+
- A `groups` file defining ansible groups, which essentially controls which features of the appliance are enabled and where they are deployed. This repository generally follows a convention where functionality is defined using ansible roles applied to a group
18+
of the same name, e.g. `openhpc` or `grafana`. The meaning and use of each group is described in comments in `environments/common/inventory/groups`. As the groups defined there for the common environment are empty, functionality is disabled by default and must be
19+
enabled in a specific environment's `groups` file. The `site` environment contains an ini file at `environments/site/inventory/groups` which enables groups for default appliance functionality across all environments. Additional groups should generally also be
20+
enabled in this file to avoid divergence between staging and production environments. Note that enabling some groups may require a site-specific image build and Ark credentials (see [operations guide](operations.md)).
1821
- Optionally, group variable files in `group_vars/<group_name>/overrides.yml`, where the group names match the functional groups described above. These can be used to override the default configuration for each functionality, which are defined in `environments/common/inventory/group_vars/all/<group_name>.yml` (the use of `all` here is due to ansible's precedence rules).
1922

2023
Although most of the inventory uses the group convention described above there are a few special cases:

docs/experimental/pulp.md

Lines changed: 11 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -16,7 +16,7 @@ pulp_host ansible_host=<VM-ip-address>
1616
```
1717

1818
> [!WARNING]
19-
> The inventory hostname cannot conflict with group names i.e can't be called `pulp` or `pulp_server`.
19+
> The inventory hostname cannot conflict with group names i.e can't be called `pulp_site` or `pulp_server`.
2020
2121
Once complete, it will print a message giving a value to set for `appliances_pulp_url` (see example config below), assuming the `ansible_host` address is also the address the cluster
2222
should use to reach the Pulp server.
@@ -28,7 +28,7 @@ An existing Pulp server can be used to host Ark repos by overriding `pulp_site_p
2828

2929
## Syncing Pulp content with Ark
3030

31-
If the `pulp` group is added to the Packer build groups, the local Pulp server will be synced with Ark on build. You must authenticate with Ark by overriding `pulp_site_upstream_username` and `pulp_site_upstream_password` with your vault encrypted Ark dev credentials. `dnf_repos_username` and `dnf_repos_password` must remain unset to access content from the local Pulp.
31+
If the `pulp_site` group is added to the Packer build groups, the local Pulp server will be synced with Ark on build. You must authenticate with Ark by overriding `pulp_site_upstream_username` and `pulp_site_upstream_password` with your vault encrypted Ark dev credentials. `dnf_repos_username` and `dnf_repos_password` must remain unset to access content from the local Pulp.
3232

3333
Content can also be synced by running `ansible/adhoc/sync-pulp.yml`. By default this syncs repositories for the latest version of Rocky supported by the appliance but this can be overridden by setting extra variables for `pulp_site_target_arch`, `pulp_site_target_distribution` and `pulp_site_target_distribution_version`.
3434

@@ -40,3 +40,12 @@ appliances_pulp_url: "http://<pulp-host-ip>:8080"
4040
pulp_site_upstream_username: <Ark-username>
4141
pulp_site_upstream_password: <Ark-password>
4242
```
43+
44+
## Installing packages from Pulp at runtime
45+
By default, system repos are overwritten to point at Pulp repos during [image builds,](../image-build.md) so using a site Pulp server will require a new fatimage. If you instead wish to install packages at runtime,
46+
you will need to add all host groups on which you will be installing packages to the `dnf_repos` group in `environments/site/inventory/groups` e.g:
47+
48+
```
49+
[dnf_repos:children]
50+
cluster
51+
```

docs/operations.md

Lines changed: 19 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -9,7 +9,7 @@ All subsequent sections assume that:
99
- Appropriate OpenStack credentials are available.
1010
- Any non-appliance controlled infrastructure is available (e.g. networks, volumes, etc.).
1111
- `$ENV` is your current, activated environment, as defined by e.g. `environments/production/`.
12-
- `$SITE_ENV` is the base site-specific environment, as defined by e.g. `environments/mysite/`.
12+
- `$SITE_ENV` is the base site-specific environment, as defined by `environments/site/`.
1313
- A string `some/path/to/file.yml:myvar` defines a path relative to the repository root and an Ansible variable in that file.
1414
- Configuration is generally common to all environments at a site, i.e. is made in `environments/$SITE_ENV` not `environments/$ENV`.
1515

@@ -62,6 +62,24 @@ This is a usually a two-step process:
6262

6363
Deploying the additional nodes and applying these changes requires rerunning both OpenTofu and the Ansible site.yml playbook - follow [Deploying a Cluster](#Deploying-a-Cluster).
6464

65+
# Enabling additional functionality
66+
Roles in the appliance which are disabled by default can be enabled by adding the appropriate groups as children of the role's corresponding group in `environments/site/inventory/groups`. For example,
67+
to install a Squid proxy on nodes in the login group, you would modify the `squid` group definition in `environments/site/inventory/groups` to:
68+
69+
```
70+
[squid:children]
71+
# Hosts to run squid proxy
72+
login
73+
```
74+
75+
Note that many non-default roles include package installations from repositories which the appliance overwrites to point at snapshotted mirrors on a Pulp server (by default StackHPC's Ark server), which are
76+
disabled during runtime to prevent Ark credentials from being leaked. To enable this functionality, you must therefore either:
77+
78+
- Create a site-specific fatimage (see [image build docs](image-build.md)) with the appropriate group added to the `inventory_groups` Packer variables.
79+
- If you instead wish roles to perform their installations during runtime, deploy a site Pulp server and sync it with with mirrors of the snapshots from the upstream Ark server (see [Pulp docs](experimental/pulp.md)).
80+
81+
In both cases, Ark credentials will be required.
82+
6583
# Adding Additional Packages
6684
By default, the following utility packages are installed during the StackHPC image build:
6785
- htop

environments/site/inventory/groups

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -158,6 +158,9 @@ compute
158158
# Note that this feature currently assumes all compute nodes are VMs, enabling
159159
# when the cluster contains baremetal compute nodes may lead to unexpected scheduling behaviour
160160

161+
[pulp_site]
162+
# Add builder to this group to enable automatically syncing of pulp during image build
163+
161164
[pulp_server]
162165
# Host to deploy a Pulp server on and sync with mirrors of upstream Ark repositories. Should be a group containing a single VM provisioned
163166
# separately from the appliance. e.g

0 commit comments

Comments
 (0)