Skip to content

Commit f0bd623

Browse files
committed
make eeesi/squid autoconfigure for proxy
1 parent 8fe4bab commit f0bd623

File tree

9 files changed

+173
-60
lines changed

9 files changed

+173
-60
lines changed

ansible/roles/eessi/README.md

Lines changed: 14 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -8,10 +8,20 @@ None.
88

99
## Role Variables
1010

11-
- `cvmfs_quota_limit_mb`: Optional int. Maximum size of local package cache on each node in MB.
12-
- `cvmfs_config_overrides`: Optional dict. Set of key-value pairs for additional CernVM-FS settings see [official docs](https://cvmfs.readthedocs.io/en/stable/cpt-configure.html) for list of options.
13-
Each dict key should correspond to a valid config variable (e.g. `CVMFS_HTTP_PROXY`) and the corresponding dict value will be set as the variable value (e.g. `https://my-proxy.com`).
14-
These configuration parameters will be written to the `/etc/cvmfs/default.local` config file on each host in the form `KEY=VALUE`.
11+
All variables relate to [CernVM-FS configuration](https://cvmfs.readthedocs.io/en/stable/cpt-configure.html).
12+
By default, the configuration is that [recommended by EESSI for single clients](https://www.eessi.io/docs/getting_access/native_installation/#installation-for-single-clients).
13+
However if `cvmfs_http_proxy` is set to a non-empty string then a configuration
14+
suitable for using a [squid proxy](https://www.eessi.io/docs/getting_access/native_installation/#configuring-your-client-to-use-a-squid-proxy)
15+
is applied instead. See [docs/production](../../../docs/eessi.md#eessi-proxy-configuration)
16+
for guidance on appliance configuration.
17+
18+
- `cvmfs_quota_limit_mb`: Optional int. Maximum size of local package cache on
19+
each node in MB. Default 10GB.
20+
- `cvmfs_http_proxy`: Optional string. Value for [CVMFS_HTTP_PROXY](https://cvmfs.readthedocs.io/en/stable/cpt-configure.html#proxy-lists). Quotes are added around the provided value. Default empty string.
21+
- `cvmfs_config_overrides`: Optional dict. Set of key-value pairs for additional
22+
CernVM-FS settings, written to `/etc/cvmfs/default.local`. Keys are
23+
[CVMFS configuration options](https://cvmfs.readthedocs.io/en/stable/cpt-configure.html)
24+
(e.g. `CVMFS_TIMEOUT_DIRECT`). Default empty dict.
1525

1626
## Dependencies
1727

ansible/roles/eessi/defaults/main.yaml

Lines changed: 12 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -1,13 +1,22 @@
11
---
22
cvmfs_release_version: "6-3"
33

4-
# Default to 10GB
5-
cvmfs_quota_limit_mb: 10000
4+
cvmfs_quota_limit_mb: 10000 # local cache soft quota in MB (default 10GB)
65

7-
cvmfs_config_default:
6+
# NB: The string omit removes the option. Defined here for both default configs
7+
# so make swapping between them work properly.
8+
# TODO explain the omits
9+
cvmfs_config_single:
810
CVMFS_CLIENT_PROFILE: single
911
CVMFS_QUOTA_LIMIT: "{{ cvmfs_quota_limit_mb }}"
1012

13+
cvmfs_http_proxy: '' # as per docs, quotes are added automatically
14+
# See https://www.eessi.io/docs/getting_access/native_installation/#configuring-your-client-to-use-a-squid-proxy
15+
cvmfs_config_proxy:
16+
CVMFS_QUOTA_LIMIT: "{{ cvmfs_quota_limit_mb }}"
17+
CVMFS_HTTP_PROXY: "'{{ cvmfs_http_proxy }}'"
18+
19+
cvmfs_config_default: "{{ cvmfs_config_single if cvmfs_http_proxy == '' else cvmfs_config_proxy }}"
1120
cvmfs_config_overrides: {}
1221
cvmfs_config: "{{ cvmfs_config_default | combine(cvmfs_config_overrides) }}"
1322

ansible/roles/eessi/tasks/configure.yml

Lines changed: 10 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -1,21 +1,23 @@
11
---
22

33
- name: Add base CVMFS config
4-
community.general.ini_file:
4+
ansible.builtin.template:
55
dest: /etc/cvmfs/default.local
6-
section: null
7-
option: "{{ item.key }}"
8-
value: "{{ item.value }}"
9-
no_extra_spaces: true
10-
mode: "0644"
11-
loop: "{{ cvmfs_config | dict2items }}"
12-
6+
src: cvmfs.config.j2
7+
mode: u=rw,go=r
8+
owner: root
9+
register: cvmfs_config
1310

1411
# NOTE: Not clear how to make this idempotent
1512
- name: Ensure CVMFS config is setup # noqa: no-changed-when
1613
ansible.builtin.command:
1714
cmd: "cvmfs_config setup"
1815

16+
- name: Reload CVMFS config
17+
ansible.builtin.command:
18+
cmd: cvmfs_config reload
19+
when: cvmfs_config.changed
20+
1921
# configure gpus
2022
- name: Check for NVIDIA GPU
2123
ansible.builtin.stat:

ansible/roles/squid/README.md

Lines changed: 18 additions & 14 deletions
Original file line numberDiff line numberDiff line change
@@ -3,13 +3,13 @@
33
Deploy a caching proxy.
44

55
**NB:** This role provides two default configurations, selected by setting
6-
`squid_conf_template`:
7-
- `squid.conf.j2`: This is aimed at providing a proxy for package installs etc.
6+
`squid_conf_mode`:
7+
- `default`: This is aimed at providing a proxy for package installs etc.
88
for nodes which do not have direct internet connectivity. It assumes access
99
to the proxy is protected by the OpenStack security groups applied to the
1010
cluster. The generated configuration should be reviewed if this is not case.
11-
- `squid-eessi.conf.j2`: This provides a proxy server for EESSI clients. It uses
12-
the [recommended configuration](https://www.eessi.io/docs/tutorial/access/proxy/#configuration)
11+
- `eessi`: This provides a proxy server for EESSI clients. It uses the
12+
[recommended configuration](https://www.eessi.io/docs/tutorial/access/proxy/#configuration)
1313
which assumes a server with:
1414
- 10Gbit link or faster to the client systems
1515
- a sufficiently powerful CPU
@@ -19,13 +19,15 @@ Deploy a caching proxy.
1919
least one for every (100-500) client nodes.
2020

2121
## Role Variables
22+
23+
- `squid_conf_mode`: Optional str, `default` (the default) or `eessi`. See above.
2224
- `squid_conf_template`: Optional str. Path (using Ansible search paths) to
23-
squid.conf template. Default is in-role `squid.conf.j2` template as above.
25+
squid.conf template. Default is in-role templates. If this is overriden then
26+
`squid_conf_mode` has no effect.
2427

25-
### Role Variables for squid_conf_template=squid.conf.j2
28+
### Role Variables for squid_conf_mode: default
2629

2730
Where noted these map to squid parameters of the same name without the `squid_` prefix - see [squid documentation](https://www.squid-cache.org/Doc/config) for details.
28-
2931
- `squid_started`: Optional bool. Whether to start squid service. Default `true`.
3032
- `squid_enabled`: Optional bool. Whether squid service is enabled on boot. Default `true`.
3133
- `squid_cache_mem`: Required str. Size of memory cache, e.g "1024 KB", "12 GB" etc. See squid parameter.
@@ -52,11 +54,13 @@ Where noted these map to squid parameters of the same name without the `squid_`
5254

5355
See squid parameter.
5456

55-
### Role Variables for squid_conf_template=squid-eessi.conf.j2
57+
### Role Variables for squid_conf_mode: eessi
5658

57-
- `squid_eessi_clients`: Optional string. CIDR specifying clients allowed to
58-
access this proxy. The default is to use the CIDR of the host's default IPv4
59-
interface, which should allow access from the [cluster network](../../../docs/networks.md).
60-
For clusters with multiple networks this may not be appropriate.
61-
- `squid_eessi_stratum_1`: Optional string. Domain (squid `acl dstdomain` format)
62-
of Stratum 1 replica servers. Default is the upstream EEESI Stratum 1 servers.
59+
- `squid_eessi_clients`: Optional str. CIDR specifying clients allowed to access
60+
this proxy. Default is the CIDR for the subnet of the [access network](../../../docs/networks.md),
61+
i.e. the first cluster network. For clusters with multiple networks this may
62+
need overriding.
63+
- `squid_eessi_stratum_1`: Optional str. Domain (in squid `acl dstdomain`
64+
format) of Stratum 1 replica servers. Defaults to upstream EEESI Stratum 1
65+
servers.
66+
- `squid_cache_dir`: See definition for default mode above.

ansible/roles/squid/defaults/main.yml

Lines changed: 6 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,8 @@
11
---
2-
## squid dnf configuration:
3-
squid_conf_template: squid.conf.j2
2+
squid_conf_mode: default # or 'eessi'
3+
4+
## squid_conf_mode=default:
5+
squid_conf_template: "squid-{{ squid_conf_mode }}.conf.j2"
46
squid_started: true
57
squid_enabled: true
68

@@ -25,7 +27,6 @@ squid_http_access: |
2527
# Finally deny all other access to this proxy
2628
http_access deny all
2729
28-
## squid eeesi configuration:
29-
#squid_conf_template: squid-eessi.conf.j2
30-
squid_eessi_clients: "{{ ansible_default_ipv4.network }}/{{ ansible_default_ipv4.prefix }}"
30+
#squid_conf_mode=eessi:
31+
squid_eessi_clients: "{{ cluster_subnets[0].cidr | mandatory('squid_eessi_clients must be defined when using eeesi squid config') }}"
3132
squid_eessi_stratum_1: '.eessi.science'

docs/eessi.md

Lines changed: 44 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -2,7 +2,7 @@
22

33
## How to Load EESSI
44

5-
The EESSI environment can be initialise by running:
5+
The EESSI environment can be initialised by running:
66

77
```bash
88
source /cvmfs/software.eessi.io/versions/2023.06/init/bash
@@ -109,3 +109,46 @@ cmake ..
109109
make
110110
./deviceQuery
111111
```
112+
113+
## EESSI Proxy Configuration
114+
115+
EESSI recommend that clusters use a proxy to reduce latency for clients and
116+
avoid excessive load on the EESSI Stratum 1 servers. Squid can be deployed and
117+
configured appropriately as this proxy on separate node(s) using
118+
the OpenTofu variable `additional_nodegroups`, e.g.:
119+
120+
```hcl
121+
additional_nodegroups = {
122+
# EESSI squid proxy
123+
squid = {
124+
nodes = ["squid-0"]
125+
flavor = squid.flavor
126+
}
127+
}
128+
```
129+
130+
EESSI [recommend](https://www.eessi.io/docs/tutorial/access/proxy/#general-recommendations)
131+
that:
132+
> The proxy server should have a 10Gbit link to the client systems, a
133+
sufficiently powerful CPU, a decent amount of memory for the kernel cache (tens
134+
of GBs), and fast local storage (SSD or NVMe).
135+
>
136+
> As a rule of thumb, it is recommended to have (at least) one proxy server for
137+
every couple of hundred worker nodes (100-500).
138+
139+
Generally, both the `squid` nodes and the `eeesi` client nodes can be
140+
appropriately configured simply by setting the squid mode to `eessi`:
141+
142+
```yaml
143+
# environments/site/inventory/group_vars/all/squid.yml:
144+
squid_conf_mode: eessi
145+
```
146+
147+
In this mode, by default:
148+
- `squid` is configured to allow clients from the access network's CIDR, using
149+
the EESSI-recommended cache configuration
150+
- `eessi` is configured to use the `squid` node IPs on the access network (the
151+
first network in `cluster_networks`) as proxies.
152+
153+
If this is not suitable then override the defaults provided by `environments/common/inventory/group_vars/all/eessi.yml`
154+
and the `eeesi` and `squid` roles.

docs/production.md

Lines changed: 50 additions & 24 deletions
Original file line numberDiff line numberDiff line change
@@ -163,33 +163,47 @@ will have been generated for you already under
163163

164164
## Define and deploy infrastructure
165165

166-
Create an OpenTofu variables file to define the required infrastructure, e.g.:
167-
168-
```text
169-
# environments/$ENV/tofu/terraform.tfvars
170-
cluster_name = "mycluster"
171-
cluster_networks = [
172-
{
173-
network = "some_network" # *
174-
subnet = "some_subnet" # *
175-
}
176-
]
177-
key_pair = "my_key" # *
178-
control_node_flavor = "some_flavor_name"
179-
login = {
180-
# Arbitrary group name for these login nodes
181-
interactive = {
182-
nodes: ["login-0"]
183-
flavor: "login_flavor_name" # *
166+
Modify the cookiecutter-templtaed OpenTofu configuration to define the required
167+
infrastructure, e.g.:
168+
169+
```hcl
170+
# environments/$ENV/tofu/main.tf
171+
module "cluster" {
172+
source = "../../site/tofu/"
173+
environment_root = var.environment_root
174+
175+
cluster_name = "mycluster"
176+
cluster_networks = [
177+
{
178+
network = "some_network" # *
179+
subnet = "some_subnet" # *
184180
}
185-
}
186-
cluster_image_id = "rocky_linux_9_image_uuid"
187-
compute = {
181+
]
182+
key_pair = "my_key" # *
183+
control_node_flavor = "some_flavor_name"
184+
login = {
185+
# Arbitrary group name for these login nodes
186+
head = {
187+
nodes = ["login-0"]
188+
flavor = "login_flavor_name" # *
189+
}
190+
}
191+
cluster_image_id = "rocky_linux_9_image_uuid"
192+
compute = {
188193
# Group name used for compute node partition definition
189194
general = {
190-
nodes: ["compute-0", "compute-1"]
191-
flavor: "compute_flavor_name" # *
195+
nodes = ["compute-0", "compute-1"]
196+
flavor = "compute_flavor_name" # *
197+
}
198+
}
199+
additional_nodes = {
200+
# Nodes configured to provide a squid proxy for EESSI - for guidance
201+
# on number and sizing see [docs/eessi.md](./eessi.md#eessi-proxy-configuration)
202+
squid = {
203+
nodes = ["squid-0"]
204+
flavor = squid_flavor_name # *
192205
}
206+
}
193207
}
194208
```
195209

@@ -203,7 +217,7 @@ Note that:
203217
- Environment-specific variables (`cluster_name`) should be hardcoded into
204218
the cluster module block.
205219

206-
- Environment-independent variables (e.g. maybe `cluster_net` if the same
220+
- Environment-independent variables (e.g. maybe `cluster_networks` if the same
207221
is used for staging and production) should be set as _defaults_ in
208222
`environments/site/tofu/variables.tf`, and then don't need to be passed
209223
in to the module.
@@ -356,6 +370,16 @@ environments which should be unique, e.g. production and staging.
356370
not, remove `grafana_auth_anonymous` in
357371
`environments/$ENV/inventory/group_vars/all/grafana.yml`
358372

373+
- Configure EESSI to be proxied via the `squid` node(s) defined in the OpenTofu
374+
configuration:
375+
376+
```yaml
377+
# environments/site/inventory/group_vars/all/squid.yml:
378+
squid_conf_mode: eessi
379+
```
380+
381+
See [docs/eessi](./eessi.md#eessi-proxy-configuration) for more information.
382+
359383
- See the [hpctests docs](../ansible/roles/hpctests/README.md) for advice on
360384
raising `hpctests_hpl_mem_frac` during tests.
361385

@@ -409,3 +433,5 @@ Once it completes you can log in to the cluster using:
409433

410434
For further information, including additional configuration guides and
411435
operations instructions, see the [docs](README.md) directory.
436+
437+
TODO: Add stuff on eessi proxy.
Lines changed: 17 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,17 @@
1+
# Automatically configure EESSI clients to use any 'squid' node(s) with squid_conf_mode == eessi as a proxy:
2+
cvmfs_proxy_ip_var: ansible_host # hostvar with proxy IP - default is IP on access network
3+
cvmfs_proxy_ips: >- # list of IPs for squid nodes with squid_conf_mode == eessi:
4+
{{
5+
hostvars.values() |
6+
selectattr('group_names', 'contains', 'squid') |
7+
selectattr('squid_conf_mode', 'eq', 'eessi') |
8+
map(attribute=cvmfs_proxy_ip_var)
9+
}}
10+
# proxy string as per EESSI docs (but unquoted), or empty string:
11+
# TODO: just check final format eg "http://10.8.1.16:3128|http://10.8.1.17:3128" looks ok
12+
cvmfs_http_proxy: >-
13+
{{
14+
cvmfs_proxy_ips |
15+
map('regex_replace', '^(.*)$', 'http://\1:' ~ (squid_http_port | string)) |
16+
join('|')
17+
}}
Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,2 +1,3 @@
11
---
2-
squid_http_port: 3128 # defined here for proxy role
2+
squid_http_port: 3128 # defined here for proxy/eeesi roles
3+
squid_conf_mode: default # defined here for eeesi config

0 commit comments

Comments
 (0)