Skip to content

Commit e892bb4

Browse files
committed
add isolated clusters docs
1 parent baf29bb commit e892bb4

File tree

1 file changed

+164
-0
lines changed

1 file changed

+164
-0
lines changed
Lines changed: 164 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,164 @@
1+
# Isolated Clusters
2+
3+
This document explains how to create clusters which do not have outbound internet
4+
access by default.
5+
6+
The approach is to:
7+
- Create a squid proxy with basic authentication and add a user.
8+
- Configure the appliance to set proxy environment variables via Ansible's
9+
[remote environment support](https://docs.ansible.com/ansible/latest/playbook_guide/playbooks_environment.html).
10+
11+
This means that proxy environment variables are not present on the hosts at all
12+
and are only injected when running Ansible, meaning the basic authentication
13+
credentials are not exposed to cluster users.
14+
15+
## Deploying Squid using the appliance
16+
If an external squid is not available, one can be deployed by the cluster on a
17+
dual-homed host. See [docs/networks.md#proxies](../networks.md#proxies) for
18+
guidance, but note a separate host should be used rather than a Slurm node, to
19+
avoid users on that node getting direct access.
20+
21+
If the deploy host is RockyLinux, this could be used as the squid host by adding
22+
it to inventory:
23+
24+
```ini
25+
# environments/$ENV/inventory/squid
26+
[squid]
27+
# configure squid on deploy host
28+
localhost ansible_host=10.20.0.121 ansible_connection=local
29+
```
30+
31+
The IP address should be the deploy hosts's IP on the cluster network and is used
32+
later to define the proxy address. Other connection variables (e.g. `ansible_user`)
33+
could be set if required.
34+
35+
## Using Squid with basic authentication
36+
37+
First create usernames/passwords on the squid host (tested on RockyLinux 8.9):
38+
39+
```shell
40+
SQUID_USER=rocky
41+
dnf install -y httpd-tools
42+
htpasswd -c /etc/squid/passwords $SQUID_USER # enter pasword at prompt
43+
sudo chown squid /etc/squid/passwords
44+
sudo chmod u=rw,go= /etc/squid/passwords
45+
```
46+
47+
This can be tested by running:
48+
```
49+
/usr/lib64/squid/basic_ncsa_auth /etc/squid/passwords
50+
```
51+
52+
and entering `$SQUID_USER PASSWORD`, which should respond `OK`.
53+
54+
If using the appliance to deploy squid, override the default `squid`
55+
configuration to use basic auth:
56+
57+
```yaml
58+
# environments/$ENV/inventory/group_vars/all/squid.yml:
59+
squid_acls:
60+
- acl ncsa_users proxy_auth REQUIRED
61+
squid_auth_param: |
62+
auth_param basic program /usr/lib64/squid/basic_ncsa_auth /etc/squid/passwords
63+
auth_param basic children 5
64+
auth_param basic credentialsttl 1 minute
65+
```
66+
67+
See the [squid docs](https://wiki.squid-cache.org/ConfigExamples/Authenticate/Ncsa) for more information.
68+
69+
## Proxy Configuration
70+
71+
Configure the appliance to configure proxying on all cluster nodes:
72+
73+
```ini
74+
# environments/.stackhpc/inventory/groups:
75+
...
76+
[proxy:children]
77+
cluster
78+
...
79+
```
80+
81+
Now configure the appliance to set proxy variables via remote environment
82+
rather than by writing it to the host, and provide the basic authentication
83+
credentials:
84+
85+
```yaml
86+
#environments/$ENV/inventory/group_vars/all/proxy.yml:
87+
proxy_basic_user: $SQUID_USER
88+
proxy_basic_password: "{{ vault_proxy_basic_password }}"
89+
proxy_plays_only: true
90+
```
91+
92+
```yaml
93+
#environments/$ENV/inventory/group_vars/all/vault_proxy.yml:
94+
vault_proxy_basic_password: $SECRET
95+
```
96+
This latter file should be vault-encrypted.
97+
98+
If using an appliance-deployed squid then the other [proxy role variables](../../ansible/roles/proxy/README.md)
99+
will be automatically constructed (see environments/common/inventory/group_vars/all/proxy.yml).
100+
You may need to override `proxy_http_address` if the hostname of the squid node
101+
is not resolvable by the cluster. This is typically the case if squid is deployed
102+
to the deploy host, in which case the IP address may be specified instead using
103+
the above example inventory as:
104+
105+
```
106+
proxy_http_address: "{{ hostvars[groups['squid'] | first].ansible_host }}"
107+
```
108+
109+
If using an external squid, at a minimum set `proxy_http_address`. You may
110+
also need to set `proxy_http_port` or any other [proxy role's variables](../../ansible/roles/proxy/README.md)
111+
if the calculated parameters are not appropriate.
112+
113+
## Image build
114+
115+
TODO: probably not currently functional!
116+
117+
## EESSI
118+
119+
Although EESSI will install with the above configuration, as there is no
120+
outbound internet access except for Ansible tasks, making it functional will
121+
require [configuring a proxy for CVMFS](https://multixscale.github.io/cvmfs-tutorial-hpc-best-practices/access/proxy/#client-system-configuration).
122+
123+
## Isolation Using Security Group Rules
124+
125+
The below shows the security groups/rules (as displayed by Horizon ) which can
126+
be used to "isolate" a cluster when using a network which has a subnet gateway
127+
provided by a router to an external network. It therefore also indicates what
128+
access is required for a different networking configuration.
129+
130+
Security group `isolated`:
131+
132+
# allow outbound DNS
133+
ALLOW IPv4 53/tcp to 0.0.0.0/0
134+
ALLOW IPv4 53/udp to 0.0.0.0/0
135+
136+
# allow everything within the cluster:
137+
ALLOW IPv4 from isolated
138+
ALLOW IPv4 to isolated
139+
140+
# allow hosts to reach metadata server (e.g. for cloud-init keys):
141+
ALLOW IPv4 80/tcp to 169.254.169.254/32
142+
143+
# allow hosts to reach squid proxy:
144+
ALLOW IPv4 3128/tcp to 10.179.2.123/32
145+
146+
Security group `isolated-ssh-https` allows inbound ssh and https (for OpenOndemand):
147+
148+
ALLOW IPv4 443/tcp from 0.0.0.0/0
149+
ALLOW IPv4 22/tcp from 0.0.0.0/0
150+
151+
152+
Then OpenTofu is configured as:
153+
154+
155+
login_security_groups = [
156+
"isolated", # allow all in-cluster services
157+
"isolated-ssh-https", # access via ssh and ondemand
158+
]
159+
nonlogin_security_groups = [
160+
"isolated"
161+
]
162+
163+
Note that DNS is required (and is configured by the cloud when the subnet has
164+
a gateway) because name resolution happens on the hosts, not on the proxy.

0 commit comments

Comments
 (0)