|
| 1 | +# Isolated Clusters |
| 2 | + |
| 3 | +This document explains how to create clusters which do not have outbound internet |
| 4 | +access by default. |
| 5 | + |
| 6 | +The approach is to: |
| 7 | +- Create a squid proxy with basic authentication and add a user. |
| 8 | +- Configure the appliance to set proxy environment variables via Ansible's |
| 9 | + [remote environment support](https://docs.ansible.com/ansible/latest/playbook_guide/playbooks_environment.html). |
| 10 | + |
| 11 | +This means that proxy environment variables are not present on the hosts at all |
| 12 | +and are only injected when running Ansible, meaning the basic authentication |
| 13 | +credentials are not exposed to cluster users. |
| 14 | + |
| 15 | +## Deploying Squid using the appliance |
| 16 | +If an external squid is not available, one can be deployed by the cluster on a |
| 17 | +dual-homed host. See [docs/networks.md#proxies](../networks.md#proxies) for |
| 18 | +guidance, but note a separate host should be used rather than a Slurm node, to |
| 19 | +avoid users on that node getting direct access. |
| 20 | + |
| 21 | +If the deploy host is RockyLinux, this could be used as the squid host by adding |
| 22 | +it to inventory: |
| 23 | + |
| 24 | +```ini |
| 25 | +# environments/$ENV/inventory/squid |
| 26 | +[squid] |
| 27 | +# configure squid on deploy host |
| 28 | +localhost ansible_host=10.20.0.121 ansible_connection=local |
| 29 | +``` |
| 30 | + |
| 31 | +The IP address should be the deploy hosts's IP on the cluster network and is used |
| 32 | +later to define the proxy address. Other connection variables (e.g. `ansible_user`) |
| 33 | +could be set if required. |
| 34 | + |
| 35 | +## Using Squid with basic authentication |
| 36 | + |
| 37 | +First create usernames/passwords on the squid host (tested on RockyLinux 8.9): |
| 38 | + |
| 39 | +```shell |
| 40 | +SQUID_USER=rocky |
| 41 | +dnf install -y httpd-tools |
| 42 | +htpasswd -c /etc/squid/passwords $SQUID_USER # enter pasword at prompt |
| 43 | +sudo chown squid /etc/squid/passwords |
| 44 | +sudo chmod u=rw,go= /etc/squid/passwords |
| 45 | +``` |
| 46 | + |
| 47 | +This can be tested by running: |
| 48 | +``` |
| 49 | +/usr/lib64/squid/basic_ncsa_auth /etc/squid/passwords |
| 50 | +``` |
| 51 | + |
| 52 | +and entering `$SQUID_USER PASSWORD`, which should respond `OK`. |
| 53 | + |
| 54 | +If using the appliance to deploy squid, override the default `squid` |
| 55 | +configuration to use basic auth: |
| 56 | + |
| 57 | +```yaml |
| 58 | +# environments/$ENV/inventory/group_vars/all/squid.yml: |
| 59 | +squid_acls: |
| 60 | + - acl ncsa_users proxy_auth REQUIRED |
| 61 | +squid_auth_param: | |
| 62 | + auth_param basic program /usr/lib64/squid/basic_ncsa_auth /etc/squid/passwords |
| 63 | + auth_param basic children 5 |
| 64 | + auth_param basic credentialsttl 1 minute |
| 65 | +``` |
| 66 | +
|
| 67 | +See the [squid docs](https://wiki.squid-cache.org/ConfigExamples/Authenticate/Ncsa) for more information. |
| 68 | +
|
| 69 | +## Proxy Configuration |
| 70 | +
|
| 71 | +Configure the appliance to configure proxying on all cluster nodes: |
| 72 | +
|
| 73 | +```ini |
| 74 | +# environments/.stackhpc/inventory/groups: |
| 75 | +... |
| 76 | +[proxy:children] |
| 77 | +cluster |
| 78 | +... |
| 79 | +``` |
| 80 | + |
| 81 | +Now configure the appliance to set proxy variables via remote environment |
| 82 | +rather than by writing it to the host, and provide the basic authentication |
| 83 | +credentials: |
| 84 | + |
| 85 | +```yaml |
| 86 | +#environments/$ENV/inventory/group_vars/all/proxy.yml: |
| 87 | +proxy_basic_user: $SQUID_USER |
| 88 | +proxy_basic_password: "{{ vault_proxy_basic_password }}" |
| 89 | +proxy_plays_only: true |
| 90 | +``` |
| 91 | +
|
| 92 | +```yaml |
| 93 | +#environments/$ENV/inventory/group_vars/all/vault_proxy.yml: |
| 94 | +vault_proxy_basic_password: $SECRET |
| 95 | +``` |
| 96 | +This latter file should be vault-encrypted. |
| 97 | +
|
| 98 | +If using an appliance-deployed squid then the other [proxy role variables](../../ansible/roles/proxy/README.md) |
| 99 | +will be automatically constructed (see environments/common/inventory/group_vars/all/proxy.yml). |
| 100 | +You may need to override `proxy_http_address` if the hostname of the squid node |
| 101 | +is not resolvable by the cluster. This is typically the case if squid is deployed |
| 102 | +to the deploy host, in which case the IP address may be specified instead using |
| 103 | +the above example inventory as: |
| 104 | + |
| 105 | +``` |
| 106 | +proxy_http_address: "{{ hostvars[groups['squid'] | first].ansible_host }}" |
| 107 | +``` |
| 108 | + |
| 109 | +If using an external squid, at a minimum set `proxy_http_address`. You may |
| 110 | +also need to set `proxy_http_port` or any other [proxy role's variables](../../ansible/roles/proxy/README.md) |
| 111 | +if the calculated parameters are not appropriate. |
| 112 | + |
| 113 | +## Image build |
| 114 | + |
| 115 | +TODO: probably not currently functional! |
| 116 | + |
| 117 | +## EESSI |
| 118 | + |
| 119 | +Although EESSI will install with the above configuration, as there is no |
| 120 | +outbound internet access except for Ansible tasks, making it functional will |
| 121 | +require [configuring a proxy for CVMFS](https://multixscale.github.io/cvmfs-tutorial-hpc-best-practices/access/proxy/#client-system-configuration). |
| 122 | + |
| 123 | +## Isolation Using Security Group Rules |
| 124 | + |
| 125 | +The below shows the security groups/rules (as displayed by Horizon ) which can |
| 126 | +be used to "isolate" a cluster when using a network which has a subnet gateway |
| 127 | +provided by a router to an external network. It therefore also indicates what |
| 128 | +access is required for a different networking configuration. |
| 129 | + |
| 130 | +Security group `isolated`: |
| 131 | + |
| 132 | + # allow outbound DNS |
| 133 | + ALLOW IPv4 53/tcp to 0.0.0.0/0 |
| 134 | + ALLOW IPv4 53/udp to 0.0.0.0/0 |
| 135 | + |
| 136 | + # allow everything within the cluster: |
| 137 | + ALLOW IPv4 from isolated |
| 138 | + ALLOW IPv4 to isolated |
| 139 | + |
| 140 | + # allow hosts to reach metadata server (e.g. for cloud-init keys): |
| 141 | + ALLOW IPv4 80/tcp to 169.254.169.254/32 |
| 142 | + |
| 143 | + # allow hosts to reach squid proxy: |
| 144 | + ALLOW IPv4 3128/tcp to 10.179.2.123/32 |
| 145 | + |
| 146 | +Security group `isolated-ssh-https` allows inbound ssh and https (for OpenOndemand): |
| 147 | + |
| 148 | + ALLOW IPv4 443/tcp from 0.0.0.0/0 |
| 149 | + ALLOW IPv4 22/tcp from 0.0.0.0/0 |
| 150 | + |
| 151 | + |
| 152 | +Then OpenTofu is configured as: |
| 153 | + |
| 154 | + |
| 155 | + login_security_groups = [ |
| 156 | + "isolated", # allow all in-cluster services |
| 157 | + "isolated-ssh-https", # access via ssh and ondemand |
| 158 | + ] |
| 159 | + nonlogin_security_groups = [ |
| 160 | + "isolated" |
| 161 | + ] |
| 162 | + |
| 163 | +Note that DNS is required (and is configured by the cloud when the subnet has |
| 164 | +a gateway) because name resolution happens on the hosts, not on the proxy. |
0 commit comments