Skip to content

Commit 1e3c8f9

Browse files
committed
define desired behaviour
1 parent f9fc905 commit 1e3c8f9

File tree

1 file changed

+85
-9
lines changed

1 file changed

+85
-9
lines changed

docs/networks.md

Lines changed: 85 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -8,11 +8,13 @@ subnets or associated infrastructure such as routers. The requirements are that:
88
4. At least one network on each node provides outbound internet access (either
99
directly, or via a proxy).
1010

11-
Futhermore, it is recommended that the deploy host has an interface on the
12-
access network. While it is possible to e.g. use a floating IP on a login node
13-
as an SSH proxy to access the other nodes, this can create problems in recovering
14-
the cluster if the login node is unavailable and can make Ansible problems harder
15-
to debug.
11+
Addresses on the "access network" are used for `ansible_host` and `k3s` node IPs.
12+
13+
It is recommended that the deploy host either has a direct connection to the
14+
"access network" or jumps through a host on it which is not part of the appliance.
15+
Using e.g. a floating IP on a login node as a jumphost creates problems in
16+
recovering the cluster if the login node is unavailable and can make Ansible
17+
problems harder to debug.
1618

1719
> [!WARNING]
1820
> If home directories are on a shared filesystem with no authentication (such
@@ -29,8 +31,8 @@ the OpenTofu variables. These will normally be set in
2931
need to be overriden for specific environments, this can be done via an OpenTofu
3032
module as discussed [here](./production.md).
3133

32-
Note that if an OpenStack subnet has a gateway IP defined then nodes with ports
33-
attached to that subnet will get a default route set via that gateway.
34+
Note that if an OpenStack subnet has a gateway IP defined then by default nodes
35+
with ports attached to that subnet get a default route set via that gateway.
3436

3537
## Single network
3638
This is the simplest possible configuration. A single network and subnet is
@@ -77,8 +79,9 @@ vnic_types = {
7779
## Additional networks on some nodes
7880

7981
This example shows how to modify variables for specific node groups. In this
80-
case a baremetal node group has a second network attached. As above, only a
81-
single subnet can have a gateway IP.
82+
case a baremetal node group has a second network attached. Here "subnetA" must
83+
have a gateway IP defined and "subnetB" must not, to avoid routing problems on
84+
the multi-homeed compute nodes.
8285

8386
```terraform
8487
cluster_networks = [
@@ -109,3 +112,76 @@ compute = {
109112
}
110113
...
111114
```
115+
116+
## Multiple networks with non-default gateways
117+
118+
In some multiple network configurations it may be necessary to manage default
119+
routes rather than them being automatically created from a subnet gateway.
120+
This can be done using the tofu variable `gateway_ip` which can be set for the
121+
cluster and/or overriden on the compute and login groups. If this is set:
122+
- a default route via that address will be created on the appropriate interface
123+
during boot if it does not exist
124+
- any other default routes will be removed
125+
126+
For example the cluster configuration below has a "campus" network with a
127+
default gateway which provides inbound SSH / ondemand access and outbound
128+
internet attached only to the login nodes, and a "data" network attached to
129+
all nodes. The "data" network has no gateway IP set on its subnet to avoid dual
130+
default routes and routing conflicts on the multi-homed login nodes, but does
131+
have outbound connectivity via a router:
132+
133+
```terraform
134+
cluster_networks = [
135+
{
136+
network = "data" # access network, CIDR 172.16.0.0/23
137+
subnet = "data_subnet"
138+
}
139+
]
140+
141+
login = {
142+
interactive = {
143+
nodes = ["login-0"]
144+
extra_networks = [
145+
{
146+
network = "campus"
147+
subnet = "campus_subnet"
148+
}
149+
]
150+
}
151+
}
152+
compute = {
153+
general = {
154+
nodes = ["compute-0", "compute-1"]
155+
}
156+
gateway_ip = "172.16.0.1" # Router interface
157+
}
158+
```
159+
160+
If there is no default route at all (either from a subnet gateway or from
161+
`gateway_ip`) then a dummy route is created via the access network interface to
162+
ensure [correct](https://docs.k3s.io/installation/airgap#default-network-route)
163+
`k3s` operation.
164+
165+
## Proxies
166+
167+
If some nodes have no outbound connectivity via any networks, the cluster can
168+
be configured to deploy a [squid proxy](https://www.squid-cache.org/) on a node
169+
with outbound connectivity. Assuming the `compute` and `control` nodes have no
170+
outbound connectivity and the `login` node does, the minimal configuration for
171+
this is:
172+
173+
```yaml
174+
# environments/$SITE/inventory/groups:
175+
[squid:children]
176+
login
177+
[proxy:children]
178+
control
179+
compute
180+
```
181+
182+
```yaml
183+
# environments/$SITE/inventory/group_vars/all/squid.yml:
184+
# these are just examples
185+
squid_cache_disk: 1024 # MB
186+
squid_cache_mem: '12 GB'
187+
```

0 commit comments

Comments
 (0)