Skip to content

Commit da9a827

Browse files
committed
Add troubleshooting of guest cluster LB IP is not reachable
Signed-off-by: Jian Wang <[email protected]>
1 parent 7f83cd9 commit da9a827

File tree

3 files changed

+61
-1
lines changed

3 files changed

+61
-1
lines changed

docs/rancher/cloud-provider.md

Lines changed: 3 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -393,7 +393,9 @@ Harvester's built-in load balancer offers both **DHCP** and **Pool** modes, and
393393
394394
:::note
395395
396-
Modifying the `IPAM` mode isn't allowed. You must create a new service if you intend to change the `IPAM` mode.
396+
- Modifying the `IPAM` mode isn't allowed. You must create a new service if you intend to change the `IPAM` mode.
397+
398+
- Refer to [Guest Cluster Loadbalancer IP is not reachable](../troubleshooting/rancher.md#guest-cluster-loadbalancer-ip-is-not-reachable).
397399
398400
:::
399401

docs/troubleshooting/rancher.md

Lines changed: 58 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -83,3 +83,61 @@ Related issues:
8383
8484
- Harvester: [#7105](https://github.com/harvester/harvester/issues/7105) and [#7284](https://github.com/harvester/harvester/issues/7284)
8585
- Rancher: [#45628](https://github.com/rancher/rancher/issues/45628)
86+
87+
## Guest Cluster Loadbalancer IP is not reachable
88+
89+
### Issue Description
90+
91+
1. Create a new [guest cluster](../rancher/node/rke2-cluster.md#create-rke2-kubernetes-cluster) with the default `Container Network: Calico` and the default `Cloud Provider: Harvester`.
92+
93+
1. Deploy `nginx` on this new guest cluster via command `kubectl apply -f https://k8s.io/examples/application/deployment.yaml`.
94+
95+
1. Create a [Load Balancer](../rancher/cloud-provider.md#load-balancer-support), which selects backend nginx.
96+
97+
1. The service is ready with allocated IP from DHCP server or IPPool, but clicking the link, the page might fail to be loaded.
98+
99+
![](/img/v1.5/troubleshooting/gc-lb-is-not-reachable.png)
100+
101+
### Root Cause
102+
103+
In below example, the guest cluster node(Harvester VM)'s IP is `10.115.1.46`, and later a new Loadbalancer IP `10.115.6.200` is added to a new interface like `vip-fd8c28ce (@enp1s0)`. However, the Loadbalancer IP is taken over by the `calio` controller. It caused the Loadbalancer IP is not reachable.
104+
105+
```sh
106+
$ ip -d link show dev vxlan.calico
107+
44: vxlan.calico: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1450 qdisc noqueue state UNKNOWN mode DEFAULT group default qlen 1000
108+
link/ether 66:a7:41:00:1d:ba brd ff:ff:ff:ff:ff:ff promiscuity 0 allmulti 0 minmtu 68 maxmtu 65535
109+
info: Using default fan map value (33)
110+
vxlan id 4096 local 10.115.6.200 dev vip-8a928fa0 srcport 0 0 dstport 4789 nolearning ttl auto ageing 300 udpcsum noudp6zerocsumtx noudp6zerocsumrx addrgenmode eui64 numtxqueues 1 numrxqueues 1 gso_max_size 65536 gso_max_segs 65535 tso_max_size 65536 tso_max_segs 65535 gro_max_size 65536
111+
```
112+
113+
### Workaround
114+
115+
For exsting clusters, run command `$ kubectl edit installation`, go to `.spec.calicoNetwork.nodeAddressAutodetectionV4`, remove any existing line like `firstFound: true`, add new line `skipInterface: vip.*` and save.
116+
117+
Wait a while, the daemonset `calico-system/calico-node` is rolling updated and then the related PODs take the node IP for VXLAN to use.
118+
119+
```sh
120+
$ ip -d link show dev vxlan.calico
121+
45: vxlan.calico: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1450 qdisc noqueue state UNKNOWN mode DEFAULT group default qlen 1000
122+
link/ether 66:a7:41:00:1d:ba brd ff:ff:ff:ff:ff:ff promiscuity 0 allmulti 0 minmtu 68 maxmtu 65535
123+
info: Using default fan map value (33)
124+
vxlan id 4096 local 10.115.1.46 dev enp1s0 srcport 0 0 dstport 4789 nolearning ttl auto ageing 300 udpcsum noudp6zerocsumtx noudp6zerocsumrx addrgenmode eui64 numtxqueues 1 numrxqueues 1 gso_max_size 65536 gso_max_segs 65535 tso_max_size 65536 tso_max_segs 65535 gro_max_size 65536
125+
```
126+
127+
The Loadbalancer IP is reachable again.
128+
129+
130+
When creating new clusters on `Rancher Manager`, click **Add-on: Calico**, add following two lines to `.installation.calicoNetwork`. The `calico` controller won't take over the Loadbalancer IP accidentally.
131+
132+
```yaml
133+
installation:
134+
backend: VXLAN
135+
calicoNetwork:
136+
bgp: Disabled
137+
nodeAddressAutodetectionV4: // add this line
138+
skipInterface: vip.* // add this line
139+
```
140+
141+
### Related Issue
142+
143+
https://github.com/harvester/harvester/issues/8072
311 KB
Loading

0 commit comments

Comments
 (0)