Service connectivity from robot agent #1891
-
Hi, Networking is not exactly my strong suit and it seems that I just reached the end of my (limited) wisdom with the following issue: I added a robot node, configured everything according to the docs and managed that the robot node is shown in the clusterand that pods get issued on it. So far so good. If I ssh into the robot node, I can ping both all my nodes and pods. What I can't ping, and here my issue starts, is the service IP's (everything with 10.43.0.0/16). This leads to the longhorn pod issued by a deamonset to fail with:
Clearly, this is a networking issue. I use the default I'm currently testing with 'not sharing routes with vSwitch' in hetzner/console/network — unsuccessful I might add. Did I miss something? Do I have to enable some additional routing? HCCM seems to be configured correctly (networking is enabled, clusterCIDR is set correctly)... Any suggestion or idea would be greatly appreciated. |
Beta Was this translation helpful? Give feedback.
Replies: 1 comment
-
It seems as if I managed to solve it. For future reference:
hetzner_ccm_use_helm = true
hetzner_ccm_values = <<EOT
args:
cloud-provider: hcloud
webhook-secure-port: "0"
kind: Deployment
replicaCount: 1
env:
HCLOUD_TOKEN:
valueFrom:
secretKeyRef:
name: hcloud
key: token
ROBOT_USER:
valueFrom:
secretKeyRef:
name: robot-secret
key: robot-username
optional: true
ROBOT_PASSWORD:
valueFrom:
secretKeyRef:
name: robot-secret
key: robot-password
optional: true
HCLOUD_LOAD_BALANCERS_ENABLED:
value: "true"
HCLOUD_LOAD_BALANCERS_LOCATION:
value: "nbg1" # your value
HCLOUD_LOAD_BALANCERS_USE_PRIVATE_IP:
value: "true"
HCLOUD_LOAD_BALANCERS_DISABLE_PRIVATE_INGRESS:
value: "true"
HCLOUD_NETWORK_ROUTES_ENABLED:
value: "false" # this is not needed anyway if you just use flannel an not any other fancy networking layer like cilium
image:
repository: docker.io/hetznercloud/hcloud-cloud-controller-manager
monitoring:
enabled: true
podMonitor:
enabled: false
networking:
enabled: true
clusterCIDR: 10.42.0.0/16 # this is the default value set by k3s and kube-hetzner uses it as well
network:
valueFrom:
secretKeyRef:
name: hcloud
key: network
resources:
requests:
cpu: 100m
memory: 50Mi
additionalTolerations:
- key: node-role.kubernetes.io/control-plane
operator: Exists
effect: NoSchedule
priorityClassName: "system-cluster-critical"
robot:
enabled: false # if you enable this on cluster creation, you have to make sure the secret referenced in the env vars is available, otherwise deploy will run into a timeout... in my cluster I redeploy the helm chart with robot enabled and the secret available
rbac:
create: true
EOT
If you made sure your ccm is deployed correctly, make sure your network interface on the robot are not cluttered by previous attempts and your k3s-agent is properly configured (you should use |
Beta Was this translation helpful? Give feedback.
It seems as if I managed to solve it. For future reference:
kube.tf
: