-
Notifications
You must be signed in to change notification settings - Fork 210
Description
Hello!
Thanks for prepare this example for me. I'm testing this on GKE and ran into this problem where the webhook service isn't reachable.
[root@gke-client-tf admission-controller-webhook-demo]# k create -f examples/pod-with-override.yaml -n webhook-demo
Error from server (InternalError): error when creating "examples/pod-with-override.yaml": Internal error occurred: failed calling webhook "webhook-server.webhook-demo.svc": Post https://webhook-server.webhook-demo.svc:443/mutate?timeout=30s: net/http: request canceled while waiting for connection (Client.Timeout exceeded while awaiting headers)
I struggled at finding the real root cause at first since we have failurePolicy: Ignore in the yaml. I had to update it into failurePolicy: Fail to eventually get where I'm now.
Here's what I've tried
get the pod ip of webhook service
[root@gke-client-tf leilichao]# k describe svc webhook-server -n webhook-demo
Name: webhook-server
Namespace: webhook-demo
Labels: <none>
Annotations: <none>
Selector: app=webhook-server
Type: ClusterIP
IP: 10.10.11.218
Port: <unset> 443/TCP
TargetPort: 8443/TCP
Endpoints: 10.1.0.168:8443
Session Affinity: None
Events: <none>
10.1.0.168 is the pod IP
luanch a busybox pod
I launched a pod in the same namespace as the webhook and curl from it with the
[root@gke-client-tf leilichao]# kubectl -n webhook-demo run curl --image=radial/busyboxplus:curl -i --tty
If you don't see a command prompt, try pressing enter.
[ root@curl:/ ]$ hostname -i
10.1.0.37
[ root@curl:/ ]$ curl -k 10.1.0.168:8443
[ root@curl:/ ]$ curl -k 10.1.0.168:8443
[ root@curl:/ ]$ nslookup webhook-server.webhook-demo.svc
Server: 10.10.11.10
Address 1: 10.10.11.10 kube-dns.kube-system.svc.cluster.local
Name: webhook-server.webhook-demo.svc
Address 1: 10.10.11.218 webhook-server.webhook-demo.svc.cluster.local
[ root@curl:/ ]$ ping webhook-server.webhook-demo.svc.cluster.local
PING webhook-server.webhook-demo.svc.cluster.local (10.10.11.218): 56 data bytes
^C
--- webhook-server.webhook-demo.svc.cluster.local ping statistics ---
6 packets transmitted, 0 packets received, 100% packet loss
[ root@curl:/ ]$ nslookup webhook-server.webhook-demo.svc.cluster.local
Server: 10.10.11.10
Address 1: 10.10.11.10 kube-dns.kube-system.svc.cluster.local
Name: webhook-server.webhook-demo.svc.cluster.local
Address 1: 10.10.11.218 webhook-server.webhook-demo.svc.cluster.local
[ root@curl:/ ]$ ping 10.10.11.218
PING 10.10.11.218 (10.10.11.218): 56 data bytes
^C
--- 10.10.11.218 ping statistics ---
11 packets transmitted, 0 packets received, 100% packet loss
[ root@curl:/ ]$ ping 10.1.0.168
PING 10.1.0.168 (10.1.0.168): 56 data bytes
64 bytes from 10.1.0.168: seq=0 ttl=63 time=0.211 ms
64 bytes from 10.1.0.168: seq=1 ttl=63 time=0.126 ms
64 bytes from 10.1.0.168: seq=2 ttl=63 time=0.086 ms
64 bytes from 10.1.0.168: seq=3 ttl=63 time=0.074 ms
64 bytes from 10.1.0.168: seq=4 ttl=63 time=0.084 ms
^C
--- 10.1.0.168 ping statistics ---
5 packets transmitted, 5 packets received, 0% packet loss
round-trip min/avg/max = 0.074/0.116/0.211 ms
A few observation here:
- the curl pod is launched at pod ip
10.1.0.37 - the FQDN
webhook-server.webhook-demo.svc.cluster.localis able to be resolved by DNS - the connection to
10.1.0.168works - the connection to
10.10.11.218does not work(of course)
see the webhook pod log from another session
You can see the traffic is comming thru
[root@gke-client-tf admission-controller-webhook-demo]# k get pods -n webhook-demo
NAME READY STATUS RESTARTS AGE
curl 1/1 Running 0 54m
webhook-server-6696bf7b88-gnlkr 1/1 Running 0 8h
[root@gke-client-tf admission-controller-webhook-demo]# k logs -n webhook-demo webhook-server-6696bf7b88-gnlkr
2020/07/20 13:30:43 http: TLS handshake error from 10.1.0.37:36756: tls: first record does not look like a TLS handshake
2020/07/20 13:30:59 http: TLS handshake error from 10.1.0.37:36812: tls: first record does not look like a TLS handshake
Can confirm the traffic went thru from curl pod
FYI I'm on GKE
Now that can confirm the pod ip is reachable and DNS works fine, I suspect there's something wrong with the kube-proxy. However I'm on GKE where the master node isn't accessable so I can't check the kube-proxy for details.
[root@gke-client-tf leilichao]# gcloud container clusters describe my-private-cluster --zone europe-west2-c
addonsConfig:
kubernetesDashboard:
disabled: true
networkPolicyConfig: {}
autoscaling: {}
binaryAuthorization: {}
clusterIpv4Cidr: 10.1.0.0/16
createTime: '2020-06-24T10:59:41+00:00'
currentMasterVersion: 1.15.11-gke.15
currentNodeCount: 3
currentNodeVersion: 1.15.11-gke.15
I'm on 1.15.11-gke.15
Could you please help advice where I could check to fix this issue?
Thanks in advance!