Getting "Internal error occurred: failed calling webhook "webhook-server.webhook-demo.svc"" error

Hello!

Thanks for prepare this example for me. I'm testing this on GKE and ran into this problem where the webhook service isn't reachable.

```
[root@gke-client-tf admission-controller-webhook-demo]# k create -f examples/pod-with-override.yaml -n webhook-demo
Error from server (InternalError): error when creating "examples/pod-with-override.yaml": Internal error occurred: failed calling webhook "webhook-server.webhook-demo.svc": Post https://webhook-server.webhook-demo.svc:443/mutate?timeout=30s: net/http: request canceled while waiting for connection (Client.Timeout exceeded while awaiting headers)
```

I struggled at finding the real root cause at first since we have `failurePolicy: Ignore` in the yaml. I had to update it into `failurePolicy: Fail` to eventually get where I'm now.

### Here's what I've tried

#### get the pod ip of webhook service
```
[root@gke-client-tf leilichao]# k describe svc webhook-server -n webhook-demo
Name:              webhook-server
Namespace:         webhook-demo
Labels:            <none>
Annotations:       <none>
Selector:          app=webhook-server
Type:              ClusterIP
IP:                10.10.11.218
Port:              <unset>  443/TCP
TargetPort:        8443/TCP
Endpoints:         10.1.0.168:8443
Session Affinity:  None
Events:            <none>
```

`10.1.0.168` is the pod IP

#### luanch a busybox pod
I launched a pod in the same namespace as the webhook and curl from it with the 

```
[root@gke-client-tf leilichao]# kubectl -n webhook-demo run curl --image=radial/busyboxplus:curl -i --tty
If you don't see a command prompt, try pressing enter.
[ root@curl:/ ]$ hostname -i
10.1.0.37
[ root@curl:/ ]$ curl -k 10.1.0.168:8443

[ root@curl:/ ]$ curl -k 10.1.0.168:8443

[ root@curl:/ ]$ nslookup webhook-server.webhook-demo.svc
Server:    10.10.11.10
Address 1: 10.10.11.10 kube-dns.kube-system.svc.cluster.local

Name:      webhook-server.webhook-demo.svc
Address 1: 10.10.11.218 webhook-server.webhook-demo.svc.cluster.local
[ root@curl:/ ]$ ping webhook-server.webhook-demo.svc.cluster.local
PING webhook-server.webhook-demo.svc.cluster.local (10.10.11.218): 56 data bytes
^C
--- webhook-server.webhook-demo.svc.cluster.local ping statistics ---
6 packets transmitted, 0 packets received, 100% packet loss
[ root@curl:/ ]$ nslookup webhook-server.webhook-demo.svc.cluster.local
Server:    10.10.11.10
Address 1: 10.10.11.10 kube-dns.kube-system.svc.cluster.local

Name:      webhook-server.webhook-demo.svc.cluster.local
Address 1: 10.10.11.218 webhook-server.webhook-demo.svc.cluster.local
[ root@curl:/ ]$ ping 10.10.11.218
PING 10.10.11.218 (10.10.11.218): 56 data bytes
^C
--- 10.10.11.218 ping statistics ---
11 packets transmitted, 0 packets received, 100% packet loss
[ root@curl:/ ]$ ping 10.1.0.168
PING 10.1.0.168 (10.1.0.168): 56 data bytes
64 bytes from 10.1.0.168: seq=0 ttl=63 time=0.211 ms
64 bytes from 10.1.0.168: seq=1 ttl=63 time=0.126 ms
64 bytes from 10.1.0.168: seq=2 ttl=63 time=0.086 ms
64 bytes from 10.1.0.168: seq=3 ttl=63 time=0.074 ms
64 bytes from 10.1.0.168: seq=4 ttl=63 time=0.084 ms
^C
--- 10.1.0.168 ping statistics ---
5 packets transmitted, 5 packets received, 0% packet loss
round-trip min/avg/max = 0.074/0.116/0.211 ms
```

A few observation here:

1. the curl pod is launched at pod ip `10.1.0.37`
2. the FQDN  `webhook-server.webhook-demo.svc.cluster.local` is able to be resolved by DNS
3. the connection to `10.1.0.168` works
4. the connection to `10.10.11.218` does **not** work(of course)

#### see the webhook pod log from another session
You can see the traffic is comming thru
```
[root@gke-client-tf admission-controller-webhook-demo]# k get pods -n webhook-demo
NAME                              READY   STATUS    RESTARTS   AGE
curl                              1/1     Running   0          54m
webhook-server-6696bf7b88-gnlkr   1/1     Running   0          8h
[root@gke-client-tf admission-controller-webhook-demo]# k logs -n webhook-demo webhook-server-6696bf7b88-gnlkr
2020/07/20 13:30:43 http: TLS handshake error from 10.1.0.37:36756: tls: first record does not look like a TLS handshake
2020/07/20 13:30:59 http: TLS handshake error from 10.1.0.37:36812: tls: first record does not look like a TLS handshake
```
Can confirm the traffic went thru from curl pod

### FYI I'm on GKE
Now that can confirm the pod ip is reachable and DNS works fine, I suspect there's something wrong with the kube-proxy. However I'm on GKE where the master node isn't accessable so I can't check the kube-proxy for details.

```
[root@gke-client-tf leilichao]# gcloud container clusters describe my-private-cluster --zone europe-west2-c
addonsConfig:
  kubernetesDashboard:
    disabled: true
  networkPolicyConfig: {}
autoscaling: {}
binaryAuthorization: {}
clusterIpv4Cidr: 10.1.0.0/16
createTime: '2020-06-24T10:59:41+00:00'
currentMasterVersion: 1.15.11-gke.15
currentNodeCount: 3
currentNodeVersion: 1.15.11-gke.15
```

I'm on `1.15.11-gke.15`

Could you please help advice where I could check to fix this issue?

Thanks in advance!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Getting "Internal error occurred: failed calling webhook "webhook-server.webhook-demo.svc"" error #6

Here's what I've tried

get the pod ip of webhook service

luanch a busybox pod

see the webhook pod log from another session

FYI I'm on GKE

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Getting "Internal error occurred: failed calling webhook "webhook-server.webhook-demo.svc"" error #6

Description

Here's what I've tried

get the pod ip of webhook service

luanch a busybox pod

see the webhook pod log from another session

FYI I'm on GKE

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions