Skip to content
This repository was archived by the owner on Oct 31, 2019. It is now read-only.

pod cluster network does not work when number of worker nodes > 1 #194

@doschkinow

Description

@doschkinow

Terraform Version

[pdos@ol7 terraform-kubernetes-installer]$ terraform -v
Terraform v0.11.3

  • provider.null v1.0.0
  • provider.oci v2.1.4
  • provider.random v1.2.0
  • provider.template v1.0.0
  • provider.tls v1.1.0

OCI Provider Version

[pdos@ol7 terraform-kubernetes-installer]$ ls -l terraform.d/plugins/linux_amd64/terraform-provider-oci_v2.1.4
-rwxr-xr-x. 1 pdos pdos 28835846 Apr 10 09:33 terraform.d/plugins/linux_amd64/terraform-provider-oci_v2.1.4

Terraform Installer for Kubernetes Version

v.1.3.0

Input Variables

[pdos@ol7 terraform-kubernetes-installer]$ cat terraform.tfvars

OCI authentication

region = "us-ashburn-1"
tenancy_ocid = "ocid1.tenancy.oc1..aaaaaaaa4jaw55rds22u6yaiy5fxt5qxjr2ja4l5fzkv4hci7kwmexv3hpqq"
compartment_ocid = "ocid1.compartment.oc1..aaaaaaaakvhehb5u7nrupwuunhefoedpbegvbnysvz5pdfluxt5wxl5aquwa"
fingerprint = "15:fd:5a:0f:7b:f7:c8:d0:82:f5:20:f8:97:07:42:02"
private_key_path = "/home/pdos/.oci/oci_api_key.pem"
user_ocid = "ocid1.user.oc1..aaaaaaaai3a6zzhjw23wncjhk5ogvjmk4x22zsws6xn4ydmzzlxoo6rthxya"

#tenancy_ocid = "ocid1.tenancy.oc1..aaaaaaaa763cu5f3m7qpzwnvr2shs3o26ftrn7fkgz55cpzgxmglgtui3v7q"
#compartment_ocid = "ocid1.compartment.oc1..aaaaaaaaidy3jl7bdmiwfryo6myhdnujcuug5zxzoclsz7vpfzw4bggng7iq"
#fingerprint = "ed:51:83:3b:d2:04:f4:af:9d:7b:17:96:dd:8a:99:bc"
#private_key_path = "/tmp/oci_api_key.pem"
#user_ocid = "ocid1.user.oc1..aaaaaaaa5fy2l5aki6z2bzff5yrrmlahiif44vzodeetygxmpulq3mbnckya"

CCM user

#cloud_controller_user_ocid = "ocid1.tenancy.oc1..aaaaaaaa763cu5f3m7qpzwnvr2shs3o26ftrn7fkgz55cpzgxmglgtui3v7q"
#cloud_controller_user_fingerprint = "ed:51:83:3b:d2:04:f4:af:9d:7b:17:96:dd:8a:99:bc"
#cloud_controller_user_private_key_path = "/tmp/oci_api_key.pem"

etcdShape = "VM.Standard1.1"
k8sMasterShape = "VM.Standard1.1"
k8sWorkerShape = "VM.Standard2.1"

etcdAd1Count = "0"
etcdAd2Count = "0"
etcdAd3Count = "1"

k8sMasterAd1Count = "0"
k8sMasterAd2Count = "0"
k8sMasterAd3Count = "1"

k8sWorkerAd1Count = "0"
k8sWorkerAd2Count = "1"
k8sWorkerAd3Count = "1"

etcdLBShape = "400Mbps"
k8sMasterLBShape = "400Mbps"
#etcd_ssh_ingress = "10.0.0.0/16"
etcd_ssh_ingress = "0.0.0.0/0"
#etcd_cluster_ingress = "10.0.0.0/16"
master_ssh_ingress = "0.0.0.0/0"
worker_ssh_ingress = "0.0.0.0/0"
master_https_ingress = "0.0.0.0/0"
worker_nodeport_ingress = "0.0.0.0/0"
#worker_nodeport_ingress = "10.0.0.0/16"

control_plane_subnet_access = "public"
k8s_master_lb_access = "public"
#natInstanceShape = "VM.Standard1.2"
#nat_instance_ad1_enabled = "true"
#nat_instance_ad2_enabled = "false"
#nat_instance_ad3_enabled = "true"
#nat_ssh_ingress = "0.0.0.0/0"
public_subnet_http_ingress = "0.0.0.0/0"
public_subnet_https_ingress = "0.0.0.0/0"

#worker_iscsi_volume_create is a bool not a string
#worker_iscsi_volume_create = true
#worker_iscsi_volume_size = 100

#etcd_iscsi_volume_create = true
#etcd_iscsi_volume_size = 50

Description of issue:

pods deployed to a worker node, different to the node where the dns-pod is deployed, are not able to resolve kubernetes.default. This is true even if the worker nodes are in the same availability domain.

Steps to reproduce:

  • terraform apply (with similar terrafrom.tfvars as above
  • deploy a busybox pod in the cluster and scale it to 2, so there is a pod on each worker node
  • note on which node the dns pod is deployed
  • go inside busybox pod on the same node - here "nslookup kubernetes.default" works
  • go inside busybox pod on the other worker node - here "nslookup kubernetes.default" does not work

Below is a log of kubectl commands which illustrate this:
[opc@k8s-master-ad3-0 ~]$ kubectl -n kube-system get pod -o wide
NAME READY STATUS RESTARTS AGE IP NODE
kube-apiserver-k8s-master-ad3-0.k8smasterad3.k8sbmcs.oraclevcn.com 1/1 Running 0 4m 10.0.32.2 k8s-master-ad3-0.k8smasterad3.k8sbmcs.oraclevcn.com
kube-controller-manager-k8s-master-ad3-0.k8smasterad3.k8sbmcs.oraclevcn.com 1/1 Running 0 4m 10.0.32.2 k8s-master-ad3-0.k8smasterad3.k8sbmcs.oraclevcn.com
kube-dns-596797cd48-lghdb 3/3 Running 0 5m 10.99.78.2 k8s-worker-ad3-0.k8sworkerad3.k8sbmcs.oraclevcn.com
kube-proxy-k8s-master-ad3-0.k8smasterad3.k8sbmcs.oraclevcn.com 1/1 Running 0 4m 10.0.32.2 k8s-master-ad3-0.k8smasterad3.k8sbmcs.oraclevcn.com
kube-proxy-k8s-worker-ad3-0.k8sworkerad3.k8sbmcs.oraclevcn.com 1/1 Running 0 3m 10.0.42.3 k8s-worker-ad3-0.k8sworkerad3.k8sbmcs.oraclevcn.com
kube-proxy-k8s-worker-ad3-1.k8sworkerad3.k8sbmcs.oraclevcn.com 1/1 Running 0 3m 10.0.42.2 k8s-worker-ad3-1.k8sworkerad3.k8sbmcs.oraclevcn.com
kube-scheduler-k8s-master-ad3-0.k8smasterad3.k8sbmcs.oraclevcn.com 1/1 Running 0 4m 10.0.32.2 k8s-master-ad3-0.k8smasterad3.k8sbmcs.oraclevcn.com
kubernetes-dashboard-796487df76-d8q7f 1/1 Running 0 5m 10.99.69.2 k8s-master-ad3-0.k8smasterad3.k8sbmcs.oraclevcn.com
oci-cloud-controller-manager-sdqv5 1/1 Running 0 5m 10.0.32.2 k8s-master-ad3-0.k8smasterad3.k8sbmcs.oraclevcn.com
oci-volume-provisioner-66f47d7fcf-ks6pk 1/1 Running 0 5m 10.99.17.2 k8s-worker-ad3-1.k8sworkerad3.k8sbmcs.oraclevcn.com

[opc@k8s-master-ad3-0 ~]$ kubectl scale deployment busybox --replicas=2
deployment "busybox" scaled
[opc@k8s-master-ad3-0 ~]$ kubectl get pod -o wide
NAME READY STATUS RESTARTS AGE IP NODE
busybox-56b5f5cd9d-6brvj 1/1 Running 0 6s 10.99.78.4 k8s-worker-ad3-0.k8sworkerad3.k8sbmcs.oraclevcn.com
busybox-56b5f5cd9d-lvz4z 1/1 Running 1 13m 10.99.17.5 k8s-worker-ad3-1.k8sworkerad3.k8sbmcs.oraclevcn.com
nginx-7cbc4b4d9c-7z772 1/1 Running 0 15m 10.99.17.3 k8s-worker-ad3-1.k8sworkerad3.k8sbmcs.oraclevcn.com
nginx-7cbc4b4d9c-c6lrj 1/1 Running 0 15m 10.99.78.3 k8s-worker-ad3-0.k8sworkerad3.k8sbmcs.oraclevcn.com
nginx-7cbc4b4d9c-k2kjr 1/1 Running 0 15m 10.99.17.4 k8s-worker-ad3-1.k8sworkerad3.k8sbmcs.oraclevcn.com
[opc@k8s-master-ad3-0 ~]$ kubectl exec -it busybox-56b5f5cd9d-6brvj nslookup kubernetes.default
Server: 10.21.21.21
Address 1: 10.21.21.21 kube-dns.kube-system.svc.cluster.local

Name: kubernetes.default
Address 1: 10.21.0.1 kubernetes.default.svc.cluster.local
[opc@k8s-master-ad3-0 ~]$ kubectl exec -it busybox-56b5f5cd9d-lvz4z nslookup kubernetes.default
Server: 10.21.21.21
Address 1: 10.21.21.21

nslookup: can't resolve 'kubernetes.default'
command terminated with exit code 1

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions