Skip to content

Incorrect MAC Address observed for flannel interface across kubernetes cluster nodes #1788

@amolmishra23

Description

@amolmishra23

Incorrect MAC Address observed for flannel interface across cluster nodes.

Expected Behavior

While we perform ARP resolution for the IP of flannel, from all nodes of cluster, it is supposed to show the correct MAC address. Thats not happening in this case, we observe different values for MAC address for same flannel, across all the nodes.

Current Behavior

$ "arp -an | grep 192.168.143.192"
// Result of the above command from all the nodes
=========== 10.14.7.42 ===========
? (192.168.143.192) at 32:48:e7:bb:74:d8 [ether] PERM on flannel.1
=========== 10.14.7.44 ===========
? (192.168.143.192) at 52:83:c1:6b:df:08 [ether] PERM on flannel.1
=========== 10.14.7.55 ===========
? (192.168.143.192) at 32:48:e7:bb:74:d8 [ether] PERM on flannel.1
=========== 10.14.7.56 ===========
? (192.168.143.192) at 52:83:c1:6b:df:08 [ether] PERM on flannel.1
=========== 10.14.7.62 ===========
? (192.168.143.192) at 32:48:e7:bb:74:d8 [ether] PERM on flannel.1
=========== 10.14.7.63 ===========
? (192.168.143.192) at 2a:36:32:7b:41:32 [ether] PERM on flannel.1
=========== 10.14.7.64 ===========
? (192.168.143.192) at 32:48:e7:bb:74:d8 [ether] PERM on flannel.1
=========== 10.14.7.43 ===========
Non-zero exit status: 1

$ ifconfig flannel.1
flannel.1: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 1450
        inet 192.168.143.192  netmask 255.255.255.255  broadcast 0.0.0.0
        inet6 fe80::2836:32ff:fe7b:4132  prefixlen 64  scopeid 0x20<link>
        ether 2a:36:32:7b:41:32  txqueuelen 0  (Ethernet)
        RX packets 0  bytes 0 (0.0 B)
        RX errors 0  dropped 0  overruns 0  frame 0
        TX packets 0  bytes 0 (0.0 B)
        TX errors 0  dropped 5 overruns 0  carrier 0  collisions 0

The values in etcd are observed to be correct, FYI

$ etcdctl --prefix /flannel/network get
/flannel/network/config
{
      "EnableIPv4": true,
      "Network": "192.168.128.0/18",
      "SubnetLen": 26,
      "Backend": {
          "Type": "vxlan",
          "DirectRouting": false
      }
}
/flannel/network/subnets/192.168.134.192-26
{"PublicIP":"10.14.7.44","PublicIPv6":null,"BackendType":"vxlan","BackendData":{"VNI":1,"VtepMAC":"b6:f1:e3:69:06:ad"}}
/flannel/network/subnets/192.168.137.64-26
{"PublicIP":"10.14.7.55","PublicIPv6":null,"BackendType":"vxlan","BackendData":{"VNI":1,"VtepMAC":"66:14:5a:2b:ae:b0"}}
/flannel/network/subnets/192.168.140.192-26
{"PublicIP":"10.14.7.62","PublicIPv6":null,"BackendType":"vxlan","BackendData":{"VNI":1,"VtepMAC":"f2:17:19:89:06:eb"}}
/flannel/network/subnets/192.168.143.192-26
{"PublicIP":"10.14.7.43","PublicIPv6":null,"BackendType":"vxlan","BackendData":{"VNI":1,"VtepMAC":"2a:36:32:7b:41:32"}}
/flannel/network/subnets/192.168.144.128-26
{"PublicIP":"10.14.7.42","PublicIPv6":null,"BackendType":"vxlan","BackendData":{"VNI":1,"VtepMAC":"46:6e:38:8a:7e:c6"}}
/flannel/network/subnets/192.168.146.192-26
{"PublicIP":"10.14.7.64","PublicIPv6":null,"BackendType":"vxlan","BackendData":{"VNI":1,"VtepMAC":"1a:01:1e:7f:fa:1d"}}
/flannel/network/subnets/192.168.148.128-26
{"PublicIP":"10.14.7.56","PublicIPv6":null,"BackendType":"vxlan","BackendData":{"VNI":1,"VtepMAC":"02:26:18:53:4f:a8"}}
/flannel/network/subnets/192.168.152.128-26
{"PublicIP":"10.14.7.63","PublicIPv6":null,"BackendType":"vxlan","BackendData":{"VNI":1,"VtepMAC":"06:e9:34:03:4e:f4"}}

Possible Solution

Periodically resync the ARP entries.

Steps to Reproduce (for bugs)

Nothing special was done to reproduce this, but it was observed regularly in our environment.

Context

This is regularly causing communication issues between the pods in our environment.
As part of preliminary investigation, as we tried to check for flannel logs, in journalctl, observed the following

Jul 27 05:23:15 system-test-03-cc30210035-node-2 flanneld[27730]: E0727 05:23:15.200382   27730 iptables.go:307] Failed to bootstrap IPTables: failed to apply partial iptables-restore unable to run iptables-restore (, ): exit status 4
Jul 27 05:23:15 system-test-03-cc30210035-node-2 flanneld[27730]: I0727 05:23:15.245615   27730 iptables.go:421] Some iptables rules are missing; deleting and recreating rules
Jul 27 05:23:15 system-test-03-cc30210035-node-2 flanneld[27730]: E0727 05:23:15.276349   27730 iptables.go:307] Failed to bootstrap IPTables: failed to apply partial iptables-restore unable to run iptables-restore (, ): exit status 4
Jul 27 05:23:15 system-test-03-cc30210035-node-2 flanneld[27730]: E0727 05:23:15.313107   27730 iptables.go:320] Failed to ensure iptables rules: error setting up rules: failed to apply partial iptables-restore unable to run iptables-re
store (, ): exit status 4
Jul 27 05:23:15 system-test-03-cc30210035-node-2 flanneld[27730]: I0727 05:23:15.320792   27730 iptables.go:421] Some iptables rules are missing; deleting and recreating rules
Jul 27 05:23:15 system-test-03-cc30210035-node-2 flanneld[27730]: I0727 05:23:15.491607   27730 iptables.go:283] bootstrap done 

The firewall rules, for which it was failing, we tried to manually execute iptables-restore for the same. Then also ran into the issue:

$ cat test1.txt
*filter
-D FORWARD -m comment --comment "flanneld forward" -j FLANNEL-FWD
-A FORWARD -m comment --comment "flanneld forward" -j FLANNEL-FWD
-A FLANNEL-FWD -s 192.168.192.0/18 -m comment --comment "flanneld forward" -j ACCEPT
-A FLANNEL-FWD -d 192.168.192.0/18 -m comment --comment "flanneld forward" -j ACCEPT
COMMIT

$ sudo iptables-restore < test1.txt
iptables-restore v1.4.21: Couldn't load target `FLANNEL-FWD':No such file or directoryError occurred at line: 2
Try `iptables-restore -h' or 'iptables-restore --help' for more information. 

Your Environment

  • Flannel version: 0.22.0
  • Backend used (e.g. vxlan or udp): VXLAN
  • Etcd version: 3.5.8
  • Kubernetes version (if used): 1.27.3
  • Operating System and version: CentOS Linux release 7.9.2009 (Core)
  • Link to your project (optional):

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions