Fix Cilium lxc* network interface filtering crash#724
Conversation
|
|
Forgot to attach, it just happens that a pod ssh xxx
$ ip -o link show | awk -F': ' '/mtu (1450|1280)/ {print $2}' | grep -Ev 'cilium|br|flannel|docker|veth'
lxc6b9e41974beb@if1025
enp7s0
lxc2ea2f0368cbe@if90
lxcfe7fafe80413@if94
lxc5f5f942b8937@if98
lxc0cde58a77769@if100
lxcb7e446c960e4@if102
lxc06bac5834002@if104
lxc098c18cde290@if106
lxc8bc8349de570@if108
lxc55921d5b14bb@if110
lxc188c516a08e3@if667
lxc_health@if669
lxcc22128de63df@if223
lxcbca769de0c00@if225With the fix we only get: ip -o link show | awk -F': ' '/mtu (1450|1280)/ {print $2}' | grep -Ev 'cilium|lxc|br|flannel|docker|veth'
enp7s0 |
|
Thanks! I am surprised it wasn't reported before. I've merged the PR for now, will make a release perhaps in the weekend :) |
|
Me too! I actually had to delete the pod that had this interface, found via: cilium-dbg endpoint list -o json | jq '.[] | select(.status.networking."interface-name" == "lxc6b9e41974beb") | .status."external-identifiers"."k8s-pod-name"'^ By running this in cilium pod on the bad node. The pod got reinstantiated by ReplicaSet and got a different lxc interface name that sorted below enp7s0 and deploy passed haha, but obviously @vitobotta can you please take a look at Grow Partition ClusterAutoscaler fix PR too? #699 |
|
I haven't had a chance to test that one yet. I'll see if I have time in the weekend. |



We just faced a critical bug which crashes
hetzner-k3s create --configon existing k3s cluster with Cilium CNI:Cilium docs state that it creates network interface for each pod with
lxcXXXXXXnaming schema:And we do have tons of these on each node:
(We don't run anything via LXC, only k3s with Cilium CNI).
This PR adds lxc prefix to the grep filter in master & worker install scripts to safely ignore Cilium
lxc*interfaces from picking them up as private network interface on the node instead of crashing the deploy.