Skip to content

Device manager does not provide devices to the application container after reboot #63

@ratermir

Description

@ratermir

I am not sure whether it is real issue or I am doing something wrong ... but

I have prepared my single node microk8s cluster for Home Assistant, installed this device plugin and propagated /dev/ttyUSB0 and /dev/zigbee2 (symlink to the first one) to the "zigbee2mqtt" pod.

After the first installation everything worked well, but after reboot the the "zigbee2mqtt" pod (with the /dev/ttyUSB0 and /dev/zigbee2 imported) didn't start. The pod stood in the state "UnexpectedAdmissionError", the other pod is created which is in state "Pending", falls of, new one is created ... etc.

zmq-5984c5f8cd-fxxhl    0/1     Pending                    0               3m14s
zmq-5984c5f8cd-z7dvd    0/1     UnexpectedAdmissionError   0               9m24s

In the pod description followin error is written (there is no log since pod didn't start):

Events:
  Type     Reason                    Age   From     Message
  ----     ------                    ----  ----     -------
  Warning  UnexpectedAdmissionError  48s   kubelet  Allocate failed due to no healthy devices present; cannot allocate unhealthy devices squat.ai/serial, which is unexpected

The situation repeats after each reboot.

When I kill all pods manually (the device manager one and also the application pods that don't work), new pods are started and everything works.

Here is log of the device manager container after the first boot (the situation, when it doesn't work - doesn't mount devices into the application container)

ha-test@zmh-lip:/home/k8s/_system/kube-system$ kcn logs device-plugin-zigbee-4vdbz
{"caller":"main.go:218","msg":"Starting the generic-device-plugin for \"squat.ai/zigbee\".","ts":"2024-03-07T13:23:42.934752026Z"}
{"caller":"main.go:218","msg":"Starting the generic-device-plugin for \"squat.ai/serial\".","ts":"2024-03-07T13:23:41.735530601Z"}
{"caller":"plugin.go:114","level":"info","msg":"listening on Unix socket","resource":"squat.ai/serial","socket":"/var/lib/kubelet/device-plugins/gdp-c3F1YXQuYWkvc2VyaWFs-1709817821.sock","ts":"2024-03-07T13:23:45.835273691Z"}
{"caller":"plugin.go:114","level":"info","msg":"listening on Unix socket","resource":"squat.ai/zigbee","socket":"/var/lib/kubelet/device-plugins/gdp-c3F1YXQuYWkvemlnYmVl-1709817821.sock","ts":"2024-03-07T13:23:43.93416597Z"}
{"caller":"plugin.go:122","level":"info","msg":"starting gRPC server","resource":"squat.ai/zigbee","ts":"2024-03-07T13:23:57.034351851Z"}
{"caller":"plugin.go:176","level":"info","msg":"waiting for the gRPC server to be ready","resource":"squat.ai/zigbee","ts":"2024-03-07T13:23:57.034340518Z"}
{"caller":"plugin.go:122","level":"info","msg":"starting gRPC server","resource":"squat.ai/serial","ts":"2024-03-07T13:23:58.334925943Z"}
{"caller":"plugin.go:176","level":"info","msg":"waiting for the gRPC server to be ready","resource":"squat.ai/serial","ts":"2024-03-07T13:23:58.434129443Z"}
{"caller":"plugin.go:188","level":"info","msg":"the gRPC server is ready","resource":"squat.ai/serial","ts":"2024-03-07T13:24:00.434686183Z"}
{"caller":"plugin.go:188","level":"info","msg":"the gRPC server is ready","resource":"squat.ai/zigbee","ts":"2024-03-07T13:24:01.334968608Z"}
{"caller":"plugin.go:226","level":"info","msg":"registering plugin with kubelet","resource":"squat.ai/zigbee","ts":"2024-03-07T13:24:01.335104052Z"}
{"caller":"plugin.go:226","level":"info","msg":"registering plugin with kubelet","resource":"squat.ai/serial","ts":"2024-03-07T13:24:01.237744071Z"}
ha-test@zmh-lip:/home/k8s/_system/kube-system$

This leds to the state described above (non - working application container).

Here is the same log from container after the first one was killed (and re-created by k8s):

ha-test@zmh-lip:/home/k8s/_system/kube-system$ kcn logs device-plugin-zigbee-tkgsv
{"caller":"main.go:218","msg":"Starting the generic-device-plugin for \"squat.ai/zigbee\".","ts":"2024-03-07T13:26:41.838459105Z"}
{"caller":"plugin.go:114","level":"info","msg":"listening on Unix socket","resource":"squat.ai/zigbee","socket":"/var/lib/kubelet/device-plugins/gdp-c3F1YXQuYWkvemlnYmVl-1709818001.sock","ts":"2024-03-07T13:26:41.839476438Z"}
{"caller":"plugin.go:122","level":"info","msg":"starting gRPC server","resource":"squat.ai/zigbee","ts":"2024-03-07T13:26:41.840105846Z"}
{"caller":"plugin.go:176","level":"info","msg":"waiting for the gRPC server to be ready","resource":"squat.ai/zigbee","ts":"2024-03-07T13:26:41.840379364Z"}
{"caller":"main.go:218","msg":"Starting the generic-device-plugin for \"squat.ai/serial\".","ts":"2024-03-07T13:26:41.934053346Z"}
{"caller":"plugin.go:114","level":"info","msg":"listening on Unix socket","resource":"squat.ai/serial","socket":"/var/lib/kubelet/device-plugins/gdp-c3F1YXQuYWkvc2VyaWFs-1709818001.sock","ts":"2024-03-07T13:26:41.934398123Z"}
{"caller":"plugin.go:188","level":"info","msg":"the gRPC server is ready","resource":"squat.ai/zigbee","ts":"2024-03-07T13:26:41.936471975Z"}
{"caller":"plugin.go:226","level":"info","msg":"registering plugin with kubelet","resource":"squat.ai/zigbee","ts":"2024-03-07T13:26:41.936560327Z"}
{"caller":"plugin.go:122","level":"info","msg":"starting gRPC server","resource":"squat.ai/serial","ts":"2024-03-07T13:26:42.034449846Z"}
{"caller":"plugin.go:176","level":"info","msg":"waiting for the gRPC server to be ready","resource":"squat.ai/serial","ts":"2024-03-07T13:26:42.034767179Z"}
{"caller":"plugin.go:188","level":"info","msg":"the gRPC server is ready","resource":"squat.ai/serial","ts":"2024-03-07T13:26:42.03771179Z"}
{"caller":"plugin.go:226","level":"info","msg":"registering plugin with kubelet","resource":"squat.ai/serial","ts":"2024-03-07T13:26:42.037867012Z"}
{"caller":"generic.go:232","level":"info","msg":"starting listwatch","resource":"squat.ai/zigbee","ts":"2024-03-07T13:26:42.53712666Z"}
{"caller":"generic.go:232","level":"info","msg":"starting listwatch","resource":"squat.ai/serial","ts":"2024-03-07T13:26:42.634323382Z"}
ha-test@zmh-lip:/home/k8s/_system/kube-system$

My environment is RaspberryPI 4/8GB (Arm64), dietpi OS (variant of Debian), USB drive. The system doesn't show any other issues.
I am not too experienced in k8s devices so I am not sure what can cause this strange behaviour.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions