-
Notifications
You must be signed in to change notification settings - Fork 804
Closed
Description
Summary
My single-node microk8s cluster goes down a few times a day, the main messages I see during that time are (not all in the same order, but a sampling):
E0715 20:15:05.336637 3731005 controller.go:195] "Failed to update lease" err="Put \"https://127.0.0.1:16443/apis/coordination.k8s.io/v1/namespaces/kube-node-lease/leases/node?timeout=10s\": net/http: request canceled (Clien
E0715 20:15:06.620303 3731005 timeout.go:140] "Post-timeout activity" timeElapsed="3.710459ms" method="PUT" path="/apis/coordination.k8s.io/v1/namespaces/kube-system/leases/snapshot-controller-leader" result=null
E0715 20:15:06.641907 3731005 status.go:71] "Unhandled Error" err="apiserver received an error that is not an metav1.Status: &errors.errorString{s:\"http: Handler timeout\"}: http: Handler timeout"
E0715 20:15:05.364152 3731005 writers.go:123] "Unhandled Error" err="apiserver was unable to write a JSON response: http: Handler timeout"
W0716 19:25:09.276703 2864938 reflector.go:492] object-"kube-system"/"coredns": watch of *v1.ConfigMap ended with: an error on the server ("unable to decode an event from the watch stream: http2: client connection lost") has prevented the request from succeeding
What Should Happen Instead?
I'm guessing clusters cycling even once a day is not desired behaviour.
Reproduction Steps
- ...
- ...
Introspection Report
Can you suggest a fix?
I can't figure out what is causing those errors, but I am happy to work with guidance. I have traced the calls up to k8s-dqlite, but it seems the version bundled with microk8s 1.32 doesn't print the errors:
time="2025-07-16T19:24:24+08:00" level=error msg="failed to list /registry/snapshot.storage.k8s.io/volumesnapshots/ for revision 215962876"
If you can guide me how to deploy with a later/patched version of k8s-dqlite I could continue digging.
Are you interested in contributing with a fix?
Yes.
Metadata
Metadata
Assignees
Labels
No labels