-
Notifications
You must be signed in to change notification settings - Fork 821
Description
I noticed my cluster not running and having issues with etcd service. I noticed it was related to enable-v2 flag as described in #5209, I removed the --enable-v2=true flag from /var/snap/microk8s/8384/args/etcd file to match the fix from #5212, however now I get the new error: illegal v2store content.
According to https://etcd.io/docs/v3.6/upgrades/upgrade_3_6/, there should be a migration performed, I cannot find a trace of a migration command (ETCDCTL_API=3 etcdctl migrate) in microk8s repository, so perhaps we're missing the migration step?
Below are logs of the failed startup of the etcd from the /var/log/syslog
Sep 8 13:04:12 ckube-1 systemd[1]: Started Service for snap application microk8s.daemon-etcd.
Sep 8 13:04:12 ckube-1 microk8s.daemon-etcd[136524]: {"level":"warn","ts":"2025-09-08T13:04:12.556709Z","caller":"embed/config.go:1209","msg":"Running http and grpc server on single port. This is not recommended for production."}
Sep 8 13:04:12 ckube-1 microk8s.daemon-etcd[136524]: {"level":"warn","ts":"2025-09-08T13:04:12.557227Z","caller":"embed/config.go:1320","msg":"it isn't recommended to use default name, please set a value for --name. Note that etcd might run into issue when multiple members have the same default name","name":"default"}
Sep 8 13:04:12 ckube-1 microk8s.daemon-etcd[136524]: {"level":"info","ts":"2025-09-08T13:04:12.557365Z","caller":"etcdmain/etcd.go:64","msg":"Running: ","args":["/snap/microk8s/8384/etcd","--data-dir=/var/snap/microk8s/common/var/run/etcd","--advertise-client-urls=https://192.168.5.70:12379","--listen-client-urls=https://0.0.0.0:12379","--client-cert-auth","--trusted-ca-file=/var/snap/microk8s/8384/certs/ca.crt","--cert-file=/var/snap/microk8s/8384/certs/server.crt","--key-file=/var/snap/microk8s/8384/certs/server.key"]}
Sep 8 13:04:12 ckube-1 microk8s.daemon-etcd[136524]: {"level":"info","ts":"2025-09-08T13:04:12.557580Z","caller":"etcdmain/etcd.go:107","msg":"server has already been initialized","data-dir":"/var/snap/microk8s/common/var/run/etcd","dir-type":"member"}
Sep 8 13:04:12 ckube-1 microk8s.daemon-etcd[136524]: {"level":"warn","ts":"2025-09-08T13:04:12.557738Z","caller":"embed/config.go:1209","msg":"Running http and grpc server on single port. This is not recommended for production."}
Sep 8 13:04:12 ckube-1 microk8s.daemon-etcd[136524]: {"level":"warn","ts":"2025-09-08T13:04:12.557857Z","caller":"embed/config.go:1320","msg":"it isn't recommended to use default name, please set a value for --name. Note that etcd might run into issue when multiple members have the same default name","name":"default"}
Sep 8 13:04:12 ckube-1 microk8s.daemon-etcd[136524]: {"level":"info","ts":"2025-09-08T13:04:12.557974Z","caller":"embed/etcd.go:138","msg":"configuring peer listeners","listen-peer-urls":["http://localhost:2380"]}
Sep 8 13:04:12 ckube-1 microk8s.daemon-etcd[136524]: {"level":"info","ts":"2025-09-08T13:04:12.558546Z","caller":"embed/etcd.go:146","msg":"configuring client listeners","listen-client-urls":["https://0.0.0.0:12379"]}
Sep 8 13:04:12 ckube-1 microk8s.daemon-etcd[136524]: {"level":"info","ts":"2025-09-08T13:04:12.558843Z","caller":"embed/etcd.go:323","msg":"starting an etcd server","etcd-version":"3.6.4","git-sha":"5400cdc","go-version":"go1.23.11","go-os":"linux","go-arch":"amd64","max-cpu-set":4,"max-cpu-available":4,"member-initialized":true,"name":"default","data-dir":"/var/snap/microk8s/common/var/run/etcd","wal-dir":"","wal-dir-dedicated":"","member-dir":"/var/snap/microk8s/common/var/run/etcd/member","force-new-cluster":false,"heartbeat-interval":"100ms","election-timeout":"1s","initial-election-tick-advance":true,"snapshot-count":10000,"max-wals":5,"max-snapshots":5,"snapshot-catchup-entries":5000,"initial-advertise-peer-urls":["http://localhost:2380"],"listen-peer-urls":["http://localhost:2380"],"advertise-client-urls":["https://192.168.5.70:12379"],"listen-client-urls":["https://0.0.0.0:12379"],"listen-metrics-urls":[],"experimental-local-address":"","cors":["*"],"host-whitelist":["*"],"initial-cluster":"","initial-cluster-state":"new","initial-cluster-token":"","quota-backend-bytes":2147483648,"max-request-bytes":1572864,"max-concurrent-streams":4294967295,"pre-vote":true,"feature-gates":"","initial-corrupt-check":false,"corrupt-check-time-interval":"0s","compact-check-time-interval":"1m0s","auto-compaction-mode":"periodic","auto-compaction-retention":"0s","auto-compaction-interval":"0s","discovery-url":"","discovery-proxy":"","discovery-token":"","discovery-endpoints":"","discovery-dial-timeout":"2s","discovery-request-timeout":"5s","discovery-keepalive-time":"2s","discovery-keepalive-timeout":"6s","discovery-insecure-transport":true,"discovery-insecure-skip-tls-verify":false,"discovery-cert":"","discovery-key":"","discovery-cacert":"","discovery-user":"","downgrade-check-interval":"5s","max-learners":1,"v2-deprecation":"write-only"}
Sep 8 13:04:12 ckube-1 microk8s.daemon-etcd[136524]: {"level":"info","ts":"2025-09-08T13:04:12.559479Z","logger":"bbolt","caller":"backend/backend.go:203","msg":"Opening db file (/var/snap/microk8s/common/var/run/etcd/member/snap/db) with mode -rw------- and with options: {Timeout: 0s, NoGrowSync: false, NoFreelistSync: true, PreLoadFreelist: false, FreelistType: hashmap, ReadOnly: false, MmapFlags: 8000, InitialMmapSize: 10737418240, PageSize: 0, NoSync: false, OpenFile: 0x0, Mlock: false, Logger: 0xc0003d0118}"}
Sep 8 13:04:12 ckube-1 microk8s.daemon-etcd[136524]: {"level":"info","ts":"2025-09-08T13:04:12.577801Z","logger":"bbolt","caller":"bbolt@v1.4.2/db.go:321","msg":"Opening bbolt db (/var/snap/microk8s/common/var/run/etcd/member/snap/db) successfully"}
Sep 8 13:04:12 ckube-1 microk8s.daemon-etcd[136524]: {"level":"info","ts":"2025-09-08T13:04:12.577867Z","caller":"storage/backend.go:80","msg":"opened backend db","path":"/var/snap/microk8s/common/var/run/etcd/member/snap/db","took":"18.489834ms"}
Sep 8 13:04:12 ckube-1 microk8s.daemon-etcd[136524]: {"level":"info","ts":"2025-09-08T13:04:12.577900Z","caller":"etcdserver/bootstrap.go:220","msg":"restore consistentIndex","index":491955990}
Sep 8 13:04:13 ckube-1 microk8s.daemon-etcd[136524]: {"level":"error","ts":"2025-09-08T13:04:13.111288Z","caller":"etcdserver/bootstrap.go:409","msg":"illegal v2store content","error":"detected disallowed custom content in v2store for stage --v2-deprecation=write-only","stacktrace":"go.etcd.io/etcd/server/v3/etcdserver.recoverSnapshot\n\tgo.etcd.io/etcd/server/v3/etcdserver/bootstrap.go:409\ngo.etcd.io/etcd/server/v3/etcdserver.bootstrapBackend\n\tgo.etcd.io/etcd/server/v3/etcdserver/bootstrap.go:225\ngo.etcd.io/etcd/server/v3/etcdserver.bootstrap\n\tgo.etcd.io/etcd/server/v3/etcdserver/bootstrap.go:80\ngo.etcd.io/etcd/server/v3/etcdserver.NewServer\n\tgo.etcd.io/etcd/server/v3/etcdserver/server.go:307\ngo.etcd.io/etcd/server/v3/embed.StartEtcd\n\tgo.etcd.io/etcd/server/v3/embed/etcd.go:262\ngo.etcd.io/etcd/server/v3/etcdmain.startEtcd\n\tgo.etcd.io/etcd/server/v3/etcdmain/etcd.go:207\ngo.etcd.io/etcd/server/v3/etcdmain.startEtcdOrProxyV2\n\tgo.etcd.io/etcd/server/v3/etcdmain/etcd.go:114\ngo.etcd.io/etcd/server/v3/etcdmain.Main\n\tgo.etcd.io/etcd/server/v3/etcdmain/main.go:40\nmain.main\n\tgo.etcd.io/etcd/server/v3/main.go:31\nruntime.main\n\truntime/proc.go:272"}
Sep 8 13:04:13 ckube-1 microk8s.daemon-etcd[136524]: {"level":"error","ts":"2025-09-08T13:04:13.118940Z","caller":"etcdserver/server.go:309","msg":"bootstrap failed","error":"detected disallowed custom content in v2store for stage --v2-deprecation=write-only","stacktrace":"go.etcd.io/etcd/server/v3/etcdserver.NewServer\n\tgo.etcd.io/etcd/server/v3/etcdserver/server.go:309\ngo.etcd.io/etcd/server/v3/embed.StartEtcd\n\tgo.etcd.io/etcd/server/v3/embed/etcd.go:262\ngo.etcd.io/etcd/server/v3/etcdmain.startEtcd\n\tgo.etcd.io/etcd/server/v3/etcdmain/etcd.go:207\ngo.etcd.io/etcd/server/v3/etcdmain.startEtcdOrProxyV2\n\tgo.etcd.io/etcd/server/v3/etcdmain/etcd.go:114\ngo.etcd.io/etcd/server/v3/etcdmain.Main\n\tgo.etcd.io/etcd/server/v3/etcdmain/main.go:40\nmain.main\n\tgo.etcd.io/etcd/server/v3/main.go:31\nruntime.main\n\truntime/proc.go:272"}
Sep 8 13:04:13 ckube-1 microk8s.daemon-etcd[136524]: {"level":"info","ts":"2025-09-08T13:04:13.119030Z","caller":"embed/etcd.go:426","msg":"closing etcd server","name":"default","data-dir":"/var/snap/microk8s/common/var/run/etcd","advertise-peer-urls":["http://localhost:2380"],"advertise-client-urls":["https://192.168.5.70:12379"]}
Sep 8 13:04:13 ckube-1 microk8s.daemon-etcd[136524]: {"level":"info","ts":"2025-09-08T13:04:13.119151Z","caller":"embed/etcd.go:428","msg":"closed etcd server","name":"default","data-dir":"/var/snap/microk8s/common/var/run/etcd","advertise-peer-urls":["http://localhost:2380"],"advertise-client-urls":["https://192.168.5.70:12379"]}
Sep 8 13:04:13 ckube-1 microk8s.daemon-etcd[136524]: {"level":"fatal","ts":"2025-09-08T13:04:13.119188Z","caller":"etcdmain/etcd.go:183","msg":"discovery failed","error":"detected disallowed custom content in v2store for stage --v2-deprecation=write-only","stacktrace":"go.etcd.io/etcd/server/v3/etcdmain.startEtcdOrProxyV2\n\tgo.etcd.io/etcd/server/v3/etcdmain/etcd.go:183\ngo.etcd.io/etcd/server/v3/etcdmain.Main\n\tgo.etcd.io/etcd/server/v3/etcdmain/main.go:40\nmain.main\n\tgo.etcd.io/etcd/server/v3/main.go:31\nruntime.main\n\truntime/proc.go:272"}
Sep 8 13:04:13 ckube-1 systemd[1]: snap.microk8s.daemon-etcd.service: Main process exited, code=exited, status=1/FAILURE
Sep 8 13:04:13 ckube-1 systemd[1]: snap.microk8s.daemon-etcd.service: Failed with result 'exit-code'.
Sep 8 13:04:13 ckube-1 systemd[1]: snap.microk8s.daemon-etcd.service: Scheduled restart job, restart counter is at 1.
Sep 8 13:04:13 ckube-1 systemd[1]: Stopped Service for snap application microk8s.daemon-etcd.
Summary
The new version of etcd expects storage to be migrated off the v2 version, which seems to never be executed when upgrading existing microk8s deployments.
What Should Happen Instead?
microk8s should perform the etcd storage upgrade prior to removal of the v2 storage.
Reproduction Steps
I have 3 clusters, 2 of them are using etcd and are experiencing the same issue. The one using Dqlite is unaffected.
My snap reports
snap-id: EaXqgt1lyCaxKaQCU349mlodBkDCXRcg
tracking: latest/stable
refresh-date: today at 10:42 UTC
installed: v1.34.0 (8384) 183MB classic
My current content of the env file (/var/snap/microk8s/8384/args/etcd)
--data-dir=${SNAP_COMMON}/var/run/etcd
--advertise-client-urls=https://${DEFAULT_INTERFACE_IP_ADDR}:12379
--listen-client-urls=https://0.0.0.0:12379
--client-cert-auth
--trusted-ca-file=${SNAP_DATA}/certs/ca.crt
--cert-file=${SNAP_DATA}/certs/server.crt
--key-file=${SNAP_DATA}/certs/server.key
Introspection Report
Inspecting system
Inspecting Certificates
Inspecting services
Service snap.microk8s.daemon-cluster-agent is running
Service snap.microk8s.daemon-containerd is running
Service snap.microk8s.daemon-kubelite is running
Service snap.microk8s.daemon-flanneld is running
FAIL: Service snap.microk8s.daemon-etcd is not running
For more details look at: sudo journalctl -u snap.microk8s.daemon-etcd
Service snap.microk8s.daemon-apiserver-kicker is running
Copy service arguments to the final report tarball
Inspecting AppArmor configuration
Gathering system information
Copy processes list to the final report tarball
Copy disk usage information to the final report tarball
Copy memory usage information to the final report tarball
Copy server uptime to the final report tarball
Copy openSSL information to the final report tarball
Copy snap list to the final report tarball
Copy VM name (or none) to the final report tarball
Copy current linux distribution to the final report tarball
Copy asnycio usage and limits to the final report tarball
Copy inotify max_user_instances and max_user_watches to the final report tarball
Copy network configuration to the final report tarball
Inspecting kubernetes cluster
Inspect kubernetes cluster
WARNING: Maximum number of inotify user instances is less than the recommended value of 1024.
Increase the limit with:
echo fs.inotify.max_user_instances=1024 | sudo tee -a /etc/sysctl.conf
sudo sysctl --system
WARNING: Maximum number of inotify user watches is less than the recommended value of 1048576.
Increase the limit with:
echo fs.inotify.max_user_watches=1048576 | sudo tee -a /etc/sysctl.conf
sudo sysctl --system
Building the report tarball
Report tarball is at /var/snap/microk8s/8384/inspection-report-20250908_132532.tar.gz
Can you suggest a fix?
It seems that at some point the migration from v2 to v3 storage should be performed, before we turn off the v2 storage.
See
Are you interested in contributing with a fix?
Yes, but would need some guidance on where to put the migration script.