Cloudstack 4.19.2.0 - KVM Cloudstack Agent Stuck #10771
-
Hi, We are running ACS 4.19.2.0 with KVM hypervisors (Ubuntu 24.04 LTS), using linstor as hyperconverged storage and nfs as secondary storage. We have the phenomen, that one of our hosts seems to have a cloudstack agent problem. We have noticed this because the agent.log stopped working (no new entries and no logrotate) and volume snapshots started to fail and ended in error state. What we also can see is, that the communication to (redundant) virtual routers on this host is disrupted, health checks fail, and the requested host is not repsonding. After failed health checks the router is getting replaced and the old one should be expunged but the router is still in expunging status. I have attached a management-server log entry regarding virtual router "r-749-VM" not able to perform health checks. After restarting the cloudstack-agent the logging starts to work again. But the router state is still "expunging". Has anyone had any experience with it? Thanks! BR |
Beta Was this translation helpful? Give feedback.
Replies: 2 comments 2 replies
-
Hi, As an additional info we receive the following log entries from cloudstack-agent log now: While creating a volume snapshot: 2025-04-24 11:55:00,183 INFO [resource.wrapper.LinstorBackupSnapshotCommandWrapper] (agentRequest-Handler-5:null) (logid:23fbdeeb) Src: /dev/mapper/vg_nvme-cs--53eced61--c5b3--4cb0--85c3--b7013e1d6ca6_00000_cs--a2db7f5c--b44c--48b4--bbdf--c39ad8e49318 | wilken-test I can not send the command: virsh pool-list --all |
Beta Was this translation helpful? Give feedback.
-
@wverleger , I think you should maybe also restart qemu/libvirtd to resolve the issues. If there are no vital VMs on the machine , even reboot it, though this should not be necessary.
|
Beta Was this translation helpful? Give feedback.
@wverleger , I think you should maybe also restart qemu/libvirtd to resolve the issues. If there are no vital VMs on the machine , even reboot it, though this should not be necessary.