-
Notifications
You must be signed in to change notification settings - Fork 1.2k
Closed
Milestone
Description
ISSUE TYPE
- Bug Report
COMPONENT NAME
Agent
KVM hypervisor
CLOUDSTACK VERSION
4.19.1
CONFIGURATION
I setup 1 zone with 1 host use KVM and local disk for backend storage instance
OS / ENVIRONMENT
Ubuntu 22.04 for managerment server and KVM hypervisor
SUMMARY
STEPS TO REPRODUCE
down managerment server host disconnect.
up managerment server host stuck in alert state
EXPECTED RESULTS
agent in host connect normal
ACTUAL RESULTS
agnet in host stuck in alert state.
Log managerment server.
`
2025-01-01 13:59:33,355 DEBUG [c.c.a.m.AgentManagerImpl] (AgentConnectTaskPool-10755:ctx-76fa1734) (logid:f8b676f6) Failed to handle host connection:
com.cloud.utils.exception.CloudRuntimeException: Unable to connect 73
at com.cloud.agent.manager.AgentManagerImpl.notifyMonitorsOfConnection([AgentManagerImpl.java:591](http://agentmanagerimpl.java:591/))
at com.cloud.agent.manager.AgentManagerImpl.sendReadyAndGetAttache([AgentManagerImpl.java:1150](http://agentmanagerimpl.java:1150/))
at com.cloud.agent.manager.AgentManagerImpl.handleConnectedAgent([AgentManagerImpl.java:1168](http://agentmanagerimpl.java:1168/))
at com.cloud.agent.manager.AgentManagerImpl$HandleAgentConnectTask.runInContext([AgentManagerImpl.java:1252](http://agentmanagerimpl.java:1252/))
at [org.apache.cloudstack.managed.context.ManagedContextRunnable$1.run](http://org.apache.cloudstack.managed.context.managedcontextrunnable$1.run/)([ManagedContextRunnable.java:48](http://managedcontextrunnable.java:48/))
at [org.apache.cloudstack.managed.context.impl.DefaultManagedContext$1.call](http://org.apache.cloudstack.managed.context.impl.defaultmanagedcontext$1.call/)([DefaultManagedContext.java:55](http://defaultmanagedcontext.java:55/))
at org.apache.cloudstack.managed.context.impl.DefaultManagedContext.callWithContext([DefaultManagedContext.java:102](http://defaultmanagedcontext.java:102/))
at org.apache.cloudstack.managed.context.impl.DefaultManagedContext.runWithContext([DefaultManagedContext.java:52](http://defaultmanagedcontext.java:52/))
at [org.apache.cloudstack.managed.context.ManagedContextRunnable.run](http://org.apache.cloudstack.managed.context.managedcontextrunnable.run/)([ManagedContextRunnable.java:45](http://managedcontextrunnable.java:45/))
at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker([ThreadPoolExecutor.java:1128](http://threadpoolexecutor.java:1128/))
at java.base/java.util.concurrent.ThreadPoolExecutor$[Worker.run](http://worker.run/)([ThreadPoolExecutor.java:628](http://threadpoolexecutor.java:628/))
at java.base/java.lang.Thread.run([Thread.java:829](http://thread.java:829/))
Caused by: java.lang.NullPointerException
2025-01-01 13:59:33,355 DEBUG [c.c.a.m.AgentManagerImpl] (AgentConnectTaskPool-10755:ctx-76fa1734) (logid:f8b676f6) Failed to send ready command:java.nio.channels.ClosedChannelException
2025-01-01 13:59:33,356 WARN [c.c.a.m.AgentManagerImpl] (AgentConnectTaskPool-10755:ctx-76fa1734) (logid:f8b676f6) Unable to create attache for agent: Seq 0-4730: { Cmd , MgmtId: -1, via: 0, Ver: v1, Flags: 1, [{"com.cloud.agent.api.StartupRoutingCommand":{"cpuSockets":"2","cpus":"56","speed":"2600","memory":"404267446272","dom0MinMemory":"1073741824","poolSync":"false","supportsClonedVolumes":"false","caps":"hvm,snapshot","pool":"/root","hypervisorType":"KVM","hostDetails":{"Host.OS.Kernel.Version":"5.15.0-127-generic","com.cloud.network.Networks.RouterPrivateIpStrategy":"HostLocal","Host.OS.Version":"22.04","host.volume.encryption":"true","host.instance.conversion":"true","secured":"true","Host.OS":"Ubuntu"},"hostTags":[],"groupDetails":{},"type":"Routing","dataCenter":"5","pod":"5","cluster":"5","guid":"LibvirtComputingResource","name":"","id":"0","version":"","iqn":"i","privateIpAddress”:”,”privateMacAddress":"","privateNetmask":"","storageIpAddress":"","storageNetmask":"","storageMacAddress”:””,”resourceName":"LibvirtComputingResource","gatewayIpAddress":",”msHostList":"@static","wait":"0","bypassHostMaintenance":"false"}},{"com.cloud.agent.api.StartupStorageCommand":{"totalSize":"(0 bytes) 0","poolInfo":{"uuid":"","host":"","localPath":"/var/lib/libvirt/images","hostPath":"/var/lib/libvirt/images","poolType":"Filesystem","capacityBytes":"(437.51 GB) 469771632640","availableBytes":"(423.22 GB) 454427123712"},"resourceType":"STORAGE_POOL","hostDetails":{},"type":"Storage","dataCenter":"5","pod":"5","guid":"872c55c9-d6f1-390a-a270-d21d74f017cf-LibvirtComputingResource","name":"cd-kvm05","id":"0","version":"[4.19.1.2](http://4.19.1.2/)","resourceName":"LibvirtComputingResource","msHostList":"@static","wait":"0","bypassHostMaintenance":"false"}}] }`
`log agent service
Jan 01 07:00:29 hostnameagent java[9624]: INFO [utils.nio.NioClient] (Agent-Handler-2:) (logid:) Connected to [managerment:8250](http://172.29.0.3:8250/)
Jan 01 07:00:29 hostnameagent java[9624]: INFO [utils.linux.KVMHostInfo] (Agent-Handler-1:) (logid:) Fetching CPU speed from command "lscpu".
Jan 01 07:00:29 hostnameagent java[9624]: INFO [utils.linux.KVMHostInfo] (Agent-Handler-1:) (logid:) Command [lscpu | grep -i 'Model name' | head -n 1 | egrep -o '[[:digit:]].[[:digit:]]+GHz' | sed 's/GHz//g'] resulted in the value [2600] for CPU speed.
Jan 01 07:00:29 hostnameagent java[9624]: INFO [kvm.resource.LibvirtComputingResource] (Agent-Handler-1:) (logid:) Host uses control group [cgroup2fs].
Jan 01 07:00:29 hostnameagent java[9624]: INFO [kvm.resource.LibvirtComputingResource] (Agent-Handler-1:) (logid:) Calculating the max shares of the host.
Jan 01 07:00:29 hostnameagent java[9624]: INFO [kvm.resource.LibvirtComputingResource] (Agent-Handler-1:) (logid:) The max shares of the host is [145600].
Jan 01 07:00:29 hostnameagent sudo[196375]: root : PWD=/ ; USER=root ; COMMAND=/usr/bin/grep InitiatorName= /etc/iscsi/initiatorname.iscsi
Jan 01 07:00:29 hostnameagent sudo[196375]: pam_unix(sudo:session): session opened for user root(uid=0) by (uid=0)
Jan 01 07:00:29 hostnameagent sudo[196375]: pam_unix(sudo:session): session closed for user root
Jan 01 07:00:29 hostnameagent java[9624]: INFO [kvm.storage.LibvirtStorageAdaptor] (Agent-Handler-1:) (logid:) Attempting to create storage pool 53f3fab7-ff3f-41d3-afc(Filesystem) in libvirt
Jan 01 07:00:29 hostnameagent java[9624]: INFO [kvm.storage.LibvirtStorageAdaptor] (Agent-Handler-1:) (logid:) Found existing defined storage pool 53f3fab7-ff3f-41d3-afcf-, using it.
Jan 01 07:00:29 hostnameagent java[9624]: INFO [kvm.storage.LibvirtStorageAdaptor] (Agent-Handler-1:) (logid:) Trying to fetch storage pool 53f3fab7-ff3f-41d3-afcf-from libvirt
Jan 01 07:00:29 hostnameagent java[9624]: INFO [cloud.agent.Agent] (Agent-Handler-2:) (logid:) Process agent startup answer, agent id = 0
Jan 01 07:00:29 hostnameagent java[9624]: INFO [cloud.agent.Agent] (Agent-Handler-2:) (logid:) Set agent id 0
Jan 01 07:00:29 hostnameagent java[9624]: INFO [cloud.agent.Agent] (Agent-Handler-2:) (logid:) Startup Response Received: agent id = 0
Jan 01 07:00:29 hostnameagent java[9624]: WARN [cloud.agent.Agent] (Agent-Handler-5:) (logid:481bb3fe) Unable to send response: null
Jan 01 07:00:34 hostnameagent java[9624]: INFO [cloud.agent.Agent] (Agent-Handler-2:) (logid:) Connected to the host: [managerment](http://172.29.0.3/)
Jan 01 07:00:34 hostnameagent java[9624]: INFO [cloud.agent.Agent] (Agent-Handler-2:) (logid:) Lost connection to host: [managerment](http://172.29.0.3/). Attempting reconnection while we still have 0 commands in progress.
Jan 01 07:00:34 hostnameagent java[9624]: INFO [utils.nio.NioClient] (Agent-Handler-2:) (logid:) NioClient connection closed
Jan 01 07:00:34 hostnameagent java[9624]: INFO [cloud.agent.Agent] (Agent-Handler-2:) (logid:) Reconnecting to host:managerment
Jan 01 07:00:34 hostnameagent java[9624]: INFO [utils.nio.NioClient] (Agent-Handler-2:) (logid:) Connecting to [managerment:8250](http://172.29.0.3:8250/)
Jan 01 07:00:34 hostnameagent java[9624]: INFO [[utils.nio.Link](http://utils.nio.link/)] (Agent-Handler-2:) (logid:) Conf file found: /etc/cloudstack/agent/agent.properties
Jan 01 07:00:35 hostnameagent java[9624]: INFO [utils.nio.NioClient] (Agent-Handler-2:) (logid:) SSL: Handshake done
Jan 01 07:00:35 hostnameagent java[9624]: INFO [utils.nio.NioClient] (Agent-Handler-2:) (logid:) Connected to [managerment:8250](http://172.29.0.3:8250/)
Jan 01 07:00:35 hostnameagent java[9624]: INFO [utils.linux.KVMHostInfo] (Agent-Handler-1:) (logid:) Fetching CPU speed from command "lscpu".
Jan 01 07:00:35 hostnameagent java[9624]: INFO [utils.linux.KVMHostInfo] (Agent-Handler-1:) (logid:) Command [lscpu | grep -i 'Model name' | head -n 1 | egrep -o '[[:digit:]].[[:digit:]]+GHz' | sed 's/GHz//g'] resulted in the value [2600] for CPU speed.
Jan 01 07:00:35 hostnameagent java[9624]: INFO [kvm.resource.LibvirtComputingResource] (Agent-Handler-1:) (logid:) Host uses control group [cgroup2fs].
Jan 01 07:00:35 hostnameagent java[9624]: INFO [kvm.resource.LibvirtComputingResource] (Agent-Handler-1:) (logid:) Calculating the max shares of the host.
Jan 01 07:00:35 hostnameagent java[9624]: INFO [kvm.resource.LibvirtComputingResource] (Agent-Handler-1:) (logid:) The max shares of the host is [145600].
Jan 01 07:00:35 hostnameagent sudo[196458]: root : PWD=/ ; USER=root ; COMMAND=/usr/bin/grep InitiatorName= /etc/iscsi/initiatorname.iscsi
Jan 01 07:00:35 hostnameagent sudo[196458]: pam_unix(sudo:session): session opened for user root(uid=0) by (uid=0)
Jan 01 07:00:35 hostnameagent sudo[196458]: pam_unix(sudo:session): session closed for user root
Jan 01 07:00:35 hostnameagent java[9624]: INFO [kvm.storage.LibvirtStorageAdaptor] (Agent-Handler-1:) (logid:) Attempting to create storage pool 53f3fab7-ff3f-41d3-afcf-(Filesystem) in libvirt
Jan 01 07:00:35 hostnameagent java[9624]: INFO [kvm.storage.LibvirtStorageAdaptor] (Agent-Handler-1:) (logid:) Found existing defined storage pool 53f3fab7-ff3f-41d3-afcf-], using it.
Jan 01 07:00:35 hostnameagent java[9624]: INFO [kvm.storage.LibvirtStorageAdaptor] (Agent-Handler-1:) (logid:) Trying to fetch storage pool 53f3fab7-ff3f-41d3-afcf-from libvirt
Jan 01 07:00:35 hostnameagent java[9624]: INFO [cloud.agent.Agent] (Agent-Handler-2:) (logid:) Process agent startup answer, agent id = 0
Jan 01 07:00:35 hostnameagent java[9624]: INFO [cloud.agent.Agent] (Agent-Handler-2:) (logid:) Set agent id 0
Jan 01 07:00:35 hostnameagent java[9624]: INFO [cloud.agent.Agent] (Agent-Handler-2:) (logid:) Startup Response Received: agent id = 0`
i have remove and change some information like IP, hostname,uuid for my policy
Metadata
Metadata
Assignees
Labels
No labels