Skip to content

Conversation

@nvazquez
Copy link
Contributor

Description

This PR fixes an NPE observed on KVM agent connection when:

  • the host details doesn't include the one with name 'host.uefi.enable' and
  • virt-v2v and/or ovftool are installed on the host
2025-09-10 16:22:32,648 DEBUG [cloud.agent.Agent] (AgentRequest-Handler-2:[]) (logid:cd17e03c) Request:Seq -1--1:  { Cmd , MgmtId: -1, via: -1, Ver: v1, Flags: 111, [{"com.cloud.agent.api.ReadyCommand":{"_details":"java.lang.NullPointerException: Cannot invoke "String.equals(Object)" because "uefiEnabled" is null","wait":"0","bypassHostMaintenance":"false"}}] }
2025-09-10 16:22:32,648 DEBUG [cloud.agent.Agent] (AgentRequest-Handler-2:[]) (logid:cd17e03c) Processing command: com.cloud.agent.api.ReadyCommand
2025-09-10 16:22:32,648 DEBUG [cloud.agent.Agent] (AgentRequest-Handler-2:[]) (logid:cd17e03c) Not ready to connect to mgt server: java.lang.NullPointerException: Cannot invoke "String.equals(Object)" because "uefiEnabled" is null
2025-09-10 16:22:32,648 INFO  [cloud.agent.Agent] (AgentShutdownThread:[]) (logid:) Stopping the agent: Reason = sig.kill
2025-09-10 16:22:32,649 DEBUG [cloud.agent.Agent] (AgentShutdownThread:[]) (logid:) Sending shutdown to management server
2025-09-10 16:22:32,696 DEBUG [utils.nio.NioClient] (Agent-NioConnectionHandler-1:[]) (logid:) Location 1: Socket Socket[addr=/10.4.19.4,port=8250,localport=54680] closed on read.  Probably -1 returned: Connection closed with -1 on reading size.
2025-09-10 16:22:32,697 DEBUG [utils.nio.NioClient] (Agent-NioConnectionHandler-1:[]) (logid:) Closing socket Socket[addr=/10.4.19.4,port=8250,localport=54680]

Fixes: #11604

Types of changes

  • Breaking change (fix or feature that would cause existing functionality to change)
  • New feature (non-breaking change which adds functionality)
  • Bug fix (non-breaking change which fixes an issue)
  • Enhancement (improves an existing feature and functionality)
  • Cleanup (Code refactoring and cleanup, that may add test cases)
  • build/CI
  • test (unit or integration test code)

Feature/Enhancement Scale or Bug Severity

Feature/Enhancement Scale

  • Major
  • Minor

Bug Severity

  • BLOCKER
  • Critical
  • Major
  • Minor
  • Trivial

Screenshots (if appropriate):

How Has This Been Tested?

How did you try to break this feature and the system with this change?

@nvazquez
Copy link
Contributor Author

@blueorangutan package

Copy link
Contributor

@Pearl1594 Pearl1594 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

code lgtm

@blueorangutan
Copy link

@nvazquez a [SL] Jenkins job has been kicked to build packages. It will be bundled with KVM, XenServer and VMware SystemVM templates. I'll keep you posted as I make progress.

@codecov
Copy link

codecov bot commented Sep 10, 2025

Codecov Report

❌ Patch coverage is 0% with 1 line in your changes missing coverage. Please review.
✅ Project coverage is 16.17%. Comparing base (5d32492) to head (270ab55).
⚠️ Report is 2 commits behind head on 4.20.

Files with missing lines Patch % Lines
...java/com/cloud/agent/manager/AgentManagerImpl.java 0.00% 1 Missing ⚠️
Additional details and impacted files
@@             Coverage Diff              @@
##               4.20   #11610      +/-   ##
============================================
- Coverage     16.17%   16.17%   -0.01%     
  Complexity    13295    13295              
============================================
  Files          5656     5656              
  Lines        498100   498100              
  Branches      60424    60424              
============================================
- Hits          80572    80570       -2     
- Misses       408561   408562       +1     
- Partials       8967     8968       +1     
Flag Coverage Δ
uitests 4.00% <ø> (ø)
unittests 17.02% <0.00%> (-0.01%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@blueorangutan
Copy link

Packaging result [SF]: ✔️ el8 ✔️ el9 ✔️ el10 ✔️ debian ✔️ suse15. SL-JID 14943

@nvazquez
Copy link
Contributor Author

@blueorangutan test

@blueorangutan
Copy link

@nvazquez a [SL] Trillian-Jenkins test job (ol8 mgmt + kvm-ol8) has been kicked to run smoke tests

Copy link
Contributor

@harikrishna-patnala harikrishna-patnala left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

code LGTM

Copy link
Contributor

@shwstppr shwstppr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

code lgtm

Copy link
Member

@weizhouapache weizhouapache left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

code lgtm

@weizhouapache
Copy link
Member

Tested OK

  • rename /etc/cloudstack/agent/uefi.properties
  • install virt-v2v
  • restart cloudstack-agent

without this change

2025-09-11T07:50:43,734 DEBUG [cloud.agent.Agent] (AgentRequest-Handler-5:[]) (logid:) Seq 2-1742611580815671308:  { Ans: , MgmtId: 32986372244225, via: 2, Ver: v1, Flags: 110, [{"com.cloud.agent.api.ReadyAnswer":{"detailsMap":{"host.virtv2v.version":"1.42.0rhel=8"},"result":"true","wait":"0","bypassHostMaintenance":"false"}}] }
2025-09-11T07:50:43,740 DEBUG [cloud.agent.Agent] (AgentRequest-Handler-1:[]) (logid:) Request:Seq -1--1:  { Cmd , MgmtId: -1, via: -1, Ver: v1, Flags: 111, [{"com.cloud.agent.api.ReadyCommand":{"_details":"java.lang.NullPointerException: Cannot invoke "String.equals(Object)" because "uefiEnabled" is null","wait":"0","bypassHostMaintenance":"false"}}] }
2025-09-11T07:50:43,741 DEBUG [cloud.agent.Agent] (AgentRequest-Handler-1:[]) (logid:) Processing command: com.cloud.agent.api.ReadyCommand
2025-09-11T07:50:43,741 DEBUG [cloud.agent.Agent] (AgentRequest-Handler-1:[]) (logid:) Not ready to connect to mgt server: java.lang.NullPointerException: Cannot invoke "String.equals(Object)" because "uefiEnabled" is null
2025-09-11T07:50:43,741 INFO  [cloud.agent.Agent] (AgentShutdownThread:[]) (logid:) Stopping the agent: Reason = sig.kill
2025-09-11T07:50:43,742 DEBUG [cloud.agent.Agent] (AgentShutdownThread:[]) (logid:) Sending shutdown to management server

with this change: no errors found

2025-09-11T08:20:43,535 DEBUG [cloud.agent.Agent] (AgentRequest-Handler-5:[]) (logid:) Seq 2-7903535871058509837:  { Ans: , MgmtId: 32986372244225, via: 2, Ver: v1, Flags: 110, [{"com.cloud.agent.api.ReadyAnswer":{"detailsMap":{"host.virtv2v.version":"1.42.0rhel=8"},"result":"true","wait":"0","bypassHostMaintenance":"false"}}] }

2025-09-11T08:20:59,854 DEBUG [kvm.resource.LibvirtComputingResource] (AgentOutRequest-Handler-1:[]) (logid:) Executing command [/usr/share/cloudstack-common/scripts/vm/network/security_group.py get_rule_logs_for_vms ].

@weizhouapache weizhouapache self-assigned this Sep 11, 2025
@weizhouapache
Copy link
Member

We had some issues with the testing environments.
Merging on approvals and manual test results. Will keep eye on smoke tests of the health check PR

@weizhouapache weizhouapache merged commit 036fd00 into apache:4.20 Sep 11, 2025
40 of 42 checks passed
@blueorangutan
Copy link

[SF] Trillian test result (tid-14284)
Environment: kvm-ol8 (x2), zone: Advanced Networking with Mgmt server ol8
Total time taken: 56298 seconds
Marvin logs: https://github.com/blueorangutan/acs-prs/releases/download/trillian/pr11610-t14284-kvm-ol8.zip
Smoke tests completed. 127 look OK, 14 have errors, 0 did not run
Only failed and skipped tests results shown below:

Test Result Time (s) Test File
test_reboot_router Error 226.32 test_network.py
test_releaseIP Error 23.65 test_network.py
test_releaseIP_using_IP Error 13.62 test_network.py
ContextSuite context=TestRouterRules>:setup Error 39.26 test_network.py
test_01_deployVMInSharedNetwork Failure 18.74 test_network.py
test_02_verifyRouterIpAfterNetworkRestart Failure 11.38 test_network.py
test_03_destroySharedNetwork Failure 1.10 test_network.py
ContextSuite context=TestSharedNetwork>:teardown Error 2.24 test_network.py
ContextSuite context=TestSharedNetworkWithConfigDrive>:setup Error 1519.36 test_network.py
test_01_nic Error 63.13 test_nic.py
ContextSuite context=TestNonStrictAffinityGroups>:setup Error 0.00 test_nonstrict_affinity_group.py
ContextSuite context=TestPrivateGwACL>:setup Error 0.00 test_privategw_acl.py
ContextSuite context=TestIsolatedNetworksPasswdServer>:setup Error 0.00 test_password_server.py
test_01_isolated_persistent_network Error 2.50 test_persistent_network.py
test_03_deploy_and_destroy_VM_and_verify_network_resources_persist Failure 6.77 test_persistent_network.py
test_03_deploy_and_destroy_VM_and_verify_network_resources_persist Error 6.77 test_persistent_network.py
ContextSuite context=TestL2PersistentNetworks>:teardown Error 6.84 test_persistent_network.py
ContextSuite context=TestPortForwardingRules>:setup Error 0.00 test_portforwardingrules.py
test_01_add_primary_storage_disabled_host Error 0.40 test_primary_storage.py
test_01_primary_storage_nfs Error 0.32 test_primary_storage.py
ContextSuite context=TestStorageTags>:setup Error 0.57 test_primary_storage.py
test_01_primary_storage_scope_change Error 0.25 test_primary_storage_scope.py
test_09_project_suspend Error 1.10 test_projects.py
test_10_project_activation Error 1.08 test_projects.py
ContextSuite context=TestCpuCapServiceOfferings>:setup Error 0.00 test_service_offerings.py
test_01_deploy_vm_on_specific_host Error 0.12 test_vm_deployment_planner.py
test_04_deploy_vm_on_host_override_pod_and_cluster Error 0.14 test_vm_deployment_planner.py
test_01_migrate_VM_and_root_volume Error 98.55 test_vm_life_cycle.py
test_02_migrate_VM_with_two_data_disks Error 50.96 test_vm_life_cycle.py
test_01_secure_vm_migration Error 77.19 test_vm_life_cycle.py
test_02_unsecure_vm_migration Error 221.62 test_vm_life_cycle.py
test_04_nonsecured_to_secured_vm_migration Error 147.94 test_vm_life_cycle.py
test_08_migrate_vm Error 0.07 test_vm_life_cycle.py
test_01_migrate_vm_strict_tags_success Error 0.27 test_vm_strict_host_tags.py
test_02_migrate_vm_strict_tags_failure Error 0.23 test_vm_strict_host_tags.py
test_01_restore_vm_strict_tags_success Error 0.31 test_vm_strict_host_tags.py
test_02_restore_vm_strict_tags_failure Error 0.27 test_vm_strict_host_tags.py
test_01_scale_vm_strict_tags_success Error 0.28 test_vm_strict_host_tags.py
test_02_scale_vm_strict_tags_failure Error 0.27 test_vm_strict_host_tags.py
ContextSuite context=TestVMDeploymentPlannerStrictTags>:setup Error 1605.31 test_vm_strict_host_tags.py

@DaanHoogland DaanHoogland deleted the 420-fix-npe-agent-connection branch September 11, 2025 15:45
dhslove pushed a commit to ablecloud-team/ablestack-cloud that referenced this pull request Sep 15, 2025
@weizhouapache weizhouapache linked an issue Sep 26, 2025 that may be closed by this pull request
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

virt-v2v conflicts with cloudstack-agent in version 4.21

6 participants