Skip to content

Conversation

@sudo87
Copy link
Collaborator

@sudo87 sudo87 commented Jun 10, 2025

Description

This PR introduces new value: "COMBINED" for config: "host.capacityType.to.order.clusters", which will be used to order cluster, host and pods based on CPU and Memory both.

COMBINED will work with "host.capacityType.to.order.clusters.cputomemoryweight" and overall capacity for cluster/pod/host will be computed based on CPU and memory using weight factor.

The allocator will need to first calculate the combined allocation/usage metric (as follows), before sorting the clusters/pods/hosts to return a ordered list of hosts by this metric, for example:
For each host, define metric as:

Metric = CPU * weight + Memory * (1-weight)

Doc PR: apache/cloudstack-documentation#524

Types of changes

  • Breaking change (fix or feature that would cause existing functionality to change)
  • New feature (non-breaking change which adds functionality)
  • Bug fix (non-breaking change which fixes an issue)
  • Enhancement (improves an existing feature and functionality)
  • Cleanup (Code refactoring and cleanup, that may add test cases)
  • build/CI
  • test (unit or integration test code)

Feature/Enhancement Scale or Bug Severity

Feature/Enhancement Scale

  • Major
  • Minor

Bug Severity

  • BLOCKER
  • Critical
  • Major
  • Minor
  • Trivial

Screenshots (if appropriate):

How Has This Been Tested?

How did you try to break this feature and the system with this change?

host.capacityType.to.order.clusters config will support new algorithm: COMBINED
which will work with host.capacityType.to.order.clusters.cputomemoryweight and capacity will be
computed based on CPU and memory both and using weight factor
@codecov
Copy link

codecov bot commented Jun 10, 2025

Codecov Report

❌ Patch coverage is 76.10063% with 38 lines in your changes missing coverage. Please review.
✅ Project coverage is 16.58%. Comparing base (fb6adac) to head (b040691).
⚠️ Report is 177 commits behind head on main.

Files with missing lines Patch % Lines
...rc/main/java/com/cloud/deploy/FirstFitPlanner.java 71.08% 22 Missing and 2 partials ⚠️
...gent/manager/allocator/impl/FirstFitAllocator.java 48.14% 14 Missing ⚠️
Additional details and impacted files
@@             Coverage Diff              @@
##               main   #10997      +/-   ##
============================================
+ Coverage     16.56%   16.58%   +0.01%     
- Complexity    14010    14036      +26     
============================================
  Files          5758     5758              
  Lines        511578   511717     +139     
  Branches      62192    62216      +24     
============================================
+ Hits          84756    84870     +114     
- Misses       417350   417374      +24     
- Partials       9472     9473       +1     
Flag Coverage Δ
uitests 3.91% <ø> (ø)
unittests 17.48% <76.10%> (+0.01%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@github-actions
Copy link

This pull request has merge conflicts. Dear author, please fix the conflicts and sync your branch with the base branch.

@sudo87
Copy link
Collaborator Author

sudo87 commented Jun 12, 2025

@blueorangutan package

@blueorangutan
Copy link

@sudo87 a [SL] Jenkins job has been kicked to build packages. It will be bundled with KVM, XenServer and VMware SystemVM templates. I'll keep you posted as I make progress.

@blueorangutan
Copy link

Packaging result [SF]: ✖️ el8 ✖️ el9 ✖️ debian ✖️ suse15. SL-JID 13745

@sudo87
Copy link
Collaborator Author

sudo87 commented Jun 12, 2025

@blueorangutan package

@blueorangutan
Copy link

@sudo87 a [SL] Jenkins job has been kicked to build packages. It will be bundled with KVM, XenServer and VMware SystemVM templates. I'll keep you posted as I make progress.

@blueorangutan
Copy link

Packaging result [SF]: ✔️ el8 ✔️ el9 ✔️ debian ✔️ suse15. SL-JID 13751

@sudo87 sudo87 marked this pull request as ready for review June 13, 2025 04:30
@sudo87 sudo87 requested a review from weizhouapache June 13, 2025 04:30
@DaanHoogland
Copy link
Contributor

@sudo87 will you create a doc PR for this as well?

@sudo87
Copy link
Collaborator Author

sudo87 commented Jun 13, 2025

@sudo87 will you create a doc PR for this as well?

Yes @DaanHoogland, doc pr will be needed for this change.

@sudo87
Copy link
Collaborator Author

sudo87 commented Jun 17, 2025

@blueorangutan package

@blueorangutan
Copy link

@sudo87 a [SL] Jenkins job has been kicked to build packages. It will be bundled with KVM, XenServer and VMware SystemVM templates. I'll keep you posted as I make progress.

@blueorangutan
Copy link

Packaging result [SF]: ✔️ el8 ✔️ el9 ✔️ debian ✔️ suse15. SL-JID 13821

@sudo87
Copy link
Collaborator Author

sudo87 commented Jun 18, 2025

@blueorangutan test

@blueorangutan
Copy link

@sudo87 a [SL] Trillian-Jenkins test job (ol8 mgmt + kvm-ol8) has been kicked to run smoke tests

@blueorangutan
Copy link

[SF] Trillian test result (tid-13554)
Environment: kvm-ol8 (x2), Advanced Networking with Mgmt server ol8
Total time taken: 2456 seconds
Marvin logs: https://github.com/blueorangutan/acs-prs/releases/download/trillian/pr10997-t13554-kvm-ol8.zip
Smoke tests completed. 8 look OK, 133 have errors, 0 did not run
Only failed and skipped tests results shown below:

Test Result Time (s) Test File
runTest Error 0.00 test_2fa.py
runTest Error 0.00 test_account_access.py
runTest Error 0.00 test_accounts.py
runTest Error 0.00 test_affinity_groups_projects.py
runTest Error 0.00 test_affinity_groups.py
runTest Error 0.00 test_annotations.py
runTest Error 0.00 test_async_job.py
runTest Error 0.00 test_attach_multiple_volumes.py
runTest Error 0.00 test_backup_recovery_dummy.py
runTest Error 0.00 test_backup_recovery_veeam.py
runTest Error 0.00 test_direct_download.py
runTest Error 0.00 test_certauthority_root.py
runTest Error 0.00 test_cluster_drs.py
runTest Error 0.00 test_console_endpoint.py
runTest Error 0.00 test_create_list_domain_account_project.py
runTest Error 0.00 test_create_network.py
runTest Error 0.00 test_deploy_vgpu_enabled_vm.py
runTest Error 0.00 test_deploy_virtio_scsi_vm.py
runTest Error 0.00 test_deploy_vm_extra_config_data.py
runTest Error 0.00 test_deploy_vm_iso.py
runTest Error 0.00 test_deploy_vm_iso_uefi.py
runTest Error 0.00 test_deploy_vm_root_resize.py
runTest Error 0.00 test_deploy_vms_in_parallel.py
runTest Error 0.00 test_deploy_vms_with_varied_deploymentplanners.py
runTest Error 0.00 test_deploy_vm_with_userdata.py
runTest Error 0.00 test_diagnostics.py
runTest Error 0.00 test_disk_offerings.py
runTest Error 0.00 test_disk_provisioning_types.py
runTest Error 0.00 test_domain_disk_offerings.py
runTest Error 0.00 test_domain_network_offerings.py
runTest Error 0.00 test_domain_service_offerings.py
runTest Error 0.00 test_guest_os.py
runTest Error 0.00 test_domain_vpc_offerings.py
runTest Error 0.00 test_enable_account_settings_for_domain.py
runTest Error 0.00 test_metrics_api.py
runTest Error 0.00 test_events_resource.py
runTest Error 0.00 test_gateway_on_shared_networks.py
runTest Error 0.00 test_global_acls.py
runTest Error 0.00 test_global_settings.py
runTest Error 0.00 test_guest_vlan_range.py
runTest Error 0.00 test_host_control_state.py
runTest Error 0.00 test_hostha_simulator.py
runTest Error 0.00 test_host_ping.py
runTest Error 0.00 test_image_store_object_migration.py
runTest Error 0.00 test_import_unmanage_volumes.py
runTest Error 0.00 test_internal_lb.py
runTest Error 0.00 test_ipv4_routing.py
runTest Error 0.00 test_ipv6_infra.py
runTest Error 0.00 test_iso.py
runTest Error 0.00 test_kubernetes_clusters.py
runTest Error 0.00 test_kubernetes_supported_versions.py
runTest Error 0.00 test_list_accounts.py
runTest Error 0.00 test_list_disk_offerings.py
runTest Error 0.00 test_list_domains.py
runTest Error 0.00 test_list_hosts.py
runTest Error 0.00 test_list_ids_parameter.py
runTest Error 0.00 test_list_service_offerings.py
runTest Error 0.00 test_list_storage_pools.py
runTest Error 0.00 test_list_volumes.py
runTest Error 0.00 test_loadbalance.py
runTest Error 0.00 test_login.py
runTest Error 0.00 test_migration.py
runTest Error 0.00 test_ms_maintenance_and_safe_shutdown.py
runTest Error 0.00 test_multipleips_per_nic.py
runTest Error 0.00 test_nested_virtualization.py
runTest Error 0.00 test_network_acl.py
runTest Error 0.00 test_network_ipv6.py
runTest Error 0.00 test_network_permissions.py
runTest Error 0.00 test_network.py
runTest Error 0.00 test_nic_adapter_type.py
runTest Error 0.00 test_nic.py
runTest Error 0.00 test_non_contigiousvlan.py
runTest Error 0.00 test_nonstrict_affinity_group.py
runTest Error 0.00 test_outofbandmanagement_nestedplugin.py
runTest Error 0.00 test_outofbandmanagement.py
runTest Error 0.00 test_over_provisioning.py
runTest Error 0.00 test_password_server.py
runTest Error 0.00 test_persistent_network.py
runTest Error 0.00 test_portable_publicip.py
runTest Error 0.00 test_portforwardingrules.py
runTest Error 0.00 test_primary_storage.py
runTest Error 0.00 test_primary_storage_scope.py
runTest Error 0.00 test_privategw_acl_ovs_gre.py
runTest Error 0.00 test_privategw_acl.py
runTest Error 0.00 test_projects.py
runTest Error 0.00 test_public_ip_range.py
runTest Error 0.00 test_purge_expunged_vms.py
runTest Error 0.00 test_pvlan.py
runTest Error 0.00 test_quarantined_ips.py
runTest Error 0.00 test_regions.py
runTest Error 0.00 test_register_userdata.py
runTest Error 0.00 test_reset_configuration_settings.py
runTest Error 0.00 test_reset_vm_on_reboot.py
runTest Error 0.00 test_resource_accounting.py
runTest Error 0.00 test_resource_detail.py
runTest Error 0.00 test_resource_names.py
runTest Error 0.00 test_restore_vm.py
runTest Error 0.00 test_router_dhcphosts.py
runTest Error 0.00 test_router_dns.py
runTest Error 0.00 test_router_dnsservice.py
runTest Error 0.00 test_routers_iptables_default_policy.py
runTest Error 0.00 test_routers_network_ops.py
runTest Error 0.00 test_routers.py
runTest Error 0.00 test_scale_vm.py
runTest Error 0.00 test_secondary_storage.py
runTest Error 0.00 test_service_offerings.py
runTest Error 0.00 test_set_sourcenat.py
runTest Error 0.00 test_sharedfs_lifecycle.py
runTest Error 0.00 test_snapshots.py
runTest Error 0.00 test_ssvm.py
runTest Error 0.00 test_storage_policy.py
runTest Error 0.00 test_templates.py
runTest Error 0.00 test_update_security_group.py
runTest Error 0.00 test_usage_events.py
runTest Error 0.00 test_usage.py
runTest Error 0.00 test_vm_autoscaling.py
runTest Error 0.00 test_vm_deployment_planner.py
runTest Error 0.00 test_vm_life_cycle.py
runTest Error 0.00 test_vm_lifecycle_unmanage_import.py
runTest Error 0.00 test_vm_schedule.py
runTest Error 0.00 test_vm_snapshot_kvm.py
runTest Error 0.00 test_vm_snapshots.py
runTest Error 0.00 test_vm_strict_host_tags.py
runTest Error 0.00 test_vnf_templates.py
runTest Error 0.00 test_volumes.py
runTest Error 0.00 test_vpc_ipv6.py
runTest Error 0.00 test_vpc_redundant.py
runTest Error 0.00 test_vpc_router_nics.py
runTest Error 0.00 test_vpc_vpn.py
runTest Error 0.00 test_webhook_delivery.py
runTest Error 0.00 test_webhook_lifecycle.py
runTest Error 0.00 test_host_maintenance.py
runTest Error 0.00 test_hostha_kvm.py

Copy link
Member

@weizhouapache weizhouapache left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@blueorangutan package

@sudo87
Copy link
Collaborator Author

sudo87 commented Jun 19, 2025

@blueorangutan package

@blueorangutan
Copy link

Packaging result [SF]: ✔️ el8 ✔️ el9 ✔️ debian ✔️ suse15. SL-JID 14022

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR adds support for a new "COMBINED" ordering mode that ranks clusters, pods, and hosts by a weighted sum of CPU and memory usage, introduces a config key for the CPU-to-memory weight, updates the DAO and schema to fetch both capacity types, and adds related unit tests.

  • Added COMBINED option and HostCapacityTypeCpuMemoryWeight config key
  • Extended FirstFitPlanner, FirstFitAllocator, and CapacityDao to compute and sort by combined metrics
  • Updated tests, configuration enum, and DB migration for the new behavior

Reviewed Changes

Copilot reviewed 13 out of 13 changed files in this pull request and generated 5 comments.

Show a summary per file
File Description
server/src/main/java/com/cloud/deploy/FirstFitPlanner.java Added helpers for combined CPU/memory ordering and refactored cluster/pod listing
server/src/main/java/com/cloud/agent/manager/allocator/impl/FirstFitAllocator.java Added combined-capacity ordering and helper methods for hosts
engine/components-api/src/main/java/com/cloud/configuration/ConfigurationManager.java Introduced HostCapacityTypeCpuMemoryWeight config key
server/src/main/java/com/cloud/configuration/Config.java Updated Config.HostCapacityTypeToOrderClusters to include COMBINED
engine/schema/src/main/resources/META-INF/db/schema-42010to42100.sql Migrated config description to mention COMBINED
engine/schema/src/main/java/com/cloud/capacity/dao/CapacityDao.java Updated DAO interface signatures for capacity listing
engine/schema/src/main/java/com/cloud/capacity/dao/CapacityDaoImpl.java Added methods to list capacities by types and removed obsolete signatures
server/src/test/java/com/cloud/vm/FirstFitPlannerTest.java Added tests for combined ordering on clusters and pods
server/src/test/java/com/cloud/agent/manager/allocator/impl/FirstFitAllocatorTest.java Added tests for combined ordering on hosts
engine/schema/src/test/java/com/cloud/capacity/dao/CapacityDaoImplTest.java Extended DAO tests for new listing methods
Comments suppressed due to low confidence (4)

server/src/main/java/com/cloud/deploy/FirstFitPlanner.java:526

  • Missing import for ArrayList: add 'import java.util.ArrayList;' to ensure this compiles.
        return new Pair<>(new ArrayList<>(podsByCombinedCapacities.keySet()), podsByCombinedCapacities);

server/src/main/java/com/cloud/deploy/FirstFitPlanner.java:564

  • Missing import for ArrayList: add 'import java.util.ArrayList;' to the top of the file.
        return new Pair<>(new ArrayList<>(clusterByCombinedCapacities.keySet()), clusterByCombinedCapacities);

server/src/main/java/com/cloud/agent/manager/allocator/impl/FirstFitAllocator.java:424

  • Missing import for HashMap: add 'import java.util.HashMap;' so this compiles without errors.
        Map<Long, Double> hostByComputedCapacity = new HashMap<>();

server/src/main/java/com/cloud/agent/manager/allocator/impl/FirstFitAllocator.java:419

  • Missing import for ArrayList: please add 'import java.util.ArrayList;'.
        return new Pair<>(new ArrayList<>(hostByComputedCapacity.keySet()), hostByComputedCapacity);

@sudo87
Copy link
Collaborator Author

sudo87 commented Jul 9, 2025

@blueorangutan package

@blueorangutan
Copy link

@sudo87 a [SL] Jenkins job has been kicked to build packages. It will be bundled with KVM, XenServer and VMware SystemVM templates. I'll keep you posted as I make progress.

@blueorangutan
Copy link

Packaging result [SF]: ✔️ el8 ✔️ el9 ✔️ debian ✔️ suse15. SL-JID 14116

return new Pair<>(new ArrayList<>(podsByCombinedCapacities.keySet()), podsByCombinedCapacities);
}

// order pods by combining cpu and memory capacity considering cpuToMemoeryWeight
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should this be javadoc? (it seems the method name already is self-documenting)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure @DaanHoogland, will remove comment in next commit.

@rosi-shapeblue
Copy link
Collaborator

@blueorangutan test

@blueorangutan
Copy link

@rosi-shapeblue a [SL] Trillian-Jenkins test job (ol8 mgmt + kvm-ol8) has been kicked to run smoke tests

@blueorangutan
Copy link

[SF] Trillian test result (tid-13752)
Environment: kvm-ol8 (x2), Advanced Networking with Mgmt server ol8
Total time taken: 55396 seconds
Marvin logs: https://github.com/blueorangutan/acs-prs/releases/download/trillian/pr10997-t13752-kvm-ol8.zip
Smoke tests completed. 140 look OK, 1 have errors, 0 did not run
Only failed and skipped tests results shown below:

Test Result Time (s) Test File
test_01_create_template Error 15.63 test_templates.py
test_CreateTemplateWithDuplicateName Error 21.24 test_templates.py
test_02_create_template_with_checksum_sha1 Error 65.63 test_templates.py
test_03_create_template_with_checksum_sha256 Error 65.65 test_templates.py

@sudo87
Copy link
Collaborator Author

sudo87 commented Jul 14, 2025

@blueorangutan test

@blueorangutan
Copy link

@sudo87 a [SL] Trillian-Jenkins test job (ol8 mgmt + kvm-ol8) has been kicked to run smoke tests

Copy link
Collaborator

@rosi-shapeblue rosi-shapeblue left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Verification passed.

@blueorangutan
Copy link

[SF] Trillian test result (tid-13764)
Environment: kvm-ol8 (x2), Advanced Networking with Mgmt server ol8
Total time taken: 87838 seconds
Marvin logs: https://github.com/blueorangutan/acs-prs/releases/download/trillian/pr10997-t13764-kvm-ol8.zip
Smoke tests completed. 130 look OK, 11 have errors, 0 did not run
Only failed and skipped tests results shown below:

Test Result Time (s) Test File
test_nic_secondaryip_add_remove Error 1517.30 test_multipleips_per_nic.py
ContextSuite context=TestNestedVirtualization>:setup Error 0.00 test_nested_virtualization.py
ContextSuite context=TestNetworkACL>:setup Error 0.00 test_network_acl.py
ContextSuite context=TestIpv6Network>:setup Error 0.00 test_network_ipv6.py
test_delete_account Error 1516.61 test_network.py
test_delete_network_while_vm_on_it Error 1.23 test_network.py
test_deploy_vm_l2network Error 1.19 test_network.py
test_l2network_restart Error 2.31 test_network.py
ContextSuite context=TestPortForwarding>:setup Error 3.53 test_network.py
ContextSuite context=TestPublicIP>:setup Error 11.38 test_network.py
test_reboot_router Failure 0.09 test_network.py
test_releaseIP Error 5.63 test_network.py
test_releaseIP_using_IP Error 5.95 test_network.py
ContextSuite context=TestRouterRules>:setup Error 6.03 test_network.py
ContextSuite context=TestSharedNetworkWithConfigDrive>:setup Error 1520.97 test_network.py
ContextSuite context=TestPrivateGwACL>:setup Error 0.00 test_privategw_acl.py
ContextSuite context=TestAdapterTypeForNic>:setup Error 0.00 test_nic_adapter_type.py
ContextSuite context=TestNonStrictAffinityGroups>:setup Error 0.00 test_nonstrict_affinity_group.py
ContextSuite context=TestIsolatedNetworksPasswdServer>:setup Error 0.00 test_password_server.py
ContextSuite context=TestPortForwardingRules>:setup Error 0.00 test_portforwardingrules.py
ContextSuite context=TestProjectSuspendActivate>:setup Error 1527.13 test_projects.py

@github-actions
Copy link

This pull request has merge conflicts. Dear author, please fix the conflicts and sync your branch with the base branch.

@rohityadavcloud rohityadavcloud merged commit e8ab0ae into apache:main Jul 15, 2025
23 of 24 checks passed
@github-project-automation github-project-automation bot moved this from In Progress to Done in Apache CloudStack 4.21.0 Jul 15, 2025
@rohityadavcloud rohityadavcloud deleted the clusterOrderVMAlloc branch July 15, 2025 11:10
dhslove pushed a commit to ablecloud-team/ablestack-cloud that referenced this pull request Aug 1, 2025
* CPU to Memory weight based algorithm to order cluster
host.capacityType.to.order.clusters config will support new algorithm: COMBINED
which will work with host.capacityType.to.order.clusters.cputomemoryweight and capacity will be
computed based on CPU and memory both and using weight factor

* minor changes

* add unit tests

* update desc and add validation

* handle copilot review comments

* add log indicating chosen capacityType for ordering

---------

Co-authored-by: Rohit Yadav <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

No open projects
Status: Done

Development

Successfully merging this pull request may close these issues.

7 participants