Skip to content

Conversation

@sureshanaparti
Copy link
Contributor

@sureshanaparti sureshanaparti commented Feb 6, 2025

Description

This PR validates the direct downloaded template file format (QCOW2) if the template file exists, otherwise skips validation & logs it.

Types of changes

  • Breaking change (fix or feature that would cause existing functionality to change)
  • New feature (non-breaking change which adds functionality)
  • Bug fix (non-breaking change which fixes an issue)
  • Enhancement (improves an existing feature and functionality)
  • Cleanup (Code refactoring and cleanup, that may add test cases)
  • build/CI
  • test (unit or integration test code)

Feature/Enhancement Scale or Bug Severity

Feature/Enhancement Scale

  • Major
  • Minor

Bug Severity

  • BLOCKER
  • Critical
  • Major
  • Minor
  • Trivial

Screenshots (if appropriate):

How Has This Been Tested?

How did you try to break this feature and the system with this change?

@apache apache deleted a comment from blueorangutan Feb 6, 2025
@sureshanaparti
Copy link
Contributor Author

@blueorangutan package

@blueorangutan
Copy link

@sureshanaparti a [SL] Jenkins job has been kicked to build packages. It will be bundled with KVM, XenServer and VMware SystemVM templates. I'll keep you posted as I make progress.

@sureshanaparti sureshanaparti added this to the 4.20.1 milestone Feb 6, 2025
@codecov
Copy link

codecov bot commented Feb 6, 2025

Codecov Report

Attention: Patch coverage is 0% with 7 lines in your changes missing coverage. Please review.

Project coverage is 15.99%. Comparing base (c5afee2) to head (75eaa5b).
Report is 26 commits behind head on 4.20.

Files with missing lines Patch % Lines
...ud/hypervisor/kvm/storage/KVMStorageProcessor.java 0.00% 7 Missing ⚠️
Additional details and impacted files
@@             Coverage Diff              @@
##               4.20   #10332      +/-   ##
============================================
- Coverage     16.00%   15.99%   -0.01%     
- Complexity    13062    13064       +2     
============================================
  Files          5644     5644              
  Lines        494915   494920       +5     
  Branches      59960    59962       +2     
============================================
- Hits          79187    79185       -2     
- Misses       406891   406897       +6     
- Partials       8837     8838       +1     
Flag Coverage Δ
uitests 4.01% <ø> (ø)
unittests 16.83% <0.00%> (-0.01%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@blueorangutan
Copy link

Packaging result [SF]: ✔️ el8 ✔️ el9 ✔️ debian ✔️ suse15. SL-JID 12356

@sureshanaparti
Copy link
Contributor Author

@blueorangutan test

@blueorangutan
Copy link

@sureshanaparti a [SL] Trillian-Jenkins test job (ol8 mgmt + kvm-ol8) has been kicked to run smoke tests

@blueorangutan
Copy link

[SF] Trillian test result (tid-12322)
Environment: kvm-ol8 (x2), Advanced Networking with Mgmt server ol8
Total time taken: 59569 seconds
Marvin logs: https://github.com/blueorangutan/acs-prs/releases/download/trillian/pr10332-t12322-kvm-ol8.zip
Smoke tests completed. 138 look OK, 3 have errors, 0 did not run
Only failed and skipped tests results shown below:

Test Result Time (s) Test File
test_11_isolated_network_with_dynamic_routed_mode Error 2.29 test_ipv4_routing.py
test_12_vpc_and_tier_with_dynamic_routed_mode Error 2.37 test_ipv4_routing.py
test_12_vpc_and_tier_with_dynamic_routed_mode Error 2.37 test_ipv4_routing.py
test_06_purge_expunged_vm_background_task Failure 397.35 test_purge_expunged_vms.py
test_12_start_vm_multiple_volumes_allocated Error 15.10 test_vm_life_cycle.py

Copy link
Member

@rohityadavcloud rohityadavcloud left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Copy link
Contributor

@DaanHoogland DaanHoogland left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

clgtm

@DaanHoogland
Copy link
Contributor

@blueorangutan test

@blueorangutan
Copy link

@DaanHoogland a [SL] Trillian-Jenkins test job (ol8 mgmt + kvm-ol8) has been kicked to run smoke tests

@blueorangutan
Copy link

[SF] Trillian test result (tid-12339)
Environment: kvm-ol8 (x2), Advanced Networking with Mgmt server ol8
Total time taken: 57382 seconds
Marvin logs: https://github.com/blueorangutan/acs-prs/releases/download/trillian/pr10332-t12339-kvm-ol8.zip
Smoke tests completed. 138 look OK, 3 have errors, 0 did not run
Only failed and skipped tests results shown below:

Test Result Time (s) Test File
test_11_isolated_network_with_dynamic_routed_mode Error 2.29 test_ipv4_routing.py
test_12_vpc_and_tier_with_dynamic_routed_mode Error 3.35 test_ipv4_routing.py
test_12_vpc_and_tier_with_dynamic_routed_mode Error 3.36 test_ipv4_routing.py
test_06_purge_expunged_vm_background_task Failure 382.78 test_purge_expunged_vms.py
test_12_start_vm_multiple_volumes_allocated Error 14.88 test_vm_life_cycle.py

@DaanHoogland
Copy link
Contributor

[SF] Trillian test result (tid-12339) Environment: kvm-ol8 (x2), Advanced Networking with Mgmt server ol8 Total time taken: 57382 seconds Marvin logs: https://github.com/blueorangutan/acs-prs/releases/download/trillian/pr10332-t12339-kvm-ol8.zip Smoke tests completed. 138 look OK, 3 have errors, 0 did not run Only failed and skipped tests results shown below:
Test Result Time (s) Test File
test_11_isolated_network_with_dynamic_routed_mode Error 2.29 test_ipv4_routing.py
test_12_vpc_and_tier_with_dynamic_routed_mode Error 3.35 test_ipv4_routing.py
test_12_vpc_and_tier_with_dynamic_routed_mode Error 3.36 test_ipv4_routing.py
test_06_purge_expunged_vm_background_task Failure 382.78 test_purge_expunged_vms.py
test_12_start_vm_multiple_volumes_allocated Error 14.88 test_vm_life_cycle.py

These errors seem consistent. I'm not sure wht the state of 4.20 is right now, though

@weizhouapache
Copy link
Member

[SF] Trillian test result (tid-12339) Environment: kvm-ol8 (x2), Advanced Networking with Mgmt server ol8 Total time taken: 57382 seconds Marvin logs: https://github.com/blueorangutan/acs-prs/releases/download/trillian/pr10332-t12339-kvm-ol8.zip Smoke tests completed. 138 look OK, 3 have errors, 0 did not run Only failed and skipped tests results shown below:
Test Result Time (s) Test File
test_11_isolated_network_with_dynamic_routed_mode Error 2.29 test_ipv4_routing.py
test_12_vpc_and_tier_with_dynamic_routed_mode Error 3.35 test_ipv4_routing.py
test_12_vpc_and_tier_with_dynamic_routed_mode Error 3.36 test_ipv4_routing.py
test_06_purge_expunged_vm_background_task Failure 382.78 test_purge_expunged_vms.py
test_12_start_vm_multiple_volumes_allocated Error 14.88 test_vm_life_cycle.py

These errors seem consistent. I'm not sure wht the state of 4.20 is right now, though

#10252

@kiranchavala
Copy link
Contributor

@blueorangutan package

@blueorangutan
Copy link

@kiranchavala a [SL] Jenkins job has been kicked to build packages. It will be bundled with KVM, XenServer and VMware SystemVM templates. I'll keep you posted as I make progress.

@blueorangutan
Copy link

Packaging result [SF]: ✔️ el8 ✔️ el9 ✔️ debian ✔️ suse15. SL-JID 12398

@kiranchavala
Copy link
Contributor

@blueorangutan test

@blueorangutan
Copy link

@kiranchavala a [SL] Trillian-Jenkins test job (ol8 mgmt + kvm-ol8) has been kicked to run smoke tests

Copy link
Contributor

@kiranchavala kiranchavala left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, was able to test it with a debugger attached

  1. Register a template as direct download

  2. Deploy a vm from the Template

  3. Get the details of the template local path from the database

mysql > select * from template_spool_ref where template_id=<>;

*************************** 3. row ***************************
               id: 3
          pool_id: 1
      template_id: 203
          created: 2025-02-11 08:12:07
     last_updated: NULL
           job_id: NULL
     download_pct: 100
   download_state: DOWNLOADED
        error_str: NULL
       local_path: 53a5bace-b791-4e83-9d30-67848b0df3e5
     install_path: 53a5bace-b791-4e83-9d30-67848b0df3e5
    template_size: 2147418112
    marked_for_gc: 0
            state: Ready
     update_count: 0
          updated: NULL
deployment_option: NULL
  1. Login to kvm host and delete the template

rm -rf /mnt/d94e3a76-d728-320c-8a27-b0ee6e03757b/53a5bace-b791-4e83-9d30-67848b0df3e5

  1. During vm deployment cloudstack checks if the qcow2 file exists or not and fails the operation if the

Agent Logs

2025-02-11 08:12:07,293 WARN  [kvm.storage.KVMStorageProcessor] (AgentRequest-Handler-5:[]) (logid:) Skipped validation whether downloaded file is QCOW2 for template 53a5bace-b791-4e83-9d30-67848b0df3e5, due to downloaded template path is not valid: /mnt/d94e3a76-d728-320c-8a27-b0ee6e03757b/53a5bace-b791-4e83-9d30-67848b0df3e5

2025-02-11 08:17:12,441 DEBUG [kvm.storage.KVMStorageProcessor] (AgentRequest-Handler-1:[]) (logid:) Failed to create volume: com.cloud.utils.exception.CloudRuntimeException: Can't find volume:53a5bace-b791-4e83-9d30-67848b0df3e5
	at com.cloud.hypervisor.kvm.storage.KVMStorageProcessor.cloneVolumeFromBaseTemplate(KVMStorageProcessor.java:454)
2025-02-11 08:22:22,237 DEBUG [kvm.storage.KVMStorageProcessor] (AgentRequest-Handler-2:[]) (logid:) Failed to create volume: com.cloud.utils.exception.CloudRuntimeException: Can't find volume:53a5bace-b791-4e83-9d30-67848b0df3e5

@blueorangutan
Copy link

[SF] Trillian test result (tid-12369)
Environment: kvm-ol8 (x2), Advanced Networking with Mgmt server ol8
Total time taken: 62950 seconds
Marvin logs: https://github.com/blueorangutan/acs-prs/releases/download/trillian/pr10332-t12369-kvm-ol8.zip
Smoke tests completed. 138 look OK, 3 have errors, 0 did not run
Only failed and skipped tests results shown below:

Test Result Time (s) Test File
test_11_isolated_network_with_dynamic_routed_mode Error 2.25 test_ipv4_routing.py
test_12_vpc_and_tier_with_dynamic_routed_mode Error 2.37 test_ipv4_routing.py
test_12_vpc_and_tier_with_dynamic_routed_mode Error 2.37 test_ipv4_routing.py
test_06_purge_expunged_vm_background_task Failure 388.54 test_purge_expunged_vms.py
test_12_start_vm_multiple_volumes_allocated Error 13.09 test_vm_life_cycle.py

@sureshanaparti sureshanaparti marked this pull request as ready for review February 12, 2025 09:28
@sureshanaparti
Copy link
Contributor Author

@blueorangutan test

@blueorangutan
Copy link

@sureshanaparti a [SL] Trillian-Jenkins test job (ol8 mgmt + kvm-ol8) has been kicked to run smoke tests

@sureshanaparti
Copy link
Contributor Author

sureshanaparti commented Feb 13, 2025

[SF] Trillian test result (tid-12369) Environment: kvm-ol8 (x2), Advanced Networking with Mgmt server ol8 Total time taken: 62950 seconds Marvin logs: https://github.com/blueorangutan/acs-prs/releases/download/trillian/pr10332-t12369-kvm-ol8.zip Smoke tests completed. 138 look OK, 3 have errors, 0 did not run Only failed and skipped tests results shown below:

Test Result Time (s) Test File
test_11_isolated_network_with_dynamic_routed_mode Error 2.25 test_ipv4_routing.py
test_12_vpc_and_tier_with_dynamic_routed_mode Error 2.37 test_ipv4_routing.py
test_12_vpc_and_tier_with_dynamic_routed_mode Error 2.37 test_ipv4_routing.py
test_06_purge_expunged_vm_background_task Failure 388.54 test_purge_expunged_vms.py
test_12_start_vm_multiple_volumes_allocated Error 13.09 test_vm_life_cycle.py

these failures are not related to this PR changes, same results in 4.20 health check as well here: #10006 (comment) (seems to be consistent, as per the above results cc @DaanHoogland @weizhouapache @kiranchavala)

@rohityadavcloud
Copy link
Member

LGTM, thanks for the PR and testing @sureshanaparti @kiranchavala

@rohityadavcloud rohityadavcloud merged commit 8c4a085 into apache:4.20 Feb 13, 2025
25 of 26 checks passed
@rohityadavcloud rohityadavcloud deleted the direct_downloaded_template_qcow2_validation branch February 13, 2025 06:39
@blueorangutan
Copy link

[SF] Trillian test result (tid-12379)
Environment: kvm-ol8 (x2), Advanced Networking with Mgmt server ol8
Total time taken: 59879 seconds
Marvin logs: https://github.com/blueorangutan/acs-prs/releases/download/trillian/pr10332-t12379-kvm-ol8.zip
Smoke tests completed. 138 look OK, 3 have errors, 0 did not run
Only failed and skipped tests results shown below:

Test Result Time (s) Test File
test_11_isolated_network_with_dynamic_routed_mode Error 2.29 test_ipv4_routing.py
test_12_vpc_and_tier_with_dynamic_routed_mode Error 3.39 test_ipv4_routing.py
test_12_vpc_and_tier_with_dynamic_routed_mode Error 3.39 test_ipv4_routing.py
test_06_purge_expunged_vm_background_task Failure 397.65 test_purge_expunged_vms.py
test_12_start_vm_multiple_volumes_allocated Error 15.17 test_vm_life_cycle.py

@Pearl1594 Pearl1594 moved this to Done in ACS 4.20.1 Mar 17, 2025
dhslove pushed a commit to ablecloud-team/ablestack-cloud that referenced this pull request Jun 19, 2025
…mplate file exists (apache#10332)

* Validate the direct downloaded template file format (QCOW2) if the template file exists

* string format not required
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

Status: Done

Development

Successfully merging this pull request may close these issues.

6 participants