Skip to content

Conversation

rushikeshjadhav
Copy link
Contributor

Added test_linstor_sr_pool_update for rolling pool update with VM availability check.

It performs

  • A rolling update and reboot of all hosts in a LINSTOR SR pool (starting with the master).
  • Verifies VM can start and shutdown successfully on each host after update.
  • Ensures SR remains usable throughout the process.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The tricky part here is ensuring that we do have packages to test the rolling pool update. That means starting with a pool which has an older version of the packages compared to the ones we want to test. That also likely means moving this kind of test to a specific job, or have an orchestrator test (or fixtures) that prepares a nested pool for this test.

See with @ydirson, as I think there are common challenges with the installation and upgrade tests.

Copy link
Contributor Author

@rushikeshjadhav rushikeshjadhav May 27, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So should we limit this to xcp-ng-linstor-testing packages to keep it in linstor tests? Along side of main update tests, this can be specific for linstor related testing package tests. For this test pool_with_linstor fixture will have to run without enablerepo="xcp-ng-linstor-testing", so that we can reliably test "upgrade" from main repo linstor vs new one.

Larger installation and upgrade tests can happen in different job.

# repoquery -a --repoid=xcp-ng-linstor-testing
linstor-common-0:1.29.2-1.el7_9.noarch
linstor-controller-0:1.29.2-1.el7_9.noarch
linstor-satellite-0:1.29.2-1.el7_9.noarch
# repoquery -a --repoid=xcp-ng-linstor
linstor-common-0:1.29.0-1.el7_9.noarch
linstor-controller-0:1.29.0-1.el7_9.noarch
linstor-satellite-0:1.29.0-1.el7_9.noarch

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@stormi @ydirson can you check if the latest approach of using a marker upgrade_test to conditionally select testing repo is useful?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See my other comments, I don't really grasp what you tried to achieve with that.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The tricky part here is ensuring that we do have packages to test the rolling pool update. That means starting with a pool which has an older version of the packages compared to the ones we want to test. That also likely means moving this kind of test to a specific job, or have an orchestrator test (or fixtures) that prepares a nested pool for this test.

What we want to do depends on what we want to test. If we're testing that a rolling update works, one very important parameter is which version we're updating from, which raises the question of which such updates are supported (and incidentally, what does the product to, and what we should check, for unsupported updates. Need input from @Wescoeur and @Nambrok here.

My feeling here is that we want to setup specific pools first (nested, I'd say), and setup the linstor cluster with the specific version (a test parameter) that we want to upgrade from.

As to which repo to take the update from, we may want to allow devs to select it (ie have control on what they want to test), so I'd rather have this simply controlled by data.py.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we want to test progressive updates to not complicate our lives for the moment. The advantage of nested pools allows us to have an image at a given version and to easily rollback (and of course to select any LINSTOR version). Now this does not allow us to test on "real machines", if we want to do without nested ones, it would have to save the initial /var/lib/linstor folder as well as several config files, it is not necessarily easy to do...

If we don't have a lot of time for now, the nested solution "should" be ok.

Comment on lines 152 to 156
h.install_updates()
h.reboot(verify=True)
vm.start(on=h.uuid)
vm.wait_for_os_booted()
vm.shutdown(verify=True)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This doesn't look like the proper way to upgrade a pool with linstor. They don't support rolling upgrade, so you have to update all satellites and restart them, at some point. @Wescoeur has this documented somewhere.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, it's https://docs.xcp-ng.org/management/updates/#rolling-pool-update-rpu I have updated the code considering above.

@rushikeshjadhav rushikeshjadhav force-pushed the feat-storage-linstor-619 branch from 25af1c3 to 2ec6fe5 Compare May 27, 2025 13:42
…po in pool_with_linstor

This mechanism allows the test to start with our without testing repo packages

Signed-off-by: Rushikesh Jadhav <[email protected]>
Tests that want to test linstor upgrade can enablerepo during execution

Signed-off-by: Rushikesh Jadhav <[email protected]>
…ity check.

- Uses `@pytest.mark.upgrade_test` to mark need of upgrade during the test
- Updates of all hosts in a LINSTOR SR pool (starting with the master)
- Reboots updated hosts
- Verifies VM can start and shutdown successfully on each host after update
- Ensures SR remains usable throughout the process

Signed-off-by: Rushikesh Jadhav <[email protected]>
@rushikeshjadhav rushikeshjadhav force-pushed the feat-storage-linstor-619 branch from 2ec6fe5 to f2a8994 Compare May 27, 2025 18:28
Copy link
Contributor

@ydirson ydirson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In addition, please format the commit message as per the standard converntions

Comment on lines +52 to +55
def pytest_configure(config):
config._linstor_upgrade_test = False

def pytest_collection_modifyitems(config, items):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

a note that those are pytest hooks (and, for the 2nd one, when it is called) it would be useful to the reader

import concurrent.futures

dont_use_testing_repo = request.config._linstor_upgrade_test
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is not clear at first read what this does: this line should have an explanation for future readers of the file, and the commit message should give an overview. And using a variable with a negative name does not help: something like use_testing_repo = not request.config._linstor_upgrade_test would be more readable - but that would not be enough in itself to understand the logic.

That's especially hard to follow, as the extra testing repo seems to be configured only for the upgrade test.

logging.info("Install updates on host %s" % self)
return self.ssh(['yum', 'update', '-y'])
enablerepo_cmd = ['--enablerepo=%s' % enablerepo] if enablerepo is not None else []
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

that is not a command, should not be named _cmd

@@ -359,9 +359,10 @@ def pool_has_vm(self, vm_uuid, vm_type='vm'):
else:
return self.xe('vm-list', {'uuid': vm_uuid}, minimal=True) == vm_uuid

def install_updates(self):
def install_updates(self, enablerepo=None):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

enablerepo sounds like a bool. extra_repo, maybe?
Also adding a type hint would help

@@ -5,6 +5,7 @@ markers =
default_vm: mark a test with a default VM in case no --vm parameter was given.

# *** Markers used to select tests at collect stage ***
upgrade_test: mark a test which will upgrade packages from testing repo
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this line does not warrant a separate commit, it logically belongs to the next commit.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See my other comments, I don't really grasp what you tried to achieve with that.

3. Reboot all hosts.
4. Sequentially ensure that the VM can start on all hosts.
"""
import concurrent.futures, threading
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not much reason not to do that at top of file

@@ -5,6 +5,7 @@ markers =
default_vm: mark a test with a default VM in case no --vm parameter was given.

# *** Markers used to select tests at collect stage ***
upgrade_test: mark a test which will upgrade packages from testing repo
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From what I gather this is about an "update" (ie using yum) not an "upgrade" (which uses the installer ISO)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The tricky part here is ensuring that we do have packages to test the rolling pool update. That means starting with a pool which has an older version of the packages compared to the ones we want to test. That also likely means moving this kind of test to a specific job, or have an orchestrator test (or fixtures) that prepares a nested pool for this test.

What we want to do depends on what we want to test. If we're testing that a rolling update works, one very important parameter is which version we're updating from, which raises the question of which such updates are supported (and incidentally, what does the product to, and what we should check, for unsupported updates. Need input from @Wescoeur and @Nambrok here.

My feeling here is that we want to setup specific pools first (nested, I'd say), and setup the linstor cluster with the specific version (a test parameter) that we want to upgrade from.

As to which repo to take the update from, we may want to allow devs to select it (ie have control on what they want to test), so I'd rather have this simply controlled by data.py.

Comment on lines +159 to +160
if host.has_updates(enablerepo="xcp-ng-linstor-testing"):
host.install_updates(enablerepo="xcp-ng-linstor-testing")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we don't want a test for updates just doing nothing when by mistake it was given no updates, that should be a test error

@rushikeshjadhav rushikeshjadhav marked this pull request as draft June 16, 2025 10:22
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants