Skip to content

Step 3: Coordinating multiple VMs

Plamen Dimitrov edited this page Dec 23, 2020 · 2 revisions

Step Three: Coordinating multiple VMs

Objective

In the third iteration we will write a test involving multiple virtual machines. The goal is to have on VM establish a connection to another via ping and to check corresponding files and hostnames on these two as a client and server VM. We will accomplish this goal in three steps:

  1. Set up the target VMs, a server one as well as a client one.
  2. Verify connectivity by mutual ping.
  3. Test the presence of correct files as well as hostnames in various ways.

Cartesian config

The complexity of the third test justifies for it to be listed as a toplevel variant, not a subtest of the quicktest class. The usual approach in defining new variants in groups.cfg is to start with simpler ones and to place further down as last the ones of higher complexity. Without further ado, here is the entry:

- tutorial3:
    get_state_vm1 = on_connect
    get_state_vm2 = on_customize
    type = tutorial_step_3
    vms = vm1 vm2
    roles = client server
    client = vm2
    server = vm1
    host_dhcp_service = yes

The type parameter defines the script to run on the host for this test. Note the parameter vms whose value is a whitespace-delimited list of identifiers specifying the virtual machines we will use.

The host test script

The first part of the test script that runs on the host is quite simple. It will first start the VMs' sessions and assign some roles:

def run(test, params, env):
    vmnet = env.get_vmnet()
    vmnet.start_all_sessions()
    vms = vmnet.get_vms()
    server_vm = vms.server
    client_vm = vms.client
    vmnet.ping_all()

    # call to a function shared among tests
    sleep(3)

Those roles are used by the test code to quickly refer to vms through their roles in the test rather than directly by hard-coded names. Once the participants and network of the test are set, the test does a quick mutual ping among all of them to ensure the full connectivity and reuses the sleep() function from the sample_utility (without a clear meaning and only for illustration again). The main part of the test then remains simple:

tmp_server = server_vm.session.cmd("ls " + server_vm.params["tmp_dir"])
tmp_client = client_vm.session.cmd("dir " + client_vm.params["tmp_dir"])
deployed_folders = ("data", "utils", "packages")
for folder in deployed_folders:
    if folder not in tmp_server:
        raise exceptions.TestFail("No deployed %s was found on the server" % folder)
    if folder not in tmp_client:
        raise exceptions.TestFail("No deployed %s was found on the client" % folder)

If any of the data, utils, or packages folders deployed during the customization stage are missing the test will fail.

Enhanced remote checks

In addition to the basic session and session ops approaches described in the previous tutorial, we can perform four more advanced and thus flexible methods of executing code remotely that we will try out in the following order here:

  1. using a remote utility call
  2. using a remote decorator
  3. using a remote control file
  4. using a remote object

These will be reflected in four-tier additional remote subvariant in the Cartesian config of the test that is not included in a normal run since it relies entirely on additional dependencies offering these enhanced features:

- tutorial3:
    ...
    variants:
        - @no_remote:
            enhanced_remote_checks = no
        - remote:
            enhanced_remote_checks = yes
            variants:
                - @no_util:
                    remote_util_check = no
                - util:
                    remote_util_check = yes
            variants:
                - @no_decorator:
                    remote_decorator_check = no
                - decorator:
                    remote_decorator_check = yes
                    walk_prefix = /etc
                    must_exist_in_walk = fstab
            variants:
                - @no_control:
                    remote_control_check = no
                - control:
                    remote_control_check = yes
                    root_dir = /tmp
                    control_file = tutorial_step_3.control
            variants:
                - @no_object:
                    remote_object_check = no
                - object:
                    remote_object_check = yes

Parsing the above definitions combines them in a Cartesian product like

VT 1-vm1vm2-all.tutorial3.vm1.virtio_blk.smp2.virtio_net.CentOS.7.0.x86_64.vm2.smp2.Win10.x86_64
VT 2-vm1vm2-all.tutorial3.remote.vm1.virtio_blk.smp2.virtio_net.CentOS.7.0.x86_64.vm2.smp2.Win10.x86_64
VT 3-vm1vm2-all.tutorial3.remote.util.vm1.virtio_blk.smp2.virtio_net.CentOS.7.0.x86_64.vm2.smp2.Win10.x86_64
VT 4-vm1vm2-all.tutorial3.remote.decorator.vm1.virtio_blk.smp2.virtio_net.CentOS.7.0.x86_64.vm2.smp2.Win10.x86_64
VT 5-vm1vm2-all.tutorial3.remote.decorator.util.vm1.virtio_blk.smp2.virtio_net.CentOS.7.0.x86_64.vm2.smp2.Win10.x86_64
VT 6-vm1vm2-all.tutorial3.remote.control.vm1.virtio_blk.smp2.virtio_net.CentOS.7.0.x86_64.vm2.smp2.Win10.x86_64
VT 7-vm1vm2-all.tutorial3.remote.control.util.vm1.virtio_blk.smp2.virtio_net.CentOS.7.0.x86_64.vm2.smp2.Win10.x86_64
VT 8-vm1vm2-all.tutorial3.remote.control.decorator.vm1.virtio_blk.smp2.virtio_net.CentOS.7.0.x86_64.vm2.smp2.Win10.x86_64
VT 9-vm1vm2-all.tutorial3.remote.control.decorator.util.vm1.virtio_blk.smp2.virtio_net.CentOS.7.0.x86_64.vm2.smp2.Win10.x86_64
VT 10-vm1vm2-all.tutorial3.remote.object.vm1.virtio_blk.smp2.virtio_net.CentOS.7.0.x86_64.vm2.smp2.Win10.x86_64
VT 11-vm1vm2-all.tutorial3.remote.object.util.vm1.virtio_blk.smp2.virtio_net.CentOS.7.0.x86_64.vm2.smp2.Win10.x86_64
VT 12-vm1vm2-all.tutorial3.remote.object.decorator.vm1.virtio_blk.smp2.virtio_net.CentOS.7.0.x86_64.vm2.smp2.Win10.x86_64
VT 13-vm1vm2-all.tutorial3.remote.object.decorator.util.vm1.virtio_blk.smp2.virtio_net.CentOS.7.0.x86_64.vm2.smp2.Win10.x86_64
VT 14-vm1vm2-all.tutorial3.remote.object.control.vm1.virtio_blk.smp2.virtio_net.CentOS.7.0.x86_64.vm2.smp2.Win10.x86_64
VT 15-vm1vm2-all.tutorial3.remote.object.control.util.vm1.virtio_blk.smp2.virtio_net.CentOS.7.0.x86_64.vm2.smp2.Win10.x86_64
VT 16-vm1vm2-all.tutorial3.remote.object.control.decorator.vm1.virtio_blk.smp2.virtio_net.CentOS.7.0.x86_64.vm2.smp2.Win10.x86_64
VT 17-vm1vm2-all.tutorial3.remote.object.control.decorator.util.vm1.virtio_blk.smp2.virtio_net.CentOS.7.0.x86_64.vm2.smp2.Win10.x86_64

In the additonally generated tutorial3 tests, each feature combination is separately tested with the "@" token resulting in a default variant dropped from the final variant name. Now let's look at each of the four tested features.

Remote utility calls

The simplest remote execution we can perform is one through a single call to a utility or module. Under the hood, this is similar to running a python script on the VM with just a few lines importing the module or utility and calling its desired function. The tutorial code performs two such exemplary calls:

if params.get_boolean("remote_util_check"):
    door.run_remote_util(
        server_vm.session,
        "os",
        "listdir",
        server_vm.params["tmp_dir"].replace("\\", r"\\"),
    )
    door.run_remote_util(
        server_vm.session,
        "subprocess",
        "call",
        "dir " + client_vm.params["tmp_dir"].replace("\\", r"\\"),
        shell=True
    )

For instance, the script that the second call creates is similar to the one below:

import subprocess
subprocess.call("dir <tmp_dir>")

As you can see, it's a simple subprocess call that will run the dir command on the windows client while the first call will do something similar on the linux server.

Remote decorator functions

The second easiest remote method involves decorating a function that will in turn be executed remotely on the VM. This only requires the VM session to be passed as a first positional argument

if params.get_boolean("remote_decorator_check"):
    check_walk(server_vm.session, params)

where to check through python code on the remote machine, we have to first decorate the function to be ran there with a "run_remotely" decorator, then write a function body that does not assume previous host-bound imports:

@door.run_remotely
def check_walk(params):
    walk_prefix = params["walk_prefix"]
    walk_goal = params["must_exist_in_walk"]

    import os
    for base_path, dir_names, file_names in os.walk(walk_prefix):
        if walk_goal in file_names:
            break
    else:
        raise AssertionError("Couldn't find %s inside %s" % (walk_goal, walk_prefix))

The same applies for the exceptions, similarly to the way we will write a control file for the third remote method below.

The check_walk() function is defined in the helpers module section and will check for the existence of a file walking through directories using python code to be written on the host but to be ran on the guest. It will do this with two additional walk_prefix and must_exist_in_walk parameters passed together as the only real argument to the decorated function.

Remote control files

For the third advanced remote method we will execute a control file directly on the remote VM. Note that we get the name of the control file from the Cartesian configuration and call functions named like set_subcontrol_parameter with it. This causes the file controls/tutorial_step_3.control to be preprocessed for execution on the guest machine. For the curious, the resulting control file can be inspected by checking the results folder for the current test (tutorial3) for debugging purposes (check the job.log for more details on all control files used in the background for most of these remote methods).

if params.get_boolean("remote_control_check"):
    control_path = server_vm.params["control_file"]
    control_path = door.set_subcontrol_parameter(control_path, "EXTRA_SLEEP", 2)
    control_path = door.set_subcontrol_parameter(control_path, "ROOT_DIR", params["root_dir"])
    control_path = door.set_subcontrol_parameter_list(control_path, "DETECT_DIRS", ["data", "utils"])
    control_path = door.set_subcontrol_parameter_dict(control_path, "SIMPLE_PARAMS",
                                                      {"client": server_vm.params["client"],
                                                       "server": server_vm.params["server"]})
    door.run_subcontrol(server_vm.session, control_path)

Finally, this initiates the guest test run by calling door.run_subcontrol(). This control file is just a python script, that we fully write by ourselves and have full control of, so we have some more flexibility than we did with the previous two remote methods. However, control files are the least preferred way of running code on guest VMs. They should only be used as a last resort when there's no better way (such as remote objects or remote shell commands), because they can make the solution too complex, they don't have access to the Avocado library of functions, and they don't provide good test feedback.

We now have to take a better look at the code in a simple control file like tutorial_step_3.control used in all control-enabled tutorial3 tests where all significant and non-template code is located between the first and last logging messages:

# CONSTANTS

SLEEP_TIME = 3


# HELPERS

def read_pipe(cmd):
    pipe = subprocess.Popen(cmd, shell=True, stdout=subprocess.PIPE).stdout
    if not pipe:
        raise IOError("failed to open pipe '%s'" % cmd)
    output = pipe.read().decode()
    return output


# MAIN PART

logging.info("Sleeping for %s seconds from the control file", SLEEP_TIME+EXTRA_SLEEP)
sample_utility.sleep(SLEEP_TIME)
sample_utility.sleep(EXTRA_SLEEP)

logging.info("List current directory and some files in it")
present_entries = os.listdir(ROOT_DIR)
wanted_entries = DETECT_DIRS
for entry in wanted_entries:
    if entry not in present_entries:
        raise AssertionError("Wanted entry '%s' not detected in %s" % (entry, present_entries))

params = SIMPLE_PARAMS
own_name = params.get("server", "vm2")
other_name = params.get("client", "vm1")
logging.info("Identified own name '%s' and peer name '%s'", own_name, other_name)
hostname = read_pipe("hostname").rstrip()
assert hostname == own_name, "%s != %s" % (hostname, own_name)
assert hostname != other_name, "%s == %s" % (hostname, other_name)

We have similar but simplified module sections with constants and helpers and reuse the deployed sample utility for some sleep calls but most importantly perform another subdirectory detection on the guest and compare the expected own and peer hostnames.

Note here that we don't raise TestFail, we raise AssertionError instead. This is because guest VMs don't have Avocado installed, so they don't support custom exceptions. This is just one example of all the limitations faced by control files and all code executed through them.

Remote objects

This last remote method is much more preferable than the third and the third should only be used as a last resort if everything else fails. It is contained in get_remote_object() for the most part where we obtain a remote sysmisc module that we can call and manipulate for as long as we like on later on.

if params.get_boolean("remote_object_check"):
    if not guest_serialization or not host_serialization:
        raise exceptions.TestSkipError("The remote door object backend (pyro) is not available")
    sysmisc = door.get_remote_object("sample_utility",
                                     session=server_vm.wait_for_login(),
                                     host=server_vm.params["ip_" + server_vm.params["ro_nic"]],
                                     port=server_vm.params["ro_port"])
    sysmisc.sleep(5)

The main difference between using a remote object and a simple remote utility call (as we did in the first remote method) is that the remote object offers data persistence. This means that it has a memory of a previous state, whereby the remote utility call is an execution of a memoriless procedure in each call. In our case, we perform a single call on the sleep method of the sample utility (here called sysmisc to differentiate from the fact this sample utility is located on the VM and not on the host where we have already imported a local sample_utility) which might as well be implemented in easier ways. However, the importance of this data persistence might be even easier to realize when comparing to the remote decorator and control methods. While these methods support multi-line remote code, they cannot provide any feedback on the execution other than a simple error. In comparison, the host test code could go back and forth from local to remote calls and thus manipulate data of both native and foreign origin.

Bonus notes on guest tests

Ultimately, we could mix some of the above methods that are most helpful for a given situation in practice. For instance, we could use the remote object advantage of data persistence to return useful information on the state of execution of a control file by sharing the test parameters as a remote object from the host code to the VM (control) code. This is one very useful mixture of the third and fourth remote methods for even more interesting and yet practical situations where a remote object cannot be used due to serialization restrictions but a remote control has to be more interactive. A final block implementing this can be found in the host test script:

if host_serialization and guest_serialization:
    control_path = server_vm.params["control_file"].replace("step_3", "step_3.2")
    control_path = door.set_subcontrol_parameter_object(control_path,
                                                        server_vm.params)
    door.run_subcontrol(server_vm.session, control_path)
    failed_checks = server_vm.params["failed_checks"]
    if failed_checks > 0:
        raise exceptions.TestFail("%s hostname checks failed" % failed_checks)

The second control file called tutorial_step_3.2.control is prepared using a single parameter object URI which is the URI of the test parameters shared from the host as a remote object. We restore the parameters through a proxy extension for dictionaries, then use previously required keys (in the more basic control) like the "server" and "client" roles for the same hostname comparison tests:

params = door.params_from_uri(URI)
failed_checks = 0

own_name = params.get("server", "vm2")
other_name = params.get("client", "vm1")

hostname = process.run("hostname", shell=True).stdout_text.rstrip()
failed_checks += 1 if hostname != own_name else 0
failed_checks += 1 if hostname == other_name else 0

params["failed_checks"] = failed_checks

The difference now is that we can also count the number of encountered failed checks which will be used by the host test once the control file completes. We do this by setting a new network-provided parameter that will make sure this count is available to the (host) test running the control file for further validation, thus emulating a return argument from the control file. Needless to say, this can be used for all sorts of communication back from the control file but a pure remote object still provides the freedom to switch back and forth between local and remote execution.

Clone this wiki locally