Skip to content

Commit 9f1ebad

Browse files
committed
testscript: improve wait_for_ssh() network diagnosis and logging
This commit uses the new helpers in wait_for_ssh() to improve network diagnosis and logging in the case of an error. We will be able to better understand if a network issue is caused by the VM host or the VM. More specifically, we will specifically see if a network issue is caused by the host losing an IP on one of its interfaces for whatever reason. wait_for_ssh() now: - checks if the host shares the same IPv4 net with the IP - if the VM is pingable - then tries to SSH into it A possible missing IP in VM host error might be reported like this: ``` ====================================================================== ERROR: test_hotplug (builtins.LibvirtTests.test_hotplug) Tests device hot plugging with multiple devices of different types: ---------------------------------------------------------------------- Traceback (most recent call last): File "<string>", line 391, in test_hotplug File "<string>", line 2000, in wait_for_ssh File "<string>", line 2285, in assert_ip_in_local_192_168_net24 RuntimeError: The VM host is not in the same network as IP 192.168.5.2! It may has lost a IP?! networks: [IPv4Network('192.168.1.0/24'), IPv4Network('192.168.2.0/24'), IPv4Network('192.168.3.0/24'), IPv4Network('192.168.4.0/24'), IPv4Network('192.168.100.0/24')] `ip a`:1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000 link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00 inet 127.0.0.1/8 scope host lo valid_lft forever preferred_lft forever inet6 ::1/128 scope host noprefixroute valid_lft forever preferred_lft forever 2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc fq_codel state UP group default qlen 1000 link/ether 52:54:00:12:34:56 brd ff:ff:ff:ff:ff:ff altname enp0s3 altname ens3 altname enx525400123456 inet 10.0.2.15/24 metric 1024 brd 10.0.2.255 scope global dynamic eth0 valid_lft 86360sec preferred_lft 86360sec inet6 fec0::b396:b54d:c532:85b8/64 scope site temporary dynamic valid_lft 86363sec preferred_lft 14363sec inet6 fec0::5054:ff:fe12:3456/64 scope site dynamic mngtmpaddr noprefixroute valid_lft 86363sec preferred_lft 14363sec inet6 fe80::5054:ff:fe12:3456/64 scope link proto kernel_ll valid_lft forever preferred_lft forever 3: eth1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc fq_codel state UP group default qlen 1000 link/ether 52:54:00:12:01:02 brd ff:ff:ff:ff:ff:ff altname enp0s9 altname ens9 altname enx525400120102 inet 192.168.100.1/24 brd 192.168.100.255 scope global eth1 valid_lft forever preferred_lft forever inet6 fe80::5054:ff:fe12:102/64 scope link proto kernel_ll valid_lft forever preferred_lft forever 4: br0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default qlen 1000 link/ether 22:0b:45:80:ae:68 brd ff:ff:ff:ff:ff:ff inet 192.168.1.1/24 brd 192.168.1.255 scope global br0 valid_lft forever preferred_lft forever inet 192.168.2.1/24 brd 192.168.2.255 scope global br0 valid_lft forever preferred_lft forever inet6 fe80::200b:45ff:fe80:ae68/64 scope link proto kernel_ll valid_lft forever preferred_lft forever 5: br4: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default qlen 1000 link/ether ea:27:4a:87:ff:89 brd ff:ff:ff:ff:ff:ff inet 192.168.4.1/24 brd 192.168.4.255 scope global br4 valid_lft forever preferred_lft forever inet6 fe80::e827:4aff:fe87:ff89/64 scope link proto kernel_ll valid_lft forever preferred_lft forever 6: br3: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc htb state UP group default qlen 1000 link/ether 52:54:00:be:b0:2c brd ff:ff:ff:ff:ff:ff inet 192.168.3.1/24 brd 192.168.3.255 scope global br3 valid_lft forever preferred_lft forever 7: vtap0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc fq_codel master br0 state UNKNOWN group default qlen 1000 link/ether fe:54:00:e5:b8:01 brd ff:ff:ff:ff:ff:ff 8: vtap1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc fq_codel master br0 state UNKNOWN group default qlen 1000 link/ether fe:54:00:e5:b8:02 brd ff:ff:ff:ff:ff:ff 9: tap3: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc fq_codel master br3 state UNKNOWN group default qlen 1000 link/ether fe:54:00:e5:b8:03 brd ff:ff:ff:ff:ff:ff inet6 fe80::fc54:ff:fee5:b803/64 scope link proto kernel_ll valid_lft forever preferred_lft forever 10: tap4: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc fq_codel master br4 state UNKNOWN group default qlen 1000 link/ether fe:54:00:e5:b8:04 brd ff:ff:ff:ff:ff:ff inet6 fe80::fc54:ff:fee5:b804/64 scope link proto kernel_ll valid_lft forever preferred_lft foreve ``` On-behalf-of: SAP philipp.schuster@sap.com Signed-off-by: Philipp Schuster <philipp.schuster@cyberus-technology.de>
1 parent 99683c6 commit 9f1ebad

File tree

1 file changed

+15
-6
lines changed

1 file changed

+15
-6
lines changed

tests/testscript.py

Lines changed: 15 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -1983,25 +1983,34 @@ def wait_for_ping(machine: Machine, ip="192.168.1.2"):
19831983

19841984
def wait_for_ssh(machine: Machine, user="root", password="root", ip="192.168.1.2"):
19851985
"""
1986-
Waits for SSH to become available to connect into the Cloud Hypervisor VM
1987-
hosted on the corresponding machine.
1986+
Waits for the VM to become accessible via SSH.
19881987
1989-
Effectively we use it to wait until the Cloud Hypervisor VM's network is up
1990-
and available.
1988+
It first checks whether the VM responds to ping, and then attempts to
1989+
establish an SSH connection using the provided credentials.
19911990
19921991
:param machine: VM host
19931992
:param user: user for SSH login
19941993
:param password: password for SSH login
19951994
:param ip: SSH host to log into
19961995
"""
19971996
retries = 100
1997+
1998+
# Sometimes we experienced test runs where the host lost IPs. We therefore
1999+
# check that early and always for better debuggability.
2000+
assert_ip_in_local_192_168_net24(machine, ip)
2001+
wait_for_ping(machine, ip)
2002+
2003+
print(f"Waiting for ssh connection into VM with IP {ip} ...")
19982004
for i in range(retries):
1999-
print(f"Wait for ssh {i}/{retries}")
2005+
print(f"Wait for ssh ({i + 1}/{retries}) ...")
20002006
status, _ = ssh(machine, "echo hello", user, password, ip)
20012007
if status == 0:
20022008
return
20032009
time.sleep(0.1)
2004-
raise RuntimeError(f"Could not establish SSH connection to {ip}")
2010+
2011+
raise RuntimeError(
2012+
f"Could not establish SSH connection to {ip} after {retries} attempts"
2013+
)
20052014

20062015

20072016
def ssh(machine: Machine, cmd, user="root", password="root", ip="192.168.1.2"):

0 commit comments

Comments
 (0)