Combine SSHConnection._init_connection
and Microvm.wait_for_ssh_up
#4850
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
We use
Microvm.wait_for_ssh_up
to wait for a booted guest's userland to be initialized enough that it can process incoming SSH connections. However, it turns out that we werent waiting the intended 10s, we were waiting 3s. The reason for this is that before thetrue
command inwait_for_ssh_up
is even sent to the guest, we go through theMicrovm.ssh
property, which calls theMicrovm.ssh_face
function, which constructs anSSHConnection
object, whose construtor callsSSHConnection._init_connection
. And inside_init_connection
, we already try to send atrue
command to the guest, to ascertain it is up and running. However, here we do it in a retry loop (needed because if port 22 is still closed,ssh
exits with non-zero code and "connection refuses") that does 20 retries at 0.15s intervals. Or tries to 3 seconds. Most importantly, the retries in_init_connection
are done without timeouts, so even if we actually hung during connection establishment,wait_for_ssh_up
wouldn't abort after 10 seconds, because it would get stuck inside of_init_connection
, which doesn't know anything about a 10s timeout.Fix all this up by merging
_init_connection
andwait_for_ssh_up
. Skip sending thetrue
command inwait_for_ssh_up
, and just callssh_iface(0)
directly. Increment the retry time in_init_connection
to actually be 10s. And add a 10s timeout to the commands in_init_connection
.License Acceptance
By submitting this pull request, I confirm that my contribution is made under
the terms of the Apache 2.0 license. For more information on following Developer
Certificate of Origin and signing off your commits, please check
CONTRIBUTING.md
.PR Checklist
PR.
CHANGELOG.md
.TODO
s link to an issue.contribution quality standards.
rust-vmm
.