Skip to content

Commit fc85170

Browse files
committed
fix: try to make UFFD handlers more robust
According to our UFFD protocol, UFFD handlers negotiate with Firecracker during initialization and wait for it to send over a UDS the UFFD file descriptor along with the memory mappings that are being handled over the UFFD. During this handshake, our (testing only/not production grade) UFFD handlers issue what essentially is a `recvmsg` that should return with the UFFD fd and the mappings. Some times instead of the file descriptor, the `recvmsg` wrapper returns a `None` value for the file descriptor. When this happens, the UFFD handler crashes and Firecracker process hangs. According to `man recv(2)`: ``` Datagram sockets in various domains (e.g., the UNIX and Internet domains) permit zero-length datagrams. When such a datagram is received, the return value is 0. ``` which means it is possible to receive a zero-length message (we are communicating with Firecracker over a UDS). Add logic to our UFFD handlers to retry the negotiation with Firecracker up to 5 times before giving up. This helps making them (slightly) more robust. Also, we add some logging in the receive logic so that we can inspect failures post-mortem. Signed-off-by: Babis Chalios <[email protected]>
1 parent 07c07bd commit fc85170

File tree

1 file changed

+37
-13
lines changed

1 file changed

+37
-13
lines changed

src/firecracker/examples/uffd/uffd_utils.rs

Lines changed: 37 additions & 13 deletions
Original file line numberDiff line numberDiff line change
@@ -52,21 +52,45 @@ pub struct UffdHandler {
5252
}
5353

5454
impl UffdHandler {
55-
pub fn from_unix_stream(stream: &UnixStream, backing_buffer: *const u8, size: usize) -> Self {
56-
let mut message_buf = vec![0u8; 1024];
57-
let (bytes_read, file) = stream
58-
.recv_with_fd(&mut message_buf[..])
59-
.expect("Cannot read from a stream");
60-
message_buf.resize(bytes_read, 0);
55+
fn get_mappings_and_file(stream: &UnixStream) -> (String, File) {
56+
// Sometimes, reading from the stream succeeds but we don't receive any
57+
// UFFD descriptor. We don't really have a good understanding why this is
58+
// happening, but let's try to be a bit more robust and retry a few times
59+
// before we declare defeat.
60+
for _ in 1..=5 {
61+
let mut message_buf = vec![0u8; 1024];
62+
let (bytes_read, file) = match stream.recv_with_fd(&mut message_buf[..]) {
63+
Ok(res) => res,
64+
Err(err) => {
65+
println!("Could not receive message from stream: {err}");
66+
continue;
67+
}
68+
};
69+
message_buf.resize(bytes_read, 0);
70+
71+
// We do not expect to receive non-UTF-8 data from Firecracker, so this is probably
72+
// an error we can't recover from. Just immediately abort
73+
let body = String::from_utf8(message_buf.clone()).unwrap_or_else(|_| {
74+
panic!(
75+
"Received body is not a utf-8 valid string. Raw bytes received: \
76+
{message_buf:#?}"
77+
)
78+
});
79+
let file = match file {
80+
Some(file) => file,
81+
None => {
82+
println!("Did not receive Uffd from UDS. Received body: {body}");
83+
continue;
84+
}
85+
};
86+
return (body, file);
87+
}
6188

62-
let body = String::from_utf8(message_buf.clone()).unwrap_or_else(|_| {
63-
panic!(
64-
"Received body is not a utf-8 valid string. Raw bytes received: {message_buf:#?}"
65-
)
66-
});
67-
let file =
68-
file.unwrap_or_else(|| panic!("Did not receive Uffd from UDS. Received body: {body}"));
89+
panic!("Could not get UFFD and mappings after 5 retries");
90+
}
6991

92+
pub fn from_unix_stream(stream: &UnixStream, backing_buffer: *const u8, size: usize) -> Self {
93+
let (body, file) = Self::get_mappings_and_file(stream);
7094
let mappings =
7195
serde_json::from_str::<Vec<GuestRegionUffdMapping>>(&body).unwrap_or_else(|_| {
7296
panic!("Cannot deserialize memory mappings. Received body: {body}")

0 commit comments

Comments
 (0)