Skip to content

Commit a3328e2

Browse files
committed
fix: try to make UFFD handlers more robust
According to our UFFD protocol, UFFD handlers negotiate with Firecracker during initialization and wait for it to send over a UDS the UFFD file descriptor along with the memory mappings that are being handled over the UFFD. During this handshake, our (testing only/not production grade) UFFD handlers issue what essentially is a `recvmsg` that should return with the UFFD fd and the mappings. Some times instead of the file descriptor, the `recvmsg` wrapper returns a `None` value for the file descriptor. When this happens, the UFFD handler crashes and Firecracker process hangs. According to `man recv(2)`: ``` Datagram sockets in various domains (e.g., the UNIX and Internet domains) permit zero-length datagrams. When such a datagram is received, the return value is 0. ``` which means it is possible to receive a zero-length message (we are communicating with Firecracker over a UDS). Add logic to our UFFD handlers to retry the negotiation with Firecracker up to 5 times before giving up. This helps making them (slightly) more robust. Also, we add some logging in the receive logic so that we can inspect failures post-mortem. Signed-off-by: Babis Chalios <[email protected]>
1 parent 431b829 commit a3328e2

File tree

1 file changed

+32
-9
lines changed

1 file changed

+32
-9
lines changed

src/firecracker/examples/uffd/uffd_utils.rs

Lines changed: 32 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -52,21 +52,44 @@ pub struct UffdHandler {
5252
}
5353

5454
impl UffdHandler {
55-
pub fn from_unix_stream(stream: &UnixStream, backing_buffer: *const u8, size: usize) -> Self {
56-
let mut message_buf = vec![0u8; 1024];
57-
let (bytes_read, file) = stream
58-
.recv_with_fd(&mut message_buf[..])
59-
.expect("Cannot read from a stream");
60-
message_buf.resize(bytes_read, 0);
55+
fn get_mappings_and_file(stream: &UnixStream) -> (String, File) {
56+
// Sometimes, reading from the stream succeeds but we don't receive any
57+
// UFFD descriptor. We don't really have a good understanding why this is
58+
// happening, but let's try to be a bit more robust and retry a few times
59+
// before we declare defeat.
60+
for _ in 1..=5 {
61+
let mut message_buf = vec![0u8; 1024];
62+
let (bytes_read, file) = match stream.recv_with_fd(&mut message_buf[..]) {
63+
Ok(res) => res,
64+
Err(err) => {
65+
println!("Could not receive message from stream: {err}");
66+
continue;
67+
}
68+
};
69+
message_buf.resize(bytes_read, 0);
6170

62-
let body = String::from_utf8(message_buf.clone()).unwrap_or_else(|_| {
71+
// We do not expect to receive non-UTF-8 data from Firecracker, so this is probably
72+
// an error we can't recover from. Just immediately abort
73+
let body = String::from_utf8(message_buf.clone()).unwrap_or_else(|_| {
6374
panic!(
6475
"Received body is not a utf-8 valid string. Raw bytes received: {message_buf:#?}"
6576
)
6677
});
67-
let file =
68-
file.unwrap_or_else(|| panic!("Did not receive Uffd from UDS. Received body: {body}"));
78+
let file = match file {
79+
Some(file) => file,
80+
None => {
81+
println!("Did not receive Uffd from UDS. Received body: {body}");
82+
continue;
83+
}
84+
};
85+
return (body, file);
86+
}
6987

88+
panic!("Could not get UFFD and mappings after 5 retries");
89+
}
90+
91+
pub fn from_unix_stream(stream: &UnixStream, backing_buffer: *const u8, size: usize) -> Self {
92+
let (body, file) = Self::get_mappings_and_file(stream);
7093
let mappings =
7194
serde_json::from_str::<Vec<GuestRegionUffdMapping>>(&body).unwrap_or_else(|_| {
7295
panic!("Cannot deserialize memory mappings. Received body: {body}")

0 commit comments

Comments
 (0)