Skip to content

AsyncFd doesn't wake up on some CAN bus socket errors #7938

@scootermon

Description

@scootermon

Version
v1.49.0

Platform
Host platform: Linux 1805e0da56bc 6.12.68-linuxkit #1 SMP [SNIP] aarch64 GNU/Linux
Target platform: Linux mdm9607 3.18.48 #1 PREEMPT [SNIP] armv7l GNU/Linux

Description

We're using AsyncFd around a Linux SocketCAN socket and we noticed the following issue: If we unregister the network device from Linux (i.e. the interface gets deleted) socket.readable().await never completes.

Here's what our test code looks like:

use socketcan::{CanSocket, Socket as _};
use tokio::io::unix::AsyncFd;

#[tokio::main]
pub async fn main() {
    let socket = CanSocket::open("can0").unwrap();
    socket.set_nonblocking(true).unwrap();
    let socket = AsyncFd::new(socket).unwrap();

    loop {
        let mut guard = socket.readable().await.unwrap();
        let Ok(res) = guard.try_io(|inner| inner.get_ref().read_frame()) else {
            // Outer error is only used in case of a spurious wakeup.
            continue;
        };
        println!("read frame: {res:?}");
    }
}

To reproduce the issue we then run the following:

# Create a new virtual can
ip link add name can0 type vcan && ip link set can0 up

# Start the example code from above in the background
cargo run &

# Send some CAN frames to demonstrate that the basic functionality works
cansend can0 123#00 && cansend can0 123#01

# Now delete the CAN interface and observe that the code is stuck
ip link delete can0

The deletion itself DOES generate an epoll event, but it's specifically epoll_event { events: 8, u64: ... } (i.e. EPOLLERR). mio and in turn tokio do not consider this as "readable" by default.

If we manually register our interest in Interest::ERROR it works:

use socketcan::{CanSocket, Socket as _};
use tokio::io::Interest;
use tokio::io::unix::AsyncFd;

#[tokio::main]
pub async fn main() {
    let socket = CanSocket::open("can0").unwrap();
    socket.set_nonblocking(true).unwrap();
    let socket = AsyncFd::new(socket).unwrap();

    loop {
        let mut guard = socket
            .ready(Interest::READABLE | Interest::ERROR)
            .await
            .unwrap();
        let Ok(res) = guard.try_io(|inner| inner.get_ref().read_frame()) else {
            // Outer error is only used in case of a spurious wakeup.
            continue;
        };
        println!("read frame: {res:?}");
    }
}

With the same test steps as above we get the following output:

TRACE mio::poll: registering event source with poller: token=Token(1), interests=READABLE
TRACE mio::poll: registering event source with poller: token=Token(2137272736), interests=READABLE | WRITABLE
read frame: Ok(Data(CanDataFrame { 123#00 }))
read frame: Ok(Data(CanDataFrame { 123#00 }))
<<<<<< Interface deleted here >>>>>
read frame: Err(Os { code: 100, kind: NetworkDown, message: "Network is down" })
read frame: Err(Os { code: 19, kind: Uncategorized, message: "No such device" })

I haven't had the chance to test this on a TCP/UDP socket, but I strongly assume that the same issue would not present itself there.

While debugging this issue I tested quite a few approaches. For reference, the code can be found here: https://gist.github.com/scootermon/f70bd124ee6bc86603b7dbe4ba30ee67

Metadata

Metadata

Assignees

No one assigned

    Labels

    A-tokioArea: The main tokio crateC-bugCategory: This is a bug.M-ioModule: tokio/io

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions