Skip to content

msg-sim revamp, Linux support only#152

Merged
thedevbirb merged 42 commits intomainfrom
lore/feat/msg-sim
Jan 14, 2026
Merged

msg-sim revamp, Linux support only#152
thedevbirb merged 42 commits intomainfrom
lore/feat/msg-sim

Conversation

@thedevbirb
Copy link
Contributor

@thedevbirb thedevbirb commented Jan 4, 2026

Supersedes #72.

This PR is a complete re-work of the msg-sim crate. First, MacOS support has been dropped. This allows to focus more on the networking stack we're the most interested in and maximise the feature completeness of the crate itself. Then, compared #72, this approach doesn't rely on wrappers over the tc binary but creates appropriate rtnetlink requests to manipulate the networking stack of the host. Moreover, the library offers an API to create a network of veth-linked devices where impairments can be added to individual links. Each "peer" in the network has a dedicated network device which lives in a completely isolated network namespace, so we can guarantee no interferences with the host environment.

The network follows a central hub topology, where there is a namespace with a single bridge/switch where all peers veth devices attach to. This is the simplest design to allow discovery and network impairments between any two peers.

The Network abstraction is flexible enough to allow running arbitrary code tasks in the network namespace of the selected peer, without the need to create additional processes, runtimes etc each time.
Here is an example of how it looks like in action, with a 1s latency impairment (from a test):

    #[tokio::test(flavor = "multi_thread")]
    async fn simulate_reqrep_netem_delay_works() {
        let _ = tracing_subscriber::fmt::try_init();

        let subnet = Subnet::new(Ipv4Addr::new(14, 0, 0, 0).into(), 16);
        let mut network = Network::new(subnet).await.unwrap();

        let peer_1 = network.add_peer().await.unwrap();
        let peer_2 = network.add_peer().await.unwrap();

        // 1s latency.
        let sec_in_us = 1_000_000;
        let impairment = LinkImpairment { latency: sec_in_us, ..Default::default() };
        network.apply_impairment(Link::new(peer_1, peer_2), impairment).await.unwrap();

        let address_2 = peer_2.veth_address(subnet);
        let port_2 = 12345;

        let task1 = network
            .run_in_namespace(peer_2, move |_ctx| {
                Box::pin(async move {
                    let mut rep_socket = RepSocket::new(Tcp::default());
                    rep_socket.bind(SocketAddr::new(address_2, port_2)).await.unwrap();

                    // Given the delay in peer1-peer2 link, this should hit timeout
                    tokio::time::timeout(Duration::from_micros((sec_in_us / 2).into()), async {
                        if let Some(request) = rep_socket.next().await {
                            let msg = request.msg().clone();
                            request.respond(msg).unwrap();
                        }
                    })
                    .await
                    .unwrap_err();

                    if let Some(request) = rep_socket.next().await {
                        let msg = request.msg().clone();
                        request.respond(msg).unwrap();
                    }
                })
            })
            .await
            .unwrap();

        let task2 = network
            .run_in_namespace(peer_1, move |_ctx| {
                Box::pin(async move {
                    let mut req_socket = ReqSocket::new(Tcp::default());

                    req_socket.connect_sync(SocketAddr::new(address_2, port_2));
                    req_socket.request("hello".into()).await.unwrap();
                })
            })
            .await
            .unwrap();

        tokio::try_join!(task1, task2).unwrap();
    }

As of now, the impairments supported are the one provided by netem(8). That includes latency, limit, loss, gap, duplicate and jitter.

@thedevbirb thedevbirb mentioned this pull request Jan 4, 2026
6 tasks
sudo ifconfig lo0 mtu 16384
# Remove the dummynet pipes
sudo dnctl pipe delete 1
sudo HOME=$HOME $(which cargo) test # add your arguments here
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not super liking this - isn't there a better way? Maybe asking capabilities from the kernel? Or elevate priviledges inside the binary

Copy link
Contributor Author

@thedevbirb thedevbirb Jan 9, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I hear that. This has been a forever pain point since I started using namespaces. Using plain sudo is just bad and I know. But you need the capabilities to fiddle with the networking stack, like CAP_NET_ADMIN (and something more). So an approach I tried briefly is to first compile the test binaries (--no-run flag), identify them, grant privileges with sudo, and then run them.

Still not incredible DX, so I'm postponing the problem for now until I think of something better.

@thedevbirb thedevbirb force-pushed the lore/feat/msg-sim branch 2 times, most recently from db13e1c to 36632e9 Compare January 9, 2026 14:15
@thedevbirb thedevbirb marked this pull request as ready for review January 9, 2026 14:55
Copy link
Contributor

@mempirate mempirate left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Lgtm. I mainly reviewed the architecture and didn't go into TC and netem specifics too much, if it works it's good. 2 points:

  • Maybe add an extensive, real-world example somewhere in the msg/examples repo. It should be realistic.
  • I'd like a builder pattern on LinkImpairment, like LinkImpairment::default().with_latency(x).with_bandwidth(y). But can be in a follow-up if you want


let class_request = DrrClassRequest::new(
QdiscRequestInner::new(if_index)
.with_parent(TcHandle::from(0x0001_0000)) // Parent: drr root (1:0)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you have this as a constans somewhere already? Or maybe as a function...
In any case, I'd say it's best to have a unified definition where possible

@claude
Copy link

claude bot commented Jan 14, 2026

Code review

I found 2 issues related to CLAUDE.md compliance:

Issue 1: MSRV Documentation Mismatch

  • CLAUDE.md line 74 states MSRV is Rust 1.75
  • Cargo.toml line 15 sets rust-version to 1.86
  • Recommendation: Update CLAUDE.md to reflect MSRV 1.86

Issue 2: macOS Support Documentation No Longer Accurate

  • CLAUDE.md lines 50 and 73 document macOS support for msg-sim
  • This PR removes all macOS code (Linux-only now)
  • Recommendation: Update CLAUDE.md to reflect Linux-only architecture

See links:

@thedevbirb thedevbirb merged commit 4827ec2 into main Jan 14, 2026
13 checks passed
@thedevbirb thedevbirb deleted the lore/feat/msg-sim branch January 14, 2026 11:22
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants