Post Migration Announce #66

amphi · 2026-01-16T15:36:57Z

This PR adds a post migration announce to Cloud Hypervisor. After a successful live migration, Cloud Hypervisor now sends reverse ARP packets on all virtio-net devices. That way switches should learn the new location of the VM faster.

phip1611 · 2026-01-19T06:21:20Z

Could you please rephrase your PR title, the commits, and the corresponding internals to something more descriptive than "post migration announce"? Something like "Post-migration ARP Advertisment Packets"?

phip1611

Nice work! Before reviewing this further, we should discuss if the proposed design is something we want to go with. I prefer to use the existing resume() path. Bonus point: it would also work after a cold migration.

Are there any implications I'm missing here by chance?

PS: You also need to do the same for the Neighbor Discovery Protocol (ARP equivalent in IPv6).

phip1611 · 2026-01-16T15:46:28Z

virtio-devices/src/device.rs

+
+    /// Some devices can announce their location after a live migration to
+    /// speed up normal execution.
+    fn post_migration_announcer(&mut self) -> std::option::Option<Box<dyn PostMigrationAnnouncer>> {


Please use Option and not the fully qualified path

phip1611 · 2026-01-16T15:47:04Z

virtio-devices/src/net.rs

+    fn build_rarp_announce(&self) -> [u8; 60] {
+        const ETH_P_RARP: u16 = 0x8035; // Ethertype RARP
+        const ARP_HTYPE_ETH: u16 = 0x1; // Hardware type Ethernet
+        const ARP_PTYPE_IP: u16 = 0x0800; // Protocol type IPv4


do we need to do the same for the ARP equivalent of IPv6?

As far as I know this works for IPv4 and IPv6. We never put any IP addresses into the packet, only MAC addresses.

But IPv6 doesn't use ARP AFAICT. ipv6 uses the Neighbor Discovery Protocol (NDP).

I am not 100% with that, but we should clarify and make sure your PR reaches your desired network discovery speedup if guest VMs only uses IPv6!

Also the IPv6 message would need the IPv6 Address of the Link, and we have no way of knowing this address. Also this is mainly for the switches in the network, which only need the MAC addresses.

My thought is: When the VM has an IPv6, and someone wants to connect the VM right after migration, some network node will use NDP (not ARP!) to ask "who has $myipv6". Therefore, we also need to populate NDP caches in all switches and not just ARP - correct? (Again, I am not 100% sure, but it seems very plausible to me!).

I think: After a live migration, switches and routers may still have the VM’s old MAC-to-port mappings, causing traffic drops. For IPv4, sending "hello" ARP updates the ARP caches in the network. For IPv6, you similarly need to send NDP messages so that neighbor caches in all relevant nodes and switches learn the VM's new location.

For Layer 2 switch learning RARP is enough, because the switches only care about MAC addresses.

For Layer 3 neighbor learning, we would always need an IP address. We do not have those, so we can only fix the Layer 2 switches.

you convinced me :) seems about right.

phip1611 · 2026-01-19T06:23:22Z

vmm/src/lib.rs

                    // The unwrap is safe, because the state machine makes sure we called
                    // vm_receive_state before, which creates the VM.
                    let vm = self.vm.vm_mut().unwrap();
+                    vm.announce();


I am very surprised about this fully-blown internal abstraction. I think this can be build on the existing resume() mechanism that Cloud Hypervisor already has?

Before I review this further, we should discuss this fundamental design discussion.

We can put this somewhere else, I can also put this into the resume function.

Isn't resume also used for when you just pause the VM which does not necessarily have to entail a live migration?

It is, but I don't see how these packets could be harmful and why we need an extra condition whether to send them or not. Also, the traffic is negligible. Right?

Yeah there are only a few packets per NIC, so I don't think we cause harm by adding it to resume.

have you measured the costs of sending these packets? is it worth it to do this in a dedicated thread?

The problem is that it has to happen periodic with a delay in between, I don't know how to do that if not with a separate thread.

ah yes, makes sense to have a dedicated thread then.

But the post migration announce has to happen more than just once, this is the reason why I start a thread that sends the announcements a few more times. I think that wouldn't really work with your solution? At least I don't see how to make it work.

The fact that the method was named announce_once made me initially think that you only announce once per device. If you indeed need to do more announcements per device, then there are ways to do that with what I outlined, but it is not necessarily better than what you already implemented.

Scheduling the post migration announcers is now done as part of DeviceManager::resume()

olivereanderson · 2026-01-19T10:36:55Z

virtio-devices/src/net.rs

+            let _ = unsafe {
+                libc::write(
+                    tap.as_raw_fd(),
+                    buf.as_ptr() as *const libc::c_void,
+                    buf.len(),
+                )
+            };


Given the function signature it seems like there is in principle nothing stopping this from being called in parallel from multiple threads. Is that safe?

it is missing a // SAFETY comment (see cargo clippy).

It is safe, but we need a nice compact comment explaining why it is :)

good point Oliver!

I added the // SAFETY comment. I also removed Sync and made self &mut self, to signal that it is not intended to call this function in parallel from multiple threads.

phip1611 · 2026-01-19T13:39:57Z

virtio-devices/src/net.rs

+        let mut buf = [0u8; 60];
+
+        // Ethernet header
+        buf[0..6].copy_from_slice(&[0xff; MAC_ADDR_LEN]); // This is a broadcast


have you used this as templat?

https://gitlab.com/qemu-project/qemu/-/blob/master/net/announce.c#L98-121

I looked whether qemu does the same, but during implementation I looked at https://datatracker.ietf.org/doc/html/rfc903 and what chatgpt told me.

olivereanderson · 2026-01-19T15:25:29Z

virtio-devices/src/device.rs

+pub trait PostMigrationAnnouncer: Send {
+    // Sending the announces is done on a best-effort basis, so we ignore
+    // errors.
+    fn announce_once(&mut self);


Could we change the name to announce?

The trailing once makes me think that it is only intended to be called once (but cannot take self: Self for one reason or another).

Sure, I changed it.

Some devices can do announcements after a live migration, e.g. a network device announces its new location in a network. With this change a device can return an announcer to do that. On-behalf-of: SAP [email protected] Signed-off-by: Sebastian Eydam <[email protected]>

To announce the new location in a network, virtio-net devices can send reverse ARP packets. On-behalf-of: SAP [email protected] Signed-off-by: Sebastian Eydam <[email protected]>

After this change switches in a network should find the new location of a VM quicker. On-behalf-of: SAP [email protected] Signed-off-by: Sebastian Eydam <[email protected]>

amphi requested review from hertrste, olivereanderson, phip1611 and scholzp January 16, 2026 15:36

amphi self-assigned this Jan 16, 2026

amphi mentioned this pull request Jan 16, 2026

Post Migration Announce cyberus-technology/libvirt-tests#129

Open

phip1611 requested changes Jan 19, 2026

View reviewed changes

olivereanderson reviewed Jan 19, 2026

View reviewed changes

phip1611 reviewed Jan 19, 2026

View reviewed changes

amphi force-pushed the migration-rarp branch 3 times, most recently from c0f9be6 to 92f7fbf Compare January 19, 2026 15:12

olivereanderson reviewed Jan 19, 2026

View reviewed changes

amphi added 3 commits January 19, 2026 16:28

virtio-devices: implement a PostMigrationAnnouncer for virtio-net

d24961b

To announce the new location in a network, virtio-net devices can send reverse ARP packets. On-behalf-of: SAP [email protected] Signed-off-by: Sebastian Eydam <[email protected]>

devices: schedule PostMigrationAnnouncer of all devices

33f2440

After this change switches in a network should find the new location of a VM quicker. On-behalf-of: SAP [email protected] Signed-off-by: Sebastian Eydam <[email protected]>

amphi force-pushed the migration-rarp branch from 92f7fbf to 33f2440 Compare January 19, 2026 15:28

Post Migration Announce #66

Are you sure you want to change the base?

Post Migration Announce #66

Conversation

amphi commented Jan 16, 2026

Uh oh!

phip1611 commented Jan 19, 2026

Uh oh!

phip1611 left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

phip1611 Jan 19, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

phip1611 Jan 19, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

This comment was marked as outdated.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

amphi Jan 19, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

olivereanderson Jan 19, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

phip1611 left a comment •

edited

Loading

phip1611 Jan 19, 2026 •

edited

Loading

phip1611 Jan 19, 2026 •

edited

Loading

amphi Jan 19, 2026 •

edited

Loading

olivereanderson Jan 19, 2026 •

edited

Loading