-
Notifications
You must be signed in to change notification settings - Fork 159
[VMBus] Add NUMA support in proxyintegration #2411
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
|
This PR modifies files containing For more on why we check whole files, instead of just diffs, check out the Rustonomicon |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR adds NUMA (Non-Uniform Memory Access) support to vmbusproxy by implementing a VP-to-NUMA node mapping mechanism. Previously, all channels had their NUMA node index hardcoded to 0.
Key changes:
- Adds a new IOCTL (
IOCTL_VMBUS_PROXY_GET_NUMA_MAP) to retrieve VP-to-NUMA node mappings from vmbusproxy - Implements
NumaNodeMapstructure to store and query the mapping - Updates channel creation and restoration logic to use actual NUMA nodes based on target VPs
Reviewed Changes
Copilot reviewed 4 out of 5 changed files in this pull request and generated no comments.
Show a summary per file
| File | Description |
|---|---|
| vm/devices/vmbus/vmbus_server/src/proxyintegration.rs | Adds NumaNodeMap struct and integrates it into ProxyTask to assign proper NUMA nodes when opening or restoring channels |
| vm/devices/vmbus/vmbus_proxy/src/proxyioctl.rs | Defines new IOCTL constant and output structure for getting NUMA node mappings |
| vm/devices/vmbus/vmbus_proxy/src/lib.rs | Implements get_numa_node_map() method to call the new IOCTL with dynamic buffer resizing |
| vm/devices/vmbus/vmbus_proxy/Cargo.toml | Adds headervec dependency for dynamic header/tail buffer management |
| Cargo.lock | Updates lockfile to reflect new dependency |
| unsafe { | ||
| // This is a synchronous operation, so don't use the async IO infrastructure. | ||
| let mut output = | ||
| HeaderVec::<proxyioctl::VMBUS_PROXY_GET_NUMA_MAP_OUTPUT, u8, 1>::with_capacity( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Set the static capacity to be the same as the initial capacity to avoid allocation.
| Some(&mut bytes), | ||
| None, | ||
| ) { | ||
| let error = GetLastError(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can get this from e.
| if error == ERROR_BUFFER_OVERFLOW || error == ERROR_MORE_DATA { | ||
| // The buffer was too small, resize and try again. | ||
| let offset = offset_of!(proxyioctl::VMBUS_PROXY_GET_NUMA_MAP_OUTPUT, NumaNodes); | ||
| let required_size = bytes as usize - offset; | ||
| output.reserve_tail(required_size / size_of::<u8>() - output.tail_capacity()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure this works. DeviceIoControl will only set bytes on ERROR_MORE_DATA (which comes from STATUS_BUFFER_OVERFLOW which is a warning status), and it will be set to the number of bytes actually written to the output, not the required number of bytes.
The correct pattern here is to have the kernel return the node (actually VP) count in the partially-filled struct, and calculate the required size from that.
| } | ||
| } | ||
|
|
||
| output.set_tail_len( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would use the count in the struct to size this (ensuring it does not exceed the returned size).
| #[repr(C)] | ||
| #[derive(Copy, Clone)] | ||
| pub struct VMBUS_PROXY_GET_NUMA_MAP_OUTPUT { | ||
| pub NodeCount: u32, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's actually a count of VPs, since what you're returning is a map of VP-to-physical-node.
| #[derive(Copy, Clone)] | ||
| pub struct VMBUS_PROXY_GET_NUMA_MAP_OUTPUT { | ||
| pub NodeCount: u32, | ||
| pub NumaNodes: [u8; 0], |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think we use 0-sized arrays like this anywhere. I would just remove the field.
| output.set_tail_len( | ||
| (bytes as usize | ||
| - offset_of!(proxyioctl::VMBUS_PROXY_GET_NUMA_MAP_OUTPUT, NumaNodes)) | ||
| / size_of::<u8>(), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we can safely assume that size_of::<u8>() will always be 1. :)
| } | ||
|
|
||
| struct NumaNodeMap { | ||
| nodes: Vec<u8>, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a map of vp-to-physical-node, so the name is misleading.
| let (send, recv) = mesh::channel(); | ||
| let proxy = Arc::new(proxy); | ||
| let numa_node_map = NumaNodeMap::new(proxy.get_numa_node_map().unwrap_or_else(|_| { | ||
| tracing::warn!("failed to get NUMA node map from proxy"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please include the error in the trace so we can distinguish "old proxy driver" from other cases. Maybe even special case ERROR_INVALID_FUNCTION (log at info instead) so we don't confuse anyone using OpenVMM into thinking there's a problem.
This change adds NUMA support in proxyintegration, calling a new vmbusproxy IOCTL which returns a VP -> NUMA node mapping, which we use when creating or restoring channels. Previously, these channels had their NUMA node index set to 0.