|
| 1 | +# NUMA Node Location API for PCI Devices |
| 2 | + |
| 3 | +## Overview |
| 4 | + |
| 5 | +The `getNUMANode()` API allows you to retrieve the NUMA (Non-Uniform Memory Access) node location of a PCI device identified by its segment:bus:device:function coordinates. |
| 6 | + |
| 7 | +## Background |
| 8 | + |
| 9 | +- **PciHandle** and **PciHandleMM** classes are abstractions of PCI configuration space registers |
| 10 | +- Each PCI device has a unique location: `segment:bus:device:function` |
| 11 | +- **segment** is also known as **group number** or **domain** (synonyms: groupnr, groupnr_) |
| 12 | + |
| 13 | +## API Usage |
| 14 | + |
| 15 | +### Method Signature |
| 16 | + |
| 17 | +```cpp |
| 18 | +int32 PciHandle::getNUMANode() const; |
| 19 | +int32 PciHandleMM::getNUMANode() const; |
| 20 | +``` |
| 21 | + |
| 22 | +### Return Value |
| 23 | + |
| 24 | +- **>= 0**: The NUMA node ID where the PCI device is located |
| 25 | +- **-1**: NUMA information not available or not applicable |
| 26 | + |
| 27 | +### Example |
| 28 | + |
| 29 | +```cpp |
| 30 | +#include "pci.h" |
| 31 | + |
| 32 | +using namespace pcm; |
| 33 | + |
| 34 | +// Open a PCI device at segment 0, bus 0, device 0, function 0 |
| 35 | +PciHandleType handle(0, 0, 0, 0); |
| 36 | + |
| 37 | +// Get the NUMA node |
| 38 | +int32 numa_node = handle.getNUMANode(); |
| 39 | + |
| 40 | +if (numa_node >= 0) { |
| 41 | + std::cout << "Device is on NUMA node: " << numa_node << "\n"; |
| 42 | +} else { |
| 43 | + std::cout << "NUMA information not available\n"; |
| 44 | +} |
| 45 | +``` |
| 46 | +
|
| 47 | +## Platform-Specific Implementation |
| 48 | +
|
| 49 | +### Linux |
| 50 | +
|
| 51 | +- **Method**: Reads from `/sys/bus/pci/devices/<domain>:<bus>:<device>.<function>/numa_node` |
| 52 | +- **Fallback**: Also tries `/pcm/sys/bus/pci/devices/...` path |
| 53 | +- **Return**: |
| 54 | + - NUMA node ID (typically 0, 1, 2, ...) if available |
| 55 | + - -1 if the file doesn't exist or can't be read |
| 56 | +
|
| 57 | +### Windows |
| 58 | +
|
| 59 | +- **Method**: Reads SRAT (System Resource Affinity Table) from ACPI firmware using `GetSystemFirmwareTable` API |
| 60 | +- **Implementation**: |
| 61 | + - Parses SRAT table to extract PCI Device Affinity structures (type 2) |
| 62 | + - Builds a mapping from PCI device location (segment:bus:device:function) to NUMA node (proximity domain) |
| 63 | + - Caches the mapping on first call for performance |
| 64 | +- **Return**: |
| 65 | + - NUMA node ID (proximity domain) if device is found in SRAT table |
| 66 | + - -1 if SRAT table is not available or device is not listed |
| 67 | +- **Requirements**: Windows Vista or later (for `GetSystemFirmwareTable` API) |
| 68 | +
|
| 69 | +### FreeBSD / DragonFly |
| 70 | +
|
| 71 | +- **Method**: Queries system via `sysctlbyname()` for NUMA domain information |
| 72 | +- **Implementation**: |
| 73 | + - First checks if NUMA is enabled via `vm.ndomains` sysctl |
| 74 | + - Attempts to query PCI device-specific NUMA domain using multiple sysctl path formats |
| 75 | + - Tries: `hw.pci.X.Y.Z.W.numa_domain` and `hw.pci.X:Y:Z.W.numa_domain` |
| 76 | +- **Return**: |
| 77 | + - NUMA node ID if available and system has NUMA enabled |
| 78 | + - -1 if NUMA is disabled, not supported, or device affinity information unavailable |
| 79 | +- **Note**: FreeBSD doesn't have a standardized sysctl path for PCI device NUMA affinity across all versions |
| 80 | +
|
| 81 | +### macOS |
| 82 | +
|
| 83 | +- **Method**: Returns -1 (macOS typically doesn't expose NUMA for PCI devices) |
| 84 | +- **Return**: -1 (not applicable) |
| 85 | +
|
| 86 | +## Use Cases |
| 87 | +
|
| 88 | +1. **Performance Optimization**: Place processing threads on the same NUMA node as the device |
| 89 | +2. **Memory Allocation**: Allocate buffers on the same NUMA node for optimal DMA performance |
| 90 | +3. **System Topology Discovery**: Map out the relationship between PCI devices and NUMA nodes |
| 91 | +4. **Monitoring and Analytics**: Identify cross-NUMA traffic patterns |
| 92 | +
|
| 93 | +## Building the Example |
| 94 | +
|
| 95 | +```bash |
| 96 | +cd examples |
| 97 | +g++ -std=c++11 -I../src numa_node_example.cpp -o numa_node_example -L../build/lib -lpcm -lpthread |
| 98 | +LD_LIBRARY_PATH=../build/lib ./numa_node_example |
| 99 | +``` |
| 100 | + |
| 101 | +## Notes |
| 102 | + |
| 103 | +- Requires appropriate permissions to access PCI configuration space |
| 104 | +- On Linux, run with `sudo` or ensure `/sys/bus/pci` is accessible |
| 105 | +- The NUMA node value is read at runtime and not cached |
| 106 | +- A return value of -1 doesn't indicate an error; it means NUMA information is not available |
| 107 | + |
| 108 | +## Related APIs |
| 109 | + |
| 110 | +- `PciHandle::read32()` - Read 32-bit value from PCI configuration space |
| 111 | +- `PciHandle::write32()` - Write 32-bit value to PCI configuration space |
| 112 | +- `PciHandle::read64()` - Read 64-bit value from PCI configuration space |
| 113 | +- `PciHandle::exists()` - Check if a PCI device exists |
| 114 | + |
| 115 | +## See Also |
| 116 | + |
| 117 | +- Linux kernel documentation: `Documentation/ABI/testing/sysfs-bus-pci` |
| 118 | +- ACPI SRAT (System Resource Affinity Table) specification |
| 119 | +- PCI Express Base Specification |
0 commit comments