-
Notifications
You must be signed in to change notification settings - Fork 1.2k
Add support for nvidia vGPU support with vendor specific framework #11432
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add support for nvidia vGPU support with vendor specific framework #11432
Conversation
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #11432 +/- ##
=========================================
Coverage 17.36% 17.36%
Complexity 15236 15236
=========================================
Files 5886 5886
Lines 525645 525648 +3
Branches 64156 64157 +1
=========================================
+ Hits 91257 91258 +1
- Misses 424093 424094 +1
- Partials 10295 10296 +1
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
@blueorangutan package |
@vishesh92 a [SL] Jenkins job has been kicked to build packages. It will be bundled with KVM, XenServer and VMware SystemVM templates. I'll keep you posted as I make progress. |
@blueorangutan package |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR adds support for NVIDIA SR-IOV vGPUs with vendor-specific VFIO framework by integrating nvidia-smi data for better GPU profile information and modifying libvirt XML generation for proper NVIDIA GPU handling.
- Adds nvidia-smi vGPU profile parsing to extract detailed profile information
- Modifies libvirt XML generation to use managed='no' for NVIDIA SR-IOV GPUs
- Enhances VF discovery to include profile details like max instances, video RAM, and resolution limits
Reviewed Changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 3 comments.
File | Description |
---|---|
scripts/vm/hypervisor/kvm/gpudiscovery.sh | Adds nvidia-smi vGPU profile parsing and enhanced VF discovery with profile details |
plugins/hypervisors/kvm/src/main/java/com/cloud/hypervisor/kvm/resource/LibvirtGpuDef.java | Modifies libvirt XML generation for NVIDIA GPU support and fixes display attribute |
plugins/hypervisors/kvm/src/main/java/com/cloud/hypervisor/kvm/resource/LibvirtGpuDef.java
Show resolved
Hide resolved
Packaging result [SF]: ✔️ el8 ✔️ el9 ✔️ debian ✔️ suse15. SL-JID 14605 |
@blueorangutan test |
@vishesh92 a [SL] Trillian-Jenkins test job (ol8 mgmt + kvm-ol8) has been kicked to run smoke tests |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM - tested and verified the PR
Test Results Summary
Enhanced GPU Discovery Script Functionality:
- nvidia-smi vGPU profile parsing integration working correctly
- Enhanced VF discovery with detailed profile information (max_instances, video_ram, max_heads, max_resolution_x/y)
- NVIDIA vendor ID detection (10de) and profile mapping functional
- Selective enhancement - only configured VFs show enhanced data, unconfigured VFs show null values
- Profile name extraction from nvidia-smi instead of generic lspci descriptions
LibVirt XML Generation for NVIDIA SR-IOV:
- NVIDIA SR-IOV vGPUs correctly generate
managed='no'
in libvirt XML - VFIO driver assignment working properly
- PCI device addressing and assignment functional
- Display attribute configuration correct
System Integration:
- CloudStack GPU device discovery and assignment working
- VM deployment with vGPU successful
- No regression in existing functionality
CloudStack UI Integration:
- GPU cards/devices properly discovered and listed in UI
- GPU devices with enhanced profile information visible
- Compute offerings with vGPU profiles created successfully
- VM deployment through UI with vGPU assignment functional
Test Evidence
GPU Discovery Script Enhancement
[oracle@gpu1 ~]$ sudo /usr/share/cloudstack-common/scripts/vm/hypervisor/kvm/gpudiscovery.sh
{ "gpus": [
{
"pci_address":"af:00.0",
"vendor_id":"10de",
"device_id":"2236",
"vendor":"NVIDIA Corporation",
"device":"GA102GL [A10]",
"driver":"nvidia",
"pci_class":"3D controller [0302]",
"iommu_group":"19",
"pci_root":"0000:ae:00.0",
"numa_node":1,
"sriov_totalvfs":32,
"sriov_numvfs":0,
"max_instances":null,
"video_ram":null,
"max_heads":null,
"max_resolution_x":null,
"max_resolution_y":null,
"full_passthrough": {
"enabled":1,
"libvirt_address": {
"domain":"0x0000",
"bus":"0xaf",
"slot":"0x00",
"function":"0x0"
},
"used_by_vm":null
},
"vgpu_instances":[],
"vf_instances":[]
}
,
{
"pci_address":"d8:00.0",
"vendor_id":"10de",
"device_id":"2236",
"vendor":"NVIDIA Corporation",
"device":"GA102GL [A10]",
"driver":"nvidia",
"pci_class":"3D controller [0302]",
"iommu_group":"17",
"pci_root":"0000:d7:00.0",
"numa_node":1,
"sriov_totalvfs":32,
"sriov_numvfs":32,
"max_instances":null,
"video_ram":null,
"max_heads":null,
"max_resolution_x":null,
"max_resolution_y":null,
"full_passthrough": {
"enabled":0,
"libvirt_address": {
"domain":"0x0000",
"bus":"0xd8",
"slot":"0x00",
"function":"0x0"
},
"used_by_vm":null
},
"vgpu_instances":[],
"vf_instances":[{"vf_pci_address":"d8:00.4","vf_profile":"NVIDIA A10-6Q","max_instances":4,"video_ram":6144,"max_heads":4,"max_resolution_x":7680,"max_resolution_y":4320,"libvirt_address":{"domain":"0x0000","bus":"0xd8","slot":"0x00","function":"0x4"},"used_by_vm":null},{"vf_pci_address":"d8:00.5","vf_profile":"NVIDIA A10-6Q","max_instances":4,"video_ram":6144,"max_heads":4,"max_resolution_x":7680,"max_resolution_y":4320,"libvirt_address":{"domain":"0x0000","bus":"0xd8","slot":"0x00","function":"0x5"},"used_by_vm":null},{"vf_pci_address":"d8:01.6","vf_profile":"","max_instances":null,"video_ram":null,"max_heads":null,"max_resolution_x":null,"max_resolution_y":null,"libvirt_address":{"domain":"0x0000","bus":"0xd8","slot":"0x01","function":"0x6"},"used_by_vm":null},{"vf_pci_address":"d8:01.7","vf_profile":"","max_instances":null,"video_ram":null,"max_heads":null,"max_resolution_x":null,"max_resolution_y":null,"libvirt_address":{"domain":"0x0000","bus":"0xd8","slot":"0x01","function":"0x7"},"used_by_vm":null},{"vf_pci_address":"d8:02.0","vf_profile":"","max_instances":null,"video_ram":null,"max_heads":null,"max_resolution_x":null,"max_resolution_y":null,"libvirt_address":{"domain":"0x0000","bus":"0xd8","slot":"0x02","function":"0x0"},"used_by_vm":null},{"vf_pci_address":"d8:02.1","vf_profile":"","max_instances":null,"video_ram":null,"max_heads":null,"max_resolution_x":null,"max_resolution_y":null,"libvirt_address":{"domain":"0x0000","bus":"0xd8","slot":"0x02","function":"0x1"},"used_by_vm":null},{"vf_pci_address":"d8:02.2","vf_profile":"","max_instances":null,"video_ram":null,"max_heads":null,"max_resolution_x":null,"max_resolution_y":null,"libvirt_address":{"domain":"0x0000","bus":"0xd8","slot":"0x02","function":"0x2"},"used_by_vm":null},{"vf_pci_address":"d8:02.3","vf_profile":"","max_instances":null,"video_ram":null,"max_heads":null,"max_resolution_x":null,"max_resolution_y":null,"libvirt_address":{"domain":"0x0000","bus":"0xd8","slot":"0x02","function":"0x3"},"used_by_vm":null},{"vf_pci_address":"d8:02.4","vf_profile":"","max_instances":null,"video_ram":null,"max_heads":null,"max_resolution_x":null,"max_resolution_y":null,"libvirt_address":{"domain":"0x0000","bus":"0xd8","slot":"0x02","function":"0x4"},"used_by_vm":null},{"vf_pci_address":"d8:02.5","vf_profile":"","max_instances":null,"video_ram":null,"max_heads":null,"max_resolution_x":null,"max_resolution_y":null,"libvirt_address":{"domain":"0x0000","bus":"0xd8","slot":"0x02","function":"0x5"},"used_by_vm":null},{"vf_pci_address":"d8:02.6","vf_profile":"","max_instances":null,"video_ram":null,"max_heads":null,"max_resolution_x":null,"max_resolution_y":null,"libvirt_address":{"domain":"0x0000","bus":"0xd8","slot":"0x02","function":"0x6"},"used_by_vm":null},{"vf_pci_address":"d8:02.7","vf_profile":"","max_instances":null,"video_ram":null,"max_heads":null,"max_resolution_x":null,"max_resolution_y":null,"libvirt_address":{"domain":"0x0000","bus":"0xd8","slot":"0x02","function":"0x7"},"used_by_vm":null},{"vf_pci_address":"d8:00.6","vf_profile":"NVIDIA A10-6Q","max_instances":4,"video_ram":6144,"max_heads":4,"max_resolution_x":7680,"max_resolution_y":4320,"libvirt_address":{"domain":"0x0000","bus":"0xd8","slot":"0x00","function":"0x6"},"used_by_vm":null},{"vf_pci_address":"d8:03.0","vf_profile":"","max_instances":null,"video_ram":null,"max_heads":null,"max_resolution_x":null,"max_resolution_y":null,"libvirt_address":{"domain":"0x0000","bus":"0xd8","slot":"0x03","function":"0x0"},"used_by_vm":null},{"vf_pci_address":"d8:03.1","vf_profile":"","max_instances":null,"video_ram":null,"max_heads":null,"max_resolution_x":null,"max_resolution_y":null,"libvirt_address":{"domain":"0x0000","bus":"0xd8","slot":"0x03","function":"0x1"},"used_by_vm":null},{"vf_pci_address":"d8:03.2","vf_profile":"","max_instances":null,"video_ram":null,"max_heads":null,"max_resolution_x":null,"max_resolution_y":null,"libvirt_address":{"domain":"0x0000","bus":"0xd8","slot":"0x03","function":"0x2"},"used_by_vm":null},{"vf_pci_address":"d8:03.3","vf_profile":"","max_instances":null,"video_ram":null,"max_heads":null,"max_resolution_x":null,"max_resolution_y":null,"libvirt_address":{"domain":"0x0000","bus":"0xd8","slot":"0x03","function":"0x3"},"used_by_vm":null},{"vf_pci_address":"d8:03.4","vf_profile":"","max_instances":null,"video_ram":null,"max_heads":null,"max_resolution_x":null,"max_resolution_y":null,"libvirt_address":{"domain":"0x0000","bus":"0xd8","slot":"0x03","function":"0x4"},"used_by_vm":null},{"vf_pci_address":"d8:03.5","vf_profile":"","max_instances":null,"video_ram":null,"max_heads":null,"max_resolution_x":null,"max_resolution_y":null,"libvirt_address":{"domain":"0x0000","bus":"0xd8","slot":"0x03","function":"0x5"},"used_by_vm":null},{"vf_pci_address":"d8:03.6","vf_profile":"","max_instances":null,"video_ram":null,"max_heads":null,"max_resolution_x":null,"max_resolution_y":null,"libvirt_address":{"domain":"0x0000","bus":"0xd8","slot":"0x03","function":"0x6"},"used_by_vm":null},{"vf_pci_address":"d8:03.7","vf_profile":"","max_instances":null,"video_ram":null,"max_heads":null,"max_resolution_x":null,"max_resolution_y":null,"libvirt_address":{"domain":"0x0000","bus":"0xd8","slot":"0x03","function":"0x7"},"used_by_vm":null},{"vf_pci_address":"d8:04.0","vf_profile":"","max_instances":null,"video_ram":null,"max_heads":null,"max_resolution_x":null,"max_resolution_y":null,"libvirt_address":{"domain":"0x0000","bus":"0xd8","slot":"0x04","function":"0x0"},"used_by_vm":null},{"vf_pci_address":"d8:04.1","vf_profile":"","max_instances":null,"video_ram":null,"max_heads":null,"max_resolution_x":null,"max_resolution_y":null,"libvirt_address":{"domain":"0x0000","bus":"0xd8","slot":"0x04","function":"0x1"},"used_by_vm":null},{"vf_pci_address":"d8:00.7","vf_profile":"NVIDIA A10-6Q","max_instances":4,"video_ram":6144,"max_heads":4,"max_resolution_x":7680,"max_resolution_y":4320,"libvirt_address":{"domain":"0x0000","bus":"0xd8","slot":"0x00","function":"0x7"},"used_by_vm":null},{"vf_pci_address":"d8:04.2","vf_profile":"","max_instances":null,"video_ram":null,"max_heads":null,"max_resolution_x":null,"max_resolution_y":null,"libvirt_address":{"domain":"0x0000","bus":"0xd8","slot":"0x04","function":"0x2"},"used_by_vm":null},{"vf_pci_address":"d8:04.3","vf_profile":"","max_instances":null,"video_ram":null,"max_heads":null,"max_resolution_x":null,"max_resolution_y":null,"libvirt_address":{"domain":"0x0000","bus":"0xd8","slot":"0x04","function":"0x3"},"used_by_vm":null},{"vf_pci_address":"d8:01.0","vf_profile":"","max_instances":null,"video_ram":null,"max_heads":null,"max_resolution_x":null,"max_resolution_y":null,"libvirt_address":{"domain":"0x0000","bus":"0xd8","slot":"0x01","function":"0x0"},"used_by_vm":null},{"vf_pci_address":"d8:01.1","vf_profile":"","max_instances":null,"video_ram":null,"max_heads":null,"max_resolution_x":null,"max_resolution_y":null,"libvirt_address":{"domain":"0x0000","bus":"0xd8","slot":"0x01","function":"0x1"},"used_by_vm":null},{"vf_pci_address":"d8:01.2","vf_profile":"","max_instances":null,"video_ram":null,"max_heads":null,"max_resolution_x":null,"max_resolution_y":null,"libvirt_address":{"domain":"0x0000","bus":"0xd8","slot":"0x01","function":"0x2"},"used_by_vm":null},{"vf_pci_address":"d8:01.3","vf_profile":"","max_instances":null,"video_ram":null,"max_heads":null,"max_resolution_x":null,"max_resolution_y":null,"libvirt_address":{"domain":"0x0000","bus":"0xd8","slot":"0x01","function":"0x3"},"used_by_vm":null},{"vf_pci_address":"d8:01.4","vf_profile":"","max_instances":null,"video_ram":null,"max_heads":null,"max_resolution_x":null,"max_resolution_y":null,"libvirt_address":{"domain":"0x0000","bus":"0xd8","slot":"0x01","function":"0x4"},"used_by_vm":null},{"vf_pci_address":"d8:01.5","vf_profile":"","max_instances":null,"video_ram":null,"max_heads":null,"max_resolution_x":null,"max_resolution_y":null,"libvirt_address":{"domain":"0x0000","bus":"0xd8","slot":"0x01","function":"0x5"},"used_by_vm":null}]
}
]}
NVIDIA-SMI Integration Verification
[oracle@gpu1 ~]$ nvidia-smi vgpu -s -v | grep -A 15 "Type ID.*0x252"
vGPU Type ID : 0x252
Name : NVIDIA A10-6Q
Class : Quadro
GPU Instance Profile ID : N/A
Max Instances : 4
Max Instances Per VM : 16
Max Instances Per GI : N/A
Multi vGPU Exclusive : False
vGPU Exclusive Type : False
vGPU Exclusive Size : False
Device ID : 0x223610de
Sub System ID : 0x223614bc
FB Memory : 6144 MiB
BAR1 size : 256 MB
Display Heads : 4
Maximum X Resolution : 7680
--
vGPU Type ID : 0x252
Name : NVIDIA A10-6Q
Class : Quadro
GPU Instance Profile ID : N/A
Max Instances : 4
Max Instances Per VM : 16
Max Instances Per GI : N/A
Multi vGPU Exclusive : False
vGPU Exclusive Type : False
vGPU Exclusive Size : False
Device ID : 0x223610de
Sub System ID : 0x223614bc
FB Memory : 6144 MiB
BAR1 size : 256 MB
Display Heads : 4
Maximum X Resolution : 7680
vGPU Type Assignment Verification
[oracle@gpu1 ~]$ for vf in d8:00.4 d8:00.5 d8:00.6 d8:00.7; do
echo "=== VF $vf ==="
if [ -f "/sys/bus/pci/devices/0000:$vf/nvidia/current_vgpu_type" ]; then
cat /sys/bus/pci/devices/0000:$vf/nvidia/current_vgpu_type
else
echo "No current_vgpu_type file found"
fi
done
=== VF d8:00.4 ===
594
=== VF d8:00.5 ===
594
=== VF d8:00.6 ===
594
=== VF d8:00.7 ===
594
[oracle@gpu1 ~]$ printf "0x%x\n" 594
0x252
VM Deployment with vGPU
(localcloud) 🐱 > deploy virtualmachine name=test-vgpu-vm-shared templateid=32131e9e-2451-470d-b04a-70eb27cd77e6 serviceofferingid=2dd17938-6e46-45ac-9ddb-d18f9fe8a1f6 zoneid=317051d1-d7ce-45ac-a717-0bb8d6888011 networkids=bb5b3aa7-93f5-452b-846d-6c11803a34b0
{
"virtualmachine": {
"account": "admin",
"affinitygroup": [],
"arch": "x86_64",
"cpunumber": 2,
"cpuspeed": 2000,
"created": "2025-08-12T12:00:42+0200",
"deleteprotection": false,
"details": {
"rootDiskController": "osdefault"
},
"displayname": "test-vgpu-vm-shared",
"displayvm": true,
"domain": "ROOT",
"domainid": "0cbd02a9-761a-11f0-9e29-e4434bdc05b0",
"domainpath": "/",
"gpucardid": "ce297a19-a76d-4d6e-b4a7-62649ea7532c",
"gpucardname": "NVIDIA Corporation GA102GL [A10]",
"gpucount": 1,
"guestosid": "844a1c55-761a-11f0-9e29-e4434bdc05b0",
"haenable": false,
"hasannotations": false,
"hostcontrolstate": "Enabled",
"hostid": "3a9ca6d3-c1e0-4efa-940c-66bec16e6f2a",
"hostname": "gpu1",
"hypervisor": "KVM",
"id": "8ced9f7a-e697-467b-9896-be642228afc3",
"instancename": "i-2-33-VM",
"ipaddress": "172.20.0.112",
"isdynamicallyscalable": false,
"jobid": "9bd0aba6-3e35-45d1-acca-4e8e2645aa2f",
"jobstatus": 0,
"lastupdated": "2025-08-12T12:00:49+0200",
"maxheads": 4,
"maxresolutionx": 7680,
"maxresolutiony": 4320,
"memory": 8192,
"name": "test-vgpu-vm-shared",
"nic": [
{
"broadcasturi": "vlan://untagged",
"deviceid": "0",
"extradhcpoption": [],
"gateway": "172.20.0.1",
"id": "51081bf2-49bb-4dee-8677-3a6a4d069e0b",
"ipaddress": "172.20.0.112",
"isdefault": true,
"isolationuri": "vlan://untagged",
"macaddress": "1e:00:18:00:00:2b",
"netmask": "255.255.0.0",
"networkid": "bb5b3aa7-93f5-452b-846d-6c11803a34b0",
"networkname": "Network",
"secondaryip": [],
"traffictype": "Guest",
"type": "Shared"
}
],
"osdisplayname": "Ubuntu 24.04 LTS",
"ostypeid": "844a1c55-761a-11f0-9e29-e4434bdc05b0",
"password": "6CGkTe",
"passwordenabled": true,
"pooltype": "NetworkFilesystem",
"receivedbytes": 0,
"rootdeviceid": 0,
"rootdevicetype": "ROOT",
"securitygroup": [],
"sentbytes": 0,
"serviceofferingid": "2dd17938-6e46-45ac-9ddb-d18f9fe8a1f6",
"serviceofferingname": "Test",
"state": "Running",
"tags": [],
"templatedisplaytext": "Ubuntu 24.04",
"templateformat": "RAW",
"templateid": "32131e9e-2451-470d-b04a-70eb27cd77e6",
"templatename": "Ubuntu 24.04",
"templatetype": "USER",
"userid": "8ef12156-761a-11f0-9e29-e4434bdc05b0",
"username": "admin",
"vgpu": "NVIDIA A10-6Q",
"vgpuprofileid": "a5ff3a77-7c52-4403-a7bb-2edbcb877c88",
"vgpuprofilename": "NVIDIA A10-6Q",
"videoram": 6144,
"zoneid": "317051d1-d7ce-45ac-a717-0bb8d6888011",
"zonename": "DC"
}
}
CloudStack UI Verification
GPU Cards Discovery:
(localcloud) 🐱 > list gpucards
{
"count": 1,
"gpucard": [
{
"deviceid": "2236",
"devicename": "GA102GL [A10]",
"id": "ce297a19-a76d-4d6e-b4a7-62649ea7532c",
"name": "NVIDIA Corporation GA102GL [A10]",
"vendorid": "10de",
"vendorname": "NVIDIA Corporation"
}
]
}
GPU Devices with Enhanced Profile Information:
(localcloud) 🐱 > list gpudevices
{
"count": 6,
"gpudevice": [
{
"busaddress": "af:00.0",
"gpucardid": "ce297a19-a76d-4d6e-b4a7-62649ea7532c",
"gpucardname": "NVIDIA Corporation GA102GL [A10]",
"gpudevicetype": "PCI",
"hostid": "3a9ca6d3-c1e0-4efa-940c-66bec16e6f2a",
"hostname": "gpu1",
"id": "58a59c35-acf2-435b-ad5f-87e6e35dd206",
"managedstate": "Managed",
"numanode": "1",
"state": "Free",
"vgpuprofileid": "0a857a31-afd1-4e3b-a42a-3257315c9edf",
"vgpuprofilename": "passthrough"
},
{
"busaddress": "d8:00.4",
"gpucardid": "ce297a19-a76d-4d6e-b4a7-62649ea7532c",
"gpucardname": "NVIDIA Corporation GA102GL [A10]",
"gpudevicetype": "PCI",
"hostid": "3a9ca6d3-c1e0-4efa-940c-66bec16e6f2a",
"hostname": "gpu1",
"id": "b0b0a6fe-621c-4f4a-80bf-c54a0f325e3a",
"managedstate": "Managed",
"numanode": "1",
"parentgpudeviceid": "9b415971-871d-4b72-acf8-1e8ae9c08e4b",
"state": "Free",
"vgpuprofileid": "a5ff3a77-7c52-4403-a7bb-2edbcb877c88",
"vgpuprofilename": "NVIDIA A10-6Q"
}
]
}
Service Offerings with vGPU Configuration:
(localcloud) 🐱 > list serviceofferings
{
"serviceoffering": [
{
"cacheMode": "none",
"cpunumber": 2,
"cpuspeed": 2000,
"created": "2025-08-12T00:30:28+0200",
"defaultuse": false,
"diskofferingdisplaytext": "Test",
"diskofferingid": "cf9da5bd-056d-4b8b-b18f-05554c715c10",
"diskofferingname": "Test",
"diskofferingstrictness": false,
"displaytext": "Test",
"dynamicscalingenabled": true,
"encryptroot": false,
"gpucardid": "ce297a19-a76d-4d6e-b4a7-62649ea7532c",
"gpucardname": "NVIDIA Corporation GA102GL [A10]",
"gpucount": 1,
"gpudisplay": false,
"hasannotations": false,
"id": "2dd17938-6e46-45ac-9ddb-d18f9fe8a1f6",
"iscustomized": false,
"issystem": false,
"isvolatile": false,
"limitcpuuse": false,
"maxheads": 4,
"maxresolutionx": 7680,
"maxresolutiony": 4320,
"memory": 8192,
"name": "Test",
"offerha": false,
"provisioningtype": "thin",
"rootdisksize": 0,
"state": "Active",
"storagetype": "shared",
"vgpuprofileid": "a5ff3a77-7c52-4403-a7bb-2edbcb877c88",
"vgpuprofilename": "NVIDIA A10-6Q",
"videoram": 6144
}
]
}
LibVirt XML Validation - Core PR Change
[oracle@gpu1 ~]$ sudo virsh dumpxml i-2-33-VM | grep -A 15 hostdev
<hostdev mode='subsystem' type='pci' managed='no' display='off'>
<driver name='vfio'/>
<source>
<address domain='0x0000' bus='0xd8' slot='0x00' function='0x5'/>
</source>
<alias name='hostdev0'/>
<address type='pci' domain='0x0000' bus='0x00' slot='0x06' function='0x0'/>
</hostdev>
<watchdog model='i6300esb' action='none'>
<alias name='watchdog0'/>
<address type='pci' domain='0x0000' bus='0x00' slot='0x08' function='0x0'/>
</watchdog>
<memballoon model='virtio'>
<alias name='balloon0'/>
<address type='pci' domain='0x0000' bus='0x00' slot='0x07' function='0x0'/>
</memballoon>
</devices>
<seclabel type='dynamic' model='dac' relabel='yes'>
<label>+0:+0</label>
<imagelabel>+0:+0</imagelabel>
</seclabel>
</domain>
Conclusion
This PR successfully implements NVIDIA SR-IOV vGPU support with vendor-specific VFIO framework integration. The key enhancement of setting managed='no'
for NVIDIA SR-IOV GPUs is working correctly, and the nvidia-smi data integration provides rich vGPU profile information for better resource management.
Ready for merge.
[SF] Trillian test result (tid-14054)
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
clgtm, i think we should modularise the gpudiscovery.sh script sometime soon though. It is growing out of hand a bit. (new PR/issue)
Description
This PR adds support for nvidia SR-IOV supported vGPUs with Vendor specific VFIO Framework.
Types of changes
Feature/Enhancement Scale or Bug Severity
Feature/Enhancement Scale
Bug Severity
Screenshots (if appropriate):
How Has This Been Tested?
How did you try to break this feature and the system with this change?