Skip to content

Commit b7f4eb1

Browse files
committed
Fix pci.alilas/reporting config for nova gpu DT/VA
Add the missing [pci]report_in_placement flag. Also configure the pci.alias config option on the controller. This configuration should match the configuration found on the compute nodes Signed-off-by: Bohdan Dobrelia <[email protected]>
1 parent 2ca49e5 commit b7f4eb1

File tree

4 files changed

+15
-4
lines changed

4 files changed

+15
-4
lines changed

examples/dt/nova/nova04delta/README.md

Lines changed: 6 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -32,30 +32,32 @@ That is a contrary to the legacy mode where PCI devices used to be requested thr
3232

3333
### Control Plane (`examples/dt/nova/nova04delta/service-values.yaml`)
3434

35-
* `[pci]alias`: Creates an alias for a specific GPU type. This allows users to request a GPU by a friendly name (e.g., `nvidia_a2`) when creating a VM.
35+
* `[pci]alias`: Creates an alias for a specific GPU type. This allows users to request a GPU by a friendly name (e.g., `nvidia_a2`) when creating a VM. This configuration should match the configuration found on the compute nodes.
3636
```yaml
3737
nova:
3838
apiServiceTemplate:
3939
customServiceConfig: |
4040
[pci]
4141
alias = { "vendor_id":"10de", "product_id":"20f1", "device_type":"type-PCI", "name":"nvidia_a2" }
4242
```
43-
* `[filter_scheduler]enabled_filters`: Ensures that `PciPassthroughFilter` is enabled in the Nova scheduler.
43+
* `[filter_scheduler]pci_in_placement`: Enables PCI in Placement. It should only be enabled after all the computes in the system become configured to report PCI inventory in Placement via enabling `[pci]report_in_placement` in EDPM nodesets configuration. However, this order must be ensured during major upgrades only, where the dataplane deployment to upate EDPM computes configurataion must come before reconfiguring control plane resources.
4444
* `device_type` in the alias is dependent on the actual hardware:
4545
* `type-PF`: The device supports SR-IOV and is the parent or root device.
4646
* `type-VF`: The device is a child device of a device that supports SR-IOV.
4747
* `type-PCI`: The device does not support SR-IOV. This is the value you should use, or simply omit setting `device_type`, in a full device passthrough scenario.
4848

4949
### Compute Node (`examples/dt/nova/nova04delta/edpm/nodeset/values.yaml`)
5050

51+
* `[pci]report_in_placement`: Required for PCI in placement to work.
5152
* `[pci]device_spec`: Whitelists the physical GPUs that are available for passthrough. You must create a `device_spec` entry for each physical GPU you want to make available. For example:
5253
```yaml
5354
nova:
5455
pci:
5556
conf: |
5657
[pci]
57-
device_spec = {"vendor_id":"10de", "product_id":"20f1", "address": "0000:04:00.0", "physical_network":null}
58-
device_spec = {"vendor_id":"10de", "product_id":"20f1", "address": "0000:82:00.0", "physical_network":null}
58+
device_spec = { "vendor_id":"10de", "product_id":"20f1", "address": "0000:04:00.0", "physical_network":null }
59+
device_spec = { "vendor_id":"10de", "product_id":"20f1", "address": "0000:82:00.0", "physical_network":null }
60+
alias = { "vendor_id":"10de", "product_id":"20f1", "device_type":"type-PCI", "name":"nvidia_a2" }
5961
```
6062

6163
In addition to PCI device configuration, the `nova.compute.conf` section includes parameters for resource management on the compute node:

examples/dt/nova/nova04delta/edpm/nodeset/values.yaml

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -148,8 +148,11 @@ data:
148148
# "product_id": The product ID of the specific GPU model (e.g., "20f1" for an NVIDIA A100).
149149
# "address": The PCI address of the GPU on the host machine. You can find this using `lspci | grep -i nvidia`.
150150
# "physical_network": This is used for SR-IOV networking passthrough. For a GPU, you can typically leave this as null.
151+
# "alias": must match the alias configuration in the API service-values.yaml
151152
conf: |
152153
# CHANGEME
153154
[pci]
154155
device_spec = {"vendor_id":"10de", "product_id":"20f1", "address": "CHANGEME", "physical_network":null}
156+
alias = { "vendor_id":"10de", "product_id":"20f1", "device_type":"type-PCI", "name":"nvidia_a2" }
157+
report_in_placement = True
155158

examples/dt/nova/nova04delta/edpm/nodeset2/values.yaml

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -148,7 +148,10 @@ data:
148148
# "product_id": The product ID of the specific GPU model (e.g., "20f1" for an NVIDIA A100).
149149
# "address": The PCI address of the GPU on the host machine. You can find this using `lspci | grep -i nvidia`.
150150
# "physical_network": This is used for SR-IOV networking passthrough. For a GPU, you can typically leave this as null.
151+
# "alias": must match the alias configuration in the API service-values.yaml
151152
conf: |
152153
# CHANGEME
153154
[pci]
154155
device_spec = {"vendor_id":"10de", "product_id":"20f1", "address": "CHANGEME", "physical_network":null}
156+
alias = { "vendor_id":"10de", "product_id":"20f1", "device_type":"type-PCI", "name":"nvidia_a2" }
157+
report_in_placement = True

examples/va/nvidia-vfio-passthrough/edpm/nodeset/values.yaml

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -148,7 +148,10 @@ data:
148148
# "product_id": The product ID of the specific GPU model (e.g., "20f1" for an NVIDIA A100).
149149
# "address": The PCI address of the GPU on the host machine. You can find this using `lspci | grep -i nvidia`.
150150
# "physical_network": This is used for SR-IOV networking passthrough. For a GPU, you can typically leave this as null.
151+
# "alias": must match the alias configuration in the API service-values.yaml
151152
conf: |
152153
# CHANGEME
153154
[pci]
154155
device_spec = {"vendor_id":"10de", "product_id":"20f1", "address": "CHANGEME", "physical_network":null}
156+
alias = { "vendor_id":"10de", "product_id":"20f1", "device_type":"type-PCI", "name":"nvidia_a2" }
157+
report_in_placement = True

0 commit comments

Comments
 (0)