@@ -106,7 +106,7 @@ role from Ansible Galaxy:
106
106
kernel_cmdline_remove :
107
107
- iommu
108
108
- intel_iommu
109
- - vfio-pci.ids :
109
+ - vfio-pci.ids
110
110
111
111
Kernel Device Management
112
112
~~~~~~~~~~~~~~~~~~~~~~~~
@@ -185,14 +185,38 @@ Once this code has taken effect (after a reboot), the VFIO kernel drivers should
185
185
vfio 32768 2 vfio_iommu_type1,vfio_pci
186
186
irqbypass 16384 5 vfio_pci,kvm
187
187
188
+ # lspci -nnk -s 3d:00.0
189
+ 3d:00.0 VGA compatible controller [0300]: NVIDIA Corporation GM107GL [Tesla M10] [10de:13bd] (rev a2)
190
+ Subsystem: NVIDIA Corporation Tesla M10 [10de:1160]
191
+ Kernel driver in use: vfio-pci
192
+ Kernel modules: nouveau
193
+
194
+ IOMMU should be enabled at kernel level as well - we can verify that on the compute host:
195
+
196
+ .. code-block :: text
197
+
198
+ # docker exec -it nova_libvirt virt-host-validate | grep IOMMU
199
+ QEMU: Checking for device assignment IOMMU support : PASS
200
+ QEMU: Checking if IOMMU is enabled by kernel : PASS
201
+
188
202
OpenStack Nova configuration
189
203
----------------------------
190
204
191
- Scheduler Filters
192
- ~~~~~~~~~~~~~~~~~
205
+ Configure nova-scheduler
206
+ ~~~~~~~~~~~~~~~~~~~~~~~~
207
+
208
+ The nova-scheduler service must be configured to enable the ``PciPassthroughFilter ``
209
+ To enable it add it to the list of filters to Kolla-Ansible configuration file:
210
+ ``etc/kayobe/kolla/config/nova.conf ``, for instance:
211
+
212
+ .. code-block :: yaml
213
+
214
+ [filter_scheduler]
215
+ available_filters = nova.scheduler.filters.all_filters
216
+ enabled_filters = AvailabilityZoneFilter, ComputeFilter, ComputeCapabilitiesFilter, ImagePropertiesFilter, ServerGroupAntiAffinityFilter, ServerGroupAffinityFilter, PciPassthroughFilter
193
217
194
- Hypervisor Resource Tracking
195
- ~~~~~~~~~~~~~~~~~~~~~~~~~~~~
218
+ Configure nova-compute
219
+ ~~~~~~~~~~~~~~~~~~~~~~
196
220
197
221
Configuration can be applied in flexible ways using Kolla-Ansible's
198
222
methods for `inventory-driven customisation of configuration
@@ -203,7 +227,7 @@ passthrough of GPU devices for hosts in a group named ``compute_gpu``.
203
227
Again, the 4-digit PCI Vendor ID and Device ID extracted from ``lspci
204
228
-nn `` can be used here to specify the GPU device(s).
205
229
206
- .. code-block :: yaml
230
+ .. code-block :: jinja
207
231
208
232
[pci]
209
233
{% raw %}
@@ -223,6 +247,43 @@ Again, the 4-digit PCI Vendor ID and Device ID extracted from ``lspci
223
247
{% endif %}
224
248
{% endraw %}
225
249
250
+ Configure nova-api
251
+ ~~~~~~~~~~~~~~~~~~
252
+
253
+ pci.alias also needs to be configured on the controller.
254
+ This configuration should match the configuration found on the compute nodes.
255
+ Add it to Kolla-Ansible configuration file:
256
+ ``etc/kayobe/kolla/config/nova-api/nova.conf ``, for instance:
257
+
258
+ .. code-block :: yaml
259
+
260
+ [pci]
261
+ alias = { "vendor_id":"10de", "product_id":"1db4", "device_type":"type-PCI", "name":"gpu-v100-16" }
262
+ alias = { "vendor_id":"10de", "product_id":"1db5", "device_type":"type-PCI", "name":"gpu-v100-32" }
263
+ alias = { "vendor_id":"10de", "product_id":"15f8", "device_type":"type-PCI", "name":"gpu-p100" }
264
+
265
+ Reconfigure nova service
266
+ ~~~~~~~~~~~~~~~~~~~~~~~~
267
+
268
+ .. code-block :: text
269
+
270
+ kayobe overcloud service reconfigure -kt nova --kolla-skip-tags common --skip-precheck
271
+
272
+ Configure a flavor
273
+ ~~~~~~~~~~~~~~~~~~
274
+ For example, to request two of the GPUs with alias gpu-p100
275
+
276
+ .. code-block :: text
277
+
278
+ openstack flavor set m1.medium --property "pci_passthrough:alias"="gpu-p100:2"
279
+
280
+
281
+ Create instance with GPU passthrough
282
+ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
283
+
284
+ .. code-block :: text
285
+
286
+ openstack server create --flavor m1.medium --image ubuntu2004 --wait test-pci
226
287
227
288
Testing GPU in a Guest VM
228
289
-------------------------
@@ -250,4 +311,3 @@ For PCI Passthrough and GPUs in OpenStack:
250
311
* https://www.kernel.org/doc/Documentation/Intel-IOMMU.txt
251
312
* https://access.redhat.com/documentation/en-us/red_hat_virtualization/4.1/html/installation_guide/appe-configuring_a_hypervisor_host_for_pci_passthrough
252
313
* https://www.gresearch.co.uk/article/utilising-the-openstack-placement-service-to-schedule-gpu-and-nvme-workloads-alongside-general-purpose-instances/
253
-
0 commit comments