Skip to content

Commit 251f695

Browse files
feat: Add BareMetalHost configuration to NVIDIA VFIO example
Co-authored-by: aider (gemini/gemini-2.5-pro) <[email protected]> Signed-off-by: Bohdan Dobrelia <[email protected]>
1 parent 5fb0949 commit 251f695

File tree

6 files changed

+279
-1
lines changed

6 files changed

+279
-1
lines changed

examples/va/nvidia-vfio-passthrough/README.md

Lines changed: 21 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -29,6 +29,27 @@ the appropriate native NVIDIA driver installed. You will need a standard NVIDIA
2929
driver. Do not use vGPU-enabled guest drivers. The GPU will appear as a physical
3030
PCI device within the guest.
3131

32+
### Host Configuration (`examples/va/nvidia-vfio-passthrough/edpm/nodeset/values.yaml`)
33+
34+
The following parameters are crucial for host-level configuration:
35+
36+
* **BareMetalHost configuration**: `baremetalhosts` section contains information required by metal3 to provision baremetal nodes.
37+
* `bmc.address`: The IP address of the Baseboard Management Controller (BMC).
38+
* `bootMACAddress`: The MAC address of the network interface that the node will use to PXE boot.
39+
* `rootDeviceHints`: Hints for metal3 to identify the root device for the OS installation.
40+
* `preprovisioningNetworkData`: Network configuration to be applied to the node for provisioning.
41+
42+
* `edpm_kernel_args`: Appends necessary kernel arguments for VFIO passthrough.
43+
* `intel_iommu=on iommu=pt`: Enables the IOMMU for device passthrough.
44+
* `vfio-pci.ids=10de:20f1`: Instructs the `vfio-pci` driver to claim the specified GPU(s) by their vendor and product IDs at boot time. The example IDs `10de:20f1` are for an NVIDIA A100 GPU.
45+
* `rd.driver.pre=vfio-pci`: Avoids race conditions during boot by loading vfio-pci kernel module early.
46+
47+
* `edpm_tuned_profile` and `edpm_tuned_isolated_cores`: These parameters configure the `tuned` service.
48+
* `edpm_tuned_profile` is set to `cpu-partitioning-powersave` to enable CPU isolation features.
49+
* `edpm_tuned_isolated_cores` specifies the cores to be isolated. For CPU isolation we strongly recommend using the Tuned approach rather than `isolcpus` kernel argument.
50+
51+
* **VFIO-PCI Binding Service**: The `vfio-pci-bind` service in `va/nvidia-vfio-passthrough/edpm/nodeset/nova_gpu.yaml` blacklists the `nouveau` and `nvidia` kernel modules to ensure they do not interfere with the `vfio-pci` driver. The service also regenerates the initramfs and grub configuration to apply these changes. A reboot is required for these changes to take effect.
52+
3253
### Nodes
3354

3455
| Role | Machine Type | Count |

examples/va/nvidia-vfio-passthrough/data-plane-pre.md

Lines changed: 6 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -14,7 +14,12 @@ Change to the nvidia-vfio-passthrough directory
1414
```
1515
cd architecture/examples/va/nvidia-vfio-passthrough
1616
```
17-
Edit the [edpm/nodeset/values.yaml](edpm/nodeset/values.yaml) and [edpm/deployment/values.yaml](edpm/deployment/values.yaml) files to suit your environment.
17+
Edit the `edpm/nodeset/values.yaml` and `edpm/deployment/values.yaml` files to suit your environment.
18+
19+
In `edpm/nodeset/values.yaml`, pay special attention to the `baremetalhosts` section. You will need to provide details for each of your baremetal compute nodes, including:
20+
- `bmc.address`: The IP address of the Baseboard Management Controller (BMC).
21+
- `bootMACAddress`: The MAC address of the network interface that the node will use to PXE boot.
22+
- Other parameters as described in the main [README.md](README.md).
1823
```
1924
vi edpm/nodeset/values.yaml
2025
vi edpm/deployment/values.yaml

examples/va/nvidia-vfio-passthrough/edpm/nodeset/values.yaml

Lines changed: 48 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -10,6 +10,54 @@ metadata:
1010
data:
1111
root_password: cmVkaGF0Cg==
1212
preProvisioned: false
13+
metal3_inspection: disabled
14+
baremetalhosts:
15+
edpm-compute-0:
16+
labels:
17+
nodeName: edpm-compute-0
18+
bmc:
19+
address: CHANGEME
20+
bootMACAddress: CHANGEME
21+
rootDeviceHints:
22+
deviceName: /dev/vda
23+
preprovisioningNetworkData: |
24+
networkData:
25+
links:
26+
- id: provisioning
27+
type: phy
28+
name: CHANGEME
29+
networks:
30+
- id: provisioning
31+
type: ipv4
32+
link: provisioning
33+
ip_address: 172.22.0.100 # CHANGEME
34+
netmask: 255.255.255.0
35+
routes:
36+
- destination: 0.0.0.0/0
37+
next_hop: 172.22.0.1 # CHANGEME
38+
edpm-compute-1:
39+
labels:
40+
nodeName: edpm-compute-1
41+
bmc:
42+
address: CHANGEME
43+
bootMACAddress: CHANGEME
44+
rootDeviceHints:
45+
deviceName: /dev/vda
46+
preprovisioningNetworkData: |
47+
networkData:
48+
links:
49+
- id: provisioning
50+
type: phy
51+
name: CHANGEME
52+
networks:
53+
- id: provisioning
54+
type: ipv4
55+
link: provisioning
56+
ip_address: 172.22.0.101 # CHANGEME
57+
netmask: 255.255.255.0
58+
routes:
59+
- destination: 0.0.0.0/0
60+
next_hop: 172.22.0.1 # CHANGEME
1361
baremetalSetTemplate:
1462
ctlplaneInterface: eno2 # CHANGEME
1563
cloudUserName: cloud-admin
Lines changed: 30 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,30 @@
1+
---
2+
apiVersion: v1
3+
kind: Secret
4+
metadata:
5+
name: edpm-compute-0-preprovision-network-data
6+
namespace: openstack
7+
type: Opaque
8+
stringData: {}
9+
---
10+
apiVersion: metal3.io/v1alpha1
11+
kind: BareMetalHost
12+
metadata:
13+
labels: {}
14+
name: edpm-compute-0
15+
namespace: openstack
16+
---
17+
apiVersion: v1
18+
kind: Secret
19+
metadata:
20+
name: edpm-compute-1-preprovision-network-data
21+
namespace: openstack
22+
type: Opaque
23+
stringData: {}
24+
---
25+
apiVersion: metal3.io/v1alpha1
26+
kind: BareMetalHost
27+
metadata:
28+
labels: {}
29+
name: edpm-compute-1
30+
namespace: openstack
Lines changed: 20 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,20 @@
1+
---
2+
apiVersion: metal3.io/v1alpha1
3+
kind: BareMetalHost
4+
metadata:
5+
labels: {}
6+
name: _ignored_
7+
namespace: openstack
8+
annotations:
9+
inspect.metal3.io: _replaced_
10+
spec:
11+
architecture: x86_64
12+
automatedCleaningMode: metadata
13+
bmc:
14+
address: _replaced_
15+
credentialsName: bmc-secret
16+
bootMACAddress: _replaced_
17+
bootMode: UEFI
18+
rootDeviceHints: {}
19+
online: false
20+
preprovisioningNetworkDataName: _replaced_

va/nvidia-vfio-passthrough/edpm/nodeset/kustomization.yaml

Lines changed: 154 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -23,6 +23,26 @@ components:
2323
resources:
2424
- baremetalset-password-secret.yaml
2525
- nova_gpu.yaml
26+
- baremetalhost.yaml
27+
28+
patches:
29+
- target:
30+
kind: BareMetalHost
31+
path: baremetalhost_template.yaml
32+
- target:
33+
kind: BareMetalHost
34+
name: edpm-compute-0
35+
patch: |
36+
- op: replace
37+
path: /spec/preprovisioningNetworkDataName
38+
value: edpm-compute-0-preprovision-network-data
39+
- target:
40+
kind: BareMetalHost
41+
name: edpm-compute-1
42+
patch: |
43+
- op: replace
44+
path: /spec/preprovisioningNetworkDataName
45+
value: edpm-compute-1-preprovision-network-data
2646
2747
replacements:
2848
- source:
@@ -88,3 +108,137 @@ replacements:
88108
- spec.baremetalSetTemplate
89109
options:
90110
create: true
111+
# BareMetalHost
112+
- source:
113+
kind: ConfigMap
114+
name: edpm-nodeset-values
115+
fieldPath: data.metal3_inspection
116+
targets:
117+
- select:
118+
kind: BareMetalHost
119+
fieldPaths:
120+
- metadata.annotations.inspect\.metal3\.io
121+
options:
122+
create: true
123+
# edpm-compute-0
124+
- source:
125+
kind: ConfigMap
126+
name: edpm-nodeset-values
127+
fieldPath: data.baremetalhosts.edpm-compute-0.labels
128+
targets:
129+
- select:
130+
kind: BareMetalHost
131+
name: edpm-compute-0
132+
fieldPaths:
133+
- metadata.labels
134+
options:
135+
create: true
136+
- source:
137+
kind: ConfigMap
138+
name: edpm-nodeset-values
139+
fieldPath: data.baremetalhosts.edpm-compute-0.bmc.address
140+
targets:
141+
- select:
142+
kind: BareMetalHost
143+
name: edpm-compute-0
144+
fieldPaths:
145+
- spec.bmc.address
146+
options:
147+
create: true
148+
- source:
149+
kind: ConfigMap
150+
name: edpm-nodeset-values
151+
fieldPath: data.baremetalhosts.edpm-compute-0.bootMACAddress
152+
targets:
153+
- select:
154+
kind: BareMetalHost
155+
name: edpm-compute-0
156+
fieldPaths:
157+
- spec.bootMACAddress
158+
options:
159+
create: true
160+
- source:
161+
kind: ConfigMap
162+
name: edpm-nodeset-values
163+
fieldPath: data.baremetalhosts.edpm-compute-0.rootDeviceHints
164+
targets:
165+
- select:
166+
kind: BareMetalHost
167+
name: edpm-compute-0
168+
fieldPaths:
169+
- spec.rootDeviceHints
170+
options:
171+
create: true
172+
- source:
173+
kind: ConfigMap
174+
name: edpm-nodeset-values
175+
fieldPath: data.baremetalhosts.edpm-compute-0.preprovisioningNetworkData
176+
targets:
177+
- select:
178+
kind: Secret
179+
name: edpm-compute-0-preprovision-network-data
180+
fieldPaths:
181+
- stringData
182+
options:
183+
create: true
184+
# edpm-compute-1
185+
- source:
186+
kind: ConfigMap
187+
name: edpm-nodeset-values
188+
fieldPath: data.baremetalhosts.edpm-compute-1.labels
189+
targets:
190+
- select:
191+
kind: BareMetalHost
192+
name: edpm-compute-1
193+
fieldPaths:
194+
- metadata.labels
195+
options:
196+
create: true
197+
- source:
198+
kind: ConfigMap
199+
name: edpm-nodeset-values
200+
fieldPath: data.baremetalhosts.edpm-compute-1.bmc.address
201+
targets:
202+
- select:
203+
kind: BareMetalHost
204+
name: edpm-compute-1
205+
fieldPaths:
206+
- spec.bmc.address
207+
options:
208+
create: true
209+
- source:
210+
kind: ConfigMap
211+
name: edpm-nodeset-values
212+
fieldPath: data.baremetalhosts.edpm-compute-1.bootMACAddress
213+
targets:
214+
- select:
215+
kind: BareMetalHost
216+
name: edpm-compute-1
217+
fieldPaths:
218+
- spec.bootMACAddress
219+
options:
220+
create: true
221+
- source:
222+
kind: ConfigMap
223+
name: edpm-nodeset-values
224+
fieldPath: data.baremetalhosts.edpm-compute-1.rootDeviceHints
225+
targets:
226+
- select:
227+
kind: BareMetalHost
228+
name: edpm-compute-1
229+
fieldPaths:
230+
- spec.rootDeviceHints
231+
options:
232+
create: true
233+
- source:
234+
kind: ConfigMap
235+
name: edpm-nodeset-values
236+
fieldPath: data.baremetalhosts.edpm-compute-1.preprovisioningNetworkData
237+
targets:
238+
- select:
239+
kind: Secret
240+
name: edpm-compute-1-preprovision-network-data
241+
fieldPaths:
242+
- stringData
243+
options:
244+
create: true

0 commit comments

Comments
 (0)