@@ -174,6 +174,120 @@ For example, to take |hypervisor_hostname| out of maintenance:
174
174
seed# sudo docker exec -it bifrost_deploy /bin/bash
175
175
(bifrost-deploy)[root@seed bifrost-base]# OS_CLOUD=bifrost openstack baremetal node maintenance unset |hypervisor_hostname|
176
176
177
+ Detect hardware differences with cardiff
178
+ ----------------------------------------
179
+
180
+ Hardware information captured during the Ironic introspection process can be
181
+ analysed to detect hardware differences, such as mismatches in firmware
182
+ versions or missing storage devices. The cardiff tool can be used for this
183
+ purpose. It was developed as part of the `Python hardware package
184
+ <https://pypi.org/project/hardware/> `__, but was removed from release 0.25. The
185
+ `mungetout utility <https://github.com/stackhpc/mungetout/ >`__ can be used to
186
+ convert Ironic introspection data into a format that can be fed to cardiff.
187
+
188
+ The following steps are used to install cardiff and mungetout:
189
+
190
+ .. code-block :: console
191
+ :substitutions:
192
+
193
+ kayobe# virtualenv |base_path|/venvs/cardiff
194
+ kayobe# source |base_path|/venvs/cardiff/bin/activate
195
+ kayobe# pip install -U pip
196
+ kayobe# pip install git+https://github.com/stackhpc/mungetout.git@feature/kayobe-introspection-save
197
+ kayobe# pip install 'hardware==0.24'
198
+
199
+ Extract introspection data from Bifrost with Kayobe. JSON files will be created
200
+ into ``${KAYOBE_CONFIG_PATH}/overcloud-introspection-data ``:
201
+
202
+ .. code-block :: console
203
+ :substitutions:
204
+
205
+ kayobe# source |base_path|/venvs/kayobe/bin/activate
206
+ kayobe# source |base_path|/src/kayobe-config/kayobe-env
207
+ kayobe# kayobe overcloud introspection data save
208
+
209
+ The cardiff utility can only work if the ``extra-hardware `` collector was used,
210
+ which populates a ``data `` key in each node JSON file. Remove any that are
211
+ missing this key:
212
+
213
+ .. code-block :: console
214
+ :substitutions:
215
+
216
+ kayobe# for file in |base_path|/src/kayobe-config/overcloud-introspection-data/*; do if [[ $(jq .data $file) == 'null' ]]; then rm $file; fi; done
217
+
218
+ Cardiff identifies each unique system by its serial number. However, some
219
+ high-density multi-node systems may report the same serial number for multiple
220
+ systems (this has been seen on Supermicro hardware). The following script will
221
+ replace the serial number used by Cardiff by the node name captured by LLDP on
222
+ the first network interface. If this node name is missing, it will append a
223
+ short UUID string to the end of the serial number.
224
+
225
+ .. code-block :: python
226
+
227
+ import json
228
+ import sys
229
+ import uuid
230
+
231
+ with open (sys.argv[1 ], " r+" ) as f:
232
+ node = json.loads(f.read())
233
+
234
+ serial = node[" inventory" ][" system_vendor" ][" serial_number" ]
235
+ try :
236
+ new_serial = node[" all_interfaces" ][" eth0" ][" lldp_processed" ][" switch_port_description" ]
237
+ except KeyError :
238
+ new_serial = serial + " -" + str (uuid.uuid4())[:8 ]
239
+
240
+ new_data = []
241
+ for e in node[" data" ]:
242
+ if e[0 ] == " system" and e[1 ] == " product" and e[2 ] == " serial" :
243
+ new_data.append([" system" , " product" , " serial" , new_serial])
244
+ else :
245
+ new_data.append(e)
246
+ node[" data" ] = new_data
247
+
248
+ f.seek(0 )
249
+ f.write(json.dumps(node))
250
+ f.truncate()
251
+
252
+ Apply this Python script on all generated JSON files:
253
+
254
+ .. code-block :: console
255
+ :substitutions:
256
+
257
+ kayobe# for file in ~/src/kayobe-config/overcloud-introspection-data/*; do python update-serial.py $file; done
258
+
259
+ Convert files into the format supported by cardiff:
260
+
261
+ .. code-block :: console
262
+ :substitutions:
263
+
264
+ source |base_path|/venvs/cardiff/bin/activate
265
+ mkdir -p |base_path|/cardiff-workspace
266
+ rm -rf |base_path|/cardiff-workspace/extra*
267
+ cd |base_path|/cardiff-workspace/
268
+ m2-extract |base_path|/src/kayobe-config/overcloud-introspection-data/*.json
269
+
270
+ .. note ::
271
+
272
+ The ``m2-extract `` utility needs to work in an empty folder. Delete the
273
+ ``extra-hardware ``, ``extra-hardware-filtered `` and ``extra-hardware-json ``
274
+ folders before executing it again.
275
+
276
+ We are now ready to compare node hardware. The following command will compare
277
+ all known nodes, which may include multiple generations of hardware. Replace
278
+ ``*.eval `` by a stricter globbing expression or by a list of files to compare a
279
+ smaller group.
280
+
281
+ .. code-block :: console
282
+
283
+ hardware-cardiff -I ipmi -p 'extra-hardware/*.eval'
284
+
285
+ Since the output can be verbose, it is recommended to pipe it to a terminal
286
+ pager or redirect it to a file. Cardiff will display groups of identical nodes
287
+ based on various hardware characteristics, such as system model, BIOS version,
288
+ CPU or network interface information, or benchmark results gathered by the
289
+ ``extra-hardware `` collector during the initial introspection process.
290
+
177
291
.. ifconfig :: deployment['ceph_managed']
178
292
179
293
.. include :: hardware_inventory_management_ceph.rst
0 commit comments