@@ -174,6 +174,120 @@ For example, to take |hypervisor_hostname| out of maintenance:
174174 seed# sudo docker exec -it bifrost_deploy /bin/bash
175175 (bifrost-deploy)[root@seed bifrost-base]# OS_CLOUD=bifrost openstack baremetal node maintenance unset |hypervisor_hostname|
176176
177+ Detect hardware differences with cardiff
178+ ----------------------------------------
179+
180+ Hardware information captured during the Ironic introspection process can be
181+ analysed to detect hardware differences, such as mismatches in firmware
182+ versions or missing storage devices. The cardiff tool can be used for this
183+ purpose. It was developed as part of the `Python hardware package
184+ <https://pypi.org/project/hardware/> `__, but was removed from release 0.25. The
185+ `mungetout utility <https://github.com/stackhpc/mungetout/ >`__ can be used to
186+ convert Ironic introspection data into a format that can be fed to cardiff.
187+
188+ The following steps are used to install cardiff and mungetout:
189+
190+ .. code-block :: console
191+ :substitutions:
192+
193+ kayobe# virtualenv |base_path|/venvs/cardiff
194+ kayobe# source |base_path|/venvs/cardiff/bin/activate
195+ kayobe# pip install -U pip
196+ kayobe# pip install git+https://github.com/stackhpc/mungetout.git@feature/kayobe-introspection-save
197+ kayobe# pip install 'hardware==0.24'
198+
199+ Extract introspection data from Bifrost with Kayobe. JSON files will be created
200+ into ``${KAYOBE_CONFIG_PATH}/overcloud-introspection-data ``:
201+
202+ .. code-block :: console
203+ :substitutions:
204+
205+ kayobe# source |base_path|/venvs/kayobe/bin/activate
206+ kayobe# source |base_path|/src/kayobe-config/kayobe-env
207+ kayobe# kayobe overcloud introspection data save
208+
209+ The cardiff utility can only work if the ``extra-hardware `` collector was used,
210+ which populates a ``data `` key in each node JSON file. Remove any that are
211+ missing this key:
212+
213+ .. code-block :: console
214+ :substitutions:
215+
216+ kayobe# for file in |base_path|/src/kayobe-config/overcloud-introspection-data/*; do if [[ $(jq .data $file) == 'null' ]]; then rm $file; fi; done
217+
218+ Cardiff identifies each unique system by its serial number. However, some
219+ high-density multi-node systems may report the same serial number for multiple
220+ systems (this has been seen on Supermicro hardware). The following script will
221+ replace the serial number used by Cardiff by the node name captured by LLDP on
222+ the first network interface. If this node name is missing, it will append a
223+ short UUID string to the end of the serial number.
224+
225+ .. code-block :: python
226+
227+ import json
228+ import sys
229+ import uuid
230+
231+ with open (sys.argv[1 ], " r+" ) as f:
232+ node = json.loads(f.read())
233+
234+ serial = node[" inventory" ][" system_vendor" ][" serial_number" ]
235+ try :
236+ new_serial = node[" all_interfaces" ][" eth0" ][" lldp_processed" ][" switch_port_description" ]
237+ except KeyError :
238+ new_serial = serial + " -" + str (uuid.uuid4())[:8 ]
239+
240+ new_data = []
241+ for e in node[" data" ]:
242+ if e[0 ] == " system" and e[1 ] == " product" and e[2 ] == " serial" :
243+ new_data.append([" system" , " product" , " serial" , new_serial])
244+ else :
245+ new_data.append(e)
246+ node[" data" ] = new_data
247+
248+ f.seek(0 )
249+ f.write(json.dumps(node))
250+ f.truncate()
251+
252+ Apply this Python script on all generated JSON files:
253+
254+ .. code-block :: console
255+ :substitutions:
256+
257+ kayobe# for file in ~/src/kayobe-config/overcloud-introspection-data/*; do python update-serial.py $file; done
258+
259+ Convert files into the format supported by cardiff:
260+
261+ .. code-block :: console
262+ :substitutions:
263+
264+ source |base_path|/venvs/cardiff/bin/activate
265+ mkdir -p |base_path|/cardiff-workspace
266+ rm -rf |base_path|/cardiff-workspace/extra*
267+ cd |base_path|/cardiff-workspace/
268+ m2-extract |base_path|/src/kayobe-config/overcloud-introspection-data/*.json
269+
270+ .. note ::
271+
272+ The ``m2-extract `` utility needs to work in an empty folder. Delete the
273+ ``extra-hardware ``, ``extra-hardware-filtered `` and ``extra-hardware-json ``
274+ folders before executing it again.
275+
276+ We are now ready to compare node hardware. The following command will compare
277+ all known nodes, which may include multiple generations of hardware. Replace
278+ ``*.eval `` by a stricter globbing expression or by a list of files to compare a
279+ smaller group.
280+
281+ .. code-block :: console
282+
283+ hardware-cardiff -I ipmi -p 'extra-hardware/*.eval'
284+
285+ Since the output can be verbose, it is recommended to pipe it to a terminal
286+ pager or redirect it to a file. Cardiff will display groups of identical nodes
287+ based on various hardware characteristics, such as system model, BIOS version,
288+ CPU or network interface information, or benchmark results gathered by the
289+ ``extra-hardware `` collector during the initial introspection process.
290+
177291.. ifconfig :: deployment['ceph_managed']
178292
179293 .. include :: hardware_inventory_management_ceph.rst
0 commit comments