-
Notifications
You must be signed in to change notification settings - Fork 199
Description
It looks like the nvme-cli output parsing was changed in nvme_metrics.py several months ago, and it looks like it's no longer compatible with nvme-cli v2.8, which is the current version on Ubuntu 24.04 and derivatives.
Running nvme_metrics.py as-is gives:
ERROR: Expecting value: line 1 column 1 (char 0)
Commenting out the try/except around the call to main() gives more output:
Traceback (most recent call last):
File "/root/./nvme_metrics.py", line 246, in <module>
main()
File "/root/./nvme_metrics.py", line 190, in main
smart_log = exec_nvme_json("smart-log", os.path.join("/dev", device_name))
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/root/./nvme_metrics.py", line 153, in exec_nvme_json
return json.loads(output)
^^^^^^^^^^^^^^^^^^
File "/usr/lib/python3.12/json/__init__.py", line 346, in loads
return _default_decoder.decode(s)
^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/lib/python3.12/json/decoder.py", line 337, in decode
obj, end = self.raw_decode(s, idx=_w(s, 0).end())
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/lib/python3.12/json/decoder.py", line 355, in raw_decode
raise JSONDecodeError("Expecting value", s, err.value) from None
json.decoder.JSONDecodeError: Expecting value: line 1 column 1 (char 0)
It looks like when you add the --verbose flag to nvme-cli v2.8, not all of the output is json-compatible. This is what I get for nvme without --verbose:
# nvme smart-log /dev/nvme0 --output-format json
{
"critical_warning":0,
"temperature":307,
"avail_spare":100,
"spare_thresh":5,
"percent_used":1,
"endurance_grp_critical_warning_summary":0,
"data_units_read":9252303,
"data_units_written":16773859,
"host_read_commands":58365724,
"host_write_commands":243774474,
"controller_busy_time":679,
"power_cycles":38,
"power_on_hours":14220,
"unsafe_shutdowns":11,
"media_errors":0,
"num_err_log_entries":0,
"warning_temp_time":0,
"critical_comp_time":0,
"temperature_sensor_1":307,
"thm_temp1_trans_count":0,
"thm_temp2_trans_count":0,
"thm_temp1_total_time":0,
"thm_temp2_total_time":0
}
And with --verbose
# nvme smart-log /dev/nvme0 --output-format json --verbose
opcode : 02
flags : 00
rsvd1 : 0000
nsid : ffffffff
cdw2 : 00000000
cdw3 : 00000000
data_len : 00000200
metadata_len : 00000000
addr : 61bd5e8b7000
metadata : 0
cdw10 : 007f0002
cdw11 : 00000000
cdw12 : 00000000
cdw13 : 00000000
cdw14 : 00000000
cdw15 : 00000000
timeout_ms : 00000000
result : 00000000
err : 0
latency : 11778 us
{
"critical_warning":0,
"temperature":307,
"avail_spare":100,
"spare_thresh":5,
"percent_used":1,
"endurance_grp_critical_warning_summary":0,
"data_units_read":9252303,
"data_units_written":16773860,
"host_read_commands":58365724,
"host_write_commands":243774547,
"controller_busy_time":679,
"power_cycles":38,
"power_on_hours":14220,
"unsafe_shutdowns":11,
"media_errors":0,
"num_err_log_entries":0,
"warning_temp_time":0,
"critical_comp_time":0,
"temperature_sensor_1":307,
"thm_temp1_trans_count":0,
"thm_temp2_trans_count":0,
"thm_temp1_total_time":0,
"thm_temp2_total_time":0
}
The first handful of lines cause the json parsing to barf. Unfortunately it's not as easy as just removing --verbose either, as that throws a different error:
Traceback (most recent call last):
File "/root/./nvme_metrics.py", line 246, in <module>
main()
File "/root/./nvme_metrics.py", line 167, in main
for subsys in device["Subsystems"]:
~~~~~~^^^^^^^^^^^^^^
KeyError: 'Subsystems'
I'm not sure what the current version of nvme_metrics.py is looking for from nvme-cli, but the previous version of nvme_metrics.py worked without issue with v2.8 of nvme-cli.