Skip to content

Latest version of nvme_metrics.py breaks on Ubuntu 24.04 #248

@suicidaleggroll

Description

@suicidaleggroll

It looks like the nvme-cli output parsing was changed in nvme_metrics.py several months ago, and it looks like it's no longer compatible with nvme-cli v2.8, which is the current version on Ubuntu 24.04 and derivatives.

Running nvme_metrics.py as-is gives:
ERROR: Expecting value: line 1 column 1 (char 0)

Commenting out the try/except around the call to main() gives more output:

Traceback (most recent call last):
  File "/root/./nvme_metrics.py", line 246, in <module>
    main()
  File "/root/./nvme_metrics.py", line 190, in main
    smart_log = exec_nvme_json("smart-log", os.path.join("/dev", device_name))
                ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/root/./nvme_metrics.py", line 153, in exec_nvme_json
    return json.loads(output)
           ^^^^^^^^^^^^^^^^^^
  File "/usr/lib/python3.12/json/__init__.py", line 346, in loads
    return _default_decoder.decode(s)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/lib/python3.12/json/decoder.py", line 337, in decode
    obj, end = self.raw_decode(s, idx=_w(s, 0).end())
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/lib/python3.12/json/decoder.py", line 355, in raw_decode
    raise JSONDecodeError("Expecting value", s, err.value) from None
json.decoder.JSONDecodeError: Expecting value: line 1 column 1 (char 0)

It looks like when you add the --verbose flag to nvme-cli v2.8, not all of the output is json-compatible. This is what I get for nvme without --verbose:

# nvme smart-log /dev/nvme0 --output-format json
{
  "critical_warning":0,
  "temperature":307,
  "avail_spare":100,
  "spare_thresh":5,
  "percent_used":1,
  "endurance_grp_critical_warning_summary":0,
  "data_units_read":9252303,
  "data_units_written":16773859,
  "host_read_commands":58365724,
  "host_write_commands":243774474,
  "controller_busy_time":679,
  "power_cycles":38,
  "power_on_hours":14220,
  "unsafe_shutdowns":11,
  "media_errors":0,
  "num_err_log_entries":0,
  "warning_temp_time":0,
  "critical_comp_time":0,
  "temperature_sensor_1":307,
  "thm_temp1_trans_count":0,
  "thm_temp2_trans_count":0,
  "thm_temp1_total_time":0,
  "thm_temp2_total_time":0
}

And with --verbose

# nvme smart-log /dev/nvme0 --output-format json --verbose
opcode       : 02
flags        : 00
rsvd1        : 0000
nsid         : ffffffff
cdw2         : 00000000
cdw3         : 00000000
data_len     : 00000200
metadata_len : 00000000
addr         : 61bd5e8b7000
metadata     : 0
cdw10        : 007f0002
cdw11        : 00000000
cdw12        : 00000000
cdw13        : 00000000
cdw14        : 00000000
cdw15        : 00000000
timeout_ms   : 00000000
result       : 00000000
err          : 0
latency      : 11778 us
{
  "critical_warning":0,
  "temperature":307,
  "avail_spare":100,
  "spare_thresh":5,
  "percent_used":1,
  "endurance_grp_critical_warning_summary":0,
  "data_units_read":9252303,
  "data_units_written":16773860,
  "host_read_commands":58365724,
  "host_write_commands":243774547,
  "controller_busy_time":679,
  "power_cycles":38,
  "power_on_hours":14220,
  "unsafe_shutdowns":11,
  "media_errors":0,
  "num_err_log_entries":0,
  "warning_temp_time":0,
  "critical_comp_time":0,
  "temperature_sensor_1":307,
  "thm_temp1_trans_count":0,
  "thm_temp2_trans_count":0,
  "thm_temp1_total_time":0,
  "thm_temp2_total_time":0
}

The first handful of lines cause the json parsing to barf. Unfortunately it's not as easy as just removing --verbose either, as that throws a different error:

Traceback (most recent call last):
  File "/root/./nvme_metrics.py", line 246, in <module>
    main()
  File "/root/./nvme_metrics.py", line 167, in main
    for subsys in device["Subsystems"]:
                  ~~~~~~^^^^^^^^^^^^^^
KeyError: 'Subsystems'

I'm not sure what the current version of nvme_metrics.py is looking for from nvme-cli, but the previous version of nvme_metrics.py worked without issue with v2.8 of nvme-cli.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions