Skip to content

pod5 inspect reads failuresΒ #175

@granek

Description

@granek

I am having problems with pod5 inspect reads (although really what I want to do is run pod5 subset).

Problem Part 1

When I run pod5 inspect with version 0.3.33 on POD5 files I get the following error message

POD5 has encountered an error: ''pyarrow.lib.ExtensionScalar' object has no attribute 'as_buffer''

For detailed information set POD5_DEBUG=1'

If I set POD5_DEBUG=1, it crashes with the following message:

Traceback (most recent call last):
  File "/usr/local/bin/pod5", line 10, in <module>
    sys.exit(main())
             ^^^^^^
  File "/usr/local/lib/python3.12/site-packages/pod5/tools/main.py", line 61, in main
    return run_tool(parser)
           ^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.12/site-packages/pod5/tools/parsers.py", line 41, in run_tool
    raise exc
  File "/usr/local/lib/python3.12/site-packages/pod5/tools/parsers.py", line 38, in run_tool
    return tool_func(**kwargs)
           ^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.12/site-packages/pod5/tools/parsers.py", line 318, in run
    return inspect_pod5(**kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.12/site-packages/pod5/tools/pod5_inspect.py", line 221, in inspect_pod5
    commands[command](**kwargs)
  File "/usr/local/lib/python3.12/site-packages/pod5/tools/pod5_inspect.py", line 64, in do_reads_command
    "byte_count": read.byte_count,
                  ^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.12/site-packages/pod5/reader.py", line 323, in byte_count
    return sum(r.byte_count for r in self.signal_rows)
                                     ^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.12/site-packages/pod5/reader.py", line 423, in signal_rows
    return [map_signal_row(r) for r in self._batch.columns.signal[self._row]]
            ^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.12/site-packages/pod5/reader.py", line 415, in map_signal_row
    batch_length = len(batch.signal[batch_row_index].as_buffer())
                       ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
AttributeError: 'pyarrow.lib.ExtensionScalar' object has no attribute 'as_buffer'

I get the same problem with version 0.3.27 and 0.3.23.

This does work normally with version 0.3.15 on most of my POD5 files.

Problem Part 2

BUT, on one of my POD5 files, version 0.3.15 gives the following error, after successfully processing over 6000 reads (a few other POD5 files give the same error, but after a different number of reads):

POD5 has encountered an error: ''ANALYSIS_CONFIG_CHANGE''

For detailed information set POD5_DEBUG=1'

If I set POD5_DEBUG=1, it crashes with the following error (at the same read):

Traceback (most recent call last):
  File "/usr/local/bin/pod5", line 10, in <module>
    sys.exit(main())
             ^^^^^^
  File "/usr/local/lib/python3.12/site-packages/pod5/tools/main.py", line 60, in main
    return run_tool(parser)
           ^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.12/site-packages/pod5/tools/parsers.py", line 41, in run_tool
    raise exc
  File "/usr/local/lib/python3.12/site-packages/pod5/tools/parsers.py", line 38, in run_tool
    return tool_func(**kwargs)
           ^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.12/site-packages/pod5/tools/parsers.py", line 318, in run
    return inspect_pod5(**kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.12/site-packages/pod5/tools/pod5_inspect.py", line 218, in inspect_pod5
    commands[command](**kwargs)
  File "/usr/local/lib/python3.12/site-packages/pod5/tools/pod5_inspect.py", line 59, in do_reads_command
    "end_reason": read.end_reason.name,
                  ^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.12/site-packages/pod5/reader.py", line 222, in end_reason
    reason=EndReasonEnum[
           ^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.12/enum.py", line 814, in __getitem__
    return cls._member_map_[name]
           ~~~~~~~~~~~~~~~~^^^^^^
KeyError: 'ANALYSIS_CONFIG_CHANGE'

This error seems to be caused by reads with an end_reason of analysis_config_change based on the following observations:

  1. pod5 view version 0.3.15 works on the problematic POD5 files (that cause pod5 inspect reads to fail). The problematic reads seem to have and end_reason of analysis_config_change
  2. pod5 subset works on these problematic POD5 files when the reads with end_reason of analysis_config_change are removed from the Target Mapping file as follows
apptainer exec \
   docker://quay.io/biocontainers/pod5:0.3.15--pyhdfd78af_0 \
   pod5 view  $POD5_DIR \
   --include "read_id, channel, end_reason" | \
       grep -v analysis_config_change | \
       cut -f1,2 > read_channel_map.tsv

pod5 inspect run commands

In all cases I am running pod5 from biocontainers, like this:

apptainer exec docker://quay.io/biocontainers/pod5:0.3.33--pyhdfd78af_0 pod5 inspect reads AYK942_skip_594fdcd9_6a4a1c50_1.pod5
apptainer exec docker://quay.io/biocontainers/pod5:0.3.27--pyhdfd78af_0 pod5 inspect reads AYK942_skip_594fdcd9_6a4a1c50_1.pod5
apptainer exec docker://quay.io/biocontainers/pod5:0.3.23--pyhdfd78af_0 pod5 inspect reads AYK942_skip_594fdcd9_6a4a1c50_1.pod5
apptainer exec docker://quay.io/biocontainers/pod5:0.3.15--pyhdfd78af_0 pod5 inspect reads AYK942_skip_594fdcd9_6a4a1c50_1.pod5

Sequencing Run Software Versions

Here is the software version information for the sequencing run that generated these POD5 files.

MinKNOW
25.03.9
Bream
8.4.4
Configuration
6.4.11
Dorado
7.8.3
MinKNOW Core
6.4.9

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions