Skip to content

Segmentation fault + offsetalignment issue with predictmatch #16

@shaman-narayanasamy

Description

@shaman-narayanasamy

Dear authors/developers,

It is me again! (Sorry)

I am attempting to run predictmatch on a set of spacers against a virome database. Briefly, the spacers were extracted directly from metagenomics reads using a tool called crass (github, publication). Similarly, the virome is a subset of contigs predicted as viruses/phages from a metagenomics de novo assembly. Both spacers and virome sequences are in fasta format and were converted to databases (see commands in addendum). On that note, nothing particularly stands out from the stdout of the database creation process.

Upon database creation, I executed predictmatch, where I ran into segmentation fault and offsetalignment issue(s). Here is the command and stdout where the error occurred:

spacepharer predictmatch spacepharer/dbs/spacers/setDbs/querySetDb         spacepharer/dbs/test_virome/setDbs/targetSetDb         spacepharer/dbs/test_virome/setDbs/targetSetDb_rev         spacepharer/predictions.tsv         --threads 40         /ibex/tmp/naras0c/spacepharer/tmpFolder_predictions

<stdout removed for brevity>

Time for merging to aggregate_truncated: 0h 0m 0s 17ms
Time for processing: 0h 0m 0s 98ms
offsetalignment spacepharer/dbs/spacers/setDbs/querySetDb_nucl spacepharer/dbs/spacers/setDbs/querySetDb spacepharer/dbs/t
est_virome/setDbs/targetSetDb_nucl spacepharer/dbs/test_virome/setDbs/targetSetDb /ibex/tmp/naras0c/spacepharer/tmpFolder_
predictions/4579051267390253550/aggregate_truncated /ibex/tmp/naras0c/spacepharer/tmpFolder_predictions/457905126739025355
0/aggregate_offset --search-type 4 --threads 40 -v 3 

Computing ORF lookup
Computing contig offsets
Computing contig lookup
Time for contig lookup: 0h 0m 0s 1ms
Writing results to: /ibex/tmp/naras0c/spacepharer/tmpFolder_predictions/4579051267390253550/aggregate_offset
Thread index 0 > maximum thread number 0
Thread index 36 > maximum thread number 0
Thread index 37 > maximum thread number 0
Thread index 5 > maximum thread number 0
/ibex/tmp/naras0c/spacepharer/tmpFolder_predictions/4579051267390253550/predictmatch.sh: line 137: 3024379 Segmentation fault      "${MMSEQS}" offsetalignment "${QUERY}_nucl" "${QUERY}" "${TARGET}_nucl" "${TARGET}" "${TMP_PATH}/aggregate_truncated" "${TMP_PATH}/aggregate_offset" --search-type 4 ${THREADS_PAR}
[Error: offsetalignment failed
[Wed Apr 16 10:59:20 2025]

Another observation is that the second run of predictmatch using the exact same command, results in a slightly different stdout:

Time for merging to aggregate_truncated: 0h 0m 0s 20ms
Time for processing: 0h 0m 0s 102ms
offsetalignment spacepharer/dbs/spacers/setDbs/querySetDb_nucl spacepharer/dbs/spacers/setDbs/querySetDb spacepharer/dbs/t
est_virome/setDbs/targetSetDb_nucl spacepharer/dbs/test_virome/setDbs/targetSetDb /ibex/tmp/naras0c/spacepharer/tmpFolder_
predictions/4579051267390253550/aggregate_truncated /ibex/tmp/naras0c/spacepharer/tmpFolder_predictions/457905126739025355
0/aggregate_offset --search-type 4 --threads 40 -v 3 

Computing ORF lookup
Computing contig offsets
Computing contig lookup
Time for contig lookup: 0h 0m 0s 1ms
Writing results to: /ibex/tmp/naras0c/spacepharer/tmpFolder_predictions/4579051267390253550/aggregate_offset
Thread index 0 > maximum thread number 0
Thread index 37 > maximum thread number 0
Thread index 23 > maximum thread number 0
Thread index 6 > maximum thread number 0
Thread index 32 > maximum thread number 0
[Error: offsetalignment failed

Notice that the segmentation fault error is missing in a repeated execution. The only way to replicate the segmentation fault error is to delete all the temporary directories, output, etc. and execute a "fresh" run.

For further information, the dataset that I am using is made up of 1000 spacers and 10000 phages (from a virome). It represents a subset of a larger dataset, which I am using to test the workflow. I am also providing a lot of memory, 250GB, to be precise with 40 threads, so I doubt resources could be the issue here. Referring to issue #8 , perhaps my input databases are not generating any hits at all?

Do not hesitate to ask me any further question. Looking forward to your input.

Best regards,
Shaman

Addendum

Commands for creating a the relevant databases

spacepharer createsetdb /ibex/project/e3018/airport_surveillance/output/viromics/sequence_clustering/vOTUs_minlen
1000-test_10000_seqs.fasta          spacepharer/dbs/test_virome/setDbs/targetSetDb /ibex/tmp/naras0c/test_virome/tmpFolder
          --threads 40 

spacepharer createsetdb /ibex/project/e3018/airport_surveillance/output/viromics/sequence_clustering/vOTUs_minlen
1000-test_10000_seqs.fasta          spacepharer/dbs/test_virome/setDbs/targetSetDb_rev          /ibex/tmp/naras0c/test_vir
ome/tmpFolder_rev --reverse-fragments 1          --threads 40

spacepharer createsetdb /ibex/project/e3018/airport_surveillance/output/crispr_identification/spacers_rep_seq-tes
t.fasta          spacepharer/dbs/spacers/setDbs/querySetDb /ibex/tmp/naras0c/spacers/tmpFolder          --threads 40

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions