Skip to content

PanGIA is broken: throws an internal KeyError when running test command #11

@xapple

Description

@xapple

I installed PanGIA by cloning this repository and then downloading these two files:

$ curl -O https://edge-dl.lanl.gov/PanGIA/database/PanGIA_20190830_taxonomy.tar.gz
$ curl -O https://edge-dl.lanl.gov/PanGIA/database/PanGIA_20190830_NCBI_genomes_refseq89_BAV.fa.mmi.tar.gz

$ tar xzf PanGIA_20190830_taxonomy.tar.gz
$ tar xzf PanGIA_20190830_NCBI_genomes_refseq89_BAV.fa.mmi.tar.gz

Next, I ran the following command to test if PanGIA could classify a bunch of artificially generated reads:

(pangia) xapple@server ~ $ ~/programs/pangia/bin/pangia.py --threads 4 --database ~/databases/pangia/PanGIA/NCBI_genomes_refseq89_BAV.fa --mode report --outdir ~/runs/pangia_test/ --readmapper minimap2 --prefix sample --input ~/runs/pangia_test/reads_fwd.fastq.gz ~/runs/pangia_test/reads_rev.fastq.gz

But it throws a KeyError and seems to be non-functional.

[00:00:00] Starting PanGIA 1.0.0-RC6.1
[00:00:00] Temporary directory '~/runs/pangia_test//sample_tmp' found. Deleting directory...
[00:04:53] Arguments and dependencies checked:
[00:04:53]     Input reads       : ['~/runs/pangia_test/reads_fwd.fastq.gz', '~/runs/pangia_test/reads_rev.fastq.gz']
[00:04:53]     Input SAM file    : ~/runs/pangia_test//sample.pangia.sam
[00:04:53]     Input background  : None
[00:04:53]     Save background   : None
[00:04:53]     Scoring method    : standalone
[00:04:53]     Scoring parameter : 0.5:0.99
[00:04:53]     Database          : ['~/databases/pangia/PanGIA/NCBI_genomes_refseq89_BAV.fa.mmi']
[00:04:53]     Abundance         : DEPTH_COV
[00:04:53]     Output path       : ~/runs/pangia_test/
[00:04:53]     Prefix            : sample
[00:04:53]     Mode              : report
[00:04:53]     Specific taxid    : None
[00:04:53]     Threads           : 4
[00:04:53]     First #refs in XA : 30
[00:04:53]     Extra NM in XA    : 1
[00:04:53]     Minimal score     : 0
[00:04:53]     Minimal RSNB      : 2.5
[00:04:53]     Minimal reads     : 10
[00:04:53]     Minimal linear len: 200
[00:04:53]     Minimal genome cov: 0.004
[00:04:53]     Minimal depth (DC): 0.01
[00:04:53]     Minimal RSDCnr    : 0.0009
[00:04:53]     Aligner option    : -A1 -B2 -k 40 -m 60 -x sr -p 1 -N 30
[00:04:53]     Aligner seed len  : 40
[00:04:53]     Aligner min score : 60
[00:04:53]     Aligner path      : ~/mambaforge/envs/pangia/bin/minimap2
[00:04:53]     Samtools path     : ~/mambaforge/envs/pangia/bin/samtools
[00:04:53] Loading taxonomy information...
[00:05:00] Done.
[00:05:00] Loading pathogen information...
[00:05:00] Done. 2817 pathogens loaded.
[00:05:00] Loading taxonomic uniqueness information...
[00:05:00] Done. 31177 taxonomic uniqueness loaded.
[00:05:00] Loading sizes of genomes...
[00:05:55] Done. 1061 target and 0 host genome(s) loaded.
[00:05:55] Running read-mapping...
[00:05:55] Mapping to ~/databases/pangia/PanGIA/NCBI_genomes_refseq89_BAV.fa.mmi...
[00:06:53] Done mapping reads to the database(s).
[00:06:53] Merging SAM files...
[00:06:55] Logfile saved to ~/runs/pangia_test//sample.pangia.log.
[00:06:55] Done. Mapped SAM file saved to ~/runs/pangia_test//sample.pangia.sam.
[00:06:55] Total number of input reads: 400013
[00:06:55] Total number of mapped reads: 186478
[00:06:55] Total number of host reads: 0 (0.00%)
[00:06:55] Total number of ignored reads (cross superkingdom): 349 (0.19%)
[00:06:55] Processing SAM file...
[00:06:55] Parsing SAM files with 4 subprocesses...
[00:06:59] Merging results...
[00:06:59] Done.
[00:06:59] Calculating linear length...
[00:07:02] Done processing SAM file, 184670 alignment(s).
[00:07:02] Rolling up taxonomies...
[00:07:02] 17 strain(s) mapped.
Traceback (most recent call last):
  File "~/programs/pangia/bin/pangia.py", line 2320, in <module>
    res_rollup = taxonomyRollUp(res, patho_meta, mapped_r_cnt, argvs.minRsnb, argvs.minReads, argvs.minLen, argvs.minCov, argvs.minDc)
  File "~/programs/pangia/bin/pangia.py", line 1199, in taxonomyRollUp
    genome_size[taxid]
KeyError: '1582156.1'

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions