-
Notifications
You must be signed in to change notification settings - Fork 15
Description
Hi,
I am trying to compare a set of vcf files to a set of confirmed snps from a genome in a bottle database. I do not have access to the raw fastq file, so I am unsure regarding the filters applied to mapping. I merely have a set of bam files, vcf files a bed region file. I therefore also don't know what post mapping alteration have been performed.
I have have tried to run:
java -jar ~/Downloads/bcbio.variation-0.2.1-standalone.jar variant-compare ref-grading.yaml
where my ref-grading.yaml file contains the following:
dir:
out: grading
prep: grading/prep
experiments:
- sample: NA00001
ref: /export/home/pjones/bcbio/genomes/Hsapiens/hg19/seq/hg19.fa
intervals: ref.bed
summary-level: quick
approach: grade
calls:- name: reference
file: ref.vcf
remove-refcalls: true - name: case1
prep: true
preclean: true
remove-refcalls: true
file: case1.vcf
intervals: ref.bed
- name: reference
I get the following error, (I am not familiar with java though):
2015-01-12 16:48:18,299 [INFO ] MLog clients using log4j logging.
2015-01-12 16:48:18,760 [INFO ] State :begin :: {:desc "Starting variation analysis"}
2015-01-12 16:48:18,788 [INFO ] State :clean :: {:desc "Cleaning input VCF: reference"}
2015-01-12 16:48:18,789 [INFO ] State :merge :: {:desc "Merging multiple input files: reference"}
2015-01-12 16:48:18,790 [INFO ] State :prep :: {:desc "Prepare VCF, resorting to genome build: reference"}
"ava.lang.NumberFormatException: For input string: "14596
at java.lang.NumberFormatException.forInputString(NumberFormatException.java:65)
at java.lang.Integer.parseInt(Integer.java:492)
at java.lang.Integer.parseInt(Integer.java:527)
at bcbio.align.ref$prep_bedline_sort$fn__1333.invoke(ref.clj:85)
at bcbio.align.ref$sort_bed_file$fn__1338$fn__1339$fn__1344.invoke(ref.clj:98)
at clojure.core$sort_by$fn__4299.invoke(core.clj:2769)
at clojure.lang.AFunction.compare(AFunction.java:49)
at java.util.TimSort.countRunAndMakeAscending(TimSort.java:324)
at java.util.TimSort.sort(TimSort.java:203)
at java.util.TimSort.sort(TimSort.java:173)
at java.util.Arrays.sort(Arrays.java:659)
at clojure.core$sort.invoke(core.clj:2754)
at clojure.core$sort_by.invoke(core.clj:2769)
at clojure.core$sort_by.invoke(core.clj:2767)
at bcbio.align.ref$sort_bed_file$fn__1338$fn__1339.invoke(ref.clj:99)
at bcbio.align.ref$sort_bed_file$fn__1338.invoke(ref.clj:97)
at bcbio.align.ref$sort_bed_file.invoke(ref.clj:96)
at bcbio.run.broad$gatk_cl_intersect_intervals$fn__1816.invoke(broad.clj:56)
at clojure.core$map$fn__4207.invoke(core.clj:2487)
at clojure.lang.LazySeq.sval(LazySeq.java:42)
at clojure.lang.LazySeq.seq(LazySeq.java:60)
at clojure.lang.RT.seq(RT.java:484)
at clojure.core$seq.invoke(core.clj:133)
at clojure.core$map$fn__4207.invoke(core.clj:2479)
at clojure.lang.LazySeq.sval(LazySeq.java:42)
at clojure.lang.LazySeq.seq(LazySeq.java:60)
at clojure.lang.RT.seq(RT.java:484)
at clojure.core$seq.invoke(core.clj:133)
at clojure.core$tree_seq$walk__4647$fn__4648.invoke(core.clj:4475)
at clojure.lang.LazySeq.sval(LazySeq.java:42)
at clojure.lang.LazySeq.seq(LazySeq.java:60)
at clojure.lang.LazySeq.more(LazySeq.java:96)
at clojure.lang.RT.more(RT.java:607)
at clojure.core$rest.invoke(core.clj:73)
at clojure.core$flatten.invoke(core.clj:6478)
at bcbio.run.broad$gatk_cl_intersect_intervals.doInvoke(broad.clj:56)
at clojure.lang.RestFn.invoke(RestFn.java:425)
at bcbio.variation.filter.intervals$select_by_sample.doInvoke(intervals.clj:56)
at clojure.lang.RestFn.invoke(RestFn.java:846)
at bcbio.variation.combine$dirty_prep_work$run_sample_select__1157.invoke(combine.clj:140)
at bcbio.variation.combine$dirty_prep_work.invoke(combine.clj:155)
at bcbio.variation.combine$gatk_normalize.invoke(combine.clj:187)
at bcbio.variation.compare$prepare_vcf_calls$fn__7526.invoke(compare.clj:120)
at clojure.core$map$fn__4207.invoke(core.clj:2487)
at clojure.lang.LazySeq.sval(LazySeq.java:42)
at clojure.lang.LazySeq.seq(LazySeq.java:60)
at clojure.lang.RT.seq(RT.java:484)
at clojure.lang.LazilyPersistentVector.create(LazilyPersistentVector.java:31)
at clojure.core$vec.invoke(core.clj:354)
at bcbio.variation.compare$prepare_vcf_calls.invoke(compare.clj:121)
at bcbio.variation.compare$variant_comparison_from_config$iter__7582__7586$fn__7587.invoke(compare.clj:255)
at clojure.lang.LazySeq.sval(LazySeq.java:42)
at clojure.lang.LazySeq.seq(LazySeq.java:60)
at clojure.lang.RT.seq(RT.java:484)
at clojure.core$seq.invoke(core.clj:133)
at clojure.core$tree_seq$walk__4647$fn__4648.invoke(core.clj:4475)
at clojure.lang.LazySeq.sval(LazySeq.java:42)
at clojure.lang.LazySeq.seq(LazySeq.java:60)
at clojure.lang.LazySeq.more(LazySeq.java:96)
at clojure.lang.RT.more(RT.java:607)
at clojure.core$rest.invoke(core.clj:73)
at clojure.core$flatten.invoke(core.clj:6478)
at bcbio.variation.compare$variant_comparison_from_config.invoke(compare.clj:254)
at bcbio.variation.compare$_main.invoke(compare.clj:274)
at clojure.lang.AFn.applyToHelper(AFn.java:161)
at clojure.lang.AFn.applyTo(AFn.java:151)
at clojure.core$apply.invoke(core.clj:617)
at bcbio.variation.core$_main.doInvoke(core.clj:35)
at clojure.lang.RestFn.applyTo(RestFn.java:137)
at bcbio.variation.core.main(Unknown Source)
I have no idea how to start debuggin this, is there some input file format that I am not aware of? Must my reference.fa be truncated to the same chromosomes as indicated in the bed file?
My Aim: To get a good estimate of the false positive/negative rate, as well as possible factors influencing these (such as coverage, entropy of neigbouring regions, mapping quality etc).
Additional information:
from the header of the vcf file the reference appears to be hg19 ucsc (which is what I used), it also appears that the additional chromosomes have been removed from the header and the call list in the vcf file (ie only chr1 - 22 + x +y). The ref.vcf and bed was downloaded and appear to have the same ucsc naming convension. My reference is indexed and there exists a gatk dictionary file. Java version (jdk 1.7.0_45). CentosOS, cluster with lustre file system.
Kind Regards,
Piet Jones