fix: add sd_locs_vcf_index input to SDtoBAF#644
fix: add sd_locs_vcf_index input to SDtoBAF#644simojoe wants to merge 2 commits intobroadinstitute:mainfrom
Conversation
|
@simojoe, thank you for fixing this! Optional index files can be confusing, and your suggested fix fits perfectly with what we discussed in #578. I want to suggest an improvement on your approach; can you please add an optional input |
mwalker174
left a comment
There was a problem hiding this comment.
Thanks @simojoe. I was at first surprised since we haven't seen this before but then noticed you are using the gzipped version of the dbsnp vcf. We currently reference the uncompressed version, which is why it didn't require an index. Unfortunately, if we ran your change with the uncompressed version it would fail because its index does not have a .tbi extension:
> gsutil ls "gs://gcp-public-data--broad-references/hg38/v0/Homo_sapiens_assembly38.dbsnp138.vcf*"
gs://gcp-public-data--broad-references/hg38/v0/Homo_sapiens_assembly38.dbsnp138.vcf
gs://gcp-public-data--broad-references/hg38/v0/Homo_sapiens_assembly38.dbsnp138.vcf.gz
gs://gcp-public-data--broad-references/hg38/v0/Homo_sapiens_assembly38.dbsnp138.vcf.gz.tbi
gs://gcp-public-data--broad-references/hg38/v0/Homo_sapiens_assembly38.dbsnp138.vcf.idx
I think we should be using the compressed one instead, but this will affect the code base in 2 other places:
- Needs index file input:
gatk-sv/wdl/CollectSVEvidence.wdl
Line 57 in 0cad166
- Update to the
.gzfile:gatk-sv/inputs/values/resources_hg38.json
Line 15 in 0cad166
Fixes #643
Adding the index in-place from the main file name.
I have not found any other reference to SDtoBAF that needs to be updated.