Skip to content

Comments

Replace womtool w/ miniwdl for wdl validation#232

Open
kachulis wants to merge 13 commits intomainfrom
ck_miniwdl_check
Open

Replace womtool w/ miniwdl for wdl validation#232
kachulis wants to merge 13 commits intomainfrom
ck_miniwdl_check

Conversation

@kachulis
Copy link
Collaborator

@kachulis kachulis commented Dec 2, 2025

No description provided.

@kachulis kachulis marked this pull request as ready for review January 7, 2026 13:58
@kachulis kachulis requested a review from kockan January 7, 2026 13:58
input {
File input_vcf
File input_vcf_index
String? input_sample_name # Required if ignore_read_groups false && multi-sample VCF
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like this input_sample_name optional variable was never used in the task, so I'm fine with removing it completely, but then I look at the comment next to this line and it says "Required if ignore_read_groups false and multi-sample VCF". I suppose if the scenario mentioned ever happens then the command-line can be updated easily and the variable added back. Just wanted to make sure you're aware.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah, I think the comment must just be wrong/outdated. given as you said, the variable isn't actually doing anything. So either it's needed, in which case the task is broken, in which case the comment is wrong. either way, removing it isn't changing anything, and I don't think testing this wdl to differentiate is in scope of this pr

GenotypeSelector genotype_selector_list = {"bcf_genotype_label": selection.left, "bcf_genotype": selection.right}
}

if (check_fingerprint) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We're not doing fingerprint checking for this workflow anymore? I don't believe I've ever used it with that option enabled so this is also fine with me, just wanted to confirm.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

that was probably overreaching on my part, I'm gonna role that back.

# Fingerprint arguments
File haplotype_map
Boolean check_all_file_pairs = true
Boolean fail_on_mismatch = false
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't fail on mismatch?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fail_on_mismatch will now default to false. In this workflow we don't want to fail on mismatch, because that would result in a fail anytime any of the comparisons are not matches, which most won't be because we're comparing all the evals to all the truths.


File ref_fasta
File ref_index
File haplotype_map
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is related to the previous BenchmarkVCFs comment. We are getting rid of check_fingerprint and haplotype_map?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

rolling this back as well

input:
input_bam=input_bam,
input_bam_index=input_bam_index,
ref_fasta=ref_fasta,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe this is fine since the reference is not needed for BAM inputs, but as the GATK manual for PrintReads states: "The reference is strictly required when handling CRAM files."

Since this WDL itself is called "ComputeIntervalBamStats", I think it's okay to get rid of these as they are unused at the moment, but we can also make the WDL work on CRAMs by either always requiring the reference or branching based on the ".cram" extension, etc. if it's worth doing.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

agree we would need to include the fasta (and also probably fasta index) if we want the wdl to support crams. but as it was written I don't think it would have worked for crams anyway, so would have needed a change to successfully support crams. Can do that in the future if we decide we need it, but for now just going to remove the fake support.

Copy link
Contributor

@kockan kockan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, there is so much detail I haven't noticed before and learned some new WDL-related syntax as well.

Just added a few comment on parts that were a bit unclear to me and I wanted to make sure those were changes we intended to make in the first place.

I believe this should be good to go.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants