Skip to content

Conversation

@Smeds
Copy link
Contributor

@Smeds Smeds commented Nov 21, 2025

FOR CONTRIBUTOR:

  • I have read the CONTRIBUTING.md document and this tool is appropriate for the tools-iuc repo.
  • License permits unrestricted use (educational + commercial)
  • This PR adds a new tool or tool collection
  • This PR updates an existing tool or tool collection
  • This PR does something else (explain below)

@Smeds Smeds marked this pull request as draft November 21, 2025 19:35
@Smeds
Copy link
Contributor Author

Smeds commented Nov 21, 2025

will stay in draft mode until I can add the tool for building reference files

@Smeds Smeds closed this Nov 21, 2025
@Smeds Smeds reopened this Nov 21, 2025
&& mv output_dir/violin_atl.png '${output_violin_plot}'
]]></command>
<inputs>
<param name="input_reads" type="data" format="fasta,fasta.gz,fastqsanger,fastqsanger.gz,fastq,fastq.gz,bam,cram" multiple="true" label="Input reads" help="Long-read sequencing data in FASTA, FASTQ, BAM, or CRAM format (gzipped supported). Multiple files can be selected."/>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please remove fastq and fastq.gz.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

updated!

<option value="median">Median</option>
<option value="max">Maximum</option>
</param>
<param name="downsample" argument="-d" type="integer" optional="true" label="Downsample telomere reads" help="Downsample to N telomere reads (optional)"/>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I remember that we need value="" for optional parameters

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

updated!

<option value="max">Maximum</option>
</param>
<param name="downsample" argument="-d" type="integer" optional="true" label="Downsample telomere reads" help="Downsample to N telomere reads (optional)"/>
<param name="random_seed" argument="--rng" type="integer" optional="true" label="Random seed" help="Random seed value for reproducibility (optional)"/>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

updated!

</section>

<section name="reference_opts" title="Reference Options" expanded="false">
<param name="custom_reference" argument="-t" type="data" format="fasta" optional="true" label="Custom reference FASTA" help="Optional custom telogator reference FASTA file. If not provided, built-in human T2T reference will be used."/>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

human T2T

Is this in the container?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes! A default human T2T (also maze and mouse) reference is included. I'm also adding the reference building tool, should be in bioconda this week.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it maybe be wise to require the user to provide a reference and not rely on the once included? Right now you loose a bit of the data tracabillity since the included references are include in the source package, and it's not super clear which reference version that have been used to build the included ones.


<section name="reference_opts" title="Reference Options" expanded="false">
<param name="custom_reference" argument="-t" type="data" format="fasta" optional="true" label="Custom reference FASTA" help="Optional custom telogator reference FASTA file. If not provided, built-in human T2T reference will be used."/>
<param name="kmer_file" argument="-k" type="data" format="txt" optional="true" label="Telomere kmers file" help="Optional telomere kmers file (required for non-human organisms like mouse or maize)"/>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any more info in this?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Information about the kmer file is currently very sparse on in the repo. I have asked them to add more information about it.

<when value="pbmm2"/>
</conditional>

<param name="ref_fasta" type="data" format="fasta" optional="true" label="Reference FASTA for CRAM input" help="Required only if input is CRAM format"/>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a connection to the custom reference?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, they are separate and could be pointing to different files!

Considering if I should update the wrapper and dropping support for cram files or do you believe that cram file is something that users? Since it's optional right now I guess the tool would fail if the user forget to set this optional parameter.

<param argument="--filt-nontel" type="integer" value="100" min="0" label="Maximum terminating non-telomere" help="Maximum terminating non-telomere length in bp"/>
<param argument="--filt-sub" type="integer" value="1000" min="0" label="Minimum terminating subtelomere" help="Minimum terminating subtelomere length in bp"/>
<param argument="--collapse-hom" type="integer" value="1000" min="0" label="Collapse homologous alleles" help="Merge alleles within this distance in bp"/>
<param argument="--fast-aln" type="boolean" truevalue="true" falsevalue="false" checked="false" label="Use fast alignment" help="Use faster but less accurate pairwise alignment"/>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use truevalue="--fast-aln" falsevalue="" then you can just $fast_aln in the command section and don't need the if statement

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

updated!


**Reference**

Stephens, Z., & Kocher, J. P. (2024). Characterization of telomere variant repeats using long reads enables allele-specific telomere length estimation. BMC Bioinformatics, 25(1), 194.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is already in the citation, or?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

True! I will remove the duplicated information

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants