Skip to content

Comments

Add the ComputeMSA module#230

Merged
rfm-targa merged 10 commits intomasterfrom
ComputeMSA
Dec 4, 2025
Merged

Add the ComputeMSA module#230
rfm-targa merged 10 commits intomasterfrom
ComputeMSA

Conversation

@rfm-targa
Copy link
Contributor

This PR adds the ComputeMSA module.
The ComputeMSA module computes Multiple Sequence Alignments (MSAs) from allele calling results. It imports a matrix with allelic profiles, creates FASTA files with the alleles identified in each sample per schema locus, and uses MAFFT to compute loci MSAs. The loci MSAs are merged to create wg/cgMLST MSAs. The module can create MSAs at the protein and DNA levels (i.e. by converting the protein MSAs back to DNA). Additionally, it can create MSAs with only the variable positions (SNPs). The default mode expects a TSV file containing allelic profiles, but the module also accepts a path to a directory containing FASTA files to compute a MSA for the sequences in each file.

…positions if there are gaps or ambiguous bases. Remove $ from template commands in docs to allow users to copy and run commands without having to delete the $.
… if any of the sequences has gaps or ambiguous characters. Add option to provide custom options to run MAFFT.
…O.index_db to store record information as a file on disk and to allow using the index with multiprocessing by efficiently loading the index in each process.
@rfm-targa rfm-targa self-assigned this Dec 4, 2025
@rfm-targa rfm-targa added Type: Enhancement Status: In Progress Has been assigned and is being worked on. labels Dec 4, 2025
@rfm-targa rfm-targa merged commit 99542c5 into master Dec 4, 2025
4 checks passed
@rfm-targa rfm-targa deleted the ComputeMSA branch December 5, 2025 10:48
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Status: In Progress Has been assigned and is being worked on. Type: Enhancement

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant