New tool addition: amas tool #7443

jchchiu · 2025-11-06T18:29:00Z

FOR CONTRIBUTOR:

I have read the CONTRIBUTING.md document and this tool is appropriate for the tools-iuc repo.
License permits unrestricted use (educational + commercial)
This PR adds a new tool or tool collection
This PR updates an existing tool or tool collection
This PR does something else (explain below)

Regarding issue #7442

Implemented the amas alignment concatenation action needed for the Biohackathon workflow
Added a simple test case with corresponding expected outputs

The AMAS commands are:
  concat      Concatenate input alignments
  convert     Convert to other file format
  replicate   Create replicate data sets for phylogenetic jackknife
  split       Split alignment according to a partitions file
  summary     Write alignment summary
  remove      Remove taxa from alignment
  translate   Translate DNA alignment into protein alignment

Command flags can be seen here
Note, The Tool Standard Output for concat will say "Wrote concatenated sequences to fasta file 'concatenated.out'"; however the file obtained will be renamed with the output format you have chosen (e.g. 'concatenated.fasta' for fasta, 'concatenated.nex' for nexus and nexus-int)

To do:

Add more test coverage
Will also try to implement replicate and split (need to workout how to do outputs); others are already covered by existing Galaxy tools (unless there is something you think looks interesting to implement)
Clean up code and help text
Fix --check-align

For now, is it possible to review the code to see if it's on the right track, and if there are any better ways to structure it?

jchchiu · 2025-11-07T05:31:19Z

See updated changes from George at jchchiu#1

tools/amas/amas.xml

tools/amas/macros.xml

tools/amas/amas.xml

bernt-matthias · 2025-11-07T09:19:51Z

tools/amas/amas.xml

+        <param name="in_format" type="select" label="Format of the input file">
+            <option value="fasta">fasta</option>
+            <option value="phylip">phylip</option>
+            <option value="phylip-int">phylip-int</option>
+            <option value="nexus">nexus(sequential)</option>
+            <option value="nexus-int">nexus(interleaved)</option>
+        </param>


fasta phylip and nexus can be distinguishe automatically, e.g. $input_file.ext gives the Galaxy datatype. Is the info on interleaved/not needed? Can it be determined automatically?

https://github.com/marekborowiec/AMAS/blob/2e93d31638625135aa48a68251c363ac23a47c4a/amas/AMAS.py#L692

It doesn't seem like they have a function that detects the format for interleaved automatically; instead it depends on the input you give it. Can galaxy automatically distinguish interleaved?

tools/amas/amas.xml

bernt-matthias · 2025-11-07T09:21:43Z

tools/amas/amas.xml

+        </collection>
+
+        <collection name="converted_alignments" type="list" label="Converted alignments">
+            <discover_datasets directory="run_dir/convert" pattern="(?P&lt;name&gt;.+)-out\..+" format="data" />


We should set the format instead if format="data"

Could you have a look at the amas_split.xml at L55; is this what you were thinking?

jchchiu · 2025-11-11T00:42:47Z

Hey @bernt-matthias, could you have a look at amas_concat.xml and see if this is on the right track? If so, I'll update the rest of the subcommands with your suggestions.
Cheers for all the thorough comments.

tools/amas/amas.xml

tools/amas/macros.xml

tools/amas/amas_concat.xml

tools/amas/macros.xml

tools/amas/amas_summary.xml

tools/amas/macros.xml

jchchiu · 2025-11-19T06:44:36Z

I've been testing the split subcommand again and it seems like AMAS doesn't work when you use a RAxML or NEXUS formatted partitions file as an input.

The regex operator only works for the unspecified partitions:
matches = re.finditer(r"^(\s+)?([^ =]+)[ =]+([\0-9, -]+)", self.in_file_lines, re.MULTILINE)

I've updated the subcommand accordingly with some more info.

…titions; removed with note and more info

…ng whole file

bernt-matthias · 2025-11-28T08:10:09Z

tools/amas/check_interleaved.py

+    # NOTE: Do we need to check all files?
+    if all(interleaved_status):
+        return 0  # Exit code 0 = interleaved
+    else:
+        return 1  # Exit code 1 = sequential


Maybe check that they all the same. Also I would just print the result and use non-zero exit code for the error case. Something like this:

Suggested change

# NOTE: Do we need to check all files?

if all(interleaved_status):

return 0 # Exit code 0 = interleaved

else:

return 1 # Exit code 1 = sequential

interleaved_status = list(set(interleaved_status))

if len(interleaved_status) > 1:

raise Exception("mixed interleaved")

print(interleaved_status[0])

Or make the script output args.format + "-int" or args.format. Then you can set a bash variable in the command block. IN_FORMAT = \$(python '$__tool_directory__/check_interleaved.py' ...)

bernt-matthias

Looks good. Nearly there.

bernt-matthias · 2025-12-01T08:35:17Z

tools/amas/amas_concat.xml

+    <inputs>
+        <param name="input_files" type="data" format="fasta,phylip,nex" label="Sequences to concatenate" multiple="true" 
+               help="Provide pre-aligned FASTA/PHYLIP/NEXUS files (DNA or protein); mixes of unaligned reads or contigs will produce meaningless results." />
+        <expand macro="input_format" />


Suggested change

<expand macro="input_format" />

Analogous in all commands.

bernt-matthias · 2025-12-01T08:35:47Z

tools/amas/amas_concat.xml

+            <param name="input_files" value="inputs/concat_1.fasta,inputs/concat_2.fasta" />
+            <param name="out_format" value="phylip" />
+            <param name="part_format" value="nexus" />
+            <param name="in_format" value="fasta" />


Also remove in_format from tests.

bernt-matthias · 2025-12-01T08:36:19Z

tools/amas/amas_remove.xml

+        --out-format $out_format
+        --in-files
+            @INPUT_FILENAMES@
+        --in-format $in_format


Use \$IN_FORMAT

jchchiu added 6 commits November 7, 2025 05:04

feat: add amas and macros

917c7aa

test: add simple working test case

bb629f6

fix: change category to Multiple Alignments

be4db1f

fix: change category to Sequence Analysis

932783d

update from george

793a6b1

update from george; add tests

af75ec9

jchchiu added 4 commits November 7, 2025 16:35

update from george; add info.xml

a45c8b5

fix lint

e816d9c

add split test; update .shed; add comment to xml command

9967a62

update .shed owners

8e937d7

bernt-matthias reviewed Nov 7, 2025

View reviewed changes

jchchiu added 5 commits November 7, 2025 22:28

remove translate

a6ff62e

docs: update .shed

c354605

refactor: split concat into separate tool

a4fc62f

refactor: add input and output format as shared macro

6a56045

refactor: add macro for changing output format

426a577

jchchiu added 8 commits November 11, 2025 17:38

refactor: move info to macros

c757008

refactor: change tool id/name; remove info macro

1509d85

docs: update categories; reduce actions

6872743

refactor: rename output format

c77e246

refactor: move 'split' subcommand into separate tool

582d254

refactor: change output pattern

bc9bebd

refactor: move 'replicate' subcommand into separate tool

dc15ac1

docs: add more help to explain what partitions are

a279552

SaimMomin12 reviewed Nov 11, 2025

View reviewed changes

tools/amas/amas.xml Outdated Show resolved Hide resolved

tools/amas/amas.xml Outdated Show resolved Hide resolved

tools/amas/macros.xml Outdated Show resolved Hide resolved

SaimMomin12 changed the title ~~Add amas(1.0) tool~~ New tool addition: amas tool Nov 11, 2025

jchchiu added 2 commits November 12, 2025 11:10

refactor: move 'summary' subcommand into separate tool

1d901f5

temp: move 'remove' subcommand into separate tool

77241c3

jchchiu requested a review from bernt-matthias November 17, 2025 04:25

bernt-matthias reviewed Nov 18, 2025

View reviewed changes

jchchiu added 9 commits November 19, 2025 12:21

refactor: set format depending on part_format

2d2349b

style: changed formatting of output files

0e62561

fix: updated version command

4af9562

tests: changed concat test from sim size to exact

cfcfca9

refactor: simplified change_format

d4b84ac

fix: updated/fixed concat test

51bb36e

fix: added nex format to allowed inputs for partitions

ff762fb

docs: updated help

3d9424b

style: fix lint

18a8396

jchchiu added 3 commits November 19, 2025 17:45

fix: split subcommand does not work with RAxML or NEXUS formatted par…

bd9a818

…titions; removed with note and more info

docs: added some comments for future

0aae4cb

style: cleaned up indenting

96395ca

jchchiu requested a review from bernt-matthias November 20, 2025 03:50

jchchiu added 9 commits November 27, 2025 23:08

draft: added small helper script to check interleave

0318c99

draft: cleaner but less informative

6d18f6b

draft: removed interleave from input formats

7b84c28

draft: changed python script to iterate line by line instead of loadi…

71a5be4

…ng whole file

draft: added test data for nexus interleave check

e40a22e

draft: refactor to make more clean/efficient

ae8d825

draft: fix python flake8 lint

2ea8f01

draft: fix python flake8 w504 lint

9db5d83

draft: removed io usage and added utf-8 encoding

ce8812f

bernt-matthias reviewed Nov 28, 2025

View reviewed changes

jchchiu added 2 commits December 1, 2025 17:21

feat: added check for interleaved files

db929eb

feat: added interleaved check to all subcommands

3a108f1

jchchiu requested a review from bernt-matthias December 1, 2025 06:38

bernt-matthias reviewed Dec 1, 2025

View reviewed changes

New tool addition: amas tool #7443

Are you sure you want to change the base?

New tool addition: amas tool #7443

Uh oh!

Conversation

jchchiu commented Nov 6, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Regarding issue #7442

To do:

Uh oh!

jchchiu commented Nov 7, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

jchchiu commented Nov 11, 2025

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

jchchiu commented Nov 19, 2025

Uh oh!

Choose a reason for hiding this comment

Uh oh!

bernt-matthias left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

jchchiu commented Nov 6, 2025 •

edited

Loading

jchchiu commented Nov 7, 2025 •

edited

Loading