-
Notifications
You must be signed in to change notification settings - Fork 75
Description
Is your feature request related to an existing issue or bug?
no
Is your new feature related to a general problem?
Typically annotation of large secretion systems (T3SS, T4SS, T6SS, etc) can be spotty and gene names differ a lot between organisms. Its also not clear if these are predicted to be complete structures. Likewise, its not always clear where integron systems are located and which genes are in cassettes.
Describe the solution you'd like
It would be really nice if Bakta could run macsyfinder (https://github.com/gem-pasteur/macsyfinder) with the TXSScan models (https://github.com/macsy-models/TXSScan) and incorporate that into the annotation. Running IntegronFinder (https://github.com/gem-pasteur/Integron_Finder) and marking cassette borders in the annotations would be helpful too. Similarly, IS element boundaries could be marked with ISEScan (https://github.com/xiezhq/ISEScan) or prophage regions with DBSCAN-SWA (https://github.com/HIT-ImmunologyLab/DBSCAN-SWA/)
Describe alternatives you've considered
I totally understand that not every analysis tool or pipeline could or even should be added to bakta. Alternatively if adding these as options is too time consuming or complex, some way to run them separately and then use a bakta script to update the annotations with this information would be great. This would also be nice for custom annotations of features from an input table file, like integrated mobile element repeats, Agrobacterium T-DNA borders, etc.
Thanks!