-
Notifications
You must be signed in to change notification settings - Fork 14
Description
Hi Jim, thanks for the amazing tool. We're starting to use it more and more within our team and really like it so far.
We use GTDB as the standard database and although the database is great, it's collapsed a couple of clinically relevant species into comprehensive species clusters. We end up having to supplement GTDB to still be able to differentiate several species of interest.
The new consolidated database functionality in v0.3 is great to reduce our inode footprint, but it means we now have to rebuild the whole GTDB set of representative genomes even if we only want to add a handful of new genomes.
Would it be possible to have some sort of middle ground approach, where the a user can specify multiple consolidated databases? This would really facilitate using skani in a more modular way, e.g. swapping out databases for different analyses or easily adding new genomes without rebuilding the whole set.
I have little understanding of the internal workings of skani so I have no idea whether this is possible or how much work this would require. Feel free to close the issue if this doesn't make sense!