This is the repository where I'll store and explain my work towards the master thesis which can be found [here].
Briefly: I used a tool called USTAR to compress genome datasets and evaluated how well this compression works in tandem with various tools.
Using Singularity and a computing cluster. For more information, look into the dedicated section: Singularity.
USTAR: GitHub page here
Fulgor: GitHub page here
GGCAT: GitHub page here
Kmdiff: GitHub page here
Mash: GitHub page here
REINDEER2: GitHub page here
A summary of all the tools' results is reported in spreadsheet foramt inside the tools folder.
You can find info about the datasets in the datasets section.
I also modified USTAR to make it do things it wasn't capable of doing before, all the details are into this folder (I suggest reading the Singularity section beforehand)