Releases: steineggerlab/Metabuli
Releases · steineggerlab/Metabuli
Metabuli v1.1.1
- Version packaged in Metabuli App
- Import the latest MMseqs2 as a git submodule
- Added FASTA/Q format validators:
fastq_utilsandfasta_validator - Added database validation function:
validatedb - Added
classifiedRefinerfor filtering or manipulating per-read classification result file. - Improved
createnewtaxalist. - Improved thread-safety of the database creation process.
Metabuli v1.1.0
- Fix errors in v1.0.9
- Custom DB creation became easier
- Improve
updateDBcommand
Metabuli v1.0.9
DB creation process improved
- Added
updateDBmodule for adding new sequences to an existing database. - Added
--cds-infoparameter in thebuildmodule. Users can provide CDS information to skip Prodigal's gene prediction.- Currently, only NCBI RefSeq or GenBank CDS files (*cds_from_genomic.fna) are supported.
- For the accessions included in the files, the provided CDS info will be used, skipping Prodigal's gene prediction.
- Added
--max-ramparameter to thebuildmodule. - Added compatibility with taxdump files generated using taxonkit.
- 1.0.9-2: Fixations for bioconda
Metabuli v1.0.8
- Added
extractmodule: It extracts reads classified under a specific taxon at any ranks. It can be used after runningclassify.
Metabuli v1.0.7
Metabuli became faster than v1.0.6
-
Dataset
- Query: SRR24315757_1.fastq, SRR24315757_2.fastq
- 22,107,398 paired-end reads
- 6,632,219,400 nt in total
- DB: GTDB
- Complete Genome or Chromosome level assemblies
- CheckM completeness > 90 and contamination < 5
- 36,203 genomes from 8,465 species
- Query: SRR24315757_1.fastq, SRR24315757_2.fastq
-
Windows: ~8.3 times faster
- Machine: Intel(R) Core(TM) i9-9900 CPU, 32GB RAM
--max-ram: 32--threads: 8- v1.0.6: 825s for the first 587,593 reads (2.7% of all). Total time not measured
- v1.0.7: 100s for the first 587,593 reads. 1h 7m 22s in total
-
MacOS: ~1.7 times faster
- Machine: MacBook Pro 14-inch 2023, M2 Pro chip, 32GB RAM
--max-ram: 32--threads: 8- v1.0.6: 71m 34s
- v1.0.7: 42m 58s
-
Linux: ~1.3 times faster
- Machine: A server with 64-core AMD EPYC 7742 CPU and 1 TB of RAM
--max-ram: 128--threads: 32- v1.0.6: 13m 34s
- v1.0.7: 9m 58s
--threads: 64- v1.0.6: 9m 36s
- v1.0.7: 7m 19s
Metabuli v1.0.6
Windows OS is supported
Metabuli v1.0.5
The CMake file was edited to pass the Bioconda PR test.
Other than that it is the same as v1.0.4.
Metabuli v1.0.4
- Fixed a minor reproducibility issue.
- Fixed a performance-harming bug occurring with sequences containing lowercased bases.
- Auto adjustment of
--match-per-kmerparameter. Issue #20 solved. - Record version info. in
db.parameter
Metabuli v1.0.3
- New parameter:
--tie-ratioinclassifymodule. [default 0.95]
When the best matching species has a score MAX, species withscore >= (MAX * --tie-ratio)is considered as a tie to the best score. When tie species occur for a read, the read is classified into their LCA.
Metabuli v1.0.2
v1.0.2
--accession-leveloption forbuildandclassifyworkflow: It reports not only the taxon but also the accession of the best match.- Fix minor bugs in
buildandclassifyworkflow. - Generate
taxonomyDBduringbuildand load it duringclassifyworkflow for faster loading of taxonomy information. - Support gzipped FASTA/FASTQ files in
add-to-libraryandclassifyworkflows. - low-complexity filtering in
buildworkflow as default with--mask-prob 0.9.