Skip to content

Commit b79cb21

Browse files
authored
Merge pull request #87 from jaebeom-kim/windows
Metabuli v1.0.7
2 parents cc0493c + 6bfce54 commit b79cb21

File tree

1 file changed

+37
-15
lines changed

1 file changed

+37
-15
lines changed

README.md

Lines changed: 37 additions & 15 deletions
Original file line numberDiff line numberDiff line change
@@ -1,12 +1,20 @@
11
[![install with bioconda](https://img.shields.io/badge/install%20with-bioconda-brightgreen.svg?style=flat)](http://bioconda.github.io/recipes/metabuli/README.html)
22
# Metabuli
3-
Metabuli is metagenomic classifier that jointly analyze both DNA and amino acid (AA) sequences.
4-
DNA-based classifiers can make specific classifications, exploiting point mutations to distinguish close taxa.
5-
AA-based classifiers have higher sensitivity in detecting homology between query and reference sequences, leverageing higher conservation of AA sequences.
6-
Metabuli combines the information of both sequence types using a novel k-mer structure, _metamer_, to enable both specific and sensitive characterization of metagenomic samples.
7-
In addition, it can classify reads against a database of any size as long as it fits in the hard disk.
3+
***Metabuli*** classifies metagenomic reads by comparing them to reference genomes. You can use Metabuli to profile the taxonomic composition of your samples or to detect specific (pathogenic) species.
84

9-
For more details of Metabuli, please see
5+
***Sensitive and Specific.*** Metabuli uses a novel k-mer structure, called *metamer*, to analyze both amino acid (AA) and DNA sequences. It leverages AA conservation for sensitive homology detection and DNA mutations for specific differentiation between closely related taxa.
6+
7+
***A laptop is enough.*** Metabuli operates within user-specified RAM limits, allowing it to search any database that fits in storage. A PC with 8 GiB of RAM is sufficient for most analyses.
8+
9+
***A few clicks are enough.*** A GUI is available [here](https://github.com/steineggerlab/Metabuli-App). You can run Metabuli and browse the results with just a few clicks on your PC.
10+
11+
***Short reads, long reads, and contigs.*** Metabuli can classify all types of sequences.
12+
13+
14+
---
15+
16+
17+
For more details, please see
1018
[Nature Methods](https://www.nature.com/articles/s41592-024-02273-y),
1119
[PDF](https://www.nature.com/articles/s41592-024-02273-y.epdf?sharing_token=je_2D5Su0-xVOSjuKSAXF9RgN0jAjWel9jnR3ZoTv0M7gE7NDF_xi_3sW8QdRiwfSJNwqaXItSoeCvr7cvcoQxKLt0oROgWc6urmki9tP80cXEuHPN0D7b4y9y3i8Yv7sZw8MxxhAj7W6p9eZE2zaK3eozdOkXvwADVfso9cXIM%3D),
1220
[bioRxiv](https://www.biorxiv.org/content/10.1101/2023.05.31.543018v2), or [ISMB 2023 talk](https://www.youtube.com/watch?v=vz2fuRcVwyk).
@@ -15,16 +23,28 @@ Please cite: [Kim J, Steinegger M. Metabuli: sensitive and specific metagenomic
1523

1624
<p align="center"><img src="https://raw.githubusercontent.com/steineggerlab/Metabuli/master/.github/marv_metabuli_small.png" height="350" /></p>
1725

18-
## Update in v1.0.6
26+
---
27+
### 🖥️ GUI apps for Windows, MacOS, and Linux are [here](https://github.com/steineggerlab/Metabuli-App).
28+
---
29+
### Update in v1.0.7
30+
- **Metabuli became faster 🚀**
31+
- Windows: *8.3* times faster
32+
- MacOS: *1.7* times faster
33+
- Linux: *1.3* times faster
34+
- Test details are in release note.
35+
- Fixed a bug in score calculation that could affect classification results.
36+
### Update in v1.0.6
1937
- Windows OS is supported.
20-
> We found Metabuli is too slow with Windows OS. Currently making it faster.
38+
> Metabuli v1.0.6 is too slow on Windows OS. Please use v1.0.7 or later.
39+
2140

22-
## Update in v1.0.4
41+
### Update in v1.0.4
2342
- Fixed a minor reproducibility issue.
2443
- Fixed a performance-harming bug occurring with sequences containing lowercased bases.
2544
- Auto adjustment of `--match-per-kmer` parameter. Issue #20 solved.
2645
- Record version info. in `db.parameter`
2746

47+
---
2848
## Installation
2949
### Precompiled binaries
3050
```
@@ -40,8 +60,9 @@ wget https://mmseqs.com/metabuli/metabuli-linux-sse2.tar.gz; tar xvzf metabuli-l
4060
4161
# MacOS (Universal, works on Apple Silicon and Intel Macs)
4262
wget https://mmseqs.com/metabuli/metabuli-osx-universal.tar.gz; tar xvzf metabuli-osx-universal.tar.gz; export PATH=$(pwd)/metabuli/bin/:$PATH
63+
4364
```
44-
Metabuli also works on Linux ARM64 systems. Please check [https://mmseqs.com/metabuli](https://mmseqs.com/metabuli) for static builds for other architectures.
65+
Metabuli also works on Linux ARM64 and Windows systems. Please check [https://mmseqs.com/metabuli](https://mmseqs.com/metabuli) for static builds for other architectures.
4566

4667
### Compile from source code
4768
To compile Metabuli from source code use the following commands:
@@ -123,7 +144,9 @@ metabuli classify --seq-mode 3 read.fna dbdir outdir jobid
123144
- PacBio Sequel II reads: `--min-score 0.005`
124145
- ONT reads: `--min-score 0.008`
125146

126-
This will generate two result files: `JobID_classifications.tsv`, `JobID_report.tsv`, and `JobID_krona.html`.
147+
This will generate three result files: `JobID_classifications.tsv`, `JobID_report.tsv`, and `JobID_krona.html`.
148+
> Sankey diagram is available in the [GUI app](https://github.com/steineggerlab/Metabuli-App).
149+
127150
#### JobID_classifications.tsv
128151
1. Classified or not
129152
2. Read ID
@@ -134,7 +157,6 @@ This will generate two result files: `JobID_classifications.tsv`, `JobID_report.
134157
7. List of "taxID : k-mer match count"
135158

136159
```
137-
#Example
138160
1 read_1 2688 294 0.627551 subspecies 2688:65
139161
1 read_2 2688 294 0.816327 subspecies 2688:78
140162
0 read_3 0 294 0 no rank
@@ -143,7 +165,6 @@ This will generate two result files: `JobID_classifications.tsv`, `JobID_report.
143165
#### JobID_report.tsv
144166
The proportion of reads that are assigned to each taxon.
145167
```
146-
#Example
147168
33.73 77571 77571 0 no rank unclassified
148169
66.27 152429 132 1 no rank root
149170
64.05 147319 2021 8034 superkingdom d__Bacteria
@@ -164,9 +185,10 @@ The proportion of reads that are assigned to each taxon.
164185
It is for an interactive taxonomy report (Krona). You can use any modern web browser to open `JobID_krona.html`.
165186
<p align="left"><img src="https://raw.githubusercontent.com/steineggerlab/Metabuli/master/.github/image.png" height="350" /></p>
166187

167-
#### Resource requirements
188+
189+
### Resource requirements
168190
Metabuli can classify reads against a database of any size as long as the database is fits in the hard disk, regardless of the machine's RAM size.
169-
We tested it with a MacBook Air (2020, M1, 8 GiB), where we classified about 1.5 M paired-end 150 bp reads (~5 GiB in size) against a database built with ~23K prokaryotic genomes (~69 GiB in size)
191+
We tested it with a MacBook Air (2020, M1, 8 GiB), where we classified about 15 M paired-end 150 bp reads (~5 GiB in size) against a database built with ~23K prokaryotic genomes (~69 GiB in size).
170192

171193
## Custom database
172194
To build a custom database, you need three things:

0 commit comments

Comments
 (0)