3
3
Command line tool to query the Global Microbial smORFs Catalog (GMSC)
4
4
5
5
## Installation
6
+
6
7
### Source
7
- Clone GMSC-mapper repository
8
+
9
+ Clone GMSC-mapper repository and execute our installation script.
8
10
9
11
``` bash
10
12
git clone https://github.com/BigDataBiology/GMSC-mapper.git
13
+ cd GMSC-mapper
14
+ ./install.sh
11
15
```
12
16
13
- Create conda environment(only support python v3.8/v3.9)
17
+ It should create a conda environment (python vv3.9) called ** gmscmapper**
18
+ inserted in the folder ` envs/ ` located in the GMSC-mapper main location.
19
+ To call this environment:
14
20
15
21
``` bash
16
- conda create -n gmscmapper python=3.8
17
- conda activate gmscmapper
18
- or
19
- conda create -n gmscmapper python=3.9
20
- conda activate gmscmapper
22
+ conda activate /path/to/GMSC-mapper/envs/gmscmapper
21
23
```
22
24
23
- You will need the following dependencies:
25
+ During the process, we install also the following dependencies:
24
26
25
27
- [ MMseqs2] ( https://github.com/soedinglab/MMseqs2 )
26
28
- [ Diamond] ( https://github.com/bbuchfink/diamond )
27
29
28
- The easiest way to install the dependencies is with [ conda] ( https://conda.io ) :
30
+ And perform a series of tests using mock datasets to check if the installation works well:
31
+
32
+ 1 . Input is genome contig sequences.
33
+
34
+ ``` bash
35
+ gmsc-mapper -i ../examples/example.fa --db ../examples/targetdb.dmnd --habitat ../examples/ref_habitat.txt --quality ../examples/ref_quality.txt --taxonomy ../examples/ref_taxonomy.txt
36
+ ```
37
+
38
+ 2 . Input is amino acid sequences.
29
39
30
40
``` bash
31
- conda install -c conda-forge -c bioconda mmseqs2
32
- conda install -c bioconda -c conda-forge diamond=2.0.13
41
+ gmsc-mapper --aa-genes ../examples/example.faa --db ../examples/targetdb.dmnd --habitat ../examples/ref_habitat.txt --quality ../examples/ref_quality.txt --taxonomy ../examples/ref_taxonomy.txt
33
42
```
34
43
35
- Once the dependencies are installed, you can install GMSC-mapper by running:
44
+ 3 . Input is nucleotide gene sequences.
36
45
37
46
``` bash
38
- cd GMSC-mapper
39
- python setup.py install
47
+ gmsc-mapper --nt-genes ../examples/example.fna --db ../examples/targetdb.dmnd --habitat ../examples/ref_habitat.txt --quality ../examples/ref_quality.txt --taxonomy ../examples/ref_taxonomy.txt
48
+ ```
49
+
50
+ 4 . Check the Alignment tool: Diamond/MMseqs2 is optional
51
+
52
+ ``` bash
53
+ gmsc-mapper -i ../examples/example.fa --db ../examples/targetdb --habitat ../examples/ref_habitat.txt --quality ../examples/ref_quality.txt --taxonomy ../examples/ref_taxonomy.txt --tool mmseqs
54
+
55
+ gmsc-mapper -i ../examples/example.fa --db ../examples/targetdb --habitat ../examples/ref_habitat.txt --quality ../examples/ref_quality.txt --taxonomy ../examples/ref_taxonomy.txt --tool diamond
56
+ ```
57
+
58
+ 5 . Flags to disable results from Habitat/taxonomy/quality annotation
59
+
60
+ ``` bash
61
+ gmsc-mapper -i ../examples/example.fa --db ../examples/targetdb.dmnd --habitat ../examples/ref_habitat.txt --quality ../examples/ref_quality.txt --taxonomy ../examples/ref_taxonomy.txt --nohabitat --notaxonomy --noquality
40
62
```
41
63
42
64
## Usage
@@ -58,41 +80,6 @@ cd gmsc_mapper
58
80
gmsc-mapper createdb -i ../examples/target.faa -o ../examples -m mmseqs
59
81
```
60
82
61
- #### Default
62
-
63
- Please make ` GMSC-mapper/gmsc_mapper ` as your work directory.
64
-
65
- GMSC database/habitat/taxonomy/quality file path and output directory path can be assigned on your own.Default is ` GMSC-mapper/db ` and ` GMSC-mapper/output ` .
66
-
67
- 1 . Input is genome contig sequences.
68
-
69
- ``` bash
70
- gmsc-mapper -i ../examples/example.fa --db ../examples/targetdb.dmnd --habitat ../examples/ref_habitat.txt --quality ../examples/ref_quality.txt --taxonomy ../examples/ref_taxonomy.txt
71
- ```
72
-
73
- 2 . Input is amino acid sequences.
74
-
75
- ``` bash
76
- gmsc-mapper --aa-genes ../examples/example.faa --db ../examples/targetdb.dmnd --habitat ../examples/ref_habitat.txt --quality ../examples/ref_quality.txt --taxonomy ../examples/ref_taxonomy.txt
77
- ```
78
-
79
- 3 . Input is nucleotide gene sequences.
80
-
81
- ``` bash
82
- gmsc-mapper --nt-genes ../examples/example.fna --db ../examples/targetdb.dmnd --habitat ../examples/ref_habitat.txt --quality ../examples/ref_quality.txt --taxonomy ../examples/ref_taxonomy.txt
83
- ```
84
-
85
- #### Alignment tool: Diamond/MMseqs2 is optional
86
- If you want to change alignment tool(Diamond/MMseqs2), you can use ` --tool ` .
87
- ``` bash
88
- gmsc-mapper -i ../examples/example.fa --db ../examples/targetdb --habitat ../examples/ref_habitat.txt --quality ../examples/ref_quality.txt --taxonomy ../examples/ref_taxonomy.txt --tool mmseqs
89
- ```
90
-
91
- #### Habitat/taxonomy/quality annotation is optional
92
- If you don't want to annotate habitat/taxonomy/quality you can use ` --nohabitat ` /` --notaxonomy ` /` --noquality ` .
93
- ``` bash
94
- gmsc-mapper -i ../examples/example.fa --db ../examples/targetdb.dmnd --habitat ../examples/ref_habitat.txt --quality ../examples/ref_quality.txt --taxonomy ../examples/ref_taxonomy.txt --nohabitat --notaxonomy --noquality
95
- ```
96
83
### Real data Usage
97
84
#### Create GMSC database index
98
85
` -o ` : Path to database output directory.(default: ` GMSC-mapper/db ` )
@@ -202,4 +189,4 @@ Subcommands: `gmsc-mapper createdb`
202
189
203
190
* ` -o/--output ` : Path to database output directory.
204
191
205
- * ` -m/--mode ` : Alignment tool(Diamond/MMseqs2)
192
+ * ` -m/--mode ` : Alignment tool(Diamond/MMseqs2)
0 commit comments