Skip to content

Commit 0477159

Browse files
simplify installation procedures
1 parent 55fd9b4 commit 0477159

File tree

1 file changed

+37
-50
lines changed

1 file changed

+37
-50
lines changed

README.md

Lines changed: 37 additions & 50 deletions
Original file line numberDiff line numberDiff line change
@@ -3,40 +3,62 @@
33
Command line tool to query the Global Microbial smORFs Catalog (GMSC)
44

55
## Installation
6+
67
### Source
7-
Clone GMSC-mapper repository
8+
9+
Clone GMSC-mapper repository and execute our installation script.
810

911
```bash
1012
git clone https://github.com/BigDataBiology/GMSC-mapper.git
13+
cd GMSC-mapper
14+
./install.sh
1115
```
1216

13-
Create conda environment(only support python v3.8/v3.9)
17+
It should create a conda environment (python vv3.9) called **gmscmapper**
18+
inserted in the folder `envs/` located in the GMSC-mapper main location.
19+
To call this environment:
1420

1521
```bash
16-
conda create -n gmscmapper python=3.8
17-
conda activate gmscmapper
18-
or
19-
conda create -n gmscmapper python=3.9
20-
conda activate gmscmapper
22+
conda activate /path/to/GMSC-mapper/envs/gmscmapper
2123
```
2224

23-
You will need the following dependencies:
25+
During the process, we install also the following dependencies:
2426

2527
- [MMseqs2](https://github.com/soedinglab/MMseqs2)
2628
- [Diamond](https://github.com/bbuchfink/diamond)
2729

28-
The easiest way to install the dependencies is with [conda](https://conda.io):
30+
And perform a series of tests using mock datasets to check if the installation works well:
31+
32+
1. Input is genome contig sequences.
33+
34+
```bash
35+
gmsc-mapper -i ../examples/example.fa --db ../examples/targetdb.dmnd --habitat ../examples/ref_habitat.txt --quality ../examples/ref_quality.txt --taxonomy ../examples/ref_taxonomy.txt
36+
```
37+
38+
2. Input is amino acid sequences.
2939

3040
```bash
31-
conda install -c conda-forge -c bioconda mmseqs2
32-
conda install -c bioconda -c conda-forge diamond=2.0.13
41+
gmsc-mapper --aa-genes ../examples/example.faa --db ../examples/targetdb.dmnd --habitat ../examples/ref_habitat.txt --quality ../examples/ref_quality.txt --taxonomy ../examples/ref_taxonomy.txt
3342
```
3443

35-
Once the dependencies are installed, you can install GMSC-mapper by running:
44+
3. Input is nucleotide gene sequences.
3645

3746
```bash
38-
cd GMSC-mapper
39-
python setup.py install
47+
gmsc-mapper --nt-genes ../examples/example.fna --db ../examples/targetdb.dmnd --habitat ../examples/ref_habitat.txt --quality ../examples/ref_quality.txt --taxonomy ../examples/ref_taxonomy.txt
48+
```
49+
50+
4. Check the Alignment tool: Diamond/MMseqs2 is optional
51+
52+
```bash
53+
gmsc-mapper -i ../examples/example.fa --db ../examples/targetdb --habitat ../examples/ref_habitat.txt --quality ../examples/ref_quality.txt --taxonomy ../examples/ref_taxonomy.txt --tool mmseqs
54+
55+
gmsc-mapper -i ../examples/example.fa --db ../examples/targetdb --habitat ../examples/ref_habitat.txt --quality ../examples/ref_quality.txt --taxonomy ../examples/ref_taxonomy.txt --tool diamond
56+
```
57+
58+
5. Flags to disable results from Habitat/taxonomy/quality annotation
59+
60+
```bash
61+
gmsc-mapper -i ../examples/example.fa --db ../examples/targetdb.dmnd --habitat ../examples/ref_habitat.txt --quality ../examples/ref_quality.txt --taxonomy ../examples/ref_taxonomy.txt --nohabitat --notaxonomy --noquality
4062
```
4163

4264
## Usage
@@ -58,41 +80,6 @@ cd gmsc_mapper
5880
gmsc-mapper createdb -i ../examples/target.faa -o ../examples -m mmseqs
5981
```
6082

61-
#### Default
62-
63-
Please make `GMSC-mapper/gmsc_mapper` as your work directory.
64-
65-
GMSC database/habitat/taxonomy/quality file path and output directory path can be assigned on your own.Default is `GMSC-mapper/db` and `GMSC-mapper/output`.
66-
67-
1. Input is genome contig sequences.
68-
69-
```bash
70-
gmsc-mapper -i ../examples/example.fa --db ../examples/targetdb.dmnd --habitat ../examples/ref_habitat.txt --quality ../examples/ref_quality.txt --taxonomy ../examples/ref_taxonomy.txt
71-
```
72-
73-
2. Input is amino acid sequences.
74-
75-
```bash
76-
gmsc-mapper --aa-genes ../examples/example.faa --db ../examples/targetdb.dmnd --habitat ../examples/ref_habitat.txt --quality ../examples/ref_quality.txt --taxonomy ../examples/ref_taxonomy.txt
77-
```
78-
79-
3. Input is nucleotide gene sequences.
80-
81-
```bash
82-
gmsc-mapper --nt-genes ../examples/example.fna --db ../examples/targetdb.dmnd --habitat ../examples/ref_habitat.txt --quality ../examples/ref_quality.txt --taxonomy ../examples/ref_taxonomy.txt
83-
```
84-
85-
#### Alignment tool: Diamond/MMseqs2 is optional
86-
If you want to change alignment tool(Diamond/MMseqs2), you can use `--tool`.
87-
```bash
88-
gmsc-mapper -i ../examples/example.fa --db ../examples/targetdb --habitat ../examples/ref_habitat.txt --quality ../examples/ref_quality.txt --taxonomy ../examples/ref_taxonomy.txt --tool mmseqs
89-
```
90-
91-
#### Habitat/taxonomy/quality annotation is optional
92-
If you don't want to annotate habitat/taxonomy/quality you can use `--nohabitat`/`--notaxonomy`/`--noquality`.
93-
```bash
94-
gmsc-mapper -i ../examples/example.fa --db ../examples/targetdb.dmnd --habitat ../examples/ref_habitat.txt --quality ../examples/ref_quality.txt --taxonomy ../examples/ref_taxonomy.txt --nohabitat --notaxonomy --noquality
95-
```
9683
### Real data Usage
9784
#### Create GMSC database index
9885
`-o`: Path to database output directory.(default: `GMSC-mapper/db`)
@@ -202,4 +189,4 @@ Subcommands: `gmsc-mapper createdb`
202189

203190
* `-o/--output`: Path to database output directory.
204191

205-
* `-m/--mode`: Alignment tool(Diamond/MMseqs2)
192+
* `-m/--mode`: Alignment tool(Diamond/MMseqs2)

0 commit comments

Comments
 (0)