Skip to content

Commit f2591a8

Browse files
authored
Add results and code of mapper (#17)
* Create README.md * Add sihumix results * Add codes for generating all the results * Create README.md * Add results of fecal sample
1 parent a340ad7 commit f2591a8

20 files changed

+585452
-0
lines changed

mapper/fecal/README.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1 @@
1+
Results for fecal data

mapper/fecal/Readme.txt

Lines changed: 24 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,24 @@
1+
Introduction:
2+
The global microbial smORF catalog(GMSC) is constructed from GMGCv2 and Progenomes2.
3+
GMGC is an integrated and consistently-processed gene catalog of the microbial world combining metagenomics and sequenced isolates.
4+
(Coelho, L.P., Alves, R., del Río, Á.R. et al. Towards the biogeography of prokaryotic genes. Nature 601, 252–256 (2022). https://doi.org/10.1038/s41586-021-04233-4)
5+
Progenomes2 is a prokaryotic genome resource that provides consistent taxonomic and functional annotations as well as habitat-specific representative genomes.
6+
(Daniel R Mende, Ivica Letunic, Oleksandr M Maistrenko, Thomas S B Schmidt, Alessio Milanese, Lucas Paoli, Ana Hernández-Plaza, Askarbek N Orakov, Sofia K Forslund, Shinichi Sunagawa, Georg Zeller, Jaime Huerta-Cepas, Luis Pedro Coelho, Peer Bork, proGenomes2: an improved database for accurate and consistent habitat, taxonomic and functional annotations of prokaryotic genomes, Nucleic Acids Research, Volume 48, Issue D1, 08 January 2020, Pages D621–D625, https://doi.org/10.1093/nar/gkz1002)
7+
Habitat of GMSC is annotated by GMGCv2 metadata.
8+
Taxonomy of GMSC is annotated by GTDB.
9+
Quality of GMSC is checked by computational checking(RNAcode,AntiFam) and experimental checking(metatranscriptomic,metaproteomic and riboseq data). High quality means smORFs pass computational checking and at least have one experimental evidence.
10+
11+
Methods:
12+
We use Macrel to predicted smORFs from contigs.
13+
(Santos-Júnior CD, Pan S, Zhao XM, Coelho LP. Macrel: antimicrobial peptide screening in genomes and metagenomes. PeerJ. 2020 Dec 18;8:e10555. doi: 10.7717/peerj.10555.)
14+
We use Diamond to map predicted smORFs against our non-redundant 90AA smORF catalog.(90AA: We clustered our raw predicted smORFs at 90% identity and 90% coverage)
15+
16+
Results:
17+
macrel.out.smorfs.faa : All smORFs predicted from Macrel
18+
diamond.result.filtered.habitat.taxa.quality.tsv : Diamond results mapped with habitat,taxonomy and quality.(Cutoff: identity:0.9,E-value:1e-5)
19+
diamond.result.filtered.faa : All mapped and filtered smORFs
20+
habitat.tsv : Habitat annotation for each mapped smORF
21+
taxonomy.tsv : Taxonomy annotation for each mapped smORF
22+
quality.tsv : Quality annotation for each mapped smORF
23+
summary.txt : Basic statistic of results
24+
geo.png : Geographical and habitat distribution map

mapper/fecal/diamond.result.filtered.faa

Lines changed: 156628 additions & 0 deletions
Large diffs are not rendered by default.
21.1 MB
Binary file not shown.

mapper/fecal/geo.png

56.8 KB
Loading

mapper/fecal/habitat.tsv

Lines changed: 78314 additions & 0 deletions
Large diffs are not rendered by default.

mapper/fecal/macrel.out.smorfs.faa

Lines changed: 163510 additions & 0 deletions
Large diffs are not rendered by default.

0 commit comments

Comments
 (0)