Skip to content

Commit 28e8615

Browse files
authored
Update README.md
1 parent 4812071 commit 28e8615

File tree

1 file changed

+55
-2
lines changed

1 file changed

+55
-2
lines changed

README.md

Lines changed: 55 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,2 +1,55 @@
1-
# gnomad_python_api
2-
🧬 That gnomAD Python API script can be used to retrieve data from gnomAD "genome aggregation database".
1+
# gnomAD Python API (Batch Script)
2+
3+
## :hash: What is *gnomAD* and the purpose of this script?
4+
[gnomAD (The Genome Aggregation Database)](http://gnomad.broadinstitute.org/) is aggregation of thousands of exomes and genomes human sequencing studies. Also, gnomAD consortium annotates the variants with allelic frequency in genomes and exomes.
5+
**Here**, this batch script is able to search the genes or transcripts of your interest and retrieve variant data from the database via [gnomAD backend API](https://gnomad.broadinstitute.org/api) that based on GraphQL query language.
6+
7+
## :hash: Requirements and Installation
8+
- Create a directory and download the "**gnomad_python_api.py**" and "**requirements.txt**" files or clone the repository via Git using following command:
9+
`git clone https://github.com/furkanmtorun/gnomad_python_api.git`
10+
11+
- Install the required packages if you do not already:
12+
` pip3 install -r requirements.txt `
13+
14+
- It's ready to use now!
15+
16+
> If you did not install **pip** yet, please follow the instruction [here](https://pip.pypa.io/en/stable/installing/).
17+
18+
## :hash: Usage & Options
19+
| Options in the script | Description | Parameters |
20+
|--|--|--|
21+
| -filter_by | *It defines the input type* |gene_name, gene_id, transcript_id |
22+
| -search_by | *It defines the input* | Type a gene/transcript identifier <br> *e.g.: TP53, ENSG00000169174, ENST00000544455* <br> Type the name of file containig your inputs <br> *e.g: myGenes.txt*
23+
| -dataset | *It defines the dataset* | exac, gnomad_r2_1, gnomad_r3, gnomad_r2_1_controls, gnomad_r2_1_non_neuro, gnomad_r2_1_non_cancer, gnomad_r2_1_non_topmed
24+
| -h | It displays the parameters | *To get help via script:* `python gnomad_python_api.py -h`
25+
26+
### Example Usages
27+
- **How to list the variants by gene name or gene id?**
28+
`python gnomad_python_api.py -filter_by="gene_name" -search_by="TP53" -dataset="gnomad_r2_1"`
29+
30+
> Here, "**gene_id**" can also be used instead of "**gene_name**" after stating an **Ensembl Gene ID** instead of a gene name.
31+
32+
- **How to list the variants by transcripts?**
33+
`python gnomad_python_api.py -filter_by="transcript_id" -search_by="ENST00000544455" -dataset="gnomad_r3"`
34+
35+
- **How to list the variants by using a file containing genes/transcripts?**
36+
37+
- Prepare your file that contains gene name, Ensembl gene IDs or Ensembl transcript IDs line-by-line.
38+
> ENSG00000169174 <br> ENSG00000171862 <br> ENSG00000170445
39+
40+
- Then, run the following command:
41+
42+
`python gnomad_python_api.py -filter_by="gene_id" -search_by="myFavoriteGenes.txt" -dataset="exac"`
43+
44+
> Please, use only one type of identifier in the file.
45+
46+
- Then, the variants will be listed in "**outputs**" folder in the files according to their identifier (gene name, gene id or transcript id).
47+
- That's all!
48+
49+
## :hash: Contributing & Feedback
50+
I would be very happy to see any feedbacks and contributions on the script.
51+
52+
**Furkan Torun | [[email protected]](mailto:[email protected]) | Web site: [furkanmtorun.github.io](https://furkanmtorun.github.io/)**
53+
54+
55+

0 commit comments

Comments
 (0)