Skip to content

Commit 6ebddef

Browse files
committed
ADD dev branch
1 parent 9204030 commit 6ebddef

File tree

12 files changed

+1880
-814
lines changed

12 files changed

+1880
-814
lines changed

.github/workflows/actions.yml

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -5,7 +5,7 @@ jobs:
55
runs-on: ubuntu-latest
66
strategy:
77
matrix:
8-
python-version: [3.5, 3.6, 3.7, 3.8]
8+
python-version: [3.6, 3.7, 3.8]
99
steps:
1010
- uses: actions/checkout@v2
1111
- name: Set up Python ${{ matrix.python-version }}
@@ -19,4 +19,4 @@ jobs:
1919
- name: Test a single transcript
2020
run: |
2121
# Test the script by retrieving a transcript data
22-
python gnomad_python_api.py -filter_by="gene_name" -search_by="TP53" -dataset="gnomad_r2_1"
22+
python gnomad_python_cli.py -filter_by=gene_name -search_by="BRCA1" -dataset="gnomad_r2_1" -sv_dataset="gnomad_sv_r2_1"

.gitignore

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,3 @@
1+
.ipynb_checkpoints
2+
outputs/
3+
outputs/*

LICENSE

Lines changed: 21 additions & 674 deletions
Large diffs are not rendered by default.

README.md

Lines changed: 87 additions & 19 deletions
Original file line numberDiff line numberDiff line change
@@ -1,61 +1,129 @@
1-
# 🧬 gnomAD Python API (Batch Script)
1+
# 🧬 gnomAD Python API
22

33
![Actions for gnomad_python_api](https://github.com/furkanmtorun/gnomad_python_api/workflows/Actions%20for%20gnomad_python_api/badge.svg)
4+
![Python Badges](https://img.shields.io/badge/Tested_with_Python-3.6%20%7C%203.7%20%7C%203.8-blue)
5+
![gnomAD Python API License](https://img.shields.io/badge/License-%20GPL--3.0-green)
6+
7+
- [🧬 gnomAD Python API](#-gnomad-python-api)
8+
- [:hash: What is *gnomAD* and the purpose of this script?](#hash-what-is-gnomad-and-the-purpose-of-this-script)
9+
- [:hash: Requirements and Installation](#hash-requirements-and-installation)
10+
- [:hash: GUI | Usage](#hash-gui--usage)
11+
- [:hash: CLI | Usage & Options](#hash-cli--usage--options)
12+
- [:hash: CLI | Example Usages](#hash-cli--example-usages)
13+
- [:hash: Disclaimer](#hash-disclaimer)
14+
- [:hash: Contributing & Feedback](#hash-contributing--feedback)
15+
- [:hash: Citation](#hash-citation)
16+
- [:hash: Developer](#hash-developer)
417

518
## :hash: What is *gnomAD* and the purpose of this script?
619
[gnomAD (The Genome Aggregation Database)](http://gnomad.broadinstitute.org/) is aggregation of thousands of exomes and genomes human sequencing studies. Also, gnomAD consortium annotates the variants with allelic frequency in genomes and exomes.
7-
**Here**, this batch script is able to search the genes or transcripts of your interest and retrieve variant data from the database via [gnomAD backend API](https://gnomad.broadinstitute.org/api) that based on GraphQL query language.
20+
21+
**Here**, this API with both CLI and GUI versions is able to search the genes or transcripts of your interest and retrieve variant data from the database via [gnomAD backend API](https://gnomad.broadinstitute.org/api) that based on GraphQL query language.
822

923
## :hash: Requirements and Installation
10-
- Create a directory and download the "**gnomad_python_api.py**" and "**requirements.txt**" files or clone the repository via Git using following command:
24+
- Create a directory and download the "**gnomad_python_cli.py**" and "**requirements.txt**" files or clone the repository via Git using following command:
1125

1226
`git clone https://github.com/furkanmtorun/gnomad_python_api.git`
1327

1428
- Install the required packages if you do not already:
1529

16-
` pip3 install -r requirements.txt `
30+
` pip3 install -r requirements.txt`
31+
32+
> The `requirements.txt` contains required libraries for both GUI (graphical user interface) and CLI (command-line interface) versions.
1733
1834
- It's ready to use now!
1935

2036
> If you did not install **pip** yet, please follow the instruction [here](https://pip.pypa.io/en/stable/installing/).
2137
22-
## :hash: Usage & Options
38+
## :hash: GUI | Usage
39+
40+
In the GUI version of gnomAD Python API, [Streamlit](https://www.streamlit.io/) has been used.
41+
42+
> **Note:** In GUI version, it is possible to generate plots from the data retrieved.
43+
> This option is not available in CLI version since it is still under development.
44+
>
45+
> **So, it is recommended to use GUI version.**
46+
47+
Here are the screenshots for the GUI version:
48+
49+
![gnomAD Python API GUI](img/main_screen.png)
50+
51+
_gnomAD Python API GUI - Main Screen_
52+
53+
![gnomAD Python API GUI](img/results.png)
54+
55+
_gnomAD Python API GUI - Outputs_
56+
57+
![gnomAD Python API GUI](img/results_2.png)
58+
59+
_gnomAD Python API GUI - Outputs and Plots_
60+
61+
> The outputs are also saved into `outputs/` folder in the GUI version.
62+
63+
## :hash: CLI | Usage & Options
2364
| Options in the script | Description | Parameters |
2465
|--|--|--|
25-
| -filter_by | *It defines the input type* |gene_name, gene_id, transcript_id |
26-
| -search_by | *It defines the input* | Type a gene/transcript identifier <br> *e.g.: TP53, ENSG00000169174, ENST00000544455* <br> Type the name of file containig your inputs <br> *e.g: myGenes.txt*
27-
| -dataset | *It defines the dataset* | exac, gnomad_r2_1, gnomad_r3, gnomad_r2_1_controls, gnomad_r2_1_non_neuro, gnomad_r2_1_non_cancer, gnomad_r2_1_non_topmed
28-
| -h | It displays the parameters | *To get help via script:* `python gnomad_python_api.py -h`
66+
| -filter_by | *It defines the input type.* |`gene_name`, `gene_id`, `transcript_id`, or `rs_id` |
67+
| -search_by | *It defines the input.* | Type a gene/transcript identifier <br> *e.g.: TP53, ENSG00000169174, ENST00000544455* <br> Type the name of file containig your inputs <br> *e.g: myGenes.txt*
68+
| -dataset | *It defines the dataset.* | `exac`, `gnomad_r2_1`, `gnomad_r3`, `gnomad_r2_1_controls`, `gnomad_r2_1_non_neuro`, `gnomad_r2_1_non_cancer`, or `gnomad_r2_1_non_topmed`
69+
| -sv_dataset | *It defines structural variants dataset* | `gnomad_sv_r2_1`, `gnomad_sv_r2_1_controls`, or `gnomad_sv_r2_1_non_neuro`
70+
| -h | *It displays the parameters.* | *To get help via script:* `python gnomad_python_cli.py -h`
71+
72+
73+
> ❗ Here, for getting variants, `gnomad_r2_1` and `gnomad_sv_r2_1` are defined as default values for these two `-dataset` and `-sv_dataset` options, respectively.
74+
>
2975
30-
## :hash: Example Usages
76+
## :hash: CLI | Example Usages
3177
- **How to list the variants by gene name or gene id?**
3278

33-
`python gnomad_python_api.py -filter_by="gene_name" -search_by="TP53" -dataset="gnomad_r2_1"`
79+
*For gene name:*
3480

35-
> Here, "**gene_id**" can also be used instead of "**gene_name**" after stating an **Ensembl Gene ID** instead of a gene name.
81+
`python gnomad_python_cli.py -filter_by=gene_name -search_by="BRCA1" -dataset="gnomad_r2_1" -sv_dataset="gnomad_sv_r2_1"`
82+
83+
*For Ensembl gene ID*
84+
85+
`python gnomad_python_cli.py -filter_by=gene_id -search_by="ENSG00000169174" -dataset="gnomad_r2_1" -sv_dataset="gnomad_sv_r2_1"`
3686

3787
- **How to list the variants by transcript ID?**
3888

39-
`python gnomad_python_api.py -filter_by="transcript_id" -search_by="ENST00000544455" -dataset="gnomad_r3"`
89+
`python gnomad_python_cli.py -filter_by=transcript_id -search_by="ENST00000407236" -dataset="gnomad_r2_1"`
90+
91+
- **How to get variant info by RS ID (rsId)?**
92+
93+
`python gnomad_python_cli.py -filter_by=rs_id -search_by="rs201857604" -dataset="gnomad_r2_1"`
4094

4195
- **How to list the variants using a file containing genes/transcripts?**
4296

43-
- Prepare your file that contains gene name, Ensembl gene IDs or Ensembl transcript IDs line-by-line.
97+
- Prepare your file that contains gene name, Ensembl gene IDs, Ensembl transcript IDs or RS IDs line-by-line.
4498
> ENSG00000169174 <br> ENSG00000171862 <br> ENSG00000170445
4599

46100
- Then, run the following command:
47101

48-
`python gnomad_python_api.py -filter_by="gene_id" -search_by="myFavoriteGenes.txt" -dataset="exac"`
102+
`python gnomad_python_cli.py -filter_by="gene_id" -search_by="myFavoriteGenes.txt" -dataset="gnomad_r2_1" -sv_dataset="gnomad_sv_r2_1"`
49103

50-
> Please, use only one type of identifier in the file.
104+
> Please, use only one type of identifier in the file.
51105
52-
- Then, the variants will be listed in "**outputs**" folder in the files according to their identifier (gene name, gene id or transcript id).
106+
- Then, the variants will be listed in "**outputs**" folder in the folders according to their identifier (gene name, gene id, transcript id or rsId).
107+
53108
- That's all!
54109

110+
## :hash: Disclaimer
111+
All the outputs provided by this tool are for informational purposes only.
112+
113+
The information is not intended to replace any consultation, diagnosis, and/or medical treatment offered by physicians or healthcare providers.
114+
115+
The author of the app will not be liable for any direct, indirect, consequential, special, exemplary, or other damages arising therefrom.
116+
55117
## :hash: Contributing & Feedback
56-
I would be very happy to see any feedbacks and contributions on the script.
118+
I would be very happy to see any feedback or contributions to the project.
119+
120+
For problems and enhancement requests, please `open an issue` above.
57121

58-
**Furkan Torun | [[email protected]](mailto:[email protected]) | Website: [furkanmtorun.github.io](https://furkanmtorun.github.io/)**
122+
## :hash: Citation
123+
Upcoming !
59124

125+
## :hash: Developer
126+
**Furkan M. Torun ([@furkanmtorun](http://github.com/furkanmtorun)) | [[email protected]](mailto:[email protected]) |
127+
Academia: [Google Scholar Profile](https://scholar.google.com/citations?user=d5ZyOZ4AAAAJ)**
60128

61129

0 commit comments

Comments
 (0)