@@ -52,6 +52,7 @@ other flavor of SQL).
5252| Hetionet | =hetionet= | Many - see =populate-ontology.py= | [[https://github.com/hetio/hetionet/tree/master/hetnet/tsv][GitHub]] | [[Hetionet]] |
5353| NCBI Gene | =ncbigene= | Genes | [[https://ftp.ncbi.nlm.nih.gov/gene/DATA/GENE_INFO/Mammalia/Homo_sapiens.gene_info.gz][Homo_sapiens.gene_info.gz]] | [[NCBI Gene]] |
5454| Drugbank | =drugbank= | Drugs / drug candidates | [[https://go.drugbank.com/releases/latest#open-data][DrugBank website]] | [[Drugbank]] |
55+ | DisGeNET | =disgenet= | Diseases and disease-gene edges | [[https://www.disgenet.org/][DisGeNET]] | [[DisGeNET]] |
5556| | | | | |
5657
5758*** Hetionet
@@ -84,6 +85,34 @@ Drug Links", click the "Download" button on the row labeled
8485file, and make sure it is named =drug_links.csv= (some versions use a
8586space instead of an underscore in the filename).
8687
88+ *** DisGeNET
89+ Although DisGeNET is available under a Creative Commons license, the
90+ database requires users to create a free account to download the
91+ tab-delimited data files. Therefore, you should create a user account
92+ and log in. Then, navigate to the Downloads page on the DisGeNET
93+ website. Now, download the two necessary files by clicking on the
94+ corresponding links:
95+ - "UMLS CUI to several disease vocabularies" (under the "UMLS CUI to
96+ several disease vocabularies" section heading - the resulting file
97+ name will be =disease_mappings.tsv.gz=)
98+ - "UMLS CUI to top disease classes" (the resulting file will be named
99+ =disease_mappings_to_attributes.tar.gz=)
100+ Both files are gzipped, so extract them into the =disgenet/= directory
101+ using your favorite method (e.g., gunzip from the command line, 7zip
102+ from within Windows, etc.).
103+
104+ Now that you have the two data files, you should run the AlzKB script
105+ we wrote to filter for rows in those files corresponding to
106+ Alzheimer's Disease. This script is in the =scripts/= directory of the
107+ AlzKB repository, so either find it on your local filesystem if you
108+ already have a copy of the repository, or find it on the AlzKB page of
109+ GitHub.
110+
111+ You can then run the Python script from within the =disgenet/=
112+ directory, which should deposit two filtered data files in the
113+ =disgenet/CUSTOM/= subdirectory. These will be automatically detected
114+ and used when you run the ontology population script.
115+
87116** SQL data sources
88117If you don't already have MySQL installed, install it. We recommend
89118using either a package manager (if one is available on your OS), or
0 commit comments