@@ -97,21 +97,27 @@ corresponding links:
9797 name will be =disease_mappings.tsv.gz=)
9898- "UMLS CUI to top disease classes" (the resulting file will be named
9999 =disease_mappings_to_attributes.tar.gz=)
100- Both files are gzipped, so extract them into the =disgenet/= directory
101- using your favorite method (e.g., gunzip from the command line, 7zip
102- from within Windows, etc.).
103-
104- Now that you have the two data files, you should run the AlzKB script
105- we wrote to filter for rows in those files corresponding to
106- Alzheimer's Disease, named =alzkb_parse_disgenet.py=. This script is
107- in the =scripts/= directory of the AlzKB repository, so either find it
108- on your local filesystem if you already have a copy of the repository,
109- or find it on the AlzKB GitHub repository in your web browser.
100+ Next, download =curated_disease_gene_associations.tsv.gz= directly by
101+ copying the following URL into your web browser:
102+ https://www.disgenet.org/static/disgenet_ap1/files/downloads/curated_disease_gene_associations.tsv.gz
103+
104+ All three files are gzipped, so extract them into the =disgenet/=
105+ directory using your favorite method (e.g., gunzip from the command
106+ line, 7zip from within Windows, etc.).
107+
108+ Now that you have the three necessary data files, you should run the
109+ AlzKB script we wrote to filter for rows in those files corresponding
110+ to Alzheimer's Disease, named =alzkb_parse_disgenet.py=. This script
111+ is in the =scripts/= directory of the AlzKB repository, so either find
112+ it on your local filesystem if you already have a copy of the
113+ repository, or find it on the AlzKB GitHub repository in your web
114+ browser.
110115
111116You can then run the Python script from within the =disgenet/=
112117directory, which should deposit two filtered data files in the
113118=disgenet/CUSTOM/= subdirectory. These will be automatically detected
114- and used when you run the ontology population script.
119+ and used when you run the ontology population script, along with the
120+ unmodified =curated_disease_gene_associations.tsv= file.
115121
116122** SQL data sources
117123If you don't already have MySQL installed, install it. We recommend
0 commit comments