You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: BUILD.org
+125-3Lines changed: 125 additions & 3 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -78,8 +78,10 @@ should be needed). You'll notice that it creates two output files
78
78
that are used while populating the ontology.
79
79
80
80
*** Drugbank
81
-
Navigate to the Download page on the Drugbank website (linked
82
-
above). Select the "External Links" tab. In the table titled "External
81
+
In order to download the Academic DrugBank datasets, you need to first create a free DrugBank account and verify your email address. After verifying your email address, they may need some more information regarding your DrugBank account, like the description of how you plan to use DrugBank, a description of your organization, Who is sponsoring this research, and What is the end goal of this research. Account approval can take up to several business days to weeks based on our experience.
82
+
83
+
After your access has been approved, navigate to the Academic Download page on the Drugbank website (linked
84
+
above) by selecting the "Download" tab and "Academic Download". Select the "External Links" tab. In the table titled "External
83
85
Drug Links", click the "Download" button on the row labeled
84
86
"All". This will download a zip file. Extract the contents of that zip
85
87
file, and make sure it is named =drug_links.csv= (some versions use a
@@ -119,6 +121,8 @@ directory, which should deposit two filtered data files in the
119
121
and used when you run the ontology population script, along with the
Then you create a directory that will hold all of the raw data files. It can be 'D:\data\' or something else you prefer. Within that, there will be 1 folder for each third-party database, and in those folders, you'll put the individual csv/tsv/txt files.
125
+
122
126
** SQL data sources
123
127
If you don't already have MySQL installed, install it. We recommend
124
128
using either a package manager (if one is available on your OS), or
@@ -252,8 +256,71 @@ define mappings using these parser objects. We won't replicate every
252
256
mapping in this guide for brevity, but you can see all of them in the
253
257
full AlzKB build script.
254
258
*** Configuration for 'flat file' (e.g., CSV) data sources
259
+
#+begin_src python
260
+
hetionet.parse_node_type(
261
+
node_type="Symptom",
262
+
source_filename="hetionet-v1.0-nodes.tsv",
263
+
fmt="tsv",
264
+
parse_config={
265
+
"iri_column_name": "name",
266
+
"headers": True,
267
+
"filter_column": "kind",
268
+
"filter_value": "Symptom",
269
+
"data_transforms": {
270
+
"id": lambda x: x.split("::")[-1]
271
+
},
272
+
"data_property_map": {
273
+
"id": onto.xrefMeSH,
274
+
"name": onto.commonName
275
+
}
276
+
},
277
+
merge=False,
278
+
skip=False
279
+
)
280
+
#+end_src
281
+
This block indicates the third-party database is hetionet, and the file is hetionet-v1.0-nodes.tsv
282
+
283
+
So the file it will look for is D:\data\hetionet\hetionet-v1.0-nodes.tsv
284
+
285
+
Some of the configuration blocks will have a CUSTOM\ prefix to the filename. This means that the file was created by us manually and will need to be stored in a CUSTOM subdirectory of the database folder. For example:
286
+
#+begin_src python
287
+
disgenet.parse_node_type(
288
+
node_type="Disease",
289
+
source_filename="CUSTOM/disease_mappings_to_attributes_alzheimer.tsv", # Filtered for just Alzheimer disease
290
+
fmt="tsv-pandas",
291
+
parse_config={
292
+
"iri_column_name": "diseaseId",
293
+
"headers": True,
294
+
"data_property_map": {
295
+
"diseaseId": onto.xrefUmlsCUI,
296
+
"name": onto.commonName,
297
+
}
298
+
},
299
+
merge=False,
300
+
skip=False
301
+
)
302
+
#+end_src
303
+
This file will be D:\data\disgenet\CUSTOM\disease_mappings_alzheimer.tsv
0 commit comments