You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
@@ -438,3 +438,367 @@ CALL db.relationshipTypes() YIELD relationshipType as type
438
438
CALL apoc.cypher.run('MATCH ()-[:`'+type+'`]->() RETURN count(*) as count',{}) YIELD value
439
439
RETURN type, value.count ORDER BY type
440
440
#+end_src
441
+
442
+
* 4.: Adding new data resources, nodes, relationships, and properties.
443
+
444
+
In version 2.0, we added "TranscriptionFactor" nodes, "TRANSCRIPTIONFACTORINTERACTSWITHGENE" relationships, node properties of "chromosome" number and "sourcedatabase", relationships properties of "correlation", "score", "p_fisher", "z_score", "affinity_nm", "confidence", "sourcedatabase", and "unbiased".
445
+
446
+
To achieve this, we added the above entities to the ontology RDF and now named =alzkb_v2.rdf= in the =alzkb\data= directory. Then collect additional source data files as detailed in the table below.
447
+
| Source | Directory name | Entity type(s) | URL | Extra instructions |
Download =trrust_rawdata.human.tsv= from TRRUST Download. Install DoRothEA by following the DoRothEA Installation within R. Place the =trrust_rawdata.human.tsv= and =alzkb_parse_dorothea.py= inside of =Dorothea/= subdirectory, which should be within your raw data directory (e.g., =D:\data=). Run =alzkb_parse_dorothea.py=. You’ll notice that it creates a =tf.tsv= file that is used while populating the ontology.
454
+
455
+
** Replicate Hetionet Resources
456
+
Since Hetionet does not have an up-to-date update plan, we have replicated them using the rephetio paper and source code to ensure AlzKB has current data. Follow the steps in [[https://github.com/EpistasisLab/AlzKB-updates][AlzKB-updates]] Github repository to create =hetionet-custom-nodes.tsv= and =hetionet-custom-edges.tsv=. Place these files in the =hetionet/= subdirectory.
457
+
458
+
** Process Data Files
459
+
Place the updated =alzkb_parse_ncbigene.py=, =alzkb_parse_drugbank.py=, and =alzkb_parse_disgenet.py= from the =scripts/= directory in their respective raw data file subdirectory. Run each script to process the data for the next step.
460
+
461
+
** Populate Ontology
462
+
Now that we have the updated ontology and updated data files, run the updated =alzkb/populate_ontology.py= to populate records. It creates a =alzkb_v2-populated.rdf= file that will be used in next step.
463
+
464
+
* 5.: Converting the ontology into a Memgraph graph database
465
+
** Installing Memgraph
466
+
If you haven't done so already, download Memgraph from the [[https://memgraph.com/docs/getting-started/install-memgraph][Install Memgraph]] page. Most users install Memgraph using a =pre-prepared docker-compose.yml= file by executing:
467
+
- for Linux and macOS:
468
+
=curl https://install.memgraph.com | sh=
469
+
- for Windows:
470
+
=iwr https://windows.memgraph.com | iex=
471
+
472
+
More details are in [[https://memgraph.com/docs/getting-started/install-memgraph/docker][Install Memgraph with Docker]]
473
+
474
+
** Generating the CSV File
475
+
Before uploading the file to Memgrpah, run =alzkb/rdf_to_memgraph_csv.py= with the =alzkb_v2-populated.rdf= file to generate =alzkb-populated.csv=.
476
+
477
+
** Starting Memgraph with Docker
478
+
Follow the instructions in [[https://memgraph.com/docs/data-migration/migrate-from-neo4j#importing-data-into-memgraph][importing-data-into-memgraph]] Step 1. Starting Memgraph with Docker to upload the =alzkb-populated.csv= file to the container.
479
+
480
+
Open Memgraph Lab. Memgraph Lab is available at =http://localhost:3000=. Click the =Query Execution= in MENU on the left bar. Then, you can type a Cypher query in the =Cypher Editor=.
481
+
482
+
** Gaining speed with indexes and analytical storage mode
483
+
- To create indexes, run the following Cypher queries:
484
+
#+begin_src cypher
485
+
CREATE INDEX ON :Drug(nodeID);
486
+
CREATE INDEX ON :Gene(nodeID);
487
+
CREATE INDEX ON :BiologicalProcess(nodeID);
488
+
CREATE INDEX ON :Pathway(nodeID);
489
+
CREATE INDEX ON :MolecularFunction(nodeID);
490
+
CREATE INDEX ON :CellularComponent(nodeID);
491
+
CREATE INDEX ON :Symptom(nodeID);
492
+
CREATE INDEX ON :BodyPart(nodeID);
493
+
CREATE INDEX ON :DrugClass(nodeID);
494
+
CREATE INDEX ON :Disease(nodeID);
495
+
CREATE INDEX ON :TranscriptionFactor (nodeID);
496
+
#+end_src
497
+
498
+
- To check the current storage mode, run:
499
+
#+begin_src cypher
500
+
SHOW STORAGE INFO;
501
+
#+end_src
502
+
503
+
- Change the storage mode to analytical before import:
504
+
#+begin_src cypher
505
+
STORAGE MODE IN_MEMORY_ANALYTICAL;
506
+
#+end_src
507
+
508
+
** Importing data into Memgraph
509
+
- Drug nodes
510
+
#+begin_src cypher
511
+
LOAD CSV FROM "/usr/lib/memgraph/alzkb-populated.csv" WITH HEADER AS row
512
+
WITH row WHERE row._labels = ':Drug' AND row.commonName <> ''
0 commit comments