Code for the paper "Beyond Completion: A Foundation Model for General Knowledge Graph Reasoning." It provides implementations for two representative tasks in the field of Knowledge Graphs (KGs): Knowledge Graph Completion (KGC) and Knowledge Graph Question Answering (KGQA).
- CUDA version: 12.x
The code is designed to run in a Docker container on an internal enterprise platform. Many of the dependencies are custom-built for this platform. If you encounter any issues, please feel free to provide feedback.
- KGC Task: Code located in the
kgdirectory. - KGQA Task: Code located in the
qadirectory.
For environment setup, refer to the env.sh script located in each directory.
- Knowledge Graph (KG) data for various datasets can be downloaded by running the
kg/script/prepare_emb.pyscript. - For textual description datasets, use the Wikidata API and the original KG dataset repositories to fetch the data.
- To initialize node embeddings in the KG, run the
kg/gen_emb.shscript.
- Download the datasets used in QAGNN (or GreaseLM), and due to version compatibility issues, you will need to convert the dataset format by running the
qa/raw_data/pyg_trans.ipynbscript. - For each question sample, use similarity search to obtain a few-shot set. The
qa/sim.ipynbscript will handle this task.
- Pre-training configuration:
kg/config/pretrain/lp_retain.yaml - To start pre-training, run:
kg/train.sh - To evaluate the model, run:
kg/eval.sh
- Training configuration:
qa/config/qa/csqa.yaml - To train the model, run:
qa/train.sh - To evaluate the model, run:
qa/eval.sh
