You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
In comparison, GraphVite uses 4 GPUs and takes 14 minutes. Thus, DGL-KE trains TransE on FB15k twice as fast as GraphVite while using much few resources. More performance information on GraphVite can be found [here](https://github.com/DeepGraphLearning/graphvite).
102
+
In comparison, GraphVite uses 4 GPUs and takes 14 minutes. Thus, DGL-KE trains TransE on FB15k 9.5X as fast as GraphVite with 8 GPUs. More performance information on GraphVite can be found [here](https://github.com/DeepGraphLearning/graphvite).
The configuration for reproducing the performance results can be found [here](https://github.com/dmlc/dgl/blob/master/apps/kg/config/best_config.sh).
118
177
@@ -129,24 +188,20 @@ when given (?, rel, tail).
129
188
130
189
### Input formats:
131
190
132
-
DGL-KE supports two knowledge graph input formats. A knowledge graph is stored
133
-
using five files.
134
-
135
-
Format 1:
191
+
DGL-KE supports two knowledge graph input formats for user defined dataset
136
192
137
-
- entities.dict contains pairs of (entity Id, entity name). The number of rows is the number of entities (nodes).
138
-
- relations.dict contains pairs of (relation Id, relation name). The number of rows is the number of relations.
139
-
- train.txt stores edges in the training set. They are stored as triples of (head, rel, tail).
140
-
- valid.txt stores edges in the validation set. They are stored as triples of (head, rel, tail).
141
-
- test.txt stores edges in the test set. They are stored as triples of (head, rel, tail).
193
+
- raw_udd_[h|r|t], raw user defined dataset. In this format, user only need to provide triples and let the dataloader generate and manipulate the id mapping. The dataloader will generate two files: entities.tsv for entity id mapping and relations.tsv for relation id mapping. The order of head, relation and tail entities are described in [h|r|t], for example, raw_udd_trh means the triples are stored in the order of tail, relation and head. It should contains three files:
194
+
-*train* stores the triples in the training set. In format of a triple, e.g., [src_name, rel_name, dst_name] and should follow the order specified in [h|r|t]
195
+
-*valid* stores the triples in the validation set. In format of a triple, e.g., [src_name, rel_name, dst_name] and should follow the order specified in [h|r|t]
196
+
-*test* stores the triples in the test set. In format of a triple, e.g., [src_name, rel_name, dst_name] and should follow the order specified in [h|r|t]
142
197
143
198
Format 2:
144
-
145
-
- entity2id.txt contains pairs of (entity name, entity Id). The number of rows is the number of entities (nodes).
146
-
- relation2id.txt contains pairs of (relation name, relation Id). The number of rows is the number of relations.
147
-
-train.txt stores edges in the training set. They are stored as triples of (head, tail, rel).
148
-
-valid.txt stores edges in the validation set. They are stored as a triple of (head, tail, rel).
149
-
-test.txt stores edges in the test set. They are stored as a triple of (head, tail, rel).
199
+
- udd_[h|r|t], user defined dataset. In this format, user should provide the id mapping for entities and relations. The order of head, relation and tail entities are described in [h|r|t], for example, raw_udd_trh means the triples are stored in the order of tail, relation and head. It should contains five files:
200
+
-*entities* stores the mapping between entity name and entity Id
201
+
-*relations* stores the mapping between relation name relation Id
202
+
-*train* stores the triples in the training set. In format of a triple, e.g., [src_id, rel_id, dst_id] and should follow the order specified in [h|r|t]
203
+
-*valid* stores the triples in the validation set. In format of a triple, e.g., [src_id, rel_id, dst_id] and should follow the order specified in [h|r|t]
204
+
-*test* stores the triples in the test set. In format of a triple, e.g., [src_id, rel_id, dst_id] and should follow the order specified in [h|r|t]
150
205
151
206
### Output formats:
152
207
@@ -166,34 +221,36 @@ Here are some examples of using the training script.
0 commit comments