Skip to content

Commit 25413b5

Browse files
authored
Merge pull request #1 from Ontotext-AD/GDB-6148-Add-GraphDB-Interactive-Workload-Implementation
GDB-6148 Add GraphDB interactive workload implementation
2 parents 10dbe71 + 18167da commit 25413b5

File tree

91 files changed

+235434
-0
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

91 files changed

+235434
-0
lines changed

README.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -23,6 +23,7 @@ We provide two reference implementations:
2323

2424
* [Neo4j (Cypher) implementation](cypher/README.md)
2525
* [PostgreSQL (SQL) implementation](postgres/README.md)
26+
* [GraphDB (SPARQL) implementation](graphdb/README.md)
2627

2728
Additional implementations:
2829

graphdb/README.md

Lines changed: 82 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,82 @@
1+
# LDBC SNB GraphDB/SPARQL implementation
2+
3+
This directory contains the [GraphDB](https://www.ontotext.com/products/graphdb/) implementation of the Interactive workload of the [LDBC SNB benchmark](https://github.com/ldbc/ldbc_snb_docs).
4+
5+
## Setup
6+
7+
The recommended environment for executing this benchmark is as follows: the benchmark scripts (Bash) and the LDBC driver (Java 8) run on the host machine, while the GraphDB database runs in a Docker container. Therefore, the requirements are as follows:
8+
9+
* Bash
10+
* Java 8
11+
* Docker 19+
12+
* enough free space in the directory `GRAPHDB_CONTAINER_ROOT` (its default value is specified in `scripts/vars.sh`)
13+
14+
## Generating and loading the data set
15+
16+
### Using pre-generated data sets
17+
18+
From the pre-generated data sets in the [SURF/CWI data repository](https://hdl.handle.net/11112/e6e00558-a2c3-9214-473e-04a16de09bf8), use the ones named `social_network_ttl_sf*`.
19+
20+
### Generating the data set
21+
22+
The data sets need to be generated and preprocessed before loading it to the database. To generate such data sets, use the `TurtleDynamicActivitySerializer` serializer classes of the [Hadoop-based Datagen](https://github.com/ldbc/ldbc_snb_datagen_hadoop):
23+
24+
```ini
25+
ldbc.snb.datagen.serializer.dynamicActivitySerializer:ldbc.snb.datagen.serializer.snb.turtle.TurtleDynamicActivitySerializer
26+
ldbc.snb.datagen.serializer.dynamicPersonSerializer:ldbc.snb.datagen.serializer.snb.turtle.TurtleDynamicPersonSerializer
27+
ldbc.snb.datagen.serializer.staticSerializer:ldbc.snb.datagen.serializer.snb.turtle.TurtleStaticSerializer
28+
```
29+
30+
An example configuration for scale factor 1 is given in the [`params-ttl.ini`](https://github.com/ldbc/ldbc_snb_datagen_hadoop/blob/main/params-ttl.ini) file of the Datagen repository. For small loading experiments, you can use scale factor 0.1, i.e. `snb.interactive.0.1`.
31+
32+
> The result of the execution will generate three .ttl files `social_network_activity_0_0.ttl`, `social_network_person_0_0.ttl` and `social_network_static_0_0.ttl`
33+
34+
### Preprocessing and loading
35+
36+
After that you need to change the following environment variables based on your data source.
37+
38+
1. Set the `GRAPHDB_IMPORT_TTL_DIR` environment variable to point to the generated data set. Its default value points to the example data set under the `test-data` directory:
39+
40+
```bash
41+
export GRAPHDB_IMPORT_TTL_DIR=`pwd`/test-data/
42+
```
43+
44+
2. You can change the GraphDB repository configuration pointed by `GRAPHDB_REPOSITORY_CONFIG_FILE` environment variable which by default uses the example configuration in `config` directory:
45+
46+
```bash
47+
export GRAPHDB_REPOSITORY_CONFIG_FILE=`pwd`/config/graphdb-repo-config.ttl
48+
```
49+
50+
### Loading the data set
51+
52+
3. To start GraphDB and load the data, run the following scripts:
53+
54+
:warning: Note that this will stop the currently running (containerized) GraphDB and delete all of its data.
55+
56+
```bash
57+
scripts/stop-graphdb.sh
58+
scripts/delete-graphdb-database.sh
59+
scripts/graphdb-importrdf.sh
60+
scripts/start-graphdb.sh
61+
```
62+
63+
> Or run all these scripts with a single command:
64+
>
65+
> ```bash
66+
> scripts/one-step-load.sh
67+
> ```
68+
69+
## Running the benchmark
70+
71+
4. To run the scripts of benchmark framework, edit the `driver/{create-validation-parameters,validate,benchmark}.properties` files, then run their script, one of:
72+
73+
```bash
74+
driver/create-validation-parameters.sh
75+
driver/validate.sh
76+
driver/benchmark.sh
77+
```
78+
79+
:warning: *Note that the default workload contains updates which are persisted in the database. Therefore, the database needs to be re-loaded between steps – otherwise repeated updates would insert duplicate entries.*
80+
81+
82+
Lines changed: 21 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,21 @@
1+
version: "3"
2+
3+
services:
4+
graphdb:
5+
container_name: ${GRAPHDB_IMPORTRDF_CONTAINER_NAME}
6+
image: ontotext/graphdb:${GRAPHDB_VERSION}
7+
# Load all files from GRAPHDB_IMPORT_TTL_DIR in repo with configuration defined in config/graphdb-repo-config.ttl
8+
#entrypoint: [ "/opt/graphdb/dist/bin/importrdf", "load", "-c", "/opt/graphdb/graphdb-repo-config.ttl", "-m", "parallel", "/opt/graphdb/home/graphdb-import"]
9+
entrypoint: [ "/opt/graphdb/dist/bin/importrdf", "load", "-c", "/opt/graphdb/graphdb-repo-config.ttl", "-m", "parallel", "/opt/graphdb/home/graphdb-import/social_network_activity${GRAPHDB_TTL_POSTFIX}
10+
/opt/graphdb/home/graphdb-import/social_network_static${GRAPHDB_TTL_POSTFIX} /opt/graphdb/home/graphdb-import/social_network_person${GRAPHDB_TTL_POSTFIX}"]
11+
environment:
12+
GDB_JAVA_OPTS: >-
13+
-Xmx${GRAPHDB_HEAP_SIZE} -Xms${GRAPHDB_HEAP_SIZE}
14+
-Dgraphdb.home=/opt/graphdb/home
15+
-Dgraphdb.workbench.importDirectory=/opt/graphdb/home/graphdb-import
16+
volumes:
17+
# Change folders in the vars.sh file or directly here
18+
- ${GRAPHDB_CONTAINER_ROOT}:/opt/graphdb/home
19+
- ${GRAPHDB_IMPORT_TTL_DIR}:/opt/graphdb/home/graphdb-import
20+
- ${GRAPHDB_REPOSITORY_CONFIG_FILE}:/opt/graphdb/graphdb-repo-config.ttl
21+
- ${GRAPHDB_REPOSITORY_RULESET_FILE}:/opt/graphdb/rdfsPlus-snb-bidir.pie
Lines changed: 29 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,29 @@
1+
#
2+
# Configuration template for a GraphDB repository
3+
#
4+
@prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#>.
5+
@prefix rep: <http://www.openrdf.org/config/repository#>.
6+
@prefix sr: <http://www.openrdf.org/config/repository/sail#>.
7+
@prefix sail: <http://www.openrdf.org/config/sail#>.
8+
@prefix graphdb: <http://www.ontotext.com/trree/graphdb#>.
9+
10+
[] a rep:Repository ;
11+
rep:repositoryID "ldbc-snb-interactive" ;
12+
rdfs:label "LDBC SNB Interactive benchmark repo" ;
13+
rep:repositoryImpl [
14+
rep:repositoryType "graphdb:SailRepository" ;
15+
sr:sailImpl [
16+
sail:sailType "graphdb:Sail" ;
17+
18+
# ruleset to use
19+
graphdb:ruleset "/opt/graphdb/rdfsPlus-snb-bidir.pie" ;
20+
21+
# disable context index(because my data do not uses contexts)
22+
graphdb:enable-context-index "false" ;
23+
24+
# indexes to speed up the read queries
25+
graphdb:enablePredicateList "true" ;
26+
graphdb:enable-literal-index "true" ;
27+
graphdb:in-memory-literal-properties "true" ;
28+
]
29+
].

graphdb/config/rdfsPlus-snb-bidir.pie

Lines changed: 146 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,146 @@
1+
Prefices
2+
{
3+
rdf : http://www.w3.org/1999/02/22-rdf-syntax-ns#
4+
rdfs : http://www.w3.org/2000/01/rdf-schema#
5+
owl : http://www.w3.org/2002/07/owl#
6+
onto : http://www.ontotext.com/
7+
xsd : http://www.w3.org/2001/XMLSchema#
8+
psys : http://proton.semanticweb.org/protonsys#
9+
pext : http://proton.semanticweb.org/protonext#
10+
snvoc: http://www.ldbc.eu/ldbc_socialnet/1.0/vocabulary/
11+
}
12+
13+
Axioms
14+
{
15+
<rdf:type> <rdf:type> <rdf:Property>
16+
<rdf:subject> <rdf:type> <rdf:Property>
17+
<rdf:predicate> <rdf:type> <rdf:Property>
18+
<rdf:object> <rdf:type> <rdf:Property>
19+
<rdf:first> <rdf:type> <rdf:Property>
20+
<rdf:rest> <rdf:type> <rdf:Property>
21+
<rdf:value> <rdf:type> <rdf:Property>
22+
<rdf:nil> <rdf:type> <rdf:List>
23+
<rdfs:subClassOf> <rdfs:domain> <rdfs:Class>
24+
<rdf:subject> <rdfs:domain> <rdf:Statement>
25+
<rdf:predicate> <rdfs:domain> <rdf:Statement>
26+
<rdf:object> <rdfs:domain> <rdf:Statement>
27+
<rdf:first> <rdfs:domain> <rdf:List>
28+
<rdf:rest> <rdfs:domain> <rdf:List>
29+
<rdfs:domain> <rdfs:range> <rdfs:Class>
30+
<rdfs:range> <rdfs:range> <rdfs:Class>
31+
<rdfs:subClassOf> <rdfs:range> <rdfs:Class>
32+
<rdf:rest> <rdfs:range> <rdf:List>
33+
<rdfs:comment> <rdfs:range> <rdfs:Literal>
34+
<rdfs:label> <rdfs:range> <rdfs:Literal>
35+
<rdf:Alt> <rdfs:subClassOf> <rdfs:Container>
36+
<rdf:Bag> <rdfs:subClassOf> <rdfs:Container>
37+
<rdf:Seq> <rdfs:subClassOf> <rdfs:Container>
38+
<rdfs:ContainerMembershipProperty> <rdfs:subClassOf> <rdf:Property>
39+
<rdfs:isDefinedBy> <rdfs:subPropertyOf> <rdfs:seeAlso>
40+
<rdf:XMLLiteral> <rdf:type> <rdfs:Datatype>
41+
<rdf:XMLLiteral> <rdfs:subClassOf> <rdfs:Literal>
42+
<rdfs:Datatype> <rdfs:subClassOf> <rdfs:Class>
43+
<owl:equivalentClass> <rdf:type> <owl:TransitiveProperty>
44+
<owl:equivalentClass> <rdf:type> <owl:SymmetricProperty>
45+
<owl:equivalentClass> <rdfs:subPropertyOf> <rdfs:subClassOf>
46+
<owl:equivalentProperty> <rdf:type> <owl:TransitiveProperty>
47+
<owl:equivalentProperty> <rdf:type> <owl:SymmetricProperty>
48+
<owl:equivalentProperty> <rdfs:subPropertyOf> <rdfs:subPropertyOf>
49+
<owl:inverseOf> <rdf:type> <owl:SymmetricProperty>
50+
<rdfs:subClassOf> <rdf:type> <owl:TransitiveProperty>
51+
<rdfs:subPropertyOf> <rdf:type> <owl:TransitiveProperty>
52+
<rdf:type> <psys:transitiveOver> <rdfs:subClassOf>
53+
<owl:differentFrom> <rdf:type> <owl:SymmetricProperty>
54+
<xsd:nonNegativeInteger> <rdf:type> <rdfs:Datatype>
55+
<xsd:string> <rdf:type> <rdfs:Datatype>
56+
<rdf:_1> <rdf:type> <rdf:Property>
57+
<rdf:_1> <rdf:type> <rdfs:ContainerMembershipProperty>
58+
}
59+
60+
Rules
61+
{
62+
63+
Id: rdfs7
64+
65+
a b c
66+
b <rdfs:subPropertyOf> d [Constraint b != d]
67+
------------------------------------
68+
a d c
69+
70+
71+
Id: rdfs8_10
72+
73+
a <rdf:type> <rdfs:Class>
74+
------------------------------------
75+
a <rdfs:subClassOf> a
76+
77+
78+
Id: proton_TransitiveOver
79+
80+
a <psys:transitiveOver> b
81+
c a d
82+
d b e
83+
------------------------------------
84+
c a e
85+
86+
87+
Id: proton_TransProp
88+
89+
a <rdf:type> <owl:TransitiveProperty>
90+
------------------------------------
91+
a <psys:transitiveOver> a
92+
93+
94+
Id: proton_TransPropInduct
95+
96+
a <psys:transitiveOver> a
97+
------------------------------------
98+
a <rdf:type> <owl:TransitiveProperty>
99+
100+
101+
Id: owl_invOf
102+
103+
a b c
104+
b <owl:inverseOf> d
105+
------------------------------------
106+
c d a
107+
108+
109+
Id: owl_invOfBySymProp
110+
111+
a <rdf:type> <owl:SymmetricProperty>
112+
------------------------------------
113+
a <owl:inverseOf> a
114+
115+
116+
Id: owl_SymPropByInverse
117+
118+
a <owl:inverseOf> a
119+
------------------------------------
120+
a <rdf:type> <owl:SymmetricProperty>
121+
122+
123+
Id: owl_EquivClassBySubClass
124+
125+
a <rdfs:subClassOf> b [Constraint b != a]
126+
b <rdfs:subClassOf> a [Cut]
127+
------------------------------------
128+
a <owl:equivalentClass> b
129+
130+
131+
Id: owl_EquivPropBySubProp
132+
133+
a <rdfs:subPropertyOf> b [Constraint b != a]
134+
b <rdfs:subPropertyOf> a [Cut]
135+
------------------------------------
136+
a <owl:equivalentProperty> b
137+
138+
139+
Id: rule_snb_knows_bidirectional
140+
p <snvoc:knows> rel [Constraint p != fr]
141+
rel <snvoc:hasPerson> fr [Constraint p != fr]
142+
---------------------------
143+
fr <snvoc:directKnows> p
144+
p <snvoc:directKnows> fr
145+
146+
}

graphdb/driver/benchmark.properties

Lines changed: 66 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,66 @@
1+
endpoint=http://localhost:7200/repositories/ldbc-snb-interactive
2+
queryDir=queries/
3+
4+
printQueryNames=false
5+
printQueryStrings=false
6+
printQueryResults=false
7+
8+
status=1
9+
thread_count=1
10+
name=LDBC-SNB
11+
mode=execute_benchmark
12+
results_log=true
13+
time_unit=MILLISECONDS
14+
time_compression_ratio=0.001
15+
peer_identifiers=
16+
workload_statistics=false
17+
spinner_wait_duration=1
18+
help=false
19+
ignore_scheduled_start_times=false
20+
21+
workload=org.ldbcouncil.snb.driver.workloads.interactive.LdbcSnbInteractiveWorkload
22+
db=com.ldbc.impls.workloads.ldbc.snb.graphdb.interactive.GraphDBInteractive
23+
24+
warmup=100
25+
operation_count=250
26+
27+
ldbc.snb.interactive.updates_dir=test-data/update_streams/
28+
ldbc.snb.interactive.parameters_dir=test-data/substitution_parameters/
29+
ldbc.snb.interactive.short_read_dissipation=0.2
30+
31+
# Supported scale factors are 0.1, 0.3, 1, 3, 10, 30, 100, 300, 1000
32+
ldbc.snb.interactive.scale_factor=0.1
33+
34+
# *** For debugging purposes ***
35+
36+
ldbc.snb.interactive.LdbcQuery1_enable=true
37+
ldbc.snb.interactive.LdbcQuery2_enable=true
38+
ldbc.snb.interactive.LdbcQuery3_enable=true
39+
ldbc.snb.interactive.LdbcQuery4_enable=true
40+
ldbc.snb.interactive.LdbcQuery5_enable=true
41+
ldbc.snb.interactive.LdbcQuery6_enable=true
42+
ldbc.snb.interactive.LdbcQuery7_enable=true
43+
ldbc.snb.interactive.LdbcQuery8_enable=true
44+
ldbc.snb.interactive.LdbcQuery9_enable=true
45+
ldbc.snb.interactive.LdbcQuery10_enable=true
46+
ldbc.snb.interactive.LdbcQuery11_enable=true
47+
ldbc.snb.interactive.LdbcQuery12_enable=true
48+
ldbc.snb.interactive.LdbcQuery13_enable=true
49+
ldbc.snb.interactive.LdbcQuery14_enable=true
50+
51+
ldbc.snb.interactive.LdbcShortQuery1PersonProfile_enable=true
52+
ldbc.snb.interactive.LdbcShortQuery2PersonPosts_enable=true
53+
ldbc.snb.interactive.LdbcShortQuery3PersonFriends_enable=true
54+
ldbc.snb.interactive.LdbcShortQuery4MessageContent_enable=true
55+
ldbc.snb.interactive.LdbcShortQuery5MessageCreator_enable=true
56+
ldbc.snb.interactive.LdbcShortQuery6MessageForum_enable=true
57+
ldbc.snb.interactive.LdbcShortQuery7MessageReplies_enable=true
58+
59+
ldbc.snb.interactive.LdbcUpdate1AddPerson_enable=true
60+
ldbc.snb.interactive.LdbcUpdate2AddPostLike_enable=true
61+
ldbc.snb.interactive.LdbcUpdate3AddCommentLike_enable=true
62+
ldbc.snb.interactive.LdbcUpdate4AddForum_enable=true
63+
ldbc.snb.interactive.LdbcUpdate5AddForumMembership_enable=true
64+
ldbc.snb.interactive.LdbcUpdate6AddPost_enable=true
65+
ldbc.snb.interactive.LdbcUpdate7AddComment_enable=true
66+
ldbc.snb.interactive.LdbcUpdate8AddFriendship_enable=true

graphdb/driver/benchmark.sh

Lines changed: 11 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,11 @@
1+
#!/bin/bash
2+
3+
set -eu
4+
set -o pipefail
5+
6+
cd "$( cd "$( dirname "${BASH_SOURCE[0]}" )" >/dev/null 2>&1 && pwd )"
7+
cd ..
8+
9+
BENCHMARK_PROPERTIES_FILE=${1:-driver/benchmark.properties}
10+
11+
java -cp target/graphdb-1.2.0-SNAPSHOT.jar org.ldbcouncil.snb.driver.Client -P ${BENCHMARK_PROPERTIES_FILE}

0 commit comments

Comments
 (0)