Skip to content

Commit ee896bd

Browse files
committed
Update README instructions for all tools
1 parent 76f3ea7 commit ee896bd

File tree

7 files changed

+204
-52
lines changed

7 files changed

+204
-52
lines changed

README.md

Lines changed: 5 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -59,8 +59,8 @@ The benchmark framework relies on the following inputs produced by the [SNB Data
5959

6060
### Driver modes
6161

62-
For each implementation, it is possible to perform to perform the run in one of the [SNB driver's](https://github.com/ldbc/ldbc_snb_interactive_driver) three modes.
63-
All three should be started withe the initial data set loaded to the database.
62+
For each implementation, it is possible to perform to perform the run in one of the [SNB driver's](https://github.com/ldbc/ldbc_snb_interactive_driver) three modes: create validation parameters, validate, and benchmark.
63+
The execution in all three modes should be started after the initial data set was loaded into the system under test.
6464

6565
1. Create validation parameters with the `driver/create-validation-parameters.sh` script.
6666

@@ -71,7 +71,7 @@ All three should be started withe the initial data set loaded to the database.
7171
* **Output:** The results will be stored in the validation parameters file (e.g. `validation_params.csv`) file set in the `create_validation_parameters` configuration property.
7272
* **Parallelism:** The execution must be single-threaded to ensure a deterministic order of operations.
7373

74-
2. Validate against existing validation parameters with the `driver/validate.sh` script.
74+
2. Validate against an existing reference output (called "validation parameters") with the `driver/validate.sh` script.
7575

7676
* **Input:**
7777
* The query substitution parameters are taken from the validation parameters file (e.g. `validation_params.csv`) file set in the `validate_database` configuration property.
@@ -82,7 +82,7 @@ All three should be started withe the initial data set loaded to the database.
8282
* If the validation failed, the results are saved to the `validation_params-failed-expected.json` and `validation_params-failed-actual.json` files.
8383
* **Parallelism:** The execution must be single-threaded to ensure a deterministic order of operations.
8484

85-
Pre-generated [validation data sets for SF0.1 to SF10](https://pub-383410a98aef4cb686f0c7601eddd25f.r2.dev/interactive-v1/validation_params-sf0.1-sf10.tar.zst) are available.
85+
Pre-generated [validation parameters for SF0.1 to SF10](https://pub-383410a98aef4cb686f0c7601eddd25f.r2.dev/interactive-v1/validation_params-sf0.1-sf10.tar.zst) are available.
8686

8787
3. Run the benchmark with the `driver/benchmark.sh` script.
8888

@@ -100,7 +100,7 @@ All three should be started withe the initial data set loaded to the database.
100100
* The detailed results of the benchmark are printed to the console and saved in the `results/` directory.
101101
* **Parallelism:** Multi-threaded execution is recommended to achieve the best result.
102102

103-
For more details on validating and benchmarking, visit the [driver wiki](https://github.com/ldbc/ldbc_snb_interactive_driver/wiki).
103+
For more details on validating and benchmarking, visit the [driver's documentation](https://github.com/ldbc/ldbc_snb_interactive_driver/tree/v1-dev/docs).
104104

105105
## Developer's guide
106106

cypher/README.md

Lines changed: 36 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -35,7 +35,7 @@ ldbc.snb.datagen.serializer.staticSerializer:ldbc.snb.datagen.serializer.snb.csv
3535

3636
An example configuration for scale factor 1 is given in the [`params-csv-composite-longdateformatter.ini`](https://github.com/ldbc/ldbc_snb_datagen_hadoop/blob/main/params-csv-composite-longdateformatter.ini) file of the Datagen repository.
3737

38-
### Preprocessing and loading
38+
## Running the benchmark
3939

4040
Set the following environment variables based on your data source and where you would like to store the converted CSVs:
4141

@@ -44,24 +44,50 @@ export NEO4J_VANILLA_CSV_DIR=`pwd`/test-data/vanilla
4444
export NEO4J_CONVERTED_CSV_DIR=`pwd`/test-data/converted
4545
```
4646

47-
#### Loading the data set
47+
### Loading the data set
4848

49-
To load the data sets, run the following script:
49+
To load the data set, run the following script:
5050

5151
```bash
5252
scripts/load-in-one-step.sh
5353
```
5454

5555
This preprocesses the CSVs in `${NEO4J_VANILLA_CSV_DIR}` and places the resulting CSVs in `${NEO4J_CONVERTED_CSV_DIR}`, stops any running Neo4j database instances, loads the database and starts it.
5656

57-
## Running the benchmark
57+
### Running the benchmark driver
5858

59-
To run the scripts of benchmark framework, edit the `driver/{create-validation-parameters,validate,benchmark}.properties` files, then run their script, one of:
59+
The instructions below explain how to run the benchmark driver in one of the three modes (create validation parameters, validate, benchmark). For more details on the driver modes, check the ["Driver modes" section of the main README](../README.md#driver-modes).
6060

61-
```bash
62-
driver/create-validation-parameters.sh
63-
driver/validate.sh
64-
driver/benchmark.sh
65-
```
61+
#### Create validation parameters
62+
63+
1. Edit the `driver/benchmark.properties` file. Make sure that the `ldbc.snb.interactive.scale_factor`, `ldbc.snb.interactive.updates_dir`, `ldbc.snb.interactive.parameters_dir` properties are set correctly and are in sync.
64+
65+
2. Run the script:
66+
67+
```bash
68+
driver/create-validation-parameters.sh
69+
```
70+
71+
#### Validate
72+
73+
1. Edit the `driver/validate.properties` file. Make sure that the `validate_database` property points to the file you would like to validate against.
74+
75+
2. Run the script:
76+
77+
```bash
78+
driver/validate.sh
79+
```
80+
81+
#### Benchmark
82+
83+
1. Edit the `driver/benchmark.properties` file. Make sure that the `ldbc.snb.interactive.scale_factor`, `ldbc.snb.interactive.updates_dir`, and `ldbc.snb.interactive.parameters_dir` properties are set correctly and are in sync.
84+
85+
2. Run the script:
86+
87+
```bash
88+
driver/benchmark.sh
89+
```
90+
91+
#### Reload between runs
6692

6793
:warning: The default workload contains updates which are persisted in the database. Therefore, **the database needs to be reloaded or restored from backup before each run**. Use the provided `scripts/backup-database.sh` and `scripts/restore-database.sh` scripts to achieve this. Alternatively, e.g. if you lack sudo rights, use Neo4j's built-in dump and load features through the `scripts/backup-neo4j.sh` and `scripts/restore-neo4j.sh` scripts.

duckdb/README.md

Lines changed: 43 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -10,27 +10,60 @@ Grab DuckDB:
1010
scripts/get.sh
1111
```
1212

13-
## Generating and loading the data set
14-
15-
### Generating the data set
13+
## Generating the data set
1614

1715
The data sets need to be generated before loading it to the database. No preprocessing is required. To generate the data sets for DuckDB, use the same settings as for PostgreSQL, i.e. the [Hadoop-based Datagen](https://github.com/ldbc/ldbc_snb_datagen_hadoop)'s `CsvMergeForeign` serializer classes.
1816

19-
### Loading the data set
17+
## Running the benchmark
18+
19+
Set the following environment variable based on your data source:
2020

2121
```bash
2222
export DUCKDB_CSV_DIR=`pwd`/../postgres/test-data
23-
scripts/load.sh
2423
```
2524

26-
### Running the benchmark
25+
### Loading the data set
2726

28-
To run the scripts of benchmark framework, edit the `driver/{create-validation-parameters,validate,benchmark}.properties` files, then run their script, one of:
27+
Load the data set as follows:
2928

3029
```bash
31-
driver/create-validation-parameters.sh
32-
driver/validate.sh
33-
driver/benchmark.sh
30+
scripts/load.sh
3431
```
3532

33+
### Running the benchmark driver
34+
35+
The instructions below explain how to run the benchmark driver in one of the three modes (create validation parameters, validate, benchmark). For more details on the driver modes, check the ["Driver modes" section of the main README](../README.md#driver-modes).
36+
37+
#### Create validation parameters
38+
39+
1. Edit the `driver/benchmark.properties` file. Make sure that the `ldbc.snb.interactive.scale_factor`, `ldbc.snb.interactive.updates_dir`, `ldbc.snb.interactive.parameters_dir` properties are set correctly and are in sync.
40+
41+
2. Run the script:
42+
43+
```bash
44+
driver/create-validation-parameters.sh
45+
```
46+
47+
#### Validate
48+
49+
1. Edit the `driver/validate.properties` file. Make sure that the `validate_database` property points to the file you would like to validate against.
50+
51+
2. Run the script:
52+
53+
```bash
54+
driver/validate.sh
55+
```
56+
57+
#### Benchmark
58+
59+
1. Edit the `driver/benchmark.properties` file. Make sure that the `ldbc.snb.interactive.scale_factor`, `ldbc.snb.interactive.updates_dir`, and `ldbc.snb.interactive.parameters_dir` properties are set correctly and are in sync.
60+
61+
2. Run the script:
62+
63+
```bash
64+
driver/benchmark.sh
65+
```
66+
67+
#### Reload between runs
68+
3669
:warning: The default workload contains updates which are persisted in the database. Therefore, **the database needs to be reloaded or restored from backup before each run**. Use the provided `scripts/backup-database.sh` and `scripts/restore-database.sh` scripts to achieve this.

graphdb/README.md

Lines changed: 34 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -31,9 +31,11 @@ An example configuration for scale factor 1 is given in the [`params-ttl.ini`](h
3131

3232
> The result of the execution will generate three .ttl files `social_network_activity_0_0.ttl`, `social_network_person_0_0.ttl` and `social_network_static_0_0.ttl`
3333
34+
## Running the benchmark
35+
3436
### Preprocessing and loading
3537

36-
After that you need to change the following environment variables based on your data source.
38+
Change the following environment variables based on your data source.
3739

3840
1. Set the `GRAPHDB_IMPORT_TTL_DIR` environment variable to point to the generated data set. Its default value points to the example data set under the `test-data` directory:
3941

@@ -66,17 +68,40 @@ scripts/start-graphdb.sh
6668
> scripts/one-step-load.sh
6769
> ```
6870
69-
## Running the benchmark
71+
### Running the benchmark driver
7072
71-
4. To run the scripts of benchmark framework, edit the `driver/{create-validation-parameters,validate,benchmark}.properties` files, then run their script, one of:
73+
The instructions below explain how to run the benchmark driver in one of the three modes (create validation parameters, validate, benchmark). For more details on the driver modes, check the ["Driver modes" section of the main README](../README.md#driver-modes).
7274
73-
```bash
74-
driver/create-validation-parameters.sh
75-
driver/validate.sh
76-
driver/benchmark.sh
77-
```
75+
#### Create validation parameters
76+
77+
1. Edit the `driver/benchmark.properties` file. Make sure that the `ldbc.snb.interactive.scale_factor`, `ldbc.snb.interactive.updates_dir`, `ldbc.snb.interactive.parameters_dir` properties are set correctly and are in sync.
78+
79+
2. Run the script:
80+
81+
```bash
82+
driver/create-validation-parameters.sh
83+
```
84+
85+
#### Validate
86+
87+
1. Edit the `driver/validate.properties` file. Make sure that the `validate_database` property points to the file you would like to validate against.
88+
89+
2. Run the script:
90+
91+
```bash
92+
driver/validate.sh
93+
```
94+
95+
#### Benchmark
96+
97+
1. Edit the `driver/benchmark.properties` file. Make sure that the `ldbc.snb.interactive.scale_factor`, `ldbc.snb.interactive.updates_dir`, and `ldbc.snb.interactive.parameters_dir` properties are set correctly and are in sync.
7898
79-
:warning: *Note that the default workload contains updates which are persisted in the database. Therefore, the database needs to be re-loaded between steps – otherwise repeated updates would insert duplicate entries.*
99+
2. Run the script:
80100
101+
```bash
102+
driver/benchmark.sh
103+
```
81104
105+
#### Reload between runs
82106
107+
:warning: The default workload contains updates which are persisted in the database. Therefore, **the database needs to be reloaded or restored from backup before each run**. Use the provided `scripts/backup-database.sh` and `scripts/restore-database.sh` scripts to achieve this.

postgres/README.md

Lines changed: 4 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -51,6 +51,8 @@ ldbc.snb.datagen.serializer.dynamicPersonSerializer:ldbc.snb.datagen.serializer.
5151
ldbc.snb.datagen.serializer.staticSerializer:ldbc.snb.datagen.serializer.snb.csv.staticserializer.CsvMergeForeignStaticSerializer
5252
```
5353

54+
## Running the benchmark
55+
5456
### Configuration
5557

5658
The default configuration of the database (e.g. database name, user, password) is set in the `scripts/vars.sh` file.
@@ -71,7 +73,7 @@ The default configuration of the database (e.g. database name, user, password) i
7173
7274
### Running the benchmark driver
7375
74-
Run the benchmark driver in one of the three modes (create validation parameters, validate, benchmark).
76+
The instructions below explain how to run the benchmark driver in one of the three modes (create validation parameters, validate, benchmark). For more details on the driver modes, check the ["Driver modes" section of the main README](../README.md#driver-modes).
7577
7678
#### Create validation parameters
7779
@@ -85,7 +87,7 @@ Run the benchmark driver in one of the three modes (create validation parameters
8587

8688
#### Validate
8789

88-
1. Edit the `driver/validate.properties` file. Make sure that the `validate_database` property points to the input CSV file.
90+
1. Edit the `driver/validate.properties` file. Make sure that the `validate_database` property points to the file you would like to validate against.
8991

9092
2. Run the script:
9193

0 commit comments

Comments
 (0)