Update README instructions for all tools

szarnyasg · szarnyasg · commit ee896bdf0a7e · 2023-04-14T18:10:21.000+02:00
diff --git a/README.md b/README.md
@@ -59,8 +59,8 @@ The benchmark framework relies on the following inputs produced by the [SNB Data
 
 ### Driver modes
 
-For each implementation, it is possible to perform to perform the run in one of the [SNB driver's](https://github.com/ldbc/ldbc_snb_interactive_driver) three modes.
-All three should be started withe the initial data set loaded to the database.
+For each implementation, it is possible to perform to perform the run in one of the [SNB driver's](https://github.com/ldbc/ldbc_snb_interactive_driver) three modes: create validation parameters, validate, and benchmark.
+The execution in all three modes should be started after the initial data set was loaded into the system under test.
 
 1. Create validation parameters with the `driver/create-validation-parameters.sh` script.
 
@@ -71,7 +71,7 @@ All three should be started withe the initial data set loaded to the database.
     * **Output:** The results will be stored in the validation parameters file (e.g. `validation_params.csv`) file set in the `create_validation_parameters` configuration property.
     * **Parallelism:** The execution must be single-threaded to ensure a deterministic order of operations.
 
-2. Validate against existing validation parameters with the `driver/validate.sh` script.
+2. Validate against an existing reference output (called "validation parameters") with the `driver/validate.sh` script.
 
     * **Input:**
         * The query substitution parameters are taken from the validation parameters file (e.g. `validation_params.csv`) file set in the `validate_database` configuration property.
@@ -82,7 +82,7 @@ All three should be started withe the initial data set loaded to the database.
         * If the validation failed, the results are saved to the `validation_params-failed-expected.json` and `validation_params-failed-actual.json` files.
     * **Parallelism:** The execution must be single-threaded to ensure a deterministic order of operations.
 
-    Pre-generated [validation data sets for SF0.1 to SF10](https://pub-383410a98aef4cb686f0c7601eddd25f.r2.dev/interactive-v1/validation_params-sf0.1-sf10.tar.zst) are available.
+    Pre-generated [validation parameters for SF0.1 to SF10](https://pub-383410a98aef4cb686f0c7601eddd25f.r2.dev/interactive-v1/validation_params-sf0.1-sf10.tar.zst) are available.
 
 3. Run the benchmark with the `driver/benchmark.sh` script.
 
@@ -100,7 +100,7 @@ All three should be started withe the initial data set loaded to the database.
         * The detailed results of the benchmark are printed to the console and saved in the `results/` directory.
     * **Parallelism:** Multi-threaded execution is recommended to achieve the best result.
 
-For more details on validating and benchmarking, visit the [driver wiki](https://github.com/ldbc/ldbc_snb_interactive_driver/wiki).
+For more details on validating and benchmarking, visit the [driver's documentation](https://github.com/ldbc/ldbc_snb_interactive_driver/tree/v1-dev/docs).
 
 ## Developer's guide
 
diff --git a/cypher/README.md b/cypher/README.md
@@ -35,7 +35,7 @@ ldbc.snb.datagen.serializer.staticSerializer:ldbc.snb.datagen.serializer.snb.csv
 
 An example configuration for scale factor 1 is given in the [`params-csv-composite-longdateformatter.ini`](https://github.com/ldbc/ldbc_snb_datagen_hadoop/blob/main/params-csv-composite-longdateformatter.ini) file of the Datagen repository.
 
-### Preprocessing and loading
+## Running the benchmark
 
 Set the following environment variables based on your data source and where you would like to store the converted CSVs:
 
@@ -44,24 +44,50 @@ export NEO4J_VANILLA_CSV_DIR=`pwd`/test-data/vanilla
 export NEO4J_CONVERTED_CSV_DIR=`pwd`/test-data/converted
 ```
 
-#### Loading the data set
+### Loading the data set
 
-To load the data sets, run the following script:
+To load the data set, run the following script:
 
 ```bash
 scripts/load-in-one-step.sh
 ```
 
 This preprocesses the CSVs in `${NEO4J_VANILLA_CSV_DIR}` and places the resulting CSVs in `${NEO4J_CONVERTED_CSV_DIR}`, stops any running Neo4j database instances, loads the database and starts it.
 
-## Running the benchmark
+### Running the benchmark driver
 
-To run the scripts of benchmark framework, edit the `driver/{create-validation-parameters,validate,benchmark}.properties` files, then run their script, one of:
+The instructions below explain how to run the benchmark driver in one of the three modes (create validation parameters, validate, benchmark). For more details on the driver modes, check the ["Driver modes" section of the main README](../README.md#driver-modes).
 
-```bash
-driver/create-validation-parameters.sh
-driver/validate.sh
-driver/benchmark.sh
-```
+#### Create validation parameters
+
+1. Edit the `driver/benchmark.properties` file. Make sure that the `ldbc.snb.interactive.scale_factor`, `ldbc.snb.interactive.updates_dir`, `ldbc.snb.interactive.parameters_dir` properties are set correctly and are in sync.
+
+2. Run the script:
+
+    ```bash
+    driver/create-validation-parameters.sh
+    ```
+
+#### Validate
+
+1. Edit the `driver/validate.properties` file. Make sure that the `validate_database` property points to the file you would like to validate against.
+
+2. Run the script:
+
+    ```bash
+    driver/validate.sh
+    ```
+
+#### Benchmark
+
+1. Edit the `driver/benchmark.properties` file. Make sure that the `ldbc.snb.interactive.scale_factor`, `ldbc.snb.interactive.updates_dir`, and `ldbc.snb.interactive.parameters_dir` properties are set correctly and are in sync.
+
+2. Run the script:
+
+    ```bash
+    driver/benchmark.sh
+    ```
+
+#### Reload between runs
 
 :warning: The default workload contains updates which are persisted in the database. Therefore, **the database needs to be reloaded or restored from backup before each run**. Use the provided `scripts/backup-database.sh` and `scripts/restore-database.sh` scripts to achieve this. Alternatively, e.g. if you lack sudo rights, use Neo4j's built-in dump and load features through the `scripts/backup-neo4j.sh` and `scripts/restore-neo4j.sh` scripts.
diff --git a/duckdb/README.md b/duckdb/README.md
@@ -10,27 +10,60 @@ Grab DuckDB:
 scripts/get.sh
 ```
 
-## Generating and loading the data set
-
-### Generating the data set
+## Generating the data set
 
 The data sets need to be generated before loading it to the database. No preprocessing is required. To generate the data sets for DuckDB, use the same settings as for PostgreSQL, i.e. the [Hadoop-based Datagen](https://github.com/ldbc/ldbc_snb_datagen_hadoop)'s `CsvMergeForeign` serializer classes.
 
-### Loading the data set
+## Running the benchmark
+
+Set the following environment variable based on your data source:
 
 ```bash
 export DUCKDB_CSV_DIR=`pwd`/../postgres/test-data
-scripts/load.sh
 ```
 
-### Running the benchmark
+### Loading the data set
 
-To run the scripts of benchmark framework, edit the `driver/{create-validation-parameters,validate,benchmark}.properties` files, then run their script, one of:
+Load the data set as follows:
 
 ```bash
-driver/create-validation-parameters.sh
-driver/validate.sh
-driver/benchmark.sh
+scripts/load.sh
 ```
 
+### Running the benchmark driver
+
+The instructions below explain how to run the benchmark driver in one of the three modes (create validation parameters, validate, benchmark). For more details on the driver modes, check the ["Driver modes" section of the main README](../README.md#driver-modes).
+
+#### Create validation parameters
+
+1. Edit the `driver/benchmark.properties` file. Make sure that the `ldbc.snb.interactive.scale_factor`, `ldbc.snb.interactive.updates_dir`, `ldbc.snb.interactive.parameters_dir` properties are set correctly and are in sync.
+
+2. Run the script:
+
+    ```bash
+    driver/create-validation-parameters.sh
+    ```
+
+#### Validate
+
+1. Edit the `driver/validate.properties` file. Make sure that the `validate_database` property points to the file you would like to validate against.
+
+2. Run the script:
+
+    ```bash
+    driver/validate.sh
+    ```
+
+#### Benchmark
+
+1. Edit the `driver/benchmark.properties` file. Make sure that the `ldbc.snb.interactive.scale_factor`, `ldbc.snb.interactive.updates_dir`, and `ldbc.snb.interactive.parameters_dir` properties are set correctly and are in sync.
+
+2. Run the script:
+
+    ```bash
+    driver/benchmark.sh
+    ```
+
+#### Reload between runs
+
 :warning: The default workload contains updates which are persisted in the database. Therefore, **the database needs to be reloaded or restored from backup before each run**. Use the provided `scripts/backup-database.sh` and `scripts/restore-database.sh` scripts to achieve this.
diff --git a/graphdb/README.md b/graphdb/README.md
@@ -31,9 +31,11 @@ An example configuration for scale factor 1 is given in the [`params-ttl.ini`](h
 
 > The result of the execution will generate three .ttl files `social_network_activity_0_0.ttl`, `social_network_person_0_0.ttl` and `social_network_static_0_0.ttl`
 
+## Running the benchmark
+
 ### Preprocessing and loading
 
-After that you need to change the following environment variables based on your data source.
+Change the following environment variables based on your data source.
 
 1. Set the `GRAPHDB_IMPORT_TTL_DIR` environment variable to point to the generated data set. Its default value points to the example data set under the `test-data` directory:
 
@@ -66,17 +68,40 @@ scripts/start-graphdb.sh
 >    scripts/one-step-load.sh
 > ```
 
-## Running the benchmark
+### Running the benchmark driver
 
-4. To run the scripts of benchmark framework, edit the `driver/{create-validation-parameters,validate,benchmark}.properties` files, then run their script, one of:
+The instructions below explain how to run the benchmark driver in one of the three modes (create validation parameters, validate, benchmark). For more details on the driver modes, check the ["Driver modes" section of the main README](../README.md#driver-modes).
 
-```bash
-driver/create-validation-parameters.sh
-driver/validate.sh
-driver/benchmark.sh
- ```
+#### Create validation parameters
+
+1. Edit the `driver/benchmark.properties` file. Make sure that the `ldbc.snb.interactive.scale_factor`, `ldbc.snb.interactive.updates_dir`, `ldbc.snb.interactive.parameters_dir` properties are set correctly and are in sync.
+
+2. Run the script:
+
+    ```bash
+    driver/create-validation-parameters.sh
+    ```
+
+#### Validate
+
+1. Edit the `driver/validate.properties` file. Make sure that the `validate_database` property points to the file you would like to validate against.
+
+2. Run the script:
+
+    ```bash
+    driver/validate.sh
+    ```
+
+#### Benchmark
+
+1. Edit the `driver/benchmark.properties` file. Make sure that the `ldbc.snb.interactive.scale_factor`, `ldbc.snb.interactive.updates_dir`, and `ldbc.snb.interactive.parameters_dir` properties are set correctly and are in sync.
 
-:warning: *Note that the default workload contains updates which are persisted in the database. Therefore, the database needs to be re-loaded between steps – otherwise repeated updates would insert duplicate entries.*
+2. Run the script:
 
+    ```bash
+    driver/benchmark.sh
+    ```
 
+#### Reload between runs
 
+:warning: The default workload contains updates which are persisted in the database. Therefore, **the database needs to be reloaded or restored from backup before each run**. Use the provided `scripts/backup-database.sh` and `scripts/restore-database.sh` scripts to achieve this.
diff --git a/postgres/README.md b/postgres/README.md
@@ -51,6 +51,8 @@ ldbc.snb.datagen.serializer.dynamicPersonSerializer:ldbc.snb.datagen.serializer.
 ldbc.snb.datagen.serializer.staticSerializer:ldbc.snb.datagen.serializer.snb.csv.staticserializer.CsvMergeForeignStaticSerializer
 ```
 
+## Running the benchmark
+
 ### Configuration
 
 The default configuration of the database (e.g. database name, user, password) is set in the `scripts/vars.sh` file.
@@ -71,7 +73,7 @@ The default configuration of the database (e.g. database name, user, password) i
 
 ### Running the benchmark driver
 
-Run the benchmark driver in one of the three modes (create validation parameters, validate, benchmark).
+The instructions below explain how to run the benchmark driver in one of the three modes (create validation parameters, validate, benchmark). For more details on the driver modes, check the ["Driver modes" section of the main README](../README.md#driver-modes).
 
 #### Create validation parameters
 
@@ -85,7 +87,7 @@ Run the benchmark driver in one of the three modes (create validation parameters
 
 #### Validate
 
-1. Edit the `driver/validate.properties` file. Make sure that the `validate_database` property points to the input CSV file.
+1. Edit the `driver/validate.properties` file. Make sure that the `validate_database` property points to the file you would like to validate against.
 
 2. Run the script:
 
diff --git a/tigergraph/README.md b/tigergraph/README.md
diff --git a/umbra/README.md b/umbra/README.md