Skip to content
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
88 changes: 88 additions & 0 deletions data-loader/performance-test/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,88 @@
# Performance test execution for ScalarDB Data Loader

## Instructions to run the script

Execute the e2e_test.sh script from the `performance-test` folder of the repository:

```
./e2e_test.sh [options]
```

## Available command-line arguments

```
./e2e_test.sh [--memory=mem1,mem2,...] [--cpu=cpu1,cpu2,...] [--data-size=size] [--image-tag=tag] [--import-args=args] [--export-args=args] [--network=network-name] [--skip-data-gen] [--disable-import] [--disable-export] [--no-clean-data] [--database-dir=path] [--use-jar] [--jar-path=path]
```

Options:

- `--memory=mem1,mem2,...`: Comma-separated list of memory limits for Docker containers (e.g., 1g,2g,4g)
- `--cpu=cpu1,cpu2,...`: Comma-separated list of CPU limits for Docker containers (e.g., 1,2,4)
- `--data-size=size`: Size of data to generate (e.g., 1MB, 2GB)
- `--num-rows=number`: Number of rows to generate (e.g., 1000, 10000)

Note: Either `--data-size` or `--num-rows` must be provided, but not both.

- `--image-tag=tag`: Docker image tag to use (default: 4.0.0-SNAPSHOT)
- `--import-args=args`: Arguments for import command
- `--export-args=args`: Arguments for export command
- `--network=network-name`: Docker network name (default: my-network)
- `--skip-data-gen`: Skip data generation step
- `--disable-import`: Skip import test
- `--disable-export`: Skip export test
- `--no-clean-data`: Don't clean up generated files after test
- `--database-dir=path`: Path to database directory
- `--use-jar`: Use JAR file instead of Docker container
- `--jar-path=path`: Path to JAR file (when using --use-jar)

Examples:

```
# Using data size
./e2e_test.sh --memory=1g,2g,4g --cpu=1,2,4 --data-size=2MB --image-tag=4.0.0-SNAPSHOT
```

```
# Using number of rows
./e2e_test.sh --memory=1g,2g,4g --cpu=1,2,4 --num-rows=10000 --image-tag=4.0.0-SNAPSHOT
```

Example with JAR:

```
./e2e_test.sh --use-jar --jar-path=./scalardb-data-loader-cli.jar --import-args="--format csv --import-mode insert --mode transaction --transaction-size 10 --data-chunk-size 500 --max-threads 16" --export-args="--format csv --max-threads 8 --data-chunk-size 500"
```

### Import-Only Examples

To run only the import test (skipping export):

```
# Using data size
./e2e_test.sh --disable-export --memory=2g --cpu=2 --data-size=1MB --import-args="--format csv --import-mode insert --mode transaction --transaction-size 10 --max-threads 16"
```

```
# Using number of rows
./e2e_test.sh --disable-export --memory=2g --cpu=2 --num-rows=10000 --import-args="--format csv --import-mode insert --mode transaction --transaction-size 10 --max-threads 16"
```

With JAR:

```
./e2e_test.sh --disable-export --use-jar --jar-path=./scalardb-data-loader-cli.jar --import-args="--format csv --import-mode insert --mode transaction --transaction-size 10 --max-threads 16"
```

### Export-Only Examples

To run only the export test (skipping import):

```
./e2e_test.sh --disable-import --memory=2g --cpu=2 --export-args="--format csv --max-threads 8 --data-chunk-size 500"
```

With JAR:

```
./e2e_test.sh --disable-import --use-jar --jar-path=./scalardb-data-loader-cli.jar --export-args="--format csv --max-threads 8 --data-chunk-size 500"
```
40 changes: 40 additions & 0 deletions data-loader/performance-test/database/db_setup.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,40 @@
#!/bin/bash

# Set variables
NETWORK_NAME="my-network"
POSTGRES_CONTAINER="postgres-db"
SCALARDB_PROPERTIES="$(pwd)/scalardb.properties"
SCHEMA_JSON="$(pwd)/schema.json"

# Step 1: Create a Docker network (if not exists)
docker network inspect $NETWORK_NAME >/dev/null 2>&1 || \
docker network create $NETWORK_NAME

# Step 2: Start PostgreSQL container
docker run -d --name $POSTGRES_CONTAINER \
--network $NETWORK_NAME \
-e POSTGRES_USER=myuser \
-e POSTGRES_PASSWORD=mypassword \
-e POSTGRES_DB=mydatabase \
-p 5432:5432 \
postgres:16

# Wait for PostgreSQL to be ready
echo "Waiting for PostgreSQL to start..."
sleep 10

# Step 3: Create 'test' schema
docker exec -i $POSTGRES_CONTAINER psql -U myuser -d mydatabase -c "CREATE SCHEMA IF NOT EXISTS test;"

# Step 4: Run ScalarDB Schema Loader
docker run --rm --network $NETWORK_NAME \
-v "$SCHEMA_JSON:/schema.json" \
-v "$SCALARDB_PROPERTIES:/scalardb.properties" \
ghcr.io/scalar-labs/scalardb-schema-loader:3.15.2-SNAPSHOT \
-f /schema.json --config /scalardb.properties --coordinator

# Step 5: Verify schema creation
docker exec -i $POSTGRES_CONTAINER psql -U myuser -d mydatabase -c "\dn"

echo "✅ Schema Loader execution completed."

6 changes: 6 additions & 0 deletions data-loader/performance-test/database/scalardb.properties
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
scalar.db.storage=jdbc
scalar.db.contact_points=jdbc:postgresql://postgres-db:5432/mydatabase
scalar.db.username=myuser
scalar.db.password=mypassword
scalar.db.cross_partition_scan.enabled=true
scalar.db.transaction_manager=single-crud-operation
25 changes: 25 additions & 0 deletions data-loader/performance-test/database/schema.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,25 @@
{
"test.all_columns": {
"transaction": true,
"partition-key": [
"col1"
],
"clustering-key": [
"col2",
"col3"
],
"columns": {
"col1": "BIGINT",
"col2": "INT",
"col3": "BOOLEAN",
"col4": "FLOAT",
"col5": "DOUBLE",
"col6": "TEXT",
"col7": "BLOB",
"col8": "DATE",
"col9": "TIME",
"col10": "TIMESTAMP",
"col11": "TIMESTAMPTZ"
}
}
}
Loading