Skip to content

Commit c7ad740

Browse files
committed
Improve readme
1 parent 7539874 commit c7ad740

File tree

1 file changed

+49
-8
lines changed

1 file changed

+49
-8
lines changed

benchmarks/cdk/README.md

Lines changed: 49 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -2,6 +2,8 @@
22

33
Creates automatically the appropriate infrastructure in AWS for running benchmarks.
44

5+
---
6+
57
# Deploy
68

79
## Prerequisites
@@ -38,6 +40,8 @@ npm run cdk deploy
3840
npm run sync-bucket
3941
```
4042

43+
---
44+
4145
# Connect to instances
4246

4347
## Prerequisites
@@ -59,24 +63,61 @@ sudo ln -s /usr/local/sessionmanagerplugin/bin/session-manager-plugin /usr/local
5963

6064
## Port Forward
6165

62-
After performing a CDK deploy, a CNF output will be printed to stdout with instructions for port-forwarding
63-
to all the machines, something like this:
66+
After performing a CDK deploy, a CNF output will be printed to stdout with instructions for port-forwarding to them.
6467

6568
```shell
66-
# instance-0 (forward port 8000 to localhost:8000)
67-
aws ssm start-session --target i-04ed9f331dcfae4b6 --document-name AWS-StartPortForwardingSession --parameters "portNumber=8000,localPortNumber=8000"
69+
export INSTANCE_ID=i-0000000000000000
70+
71+
aws ssm start-session --target $INSTANCE_ID --document-name AWS-StartPortForwardingSession --parameters "portNumber=9000,localPortNumber=9000"
6872
```
6973

70-
Just port-forwarding the first instance is enough for making queries.
74+
Just port-forwarding the first instance is enough for issuing queries.
7175

7276
## Connect
7377

7478
After performing a CDK deploy, a CNF output will be printed to stdout with instructions for connecting
7579
to all the machines, something like this:
7680

7781
```shell
78-
# instance-0
79-
aws ssm start-session --target i-00000000000000000
82+
export INSTANCE_ID=i-0000000000000000
83+
84+
aws ssm start-session --target $INSTANCE_ID
85+
```
86+
87+
The logs can be streamed with:
88+
89+
```shell
90+
sudo journalctl -u worker.service -f -o cat
91+
```
92+
93+
---
94+
95+
# Running benchmarks
96+
97+
There's a script that will run the TPCH benchmarks against the remote cluster:
98+
99+
In one terminal, perform a port-forward of one machine in the cluster, something like this:
100+
101+
```shell
102+
export INSTANCE_ID=i-0000000000000000
103+
aws ssm start-session --target $INSTANCE_ID --document-name AWS-StartPortForwardingSession --parameters "portNumber=9000,localPortNumber=9000"
104+
```
105+
106+
In another terminal, navigate to the benchmarks/cdk folder:
107+
108+
```shell
109+
cd benchmarks/cdk
80110
```
81111

82-
Just running one of those commands in the terminal will connect you to the EC2 instance
112+
And run the benchmarking script
113+
114+
```shell
115+
npm run datafusion-bench
116+
```
117+
118+
Several arguments can be passed for running the benchmarks against different scale factors and with different configs,
119+
for example:
120+
121+
```shell
122+
npm run datafusion-bench -- --sf 10 --files-per-task 4 --query 7
123+
```

0 commit comments

Comments
 (0)