You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Our cluster configuration uses docker host networking. There are a series of scripts to bring up the dockers that make up our cluster. You will likely need to tailor these scripts to meet the needs of your configuration.
2
+
3
+
We have several scripts:
4
+
spark/docker/start_master_host.sh This brings up the spark master container using host networking.
5
+
spark/docker/start_worker_host.sh This brings up the spark worker container using host networking.
6
+
spark/docker/start_launcher_host.sh This brings up the spark launcher container using host networking. This is the container where our run_tpch.sh will launch the benchmark from.
7
+
dikeHDFS/start_server_host.sh This brings up the docker with HDFS, and NDP.
8
+
9
+
There is a config file called spark/spark.config. It has the config of the addresses and hostnames needed by the above scripts. You need to modify it for your configuration. There is an example in our repo.
10
+
11
+
You also need to configure dikeHDFS/start_server_host.sh with your IP address. Change the line with --add-host=dikehdfs to include your storage server's ip address.
12
+
13
+
As an example, in our configuration we typically will follow this sequence.
14
+
1) From our master node we will run start_master_host.sh and start_launcher_host.sh
15
+
2) Next we go to the worker nodes and run start_worker_host.sh 1 8
16
+
3) Note that the 1 8 above is the number of workers followed by the number of cores to use.
17
+
4) Launch the NDP server via dikeHDFS/start_server_host.sh
0 commit comments