The code deploys CockroachDB along zones. The deployment in each zone is independent for other zones' deployments, to ensure that failures or slowdowns in a zone do not directly affect other zones.
Each zone contains several nodes, and a node might be part of multiple zones. Consider the following example: there are 5 nodes: node_0,...,node_5, and four zones. Zone_0 contains node_4, Zone_1 contains node_1 and node_3, Zone_2 contains node_0, node_1 and node_2, and Zone_3 contains all nodes. Notice that Zone_0 is disjoint from Zone_1 and Zone_2, Zone_1 and Zone_2 overlap partially, whereas Zone_3 includes all zones.
Deployment-wise, every node is a container, and will run several CockroachDB instances equal to the number of zones it participates in. In the example above, node_1 runs 3 CockorachDB deployments, one for each Zone_1, Zone_2 and Zone_3. The deployment of each zone is distributed on all zone nodes, for example in the Zone_1 deployment, node_3 is a participant.
The code uses DEDIS simnet to orchestrate the nodes and create a container for each node. The code also uses DEDIS onet v2 only to disseminate information among zone nodes, for example which ports they should start a CokcorachDB deployment on and which other nodes are part of a zone.
The client supprts two strongly-consistent operations: read key or write key,value. The operations are strongly-consistent at a per-key granularity across all zones. The client is aware of the zone structure and interacts only with the zones that cover the client's area. By default, the client sends its request to all the zones that cover its area. Please refer to the paper for more details.
Requires Go 1.13.
The code supports three operating modes, provided by DEDIS simnet:
- local containers using Docker; Warning this does not support network partitioning, so the results for the partitioning experiment are not relevant (everything succeeds 100% :) ). But it gives you an idea that the system works to be deployed.
- local containers ochestrated with Minikube on Docker;
- remote containers orchestarted with Kubernetes.
To make sure you have the necessary dependencies for the modes above, please refer to the simnet documentation. For the Minikube mode, follow the instructions below on .
This example explains a local code run from topology and workload generation to obtaining results. For simplicity, we describe the partitioning workload. Other workloads will be added to the documentation soon.
We can use three types of topologies: randomnly generated synthetic topology, realistic topology that uses CAIDA Ark pings, and real topology using AWS regions. Topology generation generates the first two types; the third type is direct deployment on AWS.
Let's generate a network of 5 nodes, organized on 2 levels (the number of levels influences the zone generation using graph approximation, please refer to the paper), with a network width of 20 ms.
Go to the directory ./nodeGen and execute the following command (K represents the number of levels).
./nodeGenRndCoordGraph -N 5 -R 20 -SpaceMax 20 -K 2 -D ../genTests/5_2_20
This generates a random topology with the constraints above. It uses a cartesian coordinate system and assigns coordinates to the nodes; the distance between the
nodes represents their latency. The command creates four output files in the directory ./genTests/5_2_20, which will be read by the experiment:
coords.txtcontains the node coordinatescoords_simnet.txtcontains the same node coordinates, but in a format accepted by simnetout.txtcontains the node names, their level, and again their coordinatespings.txtcontains the inter-node latency (ping time); necessary to be copied to./simulation/basicbefore making the Docker image
Go to the directory ./nodeGen and execute the following command (K represents the number of levels).
go build nodeGenRndCoordGraph.go
./nodeGenRndCoordGraph -N=5 -K=2 -real=true -D=../genTests/5_2_real
This generates a realistic topology using CAIDA Ark latencies.
The command creates two output files in the directory ./genTests/5_2_real, which will be read by the experiment:
out.txtcontains the node names, their level, and empty coordinatespings.txtcontains the inter-node latency (ping time); necessary to be copied to./simulation/basicbefore making the Docker image
Let's generate a partition workload. The workload creates two groups: a collaborators group C, and a partitioned group P, with C \in P. The member nodes of C issue read/write ops on the same key, and the member nodes of P are cut (all communication down) from the other nodes not in the group. The goal is to measure how many pairs of operations among the members of C are successful, in other words, not affected by failures outside the group P.
We generate the groups C and P by choosing a random node. All nodes at a radius R1 around the node form the group C, and all nodes at a radius R2 around the node form the group P, with R2 > R1, which means that P includes C.
To generate the partition workload, go to the directory ./nodeGen and execute the following command:
go build part.go
./part -dir ../genTests/5_2_20
This generates one output file, part.txt, in the directory ./genTests/5_2_20. The file contains several lines. R1 = powers of 3, starting at 3ms.
For every R1, we choose 10 random nodes as centers, and for each center we generate a group C and three groups P, corresponding to R2=R1, R=2R1, and R2=5R1.
Each combination R1 and R2 has a workload of 100 operations among members of C.
One can also create a partition workload for the realistic topology, in the same way:
./part -dir ../genTests/5_2_real
Let's generate a latency workload, which is simply pairs of operations (the first a write, the other a read on the same key) issued by different nodes. The goal is to measure the latency of these operations.
Go to nodeGen, then run:
go build latency.go
./latency -dir=../genTests/5_2_20 -bucketOpPairs=20 -bucketMsWidth=2
his generates one output file, latency.txt, in the directory ./genTests/5_2_20. The file contains several lines.
On the first line it contains parameters bucketOpPairs and bucketMsWidth.
One the second line it contains the workload. The above generates a workload of 20 operation pairs per bucket, where buckets have 2ms size.
The resulting workload length is networkMsWidth / bucketMsWidth * bucketOpPairs.
To generate a latency workload weighted with a wan distribution:
./latency -dir=../genTests/5_2_20 -wan=true -wan_file="princeton-2020.txt" -bucketOpPairs=20 -bucketMsWidth=2
The command reads the from the given wan file the bucket size and distributions per bucket. The generator outputs pairs of operations just like before.
The available wan files are: waikato-200706.txt, waikato-200707.txt, waikato-200708.txt, waikato-200709.txt and princeton-2020.txt.
If the network diameter is smaller than the number of buckets, the workload generator rescales the buckets for that width. For example,
if the width is 25 and buckets are 10 ms wide, then the generator uses the distributions for the first 3 buckets, and rescales them to sum 1.
On large deployments, pair-wise ping measurements between nodes takes time. To cut the running time, we recommend to have the experiment read the pings from a file, as follows.
Copy the pings.txt file from the topology directory, eg, ./genTests/5_2_20/pings.txt
Go to the directory ./simulation/basic and run make. This builds and uploads a Docker image with the code in the dedis Docker repo.
Important note. All deployments, both local and remote, pull the Docker image from the repository. This means, if you don't build the image, the code changes you make won't be visible.
Minor note / not recommended. If you'd like the experiment to measure the pings, despite uploading the pings file, in service/service.go change the constant to READ_PINGS=false.
This mode requires Docker and openvpn.
Atention! This run mode does not support partitioning, so when running the partitioning workload, all opeartions will succeed 100%.
- Install virtualbox (
brew install virtualbox) - Stop and delete all the minikube setup (
minikube stop+minikube delete --all) - Configure VirtualBox as the default driver
minikube config set driver virtualbox - Run again
minikube delete --all - Re-start minikube from scratch
minikube start --docker-opt bip=172.18.0.1/16 - It's also good to have the
minikube dashboard
Important make sure no openvpn process is hanging, on simply kill all if you're not running openvpn for something else: sudo killall openvpn.
Also, you can cleanup all pods if the previous run didn't finish correctly:
GO111MODULE=on go run mod.go config.go -do-clean
One can set the resource requirements. At the moment this is a manual process and requires editing the constants in the source code, in simulation/basic/sim/config.go.
- To change the CPU and memory requirements when running a synthetic topology, edit
SYNTHETIC_CPUandSYNTHETIC_MEMORY. - To change the CPU and memory requirements when running a realistic topology, edit
REALISTIC_CPUandREALISTIC_MEMORY. - To change the CPU and memory requirements when running on AWS, edit
AWS_CPUandAWS_MEMORY.
We will run the partition workload in three modes: vanilla, autozoned and jurisdiction.
Be warned: These experiments take a pretty long time! Which is why, we also describe how to run a single sub-workload (R1,R2 combination). This is useful to check if the experiment works, to rerun a particular workload, etc.
Go to the directory ./simulation/basic/sim and execute the following command:
Vanilla
The following runs the system on Kubernetes, for example on Minikube if the kubectl context points to Minikube.
GO111MODULE=on go run mod.go config.go - -nodes=5 -levels=2 -width=20 -evalPartitionsFile=../../../genTests/5_2_20/part.txt -pistachioMode=false -workloadFlag=true -i=5
To run locally on Docker, one should add the flag -useDocker=true (this holds for all commands):
GO111MODULE=on go run mod.go config.go - -nodes=5 -levels=2 -width=20 -evalPartitionsFile=../../../genTests/5_2_20/part.txt -pistachioMode=false -workloadFlag=true -useDocker=true -i=5
The command reads the topology config files, reads the workload 5 in the file part.txt (parameter i=5), and runs the experiment.
It outputs under the topology directory ./genTests/5_2_20 a file ft_vanilla_$i$, where i represents the workload number. The output is of the form
R1=3.000000 R2=6.000000 succeeded 100 infinity 0 specifying how many operation-pairs succeeded.
To run a realistic topology, simply change the width to width=real and the directory in the command. The rest stays the same:
GO111MODULE=on go run mod.go config.go - -nodes=5 -levels=2 -width=real -evalPartitionsFile=../../../genTests/5_2_real/part.txt -pistachioMode=false -workloadFlag=true [-useDocker=true] -i=5
Autozoned
There ar4e two difference from the commands above:
- change the parameter
-pistachioMode=falseto-pistachioMode=true. - add the parameters
-minNodesPerZone=1(default is 3). For small local deployments, unless we add this, the code will generate zones of at least three nodes.
For completeness, we write the commands below:
GO111MODULE=on go run mod.go config.go - -nodes=5 -levels=2 -width=20 -evalPartitionsFile=../../../genTests/5_2_20/part.txt -pistachioMode=true -workloadFlag=true [-useDocker=true] -minNodesPerZone=1 -i=5
This generates under the topology directory ./genTests/5_2_20 a file ft_$i$, where i represents the sub-workload number. The output is of the form
R1=3.000000 R2=6.000000 succeeded 100 infinity 0 specifying how many operation-pairs succeeded.
To run a realistic topology, simply change the width to width=real and the directory in the command. The rest stays the same:
GO111MODULE=on go run mod.go config.go - -nodes=5 -levels=2 -width=real -evalPartitionsFile=../../../genTests/5_2_real/part.txt -pistachioMode=true -workloadFlag=true [-useDocker=true] -minNodesPerZone=1 -i=5
Jurisdiction
The two flags relevant here are jurisdictions="2_5_7" and jurisdictionsCenter="node_2" (these are both examples).
jurisdictions specifies we'll create three overlapping (embedded) regions: of 2ms diameter, of 5ms diameter and of 7, respectively.
The code also adds a default global zone, unless the last radius already comprises the global zone.
The jurisdictionsCenter specifies where to center these zones. If this parameter is missing, the experiment centers them around a
random zone.
We can run jursidictions in core CokcorachDB mode, which runs CockorachDB as a private cloud on the smallest jurisdiction. Tested with two jurisdictions only, one small and one global jursidictions, for example:
GO111MODULE=on go run mod.go config.go - -nodes=90 -levels=3 -width=real -evalLatencyFile=../../../genTests/90_3_real/latency.txt -pistachioMode=true -workloadFlag=true -minNodesPerZone=3 -jurisdictions="50" -jurisdictionsCenter="node_32" -i=3 -privateCloud=true
We can run jursidictions in Limix mode, which runs Limix in the given jursidictions. Tested with two jurisdictions only, one small and one global jursidictions, for example:
GO111MODULE=on go run mod.go config.go - -nodes=90 -levels=3 -width=real -evalLatencyFile=../../../genTests/90_3_real/latency.txt -pistachioMode=true -workloadFlag=true -minNodesPerZone=3 -jurisdictions="50" -jurisdictionsCenter="node_32" -i=3
If we don't pass -jurisdictionsCenter, the code chooses a random center for the jurisdictions.
GO111MODULE=on go run mod.go config.go - -nodes=5 -levels=2 -width=20 -evalPartitionsFile=../../../genTests/5_2_20/part.txt -pistachioMode=true -workloadFlag=true -minNodesPerZone=1 -jurisdictions="2_5_7" -i=5
This generates under the topology directory ./genTests/5_2_20 a file ft_$i$_$jurisdictions_$jurisdictionsCenter,
where i represents the sub-workload number, with the same content as above.
Running all partitioning workloads
One can easily automate the experiment run in a script and save the output logs:
GO111MODULE=on go run mod.go config.go -do-clean
for i in {0..30}
do
GO111MODULE=on go run mod.go config.go - -nodes=5 -levels=2 -width=20 -evalPartitionsFile=../../../genTests/5_2_20/part.txt -pistachioMode=true -workloadFlag=true -minNodesPerZone=1 -i=$i | tee logs_partition_$i.txt
done
Vanilla
It's similar to running the partition workload, we just change the input workload by replacing the flag evalPartitionsFile with the flag evalLatencyFile:
GO111MODULE=on go run mod.go config.go - -nodes=5 -levels=2 -width=20 -evalLatencyFile=../../../genTests/5_2_20/latency.txt -pistachioMode=false -workloadFlag=true
The command reads the topology config files, and runs the workload in latency.txt.
It outputs under the topology directory ./genTests/5_2_20 a file latency_vanilla$i_$j, where i represents the bucketOpPairs
and j represents the bucketMsWidth. The output contains one line per operation pair, the first a write and the second a read, where each line is:
optime-node_2-node_0 41.007681
Autozoned
GO111MODULE=on go run mod.go config.go - -nodes=5 -levels=2 -width=20 -evalLatencyFile=../../../genTests/5_2_20/latency.txt -pistachioMode=true -workloadFlag=true -minNodesPerZone=1
The command outputs under the topology directory ./genTests/5_2_20 a file latency_$i_$j, with the same meaning and structure as above.
Jurisdiction
Probably intuitive by now:
GO111MODULE=on go run mod.go config.go - -nodes=5 -levels=2 -width=20 -evalLatencyFile=../../../genTests/5_2_20/latency.txt -pistachioMode=true -workloadFlag=true -jurisdictions="2_5_7" -jurisdictionsCenter="node_1" -minNodesPerZone=1
The command outputs under the topology directory ./genTests/5_2_20 a file latency_$i_$j_$jurisdictions_$jurisdictionsCenter.txt,
with the same meaning and structure as above.
All
./nodeGenRndCoordGraph -K=2 -N=40 -R=40 -real=true -D="../genTests/40_2_real_1"
GO111MODULE=on go run mod.go config.go - -nodes=40 -levels=2 -width=real -evalLatencyFile=../../../genTests/40_2_real/latency.txt -pistachioMode=true -workloadFlag=true -minNodesPerZone=3 -jurisdictions="50" -jurisdictionsCenter="node_11"
./part -dir="../genTests/40_2_real" -centerNode="node_11" -jurisdictions="50" -wan=true -wan_file="princeton-2020.txt" -workloadLen=1000
GO111MODULE=on go run mod.go config.go - -nodes=40 -levels=2 -width=real -evalLatencyFile=../../../genTests/40_2_real/latency.txt -pistachioMode=false -workloadFlag=true -minNodesPerZone=3
GO111MODULE=on go run mod.go config.go - -nodes=40 -levels=2 -width=real -evalPartitionsFile=../../../genTests/40_2_real/part.txt -pistachioMode=false -workloadFlag=true -minNodesPerZone=3
GO111MODULE=on go run mod.go config.go - -nodes=40 -levels=2 -width=real -evalPartitionsFile=../../../genTests/40_2_real/part.txt -pistachioMode=false -workloadFlag=true -minNodesPerZone=3
GO111MODULE=on go run mod.go config.go - -nodes=40 -levels=2 -width=real -evalLatencyFile=../../../genTests/40_2_real/latency.txt -pistachioMode=false -workloadFlag=true -minNodesPerZone=3 -jurisdictions="50" -jurisdictionsCenter="node_11" -privateCloud=true
GO111MODULE=on go run mod.go config.go - -nodes=40 -levels=2 -width=real -evalLatencyFile=../../../genTests/40_2_real/latency.txt -pistachioMode=true -workloadFlag=true -minNodesPerZone=3 -jurisdictions="50" -jurisdictionsCenter="node_11"
GO111MODULE=on go run mod.go config.go - -nodes=40 -levels=2 -width=real -evalPartitionsFile=../../../genTests/40_2_real/part_aux.txt -pistachioMode=true -workloadFlag=true -minNodesPerZone=3 -jurisdictions="50" -jurisdictionsCenter="node_11"
JURISDICTIONS
private: GO111MODULE=on go run mod.go config.go - -nodes=40 -levels=2 -width=real -evalPartitionsFile=../../../genTests/40_2_real/part_aux.txt -pistachioMode=false -workloadFlag=false -minNodesPerZone=3 -jurisdictions="50" -jurisdictionsCenter="node_11" -privateCloud=true geo: GO111MODULE=on go run mod.go config.go - -nodes=40 -levels=2 -width=real -evalPartitionsFile=../../../genTests/40_2_real/part_aux.txt -pistachioMode=false -workloadFlag=true -minNodesPerZone=3 limix: GO111MODULE=on go run mod.go config.go - -nodes=40 -levels=2 -width=real -evalPartitionsFile=../../../genTests/40_2_real/part_aux.txt -pistachioMode=true -workloadFlag=true -minNodesPerZone=3 -jurisdictions="50" -jurisdictionsCenter="node_11"
limix multi-jurisdiction: GO111MODULE=on go run mod.go config.go - -nodes=40 -levels=2 -width=real -evalPartitionsFile=../../../genTests/40_2_real/part_aux.txt -pistachioMode=true -workloadFlag=true -minNodesPerZone=1 -jurisdictions="physalia"
Physalia multi: GO111MODULE=on go run mod.go config.go - -nodes=40 -levels=2 -width=real -evalPartitionsFile=../../../genTests/40_2_real/part_aux.txt -pistachioMode=false -workloadFlag=true -minNodesPerZone=1 -physalia=true
Deployment on AWS using mikro8s
Configure Docker to accept the endpoint by adding:
"insecure-registries" : ["3.70.69.67:32000"]
to
~/.docker/daemon.json
Then build the image:
docker build . -t 3.70.69.67:32000/dedis/cruxsim
and push it
docker push 3.70.69.67:32000/dedis/cruxsim
Note to self: should make a Makefile.
GO111MODULE=on go run mod.go config.go - -nodes=5 -levels=2 -width=real -evalPartitionsFile=../../../genTests/5_2_real/part.txt -pistachioMode=true -workloadFlag=false -minNodesPerZone=3 -awsIR=true
GO111MODULE=on go run mod.go config.go - -nodes=40 -levels=2 -width=real -evalPartitionsFile=../../../genTests/40_2_real/part_aux.txt -pistachioMode=true -workloadFlag=true -minNodesPerZone=3 -jurisdictions="50" -jurisdictionsCenter="node_11" | tee limix_jur.txt
GO111MODULE=on go run mod.go config.go - -nodes=40 -levels=2 -width=real -evalPartitionsFile=../../../genTests/40_2_real/part_aux.txt -pistachioMode=false -workloadFlag=false -minNodesPerZone=3 -jurisdictions="50" -jurisdictionsCenter="node_11" -privateCloud=true
GO111MODULE=on go run mod.go config.go - -nodes=40 -levels=2 -width=real -evalPartitionsFile=../../../genTests/40_2_real/part_aux.txt -pistachioMode=false -workloadFlag=false -minNodesPerZone=3 -physalia=true
This works!!!s GO111MODULE=on go run mod.go config.go - -nodes=5 -levels=2 -width=real -evalPartitionsFile=../../../genTests/5_2_real/part.txt -pistachioMode=true -workloadFlag=true -minNodesPerZone=1 -awsIR=true