|
| 1 | +# ArangoDB Starter Architecture |
| 2 | + |
| 3 | +## What does the Starter do |
| 4 | + |
| 5 | +The ArangoDB Starter is a program used to create ArangoDB database deployments |
| 6 | +on bare-metal (or virtual machines) with ease. |
| 7 | +It enables you to create everything from a simple Single server instance |
| 8 | +to a full blown Cluster with datacenter to datacenter replication in under 5 minutes. |
| 9 | + |
| 10 | +The Starter is intended to be used in environments where there is no higher |
| 11 | +level orchestration system (e.g. Kubernetes or DC/OS) available. |
| 12 | + |
| 13 | +## Starter versions |
| 14 | + |
| 15 | +The Starter is a separate process in a binary called `arangodb` (or `arangodb.exe` on Windows). |
| 16 | +This binary has its own version number that is independent of a ArangoDB (database) |
| 17 | +version. |
| 18 | + |
| 19 | +This means that Starter version `a.b.c` can be used to run deployments |
| 20 | +of ArangoDB databases with different version. |
| 21 | +For example, the Starter with version `0.11.2` can be used to create |
| 22 | +ArangoDB deployments with ArangoDB version `3.2.<something>` as well |
| 23 | +as deployments with ArangoDB version `3.3.<something>`. |
| 24 | + |
| 25 | +It also means that you can update the Starter independently from the ArangoDB |
| 26 | +database. |
| 27 | + |
| 28 | +Note that the Starter is also included in all binary ArangoDB packages. |
| 29 | + |
| 30 | +To find the versions of you Starters & ArangoDB database, run the following commands: |
| 31 | + |
| 32 | +```bash |
| 33 | +# To get the Starter version |
| 34 | +arangodb --version |
| 35 | +# To get the ArangoDB database version |
| 36 | +arangod --version |
| 37 | +``` |
| 38 | + |
| 39 | +## Starter deployment modes |
| 40 | + |
| 41 | +The Starter supports 3 different modes of ArangoDB deployments: |
| 42 | + |
| 43 | +1. Single server |
| 44 | +1. Active failover |
| 45 | +1. Cluster |
| 46 | + |
| 47 | +Note: Datacenter replication is an option for the `cluster` deployment mode. |
| 48 | + |
| 49 | +You select one of these modes using the `--starter.mode` command line option. |
| 50 | + |
| 51 | +Depending on the mode you've selected, the Starter launches one or more |
| 52 | +(`arangod` / `arangosync`) server processes. |
| 53 | + |
| 54 | +No matter which mode you select, the Starter always provides you |
| 55 | +a common directory structure for storing the servers data, configuration & log files. |
| 56 | + |
| 57 | +## Starter operating modes |
| 58 | + |
| 59 | +The Starter can run as normal processes directly on the host operating system, |
| 60 | +or as containers in a docker runtime. |
| 61 | + |
| 62 | +When running as normal process directly on the host operating system, |
| 63 | +the Starter launches the servers as child processes and monitors those. |
| 64 | +If one of the server processes terminates, a new one is started automatically. |
| 65 | + |
| 66 | +When running in a docker container, the Starter launches the servers |
| 67 | +as separate docker containers, that share the volume namespace with |
| 68 | +the container that runs the Starter. It monitors those containers |
| 69 | +and if one terminates, a new container is launched automatically. |
| 70 | + |
| 71 | +## Starter data-directory |
| 72 | + |
| 73 | +The Starter uses a single directory with a well known structure to store |
| 74 | +all data for its own configuration & logs, as well as the configuration, |
| 75 | +data & logs of all servers it starts. |
| 76 | + |
| 77 | +This data directory is set using the `--starter.data-dir` command line option. |
| 78 | +It contains the following files & sub-directories. |
| 79 | + |
| 80 | +- `setup.json` The configuration of the "cluster of Starters". |
| 81 | + For details see below. DO NOT edit this file. |
| 82 | +- `arangodb.log` The log file of the Starter |
| 83 | +- `single<port>`, `agent<port>`, `coordinator<port>`, `dbserver<port`>: directories for |
| 84 | + launched servers. These directories contain among others the following files: |
| 85 | + - `apps`: A directory with Foxx applications |
| 86 | + - `data`: A directory with database data |
| 87 | + - `arangod.conf`: The configuration file for the server. Editing this file is possible, but not recommended. |
| 88 | + - `arangod.log`: The log file of the server |
| 89 | + - `arangod_command.txt`: File containing the exact command line of the started server (for debugging purposes only) |
| 90 | + |
| 91 | +## Running on multiple machines |
| 92 | + |
| 93 | +For the `activefailover` & `cluster` mode, it is required to run multiple |
| 94 | +Starters, as every Starter will only launch a subset of all servers needed |
| 95 | +to form the entire deployment. |
| 96 | +For example in `cluster` mode, a Starter will launch a single agent, a single dbserver |
| 97 | +and a single coordinator. |
| 98 | + |
| 99 | +It is the responsibility of the user to run the Starter on multiple machines such |
| 100 | +that enough servers are started to form the entire deployment. |
| 101 | +The minimum number of Starters needed is 3. |
| 102 | + |
| 103 | +The Starters running on those machines need to know about each other's existence. |
| 104 | +In order to do so, the Starters form a "cluster" of their own (not to be confused |
| 105 | +with the ArangoDB database cluster). |
| 106 | +This cluster of Starters is formed from the values given to the `--starter.join` |
| 107 | +command line option. You should pass the addresses (`<host>:<port>`) of all Starters. |
| 108 | + |
| 109 | +For example a typical commandline for a cluster deployment looks like this: |
| 110 | + |
| 111 | +```bash |
| 112 | +arangodb --starter.mode=cluster --starter.join=hostA:8528,hostB:8528,hostC:8528 |
| 113 | +# this command is run on hostA, hostB and hostC. |
| 114 | +``` |
| 115 | + |
| 116 | +The state of the cluster (of Starters) is stored in a configuration file called |
| 117 | +`setup.json` in the data directory of every Starter and the ArangoDB |
| 118 | +agency is used to elect a master among all Starters. |
| 119 | + |
| 120 | +The master Starter is responsible for maintaining the list of all Starters |
| 121 | +involved in the cluster and their addresses. The slave Starters (all Starters |
| 122 | +except the elected master) fetch this list from the master Starter on regular |
| 123 | +basis and store it to its own `setup.json` config file. |
| 124 | + |
| 125 | +Note: The `setup.json` config file MUST NOT be edited manually. |
| 126 | + |
| 127 | +## Running on multiple machines (under the hood) |
| 128 | + |
| 129 | +As mentioned above, when the Starter is used to create an `activefailover` |
| 130 | +or `cluster` deployment, it first creates a "cluster" of Starters. |
| 131 | + |
| 132 | +These are the steps taken by the Starters to bootstrap such a deployment |
| 133 | +from scratch. |
| 134 | + |
| 135 | +1. All Starters are started (either manually or by some supervisor) |
| 136 | +1. All Starters try to read their config from `setup.json`. |
| 137 | + If that file exists and is valid, this bootstrap-from-scratch process |
| 138 | + is aborted and all Starters go directly to the `running` phase described below. |
| 139 | +1. All Starters create a unique ID |
| 140 | +1. The list of `--starter.join` arguments is sorted |
| 141 | +1. All Starters request the unique ID from the first server in the sorted `--starter.join` list, |
| 142 | + and compares the result with its own unique ID. |
| 143 | +1. The Starter that finds its own unique ID, is continuing as `bootstrap master` |
| 144 | + the other Starters are continuing as `bootstrap slaves`. |
| 145 | +1. The `bootstrap master` waits for at least 2 `bootstrap slaves` to join it. |
| 146 | +1. The `bootstrap slaves` contact the `bootstrap master` to join its cluster of Starters. |
| 147 | +1. Once the `bootstrap master` has received enough (at least 2) requests |
| 148 | + to join its cluster of Starters, it continues with the `running` phase. |
| 149 | +1. The `bootstrap slaves` keep asking the `bootstrap master` about its state. |
| 150 | + As soon as they receive confirmation to do so, they also continue with the `running` phase. |
| 151 | + |
| 152 | +In the `running` phase all Starters launch the desired servers and keeps monitoring those |
| 153 | +servers. Once a functional agency is detected, all Starters will try to be |
| 154 | +`running master` by trying to write their ID in a well known location in the agency. |
| 155 | +The first Starter to succeed in doing so wins this master election. |
| 156 | + |
| 157 | +The `running master` will keep writing its ID in the agency in order to remaining |
| 158 | +the `running master`. Since this ID is written with a short time-to-live, |
| 159 | +other Starters are able to detect when the current `running master` has been stopped |
| 160 | +or is no longer responsible. In that case the remaining Starters will perform |
| 161 | +another master election to decide who will be the next `running master`. |
| 162 | + |
| 163 | +API requests that involve the state of the cluster of Starters are always answered |
| 164 | +by the current `running master`. All other Starters will refer the request to |
| 165 | +the current `running master`. |
0 commit comments