Skip to content

Commit ddb27eb

Browse files
authored
Merge pull request #180 from arangodb-helper/design/upgrade-procedure
Documented changed upgrade procedure design
2 parents 7836d87 + 3128653 commit ddb27eb

File tree

1 file changed

+54
-23
lines changed

1 file changed

+54
-23
lines changed

docs/upgrade_spec.md

Lines changed: 54 additions & 23 deletions
Original file line numberDiff line numberDiff line change
@@ -21,33 +21,64 @@ If the deployment mode is `single`, the Starter will:
2121
The server will perform the auto-upgrade and then stop.
2222
After that the Starter will automatically restart it with its normal arguments.
2323

24-
If the deployment mode is `activefailover`, the Starters will:
24+
If the deployment mode is `activefailover` or `cluster` the Starters will:
2525

2626
- Perform a version check on all servers, ensuring it supports the upgrade procedure.
2727
TODO: Specify minimal patch version for 3.2, 3.3 & 3.4,
2828
- Turning off supervision in the Agency and wait for it to be confirmed.
29-
- Restarting one agent at a time with an additional `--database.auto-upgrade=true` argument.
30-
The agent will perform the auto-upgrade and then stop.
31-
After that the Starter will automatically restart it with its normal arguments.
32-
- Restarting one resilient single server at a time with an additional `--database.auto-upgrade=true` argument.
33-
This server will perform the auto-upgrade and then stop.
34-
After that the Starter will automatically restart it with its normal arguments.
35-
- Turning on supervision in the Agency and wait for it to be confirmed.
29+
- Create an upgrade plan and store it in the agency.
30+
The plan consists of upgrade entries for all agents,
31+
followed by upgrade entries for all single servers (in case of `activefailover`),
32+
followed by upgrade entries for all dbservers (in case of `cluster`),
33+
followed by upgrade entries for all coordinators (in case of `cluster`),
34+
followed by upgrade entries for all sync masters (in case of `cluster` & sync),
35+
followed by upgrade entries for all sync workers (in case of `cluster` & sync),
36+
followed by an entry to re-enable agency supervision.
3637

37-
If the deployment mode is `cluster`, the Starters will:
38+
Every Starter will monitor the agency for an upgrade plan.
39+
As soon as it detects an upgrade plan, it will inspect the first entry
40+
of that plan.
3841

39-
- Perform a version check on all servers, ensuring it supports the upgrade procedure.
40-
TODO: Specify minimal patch version for 3.2, 3.3 & 3.4,
41-
- Turning off supervision in the Agency and wait for it to be confirmed.
42-
- Restarting one agent at a time with an additional `--database.auto-upgrade=true` argument.
43-
The agent will perform the auto-upgrade and then stop.
44-
After that the Starter will automatically restart it with its normal arguments.
45-
- Restarting one dbserver at a time with an additional `--database.auto-upgrade=true` argument.
46-
This dbserver will perform the auto-upgrade and then stop.
47-
After that the Starter will automatically restart it with its normal arguments.
48-
- Restarting one coordinator at a time with an additional `--database.auto-upgrade=true` argument.
49-
This coordinator will perform the auto-upgrade and then stop.
50-
After that the Starter will automatically restart it with its normal arguments.
51-
- Turning on supervision in the Agency and wait for it to be confirmed.
42+
If the first entry involves a server that is under control of this Starter,
43+
it will restart the server once with an additional
44+
`--database.auto-upgrade=true` argument.
45+
This server will perform the auto-upgrade and then stop.
46+
The Starter will wait for this server to terminate and restart it with all
47+
the usual arguments.
48+
49+
Once the Starter has confirmed that the newly started server is up and running,
50+
it will remove the first entry from the upgrade plan.
51+
52+
If the first entry of the upgrade plan is a re-enable agency supervision
53+
item, the leader Starter will re-enable agency supervision and mark
54+
the upgrade plan as ready.
55+
56+
## Upgrade state inspection
57+
58+
A new API will be added to inspect the current state of the upgrade process.
59+
60+
This API will be a `GET` request to `/database-auto-upgrade`, resulting in
61+
a JSON object with the following fields:
62+
63+
- `ready` a boolean that is `true` when the plan has been finished succesfully,
64+
or `false` otherwise.
65+
- `failed` a boolean that is `true` when the plan has resulted in an
66+
upgrade failure, or `false` otherwise.
67+
- `reason` a string that describes the state of the upgrade plan in a
68+
human readable form.
69+
- `servers_upgraded` an array containing objects describing the servers that have
70+
been upgraded.
71+
- `servers_remaining` an array containing objects describing the servers that
72+
have not yet been upgraded.
73+
74+
## Failures
75+
76+
If the upgrade procedure of one of the servers fails (for whatever reason),
77+
the Starter that performs that part of the procedure will mark the
78+
first entry of the upgrade plan as failed.
79+
No Starter will act on an upgrade plan entry that is marked failed.
80+
81+
A new API will be added to remove the failure flag from the first entry
82+
of an upgrade plan, such that the procedure can be retried.
5283

53-
Once all servers in the starter have upgraded, repeat the procedure for the next starter.
84+
This API will be a `POST` request to `/database-auto-upgrade/retry`.

0 commit comments

Comments
 (0)