You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: docs/multisite.rst
+60-16Lines changed: 60 additions & 16 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -3,23 +3,30 @@
3
3
Using Patroni in multisite mode
4
4
===============================
5
5
6
+
.. _multisite_introduction:
7
+
6
8
Introduction
7
9
++++++++++++
8
10
9
11
The multisite mode has been developed to increase resilience of Patroni setups spanning multiple sites against temporary outages. In multisite mode each site runs a separate Patroni cluster with its own DCS, being able to perform leader switches (switchovers and failovers) as usual Patroni clusters. On top of this, in multisite mode here is a global DCS for leader site election, which coordinates which site is the primary and which is the standby. In each site the local leader instance is responsible for global leader site election. The site that acquires the leader lock runs Patroni normally, other sites configure themselves as standby clusters.
10
12
13
+
.. _multisite_when_to_use:
11
14
When to use multisite mode
12
15
--------------------------
13
16
14
17
If network reliability and bandwidth between sites is good and latency low (<10ms), multisite mode is most likely not useful. Instead, a simple Patroni cluster that spans the two sites will be a simpler and more robust solution.
15
18
16
19
Multisite mode is useful when automatic cross site failover is needed, but the cross site failover needs to be much more resilient against temporary outages. It is also useful when cluster member IP addresses are not globally routable and cross site communication needs to pass through an externally visible proxy address.
17
20
21
+
.. _multisite_dcs_considerations:
22
+
18
23
DCS considerations
19
24
------------------
20
25
21
26
There are multiple possible ways of setting up DCS for multisite mode, but in every case there are two separate concerns covered. One is the local DCS, which is backing the site-local actions of Patroni. In addition, there is the global DCS, being responsible for keeping track of site state.
22
27
28
+
.. _multisite_global_dcs:
29
+
23
30
Global DCS
24
31
~~~~~~~~~~
25
32
@@ -33,46 +40,62 @@ Here is a typical deployment architecture for using multisite mode:
33
40
34
41
.. image:: _static/multisite-architecture.png
35
42
43
+
.. _multisite_cross_site_latency:
44
+
36
45
Cross-site latency
37
46
##################
38
47
39
48
If the network latencies between sites are very high, then DCS might require special tuning. For example, etcd uses a heartbeat interval of 100 ms and election timeout of 1 s by default. If round trip time between sites is more than 100 ms, these values should be increased.
40
49
50
+
.. _multisite_local_dcs:
51
+
41
52
Local DCS
42
53
~~~~~~~~~
43
54
44
55
This is not different from a usual Patroni setup.
45
56
46
57
47
58
59
+
.. _multisite_op_howto:
60
+
48
61
Operational how-tos
49
62
+++++++++++++++++++
50
63
64
+
.. _multisite_installation:
65
+
51
66
Installation
52
67
------------
53
68
69
+
.. _multisite_installation_linux:
70
+
54
71
Linux
55
72
~~~~~
56
73
74
+
.. _multisite_installation_linux_prerequisites:
75
+
57
76
Prerequisites
58
77
#############
59
78
60
79
Before starting the installation, Python3 and the matching pip binary have to be installed on the system.
61
80
62
81
Patroni stores its state and some of its config in a distributed configuration store (DCS). You have to install one of the possible solutions, e.g. etcd 3.5 (https://etcd.io/docs/v3.5/install/).
63
82
83
+
.. _multisite_installation_linux_steps:
84
+
64
85
Installation steps
65
86
##################
66
87
67
88
As systemd is is now the de-facto init system across Linux distributions, we use it in the below steps.
68
89
69
90
#. Download and unpack source from https://github.com/cybertec-postgresql/patroni/archive/refs/heads/multisite.zip
70
-
#. `cd` to the resulting `patroni` directory
71
-
#. `pip install -r requirements.txt`
72
-
#. `pip install psycopg`
91
+
#. ``cd`` to the resulting ``patroni`` directory
92
+
#. ``pip install -r requirements.txt``
93
+
#. ``pip install psycopg``
73
94
#. create Patroni config (see Configuration below)
74
95
#. to run Patroni as a systemd service, create a systemd unit config based on the linked example: https://github.com/patroni/patroni/blob/master/extras/startup-scripts/patroni.service
75
-
#. start Patroni with `[sudo] systemctl start patroni`
96
+
#. start Patroni with ``[sudo] systemctl start patroni``
97
+
98
+
.. _multisite_installation_windows:
76
99
77
100
Windows
78
101
~~~~~~~
@@ -82,12 +105,14 @@ You can use Cybertec's packaged versions from https://github.com/cybertec-postgr
82
105
If you need, for example, a different PostgreSQL version from what's provided, open a Github issue there, and a new release will soon be prepared.
83
106
84
107
108
+
.. _multisite_configuration:
109
+
85
110
Configuration
86
111
-------------
87
112
88
113
Configuring multisite mode is done using a top level ``multisite`` section in Patroni configuration file.
89
114
90
-
The configuration is very similar to the usual Patroni config. In fact, the keys and their respective values under `multisite` obey the same rules as those in a conventional configuration.
115
+
The configuration is very similar to the usual Patroni config. In fact, the keys and their respective values under ``multisite`` obey the same rules as those in a conventional configuration.
91
116
92
117
An example configuration for two Patroni sites:
93
118
@@ -114,6 +139,7 @@ An example configuration for two Patroni sites:
114
139
ttl: 90
115
140
retry_timeout: 40
116
141
142
+
.. _multisite_config_parameters:
117
143
118
144
Details of the configuration parameters
119
145
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
@@ -135,38 +161,48 @@ Details of the configuration parameters
135
161
``retry_timeout``
136
162
How long the global etcd cluster can be inaccessible before the cluster is demoted. Must be a few times longer than the usual ``retry_timeout`` value in order to prevent unnecessary site failovers.
137
163
164
+
.. _multisite_config_passwords:
165
+
138
166
Passwords in the YAML configuration
139
167
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
140
168
141
-
As all standby sites replicate from the leader, users and their passwords are the same on each Postgres node. Therefore the YAML configuration should specify the same password for each user under `postgresql.authentication`.
169
+
As all standby sites replicate from the leader, users and their passwords are the same on each Postgres node. Therefore the YAML configuration should specify the same password for each user under ``postgresql.authentication``.
170
+
142
171
172
+
.. _multisite_site_failover:
143
173
144
174
Site failover
145
175
-------------
146
176
147
177
In case the multisite leader lock is not updated for at least the time specified by multisite TTL, the standby leader(s) of the other site(s) will try to update the lock. If successful, the standby leader will be promoted to a proper leader. As a result, the Postgres primary instance will be now found in a new site.
148
178
179
+
.. _multisite_restore_order_after_failover:
180
+
149
181
Restoring the old leader site after site failover
150
182
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
151
183
152
184
Once the problems leading to the site failover are resolved, the old leader site will be able to join the multisite cluster as a standby leader. There is no automatic attempt made for restoring the original order - that is, if desired, switching back to the old leader site must be done manually, via a site switchover.
Applications should be ready to try to connect to the new primary. See 'Connecting to a multisite cluster' for more details.
158
192
159
193
194
+
.. _multisite_site_switchover:
195
+
160
196
Site switchover
161
197
---------------
162
198
163
-
When circumstances arise that makes it necessary to switch the location of the Postgres primary from one site to another, one could do it by performing a site switchover. Just like a normal switchover, a site switchover can be initiated using `patronictl` (or, alternatively, and API call to the Rest API). The CTL command is as simple as
199
+
When circumstances arise that makes it necessary to switch the location of the Postgres primary from one site to another, one could do it by performing a site switchover. Just like a normal switchover, a site switchover can be initiated using ``patronictl`` (or, alternatively, and API call to the Rest API). The CTL command is as simple as
164
200
165
201
```
166
202
patronictl site-switchover
167
203
```
168
204
169
-
Answer the prompts as you would with other `patronictl` commands.
205
+
Answer the prompts as you would with other ``patronictl`` commands.
170
206
171
207
The API call could look like the following (replace 'dc2' with the desired site name):
Applications should be ready to try to connect to the new primary. See 'Connecting to a multisite cluster' for more details.
220
+
Applications should be ready to try to connect to the new primary. See :ref:`_multisite_connection_to_cluster` for more details.
221
+
183
222
223
+
.. _multisite_connection_to_cluster:
184
224
185
225
Connecting to a multisite cluster
186
226
---------------------------------
@@ -189,17 +229,19 @@ There are multiple ways one could set up application connections to a multisite
189
229
190
230
1. Single IP address using HAProxy
191
231
192
-
This is the simplest from the application standpoint, but setting it up is the most complex of all listed solutions (extra node(s) for HAProxy itself, and `keepalived` for ensuring HAProxy's availability). Unless you need the load balancing features HAProxy provides, you should probably choose one of the other methods.
232
+
This is the simplest from the application standpoint, but setting it up is the most complex of all listed solutions (extra node(s) for HAProxy itself, and Keepalived for ensuring HAProxy's availability). Unless you need the load balancing features HAProxy provides, you should probably choose one of the other methods.
193
233
194
234
2. Multi-host connection strings
195
235
196
-
With this solution, all potential primary instances are listed in the connection string. To ensure connections land on the primary, the connection failover feature of the DB driver should be used (`targetServerType=primary` for [JDBC](https://jdbc.postgresql.org/documentation/use/#connection-fail-over), `target_session_attrs="read-write"` for [libpq](https://www.postgresql.org/docs/current/libpq-connect.html#LIBPQ-MULTIPLE-HOSTS), `TargetSessionAttributes.Primary` for .NET's [Npgsql](https://www.npgsql.org/doc/failover-and-load-balancing.html?tabs=7)). The big advantage of this solution is that it doesn't require any extra setup on the DB side. A disadvantage can be that with many nodes (e.g. two sites with three nodes each) it can take a while to have a connection opened. This is less of a problem when using connection poolers.
236
+
With this solution, all potential primary instances are listed in the connection string. To ensure connections land on the primary, the connection failover feature of the DB driver should be used (``targetServerType=primary`` for [JDBC](https://jdbc.postgresql.org/documentation/use/#connection-fail-over), ``target_session_attrs="read-write"`` for [libpq](https://www.postgresql.org/docs/current/libpq-connect.html#LIBPQ-MULTIPLE-HOSTS), ``TargetSessionAttributes.Primary`` for .NET's [Npgsql](https://www.npgsql.org/doc/failover-and-load-balancing.html?tabs=7)). The big advantage of this solution is that it doesn't require any extra setup on the DB side. A disadvantage can be that with many nodes (e.g. two sites with three nodes each) it can take a while to have a connection opened. This is less of a problem when using connection poolers.
197
237
198
238
3. Per-site endpoint IP combined with multi-host connection strings
199
239
200
240
[vip-manager](https://github.com/cybertec-postgresql/vip-manager/) provides a relatively easy way of maintaining a single IP address that always points to the leader of a single site. One could set it up for each site, and then use the endpoint IPs in a multi-host connection string as described above. As the number of addresses to check is less than in (2), establishing a connection is faster on average. The downside is the added complexity (vip-manager has to be installed on the Patroni nodes, and configured to pull the necessary information from DCS).
201
241
202
242
243
+
.. _multisite_transforming_standby_to_multisite:
244
+
203
245
Transforming an existing setup into multisite
204
246
---------------------------------------------
205
247
@@ -209,19 +251,21 @@ If the present setup consists of a standby cluster replicating from a leader sit
209
251
1.1 if a separate DCS cluster is going to be used, set up the new cluster as usual (one node in both Patroni sites, and a third node in a third site)
210
252
2. Enable multisite on leader site's Patroni cluster
211
253
2.1 apply the multisite config to all nodes' Patroni config files
212
-
2.2 reload local configuration on the leader site cluster's nodes (`patronictl reload`)
213
-
2.3 check if `patronictl list` shows an extra line saying 'Multisite <leader-site> is leader'
254
+
2.2 reload local configuration on the leader site cluster's nodes (``patronictl reload``)
255
+
2.3 check if ``patronictl list`` shows an extra line saying 'Multisite <leader-site> is leader'
214
256
3. Enable multisite on the standby cluster
215
257
3.1 repeat the steps from 2. on the standby cluster
216
-
3.2 after reloading the config, you should see `patronictl list` saying 'Multisite <standby-site> is standby, replicating from <leader-site>'
217
-
4. Remove `standby_cluster` specification from the dynamic config
218
-
4.1 use `patronictl edit-config` to remove all lines belonging to the standby cluster definition
258
+
3.2 after reloading the config, you should see ``patronictl list`` saying 'Multisite <standby-site> is standby, replicating from <leader-site>'
259
+
4. Remove ``standby_cluster`` specification from the dynamic config
260
+
4.1 use ``patronictl edit-config`` to remove all lines belonging to the standby cluster definition
219
261
220
262
If the present setup is one Patroni cluster over two sites, first turn that setup into a stanby cluster setup, and perform the above steps to enable multisite.
221
263
222
264
Moving from an existing Postgres setup to multisite can be achieved by setting up a full multisite cluster which is still replicating from the original primary. This can be achieved by using the usual standby cluster specification, this time on the leader site's cluster. On cutover simply remove the standby cluster specification, thus promoting the leader site.
0 commit comments