Skip to content

Commit d6be415

Browse files
Jonathan S. Katzjkatz
authored andcommitted
Add documentation for the multi-cluster Kubernetes deployments
This is a pretty neat feature, so it is good to have it documented on how it can be used.
1 parent 20f608f commit d6be415

File tree

4 files changed

+435
-10
lines changed

4 files changed

+435
-10
lines changed
File renamed without changes.
Lines changed: 323 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,323 @@
1+
---
2+
title: "Kubernetes Multi-Cluster Deployments"
3+
date:
4+
draft: false
5+
weight: 300
6+
---
7+
8+
![PostgreSQL Operator High-Availability Overview](/images/postgresql-ha-multi-data-center.png)
9+
10+
Advanced [high-availability]({{< relref "/architecture/high-availability/_index.md" >}})
11+
and [disaster recovery]({{< relref "/architecture/disaster-recovery.md" >}})
12+
strategies involve spreading your database clusters across multiple data centers
13+
to help maximize uptime. In Kubernetes, this technique is known as "[federation](https://en.wikipedia.org/wiki/Federation_(information_technology))".
14+
Federated Kubernetes clusters are able to communicate with each other,
15+
coordinate changes, and provide resiliency for applications that have high
16+
uptime requirements.
17+
18+
As of this writing, federation in Kubernetes is still in ongoing development
19+
area and is something we monitor with intense interest. As Kubernetes federation
20+
continues to mature, we wanted to provide a way to deploy PostgreSQL clusters
21+
managed by the [PostgreSQL Operator](https://www.crunchydata.com/developers/download-postgres/containers/postgres-operator)
22+
that can span multiple Kubernetes clusters. This can be accomplished with a
23+
few environmental setups:
24+
25+
- Two Kubernetes clusters
26+
- S3, or an external storage system that uses the S3 protocol
27+
28+
At a high-level, the PostgreSQL Operator follows the "active-standby" data
29+
center deployment model for managing the PostgreSQL clusters across Kuberntetes
30+
clusters. In one Kubernetes cluster, the PostgreSQL Operator deploy PostgreSQL as an
31+
"active" PostgreSQL cluster, which means it has one primary and one-or-more
32+
replicas. In another Kubernetes cluster, the PostgreSQL cluster is deployed as
33+
a "standby" cluster: every PostgreSQL instance is a replica.
34+
35+
A side-effect of this is that in each of the Kubernetes clusters, the PostgreSQL
36+
Operator can be used to deploy both active and standby PostgreSQL clusters,
37+
allowing you to mix and match! While the mixing and matching may not ideal for
38+
how you deploy your PostgreSQL clusters, it does allow you to perform online
39+
moves of your PostgreSQL data to different Kubernetes clusters as well as manual
40+
online upgrades.
41+
42+
Lastly, while this feature does extend high-availability, promoting a standby
43+
cluster to an active cluster is **not** automatic. While the PostgreSQL clusters
44+
within a Kubernetes cluster do support self-managed high-availability, a
45+
cross-cluster deployment requires someone to specifically promote the cluster
46+
from standby to active.
47+
48+
## Standby Cluster Overview
49+
50+
Standby PostgreSQL clusters are managed just like any other PostgreSQL cluster
51+
that is managed by the PostgreSQL Operator. For example, adding replicas to a
52+
standby cluster is identical to before: you can use [`pgo scale`]({{< relref "/pgo-client/reference/pgo_scale.md" >}}).
53+
54+
As the architecture diagram above shows, the main difference is that there is
55+
no primary instance: one PostgreSQL instance is reading in the database changes
56+
from the S3 repository, while the other replicas are replicas of that instance.
57+
This is known as [cascading replication](https://www.postgresql.org/docs/current/warm-standby.html#CASCADING-REPLICATION).
58+
replicas are cascading replicas, i.e. replicas replicating from a database server that itself is replicating from another database server.
59+
60+
Because standby clusters are effectively read-only, certain functionality
61+
that involves making changes to a database, e.g. PostgreSQL user changes, is
62+
blocked while a cluster is in standby mode. Additionally, backups and restores
63+
are blocked as well. While [pgBackRest](https://pgbackrest.org/) does support
64+
backups from standbys, this requires direct access to the primary database,
65+
which cannot be done until the PostgreSQL Operator supports Kubernetes
66+
federation. If a blocked function is called on a standby cluster via the
67+
[`pgo` client]({{< relref "/pgo-client/_index.md">}}) or a direct call to the
68+
API server, the call will return an error.
69+
70+
### Key Commands
71+
72+
#### [`pgo create cluster`]({{< relref "/pgo-client/reference/pgo_create_cluster.md" >}})
73+
74+
This first step to creating a standby PostgreSQL cluster is...to create a
75+
PostgreSQL standby cluster. We will cover how to set this up in the example
76+
below, but wanted to provide some of the standby-specific flags that need to be
77+
used when creating a standby cluster. These include:
78+
79+
- `--standby`: Creates a cluster as a PostgreSQL standby cluster
80+
- `--password-superuser`: The password for the `postgres` superuser account,
81+
which performs a variety of administrative actions.
82+
- `--password-replication`: The password for the replication account
83+
(`primaryuser`), used to maintain high-availability.
84+
- `--password`: The password for the standard user account created during
85+
PostgreSQL cluster initialization.
86+
- `--pgbackrest-repo-path`: The specific pgBackRest repository path that should
87+
be utilized by the standby cluster. Allows a standby cluster to specify a path
88+
that matches that of the active cluster it is replicating.
89+
- `--pgbackrest-storage-type`: Must be set to `s3`
90+
- `--pgbackrest-s3-key`: The S3 key to use
91+
- `--pgbackrest-s3-key-secret`: The S3 key secret to use
92+
- `--pgbackrest-s3-bucket`: The S3 bucket to use
93+
- `--pgbackrest-s3-endpoint`: The S3 endpoint to use
94+
- `--pgbackrest-s3-region`: The S3 region to use
95+
96+
With respect to the credentials, it should be noted that when the standby
97+
cluster is being created within the same Kubernetes cluster AND it has access to
98+
the Kubernetes Secret created for the active cluster, one can use the
99+
`--secret-from` flag to set up the credentials.
100+
101+
#### [`pgo update cluster`]({{< relref "/pgo-client/reference/pgo_update_cluster.md" >}})
102+
103+
[`pgo update cluster`]({{< relref "/pgo-client/reference/pgo_update_cluster.md" >}})
104+
is responsible for the promotion and disabling of a standby cluster, and
105+
contains several flags to help with this process:
106+
107+
- `--enable-standby`: Enables standby mode in a cluster for a cluster. This will
108+
bootstrap a PostgreSQL cluster to become aligned with the current active
109+
cluster and begin to follow its changes.
110+
- `--promote-standby`: Enables standby mode in a cluster. This is a destructive
111+
action that results in the deletion of all PVCs for the cluster (data will be
112+
retained according Storage Class and/or Persistent Volume reclaim policies).
113+
In order to allow the proper deletion of PVCs, the cluster must also be
114+
shutdown.
115+
- `--shutdown`: Scales all deployments for the cluster to 0, resulting in a full
116+
shutdown of the PG cluster. This includes the primary, any replicas, as well as
117+
any supporting services ([pgBackRest](https://www.pgbackrest.org) and
118+
[pgBouncer](({{< relref "/pgo-client/common-tasks.md" >}}#connection-pooling-via-pgbouncer))
119+
if enabled).
120+
- `--startup`: Scales all deployments for the cluster to 1, effectively starting
121+
a PG cluster that was previously shutdown. This includes the primary, any
122+
replicas, as well as any supporting services (pgBackRest and pgBouncer if
123+
enabled). The primary is brought online first in order to maintain a
124+
consistent primary/replica architecture across startups and shutdowns.
125+
126+
## Creating a Standby PostgreSQL Cluster
127+
128+
Let's create a PostgreSQL deployment that has both an active and standby
129+
cluster! You can try this example either within a single Kubernetes cluster, or
130+
across multuple Kubernetes clusters.
131+
132+
First, deploy a new active PostgreSQL cluster that is configured to use S3 with
133+
pgBackRest. For example:
134+
135+
```
136+
pgo create cluster hippo --pgbouncer --replica-count=2 \
137+
--pgbackrest-storage-type=local,s3 \
138+
--pgbackrest-s3-key=<redacted> \
139+
--pgbackrest-s3-key-secret=<redacted> \
140+
--pgbackrest-s3-bucket=watering-hole \
141+
--pgbackrest-s3-endpoint=s3.amazonaws.com \
142+
--pgbackrest-s3-region=us-east-1 \
143+
--password-superuser=supersecrethippo \
144+
--password-replication=somewhatsecrethippo \
145+
--password=opensourcehippo
146+
```
147+
148+
(Replace the placeholder values with your actual values. We are explicitly
149+
setting all of the passwords for the primary cluster to make it easier to run
150+
the example as is).
151+
152+
The above command creates an active PostgreSQL cluster with two replicas and a
153+
pgBouncer deployment. Wait a few moments for this cluster to become live before
154+
proceeding.
155+
156+
Once the cluster has been created, you can then create the standby cluster. This
157+
can either be in another Kubernetes cluster or within the same Kubernetes
158+
cluster. If using a separate Kubernetes cluster, you will need to provide the
159+
proper passwords for the superuser and replication accounts. You can also
160+
provide a password for the regular PostgreSQL database user created during cluster
161+
initialization to ensure the passwords and associated secrets across both
162+
clusters are consistent.
163+
164+
(If the standby cluster is being created using the same PostgreSQL Operator
165+
deployment (and therefore the same Kubernetes cluster), the `--secret-from` flag
166+
can also be used in lieu of these passwords. You would specify the name of the
167+
cluster [e.g. `hippo`] as the value of the `--secret-from` variable.)
168+
169+
With this in mind, create a standby cluster similar to this below:
170+
171+
```
172+
pgo create cluster hippo-standby --standby --pgbouncer --replica-count=2 \
173+
--pgbackrest-storage-type=s3 \
174+
--pgbackrest-s3-key=<redacted> \
175+
--pgbackrest-s3-key-secret=<redacted> \
176+
--pgbackrest-s3-bucket=watering-hole \
177+
--pgbackrest-s3-endpoint=s3.amazonaws.com \
178+
--pgbackrest-s3-region=us-east-1 \
179+
--pgbackrest-repo-path=/backrestrepo/hippo-backrest-shared-repo \
180+
--password-superuser=supersecrethippo \
181+
--password-replication=somewhatsecrethippo \
182+
--password=opensourcehippo
183+
```
184+
185+
Note the use of the `--pgbackrest-repo-path` flag as it points to the name of
186+
the pgBackRest repository that is used for the original `hippo` cluster.
187+
188+
At this point, the standby cluster will bootstrap as a standby along with two
189+
cascading replicas. pgBouncer will be deployed at this time as well, but will
190+
remain non-functional until `hippo-standby` is promoted. To see that the Pod is
191+
indeed a standby, you can check the logs.
192+
193+
```
194+
kubectl logs hippo-standby-dcff544d6-s6d58
195+
196+
Thu Mar 19 18:16:54 UTC 2020 INFO: Node standby-dcff544d6-s6d58 fully initialized for cluster standby and is ready for use
197+
2020-03-19 18:17:03,390 INFO: Lock owner: standby-dcff544d6-s6d58; I am standby-dcff544d6-s6d58
198+
2020-03-19 18:17:03,454 INFO: Lock owner: standby-dcff544d6-s6d58; I am standby-dcff544d6-s6d58
199+
2020-03-19 18:17:03,598 INFO: no action. i am the standby leader with the lock
200+
2020-03-19 18:17:13,389 INFO: Lock owner: standby-dcff544d6-s6d58; I am standby-dcff544d6-s6d58
201+
2020-03-19 18:17:13,466 INFO: no action. i am the standby leader with the lock
202+
```
203+
204+
You can also see that this is a standby cluster from the
205+
[`pgo show cluster`]({{< relref "/pgo-client/reference/pgo_show_cluster.md" >}})
206+
command.
207+
208+
```
209+
pgo show cluster hippo
210+
211+
cluster : standby (crunchy-postgres-ha:centos7-12.2-4.3.0)
212+
standby : true
213+
```
214+
## Promoting a Standby Cluster
215+
216+
There comes a time where a standby cluster needs to be promoted to an active
217+
cluster. Promoting a standby cluster means that a PostgreSQL instance within
218+
it will become a priary and start accepting both reads and writes. This has the
219+
net effect of pushing WAL (transaction archives) to the pgBackRest repository,
220+
so we need to take a few steps first to ensure we don't accidentally create a
221+
split-brain scenario.
222+
223+
First, if this is not a disaster scenario, you will want to "shutdown" the
224+
active PostgreSQL cluster. This can be done with the `--shutdown` flag:
225+
226+
```
227+
pgo update cluster hippo --shutdown
228+
```
229+
230+
The effect of this is that all the Kubernetes Deployments for this cluster are
231+
scaled to 0. You can verify this with the following command:
232+
233+
```
234+
kubectl get deployments --selector pg-cluster=hippo
235+
236+
NAME READY UP-TO-DATE AVAILABLE AGE
237+
hippo 0/0 0 0 32m
238+
hippo-backrest-shared-repo 0/0 0 0 32m
239+
hippo-kvfo 0/0 0 0 27m
240+
hippo-lkge 0/0 0 0 27m
241+
hippo-pgbouncer 0/0 0 0 31m
242+
```
243+
244+
We can then promote the standby cluster using the `--promote-standby` flag:
245+
246+
```
247+
pgo update cluster hippo-standby --promote-standby
248+
```
249+
250+
This command essentially removes the standby configuration from the Kubernetes
251+
cluster’s DCS, which triggers the promotion of the current standby leader to a
252+
primary PostgreSQL instance. You can view this promotion in the PostgreSQL
253+
standby leader's (soon to be active leader's) logs:
254+
255+
```
256+
kubectl logs hippo-standby-dcff544d6-s6d58
257+
258+
2020-03-19 18:28:11,919 INFO: Reloading PostgreSQL configuration.
259+
server signaled
260+
2020-03-19 18:28:16,792 INFO: Lock owner: standby-dcff544d6-s6d58; I am standby-dcff544d6-s6d58
261+
2020-03-19 18:28:16,850 INFO: Reaped pid=5377, exit status=0
262+
2020-03-19 18:28:17,024 INFO: no action. i am the leader with the lock
263+
2020-03-19 18:28:26,792 INFO: Lock owner: standby-dcff544d6-s6d58; I am standby-dcff544d6-s6d58
264+
2020-03-19 18:28:26,924 INFO: no action. i am the leader with the lock
265+
```
266+
267+
As pgBouncer was enabled for the cluster, the `pgbouncer` user's password is
268+
rotated, which will bring pgBouncer online with the newly promoted active
269+
cluster. If pgBouncer is still having trouble connecting, you can explicitly
270+
rotate the password with the following command:
271+
272+
```
273+
pgo update pgbouncer --rotate-password hippo-standby
274+
```
275+
276+
With the standby cluster now promoted, the cluster with the original active
277+
PostgreSQL cluster can now be turned into a standby PostgreSQL cluster. This is
278+
done by deleting and recreating all PVCs for the cluster and re-initializing it
279+
as a standby using the S3 repository. Being that this is a destructive action
280+
(i.e. data will only be retained if any Storage Classes and/or Persistent
281+
Volumes have the appropriate reclaim policy configured) a warning is shown
282+
when attempting to enable standby.
283+
284+
```
285+
pgo update cluster hippo --enable-standby
286+
Enabling standby mode will result in the deletion of all PVCs for this cluster!
287+
Data will only be retained if the proper retention policy is configured for any associated storage classes and/or persistent volumes.
288+
Please proceed with caution.
289+
WARNING: Are you sure? (yes/no): yes
290+
updated pgcluster hippo
291+
```
292+
293+
294+
To verify that standby has been enabled, you can check the DCS configuration for
295+
the cluster to verify that the proper standby settings are present.
296+
297+
```
298+
kubectl get cm hippo-config -o yaml | grep standby
299+
%f \"%p\""},"use_pg_rewind":true,"use_slots":false},"standby_cluster":{"create_replica_methods":["pgbackrest_standby"],"restore_command":"source
300+
```
301+
302+
Also, the PVCs for the cluster should now only be a few seconds old, since they
303+
were recreated.
304+
305+
306+
```
307+
kubectl get pvc --selector pg-cluster=hippo
308+
NAME STATUS VOLUME CAPACITY AGE
309+
hippo Bound crunchy-pv251 1Gi 33s
310+
hippo-kvfo Bound crunchy-pv174 1Gi 29s
311+
hippo-lkge Bound crunchy-pv228 1Gi 26s
312+
hippo-pgbr-repo Bound crunchy-pv295 1Gi 22s
313+
```
314+
315+
And finally, the cluster can be restarted:
316+
317+
```
318+
pgo update cluster hippo --startup
319+
```
320+
321+
At this point, the cluster will reinitialize from scratch as a standby, just
322+
like the original standby that was created above. Therefore any transactions
323+
written to the original standby, should now replicate back to this cluster.

0 commit comments

Comments
 (0)