|
| 1 | +--- |
| 2 | +title: "Kubernetes Multi-Cluster Deployments" |
| 3 | +date: |
| 4 | +draft: false |
| 5 | +weight: 300 |
| 6 | +--- |
| 7 | + |
| 8 | + |
| 9 | + |
| 10 | +Advanced [high-availability]({{< relref "/architecture/high-availability/_index.md" >}}) |
| 11 | +and [disaster recovery]({{< relref "/architecture/disaster-recovery.md" >}}) |
| 12 | +strategies involve spreading your database clusters across multiple data centers |
| 13 | +to help maximize uptime. In Kubernetes, this technique is known as "[federation](https://en.wikipedia.org/wiki/Federation_(information_technology))". |
| 14 | +Federated Kubernetes clusters are able to communicate with each other, |
| 15 | +coordinate changes, and provide resiliency for applications that have high |
| 16 | +uptime requirements. |
| 17 | + |
| 18 | +As of this writing, federation in Kubernetes is still in ongoing development |
| 19 | +area and is something we monitor with intense interest. As Kubernetes federation |
| 20 | +continues to mature, we wanted to provide a way to deploy PostgreSQL clusters |
| 21 | +managed by the [PostgreSQL Operator](https://www.crunchydata.com/developers/download-postgres/containers/postgres-operator) |
| 22 | +that can span multiple Kubernetes clusters. This can be accomplished with a |
| 23 | +few environmental setups: |
| 24 | + |
| 25 | +- Two Kubernetes clusters |
| 26 | +- S3, or an external storage system that uses the S3 protocol |
| 27 | + |
| 28 | +At a high-level, the PostgreSQL Operator follows the "active-standby" data |
| 29 | +center deployment model for managing the PostgreSQL clusters across Kuberntetes |
| 30 | +clusters. In one Kubernetes cluster, the PostgreSQL Operator deploy PostgreSQL as an |
| 31 | +"active" PostgreSQL cluster, which means it has one primary and one-or-more |
| 32 | +replicas. In another Kubernetes cluster, the PostgreSQL cluster is deployed as |
| 33 | +a "standby" cluster: every PostgreSQL instance is a replica. |
| 34 | + |
| 35 | +A side-effect of this is that in each of the Kubernetes clusters, the PostgreSQL |
| 36 | +Operator can be used to deploy both active and standby PostgreSQL clusters, |
| 37 | +allowing you to mix and match! While the mixing and matching may not ideal for |
| 38 | +how you deploy your PostgreSQL clusters, it does allow you to perform online |
| 39 | +moves of your PostgreSQL data to different Kubernetes clusters as well as manual |
| 40 | +online upgrades. |
| 41 | + |
| 42 | +Lastly, while this feature does extend high-availability, promoting a standby |
| 43 | +cluster to an active cluster is **not** automatic. While the PostgreSQL clusters |
| 44 | +within a Kubernetes cluster do support self-managed high-availability, a |
| 45 | +cross-cluster deployment requires someone to specifically promote the cluster |
| 46 | +from standby to active. |
| 47 | + |
| 48 | +## Standby Cluster Overview |
| 49 | + |
| 50 | +Standby PostgreSQL clusters are managed just like any other PostgreSQL cluster |
| 51 | +that is managed by the PostgreSQL Operator. For example, adding replicas to a |
| 52 | +standby cluster is identical to before: you can use [`pgo scale`]({{< relref "/pgo-client/reference/pgo_scale.md" >}}). |
| 53 | + |
| 54 | +As the architecture diagram above shows, the main difference is that there is |
| 55 | +no primary instance: one PostgreSQL instance is reading in the database changes |
| 56 | +from the S3 repository, while the other replicas are replicas of that instance. |
| 57 | +This is known as [cascading replication](https://www.postgresql.org/docs/current/warm-standby.html#CASCADING-REPLICATION). |
| 58 | + replicas are cascading replicas, i.e. replicas replicating from a database server that itself is replicating from another database server. |
| 59 | + |
| 60 | +Because standby clusters are effectively read-only, certain functionality |
| 61 | +that involves making changes to a database, e.g. PostgreSQL user changes, is |
| 62 | +blocked while a cluster is in standby mode. Additionally, backups and restores |
| 63 | +are blocked as well. While [pgBackRest](https://pgbackrest.org/) does support |
| 64 | +backups from standbys, this requires direct access to the primary database, |
| 65 | +which cannot be done until the PostgreSQL Operator supports Kubernetes |
| 66 | +federation. If a blocked function is called on a standby cluster via the |
| 67 | +[`pgo` client]({{< relref "/pgo-client/_index.md">}}) or a direct call to the |
| 68 | +API server, the call will return an error. |
| 69 | + |
| 70 | +### Key Commands |
| 71 | + |
| 72 | +#### [`pgo create cluster`]({{< relref "/pgo-client/reference/pgo_create_cluster.md" >}}) |
| 73 | + |
| 74 | +This first step to creating a standby PostgreSQL cluster is...to create a |
| 75 | +PostgreSQL standby cluster. We will cover how to set this up in the example |
| 76 | +below, but wanted to provide some of the standby-specific flags that need to be |
| 77 | +used when creating a standby cluster. These include: |
| 78 | + |
| 79 | +- `--standby`: Creates a cluster as a PostgreSQL standby cluster |
| 80 | +- `--password-superuser`: The password for the `postgres` superuser account, |
| 81 | +which performs a variety of administrative actions. |
| 82 | +- `--password-replication`: The password for the replication account |
| 83 | +(`primaryuser`), used to maintain high-availability. |
| 84 | +- `--password`: The password for the standard user account created during |
| 85 | +PostgreSQL cluster initialization. |
| 86 | +- `--pgbackrest-repo-path`: The specific pgBackRest repository path that should |
| 87 | +be utilized by the standby cluster. Allows a standby cluster to specify a path |
| 88 | +that matches that of the active cluster it is replicating. |
| 89 | +- `--pgbackrest-storage-type`: Must be set to `s3` |
| 90 | +- `--pgbackrest-s3-key`: The S3 key to use |
| 91 | +- `--pgbackrest-s3-key-secret`: The S3 key secret to use |
| 92 | +- `--pgbackrest-s3-bucket`: The S3 bucket to use |
| 93 | +- `--pgbackrest-s3-endpoint`: The S3 endpoint to use |
| 94 | +- `--pgbackrest-s3-region`: The S3 region to use |
| 95 | + |
| 96 | +With respect to the credentials, it should be noted that when the standby |
| 97 | +cluster is being created within the same Kubernetes cluster AND it has access to |
| 98 | +the Kubernetes Secret created for the active cluster, one can use the |
| 99 | +`--secret-from` flag to set up the credentials. |
| 100 | + |
| 101 | +#### [`pgo update cluster`]({{< relref "/pgo-client/reference/pgo_update_cluster.md" >}}) |
| 102 | + |
| 103 | +[`pgo update cluster`]({{< relref "/pgo-client/reference/pgo_update_cluster.md" >}}) |
| 104 | +is responsible for the promotion and disabling of a standby cluster, and |
| 105 | +contains several flags to help with this process: |
| 106 | + |
| 107 | +- `--enable-standby`: Enables standby mode in a cluster for a cluster. This will |
| 108 | +bootstrap a PostgreSQL cluster to become aligned with the current active |
| 109 | +cluster and begin to follow its changes. |
| 110 | +- `--promote-standby`: Enables standby mode in a cluster. This is a destructive |
| 111 | +action that results in the deletion of all PVCs for the cluster (data will be |
| 112 | + retained according Storage Class and/or Persistent Volume reclaim policies). |
| 113 | + In order to allow the proper deletion of PVCs, the cluster must also be |
| 114 | + shutdown. |
| 115 | +- `--shutdown`: Scales all deployments for the cluster to 0, resulting in a full |
| 116 | +shutdown of the PG cluster. This includes the primary, any replicas, as well as |
| 117 | +any supporting services ([pgBackRest](https://www.pgbackrest.org) and |
| 118 | +[pgBouncer](({{< relref "/pgo-client/common-tasks.md" >}}#connection-pooling-via-pgbouncer)) |
| 119 | +if enabled). |
| 120 | +- `--startup`: Scales all deployments for the cluster to 1, effectively starting |
| 121 | +a PG cluster that was previously shutdown. This includes the primary, any |
| 122 | +replicas, as well as any supporting services (pgBackRest and pgBouncer if |
| 123 | +enabled). The primary is brought online first in order to maintain a |
| 124 | +consistent primary/replica architecture across startups and shutdowns. |
| 125 | + |
| 126 | +## Creating a Standby PostgreSQL Cluster |
| 127 | + |
| 128 | +Let's create a PostgreSQL deployment that has both an active and standby |
| 129 | +cluster! You can try this example either within a single Kubernetes cluster, or |
| 130 | +across multuple Kubernetes clusters. |
| 131 | + |
| 132 | +First, deploy a new active PostgreSQL cluster that is configured to use S3 with |
| 133 | +pgBackRest. For example: |
| 134 | + |
| 135 | +``` |
| 136 | +pgo create cluster hippo --pgbouncer --replica-count=2 \ |
| 137 | + --pgbackrest-storage-type=local,s3 \ |
| 138 | + --pgbackrest-s3-key=<redacted> \ |
| 139 | + --pgbackrest-s3-key-secret=<redacted> \ |
| 140 | + --pgbackrest-s3-bucket=watering-hole \ |
| 141 | + --pgbackrest-s3-endpoint=s3.amazonaws.com \ |
| 142 | + --pgbackrest-s3-region=us-east-1 \ |
| 143 | + --password-superuser=supersecrethippo \ |
| 144 | + --password-replication=somewhatsecrethippo \ |
| 145 | + --password=opensourcehippo |
| 146 | +``` |
| 147 | + |
| 148 | +(Replace the placeholder values with your actual values. We are explicitly |
| 149 | +setting all of the passwords for the primary cluster to make it easier to run |
| 150 | +the example as is). |
| 151 | + |
| 152 | +The above command creates an active PostgreSQL cluster with two replicas and a |
| 153 | +pgBouncer deployment. Wait a few moments for this cluster to become live before |
| 154 | +proceeding. |
| 155 | + |
| 156 | +Once the cluster has been created, you can then create the standby cluster. This |
| 157 | +can either be in another Kubernetes cluster or within the same Kubernetes |
| 158 | +cluster. If using a separate Kubernetes cluster, you will need to provide the |
| 159 | +proper passwords for the superuser and replication accounts. You can also |
| 160 | +provide a password for the regular PostgreSQL database user created during cluster |
| 161 | +initialization to ensure the passwords and associated secrets across both |
| 162 | +clusters are consistent. |
| 163 | + |
| 164 | +(If the standby cluster is being created using the same PostgreSQL Operator |
| 165 | +deployment (and therefore the same Kubernetes cluster), the `--secret-from` flag |
| 166 | +can also be used in lieu of these passwords. You would specify the name of the |
| 167 | +cluster [e.g. `hippo`] as the value of the `--secret-from` variable.) |
| 168 | + |
| 169 | +With this in mind, create a standby cluster similar to this below: |
| 170 | + |
| 171 | +``` |
| 172 | +pgo create cluster hippo-standby --standby --pgbouncer --replica-count=2 \ |
| 173 | + --pgbackrest-storage-type=s3 \ |
| 174 | + --pgbackrest-s3-key=<redacted> \ |
| 175 | + --pgbackrest-s3-key-secret=<redacted> \ |
| 176 | + --pgbackrest-s3-bucket=watering-hole \ |
| 177 | + --pgbackrest-s3-endpoint=s3.amazonaws.com \ |
| 178 | + --pgbackrest-s3-region=us-east-1 \ |
| 179 | + --pgbackrest-repo-path=/backrestrepo/hippo-backrest-shared-repo \ |
| 180 | + --password-superuser=supersecrethippo \ |
| 181 | + --password-replication=somewhatsecrethippo \ |
| 182 | + --password=opensourcehippo |
| 183 | +``` |
| 184 | + |
| 185 | +Note the use of the `--pgbackrest-repo-path` flag as it points to the name of |
| 186 | +the pgBackRest repository that is used for the original `hippo` cluster. |
| 187 | + |
| 188 | +At this point, the standby cluster will bootstrap as a standby along with two |
| 189 | +cascading replicas. pgBouncer will be deployed at this time as well, but will |
| 190 | +remain non-functional until `hippo-standby` is promoted. To see that the Pod is |
| 191 | +indeed a standby, you can check the logs. |
| 192 | + |
| 193 | +``` |
| 194 | +kubectl logs hippo-standby-dcff544d6-s6d58 |
| 195 | +… |
| 196 | +Thu Mar 19 18:16:54 UTC 2020 INFO: Node standby-dcff544d6-s6d58 fully initialized for cluster standby and is ready for use |
| 197 | +2020-03-19 18:17:03,390 INFO: Lock owner: standby-dcff544d6-s6d58; I am standby-dcff544d6-s6d58 |
| 198 | +2020-03-19 18:17:03,454 INFO: Lock owner: standby-dcff544d6-s6d58; I am standby-dcff544d6-s6d58 |
| 199 | +2020-03-19 18:17:03,598 INFO: no action. i am the standby leader with the lock |
| 200 | +2020-03-19 18:17:13,389 INFO: Lock owner: standby-dcff544d6-s6d58; I am standby-dcff544d6-s6d58 |
| 201 | +2020-03-19 18:17:13,466 INFO: no action. i am the standby leader with the lock |
| 202 | +``` |
| 203 | + |
| 204 | +You can also see that this is a standby cluster from the |
| 205 | +[`pgo show cluster`]({{< relref "/pgo-client/reference/pgo_show_cluster.md" >}}) |
| 206 | +command. |
| 207 | + |
| 208 | +``` |
| 209 | +pgo show cluster hippo |
| 210 | +
|
| 211 | +cluster : standby (crunchy-postgres-ha:centos7-12.2-4.3.0) |
| 212 | + standby : true |
| 213 | +``` |
| 214 | +## Promoting a Standby Cluster |
| 215 | + |
| 216 | +There comes a time where a standby cluster needs to be promoted to an active |
| 217 | +cluster. Promoting a standby cluster means that a PostgreSQL instance within |
| 218 | +it will become a priary and start accepting both reads and writes. This has the |
| 219 | +net effect of pushing WAL (transaction archives) to the pgBackRest repository, |
| 220 | +so we need to take a few steps first to ensure we don't accidentally create a |
| 221 | +split-brain scenario. |
| 222 | + |
| 223 | +First, if this is not a disaster scenario, you will want to "shutdown" the |
| 224 | +active PostgreSQL cluster. This can be done with the `--shutdown` flag: |
| 225 | + |
| 226 | +``` |
| 227 | +pgo update cluster hippo --shutdown |
| 228 | +``` |
| 229 | + |
| 230 | +The effect of this is that all the Kubernetes Deployments for this cluster are |
| 231 | +scaled to 0. You can verify this with the following command: |
| 232 | + |
| 233 | +``` |
| 234 | +kubectl get deployments --selector pg-cluster=hippo |
| 235 | +
|
| 236 | +NAME READY UP-TO-DATE AVAILABLE AGE |
| 237 | +hippo 0/0 0 0 32m |
| 238 | +hippo-backrest-shared-repo 0/0 0 0 32m |
| 239 | +hippo-kvfo 0/0 0 0 27m |
| 240 | +hippo-lkge 0/0 0 0 27m |
| 241 | +hippo-pgbouncer 0/0 0 0 31m |
| 242 | +``` |
| 243 | + |
| 244 | +We can then promote the standby cluster using the `--promote-standby` flag: |
| 245 | + |
| 246 | +``` |
| 247 | +pgo update cluster hippo-standby --promote-standby |
| 248 | +``` |
| 249 | + |
| 250 | +This command essentially removes the standby configuration from the Kubernetes |
| 251 | +cluster’s DCS, which triggers the promotion of the current standby leader to a |
| 252 | +primary PostgreSQL instance. You can view this promotion in the PostgreSQL |
| 253 | +standby leader's (soon to be active leader's) logs: |
| 254 | + |
| 255 | +``` |
| 256 | +kubectl logs hippo-standby-dcff544d6-s6d58 |
| 257 | +… |
| 258 | +2020-03-19 18:28:11,919 INFO: Reloading PostgreSQL configuration. |
| 259 | +server signaled |
| 260 | +2020-03-19 18:28:16,792 INFO: Lock owner: standby-dcff544d6-s6d58; I am standby-dcff544d6-s6d58 |
| 261 | +2020-03-19 18:28:16,850 INFO: Reaped pid=5377, exit status=0 |
| 262 | +2020-03-19 18:28:17,024 INFO: no action. i am the leader with the lock |
| 263 | +2020-03-19 18:28:26,792 INFO: Lock owner: standby-dcff544d6-s6d58; I am standby-dcff544d6-s6d58 |
| 264 | +2020-03-19 18:28:26,924 INFO: no action. i am the leader with the lock |
| 265 | +``` |
| 266 | + |
| 267 | +As pgBouncer was enabled for the cluster, the `pgbouncer` user's password is |
| 268 | +rotated, which will bring pgBouncer online with the newly promoted active |
| 269 | +cluster. If pgBouncer is still having trouble connecting, you can explicitly |
| 270 | +rotate the password with the following command: |
| 271 | + |
| 272 | +``` |
| 273 | +pgo update pgbouncer --rotate-password hippo-standby |
| 274 | +``` |
| 275 | + |
| 276 | +With the standby cluster now promoted, the cluster with the original active |
| 277 | +PostgreSQL cluster can now be turned into a standby PostgreSQL cluster. This is |
| 278 | +done by deleting and recreating all PVCs for the cluster and re-initializing it |
| 279 | +as a standby using the S3 repository. Being that this is a destructive action |
| 280 | +(i.e. data will only be retained if any Storage Classes and/or Persistent |
| 281 | + Volumes have the appropriate reclaim policy configured) a warning is shown |
| 282 | + when attempting to enable standby. |
| 283 | + |
| 284 | +``` |
| 285 | +pgo update cluster hippo --enable-standby |
| 286 | +Enabling standby mode will result in the deletion of all PVCs for this cluster! |
| 287 | +Data will only be retained if the proper retention policy is configured for any associated storage classes and/or persistent volumes. |
| 288 | +Please proceed with caution. |
| 289 | +WARNING: Are you sure? (yes/no): yes |
| 290 | +updated pgcluster hippo |
| 291 | +``` |
| 292 | + |
| 293 | + |
| 294 | +To verify that standby has been enabled, you can check the DCS configuration for |
| 295 | +the cluster to verify that the proper standby settings are present. |
| 296 | + |
| 297 | +``` |
| 298 | +kubectl get cm hippo-config -o yaml | grep standby |
| 299 | + %f \"%p\""},"use_pg_rewind":true,"use_slots":false},"standby_cluster":{"create_replica_methods":["pgbackrest_standby"],"restore_command":"source |
| 300 | +``` |
| 301 | + |
| 302 | +Also, the PVCs for the cluster should now only be a few seconds old, since they |
| 303 | +were recreated. |
| 304 | + |
| 305 | + |
| 306 | +``` |
| 307 | +kubectl get pvc --selector pg-cluster=hippo |
| 308 | +NAME STATUS VOLUME CAPACITY AGE |
| 309 | +hippo Bound crunchy-pv251 1Gi 33s |
| 310 | +hippo-kvfo Bound crunchy-pv174 1Gi 29s |
| 311 | +hippo-lkge Bound crunchy-pv228 1Gi 26s |
| 312 | +hippo-pgbr-repo Bound crunchy-pv295 1Gi 22s |
| 313 | +``` |
| 314 | + |
| 315 | +And finally, the cluster can be restarted: |
| 316 | + |
| 317 | +``` |
| 318 | +pgo update cluster hippo --startup |
| 319 | +``` |
| 320 | + |
| 321 | +At this point, the cluster will reinitialize from scratch as a standby, just |
| 322 | +like the original standby that was created above. Therefore any transactions |
| 323 | +written to the original standby, should now replicate back to this cluster. |
0 commit comments