4.3.0 Beta 1
Pre-releaseCrunchy Data announces the release of the PostgreSQL Operator 4.3.0 Beta 1.
Details on the new major features, as well as other features and fixes, proceed
below. While we aim to have our betas feature complete, new features and
bug fixes will be introduced in subsequent beta release of PostgreSQL
Operator 4.3.
We also aim to have quarterly release. Due to some disruptions related to
ongoing world events, we are delaying the release but still aiming to make it
available by in mid-April. This will give us some time to get a few more
features in and tighten up existing functionality. We hope that the wait will be
worth it for you.
With that said, we did want to make the beta available sooner so you could begin
evaluating it for your purposes.
The PostgreSQL Operator 4.3.0 release includes the following software versions upgrades:
- The PostgreSQL containers now use versions 12.2, 11.7, 10.12, 9.6.17, and 9.5.21
- This now includes support for using the JIT compilation feature introduced
in PostgreSQL 11
- This now includes support for using the JIT compilation feature introduced
- PostgreSQL containers now support PL/Python3
- pgBackRest is now at version 2.24
- Patroni is now at version 1.6.4
- postgres_exporter is now at version 0.7.0
The PostgreSQL Operator is released in conjunction with the Crunchy Container Suite.
PostgreSQL Operator is tested with Kubernetes 1.13 - 1.17, OpenShift 3.11+,
Google Kubernetes Engine (GKE), and VMware Enterprise PKS 1.3+.
Major Features
- Standby Clusters + Multi-Kubernetes Deployments
- Set custom PVC sizes for PostgreSQL clusters on creation and clone
- Tablespaces
- All Operator commands now support TLS-only PostgreSQL workflows
Standby Clusters + Multi-Kubernetes Deployments
A key component of building database architectures that can ensure continuity of
operations is to be able to have the database available across multiple data
centers. In Kubernetes, this would mean being able to have the PostgreSQL
Operator be able to have the PostgreSQL Operator run in multiple Kubernetes
clusters, have PostgreSQL clusters exist in these Kubernetes clusters, and only
ensure the "standby" deployment is promoted in the event of an outage or planned
switchover.
As of this release, the PostgreSQL Operator now supports standby PostgreSQL
clusters that can be deployed across namespaces or other Kubernetes or
Kubernetes-enabled clusters (e.g. OpenShift). This is accomplished by leveraging
the PostgreSQL Operator's support for
pgBackRest
and leveraging an intermediary, i.e. S3, to provide the ability for the standby
cluster to read in the PostgreSQL archives and replicate the data. This allows a
user to quickly promote a standby PostgreSQL cluster in the event that the
primary cluster suffers downtime (e.g. data center outage), for planned
switchovers such as Kubernetes cluster maintenance or moving a PostgreSQL
workload from one data center to another.
To support standby clusters, there are several new flags available on
pgo create cluster that are required to set up a new standby cluster. These
include:
--standby: If set, creates the PostgreSQL cluster as a standby cluster--pgbackrest-repo-path: Allows the user to override thepgBackRest
repository path for a cluster. While this setting can now be utilized when
creating any cluster, it is typically required for the creation of standby
clusters as the repository path will need to match that of the primary cluster.--password-superuser: When creating a standby cluster, allows the user to
specify a password for the superuser that matches the superuser account in the
cluster the standby is replicating from--password-replication: When creating a standby cluster, allows the user to
specify a password for the replication user that matches the superuser account
in the cluster the standby is replicating from
Note that the --password flag must be used to ensure the password of the main
PostgreSQL user account matches that of the primary PostgreSQL cluster, if you
are using Kubernetes to manage the user's password.
For example, if you have a cluster named hippo and wanted to create a standby
cluster called hippo and assuming the S3 credentials are using the defaults
provided to the PostgreSQL Operator, you could execute a command similar to:
pgo create cluster hippo-standby --standby \
--pgbackrest-repo-path=/backrestrepo/hippo-backrest-shared-repo
--password-superuser=superhippo
--password-replication=replicahippoTo shutdown the primary cluster (if you can), you can execute a command similar
to:
pgo update cluster hippo --shutdownTo promote the standby cluster to be able to accept write traffic, you can
execute the following command:
pgo update cluster hippo-standby --promote-standbyTo convert the old primary cluster into a standby cluster, you can execute the
following command:
pgo update cluster hippo --enable-standbyOnce the old primary is converted to a standby cluster, you can bring it online
with the following command:
pgo update cluster hippo --startupFor information on the architecture and how to
set up a standby PostgreSQL cluster, please visit url
At present, streaming replication between the primary and standby clusters are
not supported, but the PostgreSQL instances within each cluster do support
streaming replication.
Customize PVC Size on PostgreSQL cluster Creation & Clone
The PostgreSQL Operator provides the ability to set customization for how large
the a PVC can be via the "storage config" options available in the PostgreSQL
Operator configuration file (aka pgo.yaml). While these provide a baseline
level of customizability, it is often important to be able to set the size
of the PVC that a PostgreSQL cluster should use at cluster creation time. In
other words, users should be able to choose exactly how large they want their
PostgreSQL PVCs to be.
PostgreSQL Operator 4.3 introduces the ability to set the PVC sizes for the
PostgreSQL cluster, the pgBackRest repository for the PostgreSQL cluster, and
the PVC size for each tablespace at cluster creation time. Additionally,
this behavior has been extended to the clone functionality as well, which is
helpful when trying to resize a PostgreSQL cluster. Here is some information on
the flags that have been added:
pgo create cluster
--pvc-size - sets the PVC size for the PostgreSQL data directory
--pgbackrest-pvc-size - sets the PVC size for the PostgreSQL pgBackRest
repository
For tablespaces, one can use the pvcsize option to set the PVC size for that
tablespace.
pgo clone cluster
--pvc-size - sets the PVC size for the PostgreSQL data directory for the newly
created cluster
--pgbackrest-pvc-size - sets the PVC size for the PostgreSQL pgBackRest
repository for the newly created cluster
Tablespaces
Tablespaces can be used to spread out PostgreSQL workloads across
multiple volumes, which can be used for a variety of use cases:
- Partitioning larger data sets
- Putting data onto archival systems
- Utilizing hardware (or a storage class) for a particular database
object, e.g. an index
and more.
Tablespaces can be created via the pgo create cluster command using
the --tablespace flag. The arguments to --tablespace can be passed
in using one of several key/value pairs, including:
name(required) - the name of the tablespacestorageconfig(required) - the storage configuration to use for the tablespacepvcsize- if specified, the size of the PVC. Defaults to the PVC size in the storage configuration
Each value is separated by a :, for example:
pgo create cluster hacluster --tablespace=name=ts:storageconfig=nfsstorageAll tablespaces are mounted in the /tablespaces directory. The
PostgreSQL Operator manages the mount points and persistent volume
claims (PVCs) for the tablespaces, and ensures they are available
throughout all of the PostgreSQL lifecycle operations, including:
- Provisioning
- Backup & Restore
- High-Availability, Failover, Healing
- Clone
etc.
One additional value is added to the pgcluster CRD:
- TablespaceMounts: a map of the name of the tablespace and its
associated storage.
Tablespaces are automatically created in the PostgreSQL cluster. You can
access them as soon as the cluster is initialized. For example, using
the tablespace created above, you could create a table on the tablespace
ts with the following SQL:
CREATE TABLE (id int) TABLESPACE ts;Tablespaces can also be added to existing PostgreSQL clusters by using the
pgo update cluster command. The syntax is similar to that of creating a
PostgreSQL cluster with a tablespace, i.e.:
pgo update cluster hacluster --tablespace=name=ts2:storageconfig=nfsstorageAs additional volumes need to be mounted to the Deployments, this action can
cause downtime, though it expectation is that the downtime is brief.
Based on usage, future work will look to making this
more flexible. Dropping tablespaces can be tricky as no objects must
exist on a tablespace in order for PostgreSQL to drop it (i.e. there is
no DROP TABLESPACE .. CASCADE command).
Enhanced pgo df
pgo df provides information on the disk utilization of a PostgreSQL cluster,
and previously, this was not reporting accurate numbers. The new pgo df looks
at each PVC that is mounted to each PostgreSQL instance in a cluster, including
the PVCs for tablespaces, and computers the overall utilization. Even better,
the data is returned in a structured format for easy scraping. This
implementation also leverages Golang concurrency to help compute the results
quickly.
Enhanced pgBouncer Integration
The pgBouncer integration was completely rewritten to support the TLS-only
operations via the PostgreSQL Operator. While most of the work was internal,
you should now see a much more stable pgBouncer experience. Additionally, a few
new commands were added:
pgo show pgbouncershows information about a pgBouncer deploymentpgo update pgbouncer --rotate-passwordallows one to rotate the service
account password for pgBouncer
Rewritten pgo User Management commands
The user management commands were rewritten to support the TLS only workflow.
These commands now return additional information about a user when actions are
taken. Several new flags have been added too, including the option to view all
output in JSON. Other flags include:
pgo update user --rotate-passwordto automatically rotate the passwordpgo update user --disable-loginwhich disables the ability for a PostgreSQL user to loginpgo update user --enable-loginwhich enables the ability for a PostgreSQL user to loginpgo update user --valid-alwayswhich sets a password to always be valid, i.e. it has no
expirationpgo show userdoes not show system accounts by default now, but can be made
to show the system accounts by usingpgo show user --show-system-accounts
A major change as well is that the default password expiration function is now
defaulted to be unlimited (i.e. never expires) which aligns with typical
PostgreSQL workflows.
Breaking Changes
pgo create clusterwill now set the default database name to be the name of the cluster. For example,pgo create cluster hippowould create the initial database namedhippo.- The
Databaseconfiguration parameter inpgo.yaml(db_namein the Ansible inventory) is now set to""by default. - the
--password/-wflag forpgo create clusternow only sets the password for the regular user account that is created, not all of the system accounts (e.g. thepostgressuperuser). - A default
postgres-ha.yamlfile is no longer is no longer created by the Operator for every PostgreSQL cluster. - The
pgbackups.crunchydata.com, deprecated since 4.2.0, has now been completely removed, along with any code that interfaced with it. - Remove
--seriesflag frompgo create clustercommand. This affects API calls more than actual usage of thepgoclient. pgo benchmark,pgo show benchmark,pgo delete benchmarkare removed. PostgreSQL benchmarks withpgbenchcan still be executed using thecrunchy-pgbenchcontainer.pgo lsis removed.- The API that is used by
pgo create clusternow returns its contents in JSON. The output now includes information about the user that is created. - The API that is used by
pgo show backupnow returns its contents in JSON. The output view ofpgo show backupremains the same.
Features
- Several additions to
pgo create clusteraround PostgreSQL users and databases, including:--cpuflag that sets the amount of CPU to use for the PostgreSQL instances in the cluster
---database/-dflag that sets the name of the initial database created.--memoryflag that sets the amount of memory to use for the PostgreSQL instances in the cluster--user/-uflag that sets the PostgreSQL username for the standard database user--password-lengthsets the length of the password that should be generated, if--passwordis not set.--show-system-accountsreturns the credentials of the system accounts (e.g. thepostgressuperuser) along with the credentials for the standard database user
- Added the
PodAntiAffinityPgBackRestandPodAntiAffinityPgBouncerto thepgo.yamlconfiguration file to set specific Pod anti-affinity rules for pgBackRest and pgBouncer Pods that are deployed along with PostgreSQL clusters that are managed by the Operator. The default for pgBackRest and pgBouncer is to use the value that is set inPodAntiAffinity. pgo create clusternow supports the--pod-anti-affinity-pgbackrestand--pod-anti-affinity-pgbouncerflags to specifically overwrite the pgBackRest repository and pgBouncer Pod anti-affinity rules on a specific PostgreSQL cluster deployment, which overrides any values present inPodAntiAffinityPgBackRestandPodAntiAffinityPgBouncerrespectfully. The default for pgBackRest and pgBouncer is to use the value for pod anti-affinity that is used for the PostgreSQL instances in the cluster,pgo clonenow supports the--enable-metricsflag, which will deploy the monitoring sidecar along with the newly cloned PostgreSQL cluster.- The pgBackRest repository now uses ED25519 SSH key pairs.
- Add the
--enable-autofailflag topgo updateto make it clear how the autofailover mechanism can be re-enabled for a PostgreSQL cluster.
Changes
- POSIX shared memory is now used for the PostgreSQL Deployments.
- The number of unsupported pgBackRest flags on the deny list has been reduced.
- The liveness and readiness probes for a PostgreSQL cluster now reference the
/opt/cpm/bin/health wal_levelis now defaulted tologicalto enable logical replicationarchive_timeoutis now a default setting in thecrunchy-postgres-haandcrunchy-postgres-ha-giscontainers and is set to60ArchiveTimeout,LogStatement,LogMinDurationStatementare removed frompgo.yaml, as these can be customized either via a custompostgresql.conffile orpostgres-ha.yamlfile- Quoted identifiers for the database name and user name in bootstrap scripts for the PostgreSQL containers
- The PostgreSQL Operator now logs its timestamps using RFC3339 formatting as implemented by Go
- SSH key pairs are no longer created as part of the Operator installation process. This was a legacy behavior that had not been removed
- The
pv/create-pv-nfs.shhas been modified to create persistent volumes with their own directories on the NFS filesystems. This better mimics production environments. The older version of the script still exists aspv/create-pv-nfs-legacy.sh - Load pgBackRest S3 credentials into environmental variables as Kubernetes Secrets, to avoid revealing their contents in Kubernetes commands or in logs
- Update how the pgBackRest and pgMonitor pamareters are loaded into Deployment templates to no longer use JSON fragments
- Remove using
expenvin theadd-targeted-namespace.shscript
Fixes
- Ensure PostgreSQL clusters can be successfully restored via
pgo restoreafter 'pgo scaledown' is executed - Allow the original primary to be removed with
pgo scaledownafter it is failed over - Report errors in a SQL policy at the time
pgo applyis executed, which was the previous behavior. Reported by José Joye (@jose-joye) pgo scaledownnow works on a failed primary that has recovered and became a replica- Ensure all replicas are listed out via the
--queryflag inpgo scaledownandpgo failover. This now follows the pattern outlined by the Kubernetes safe random string generator - The
pgo-rmdataJob will not fail if a PostgreSQL cluster has not been properly initialized - Fixed a separate
pgo-rmdatacrash related to an improper SecurityContext - Allow the standard PostgreSQL user created with the Operator to be able to create and manage objects within its own user schema. Reported by Nicolas HAHN (@hahnn)
- Honor the value of "PasswordLength" when it is set in the pgo.yaml file for password generation. The default is now set at
24 - Do not log pgBackRest environmental variables to the Kubernetes logs
- By default, exclude using the trusted OS certificate authority store for the Windows pgo client.
- Update the
pgo-clientimagePullPolicy to beIfNotPresent, which is the default for all of the managed containers across the project - Set
UsePAM yesin thesshd_configfile to fix an issue with using SSHD in newer versions of Docker - Only add Operator labels to a managed namespace if the namespace already exists when executing the
add-targeted-namespace.shscript