|
| 1 | +# PostgreSQL units |
| 2 | + |
| 3 | +Each [HA](https://en.wikipedia.org/wiki/High_availability)/[DR](https://en.wikipedia.org/wiki/IT_disaster_recovery) implementation has a primary and secondary (standby) site(s). |
| 4 | +Charmed PostgreSQL cluster size can be [easily scaled](/t/11863) from 0 to 10 units ([contact us](/t/11863) for 10+ units cluster). It is recommended to use 3+ units cluster size in production (due to [Raft consensus](https://en.wikipedia.org/wiki/Raft_(algorithm)) requirements). Those units type can be: |
| 5 | + * **Primary**: unit which accepts all writes and guaranties [no split brain](https://en.wikipedia.org/wiki/Split-brain_(computing)). |
| 6 | + * **Sync Standby** (synchronous copy) : designed for the fast automatic failover. Used for read-only queries and guaranties the latest transaction availability. |
| 7 | + * **Replica** (asynchronous copy): designed for long-running and resource consuming queries without affecting Primary performance. Used for read-only queries without guaranties of the latest transaction availability. |
| 8 | + |
| 9 | +> **Warning**: all SQL transactions have to be confirmed by all Sync Standby unit(s) before Primary unit commit transaction to the client. Therefor the high-performance and high-availability is a trade-of balance between "Sync Standby" and "Replica" units count in the cluster. |
| 10 | +
|
| 11 | +> **Note**: starting from revision 561 all Charmed PostgreSQL units are configured as Sync Standby members. It provides better guaranties for the data survival when two of three units gone simultaneously. Users can re-configure the necessary synchronous units count using Juju config option '[synchronous_node_count](https://charmhub.io/postgresql/configurations?channel=14/edge#synchronous_node_count)'. |
| 12 | +
|
| 13 | + |
| 14 | + |
| 15 | +## Primary |
| 16 | + |
| 17 | +The simplest way to find the Primary unit is to run `juju status`. Please be aware that the information here can be outdated as it is being updated only on [Juju event 'update-status'](https://documentation.ubuntu.com/juju/3.6/reference/hook/#update-status): |
| 18 | +```shell |
| 19 | +ubuntu@juju360:~$ juju status postgresql |
| 20 | +Model Controller Cloud/Region Version SLA Timestamp |
| 21 | +postgresql lxd localhost/localhost 3.6.5 unsupported 13:04:15+02:00 |
| 22 | + |
| 23 | +App Version Status Scale Charm Channel Rev Exposed Message |
| 24 | +postgresql 14.15 active 3 postgresql 14/stable 553 no |
| 25 | + |
| 26 | +Unit Workload Agent Machine Public address Ports Message |
| 27 | +postgresql/0* active idle 0 10.189.210.53 5432/tcp Primary <<<<<<<<<<<<<< |
| 28 | +postgresql/1 active idle 1 10.189.210.166 5432/tcp |
| 29 | +postgresql/2 active idle 2 10.189.210.188 5432/tcp |
| 30 | + |
| 31 | +Machine State Address Inst id Base AZ Message |
| 32 | +0 started 10.189.210.53 juju-422c1a-0 [email protected] Running |
| 33 | +1 started 10.189.210.166 juju-422c1a-1 [email protected] Running |
| 34 | +2 started 10.189.210.188 juju-422c1a-2 [email protected] Running |
| 35 | +``` |
| 36 | + |
| 37 | +The up-to-date Primary unit number can be received using Juju action `get-primary`: |
| 38 | +```shell |
| 39 | +> juju run postgresql/leader get-primary |
| 40 | +... |
| 41 | +primary: postgresql/0 |
| 42 | +``` |
| 43 | + |
| 44 | +Also it is possible to retrieve this information using [patronictl](/t/17406#p-37204-patronictl-3) and [Patroni REST API](/t/17406#p-37204-patroni-rest-api-8). |
| 45 | + |
| 46 | +## Standby / Replica |
| 47 | + |
| 48 | +At the moment it is possible to retrieve this information using [patronictl](/t/17406#p-37204-patronictl-3) and [Patroni REST API](/t/17406#p-37204-patroni-rest-api-8) only (check the linked documentation for the access details). Example: |
| 49 | +```shell |
| 50 | +> ... patronictl ... list |
| 51 | ++ Cluster: postgresql (7499430436963402504) ---+-----------+----+-----------+ |
| 52 | +| Member | Host | Role | State | TL | Lag in MB | |
| 53 | ++--------------+----------------+--------------+-----------+----+-----------+ |
| 54 | +| postgresql-0 | 10.189.210.53 | Leader | running | 1 | | |
| 55 | +| postgresql-1 | 10.189.210.166 | Sync Standby | streaming | 1 | 0 | |
| 56 | +| postgresql-2 | 10.189.210.188 | Replica | streaming | 1 | 0 | |
| 57 | ++--------------+----------------+--------------+-----------+----+-----------+ |
| 58 | +``` |
| 59 | +On the example above: |
| 60 | +* `postgresql-0` is a PostgreSQL Primary unit (Patroni Leader) which accepts all writes |
| 61 | +* `postgresql-1` is a PostgreSQL/Patroni Sync Standby unit which can be promoted as new primary using manual switchover (safe). |
| 62 | +* `postgresql-2` is a PostgreSQL/Patroni Replica unit which can NOT be directly promoted as a new Primary using manual switchover. The automatic promotion Replica=>Sync Standby is necessary to guaranties the latest SQL transactions availability on this unit to allow further promotion as a new Primary. Otherwise the manual failover can be performed to Replica unit accepting the risks of loosing the last transactions(s) which lagged behind Primary. |
| 63 | + |
| 64 | +## Replica lag distance |
| 65 | + |
| 66 | +At the moment it is possible to retrieve this information using [patronictl](/t/17406#p-37204-patronictl-3) and [Patroni REST API](/t/17406#p-37204-patroni-rest-api-8) only (check the linked documentation for the access details). Example: |
| 67 | +```shell |
| 68 | +> ... patronictl ... list |
| 69 | ++ Cluster: postgresql (7499430436963402504) ---+-----------+----+-----------+ |
| 70 | +| Member | Host | Role | State | TL | Lag in MB | |
| 71 | ++--------------+----------------+--------------+-----------+----+-----------+ |
| 72 | +| postgresql-0 | 10.189.210.53 | Leader | running | 1 | | |
| 73 | +| ... |
| 74 | +| postgresql-2 | 10.189.210.188 | Replica | streaming | 1 | 42 | <<<<< |
| 75 | ++--------------+----------------+--------------+-----------+----+-----------+ |
| 76 | + |
| 77 | +> curl ... x.x.x.x:8008/cluster | jq |
| 78 | + "members": [ |
| 79 | + { |
| 80 | + "name": "postgresql-0", |
| 81 | + "role": "leader", |
| 82 | + "state": "running", |
| 83 | + ... |
| 84 | + }, |
| 85 | +... |
| 86 | + { |
| 87 | + "name": "postgresql-2", |
| 88 | + "role": "replica", |
| 89 | + "state": "streaming", |
| 90 | + ... |
| 91 | + "lag": 42 <<<<<<<<<<<< Lag in MB |
| 92 | + } |
| 93 | +``` |
0 commit comments